CLI Use

When an agent runs sootsim do tap-id loginButton, you see a visible cursor animate to the element, the touch fires, and the canvas updates. No screenshot OCR, no XCUITest, no visual reasoning — same surface as a human, just faster.

The realtime cursor view

Every do command renders a visible agent cursor directly into the canvas. The cursor walks to the target node, the press visually depresses, and the screen updates in place.

terminal

sootsim do tap-text "Sign in"
sootsim do tap-id createThread
sootsim do scroll feed-list 0 500
sootsim do type "hello world"

What this gets you:

  • Watch a remote or CI agent work in real time on a single sim window. Useful for “why didn’t this tap take?” without re-running.
  • Recordings (sootsim record) capture cursor too, so flow replays look like a person used the app.
  • One canonical interaction surface — the same path a human’s pointer takes, the same hit testing, the same gesture pipeline.

Per-session claim leases

Multiple agents can attach to the bridge at once without stepping on each other:

  • Reads (describe, tree, screenshot, get *) always pass through.
  • Writes (do tap, do type, close) are gated by a per-session lease.
  • Leases expire after 10 minutes of inactivity.
  • Same-session sockets coexist — one agent spawning parallel CLI calls works without self-disconnects.

terminal

sootsim list # see connected tabs and their session ids
sootsim claim tab-2 # take the write lease on tab-2
sootsim claim tab-2 --force # boot the incumbent if locked

Two Claude Codes in two terminals, two Cursors in two IDEs, a CI test alongside a human-driven session — all coexist without fighting over the bridge.

Agent-aware auto-settle

Every write polls layout stability before returning, so the agent sees post-animation state instead of mid-transition state.

CallerSettle budgetWhy
Agent1200 msGives animations room to finish
Human350 msKeeps interactive use snappy

Sootsim auto-detects agent environments via CLAUDECODE, CURSOR_TRACE_ID, and a few other env vars. If your tool isn’t detected automatically, set SOOTSIM_AGENT=1.

No sleep chaining, no do settle “just to be safe” — every write already participates.

The accessibility DOM mirror

Canvas nodes are mirrored into a hidden DOM tree as real semantic elements — <button>, <a>, <input>, <textarea>, <h2>, <img> — with full ARIA (role, label, state, value, hint), data-testid, and text content.

Any MCP-aware browser tool (Claude Code, Cursor, Codex, Chrome MCP) treats the sim like a website — clicks the same elements, reads the same labels, types into the same inputs.

Two modes:

  • Shallow — flat DOM, one element per hit-testable node. Default.
  • Deep — nested interactive proxies. Scrollable <div>, real <input>/<textarea>, <button> that forwards to canvas.

Throttle defaults to 100 ms; window.__sootsimA11y.active() drops to 30 ms for high-frequency agent loops.

describe, find, do — the daily loop

terminal

# 1. ground the agent in current state
sootsim describe --verbose
# 2. locate the target
sootsim find --testid createThread
# 3. act
sootsim do tap-id createThread
# 4. verify
sootsim describe --verbose
sootsim get errors 5
# 5. record the flow for a PR preview
sootsim record --mode combined --duration 8 --open

Companion reads:

  • sootsim tree — compact tree
  • sootsim find --testid loginButton — locate a node by test ID
  • sootsim get errors 5 — recent console errors
  • sootsim get requests 5 — recent network calls
  • sootsim debug state — shell state, keyboard state, scroll node, hit-test, gesture state

Env vars

VariableEffect
SOOTSIM_AGENT=1Force agent mode (longer auto-settle, verbose JSON where applicable)
SOOTSIM_SESSION_ID=<id>Stable session key so same-session sockets coexist
SOOTSIM_UPLOAD_ORIGIN=<url>Override the upload host for sootsim upload and flow --preview