| marp | true |
|---|---|
| theme | default |
| paginate | true |
An agent that keeps up with you. Sees what you see. Adjusts when you adjust.
This week: We proved the data we record is correct -- and that we can act on it.
Record raw OS events (mouse, keyboard, scroll, touchpad gestures) into SQLite Replay them 1:1 at microsecond timing to drive a real browser Verify with a 23-level E2E test suite that compares DOM state
This is the actuation layer for the agent -- the foundation for it to take actions in the real world.
Cross-platform: macOS (Quartz) + Windows (SendInput)
What you're seeing:
- Real user session recorded, then replayed into the same browser
- Clicks land, text appears, scroll positions match
- The browser can't tell the difference between the user and the replay
Narrate live over the video
What you're seeing:
- 23 test levels: record, replay, compare -- pass/fail badges in real-time
- 20 out of 23 tests pass
Narrate live over the video
Clicks, right-click, double/triple/middle-click, drag, text input, special keys, auto-repeat, copy/paste, shift+click, scroll direction, multi-line input, focus order, click accuracy (7/7 targets), form controls, drag-and-drop
| Problem | Detail |
|---|---|
| Touchpad scroll | Windows touchpad gestures go through an OS gesture recognizer that's non-deterministic -- replay produces different velocity each time. We investigated a kernel driver approach but it's too fragile to ship. |
| Mouse velocity | 3.4% distance loss from OS acceleration + browser event coalescing at high speed |
Key takeaway: We know exactly what works, what doesn't, and why. Touchpad gesture replay is a known unsolved problem industry-wide -- we're not shipping a fragile workaround.
We investigated deeply (custom kernel driver, HID injection). Conclusion: Windows touchpad gesture replay is too fragile and non-deterministic to ship. No one in the industry has solved this reliably. We'll work around it at a higher level (e.g. scroll via SendInput instead of raw gestures).
Use the replay engine to take a recording and find the minimal action sequence that achieves the same result. The actuation layer + action space lets us treat any recording as a "gym pass" and discover the shortest path to the same outcome.
Record (what the user did) + Understand (what mattered) + Act (replay/optimize) The actuation layer we proved this week is the "Act" piece.