Conversation
Add design document for a configurable browser event streaming system that captures CDP events (console, network, DOM, layout shifts, screenshots, interactions), tags them with tab/frame context, and writes them durably to S2 streams. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
- layout_settled: start 1s timer after page_load, reset on each shift, emit when timer expires. Handles zero-shift pages correctly. - screenshots: downscale PNG by halving dimensions if base64 exceeds ~950KB, rather than truncating (which corrupts binary data). Co-authored-by: Cursor <cursoragent@cursor.com>
| | Type | Trigger | | ||
| |------|---------| | ||
| | `network_idle` | Pending request count at 0 for 500ms after navigation | | ||
| | `layout_settled` | 1s of no layout-shift entries after page_load (timer resets on each shift) | |
There was a problem hiding this comment.
Good catch -- the table and description were contradictory. Fixed in 7b9c491: after page_load, start a 1s timer. Each layout shift resets the timer. layout_settled fires when the timer expires (1s of quiet). For zero-shift pages, this correctly fires 1s after page_load.
| | `interaction_key` | Injected JS | key, selector, tag | | ||
| | `interaction_scroll` | Injected JS | from_x, from_y, to_x, to_y, target_selector | | ||
| | `layout_shift` | Injected PerformanceObserver | score, sources (element, previous_rect, current_rect) | | ||
| | `screenshot` | ffmpeg x11grab (full display) | base64 PNG in data | |
There was a problem hiding this comment.
Valid concern -- truncating base64 PNG data produces corrupt output. We don't support 4K displays so this is unlikely in practice, but the plan now specifies: if the base64 PNG exceeds ~950KB, downscale by halving dimensions and re-encode. This keeps a usable PNG under the 1MB S2 limit. Fixed in 7b9c491.
ulziibay-kernel
left a comment
There was a problem hiding this comment.
is this an infra through we we can log IP addresses the browser sessions are assigned to? That is highly relevant for https://linear.app/onkernel/issue/KERNEL-801/residential-ip-reputation-measurement
Sayan-
left a comment
There was a problem hiding this comment.
building on top of CDP, S2 makes sense! I think the main risks are going to be some of the signal settling + chromium lifecycle handling but all solvable problems
| S2Stream --> Agents | ||
| ``` | ||
|
|
||
| The CDPMonitor opens its own CDP WebSocket to Chrome (using the existing `UpstreamManager.Current()` URL) and subscribes to configured CDP domains. It normalizes events into a common schema, tags each with tab/frame/target context, and dual-writes to both an S2 stream and a local ring buffer. The local buffer backs a `GET /events/stream` SSE endpoint. |
There was a problem hiding this comment.
open question: what are the performance / IO implications of a second CDP WebSocket connection on these unikernels? with both the user's CDP session and the monitor subscribing to overlapping domains (e.g. Network), Chrome doubles the event traffic. worth benchmarking under load once implemented.
|
|
||
| The CDPMonitor opens its own CDP WebSocket to Chrome (using the existing `UpstreamManager.Current()` URL) and subscribes to configured CDP domains. It normalizes events into a common schema, tags each with tab/frame/target context, and dual-writes to both an S2 stream and a local ring buffer. The local buffer backs a `GET /events/stream` SSE endpoint. | ||
|
|
||
| Default state is **off**. An explicit `POST /events/start` is required to begin capture. |
There was a problem hiding this comment.
when Chrome crashes and restarts mid-capture, the monitor's WebSocket dies and events are lost until reconnect. consider emitting synthetic monitor_disconnected / monitor_reconnected events so consumers know there's a gap in the stream rather than silently missing events.
| Each event is a JSON record, capped at **1MB** (S2's record size limit): | ||
|
|
||
| ```go | ||
| type BrowserEvent struct { |
There was a problem hiding this comment.
how should a downstream consumer ensure event ordering? two events can share the same millisecond timestamp. also, how should consumers deduplicate events? (S2 provides at-least-once delivery, so duplicates are possible.)
| Timestamp int64 `json:"ts"` // unix millis | ||
| Type string `json:"type"` // snake_case event name | ||
| TargetID string `json:"target_id,omitempty"` // CDP target ID (tab/window) | ||
| SessionID string `json:"session_id,omitempty"` // CDP session ID |
There was a problem hiding this comment.
nit: session_id here means the CDP session ID, but in the broader Kernel system "session" means the user's browser session. consider cdp_session_id to avoid confusion for consumers, or at minimum add a doc comment clarifying.
|
|
||
| ### Event Types | ||
|
|
||
| **Raw CDP events** (forwarded from Chrome, enriched with target/frame context): |
There was a problem hiding this comment.
as designed, each event type requires a custom transform from CDP params to the data schema adding a new event type means writing a new handler, which seems reasonable. I don't think attempting to generically passthrough all CDP events across whatever domains the users enable is quite right but figured I'd double check the semantics we're initially landing on
| - `S2_ACCESS_TOKEN` -- S2 access token (optional; if absent, S2 writes are skipped) | ||
| - `S2_BASIN` -- S2 basin name | ||
| - `S2_STREAM_NAME` -- stream name for browser events | ||
| - **Write path**: CDPMonitor batches events (every 100ms or 50 events, whichever comes first) and calls `streamClient.Append()` with `[]AppendRecord`. Each record body is the JSON-serialized `BrowserEvent`. |
There was a problem hiding this comment.
should the monitor write to the ring buffer only, with the S2 writer as another reader? single write path, naturally decouples S2 latency from CDP processing, and the S2 writer is just another consumer like SSE clients.
|
|
||
| ## Testing Plan | ||
|
|
||
| ### Unit Tests (`server/lib/cdpmonitor/*_test.go`) |
There was a problem hiding this comment.
the test plan covers happy paths well but doesn't mention failure modes: Chrome crash/restart during capture, ring buffer overflow under high event volume, or calling /events/start when Chrome isn't ready yet. worth adding at least the Chrome lifecycle case since that's a real production scenario.
Summary
Design document for a configurable browser event streaming system on the image server.
Target.setAutoAttachwithflatten: truenetwork_idle,layout_settled,navigation_settled(composite of dom_content_loaded + network_idle + layout_settled)POST /events/startwith a config bodytruncatedflagThe full RFC is in
.cursor/plans/2026-02-05-events.md. Also addsdevtools-protocol/as a reference for CDP domain definitions.Test plan
Made with Cursor
Note
Low Risk
Documentation-only change; no runtime behavior, APIs, or dependencies are modified in this PR.
Overview
Adds a new RFC document (
.cursor/plans/2026-02-05-events.md) proposing a configurable browser event capture pipeline, including an event schema, computed “settling” meta-events, and APIs (POST /events/start,POST /events/stop,GET /events/stream) plus optional durable streaming to S2.No production code changes are included in this PR; it is design/specification only, outlining planned new packages, config env vars, and a testing approach.
Written by Cursor Bugbot for commit 7b9c491. This will update automatically on new commits. Configure here.