Browser-based annotator for multimodal agent recordings. Automatically detects mobile (touch) or desktop (keyboard/mouse) datasets from MCAP files and renders the appropriate overlay on synchronized video playback. Add timestamped reasoning annotations and export as JSON.
npm install
npm run devDrop your MCAP + Video + JSON files into the browser, or load datasets from HuggingFace via URL parameters.