Browser-based visualization tool for exploring OWAMcap datasets with synchronized playback of screen recordings and interaction events.
Quick Start: https://huggingface.co/spaces/open-world-agents/visualize_dataset
Usage:
- Visit the viewer URL
- Either drag & drop local files, or enter a Hugging Face dataset ID
- Explore your data with synchronized video and input overlays
- Drag & Drop: Load local
.mcap+.mkvfiles directly in browser - Hugging Face Integration: Browse and load datasets via
?repo_id=org/dataset - Synchronized Playback: Video synced with keyboard/mouse overlays
- Large File Support: Uses MCAP index for seeking, never loads entire file
- Input Overlay: Keyboard (all keys), mouse (L/R/M buttons, scroll), cursor minimap
Install Node.js via nvm (recommended):
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash
nvm install --ltsgit clone https://github.com/open-world-agents/owa-dataset-visualizer
cd owa-dataset-visualizer
npm install
npm run devFor browsing multiple recordings from a local directory:
# Serve a directory containing mcap/video pairs
python scripts/serve_local.py /path/to/recordings -p 8080
# Open visualizer with local server
# http://localhost:5173/?base_url=http://localhost:8080Features:
- Auto-scans for mcap/video pairs
- HTTP Range support for streaming large videos
- Multi-threaded for concurrent requests
| URL | Description |
|---|---|
/ |
Landing page with featured datasets |
?repo_id=org/dataset |
Load HuggingFace dataset |
?base_url=http://localhost:8080 |
Load from local file server |
?mcap=url&mkv=url |
Direct file URLs |
src/
├── main.js # Routing, landing page
├── viewer.js # Viewer logic, render loop
├── hf.js # HuggingFace API, file tree
├── state.js # StateManager, message handlers
├── mcap.js # MCAP loading, TimeSync
├── overlay.js # Keyboard/mouse canvas drawing
├── ui.js # Side panel, loading indicator
├── config.js # Featured datasets
├── constants.js # VK codes, colors, flags
└── styles.css
- Find nearest
keyboard/statesnapshot before target time - Replay
keyboardevents from snapshot to target - Find nearest
mouse/statesnapshot before target time - Replay mouse events from snapshot to target
- Find latest
windowinfo
This enables O(snapshot interval) seek instead of O(file size).
