Why
ROADMAP Phase 4 lists "Real-Time Subject Tracking — local MediaPipe / YOLO / SAM-2 for fast-moving subject reframing, replacing today's Claude Vision keyframe approach (vibe edit reframe --track)" as an open item.
Today's path is slow and expensive (Vision API per keyframe). A local model would unlock per-frame tracking without per-call cost.
State of code (2026-04-29)
packages/cli/src/commands/edit-cmd.ts:478-560 — vibe edit reframe command
edit-cmd.ts:556 — "Analyzing frames for subject tracking..." — extracts keyframes, sends to Claude Vision for subject location
- Output: ROI hints used by FFmpeg crop+resize
Limitations of current approach:
- Cost grows linearly with keyframe count
- Sparse sampling means jittery output on fast-moving subjects
- Requires
ANTHROPIC_API_KEY for what should be a local-first operation
Scope (sketch — design doc needed)
Out of scope
- Tracking subjects across cuts (scene boundary detection is
vibe analyze scene's job)
- Multi-subject tracking with assignment / re-id
Recommendation
Big enough to need a docs/design/ doc + plan PR series. Probably the highest-effort item on the Phase 4 list. Marked help wanted — this is a great fit for an OSS contributor with ML/vision background.
Reference
- ROADMAP.md Phase 4 "Open items in Phase 4 (v0.61+ candidates)"
- Current implementation:
packages/cli/src/commands/edit-cmd.ts:478-560
Why
ROADMAP Phase 4 lists "Real-Time Subject Tracking — local MediaPipe / YOLO / SAM-2 for fast-moving subject reframing, replacing today's Claude Vision keyframe approach (
vibe edit reframe --track)" as an open item.Today's path is slow and expensive (Vision API per keyframe). A local model would unlock per-frame tracking without per-call cost.
State of code (2026-04-29)
packages/cli/src/commands/edit-cmd.ts:478-560—vibe edit reframecommandedit-cmd.ts:556— "Analyzing frames for subject tracking..." — extracts keyframes, sends to Claude Vision for subject locationLimitations of current approach:
ANTHROPIC_API_KEYfor what should be a local-first operationScope (sketch — design doc needed)
--tracker <mediapipe|yolo|sam2>flag.npm install -gstory. Options:@vibeframe/tracker-nativepackage, soft-imported--tracker claudeor default if local model not installed--describe): local trackers should reportcost: 0and a wall-clock estimateOut of scope
vibe analyze scene's job)Recommendation
Big enough to need a
docs/design/doc + plan PR series. Probably the highest-effort item on the Phase 4 list. Markedhelp wanted— this is a great fit for an OSS contributor with ML/vision background.Reference
packages/cli/src/commands/edit-cmd.ts:478-560