Skip to content

Add video analysis mode via Gemini CLI#5

Open
Abeansits wants to merge 2 commits intomainfrom
feature/video-analysis
Open

Add video analysis mode via Gemini CLI#5
Abeansits wants to merge 2 commits intomainfrom
feature/video-analysis

Conversation

@Abeansits
Copy link
Owner

Summary

  • Adds video mode to query.sh — analyze local video files or YouTube URLs using Gemini's multimodal capabilities
  • Accepts local files (@path syntax) or URLs (bare in prompt), auto-switches provider to Gemini
  • Default prompt produces a structured Markdown table (Timestamp, Visual Narrative, Text & Graphics, Cinematography, Audio, Emotional Beat) plus symbolism/ad evaluation summary
  • Updates /midflight skill with --video <file-or-url> flag support
  • Sandboxes local files via temp staging dir (copy, not symlink) to avoid exposing parent directories
  • Hard-fails on stat failure instead of silently bypassing the 20MB size guard

Use case

Paul generates video ads for the Judah app. The Slack conductor needs to "watch" these videos and provide scene breakdowns, quality feedback, and content verification — without Sebastian being involved.

Usage

# Direct — local file with default scene breakdown
query.sh ./ad-v3.mp4 video

# Direct — YouTube URL
query.sh "https://youtube.com/watch?v=abc" video

# Direct — custom prompt
query.sh ./ad-v3.mp4 video "Does this match the storyboard?"

# Via skill
/midflight --video ./ad-v3.mp4 Does this match our brand guidelines?

Test plan

  • Local file: scene breakdown with default prompt (13MB mp4)
  • YouTube URL: default prompt produces full table output
  • Custom prompt: responds to specific question about the video
  • Backward compat: consult mode routes to Codex as before
  • Error paths: missing file, empty args, file-not-found
  • Staging dir cleaned up on exit
  • Codex code review via midflight — addressed security feedback

🤖 Generated with Claude Code

Enable mid-flight to analyze video files and URLs using Gemini's
multimodal capabilities. Supports scene breakdowns, ad review, and
quality checks — designed for the Slack conductor to "watch" video
ads without human involvement.

- Add `video` mode to query.sh alongside consult/implement
- Accept local files (@path syntax) or URLs (bare in prompt)
- Auto-switch provider to Gemini when video mode is selected
- Default prompt produces structured Markdown table with 6 columns
  (Timestamp, Visual Narrative, Text & Graphics, Cinematography,
  Audio & Soundscape, Emotional Beat)
- Sandbox local files via temp staging dir to avoid exposing parent
  directories through --include-directories
- Validate file size (20MB Gemini CLI limit) with hard-fail on
  stat failure
- Update midflight skill with --video flag parsing and URL support

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Abeansits Abeansits force-pushed the feature/video-analysis branch from 986730c to 2653e0e Compare March 24, 2026 06:03
Document video analysis mode: usage examples, troubleshooting,
and technical details (staging dir sandboxing, URL vs local file
handling). Bump version for new feature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant