field-recording-video-generator

A shell script that turns a WAV field recording into a video with a scrolling linear spectrogram, metadata overlay, and playback cursor. The visual style is inspired by NASA telemetry displays and industrial camera footage.

Fair warning: This tool was vibe-coded with AI assistance. It works, but it has rough edges. Use at your own risk, expect quirks, and feel free to fix things. No warranty, no guarantees, no refunds.

What it produces

Scrolling linear spectrogram with configurable FFT window and frequency range
Info panel with recording metadata (file, subject, date, location, equipment, sample info, playback speed)
Optional retro monochrome map widget showing recording coordinates
Green glow playback cursor
Optional frequency grid overlay
Optional photo band
60 fps output at 1080x1080 (square) or 1080x1920 (reel)

Install

System dependencies

# macOS
brew install ffmpeg sox python3

# Ubuntu / Debian
sudo apt install ffmpeg sox python3 python3-venv

# Arch
sudo pacman -S ffmpeg sox python

Clone and run

git clone https://github.com/mrkva/field-recording-video-generator.git
cd field-recording-video-generator
./field-recording-video-generator recording.wav

Python packages (numpy, scipy, matplotlib, Pillow) are installed automatically into a local .venv/ on first run. No system-wide pip installs needed.

To add to your PATH:

sudo ln -sf "$(pwd)/field-recording-video-generator" /usr/local/bin/field-recording-video-generator

Usage

Command line

./field-recording-video-generator recording.wav

Web interface

./web.py                   # opens at http://localhost:5000
./web.py --port 8080       # custom port
./web.py --host 0.0.0.0   # listen on all interfaces

Drop a WAV file on the page, fill in the metadata, and hit Generate. Progress updates stream in real-time. Works from any browser on any OS.

Interactive prompts (CLI)

The CLI walks you through an interactive dialogue:

Prompt	Default	Description
Date/time	auto-detected from filename or file metadata	ISO format, e.g. `2024-03-15T14:30:22`
Recorded subject	--	What was recorded
Recorded with	--	Equipment used
Location	(optional)	Recording location name
Coordinates	(optional, if location set)	`lat,lon` for the map widget
Freq min (Hz)	`20`	Spectrogram lower bound
Freq max (Hz)	Nyquist	Spectrogram upper bound
Playback speed	1x	`SOURCE:TARGET` (e.g. `192000:44100`) or divisor
Preset	`square`	`square` (1080x1080) or `reel` (1080x1920)
Photo	(optional)	Path to a photo to embed in the video
Frequency grid	`y`	Draw grid lines on the spectrogram
FFT window size	`2048`	STFT window (e.g. `512`, `1024`, `4096`, `8192`)
Normalize audio	`y`	Two-pass loudnorm to -16 LUFS
Output file	`{input}_video.mp4`	Output path

Playback speed

For ultrasonic recordings (e.g. bat echolocation at 192kHz), use the playback speed setting to reinterpret the sample rate. This slows the audio to make ultrasound audible without altering the output sample rate.

Formats:

192000:44100 -- reinterpret 192kHz as 44.1kHz (4.35x slower)
4.35 -- slow by a factor of 4.35
1x -- normal speed (default)

FFT window size

The spectrogram is computed using a Short-Time Fourier Transform with a Hann window and 87.5% overlap. The window size controls the trade-off between time and frequency resolution:

Smaller windows (512, 1024) -- better time resolution, good for transient-heavy recordings (clicks, impacts, birdsong)
Larger windows (4096, 8192) -- better frequency resolution, good for tonal content (drones, engines, sustained notes)

How it works

Probe the input WAV for sample rate, bit depth, channels, duration, and codec
Prepare audio -- optionally reinterpret sample rate for ultrasonic recordings (asetrate + aresample), then normalize loudness with a two-pass loudnorm filter
Generate spectrogram -- compute STFT in 30-second chunks (for memory efficiency), apply colormap, and write a wide PNG strip
Render overlays -- info panel, frequency scale, cursor image, and optional map widget (fetched from OpenStreetMap tiles with a retro dark/inverted filter)
Compose video -- ffmpeg scrolls (crops) across the spectrogram strip, overlays the cursor and frequency scale, and stacks the info panel on top

Presets

Presets live in presets/ and set video dimensions, font size, and frame rate:

Preset	Resolution	Use case
`square`	1080x1080	Instagram post, general use
`reel`	1080x1920	Instagram/TikTok reel, vertical stories

Output

Video: H.264, CRF 18, slow preset, 60 fps
Audio: AAC 256 kbps
Container: MP4 with faststart flag

Project structure

field-recording-video-generator   # main shell script (CLI)
web.py                            # web interface (Flask)
templates/index.html              # web UI
lib/
  generate_spectrogram.py         # STFT + spectrogram image
  render_info_panel.py            # metadata overlay panel
  render_freq_scale.py            # frequency scale on left edge
  render_map_widget.py            # OSM-based retro map widget
presets/
  square.conf
  reel.conf
requirements.txt

License

Do whatever you want with it.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
lib		lib
presets		presets
templates		templates
.gitignore		.gitignore
Makefile		Makefile
PLUGIN_SPEC.md		PLUGIN_SPEC.md
README.md		README.md
requirements.txt		requirements.txt
sonogram		sonogram
web.py		web.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

field-recording-video-generator

What it produces

Install

System dependencies

Clone and run

Usage

Command line

Web interface

Interactive prompts (CLI)

Playback speed

FFT window size

How it works

Presets

Output

Project structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

field-recording-video-generator

What it produces

Install

System dependencies

Clone and run

Usage

Command line

Web interface

Interactive prompts (CLI)

Playback speed

FFT window size

How it works

Presets

Output

Project structure

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages