spectrogram.sh writes temp WAV into watched StreamData directory, causing analyzer crashes and a runaway-growth race
Summary
scripts/spectrogram.sh creates its working file spectrogram_window.tmp.wav inside $STREAM_DIR ($RECS_DIR/StreamData). That directory is watched by birdnet_analysis.py via inotify for IN_CLOSE_WRITE events, so every time sox finishes writing the spectrogram script's working file, the analyzer picks it up and tries to process it as a recording. This causes a continuous stream of analyzer errors and wasted CPU. It also creates a race condition if two spectrogram.sh instances ever run concurrently — both write to the same fixed filename and sox will concatenate the file with itself, producing unbounded growth.
Environment
- Fork:
Nachtzuster/BirdNET-Pi, branch main, commit 88985a3 ("fix: make bottom visible on smaller screens")
- Hardware: Raspberry Pi 5
- OS: Debian Trixie (BirdNET-Pi recommended for Pi 5)
- Audio: USB mic at
plughw:2,0, 48 kHz mono
Reproduction
- Run a normal install with
birdnet_analysis.service and spectrogram_viewer.service both active.
- Tail the analyzer journal:
journalctl -u birdnet_analysis.service -f
- Within 3–6 seconds, observe repeated entries like:
[birdnet_analysis][INFO] Analyzing /home/birder/BirdSongs/StreamData/spectrogram_window.tmp.wav
[birdnet_analysis][ERROR] Unexpected error:
Traceback (most recent call last):
File "/usr/local/bin/birdnet_analysis.py", line 91, in process_file
file = ParseFileName(file_name)
File "/home/birder/BirdNET-Pi/scripts/utils/classes.py", line 33, in __init__
date_created = re.search('^[0-9]+-[0-9]+-[0-9]+', name).group()
AttributeError: 'NoneType' object has no attribute 'group'
(The crash itself is bug #2, filed separately. This issue is about why the analyzer is being asked to process that file.)
Root cause
In scripts/spectrogram.sh:
STREAM_DIR="$HOME/BirdSongs/StreamData"
...
TMP_WAV="${STREAM_DIR}/spectrogram_window.tmp.wav"
In birdnet_analysis.py (line 35):
i.add_watch(os.path.join(conf['RECS_DIR'], 'StreamData'), mask=IN_CLOSE_WRITE)
spectrogram.sh calls sox on TMP_WAV every 3 seconds (INTERVAL_SECONDS=3). Each sox write triggers IN_CLOSE_WRITE in the watched directory, the analyzer attempts to process the file, and the filename spectrogram_window.tmp.wav doesn't conform to the YYYY-MM-DD...HH:MM:SS pattern that ParseFileName expects.
Secondary problem: instance-collision race
Because TMP_WAV is a fixed filename (no PID or timestamp namespacing), two co-running instances of spectrogram.sh will both target the same file. The sox invocation in the script is:
sox -V1 "$previous" "$current" "$TMP_WAV" 2>/dev/null
If a second instance manages to use $TMP_WAV (its own previous output) as one of its inputs while still writing to $TMP_WAV, sox concatenates the file with itself. I observed this in practice after a manual restart left two instances running briefly: the temp file grew to 499 MB before I caught it. Single-instance operation is the norm under systemd, but anything that kills and respawns the script (a manual pkill followed by a nohup relaunch, for example) can produce this state, and the cleanup() EXIT trap doesn't help because both instances reference the same path.
Impact
- Analyzer CPU waste:
process_file is invoked, opens the file, calls ParseFileName, throws, logs a traceback. Repeats every 3–6 seconds indefinitely.
- Log noise: thousands of identical tracebacks per hour drown out genuine analyzer errors.
- Indirect: under sustained CPU pressure (especially combined with the secondary race scenario), I observed cascading ALSA
overrun errors in birdnet_recording.sh that eventually wedged the arecord process in a state where it held the audio device but no longer drained the kernel buffer. Recording silently stopped for ~6.5 hours before manual intervention. I can't prove the analyzer churn was the trigger, but it's a plausible contributor and at minimum it's not helping.
Proposed fix
Move TMP_WAV out of the watched directory and PID-namespace the filename. In scripts/spectrogram.sh:
STREAM_DIR="$HOME/BirdSongs/StreamData"
OUT_PNG="${EXTRACTED}/spectrogram.png"
TMP_PNG="${OUT_PNG}.tmp"
-TMP_WAV="${STREAM_DIR}/spectrogram_window.tmp.wav"
+TMP_WAV="/tmp/spectrogram_window.$$.wav"
INTERVAL_SECONDS=3
WINDOW_SECONDS=15
Optionally, expand cleanup() to remove stragglers from prior crashed instances:
cleanup() {
rm -f "$TMP_WAV" "$TMP_PNG" 2>/dev/null
+ # Remove stragglers from prior crashed instances (older than 5 minutes
+ # so we don't yank a co-running sibling's working file).
+ find /tmp -maxdepth 1 -name 'spectrogram_window.*.wav' -mmin +5 -delete 2>/dev/null
}
/tmp is appropriate here: the file is genuinely transient (rewritten every 3 seconds), large (a few MB), and has no value beyond the next iteration. $$ resolves to the script's PID, so concurrent instances cannot collide.
This change is independent of bug #2 — fixing it removes the ongoing analyzer churn, but the analyzer should still be hardened against unexpected filenames in case any other source ever drops a file into StreamData.
I'd be happy to open a PR with this fix if useful.
spectrogram.shwrites temp WAV into watchedStreamDatadirectory, causing analyzer crashes and a runaway-growth raceSummary
scripts/spectrogram.shcreates its working filespectrogram_window.tmp.wavinside$STREAM_DIR($RECS_DIR/StreamData). That directory is watched bybirdnet_analysis.pyviainotifyforIN_CLOSE_WRITEevents, so every timesoxfinishes writing the spectrogram script's working file, the analyzer picks it up and tries to process it as a recording. This causes a continuous stream of analyzer errors and wasted CPU. It also creates a race condition if twospectrogram.shinstances ever run concurrently — both write to the same fixed filename andsoxwill concatenate the file with itself, producing unbounded growth.Environment
Nachtzuster/BirdNET-Pi, branchmain, commit88985a3("fix: make bottom visible on smaller screens")plughw:2,0, 48 kHz monoReproduction
birdnet_analysis.serviceandspectrogram_viewer.serviceboth active.journalctl -u birdnet_analysis.service -f(The crash itself is bug #2, filed separately. This issue is about why the analyzer is being asked to process that file.)
Root cause
In
scripts/spectrogram.sh:In
birdnet_analysis.py(line 35):spectrogram.shcallssoxonTMP_WAVevery 3 seconds (INTERVAL_SECONDS=3). Eachsoxwrite triggersIN_CLOSE_WRITEin the watched directory, the analyzer attempts to process the file, and the filenamespectrogram_window.tmp.wavdoesn't conform to theYYYY-MM-DD...HH:MM:SSpattern thatParseFileNameexpects.Secondary problem: instance-collision race
Because
TMP_WAVis a fixed filename (no PID or timestamp namespacing), two co-running instances ofspectrogram.shwill both target the same file. Thesoxinvocation in the script is:If a second instance manages to use
$TMP_WAV(its own previous output) as one of its inputs while still writing to$TMP_WAV,soxconcatenates the file with itself. I observed this in practice after a manual restart left two instances running briefly: the temp file grew to 499 MB before I caught it. Single-instance operation is the norm under systemd, but anything that kills and respawns the script (a manualpkillfollowed by anohuprelaunch, for example) can produce this state, and thecleanup()EXITtrap doesn't help because both instances reference the same path.Impact
process_fileis invoked, opens the file, callsParseFileName, throws, logs a traceback. Repeats every 3–6 seconds indefinitely.overrunerrors inbirdnet_recording.shthat eventually wedged thearecordprocess in a state where it held the audio device but no longer drained the kernel buffer. Recording silently stopped for ~6.5 hours before manual intervention. I can't prove the analyzer churn was the trigger, but it's a plausible contributor and at minimum it's not helping.Proposed fix
Move
TMP_WAVout of the watched directory and PID-namespace the filename. Inscripts/spectrogram.sh:STREAM_DIR="$HOME/BirdSongs/StreamData" OUT_PNG="${EXTRACTED}/spectrogram.png" TMP_PNG="${OUT_PNG}.tmp" -TMP_WAV="${STREAM_DIR}/spectrogram_window.tmp.wav" +TMP_WAV="/tmp/spectrogram_window.$$.wav" INTERVAL_SECONDS=3 WINDOW_SECONDS=15Optionally, expand
cleanup()to remove stragglers from prior crashed instances:cleanup() { rm -f "$TMP_WAV" "$TMP_PNG" 2>/dev/null + # Remove stragglers from prior crashed instances (older than 5 minutes + # so we don't yank a co-running sibling's working file). + find /tmp -maxdepth 1 -name 'spectrogram_window.*.wav' -mmin +5 -delete 2>/dev/null }/tmpis appropriate here: the file is genuinely transient (rewritten every 3 seconds), large (a few MB), and has no value beyond the next iteration.$$resolves to the script's PID, so concurrent instances cannot collide.This change is independent of bug #2 — fixing it removes the ongoing analyzer churn, but the analyzer should still be hardened against unexpected filenames in case any other source ever drops a file into
StreamData.I'd be happy to open a PR with this fix if useful.