[bugfix] guard load_audio against bytes input#9606
Conversation
There was a problem hiding this comment.
Code Review
This pull request aims to prevent a TypeError in load_audio when fallback decoding is triggered with bytes input, by ensuring audio is a string before calling startswith. It also adds a unit test to verify this behavior. However, the review feedback correctly points out that a similar TypeError will still occur in the subsequent else block when _check_path(audio) is called with bytes input, and that mocking _check_path in the test masks this underlying issue.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| res = librosa.load(audio_io, sr=sampling_rate) | ||
| except Exception: | ||
| if audio.startswith(('http://', 'https://')): | ||
| if isinstance(audio, str) and audio.startswith(('http://', 'https://')): |
There was a problem hiding this comment.
While guarding the startswith check with isinstance(audio, str) prevents a TypeError on this line, a similar TypeError will still occur on line 309 in the else block:
audio_io = _check_path(audio) or audioIf audio is of type bytes, _check_path(audio) is called. _check_path expects a str and performs string-specific operations (like data.startswith('data:') on line 118), which will raise TypeError: startswith first arg must be bytes or a tuple of bytes, not str.
To fully support bytes input without crashing, we should also avoid calling _check_path when audio is bytes.
|
|
||
| fake_librosa.load = fake_load | ||
| monkeypatch.setitem(sys.modules, 'librosa', fake_librosa) | ||
| monkeypatch.setattr(vision_utils, '_check_path', lambda p: None) |
There was a problem hiding this comment.
Mocking _check_path here masks a secondary TypeError bug in vision_utils.py line 309. When audio is bytes, _check_path(audio) is called in the else block, which crashes because _check_path performs string operations (e.g., data.startswith('data:')).
If we don't mock _check_path here, the test will correctly fail and expose this issue. Consider removing this mock and fixing the underlying bug in vision_utils.py by ensuring _check_path is only called for str inputs.
…back
When load_audio's first decode fails for bytes input, the except branch
calls _check_path(audio) with bytes, which then crashed on the str-only
checks (e.g. data.startswith('data:')). Return None for non-str input so the
caller falls back to the raw bytes, completing the bytes guard started in the
previous commit. The existing fallback test now exercises the real
_check_path (the patch is removed) and a focused _check_path bytes test is
added.
|
Extended this to fully guard |
PR type
PR information
load_audio()acceptsUnion[str, bytes]. When the firstlibrosa.load()fails (e.g. an audio format soundfile can't read on its own), it falls into theexceptblock and checksaudio.startswith(('http://', 'https://')). Forbytesinput this raisesTypeError: a bytes-like object is required, not 'str', masking the real decode error — andbytesis a documented input type (the dataset can hand the template raw audio bytes, andload_filealready handlesbytes).This guards the scheme check with
isinstance(audio, str), so byte audio takes the non-URL fallback path instead of crashing on the type mismatch. String/URL behavior is unchanged.Experiment results
Added
test_load_audio_bytes_input_does_not_crash_on_fallbackintests/general/test_template.py: it injects a stublibrosawhose firstloadraises, passesbytesaudio, and assertsload_audioreaches the fallback without aTypeError. The test fails on the previous code (TypeErrorat thestartswithline) and passes with this change. flake8 / yapf / isort are clean on both files.