A high-speed Serial utility for building machine learning audio datasets. This tool allows you to stream raw 16-bit PCM data from an ESP32, visualize signal levels in real-time, and save recordings into labeled .wav files.
- Live Debug Mode: A real-time ASCII volume meter to verify microphone gain and wiring without saving files.
- Bulk Recording: Optimized workflow for capturing dozens of samples quickly—perfect for "Wake Word" training.
- High-Speed Transfer: Uses 921,600 Baud to ensure 16kHz audio data is captured without packet loss.
- Automatic Organization: Files are timestamped and sorted by label (e.g.,
command_170921400_1.wav).
- ESP32 (S3, C3, or Standard DevKit).
- I2S Microphone (e.g., INMP441, ICS-43434) or an Analog Mic using the ESP32 ADC.
- USB Data Cable (Ensure it is a data cable, not just a charging cable).
- Python 3.7+
- PySerial library:
pip install pyserial
- Clone the Repository:
git clone https://github.com/jhoward-embedded/EdgeImpulseDataRecorder-ESP32.git
cd EdgeImpulseDataRecorder-ESP32
- Configure the Script:
Open
audio_collector.pyand update thePORTvariable to match your ESP32:
PORT = 'COM5' # Windows
# PORT = '/dev/ttyUSB0' # Linux/Mac- ESP32 Firmware: Ensure your ESP32 is programmed to:
- Send raw binary (2 bytes per sample) when it receives the character
's'. - Send ASCII integers (text lines) when it receives the character
'd'.
Run the script:
python audio_recorder.py
Choose Option 1 to test your microphone. You will see a level meter like this:
[########## ] 4200
If the bar doesn't move when you speak, check your I2S clock (BCLK/WS) and Data (SD) wiring.
Choose Option 2 to start building your dataset.
- Enter a label when prompted (e.g.,
alexa,background,noise). - Press Enter to start a 3-second capture.
- The script will automatically increment the file count and save it to the
dataset_recordings/folder. - Type
qto stop the batch and return to the main menu.
| Parameter | Value |
|---|---|
| Sample Rate | 16,000 Hz |
| Bit Depth | 16-bit Signed Integer (PCM) |
| Baud Rate | 921,600 bps |
| Format | Mono .WAV (Little Endian) |
Contributions are welcome! If you'd like to add features like a Spectrogram preview or automatic silence trimming, please open an issue or a Pull Request.