Hello from the Whale Coda Explorer project 🐋
We're Kevin and Claude, and we've built an open-source interactive web application on top of WhAM and the DSWP dataset. We wanted to share it with you in case it's useful to the community, and to express our gratitude for making your research openly available.
Repository: https://github.com/CivicDash/whale-coda-explorer
Built with: WhAM embeddings, DSWP dataset, Python port of the Coda-detector, Gero et al. (2015) dataset
What it does
1. Interactive Cluster Explorer
- UMAP projection of WhAM embeddings for the 620 DSWP codas
- HDBSCAN clustering (15 clusters discovered)
- Click any point to listen to the coda and see its spectrogram
2. Individual Whale Identity
- Integration of the Gero, Whitehead & Rendell (2015) coda dataset (3,876 labeled codas)
- UMAP projection of ICI features colored by coda type, social unit, or individual
- Named whale profiles (17 identified individuals) with vocal repertoire statistics
- Search and highlight specific individuals
3. Acoustic Individual Identification (k-NN)
- 7-NN classifier trained on 1,602 labeled codas from the Gero dataset
- Upload any audio/video file (WAV, MP3, MP4, OGG, FLAC...)
- Automatic coda detection → feature extraction → individual matching
- Multi-coda synthesis for higher confidence across a session
- Tested successfully on YouTube and TikTok recordings
4. Vocal Activity Classification
- Automatic classification of sperm whale vocal activity:
- Echolocation (regular clicks, ICI ~0.5-2s)
- Codas (social communication, patterned clicks)
- Creaks/Buzzes (prey capture attempts, ICI <30ms)
- Silence
- Visual timeline with waveform overlay
- Depth estimation from echolocation ICI
5. Python port of the Coda-detector
- Full port of your MATLAB Coda-detector to Python
- TKEO-based click detection + graph clustering
- Integrated directly into the web UI
6. Educational Guide
- Interactive glossary explaining UMAP, clusters, ICI, codas, spectrograms, etc.
- Study area map (Dominica, Eastern Caribbean)
- Designed to make the science accessible to non-specialists
Technical stack
- Frontend: Gradio 6 with Plotly interactive plots
- ML: scikit-learn (k-NN), UMAP, HDBSCAN
- Audio: pydub + ffmpeg (supports any audio/video format)
- Visualization: Plotly, Matplotlib
Why we're sharing this
This project started as a personal exploration of animal communication and grew into something we believe could be useful:
- For researchers: quick visual exploration of coda datasets, automated coda detection and individual matching
- For citizen science: anyone can upload a whale recording from YouTube/TikTok and get immediate analysis
- For education: accessible explanations of the science behind whale communication
We'd love your feedback on:
- The accuracy of our k-NN identification approach (we know ICI-based matching has limitations vs. full acoustic analysis)
- Whether this tool could be useful to your team or collaborators
- Any suggestions for improvement
Everything is MIT-licensed and open source. We're happy to adapt the tool to better serve the research community.
Thank you for your incredible work on WhAM and Project CETI. The idea that we might one day understand what sperm whales are saying to each other is deeply inspiring.
— Kevin (CivicDash / Civis-Consilium) & Claude
Hello from the Whale Coda Explorer project 🐋
We're Kevin and Claude, and we've built an open-source interactive web application on top of WhAM and the DSWP dataset. We wanted to share it with you in case it's useful to the community, and to express our gratitude for making your research openly available.
Repository: https://github.com/CivicDash/whale-coda-explorer
Built with: WhAM embeddings, DSWP dataset, Python port of the Coda-detector, Gero et al. (2015) dataset
What it does
1. Interactive Cluster Explorer
2. Individual Whale Identity
3. Acoustic Individual Identification (k-NN)
4. Vocal Activity Classification
5. Python port of the Coda-detector
6. Educational Guide
Technical stack
Why we're sharing this
This project started as a personal exploration of animal communication and grew into something we believe could be useful:
We'd love your feedback on:
Everything is MIT-licensed and open source. We're happy to adapt the tool to better serve the research community.
Thank you for your incredible work on WhAM and Project CETI. The idea that we might one day understand what sperm whales are saying to each other is deeply inspiring.
— Kevin (CivicDash / Civis-Consilium) & Claude