Skip to content

lina-usc/dcusv

Repository files navigation

DCUSV — Deep Clustering of Ultrasonic Vocalizations in Rodents

DCUSV is a Python pipeline for the unsupervised discovery of vocalization types in rodent ultrasonic vocalizations (USVs). It segments raw audio into spectrogram images using methods from ContourUSV, compresses them with a dense autoencoder, and clusters the latent representations with UMAP + HDBSCAN + agglomerative meta-clustering.

The pipeline was developed for 22 kHz USVs recorded in a PTSD rat model, but the approach generalizes to any USV frequency band.


Pipeline overview

Raw .wav files  (USCMed dataset)
      │
      ▼
 data_prep.py        — segment audio, extract USV spectrogram patches → 512×512 PNG images
      │
      ▼
 autoencoder.py      — train a dense autoencoder; save 10-D latent embeddings (.npy)
      │
      ▼
 dcusv.py            — UMAP → HDBSCAN → agglomerative meta-clustering; UMAP/t-SNE/PCA plots
      │
      ▼
 cluster_dist.py     — per-animal cluster-proportion heatmaps; symmetric KL divergence

param_optuna.py is an optional Optuna hyperparameter search over the UMAP + HDBSCAN + meta-clustering space.


Directory structure

dcusv/
├── data_prep.py          # Step 1 – spectrogram generation & USV extraction
├── autoencoder.py        # Step 2 – dense autoencoder training & embedding
├── dcusv.py              # Step 3 – UMAP / HDBSCAN / meta-clustering & visualization
├── cluster_dist.py       # Step 4 – cluster-distribution analysis across animals/conditions
├── param_optuna.py       # (Optional) Optuna hyperparameter search
├── requirements.txt
└── README.md

USCMed/                    # Downloaded dataset (see Step 0 below)
└── PTSD16/
    └── Context/
        └── *.wav

clustering_data/           # Created by data_prep.py — 512×512 spectrogram patches
clustering_results/
├── models/                # Autoencoder weights, .npy embeddings, file-path list
├── cluster_dcusv_*/       # Per-cluster image grids (dcusv.py)
├── cluster_dcusv_vis_*/   # UMAP / t-SNE / PCA scatter plots (dcusv.py)
└── cluster_map/           # Heatmaps and KL-divergence CSV/PNG (cluster_dist.py)

Installation

Python 3.9–3.11 is recommended (TensorFlow 2.x compatibility).

  1. Clone the repository:

    git clone https://github.com/lina-usc/dcusv.git
    cd dcusv
  2. Create and activate virtual environment:

    python -m venv venv
    source venv/bin/activate  # Linux/MacOS
    venv\Scripts\activate     # Windows
  3. Install dependencies:

    pip install -r requirements.txt

Usage

Step 0 — Download the data

Download the USCMed dataset from Zenodo:

https://zenodo.org/records/15029872

USCMed contains audio recordings and hand-scored annotations for 27 male rats across a Context trial, collected at the University of South Carolina School of Medicine.

Extract the archive into the dcusv/ directory so that the layout matches:

dcusv/
└── USCMed/
    └── PTSD16/
        └── Context/
            └── *.wav

If you extract to a different location, update root_path in data_prep.py accordingly.


Step 1 — Generate spectrogram images

python data_prep.py

This reads the raw .wav files from USCMed/PTSD16/Context/, extracts 22 kHz USV regions, and saves 512×512 grayscale PNG spectrogram patches under:

clustering_data/all_data_512x512//<recording_stem>/

Step 2 — Train the autoencoder & save embeddings

python autoencoder.py

Outputs written to clustering_results/models/:

  • dense_encoder_all.h5 — saved Keras encoder
  • dense_encoded_images_all.npy — (N, 10) latent embedding matrix
  • file_paths_all.npy — ordered list of source image paths
  • dense_autoencoder_loss_all.png — train/val loss curve

Step 3 — Cluster and visualize

python dcusv.py

Outputs:

  • Per-cluster image grids in clustering_results/cluster_dcusv_<embedding>/
  • UMAP, t-SNE, and PCA scatter plots in clustering_results/cluster_dcusv_vis_<embedding>_<silhouette>/

Step 4 — Cluster-distribution analysis

python cluster_dist.py

Outputs written to clustering_results/cluster_map/:

  • cluster_counts_<cond>.csv / cluster_props_<cond>_colnorm.csv
  • cluster_clustermap_<cond>.png — per-condition cluster heatmap
  • kl_divergence_per_animal.csv / kl_divergence_per_animal.png

(Optional) Hyperparameter search

python param_optuna.py

Runs 500 Optuna trials optimizing silhouette score over UMAP, HDBSCAN, and meta-clustering hyperparameters.


ContourUSV integration

data_prep.py uses the detection methods from ContourUSV, our earlier USV detection pipeline.


Configuration

Key parameters are set as variables at the top of each script.

Script Variable Default Description
data_prep.py root_path Path("USCMed") Root directory of the downloaded dataset
data_prep.py experiment_tests_mapping {'PTSD16': ['Context']} Experiments and conditions to process
data_prep.py freq_min / freq_max 0 / 115 kHz Frequency range for spectrogram
autoencoder.py dims [N, 2048, 512, 128, 10] Autoencoder layer sizes
autoencoder.py pretrain_epochs 300 Max training epochs (early stopping applies)
dcusv.py embedding "dense_encoded_images_all" Embedding file stem to load
dcusv.py UMAP n_neighbors / min_dist 7 / 0.0 UMAP hyperparameters
dcusv.py HDBSCAN min_cluster_size / min_samples 10 / 14 HDBSCAN hyperparameters
cluster_dist.py agg n_clusters 4 Number of meta-clusters

Output examples

  • Cluster image grids — a panel of up to 10 representative spectrogram patches for each cluster.
  • UMAP / t-SNE / PCA scatter plots — 2-D embedding colored by merged cluster label.
  • Cluster heatmaps — rows = clusters, columns = animals; color = proportion of animal's USVs in that cluster.
  • KL divergence bar chart — per-animal divergence between ACQ and Context cluster distributions.

Cluster quality metrics

dcusv.py prints the following metrics (noise points excluded):

Metric Interpretation
Silhouette score Higher is better (−1 to 1)
Davies-Bouldin score Lower is better
Calinski-Harabasz score Higher is better
HDBSCAN validity index Higher is better (0 to 1)

Dependencies

See requirements.txt. Core libraries:


Citation

If you use DCUSV in your research, please cite the following:

DCUSV (this work):

Deep Clustering of Ultrasonic Vocalizations in Rodents. Research Square, 2025. https://www.researchsquare.com/article/rs-9068431/v1

ContourUSV (spectrogram preprocessing):

Anis, S. S., Kellis, D. M., Kaigler, K. F., Wilson, M. A., & O'Reilly, C. (2025). A Reliable and Efficient Detection Pipeline for Rodent Ultrasonic Vocalizations. arXiv:2503.18928. https://arxiv.org/abs/2503.18928

USCMed dataset:

https://zenodo.org/records/15029872

About

We introduce DCUSV: Deep Clustering of Ultrasonic Vocalizations in Rodents, where preprocessed USV contours are further analyzed to reveal distinct patterns in rodent vocal behavior.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages