GitHub

Exploring the Descript Audio Codec

Notebooks for showing how to code/decode, create your own dacfiles, morph in the DAC latent space, etc.

(Forked from the Descript RVQGAN (see their README_Descript in this repository) and modified to work with the Descript Audio Codec (DAC) and the Descript Audio Codec Morphing (DACM) models. The Descript Audio Codec is a VQ-VAE model that compresses audio files to a small number of bits, and the Descript Audio Codec Morphing model is a GAN that morphs between two audio files in the DAC latent space. The Descript Audio Codec is available at https://github.com/descriptinc/descript-audio-codec)

First let's set up a container with all the libraries and requirements you will need to run the notebooks.

If you don't already have access to a container built from this repository (lonce:dacdevupf in S'pore, or the /home/lwyse/working/DACodecMorphing/lwyse_dacdevupf23.sif on the HPC cluster in BCN), then you can build the container yourself. To build the docker container:

docker image build --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) --file Dockerfile --tag yourname:containername  .

uncompress dacdevdata.zip (where datafiles used in the notebooks are stored). The notebooks get their audio files from the folder that is created.
Run your container: To run this docker thus:

docker run --ipc=host --gpus "all" -p 8889:8889 -it -v /fullpath-to-local-dir:/app --name foo --rm yourname:containername

where /full-path-to-local-dir - is the absolute path to the folder where this repository lives (/app is necessary because that's how it is referenced in the notebooks) foo - is whatever name you want to give to the running docker container (the -v a:b flag maps a full path 'a' to a path you can use in the docker, 'b').

the -p is port mapping, use the same port that you will run jupyter on (see below)

--gpus can not be used if you don't have any! (The notebooks can easily run on the CPU)

If you are running on the HPC in BCN:

Start an interactive session (interactive -g 1),

While you are in an interactive session on a research compute note, get the docker from the UPF repository (which automatically builds to a Singularity container):

singularity pull --docker-login docker://registry.sb.upf.edu/mtg/lwyse:dacdev23

Go get a cup of coffee.

Then run the container,

singularity run --nv  --bind $(pwd):/app lwyse_dacdevupf23.sif

Run jupyter inside your container and then point a browser to the running notebook (using an ssh tunnel as explained below). You might have your own way of doing this, but because I am running remotely, I do it this way:

Run jupyter notebooks in the background:

jupyter lab --no-browser --port=8889 --ip=0.0.0.0 &

Point your local browser to

localhost:xxxx

where xxxx is the local port number you are using for the ssh tunnel to the remote machine and port number.

If you are running a docker container, check the remote machine ip addresss this way:

user> ^P ^Q to "detach" from the docker while leaving the container running ( 'docker attach foo' will put you back in the container)

docker inspect foo to find the IPAddress that the container is using

You need an ssh tunnel from you local machine to the IP and port that jupyter is running on. Let's say the docker is using 172.17.0.5:8889 (we specified the port when we ran jupyter), then you need a tunnel from some local port (for example 7775) to 172.17.0.5:8889. Once you have the tunnel set up, you can point your browser to: localhost:7775, and whalah, you can open the notebooks and run them.

If you are running on HPC research nodes (eg node022) the remote end of the ssh tunnel would look like node002:8889 where 8889 was specified when you ran jupyter. Your local port number for the tunnel is whatever you set it to.

If you don't want to wait for the Descript pre-trained models to download every time (see the top of any of the notebooks), you can download the models and load them locally and point the model_path to the file. To download them, run wget at a commandline prompt:

wget https://github.com/descriptinc/descript-audio-codec/releases/download/0.0.5/weights_16khz.pth
see dac/utils/init.py for other models using other sample rates.

Notebooks

These notebooks were all built just to explore the DAC API, examine the embeddings (z) and the latents (the 8D vectors that get quantized), compare the in-code coding/decoding to the comman line coding/decoding, etc.

ClampExperiment.ipynb - For checking the clamping and rescaling of vectors we sometimes need to do. Also has a nice disiplay of the distribution of vector values.
DACCodecbookTests16.ipynb - testing the difference that the number of codebooks make for 16kHz sample rate sounds
DACCodebookTest44 - testing the difference that the number of codebooks make for 44kHz sample rate sounds
dacfiles.ipynb - exploring dataloading of dac files
MorphProjectedLatents.ipyng - morphing in the space of the 8D pre-quantized latent space
morphZ.ipynb - morphing in the 1024D z-space (prior to the projection to 8D)
MyEncode_Folder - encoding exploration
MyEncode_vs_Compress.ipynb - compares programmatic coding to command-line coding (which create files that differ, unfortunately!)
perturb16.ipynb - testing what adding a little noise to codebook codes does to the audio
randomcodes16.ipynb - what do random codes sound like, anyway?
randomcodes44.ipynb - and does that say anything about the codec biases?

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.Trash-1001		.Trash-1001
assets		assets
conf		conf
dac		dac
descript_audio_codec.egg-info		descript_audio_codec.egg-info
hold		hold
notebooks		notebooks
scripts		scripts
tests		tests
.gitignore		.gitignore
.gitignore~		.gitignore~
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README_Descript.md		README_Descript.md
dacdevdata.zip		dacdevdata.zip
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring the Descript Audio Codec

Notebooks

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

lonce/DACodecMorphing

Folders and files

Latest commit

History

Repository files navigation

Exploring the Descript Audio Codec

Notebooks

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages