The Jazz Trio Database (JTD) is a dataset composed of about 45 hours of jazz performances annotated by an automated signal processing pipeline. For more information, check out the docs or our paper published in Transactions of the International Society for Music Information Retrieval.
JTD is now integrated within the latest version of mirdata, and this is the recommended way to work with the database annotations moving forwards. To install mirdata, run the following code (best inside a virtualenv):
pip install mirdata
Now, you can download the dataset and access the annotations simply by running the following lines of Python code:
import mirdata
jtd = mirdata.initialize('jtd')
jtd.download()
Audio recordings (both mixed and unmixed) can be found on this Zenodo record. Further instruction on where these recordings must be placed will be provided when running jtd.download(). Access must be requested before JTD audio can be downloaded, and will only be granted to valid research projects. Please provide as much detail relating to how you hope to use JTD when requesting access to the audio.
For more information on using mirdata together with JTD, refer to the mirdata documentation and the examples given in our documentation. Although it is not recommended, to build the JTD annotations directly from source, read the relevant page of our docs.
The dataset is made available under the MIT License. Please note that your use of the audio files linked to on YouTube is not covered by the terms of this license.
If you use the Jazz Trio Database in your work, please cite the paper where it was first introduced:
@article{jazz-trio-database
title = {Jazz Trio Database: Automated Annotation of Jazz Piano Trio Recordings Processed Using Audio Source Separation},
url = {https://doi.org/10.5334/tismir.186},
doi = {10.5334/tismir.186},
publisher = {Transactions of the International Society for Music Information Retrieval},
author = {Cheston, Huw and Schlichting, Joshua L and Cross, Ian and Harrison, Peter M C},
year = {2024},
}
Further information on mirdata can be found in the following paper:
@inproceedings{
bittner_fuentes_2019,
title={mirdata: Software for Reproducible Usage of Datasets},
author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},
booktitle={International Society for Music Information Retrieval (ISMIR) Conference},
year={2019}
}
The Jazz Trio Database has been used in the following published research outputs:
- Cheston, H., Bance, R., & Harrison, P. M. C. (2025). Deconstructing Jazz Piano Style Using Machine Learning. arXiv. https://doi.org/10.48550/arXiv.2504.05009
- Cheston, H., Schlichting, J. L., Cross, I., & Harrison, P. M. C. (2024). Rhythmic Qualities of Jazz Improvisation Predict Performer Identity and Style in Source-Separated Audio Recordings. Royal Society Open Science. https://doi.org/10.1098/rsos.240920
- Cheston, H., Cross, I., & Harrison, P. M. C. (2023). An Automated Pipeline for Characterizing Timing in Jazz Trios. Proceedings of the DMRN+18 Digital Music Research Network. Digital Music Research Network, Queen Mary University of London, London, United Kingdom.