Building a model which correctly recognizes any genre of music based on their characteristic audio features. Building an interface to predict the genre of audio file using a given model with acceptable accuracy.
The dataset used in this project is the GTZAN Genre Collection Dataset. It has been designed and written by George Tzanetaki. The dataset has been taken from the popular software framework MARSYAS. Marsyas (Music Analysis, Retrieval and Synthesis for Audio Signals) is an open source software framework for audio processing with specific emphasis on Music Information Retrieval applications. Dataset consists of 1000 audio tracks each 30 seconds long. It contains 10 genres (Blues, Classical, Country, Disco, Hip-Hop, Jazz, Metal, Pop, Reggae and Rock), each represented by 100 tracks. The tracks are all 22050 Hz Mono 16-bit audio files in .wav format.