Music-Genre-Classification (GTZAN Dataset)

Prerequisite

Python 3.6.8
CUDA 9.0 (Follow these: https://blog.quantinsti.com/install-tensorflow-gpu/ )
Sublime Text (Optional)

Package Required

numpy (For Numerical Computation)
librosa (For dealing with Audios)
tqdm (For showing loading/progrss bar)
sklearn (For Splitting the data into train, valid and test, and Confusion Matrix)
keras (For Deep learning: CNN and VGG16)
matplotlib (For showing the graph of train and valid with loss and accuracy)
collections (For storing the genres corresponding names to showing in confusion matrix)
itertools (For iterating the elements)
pickle (For storing the data into hard drive, so that we don't need to compute again and again. Just call the pickle file and load)

System Requirement

16GB RAM
Nvidia 4GB RAM

Dataset

GTZAN Genre Collection (Download Link: http://marsyas.info/downloads/datasets.html )

Steps of Execution:

Read_File.py : This file perform reading the audio files with labels corresponding to number of genres.
Audio_Segment.py : This file perform clipping of audio into small segment/clip of duration depending on window size and overlap.
Feature_Extraction.py : This file contain feature extraction techniques (STFT, Melspectrogram and MFCC). Whichever you want to extract, just change the name of the function 'to_stft' instead.
Split_Data.py : This file perform splitting of data into train, valid and test data.
CNN_Model.py : This file contain CNN model. (Also include CNN + RNN in comment section) So that you can train with RNN if required.
CNN_BiDirectional.py : This file is for CNN+BiRNN Model.
VGG16_Model.py : This file contain VGG16 model. (Also include VGG16 + RNN in comment section) So that you can also train with RNN also if required.
VGG16_BiDirectional.py : This file is for VGG16+BiRNN Model.

Other files are just for EDA:

Waveform.py : To show the plot of wave for the audio

Plot_Audio.py : To plot the spectogram of segments of the audio. (You can uncomment the code if you want to see STFT, Melspectrogram or MFCC)

Plot_CM.py : Create a module to print the Confusion Matrix

Result

See the result with every model with different feature extraction in 'Result and Output' Folder

train, valid and test loss and accuracy
test confusion matrix

Description

You can study the description/thesis about this project in 'Major Thesis_(Final).pdf'

Publication

Faiyaz Ahmad, Sahil,'Music Genre Classification using Spectral Analysis Techniques With Hybrid Convolution-Recurrent Neural Network',International Journal of Innovative Technology and Exploring Engineering (IJITEE), Volume-9, Issue-1, Novemeber, 2019

Link: https://www.ijitee.org/wp-content/uploads/papers/v9i1/A3956119119.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Result and Output		Result and Output
Audio_Segment.py		Audio_Segment.py
CNN_BiDirectional.py		CNN_BiDirectional.py
CNN_Model.py		CNN_Model.py
Feature_Extraction.py		Feature_Extraction.py
Major Thesis_(Final).pdf		Major Thesis_(Final).pdf
Plot_Audio.py		Plot_Audio.py
Plot_CM.py		Plot_CM.py
README.md		README.md
Read_file.py		Read_file.py
Research Paper-IJITEE.pdf		Research Paper-IJITEE.pdf
Split_Data.py		Split_Data.py
VGG16_BiDirectional.py		VGG16_BiDirectional.py
VGG16_Model.py		VGG16_Model.py
Waveform.py		Waveform.py

sahilsharma884/Music-Genre-Classification

Folders and files

Latest commit

History

Repository files navigation

Music-Genre-Classification (GTZAN Dataset)

Prerequisite

Package Required

System Requirement

Dataset

Steps of Execution:

Result

Description

Publication

About

Topics

Resources

Stars

Watchers

Forks

Languages