Open Access Open Access  Restricted Access Subscription or Fee Access

Genre Classification of Music Using Machine Learning

Partha Ghosh, Soham Mahapatra, Subhadeep Jana, Ritesh Kr. Jha

Abstract


One of the most important technological advances of the 21st century is artificial intelligence (AI) and machine learning. They are revolutionizing computers, banking, healthcare, agriculture, music and travel. A strong model has mastered many difficult learning tasks. Speech analysis is one area of artificial intelligence. This includes finding information about music, creating music, and categorizing music. Music data is one of the most complex source data available today. This is primarily because it is challenging to pull relevant correlation data from it. Various methods, from classical to neural networks to hybrids, have been tested on music data and achieved excellent accuracy. This study intends to analyse and contrast various approaches to identify the musical genres. On a small sample of the Free Music Archive (FMA) dataset, the accuracy rates were as follows: 46% using Support Vector Classifier (SVC), 40% using Logistic Regression, 67% using Artificial Neural Network (ANN), 77% using Convolutional Neural Networks (CNN), 90% using Convolution-Recurrent Neural Network (CRNN), 88% using Parallel Convolution-Recurrent Neural Network (PCRNN), 73% without using Ensemble technique, and 85% using Ensemble technique of AdaBoost. We established SVC as our baseline model, which had an accuracy of 46%, and we defined the succeeding models to have an accuracy higher than that. For the test dataset, ANN provided us with a score of 67%, whereas CNN outperformed it with a score of 77%. We discovered that imagebased features performed better at categorising the labels than typical audio-extracted features. The dataset responded best to a mix of CNN and RNN, with a series CRNN model providing the highest accuracy. After that, we attempted to train an ensemble model to our dataset and investigated how it functioned. The numerous ways for classifying music genres are thoroughly examined in this research, with an emphasis on some parallel models and ensembling strategies.


Full Text:

PDF

References


Choi K, Fazekas G, Sandler M, Cho K. Convolutional Recurrent Neural Networks for Music Classification. 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). 2017; 2392–2396. doi: 10.1109/ICASSP.2017.7952585.

Kozakowski P, Michalak B. Deep sound: Music genre recognition.

Feng L, Liu S, Yao J. Music Genre Classification with Paralleling Recurrent Convolutional Neural Network. 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). 2020; 332–338.

Pelchat N, Gelowitz CM. Neural Network Music Genre Classification. Can J Electr Comput Eng. 2020; 43(3): 170–173. 5. Lidy T, Schindler A. Parallel Convolutional Neural Networks for Music Genre and Mood Classification. 2016 Music Information Retrieval Evaluation eXchange (MIREX). 2016; 1–4.

Chang KK, Jang JR, Iliopoulos CS. Music Genre Classification via Compressive Sampling. 11th International Society for Music Information Retrieval Conference (ISMIR). 2010; 387–392.

Jeoung IY, Lee K. Learning Temporal Features Using a Deep Neural Network and its Application to Music Genre Classification. 17th International Society for Music Information Retrieval Conference (ISMIR). 2016; 434–440.

Annesi P, Bassili R, Gitto R, Moschitti A, Petiti R. Audio Feature Engineering for Automatic Music Genre Classification. RIAO, Pittsburgh. 2007.

Chathuranga YMD, Jayaratne KL. Automatic Music Genre Classification of Audio Signals with Machine Learning. GSTF International Journal on Computing (JoC). 2013; 3(2): 13–24.

Jothilakshmi S, Kathiresan N. Automatic Music Genre Classification for Indian Music. 2012 International Conference on Software and Computer Applications (ICSCA). 2012.

Bertin-Mahieux T, Ellis DPW, Whitman B, Lamere P. The Million Song Dataset. 12th International Society for Music Information Retrieval Conference (ISMIR). 2011; 591–596.

McFee B, Raffel C, Liang D, Ellis D, McVicar M, Batenberg E, Nieto O. Librosa: Audio and Music Signal Analysis in Python. 14th Python in Science Conference (SCIPY). 2015; 18–24.


Refbacks

  • There are currently no refbacks.