Groenbroek, Herman (2021) A Machine Learning Approach to Automatic Language Identification of Vocals in Music. Master's Thesis / Essay, Artificial Intelligence.
|
Text
mAI_2021_GroenbroekHG.pdf Download (2MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (119kB) |
Abstract
Audio classification is an important field within data science. An important first step of audio classification is Automatic Language Identification (LID). At this time, there exists no publicly accessible system that is able to accurately classify the language that music is sung in, nor a labelled dataset to train one. In this thesis, a novel music dataset with language labels is described: the 6L5K Music Corpus. A vocal fragment dataset is obtained by taking 3-second audio fragments from the 6L5K Music Corpus classified by a pretrained vocal detector to contain vocals. Two neural network architectures are implemented: a feedforward DNN and VGGish. For the input features, mel spectrograms and MFCCs are computed. The results in this thesis indicate that the task of LID of sung music is non-trivial. The DNN with various setups performs better than chance, obtaining at best 35% accuracy with six languages. VGGish shows more promising results on the vocal fragment data, obtaining 41% accuracy on the same six-class dataset. When using these systems on unseen test data however, the DNN drops to 18.1% accuracy, whereas VGGish drops to a more respectable 35.2%. We finally implement an Ensemble by combining the two models, but the results are no better than an average of the two. These results, combined with the fact that little research is done on LID of sung music, indicate that this subset of audio classification has plenty of potential still for novel research.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Wiering, M.A. |
Degree programme: | Artificial Intelligence |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 08 Apr 2021 14:11 |
Last Modified: | 08 Apr 2021 14:11 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/24222 |
Actions (login required)
View Item |