Analisis Komparatif antara Neural Network yang Dioptimalkan dan Large-Scale Language Models untuk Klasifikasi Genre Musik

Marzuqi, Ahmad Naufal Luthfan (2026) Analisis Komparatif antara Neural Network yang Dioptimalkan dan Large-Scale Language Models untuk Klasifikasi Genre Musik. Undergraduate thesis, Universitas Muhammadiyah Malang.

[thumbnail of PENDAHULUAN.pdf]
Preview
Text
PENDAHULUAN.pdf

Download (1MB) | Preview
[thumbnail of BAB I.pdf]
Preview
Text
BAB I.pdf

Download (366kB) | Preview
[thumbnail of BAB II.pdf]
Preview
Text
BAB II.pdf

Download (406kB) | Preview
[thumbnail of BAB III.pdf] Text
BAB III.pdf
Restricted to Registered users only

Download (366kB) | Request a copy
[thumbnail of BAB IV.pdf] Text
BAB IV.pdf
Restricted to Registered users only

Download (714kB) | Request a copy
[thumbnail of BAB V.pdf] Text
BAB V.pdf
Restricted to Registered users only

Download (315kB) | Request a copy
[thumbnail of LAMPIRAN.pdf] Text
LAMPIRAN.pdf
Restricted to Registered users only

Download (463kB) | Request a copy
[thumbnail of POSTER.png] Image
POSTER.png
Restricted to Registered users only

Download (674kB) | Request a copy

Abstract

The rapid growth of the digital music industry requires accurate music genre classification systems to enhance user experience in streaming services. This study compares a domain-specific Long Short-Term Memory (LSTM) network with three Large Language Models (LLMs)—HuBERT, WavLM, and WAV2Vec 2.0—for Music Genre Classification (MGC). The LSTM model was trained using Mel-spectrograms transformed from the GTZAN dataset, while the LLMs were fine-tuned using a smaller set of raw audio samples due to computational constraints. All models were tested on datasets with identical genre labels to ensure a fair evaluation. Results show that the LSTM model achieved the highest accuracy of 97.10%, outperforming HuBERT (86.00%), WavLM (83.00%), and WAV2Vec 2.0 (80.00%). The LSTM demonstrated superior generalization and stability without overfitting, while the LLMs struggled to differentiate between genres with similar acoustic characteristics. These findings indicate that general-purpose pre-trained models, although powerful, are less effective in music-specific tasks due to domain mismatch. Therefore, incorporating music-specific features and architectures remains essential for achieving higher accuracy and reliability in automatic genre classification systems.

Item Type: Thesis (Undergraduate)
Student ID: 202210370311072
Keywords: audio large language models, comparative deep learning, music genre classification
Subjects: M Music and Books on Music > M Music
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Divisions: Faculty of Engineering > Department of Informatics (55201)
Depositing User: 202210370311072 an4667112
Date Deposited: 11 May 2026 01:31
Last Modified: 11 May 2026 01:31
URI: https://eprints.umm.ac.id/id/eprint/29810

Actions (login required)

View Item
View Item