Pengaruh Teknik Augmentasi Dalam Klasifikasi Berita Berbahasa Inggris Menggunakan Algoritma BERT

Daffa, Muhammad (2025) Pengaruh Teknik Augmentasi Dalam Klasifikasi Berita Berbahasa Inggris Menggunakan Algoritma BERT. Undergraduate thesis, Universitas Muhammadiyah Malang.

[thumbnail of PENDAHULUAN.pdf]
Preview
Text
PENDAHULUAN.pdf

Download (1MB) | Preview
[thumbnail of BAB I.pdf]
Preview
Text
BAB I.pdf

Download (226kB) | Preview
[thumbnail of BAB II.pdf]
Preview
Text
BAB II.pdf

Download (326kB) | Preview
[thumbnail of BAB III.pdf] Text
BAB III.pdf
Restricted to Registered users only

Download (332kB) | Request a copy
[thumbnail of BAB IV.pdf] Text
BAB IV.pdf
Restricted to Registered users only

Download (749kB) | Request a copy
[thumbnail of BAB V.pdf] Text
BAB V.pdf
Restricted to Registered users only

Download (200kB) | Request a copy
[thumbnail of POSTER.pdf] Text
POSTER.pdf
Restricted to Registered users only

Download (60kB) | Request a copy

Abstract

This study investigates the impact of data augmentation techniques on the performance of a BERT-based news classification model. Five experimental scenarios were evaluated: one without augmentation, three with individual augmentation techniques (Synonym Replacement, Back Translation, and Random Swap), and one combining all three methods. The model performance was assessed using classification metrics and cosine similarity analysis. Results show that Synonym Replacement provided the best improvement, achieving the highest F1-score of 0.984 and the lowest evaluation loss of 0.0489. However, several techniques generated augmented texts that were overly similar to the original data (cosine similarity > 0.99), limiting their effectiveness in increasing data diversity. This study concludes that the success of data augmentation largely depends on the semantic variety and quality of the newly generated data, which should enrich the training set without introducing noise.

Item Type: Thesis (Undergraduate)
Student ID: 202110370311047
Keywords: Back Translation, BERT, Data Augmentation, News Classification, Random Swap, Synonym Replacement
Subjects: Q Science > Q Science (General)
Divisions: Faculty of Engineering > Department of Informatics (55201)
Depositing User: 202110370311047 daffa25
Date Deposited: 30 Jul 2025 08:34
Last Modified: 30 Jul 2025 08:34
URI: https://eprints.umm.ac.id/id/eprint/20818

Actions (login required)

View Item
View Item