Agam, Muh. Ridha (2024) Klasifikasi Berita Bahasa Indonesia dengan Recurrent Neural Network (RNN). Undergraduate thesis, Universitas Muhammadiyah Malang.
PENDAHULUAN.pdf
Download (1MB) | Preview
BAB I.pdf
Restricted to Registered users only
Download (217kB) | Request a copy
BAB II.pdf
Restricted to Registered users only
Download (254kB) | Request a copy
BAB III.pdf
Restricted to Registered users only
Download (433kB) | Request a copy
BAB IV.pdf
Restricted to Registered users only
Download (233kB) | Request a copy
BAB V.pdf
Restricted to Registered users only
Download (383kB) | Request a copy
LAMPIRAN.pdf
Restricted to Registered users only
Download (521kB) | Request a copy
Poster.pdf
Restricted to Registered users only
Download (596kB) | Request a copy
Abstract
This study aims to develop a classification model for Indonesian news documents
using Recurrent Neural Network (RNN). With the advancement of information
technology over the past decade, online media has become the primary source of
information for society, necessitating efficient mechanisms to manage and classify
the ever-growing content. Machine learning technology, especially RNN, is crucial
for classifying Indonesian news documents due to its ability to learn temporal
dependencies in sequential data. However, RNNs often achieve low accuracy and
tend to overfit. This issue arises from uncontrolled sample data conditions and
limited dataset sizes. Therefore, this study aims to improve RNN accuracy through
data augmentation to reduce the likelihood of overfitting. This research involves
several main stages, including data collection through scraping techniques from
the Kompas.com news portal, text preprocessing that includes cleaning, filtering,
stemming, and tokenization, data augmentation using Back Translation, and RNN
model training. Data augmentation is performed to address class imbalance in the
dataset and enhance model accuracy. Augmentation with Back Translation
produces richer data variations and helps the model recognize more diverse
patterns, thus reducing overfitting and improving overall data quality. The results
show that the RNN model trained with data augmentation significantly improves
performance compared to conventional methods like the Naïve Bayes Classifier.
The RNN model without augmentation achieved an accuracy of 92%, while the RNN
model with data augmentation achieved an accuracy of 96%. This improvement is
also reflected in other evaluation metrics such as precision, recall, and F1-Score.
This research not only strengthens the literature in the field of text classification
but also opens up opportunities for applying deep learning techniques in various
aspects of text-based research and information system development. The
augmentation technique used shows significant contributions in helping the RNN
model to more accurately and efficiently identify and classify news categories.
Furthermore, this study is expected to make a significant contribution to the
development of more advanced and adaptive automated news classification
systems.
Item Type: | Thesis (Undergraduate) |
---|---|
Student ID: | 202010370311035 |
Keywords: | Recurrent Neural Network, Back Translation, Data Augmentation, Text Classification, News Classification. |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science T Technology > T Technology (General) |
Divisions: | Faculty of Engineering > Department of Informatics (55201) |
Depositing User: | 202010370311035 muhridhaagam |
Date Deposited: | 08 Aug 2024 05:56 |
Last Modified: | 08 Aug 2024 05:56 |
URI: | https://eprints.umm.ac.id/id/eprint/9229 |