Klasifikasi Berita Bahasa Indonesia dengan Recurrent Neural Network (RNN)

Agam, Muh. Ridha (2024) Klasifikasi Berita Bahasa Indonesia dengan Recurrent Neural Network (RNN). Undergraduate thesis, Universitas Muhammadiyah Malang.

[thumbnail of PENDAHULUAN.pdf]
Preview
Text
PENDAHULUAN.pdf

Download (1MB) | Preview
[thumbnail of BAB I.pdf] Text
BAB I.pdf
Restricted to Registered users only

Download (217kB) | Request a copy
[thumbnail of BAB II.pdf] Text
BAB II.pdf
Restricted to Registered users only

Download (254kB) | Request a copy
[thumbnail of BAB III.pdf] Text
BAB III.pdf
Restricted to Registered users only

Download (433kB) | Request a copy
[thumbnail of BAB IV.pdf] Text
BAB IV.pdf
Restricted to Registered users only

Download (233kB) | Request a copy
[thumbnail of BAB V.pdf] Text
BAB V.pdf
Restricted to Registered users only

Download (383kB) | Request a copy
[thumbnail of LAMPIRAN.pdf] Text
LAMPIRAN.pdf
Restricted to Registered users only

Download (521kB) | Request a copy
[thumbnail of Poster.pdf] Text
Poster.pdf
Restricted to Registered users only

Download (596kB) | Request a copy

Abstract

This study aims to develop a classification model for Indonesian news documents
using Recurrent Neural Network (RNN). With the advancement of information
technology over the past decade, online media has become the primary source of
information for society, necessitating efficient mechanisms to manage and classify
the ever-growing content. Machine learning technology, especially RNN, is crucial
for classifying Indonesian news documents due to its ability to learn temporal
dependencies in sequential data. However, RNNs often achieve low accuracy and
tend to overfit. This issue arises from uncontrolled sample data conditions and
limited dataset sizes. Therefore, this study aims to improve RNN accuracy through
data augmentation to reduce the likelihood of overfitting. This research involves
several main stages, including data collection through scraping techniques from
the Kompas.com news portal, text preprocessing that includes cleaning, filtering,
stemming, and tokenization, data augmentation using Back Translation, and RNN
model training. Data augmentation is performed to address class imbalance in the
dataset and enhance model accuracy. Augmentation with Back Translation
produces richer data variations and helps the model recognize more diverse
patterns, thus reducing overfitting and improving overall data quality. The results
show that the RNN model trained with data augmentation significantly improves
performance compared to conventional methods like the Naïve Bayes Classifier.
The RNN model without augmentation achieved an accuracy of 92%, while the RNN
model with data augmentation achieved an accuracy of 96%. This improvement is
also reflected in other evaluation metrics such as precision, recall, and F1-Score.
This research not only strengthens the literature in the field of text classification
but also opens up opportunities for applying deep learning techniques in various
aspects of text-based research and information system development. The
augmentation technique used shows significant contributions in helping the RNN
model to more accurately and efficiently identify and classify news categories.
Furthermore, this study is expected to make a significant contribution to the
development of more advanced and adaptive automated news classification
systems.

Item Type: Thesis (Undergraduate)
Student ID: 202010370311035
Keywords: Recurrent Neural Network, Back Translation, Data Augmentation, Text Classification, News Classification.
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
T Technology > T Technology (General)
Divisions: Faculty of Engineering > Department of Informatics (55201)
Depositing User: 202010370311035 muhridhaagam
Date Deposited: 08 Aug 2024 05:56
Last Modified: 08 Aug 2024 05:56
URI: https://eprints.umm.ac.id/id/eprint/9229

Actions (login required)

View Item
View Item