M. Rafly Rahman, Rahman (2025) INTEGRASI DATA TABULAR DAN REPRESENTASI TEKS UNTUK PREDIKSI RISIKO KLINIS MENGGUNAKAN MACHINE LEARNING DAN LARGE LANGUAGE MODELS. Undergraduate thesis, Universitas Muhammadiyah Malang.
PENDAHULUAN.pdf
Download (540kB) | Preview
BAB I.pdf
Download (257kB) | Preview
BAB II.pdf
Download (378kB) | Preview
BAB III.pdf
Restricted to Registered users only
Download (501kB) | Request a copy
BAB IV.pdf
Restricted to Registered users only
Download (325kB) | Request a copy
BAB V.pdf
Restricted to Registered users only
Download (153kB) | Request a copy
Abstract
Global health is currently facing serious challenges due to the increasing number of chronic disease patients such as heart failure, diabetes, and cancer. This issue arises from the limitations of electronic health record (EHR) systems, which are not yet fully capable of ensuring accurate clinical diagnoses because of potential data input errors and delays in symptom identification by medical personnel. In response to this issue, this paper focuses on the integration of medical tabular data with a classification approach based on classical machine learning (ML) and large language models (LLM) to improve the accuracy of patient diagnosis predictions. This paper aims to develop and compare the performance of various ML models, such as XGBoost, SVM, and Logistic Regression, as well as LLM models like Gemini, LLaMA, and Qwen in fine-tuning, few-shot, and zero-shot scenarios. The paper results show that the combination of Llama and the few-shot approach (250 shots) achieved the highest accuracy of up to 96.0%, in predicting heart failure risk. The main finding of this study is that the narrative text representation of tabular data processed with LLM significantly enhances contextual understanding and classification accuracy, making this approach highly potent for application in AI-based clinical decision-making
| Item Type: | Thesis (Undergraduate) |
|---|---|
| Student ID: | 202110370311159 |
| Keywords: | Medical Tabular Data, Large Language Models (LLM), Clinical Risk Prediction, Data Serialization, Few-shot Learning |
| Subjects: | T Technology > T Technology (General) |
| Divisions: | Faculty of Engineering > Department of Informatics (55201) |
| Depositing User: | 202110370311159 raflyrahmanr060902 |
| Date Deposited: | 07 Feb 2026 04:43 |
| Last Modified: | 07 Feb 2026 04:43 |
| URI: | https://eprints.umm.ac.id/id/eprint/26988 |
