Zuhairi, Ainaa Hanis and Yakub, Fitri and Omar, Mas and Faruq, Amrul (2024) Imbalanced Flood Forecast Dataset Resampling Using SMOTE-Tomek Link. In: Proceeding of International Exchange and Innovation Conference on Engineering & Sciences (IEICES). Kyushu University.
Zuhairi Yakub Faruq - Imbalanced Flood Forecasting Machine Learning .pdf - Published Version
Download (339kB) | Preview
Similarity - Zuhairi Yakub Faruq - Imbalanced Flood Forecasting Machine Learning .pdf - Published Version
Download (455kB) | Preview
Abstract
Imbalanced data is common and presents significant challenge towards classification of data. In this research,
we present a combination of two techniques used for handling class imbalance in datasets, SMOTE (Synthetic Minority
Over-sampling Technique) and Tomek Links. Each strategy handles the class imbalance problem in a unique way, and
their combination attempts to create a more balanced and cleaner dataset for training machine learning models to handle
binary classification by addressing problematic or difficult-to-classify data. Machine learning classifiers used in this
study are K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Logistic Regression, Decision Tree (DT),
Random Forest (RF), Gradient Boosting, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting (LGBM),
AdaBoost and Catboost. It has been discovered that the mean F1 score for resampled datasets provides more trustworthy
results for forecasting floods.
Item Type: | Book Section / Proceedings |
---|---|
Keywords: | Imbalanced Dataset, flood forecast, Resampling, SMOTE-Tomek |
Subjects: | Q Science > QA Mathematics > QA76 Computer software T Technology > TK Electrical engineering. Electronics Nuclear engineering |
Divisions: | Faculty of Engineering > Department of Electrical Engineering (20201) |
Depositing User: | faruq Amrul Faruq |
Date Deposited: | 30 Jan 2025 02:40 |
Last Modified: | 30 Jan 2025 02:40 |
URI: | https://eprints.umm.ac.id/id/eprint/14289 |