(2) Azka Khoirunnisa
*corresponding author
AbstractThe “Makan Bergizi Gratis” (MBG) Program is one of the strategic policies of the Government of Indonesia that reaps various opinions from the public, especially through social media. This study aims to classify public sentiment towards the MBG program with an ensemble learning-based machine learning approach, as well as evaluate the effectiveness of the SMOTE algorithm in dealing with class imbalance in opinion data. The dataset was collected from platform X (formerly Twitter) for the January–April 2025 period, totaling 4,374 tweets with label distributions: 1,783 positive, 1,634 negative, and 957 neutral. The preprocessing process includes data cleansing, normalization, stemming, and vectorization with TF-IDF. Five ensemble algorithms were used, namely Random Forest, AdaBoost, Bagging, Stacking, and Voting, tested in two scenarios: with and without the implementation of SMOTE. The results of the experiments showed that Random Forest provided the best and most consistent performance, with the F1-score increasing from 72.03% to 72.66% after the implementation of SMOTE. However, not all models benefit from SMOTE, such as Voting which experienced a drop in F1-score. These findings suggest that SMOTE is effective in increasing the sensitivity of the model to minority classes, but its success depends on the characteristics of the algorithm used. This study suggests the selective selection of balancing methods as well as the development of a more adaptive approach to handle unstructured opinion data.
KeywordsEnsemble Learning; MBG Program; Imbalanced Data; Sentiment Analysis; SMOTE
|
DOIhttps://doi.org/10.29099/ijair.v9i1.1.1495 |
Article metrics10.29099/ijair.v9i1.1.1495 Abstract views : 76 | PDF views : 5 |
Cite |
Full Text Download
|
References
A. Kiftiyah, F. A. Palestina, F. U. Abshar, and K. Rofiah, “Program Makan Bergizi Gratis (MBG) dalam Perspektif Keadilan Sosial dan Dinamika Sosial–Politik,” Pancasila: Jurnal Keindonesiaan, vol. 5, no. 1, pp. 101–112, 2025.
B. Rahmatullah, S. A. Saputra, P. Budiono, and D. P. Wigandi, “Sentimen Analisis Makan Bergizi Gratis Menggunakan Algoritma Naive Bayes,” JIfoTech, vol. 5, no. 1, Mar. 2025.
S. Kedas, A. Kumar, and P. K. Jain, “Dealing with class imbalance in sentiment analysis using deep learning and SMOTE,” in Advances in Data Computing, Communication and Security, Singapore: Springer Nature Singapore, 2022, pp. 407–416.
R. Obiedat et al., “Sentiment analysis of customers’ reviews using a hybrid evolutionary SVM-based approach in an imbalanced data distribution,” IEEE Access, vol. 10, pp. 22260–22273, 2022.
N. G. Ramadhan, Adiwijaya, W. Maharani, and A. Akbar Gozali, “Chronic diseases prediction using machine learning with data preprocessing handling: a critical review,” IEEE Access, vol. 12, pp. 80698–80730, 2024.
V. S. Spelmen and R. Porkodi, “A review on handling imbalanced data,” in 2018 International Conference on Current Trends towards Converging Technologies (ICCTCT), Coimbatore, 2018, pp. 1–11.
O. Sagi and L. Rokach, “Ensemble learning: A survey,” Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 8, no. 4, p. e1249, Jul. 2018.
P. Kashyap, A. Pareek, S. Mishra, Z. Khan, R. Garg, and H. K. Tripathy, “Sentiment polarity analysis of twitter data using machine learning models,” in Innovative Computing and Communications, Singapore: Springer Nature Singapore, 2024, pp. 623–635.
B. Bala and S. Behal, “A brief survey of data preprocessing in machine learning and deep learning techniques,” in 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Kirtipur, Nepal, 2024, pp. 1755–1762.
B. Menaouer, S. Fairouz, M. B. Meriem, S. Mohammed, and M. Nada, “A sentiment analysis of the Ukraine-Russia War tweets using knowledge graph convolutional networks,” Int. J. Inf. Technol., Jan. 2025.
G. Taiwo, M. Saraee, and J. Fatai, “Crime prediction using twitter sentiments and crime data,” Informatica (Vilnius), vol. 48, Feb. 2024.
G. Popoola, K.-K. Abdullah, G. S. Fuhnwi, and J. Agbaje, “Sentiment analysis of financial news data using TF-IDF and machine learning algorithms,” in 2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC), Houston, TX, USA, 2024, pp. 1–6.
A. S. Safitri, I. Wijayanto, and S. Hadiyoso, “Improving classification accuracy with preprocessing techniques for sentiment analysis,” in 2024 International Conference on Data Science and Its Applications (ICoDSA), Kuta, Bali, Indonesia, 2024, vol. 7, pp. 487–490.
S. Alam and N. Yao, “The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis,” Comput. Math. Organ. Theory, vol. 25, no. 3, pp. 319–335, Sep. 2019.
D. J. Ladani and N. P. Desai, “Stopword Identification and Removal Techniques on TC and IR applications: A Survey,” in 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS), Coimbatore, India, 2020, pp. 466–472.
H. Raza, M. Faizan, A. Hamza, A. Mushtaq, and N. Akhtar, “Scientific text sentiment analysis using machine learning techniques,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 12, pp. 157–165, 2019.
Student, The University of Texas at Arlington, Arlington, Texas, United States of America and S. S. A. Challapalli, “Sentiment analysis of the Twitter dataset for the prediction of sentiments,” Journal of Sensors, IoT & Health Sciences, vol. 2, no. 4, pp. 1–15, Dec. 2024.
A. Sabir, H. A. Ali, and M. A. Aljabery, “ChatGPT tweets sentiment analysis using machine learning and data classification,” Informatica (Vilnius), vol. 48, May 2024.
N. A Semary, W. Ahmed, K. Amin, P. P?awiak, and M. Hammad, “Enhancing machine learning-based sentiment analysis through feature extraction techniques,” PLoS One, vol. 19, no. 2, p. e0294968, Feb. 2024.
N. Romadoni, A. M. Siregar, D. S. Kusumaningrum, and T. Rohana, “Classification Model of Public Sentiments About Electric Cars Using Machine Learning,” Scientific Journal of Informatics, vol. 11, no. 2, pp. 303–314, 2024.
N. G. Ramadhan and F. Adhinata, “Sentiment analysis on vaccine COVID-19 using word count and Gaussian Naïve Bayes,” Indones. J. Electr. Eng. Comput. Sci., Jun. 2022.
K. Alemerien, A. Al-Ghareeb, and M. Z. Alksasbeh, “Sentiment analysis of online reviews: A Machine Learning based approach with TF-IDF vectorization,” J. Mob. Multimed., vol. 20, no. 5, pp. 1089–1116, Dec. 2024.
Y. Terentyeva, “Sentiment Analysis, InSet Lexicon, SentiStrength Lexicon, Naive Bayes, Multinomial Naive Bayes, TF-IDF, Machine Learning,” International Journal of Open Information Technologies, vol. 12, no. 7, pp. 32–37, 2024.
E. Triana, A. I. Purnamasari, A. Bahtiar, and E. Tohidi, “Improved spam email detection performance based on Naïve Bayes approach TF-IDF Vectorizer with multi-metric optimization,” j. of artif. intell. and eng. appl., vol. 4, no. 3, pp. 1667–1672, Jun. 2025.
M. Mujahid et al., “Data oversampling and imbalanced datasets: an investigation of performance for machine learning and feature engineering,” J. Big Data, vol. 11, no. 1, Jun. 2024.
W. Chen, K. Yang, Z. Yu, Y. Shi, and C. L. P. Chen, “A survey on imbalanced learning: latest research, applications and future directions,” Artif. Intell. Rev., vol. 57, no. 6, May 2024.
A. Fernandez, S. Garcia, F. Herrera, and N. V. Chawla, “SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary,” J. Artif. Intell. Res., vol. 61, pp. 863–905, Apr. 2018.
N. G. Ramadhan, Adiwijaya, W. Maharani, and A. Akbar Gozali, “Prediction of cardiovascular disease (CVD) in the upcoming year using tree-based ensemble model,” in 12th International Conference on Information and Communication Technology (ICOICT), 2024, pp. 210–216.
N. G. Ramadhan, W. Maharani, A. A. Gozali, and A. Adiwijaya, “Modified SMOTE and Ensemble Learning Based on Expert Judgment for Chronic Diseases Prediction,” International Journal of Innovative Computing, Information and Control (IJICIC), vol. 21, no. 4, 2025.
A. Adiwijaya and N. G. Ramadhan, “Analyzing risk factors and handling imbalanced data for predicting stroke risk using machine learning,” Int. J. Adv. Intell. Inform., vol. 11, no. 1, pp. 39–54, Feb. 2025.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
________________________________________________________
The International Journal of Artificial Intelligence Research
Organized by: Prodi Teknik Informatika Fakultas Teknologi Bisnis dan Sains
Published by: Universitas Dharma Wacana
Jl. Kenanga No. 03 Mulyojati 16C Metro Barat Kota Metro Lampung
Email: jurnal.ijair@gmail.com

This work is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.













Download