Experimental of information gain and AdaBoost feature for machine learning classifier in media social data

Indonesian Journal of Electrical Engineering and Computer Science

Experimental of information gain and AdaBoost feature for machine learning classifier in media social data

Abstract

In this research, we use several machine learning methods and feature selection to process social media data, namely restaurant reviews. The selection feature used is a combination of information gain (IG) and adaptive boosting (AdaBoost) which is used to see its effect on the classification performance evaluation value of machine learning methods such as Naïve Bayes (NB), K-nearest neighbor (KNN), and random forest (RF) which is the aim of this research. NB is very simple and efficient and very sensitive to feature selection. Meanwhile, KNN is known for its weaknesses such as biased k values, overly complex computation, memory limitations, and ignoring irrelevant attributes. Then RF has weaknesses, including that the evaluation value can change significantly with only small data changes. In text classification, feature selection can improve the scalability, efficiency and accuracy of text classification. Based on tests that have been carried out on several machine learning methods and a combination of the two selection features, it was found that the best classifier is the RF algorithm. RF produces a significant increase in value after using the IG and AdaBoost features. Increased accuracy by 10%, precision by 12.43%, recall by 8.14% and F1-score by 10.37%. RF also produces even accuracy, precision, recall, and F1-score values after using IG and AdaBoost with an accuracy value of 84.5%; precision of 85.58%; recall was 86.36%; and F1-score was 85.97%.

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration