Performance analysis of 10 machine learning models in lung cancer prediction

Indonesian Journal of Electrical Engineering and Computer Science

Performance analysis of 10 machine learning models in lung cancer prediction

Abstract

Lung cancer is one of the diseases with the highest incidence and mortality in the world. Machine learning (ML) models can play an important role in the early detection of this disease. This study aims to identify the ML algorithm that has the best performance in predicting lung cancer. The algorithms that were contrasted were logistic regression (LR), decision tree (DT), k-nearest neighbors (KNN), gaussian Naive Bayes (GNB), multinomial Naive Bayes (MNB), support vector classifier (SVC), random forest (RF), extreme gradient boosting (XGBoost), multilayer perceptron (MLP) and gradient boosting (GB). The dataset used was provided by Kaggle, with a total of 309 records and 16 attributes. The study was developed in several phases, such as the description of the ML models and the analysis of the dataset. In addition, the contrast of the models was performed under the metrics of specificity, sensitivity, F1 count, accuracy, and precision. The results showed that the SVC, RF, MLP, and GB models obtained the best performance metrics, achieving 98% accuracy, 98% precision, and 98% sensitivity.

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration