Stable and accurate customer churn prediction: comparative analysis of eight classification algorithms

Indonesian Journal of Electrical Engineering and Computer Science

Stable and accurate customer churn prediction: comparative analysis of eight classification algorithms

Abstract

Predicting customer churn is a challenging problem in many subscription-based industries, though it is considered more cost-effective than acquiring new customers. In this research, customer churn is predicted using a public dataset from an internet service provider, with 72,274 instances and 55% churn rate. The main contribution is to provide a comprehensive comparison of the stability and performance of eight classification algorithms in customer churn prediction using a large-scale public dataset. The research process includes data collection, data preprocessing, feature engineering, and model evaluation. The metrics evaluation presents test accuracy, accuracy gap, precision, recall, F1-Score, and ROC AUC, with stratified K-Fold cross-validation. Since the proportion of churn and non-churn in the dataset is relatively balanced, the F1-score is considered as the primary evaluation metric, as it provides a balanced assessment of precision and recall for both classes. The results show that CatBoost and XGBoost are the most effective models that achieve high F1-scores of 94.97% and 94.92%, respectively.

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration