PeerJ Computer Science | |
An ensemble based approach using a combination of clustering and classification algorithms to enhance customer churn prediction in telecom industry | |
Syed Fakhar Bilal1  Saba Bashir1  Abdulaleem Ali Almazroi2  Farhan Hassan Khan3  Abdulwahab Ali Almazroi4  | |
[1] Computer Science Department, Federal Urdu University of Arts, Science and Technology, Islamabad, Pakistan;Department of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Rabigh, Saudi Arabia;Knowledge & Data Science Research Center (KDRC), Computer Engineering Department, National University of Science and Technology, Islamabad, Pakistan;University of Jeddah, College of Computing and Information Technology at Khulais, Department of Information Technology, Jeddah, Saudi Arabia; | |
关键词: Churn prediction; Hybrid model; Classification; Clustering; Decision support system; | |
DOI : 10.7717/peerj-cs.854 | |
来源: DOAJ |
【 摘 要 】
Mobile communication has become a dominant medium of communication over the past two decades. New technologies and competitors are emerging rapidly and churn prediction has become a great concern for telecom companies. A customer churn prediction model can provide the accurate identification of potential churners so that a retention solution may be provided to them. The proposed churn prediction model is a hybrid model that is based on a combination of clustering and classification algorithms using an ensemble. First, different clustering algorithms (i.e. K-means, K-medoids, X-means and random clustering) were evaluated individually on two churn prediction datasets. Then hybrid models were introduced by combining the clusters with seven different classification algorithms individually and then evaluations were performed using ensembles. The proposed research was evaluated on two different benchmark telecom data sets obtained from GitHub and Bigml platforms. The analysis of results indicated that the proposed model attained the highest prediction accuracy of 94.7% on the GitHub dataset and 92.43% on the Bigml dataset. State of the art comparison was also performed using the proposed model. The proposed model performed significantly better than state of the art churn prediction models.
【 授权许可】
Unknown