A Comparative Study on Customer Churn Analysis Using Machine Learning and Data Enrichment Techniques


Karaarslan H., Baştuğ M., Şen C. G., Işık E. E.

Journal of Soft Computing and Decision Analytics, cilt.2, sa.1, ss.225-235, 2024 (Hakemli Dergi)

Özet

With the increasing amount of online shopping, companies can collect more customer data. Companies use this data to get to know their customers better and provide customized services. Churn analysis is one of the most essential analyses derived from the vast amount of data collected, which provides information about when a customer will stop shopping with the company. In this study, we perform a churn analysis using machine learning (ML) algorithms to analyse the customer behavior data of a fashion retail company. To perform churn analysis, we performed a four-stage methodology. First, we carried out data preparation and visualization studies, and then we created models using various ML algorithms. After examining the baseline data, we added the RFM (Recency, Frequency, Monetary) score to the data with the data enrichment technique and performed the analysis again. We used the Synthetic Minority Oversampling Technique (SMOTE) to eliminate the data irregularity and performed parameter optimization on the algorithms in SMOTE data. We compared the accuracy and F1 score values obtained after this four-stage process and examined the effect of the algorithms. In the last stage, we divided whole data into clusters using the k-means technique and applied ML algorithms to clustered data. Then, we compared all these results and examined the effect of segmentation on the results. The analysis shows that the extreme gradient boosting algorithm provides better accuracy and F1 score values. Using these results, the company can identify customers likely to churn and begin funding Customer Relationship Management (CRM) efforts. Additionally, experts can determine the company's development directions by organizing campaigns for these customers and analysing their reasons for churn in more detail.