We helped our client in the telecommunication industry with a significant number of business users and group accounts, to create a company churn prediction model. The model has detected companies that are likely to stop using clients’ services.
The input for this project was a dataset with monthly reports about customer’s SIM cards usage, as described below.
The output is the model (code) and report, that includes results and detailed description of our approach.
There are a growing number of telco businesses on the market and our client wants to remain competitive.Therefor, client approached SmartCat to improve the existing churn model that the telco internal team had developed. They already tried to address issues with the highest churn probability and the hit rate was 5-6% on average and our goal was to increase the accuracy by 5-10% (this is something the client believed was achievable).
Through our regular process that includes regular sessions with our client, agile delivery cycles and refinement sessions, we delivered a model that identifies business customers who are most likely to churn. This group of customers is a crucial group for sales and marketing promotions, since these actions can prevent churn so we have created a list that makes targeting much more efficient.
Our approach to this project included multiple stages, as follows:
- Phase 1: Data cleaning and validation. Exploratory data analysis.
- Phase 2: Feature extraction.
- Phase 3: Implementation and evaluation of predictive models for churn one and two months in advance.
We used historical data to analyze typical patterns, trends, and potential seasonality. Different statistics and visualizations were implemented. Also, we validated and cleaned data, since we saw that some related columns had inconsistent values in some cases. We did these steps in permanent communication with a dedicated person on the client’s side. Before the start of the modeling phase, we extracted many features that were used as input to train machine learning models. The accuracy of models was measured using precision and recall for churners (because of highly imbalanced labels in the dataset) and compared with the client’s model. Also, we performed the analysis of seasonalities and anomalies.
The goal of this project was to improve the accuracy of the current churn prediction algorithm and we have set our target to 15-20%, or higher.
A predictive algorithm was being trained with historical data and optimized as we strived for our defined goal of prediction accuracy. Many features that we designed using provided data significantly improved the final accuracy. Compared to the client’s baseline model, for the same recall values, our final model had a 15-20% higher precision, which satisfies the customer benchmark.
“Very smart people, great company. With detailed preparation and data sharing principles in mind (GDPR and security) they helped us develop algorithms to get the probability that customers will churn.”