Risk profiling for customer churn analysis

6 min readAug 1, 2020

“ It is not the employer who pays the wages. Employers only handle the money. It is the customer who pays the wages. “ — Henry Ford

The above message is just as relevant today as it was a century ago. In today’s world, customers are spoilt with choices and businesses have to fight harder than ever before to maintain their customers.

A customer’s journey starts with acquisition. This side of the equation is usually well understood by the business, and substantial amounts of money are invested into marketing campaigns to attract new customers. This is not surprising, since more customers equals more revenue. However, the story shouldn’t stop here. Even today when a great deal of data are available about customer journeys, a lot less effort and money is invested in measuring churn risk.

Churn is referring to customers quitting a service or no longer using a product. Addressing customer churn is equally important as acquisition, as a decrease in the number of customers leads to lost revenue. The question is, how do we know which customers are at a high risk of leaving?

Risk profiling using predictive analytics

The insurgence of machine learning techniques have introduced a shift towards predictive business solutions in the last decades. Data is the new oil, and soon it will be impossible for businesses to compete in the market without leveraging the power of data and machine learning.

Amongst the many advantages of using predictive analytics is the ability to regard a wealth of customer characteristics that would be too complex to consider otherwise. Furthermore, a machine learning model can provide almost real-time results for a large pool of customers, in an automated way.

Machine learning is a powerful tool if we want to gain a better understanding of customer behaviour, for example the general sentiment towards a product, a customer’s next purchase, and whether they will leave a service or not. This allows businesses to continuously improve their service, and hence increase customer loyalty and stickiness on the long run.

In this project I will illustrate how predictive analytics can be used to profile customers based on their churn risk. The code for this project is written in Python, and it can be found on my Github.

Getting the dataset

The dataset used in this exercise was downloaded from Kaggle. It contains information about bank customers, such as their location, credit score, age, etc. Let’s have a look at the data.

df = pd.read_csv('Churn_Modelling.csv')
df.head()

The last column, ‘Exited’, marks the churn outcome: 0 means the customer is still with the bank, 1 means the customer has churned. This information is necessary to train a machine learning model, so new cases can be predicted in the future. In real-life examples, churn is not always clearly defined. In some cases a customer can stop using a product without notifying the service provider, a good example of this is online retail shopping. In this case it’s up to the business and data scientist to come up with a definition, and it’s usually based on the timeframe the customer goes without a purchase.

In order to apply machine learning to this dataset and make predictions, first we need to transform the data.

Data preparation

Categoric inputs will be converted into dummy variables via one-hot-encoding. This step is necessary when dealing with string type variables, or when the values have no real numeric relationship with each other. ‘Geography’ is a good example for this. Read more about dummy variables here.

Geography column converted into dummy variables

The machine learning model is trained based on historic examples of customer characteristics (the input, X) and the churn outcome (output, y). After this step, feature scaling is applied.

Feature scaling is necessary to bring the features to a similar order of magnitude, otherwise large numbers would dominate over the small ones in the predictions (e.g. see salary vs. credit score).

# Select X and y 
X = df.drop(columns=[‘Exited’])
y = df[‘Exited’].values# Apply feature scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X = sc.fit_transform(X)

Model training

Once the data is transformed, we are ready to train the model. In this example I will use logistic regression because the goal is to estimate churn probability and turn the predictions into categories (low risk, medium risk, high risk). Furthermore, the regression algorithm is easy to implement and efficient to train.

Once the model is trained, we use k-fold cross-validation to validate the accuracy of the model. The dataset is split into k different datasets and the results are calculated for each of these groups to obtain a better estimate of the algorithm performance.

# Train the logistic regression model 
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(random_state=0)
model.fit(X, y)# Apply k-Fold cross validation
from sklearn.model_selection import cross_val_score
accuracies = cross_val_score(estimator=model, 
X=X, y=y, cv=10)

The mean of the 10 subsets suggests the model accuracy is 84%. To improve on this, one could experiment with different algorithms, or with further data wrangling to include stronger predictors of churn. These improvements are beyond the scope of this project.

Assessing the risk

The trained model will be used to make churn risk predictions. In a real-life scenario we would apply this to fresh data where churn outcomes are unknown, however we do not have fresh data and will rely on X for illustration purposes.

y_pred_proba = model.predict_proba(X)

The risk probabilities can then be binned and converted into risk categories, as the below pie chart illustrates:

Risk profiling based on the churn probability predictions

About 65% of the the customers belong to the low risk category (<20% churn risk). This is actually really good news, meaning that more than half of the customers have a small risk of churning. Low-risk customers don’t require special attention. Creating such risk buckets is one of the main advantages of churn risk profiling, as now we can focus on the cases that carry more weight.
32% of customers belong to the medium risk category (20%–80% churn risk). It is unclear at this stage whether these customers will churn or not, and this is the group where proactive intervention can actually make a difference.
3% of the customer are in the high risk category (>80% churn risk). These customers have likely already churned and it might be too late to bring them back. Additional analysis might shed light on the reason behind the high risk results.

The three groups would be treated differently, and it is usually up to the business to come up with a marketing strategy considering the differences between these groups. For example, low-risk customers might be flattered with loyalty rewards, whereas medium and high risk customers would require more strategic targeting (discounts, customised deals, etc.). Either way, the business would have a much better understanding of customer behaviour at this stage thanks to churn risk profiling, and this can be beneficial both for the business and the customer.