Who is interested in getting your vehicle insurance?

Analysis of Health Insurance customer data and prediction whether they would be interested in an additional Vehicle Insurance.

Anastasia Syntychaki
5 min readNov 27, 2020

Introduction

The number of insurance companies that provide various insurance services is constantly increasing, making the insurance industry one of the most competitive ones. The large number of options for insurance companies, policies and services, often overwhelms potential customers.

For this reason, insurance companies often offer extended plans that include coverage for health, vehicles, property etc. This way customers of one insurance provider can stay with the same provider, which in turn can increase its revenue.

Photo by Ulises Baga on Unsplash

Companies interested in offering such extended packages to their customers however need a strategy to approach the right group of customers. What is the chance that a given customer will be interested in an extra insurance package? By answering this question companies can save time and optimize their business.

In this article I discuss my findings after analyzing data of an insurance provider that wants to predict whether its medical insurance customers would be interested in a vehicle insurance. To help answer this question, I performed an exploratory analysis and I built a model for the predictions. More specifically, my analysis was aimed to answer the following questions:

  1. At what age are customers more likely to want a vehicle insurance? Is there any difference between men and women?
  2. Are there any communication channels that are more successful in convincing customers?
  3. Can we predict the probability that a customer would be interested in a vehicle insurance?

1. At what age are customers more likely to want a vehicle insurance? Is there any difference between men and women?

Figure 1: Profile of customers that already have a vehicle insurance

The two figures show the gender and age profiles of customers that already have a vehicle insurance and those of customers that are interested in a vehicle insurance.

Figure 2: Who is more interested in getting a vehicle insurance?

The two main conclusions of this analysis are:

  1. Both female and male customers follow similar behaviors.
  2. Customers might respond quite differently depending on their age.

More specifically, customers with ages ranging from 30 to 50 years old seem to be the ones that most probably will be interested in a vehicle insurance (with probability from 15% up to a bit more than 20%). This probability drops abruptly for ages below 30 years old. That is probably due to the fact that a big part of customers aged below 30 already have a vehicle insurance. The interest also drops for customers aged over 50.

2. Are there any communication channels that are more successful in convincing customers?

Figure 3: Top 10 successful policy sales channels (anonymized codes)

Here I was interested in the different channels that were used for outreaching to the customers, and whether there are any channels with higher success rate.

This figure shows the top 10 more successful policy sales channels. Out of a total of 155 different channels, only channels 123.0 and 43.0 (anonymized channel codes) had a 100% success into convincing customers to get a vehicle insurance. Other channels in the top 9 had a maximum of 33% success rate, and channels that followed the top 9 only had a mean success rate of 10%.

Therefore, this finding indicates that it is important for companies to select the right means for outreaching to customers. This way the chance of getting a positive response can be increased.

3. Can we predict the chance that a customer would be interested in a vehicle insurance?

In this part I was interested in creating a model that predicts whether customers would like to get a vehicle insurance, given data such as their demographics, vehicles and policy data. The model analyses a set of previous customer data including their response, in order to learn any relationships in the data and predict the response of new customers.

Table 1: Actual responses vs predicted probability for a positive response

This table shows the actual responses of the existing policyholders and the predicted probabilities of a positive response given by the model. Given the customer data, the model could predict customer responses with a very high accuracy of 95%.

Figure 4: Interest of existing curtomers Vs Predicted interest of new customers. (Males: Gender_Female=0, Females: Gender_Female=1)

The top plot of Figure 4 shows the profiles of existing customers and their actual responses, while in the bottom plot are the profiles of new customers and their predicted responses. The trends in the predicted customer behavior are similar to those we observed previously (see part 1): Customers aged between 30 and 50 years old are more likely to want a vehicle insurance compared to those below 30 or above 50.

Therefore, the take home message of this analysis is that predictive models can quite accurately predict the customer behavior and can essentially help companies target the right audience.

Conclusions

In this analysis I looked into factors that can help determine whether a policyholder would be interested in an extra vehicle insurance or not. This information can be very useful for companies that want to optimize their business model and increase their revenue by targeting the right audience.

More specifically:

  1. The analysis on the gender and age profiles of customers showed that both men and women that are aged between 30 to 50 years old are more likely to be interested in a vehicle insurance.
  2. We then analyzed the customer response given the different communication channels. We could see that channels 123.0 and 43.0 (anonymized codes) had a 100% success rate in convincing customers. The choice of channels to outreach to customers is therefore important.
  3. Finally, we generated a model in order to predict the response of new customers given their data (demographic, etc.). Customer response could be predicted with a very high accuracy of 95%.

The code and more info for this analysis are available in my Github here

--

--

Anastasia Syntychaki
0 Followers

Physicist, Ph.D. in Biophysics. Interested in data analytics, machine learning and their applications in healthcare and the life sciences.