When it comes to marketing, reaching the right people with the right message at the right time is the ultimate goal for any data-driven business. In e-commerce, hyper-personalisation has become an expected part of the customer experience and making it easy for customers to find what they are looking for quickly is crucial to conversion.
One of the most effective ways to personalise customer experience is to forecast customer behaviour using propensity scoring. This statistical approach to data analysis predicts future actions, taking your data beyond what has happened and pushing it to what will probably happen in the future. It accounts for all variables that affect that behaviour and, when it comes to customers, this enables you to serve relevant content, offers and recommendations, based on your predictions.
When building a propensity model, it’s important to consider a few factors to ensure it is truly effective and works with your dataset. In marketing applications, AB testing and experimentation allow you to validate the accuracy of propensity scores and understand how to improve the model for your specific requirements.
A great propensity model should be dynamic, retraining and continuously evolving based on the feedback loop created by the data pipeline. As new data becomes available, the model needs to change to become smarter and more accurate based on the underlying trends in the data.
A dynamic model requires a robust data pipeline to regularly ingest data, retrain, validate and deploy. For that reason, your model needs to be productionised and deliver understandable and actionable predictions into your business processes, often in real time.
Your model also needs to be scalable. Rather than building a new model for each campaign or use case, an effective model should be capable of producing large volumes of predictions and also be adaptable for similar scenarios across the business.
You don’t need to be a data scientist or mathematician to use propensity scoring but it helps to have a basic understanding of regression analysis, the core analytical process behind it.
Regression analysis is simply a predictive modelling technique that analyses the relationship between a dependent variable (e.g. average order value per customer) and independent variables, or features (e.g. product attributes). The two types of regression analysis used for propensity modelling in machine learning are linear regression and logistic regression.
Where the outcome is continuous which means there can be an infinite number of potential values. Technically, when your data involves more than one independent variable (feature) the model would be a multiple linear regression.
The equation used to denote the linear regression model is y=mx+c+e, where m is the slope of the line, c is an intercept, and e represents the error in the model.
The best fit line on the chart is determined by varying the values of m and c. The error is the difference between the observed values and the predicted value. The values of m and c are selected in a way that creates the minimum error. As you can see from the chart, a simple linear regression model is susceptible to outliers and for this reason, it’s not an appropriate choice for big data volumes.
A predictive analysis algorithm and based on the concept of probability, where the outcome has a limited number of potential values. This method is used when the dependent variable is discrete (i.e. individually distinct – 0 or 1, true or false, etc.) and there is no correlation between the independent variables in the dataset. This means the target variable can only have two values, and a sigmoid curve (a mathematical function that has a characteristic ‘S’ shaped curve) denotes the relation between the target variable and the independent variable (feature).
Logit function, also referred to as log-odds, is used in logistic regression to measure the relationship between the target variable and independent variables. It estimates probabilities between 0 and 1.
In recent years, machine learning has unlocked the potential of propensity modelling for most businesses with a data science team. But, creating an effective, scalable propensity model that includes the kind of feedback loop necessary for continuous improvement is complicated. Most CRM or marketing automation platforms will have some propensity models built-in for users but these often have shortcomings that mean the predictions they produce won’t be accurate enough to deliver real marketing ROI and uplift.
The reason for this is that most basic models will rely on a small number of features, typically limited to customer data and campaign-specific transaction history. They tend to overlook broader transaction history and activity data.
Models created by an in-house data science team might be more applicable to that specific business but they won’t necessarily be scalable or robust enough. Similar to those in a CRM, these tend to be static which means they don’t adapt to changes in the underlying data and as a result, they don’t become more accurate over time.
For businesses that don’t have a data science team, innovative tools, like our own platform product, bridge the gap and allow for the implementation of propensity models (and other ML capabilities) in a user-friendly way.
Know exactly where and how to start your AI journey with Datatonic’s
three-week AI Innovation Jumpstart *.
* Duration dependent on data complexity and use case chosen for POC model
With your own data sets, convince your business of the value of migrating your data warehouse, data lake and/or streaming platform to the cloud in four weeks.
With your own data, see how Looker can modernise your BI needs
with Datatonic’s two-week Showcase.