In the digital world, online companies must embrace data and analytics to ensure their media spend is competitive. Whilst Google Analytics and Adwords are bread-and-butter tools for any e-commerce business, companies can also leverage their first-party data to help drive smarter bidding strategies for paid search terms on Google.
With the rise in Big Data and Machine Learning capabilities, surely there is something a company can yield from its existing customer behaviours to help optimise audience selection for ad bidding?
Can a company use its first party data to identify audiences that would likely return to site and purchase, even if a branded ad wasn’t displayed? Surely this type of returning visitor would have to have a high likelihood to return and buy regardless of whether a branded ad was displayed on a Google results page or not; perhaps they would return via a non-paid source, such as an organic link, or via email (paid for but low cost).
This article is a part of our Marketing Analytics series. These aim to illustrate that finding smarter ways to utilise Analytics, Machine Learning and Google Cloud based solutions is about generating real and measurable business impact – whether it be reducing cost or increasing revenue – as well as driving technical and process efficiencies.
MandM Direct is one of the largest online fashion retailers in the UK, with over 2 million active customers and more than 100 million visits a year. They pride themselves on their offering a huge product range from over 200 the world’s biggest brands at low prices.
With such a large amount of products on offer and millions of customers, MandM Direct has enough data available to help them identify certain types of customers, allowing them to provide more targeted and personalised user experiences. In this case, we were looking to optimise ad spend.
Through development of best-practice automated machine learning pipeline, we provided MandM Direct with capability to retrain propensity models and rescore the known customer base each day. The customer scores could then be used to define propensity-based audiences to optimise Google Ads campaigns (and beyond) by reducing spend and therefore ad exposure to those who are likely to return and buy anyway. To find out how Datatonic delivered this, read on…
MandM Direct wanted to provide more targeted and personalised user experiences, and through this, optimise ad spend. In order to use Machine Learning successfully, first we needed to identify the attribute that we wanted to predict. In this case, the attribute should help us distinguish customers very likely to return and make a purchase in the near future from other customers.
To help MandM Direct achieve their goals, the Datatonic team developed a propensity model that assigned each customer with a likelihood to return and purchase a product the next day. Given the primary use of this model was for optimising ad spend, we focused specifically on identifying customers’ likely to return and purchase through a paid ad on Google.
Propensity modelling is a popular tool-kit in the personalisation game; particularly when a client has rich customer data. Propensity modelling is a neat approach for differentiating customers and identifying those who are most and least likely to make a future purchase, based on past behaviours (e.g. recent visits, historic spend, promotional email open and click rate).
The propensity model we developed for MandM Direct outputs a score between 0 and 1 related to the likelihood of a customer returning to site via a Google Ad and making a purchase based on a wide scope of previously captured on-site behaviours. A score close to 1 indicates that the model is certain that the customer is going to purchase, whilst a score of 0 indicates the opposite. In fact, if developed carefully and on appropriately sampled data, the score output of a propensity model can be inferred as a true probability of purchasing.
So, we’ve discussed the approach but how can we use a customer assigned propensity-to-buy score to inform audience optimisation for paid ad campaigns? This requires a model that predicts the likelihood to purchase well, and a sensible measurement framework to learn and action based on the results.
Ranking all known visitors based on their propensity scores gives us a clear and intuitive ordering of customers who are most likely to purchase. Providing that the model works well (referring to typical ranking metrics on a test dataset, such as Area Under ROC Curve), we have a reliable ordering. We can then use propensity scores to select customer cohorts that we know are more likely to return and purchase in the next day (as below).
This process allows MandM Direct to identify customers that are most likely to return and buy – this being the most appropriate audience to focus on for optimising short-term ad spend. Identifying segments of high propensity customers to reduce spend on is crucially reliant on a robust test methodology to ensure measurement and outcomes are clear (we cover an approach later in this article). Intuitively, reduced ad spend can be re-allocated to other audiences or even other marketing channels.
A key part of this project was delivering a fully automated model training and scoring pipeline to run each day, making best use of Google Cloud components to develop a scalable and pragmatic solution architecture. A customer’s propensity-to-buy can be extremely sensitive to time (where recent onsite behaviours have a significant effect on the model), therefore having a scheduled framework for refreshing customer level scores provides a useful and powerful ongoing view of each customer’s current and changing intent to purchase.
To address this, we developed a solution architecture that leverages Google Cloud Platform’s components for controlled productionisation of a scheduled machine learning pipeline, resulting in daily retraining and scoring of the full customer base.
Each day, once the latest clickstream (Google Analytics) data is loaded into BigQuery, the full model retraining and customer scoring process is triggered, illustrated below.
We chose Google Cloud Composer – a managed version of the popularly known Airflow environment – to chain each step together in an automated fashion and give MandM Direct a simple-to-use, but robust and scalable, solution. As you can see in the diagram above, once the data is ingested into BigQuery, the rest takes care of itself so the business can focus on measuring the impact of using customer propensity scores and the Data Science team can get back to exploring new opportunities and machine learning techniques.
A robust measurement framework is required to evaluate whether propensity scores can be used to successfully optimise ad spend. For any data-driven optimisation, it is crucial to carry out tests and draw conclusions from the results. In MandM Direct’s case, we wanted to evaluate whether the ad optimisation was successful.
For this case an A/B test framework can be used to assess the value of any changes to bid strategies, for a segment. We were interested in understanding if it made sense to reduce ad spend on customers with innate high propensity to engage, independently of the specific bid strategy. First, you use the scores outputted from the propensity model to define target visitors with high propensity to buy. Then, you randomly split these visitors into two groups (or even multiple segments to generate more granular learnings and a more refined optimisation); group A is served with the current standard bid strategy, group B with a heavily down-weighted (or removed) ad spend. Finally, you upload these audiences to your Google Ad campaign tool and apply the different bid strategies (for a given set of keywords, for example).
At the end of the outcome period, if both group A and B return the same revenue per customer (or non-significant fall in revenue), then MandM Direct can remove ad spend for this segment with high propensity-to-engage and, thus, maximise ROI. If successful, the business have smartly reduced ad cost without compromising revenue, for this customer audience.
Whilst we have focused on the application of our solution for optimising Google Ads spend, the delivered end-to-end methodology is completely transferable to other marketing opportunities for MandM Direct, who now have the right knowledge, tools and code base, to extend Machine Learning to new exciting opportunities.
If you are a digital business interested in using your data to optimise your paid media marketing spend or more generally interested in how we, at Datatonic, can help you unlock value from your customer data, with the power of Google Cloud Platform, do not hesitate to Contact Us!
Know exactly where and how to start your AI journey with Datatonic’s
three-week AI Innovation Jumpstart *.
* Duration dependent on data complexity and use case chosen for POC model
With your own data sets, convince your business of the value of migrating your data warehouse, data lake and/or streaming platform to the cloud in four weeks.
With your own data, see how Looker can modernise your BI needs
with Datatonic’s two-week Showcase.