A/B Testing for Machine Learning

A/B testing is an optimisation technique often used to understand how an altered variable affects audience or user engagement. It’s a common method used in marketing, web design, product development, and user experience design to improve campaigns and goal conversion rates. Machine learning is the training of models which learn the relationship between input and output data. Machine learning models can then be used to complete complex tasks such as classifying or identifying data or predicting trends.  

Although the use case for A/B testing and machine learning is different, they can be used to complement and enhance one another. A/B testing can be a useful technique when optimising machine learning algorithms, whether testing new models or gaining insight into training data. In addition, machine learning algorithms can be used to automate the A/B testing process, making experimentation more efficient over manual processes.  

This guide explores A/B testing for machine learning, the differences between the two, and how A/B testing can enhance the machine learning process. 

What is A/B testing?

A/B testing is a type of split testing and is commonly used to drive improvements to specific variables or elements by measuring user or audience engagement. The approach is commonly used to optimise marketing campaigns or digital assets like websites. In A/B testing a specific variable is altered such as a title, image, or element layout. A sample of the audience is shown the control version and the altered version in a 50/50 split. Half traffic will interact with the original version, the other half will interact with the newer version. Engagement or the completion of a defined goal is the metric that is compared between the versions after a set period of time.  

A/B testing is performed in tandem for a specific period of time, and the audience sample for both variations is randomised. An example would be an A/V test for a marketing campaign. An element such as banner size, button colour, header image or title text is altered. A control or original version and the newly altered version are then deployed in tandem to an audience sample over a set period of time.  

The resulting engagement metrics identify which version the user prefers, so the campaign can be refined and improved for best effect. The preferred version is then implemented, with the aim of improving engagement with future users. The metric is usually the conversion rate of a defined business goal, such as the portion of users that click on a button or make a purchase.   

A/B testing is an important method of optimisation of products or digital assets like websites. Web designers may use A/B testing to refine web page layouts, as two versions of a webpage can be tested against user engagement. Experiments are regularly performed so that improvements can be continuously implemented. Because the users are randomised, A/B testing can be used to evaluate decisions to avoid bias or assumptions. The deciding factor is user preference in a dynamic environment. A/B testing works best with audience segmentation, so that user insights can be refined even more.  

A/B testing can be used to: 

  • Refine marketing campaign messaging and design. 
  • Improve conversion rates through enhancements to user experience. 
  • Continuously optimise assets like web pages by considering user engagement. 

A/B testing vs machine learning

Machine learning is when a system learns the relationship between input and output data without direct human control. The process is used to create algorithms and models that make sense of data without the model being directly written by a human programmer. These algorithms can be used to make decisions, classify data, or perform complex tasks. 

A/B testing is the split testing of an altered variable, with success usually measured by user engagement or conversions. The overall aim of A/B testing and machine learning is therefore very different. A major difference in approach is that machine learning models are usually developed in an offline environment before being deployed to live, dynamic data. In comparison, A/B testing is performed on live or online data.  

The majority of machine learning models are usually trained in an offline or local environment on training data. This means models will often experience a form of machine learning drift such as concept drift or covariate drift. The model may become less accurate or efficient over time because the data in the dynamic, live environment has shifted or evolved away from the original training data. A way around this is the regular retraining or refitting of models when machine learning drift is detected, keeping models accurate to emerging data. 

A/B testing is used to gain insight into live or online data. For example an A/B test might be used to refine an email campaign using a sample of the audience data before the main campaign is sent. A/B testing is used to measure effectiveness of a change or alteration using audience feedback and engagement. 

Another major difference is the amount of variables. Machine learning models process and map many different variables within datasets. The aim is to learn the multivariate functions between input and output data. In contrast, A/B testing is dealing with individual variables, and is used to understand the effects of often very small changes to these variables.  

Using A/B testing with machine learning

Although A/B testing and machine learning have different aims and approaches, there are a range of ways that a combination can have strong results. A/B testing as an optimisation technique can be used to refine and inform the development and deployment of machine learning models. Machine learning methods can also be used to automate and improve the process of A/B testing. 

A/B testing with machine learning can combined to: 

  • Test and refine the deployment of new machine learning models. 
  • Automate A/B testing to make the process more efficient and effective. 
  • Discover useful information about datasets and variables when developing or aligning algorithms.  

A/B testing for machine learning deployment

The technique of A/B testing can be used to test and improve machine learning models. The technique can be used to decide whether a new model is an improvement over a current model. The organisation should choose a metric to compare the control model with the new model. This metric will be used to measure success, and outline the difference between the two deployments. The two models will need to be simultaneously deployed on a sample of data for a defined period of time. Half of the users will be interacting with the control model, and the other half will be interacting with the new model. 

A/B testing for machine learning models is a useful experiment to gauge user preference in a dynamic environment. However, there are limitations to the technique to consider. With 50 percent of users exposed to the control version and the other half to a newer version, it means that during experimentation half of users are presented with a less-than-optimal option. The overall preference of the audience can be relatively close. Although the majority of the audience prefer option B, this could still mean 40 percent of the audience prefer option A.  

Another common approach to exploring machine learning are multi-armed bandit (MAB) tests which resolve this particular issue with A/B tests. The approach is similar to A/B tests, in that a MAB test is exploring the performance of different variations of a machine learning model in a live environment. However, MAB testing isn’t purely a 50/50 split over an experiment phase as in A/B testing. Instead, MAB experiments are adaptive and will dynamically favour the best performing iteration of the model. Whereas A/B testing will present half of user traffic with each model, MAB experiments dynamically shift in light of user preference.  

Automating A/B testing with machine learning

Machine learning as a tool can power and enhance A/B testing beyond manual experimentation. To be effective, A/B testing should be regularly performed so that optimisation is continuous. This can be labour and resource-intensive. Machine learning can be used to automate A/B testing, lowering the resource cost of ongoing or multiple experiments. Machine learning algorithms help to streamline the A/B testing process, with many off-the-shelf products available. 

Machine learning models are also incredibly useful at clustering audiences into different segments. Different types of machine learning models have different real-world applications, but classification of data into different clusters is a common use case. These models can be used to segment audience data into similar groups from a range of dimensions. These segments can then be utilised to perform focused A/B testing on more granular audience groupings.  

Discovering information about datasets with A/B testing

Machine learning models are developed to understand and process multiple variables, mapping the relationship between input and output data. A/B tests are very useful in identifying causal relationships within datasets, understanding how changes in a single variable may affect an outcome. Existing A/B testing results can therefore be used by data scientists during the development or refinement phase of the machine learning model process. A/B testing results provide clear insight from the dynamic live environment, so are incredibly useful when considering machine learning models. Data scientists can use this information to refit models to meet a specific use case. 

Deploying machine learning models in your organisation

Seldon moves machine learning from POC to production to scale, reducing time-to-value so models can get to work up to 85% quicker. In this rapidly changing environment, Seldon can give you the edge you need to supercharge your performance.

With Seldon Deploy, your business can efficiently manage and monitor machine learning, minimise risk, and understand how machine learning models impact decisions and business processes. Meaning you know your team has done its due diligence in creating a more equitable system while boosting performance.

Deploy machine learning in your organisations effectively and efficiently. Talk to our team about machine learning solutions today.

Contents