Advancing MLOps Capabilities with Babyshop

Babyshop Baby Clothes

Client

Babyshop

Tech stack

Google Cloud

Solution

MLOps

Service

AI + Machine Learning

Babyshop is a world-leading online retailer of children’s clothing. Their data science team play a key role in providing a high degree of personalisation to customers with their cutting-edge machine learning models. The team is now taking its next step towards AI automation to enable rapid iterative experimentation, reduce time to production and minimise the burden of repetitive manual processes. In a single-day workshop, Datatonic helped Babyshop enhance its MLOps capabilities by focusing on improving a single machine learning pipeline on Kubeflow Pipelines.

Our impact

  • Enhanced an existing machine learning training pipeline with a focus on reusability, robustness, and rapid experimentation
  • Leveraged the power of running Kubeflow Pipelines by training models at scale on AI Platform with hyperparameter tuning
  • Shared best practices for developing machine learning pipeline code on Kubeflow Pipelines

 

The challenge

Developing and deploying AI models is a complex, iterative process. While the freedom to experiment is vital in any data science workflow, engineering precision is required to get a stable model out into production.

Successful AI automation revolves around addressing these often competing needs: efficient, reproducible, and transparent experimentation allowing for rapid R&D iterations; and rigor in deploying and monitoring models. The burgeoning field of MLOps has developed in recent years in response to these challenges (for more information about AI automation with MLOps, check out our recent blog)

The data science team at Babyshop is an early adopter of Kubeflow, the open-source machine learning platform from Google. There are several aspects of Kubeflow that appealed to Babyshop and that they wanted to take full advantage of in their day-to-day work:

  1. Agility to accelerate the journey from model ideation to deployment and business impact
  2. Automation to free up data scientists to do the fun modelling work they’d rather be doing 
  3. Consistency and repeatability 
  4. Performance and scalability 
  5. Visibility by offering a centralised place to collect, visualise, and compare information about data, models, and experiments

Babyshop was interested in best practices around Kubeflow and how to use Kubeflow on Google Cloud for its training pipelines.

“We received an MLOps workshop from the team at Datatonic. The workshop was built around our own ML models and challenges and therefore was easily applied in our daily operations. Datatonic are true experts in this field and the workshop was leading to valuable insights in the way of working Kubeflow Pipelines and AI platform for multiple projects that Babyshop is currently implementing.” – David Feldell, Data science Lead, Babyshop

 

Our solution

To help address these challenges, Datatonic and Babyshop joined forces in a single-day workshop to restructure a training pipeline. During this workshop, we focused on concrete improvements to the pipeline:

  1. Increasing the potential reusability of pipeline components by encouraging a modular approach to pipeline development
  2. Allowing for step-specific hardware optimisation in order to decrease costs and training times
  3. Generating informative visualisations to better track experiment results
  4. Leveraging AI platform for distributed model training and hyperparameter tuning 
  5. Encouraging best practices for pipeline code

At Datatonic we emphasise knowledge sharing with our clients. With these best practices and transferable lessons, the data science team at Babyshop is well situated to continue delivering innovative solutions that enhance the unique shopping experience their customers have come to expect.

They’re now able to take advantage of the big data and machine learning capabilities of Google Cloud, limit costs, facilitate development and experimentation, and accelerate the path to production.