Advancing MLOps Capabilities with Babyshop

Babyshop is a world-leading online retailer for children’s clothing. Their data science team play a key role in providing a high degree of personalization to customers with their cutting-edge machine learning models. The team is now taking its next step towards AI automation to enable rapid iterative experimentation, reduce time to production and minimise the burden of repetitive manual processes. In a single-day workshop, Datatonic helped Babyshop enhance their MLOps capabilities by focusing on improving a single machine learning pipeline on Kubeflow Pipelines.

Our Impact

Enhanced an existing machine learning training pipeline with a focus on reusability, robustness, and rapid experimentation

Leveraged the power of running Kubeflow Pipelines by training models at scale on AI Platform with hyperparameter tuning

Shared best practices for developing machine learning pipeline code on Kubeflow Pipelines

The Challenge

Developing and deploying AI models is a complex, iterative process. While the freedom to experiment is vital in any data science workflow, engineering precision is required to get a stable model out into production.

Successful AI automation revolves around addressing these often competing needs: efficient, reproducible, and transparent experimentation allowing for rapid R&D iterations; and rigor in deploying and monitoring models. The burgeoning field of MLOps has developed in recent years in response to these challenges (for more information about AI automation with MLOps, check out our recent blog)

The data science team at Babyshop is an early adopter of Kubeflow, the open-source machine learning platform from Google. There are several aspects of Kubeflow that appealed to Babyshop and that they wanted to take full advantage of in their day-to-day work:

  1. Agility to accelerate the journey from model ideation to deployment and business impact
  2. Automation to free up data scientists to do the fun modelling work they’d rather be doing 
  3. Consistency and repeatability 
  4. Performance and scalability 
  5. Visibility by offering a centralized place to collect, visualize, and compare information about data, models, and experiments

Babyshop were interested in best practices around Kubeflow and how to use Kubeflow on GCP for their training pipelines.

We received an MLOps workshop from the team at Datatonic. The workshop was built around our own ML models and challenges and therefore was easily applied in our daily operations. Datatonic are true experts in this field and the workshop was leading to valuable insights in the way of working Kubeflow Pipelines and AI platform for multiple projects that Babyshop is currently implementing.

David Feldell, Data Science Lead - Babyshop Group
Solution

To help address these challenges, Datatonic and Babyshop joined forces in a single-day workshop to restructure a training pipeline. During this workshop, we focused on concrete improvements to the pipeline:

  1. Increasing the potential reusability of pipeline components by encouraging a modular approach to pipeline development
  2. Allowing for step-specific hardware optimization in order to decrease costs and training times
  3. Generating informative visualizations to better track experiment results
  4. Leveraging AI platform for distributed model training and hyperparameter tuning 
  5. Encouraging best practices for pipeline code

At Datatonic we emphasize knowledge sharing with our clients. With these best practices and transferable lessons, the data science team at Babyshop is well situated to continue delivering innovative solutions that enhance the unique shopping experience their customers have come to expect.

They’re now able to take advantage of the big data and machine learning capabilities of Google Cloud, limit costs, facilitate development and experimentation, and accelerate the path to production.

Get in touch
Get more out of your data. Start your cloud data + AI journey today.
Let's talk