11th May 2022 – Datatonic announced today that they have open-sourced their MLOps Turbo Templates, co-developed with Google Cloud’s Vertex Pipelines Product Team, to help data teams kickstart their Vertex AI MLOps initiatives.
As businesses continue to look to big data to remain competitive, the push for data science teams to leverage the power of Machine Learning (ML) is ever-growing. However, there are industry-wide challenges with achieving optimal business value from ML, and the companies seeing the biggest bottom-line impact from AI adoption engage in both core and advanced AI best practices, including Machine Learning Operations (MLOps) and cloud technologies (The State of AI in 2021, McKinsey).
Building your MLOps solution out can prove tricky, though. Cutting-edge tools, such as Google Cloud’s Vertex AI, provide an excellent way to simplify MLOps for data scientists and ML engineers so that businesses can accelerate the time to value for ML initiatives. To help data teams kickstart their Vertex AI MLOps initiatives even faster, Datatonic has been working closely with Google Cloud’s Vertex Pipelines Product Team to develop an open-source production-ready MLOps solution: that is how MLOps Turbo Templates were born.
Similar to DevOps, and how it benefits software development, MLOps is a concept developed to benefit the development of ML systems. MLOps considers every stage of the ML lifecycle, from building, deploying, and serving to monitoring ML models, helping businesses get models to production faster and with higher success rates through the right platform, processes and people.
Executed well, these capabilities combine to manage the additional complexity introduced when managing ML systems and reduce model performance degradation. This decreases overhead and operating costs for the business, while enabling the use of advanced analytics, ML-powered decisioning, and unlocking new revenue streams.
Google Cloud has been a pioneer in building state-of-the-art MLOps tools and solutions for nearly a decade. Their latest offering, Vertex AI, aims to help teams build, deploy, and scale ML models faster, with pre-trained and custom tooling within a unified artificial intelligence platform which aims to satisfy the various needs of Data Science teams and other ML practitioners. This ecosystem of products is accessible to teams across the business that need to support ML and MLOps solutions. Depending on the business’ main priorities, they might want to consider infrastructure-as-code, CI/CD pipelines, continuous training pipelines, batch or online prediction services as well as other capabilities. The exact approach depends on the organisation’s requirements and needs of the ML use case.
The MLOps Turbo Templates are an open-source codebase that provide a template, or reference implementation, of the end-to-end ML lifecycle. Co-developed by Datatonic and Google Cloud’s Vertex Pipelines Product Team to help businesses kickstart their MLOps initiatives, they aim to enable both Data Scientists and ML Engineers to use Google Cloud Platform (GCP) better and faster.
With more clients deploying ML models to production than ever before, our teams realised the power of MLOps solutions for maximising the value of clients’ ML models early on. Our MLOps solution for Sky’s Content team, for example, has helped them reduce time to production by 4-5x, and Delivery Hero’s MLOps Platform allows their teams to seamlessly leverage MLOps best practices. The MLOps Turbo Templates combine Datatonic’s experience implementing MLOps solutions with Google Cloud’s technical excellence to help reduce technical debt and adoption challenges that many businesses may face.
These templates can be used to create a production-ready MLOps solution on Google Cloud, covering the key components of an ML system in production, including:
The diagrams below give an indication of the process flow for two primary processes in the ML lifecycle: model training and model scoring. The training pipeline shows an automated pipeline for ingesting data, validating it against a schema, and then training an ML model. This incorporates a champion-challenger model evaluation strategy to ensure the best performing model is always promoted to production. The prediction pipeline shows how model output data can be monitored for training-serving skew, then compute predictions using the previously-trained model.
The template utilises GCP tools and products throughout, with the following component breakdown:
The MLOps Turbo Templates are now available in Google Cloud’s open-source repository here, including all the documentation needed to fork and help teams start with their use cases.
Know exactly where and how to start your AI journey with Datatonic’s
three-week AI Innovation Jumpstart *.
* Duration dependent on data complexity and use case chosen for POC model
With your own data sets, convince your business of the value of migrating your data warehouse, data lake and/or streaming platform to the cloud in four weeks.
With your own data, see how Looker can modernise your BI needs
with Datatonic’s two-week Showcase.