Vertex AI Assessment: Choosing MLOps Tooling
Introduction
In recent years, many companies have looked to AI and Machine Learning as key technologies to drive growth by creating new revenue streams, reducing operating costs, or mitigating risks. However, leveraging these technologies can be difficult and costly without the right tools, best practices, or expertise.
Fortunately, many tools have emerged to address these challenges in recent years, such as Google Cloud’s Vertex AI. These powerful ecosystems can enable companies to harness the full potential of Data Science (DS) and Machine Learning (ML). They can also allow companies to streamline their Machine Learning operations (MLOps), and create incremental value from ML. But, with so many options available, it can be overwhelming for companies to determine which tool is the best fit for their specific needs.
As Google Cloud’s Machine Learning Partner of the Year, Datatonic has deep expertise in Vertex AI and Google Cloud. In this blog, we will explore how Datatonic leverages its expertise to advise clients on how to use Vertex AI to onboard strategic ML use cases over a two-week-long engagement.
First, we will discuss how our Vertex AI Assessment package introduces Google Cloud + Vertex AI. Then, we will look at how we support companies in getting started with Vertex AI and allow them to assess its suitability for their ML use cases. Finally, we’ll look at a client case study and will examine how our expertise enabled an informed decision between AWS Sagemaker and Vertex AI for its ML platform.
Vertex AI Assessment – Enablement + Hands-on Workshops
Datatonic’s Vertex AI Assessment is designed to provide businesses with a detailed view of Vertex AI and to help them to assess whether or not it is the right fit for their organisation. The engagement primarily covers how Data Scientists and Engineers can use Vertex AI tools to leverage their data and produce state-of-the-art Machine Learning systems. In addition to this, clients are also shown how ML systems can be deployed, tracked and managed using Vertex AI Pipelines.
Workshops run over two weeks and include the initial set-up of Vertex AI infrastructure, a series of deep dive discussions on a curated list of Vertex AI services, and hands-on time for the client to try out Vertex AI. Data Scientists and Machine Learning Engineers from Datatonic are also available throughout the two weeks to support clients as they explore the tools.
Upon completion of the workshops, Datatonic provides expert advice regarding the suitability of Vertex for the client’s needs and assists them in comparing it to any alternative tools they may be considering.
Case Study
Client Background
The business provides a B2B sales platform that helps companies find their next sales opportunities by matching clients with more relevant customers, allowing for more business to be completed with fewer interactions. The platform accesses high-dimensional databases and harnesses the power of AI and Machine Learning methods to identify sales leads efficiently and reliably.
Client Challenge
Our client has a talented and advanced Data Science research team that is creating innovative solutions to give them an edge over their competitors. With a hybrid cross-cloud set-up, it lacked a standardised process of taking a Data Science solution to production using Cloud technologies. As a result, despite developing Artificial Intelligence and Machine Learning models to improve and automate decision-making, there was still a large manual overhead. These are common challenges among many businesses that are looking to adopt or have already adopted Machine Learning.
A direct consequence of these challenges is a slow time to production and low ROI from ML initiatives. Similarly, the difficulties involved in monitoring and maintaining these models in production means that the quality of these models may degrade over time. For these reasons, the client was keen to build a platform that can cater for its Data Science and MLOps needs, allowing for a more efficient, reliable and standardised method of productionising cutting-edge Machine Learning solutions, keeping them ahead of the competition.
While Vertex AI on Google Cloud was identified as a platform where the value of Data Science work could be realised, the client also considered Amazon Web Services’ Sagemaker as an alternative. Datatonic provided a comprehensive evaluation package of the Vertex AI platform, to allow the client to make an informed decision on how to best continue its AI practice.
Our Solution
Google Cloud and Vertex AI provide tooling for the entire lifecycle of a Machine Learning solution. This ranges from data ingestion, exploration and model development, all the way to Machine Learning pipelines and model deployment. As the client already has expertise in Data Science, emphasis was placed on how a Data Scientist would use Google Cloud on a day-to-day basis and leverage the relevant services to improve and automate their workflow. With an emphasis on improving existing workflows, deep dives into Vertex Pipelines and MLOps were also at the forefront of the engagement.
A series of workshops were delivered on the relevant offerings of Google Cloud, including deep dives into the services, tutorials and hands-on lab activities. The first group of workshops discussed “An Introduction to Data Science on Google Cloud”, emphasising how Data Scientists could leverage Google Cloud solutions to reduce the manual overhead in their existing workflows. Following Google Cloud’s recommended Data Science workflow, pre-packaged solutions were explored first, followed by no-code and low-code solutions, and then more advanced custom training options. In addition to modelling capabilities, additional Vertex AI services included: Managed Notebooks, Managed Datasets and Matching Engine, and more.
Getting Started
Initially, the client explored the various methods for ingesting and storing data in Google Cloud and the various tools and databases available depending on the data type and use case requirements. This involved covering best practices for ETL on Google Cloud and how to automate data processing so that Data Scientists can spend less time on this aspect and more time generating insights and delivering value from Machine Learning.
Vertex AI Workbench
Now that the data was transformed and in place, the client’s Data Scientists were eager to get hands-on with the data. Before the engagement, much of the Data Scientist’s work was undertaken locally and thus we were keen to show the power and flexibility of Vertex AI Workbench.
Vertex AI Workbench is ideal for end-to-end Data Science workflows; you can implement code for data preparation, exploratory data analysis, custom model training and much more. This service is a Jupyter notebook-based development environment which allows for seamless interaction and integration with Vertex AI and other Google Cloud services, giving full control to your Data Scientists. Within Vertex AI Workbench, there are two options for development: Managed Notebooks and User-Managed Notebooks.
Managed Notebooks are Google-managed environments that allow for seamless integration with BigQuery and Google Cloud Storage, meaning that Data Scientists do not need to leave the JupyterLab environment to view and explore the data. Managed Notebooks offer further flexibility including the management of hardware and framework for your instance. This is incredibly powerful, as at your fingertips within your notebook interface you can scale how many CPUs, GPUs or RAM to run your code on, without having to restart your instance, meaning rapid experimentation and flexibility for running your code against varying data and processing demands.
User-Managed Notebooks are ideal for Data Scientists requiring more rigorous control of their environment. These notebooks are Deep Learning Virtual Machine (VM) instances, in which you can detail the machine type and framework for your VM instance. Notebooks can be configured with CPUs or GPUs, and a suite of pre-installed deep learning packages that supports TensorFlow, PyTorch, scikit-learn, NLTK and more. You can further customise your notebook by installing desired packages from within the instance itself, allowing for full control of your environment.
The client was particularly drawn to the customisability of the notebooks, and the ability to scale to their desired machine type and frameworks rapidly to cater for varying processing and data requirements, which heavily holds them back when using local machines. One of the major pain points of the client was the manual overhead involved with their current workflow, and the scheduled orchestration of notebook runs was incredibly attractive, meaning that model retraining and experimentation could be automated. Various integrations with BigQuery and Git were also appealing as these connectors assist with the reduction of development time.
No-Code and Low-Code Capabilities
Concerning Machine Learning modelling, the first workshops revolved around pre-trained models and off-the-shelf APIs that could reduce the manual workload for Data Scientists whilst being highly performant on generic use cases. As the client regularly utilises unstructured text data for their Machine Learning solutions, the Cloud Natural Language APIs were explored, with particular emphasis on state-of-the-art content classification and various entity analysis capabilities.
Having an advanced in-house Data Science research team, the client was eager to also learn how they could use Vertex AI for custom NLP use cases with use case and sector-specific content. The client was impressed with how straightforward creating a labelled Managed Dataset was for text classification, and with the rapid nature of creating an AutoML model without code.
AutoML Natural Language provides a simple interface to train powerful Large Language Models (LLMs) without having to configure distributed training strategies or worry about machine types and other hardware considerations. With LLMs very much being the trend, having such a service that allows for rapid training and high model performance is becoming far more popular amongst forward-thinking businesses.
The client was also keen to get hands-on with the data they had successfully stored in BigQuery. The first port of call was to explore BigQuery ML (BQML), where we could train models from the BigQuery UI using SQL. Together with the client, we trialled a myriad of classification algorithms that BQML accommodates on their pre-embedded text dataset. The client was impressed with the performance of BQML and how it rapidly provided a challenging baseline for their custom models to try and beat.
Vertex Explainable AI integrates very easily with BQML to help provide a better understanding of model decision-making, both on a global and local scale. The client was impressed at how they could analyse the reasoning behind model predictions on an example-by-example level, which is becoming ever more important in the widescale adoption of AI.
We also explored hyperparameter tuning via the Vertex AI Vizier tuning algorithm to improve model performance. BQML is becoming an increasingly popular choice amongst clients due to the ever-expanding model options, and clever integrations such as Hyperparameter Tuning and Explainable AI, whilst also being a managed service. This means that Google takes care of the compute resources and the user does not need to worry about any hardware or machine type used in training.
Vertex AI Matching Engine
The client was particularly interested in vector similarity-matching, and being able to do so at low latency. Thus, we introduced them to Vertex AI Matching Engine, which is a high-scale low latency vector database which allows for embeddings to be queried against indexes with billions of embedding vectors to retrieve the most similar items within the database. The service is ideal for recommendation engines, text similarity, image search and ad targeting systems and it is the same technology that is used for YouTube and Google Search.
By deploying the client’s pre-embedded data to an Index and were able to showcase the rapid querying offering of Matching Engine by retrieving the closest embeddings at incredibly low latency. This unique offering of Vertex AI further added to the client’s confidence that Google Cloud was the right platform for them.
Conclusion
Using our MLOps Tooling Evaluation Framework, the client performed a like-for-like comparison between Google Cloud’s Vertex AI and AWS SageMaker and decided to move Machine Learning capabilities to Google Cloud. How Vertex AI could reduce key pain points such as manual overhead and the variety of powerful and flexible services at the fingertips of Data Scientists vs on-prem was a key factor in the decision-making. Now, the client is rolling out its first use case on Vertex AI, taking advantage of the MLOps sessions we delivered, utilising Vertex Pipelines, and our MLOps Turbo Templates to accelerate the productionisation of their Machine Learning models.
Datatonic is Google Cloud’s Machine Learning Partner of the Year with a wealth of experience developing and deploying impactful Machine Learning models and MLOps Platform builds. Need help developing an ML model, or deploying your Machine Learning models fast?
Have a look at our MLOps 101 webinar, where our experts talk you through how to get started with Machine Learning at scale or get in touch to discuss your ML or MLOps requirements!