Insights

TonicTalk with Sky: Your Questions Answered!

MLOps

How do organisations make personalisation a reality? Why is realising you’re at “Level 0” Tactical AI (i.e. highly manual and disjointed ML processes and models) a good thing? And, how are solid data foundations and MLOps best practices the key pillars to your personalisation success? 

At our February TonicTalk, Datatonic had the pleasure of hosting Hamish Neugebauer and Aidan Dunlop, from leading media company, Sky, to discuss these burning questions that Data Scientists and Machine Learning Engineers often ask when scaling their personalisation efforts. 

Joined by our own specialists, Thomas Gaddy and Jamie Curtis, the speakers were inundated with questions, ranging from advice on how to get started with MLOps, to perspectives on how to evolve from “tactical” to “transformational” stages of AI.  

We simply didn’t have the time to answer all of the audience’s questions, so we’ve summarised our responses from Datatonic and Sky in this blog. If you’re also attempting to scale your ML models in an efficient and cost-effective way, no doubt, you may have faced similar challenges or have asked yourself a similar set of questions! 

 

How to Get Started on Your MLOps Journey

Q: Do you have any process-specific hacks or steps to get started on the MLOps journey? How do you prioritise the requirements? How do you decide on which aspects to focus on first? 

Datatonic: The most important prerequisites are to invest in the right people and in your data infrastructure. Assuming you have these in place and are already doing some POC data science projects, The first step is simply to identify all the bottlenecks in the process. Using guides, like the Google AI Adoption framework, are a good start. Are all processes manual and disjointed? Are there no training pipelines in place? Once you identify all of these pain points at “Level 0” Tactical AI, it will be easier to prioritise these, as you’ll be able to understand what is slowing you down. This should allow you to take a more structured approach, thereby making the journey less intimidating as it is broken into smaller steps. 

Live Poll Results: By far, the biggest “Level 0 problem” was Highly Manual Processes. But as our speakers reiterated, it is a good thing to clearly identify your MLOps gaps!

 

How to Move from Tactical to Transformational AI

Q: At any point during your attempt to move from level 0 (tactical AI), did you find that you had chosen the wrong tool, and did that bring about any major problems in moving away from that tool? 

Sky: So far, we have been lucky though we have had to pivot on some details. One example was having our serving and training clusters separate. It made sense to us, but the maintenance burden probably isn’t worth the benefit. So, we moved both into one K8 cluster, we can change this later which is great but we are starting as simple / low support overhead as we can. The main takeaway from us was trying to think ahead and choosing flexibility that brought value based on our ecosystem and environment within Sky. 

Q: How do you prevent scope creep and keep it manageable? 

Sky: We envision the inertia of tweaking and developing new ML capabilities as “low enough”, so that scope creep is something we can welcome, if it’ll ultimately improve the customer’s experience. The key tenement is democratising AL/AI and empowering Data Science experts to use their knowledge with as little overhead and support as possible. In addition, I would note getting the cross functional groups working together as early as possible means that you can set expectations early. If, for instance, a data silo will prevent the solution ever really working in production, you can be aware early and work out what to do together keeping stakeholders informed.

Live Poll Results: For our audience, the top three challenges to delivering hyper-personalised experiences to customers were: (1) Data Silos, (2) Data Quality, (3) Business Readiness. - TonicTalk Sky

Q: There’s a balance between managed services and building in-house, while the former is faster, the latter can often be more advantageous further down the line. What are your thoughts here? 

Datatonic: This is a great question, and there isn’t a single solution. How to strike the appropriate balance will be highly dependent on the needs and capabilities of your organization. In general, we like technologies such as TFX and Kubeflow Pipelines because they integrate very nicely with managed services on Google Cloud Platform, but are also open source and can easily be used and extended in-house. 

 

MLOps Tools and Data

Q: How do you track Drift in more real-time once the model deployed into production?

Datatonic: One way to think of drift detection is just as a statistical problem: are these samples drawn from the same distribution? In general, we’d want to find the appropriate balance between the following: 

  • False alarm rate: what are the implications if we erroneously detect drift?
  • Misdetection rate: what are the implications if we fail to detect drift?
  • Detection delay: what is the optimal delay between when distributions start to shift and firing an alert. 

With that in mind, it is important to note that if we are tracking drift in near-real time, we are probably basing our decisions on a very small amount of data and therefore risk having more false alarms. Alternatively, if we wait to collect a lot of data in order to be more confident that the data has in fact shifted, we risk a large detection delay. That being said, we can use similar techniques in both scenarios. 

Here are a few interesting papers which present different approaches to the general problem of detecting drift: 

Q: What do you use for experiment tracking? Comet/W&B/neptune? Tensorboard? Dvc? And, how do you reason when choosing? Where do these tests sit more specifically?

Sky: In our space, we couldn’t find anything off the shelf that was going to let us connect this all up. We are also developing a platform in our space that we hope to bolt onto the ML project work. We have tests that sit in our training pipeline codebase, that we run as we make changes to the code, and we have tests that sit inside the TFX pipeline itself – so if there’s a problem with the data input, the pipeline will fail and we can go in and fix it, before we cause a problem in production. 

TonicTalk Sky


If you missed out on the event, you can watch the TonicTalk on-demand here.

Related
View all
View all
Partner of the Year Awards
Insights
Datatonic Wins Four 2024 Google Cloud Partner of the Year Awards
Women in Data and Analytics
Insights
Coding Confidence: Inspiring Women in Data and Analytics
Prompt Engineering
Insights
Prompt Engineering 101: Using GenAI Effectively
Generative AI