Member-only story

Drifting Away: Testing ML Models in Production | Databricks Lakehouse Monitoring

3 min readFeb 6, 2024

Hi all,

“Deploying machine learning models has become a relatively frictionless process. However, properly deploying a model with a robust testing and monitoring framework is a vastly more complex task. There is no one-size-fits-all solution when it comes to productionizing ML models, oftentimes requiring custom implementations utilising multiple libraries and tools. There are however, a set of core statistical tests and metrics one should have in place to detect phenomena such as data and concept drift to prevent models from becoming unknowingly stale and detrimental to the business.

Combining our experiences from working with Databricks customers, we do a deep dive on how to test your ML models in production using open source tools such as MLflow, SciPy and statsmodels. You will come away from this talk armed with knowledge of the key tenets for testing both model and data validity in production, along with a generalizable demo which uses MLflow to assist with the reproducibility of this process.” (Source: Databricks)

Types of Drift in Machine Learning

Drifting Away: Testing ML Models in Production | Databricks Lakehouse Monitoring

Types of Drift in Machine Learning

Create an account to read the full story.

Written by Caio Moreno

Responses (1)

More from Caio Moreno

Fix for PowerShell Script Not Digitally Signed

When you run a .ps1 PowerShell script you might get the message saying “.ps1 is not digitally signed. The script will not execute on the…

Databricks Genie AI/BI integrate to Microsoft Teams (Conversational AI)

Dear all,

Get Ready to be Databricks Certified: Generative AI Engineer Associate

Dear all,

Using Azure LogicApp to pull files from Microsoft Sharepoint

I come up with a situation, where I had to fetch some files from a Microsoft Sharepoint folder.

Recommended from Medium

Databricks is now more user-friendly (your new cloud drive)

Recently databricks added some “soft” improvements, making databricks more user-friendly (and Excel-friendly), let’s deep dive into these…

Comparing Databricks, AWS, Azure, and GCP

When comparing Databricks, AWS, Azure, and GCP (Google Cloud Platform), it’s essential to consider their strengths, use cases, and how they…

Lists

Predictive Modeling w/ Python

Practical Guides to Machine Learning

Natural Language Processing

The New Chatbots: ChatGPT, Bard, and Beyond

How I processed ONE billion rows in PySpark without crashing (and You Can Too!)

Ever tried running a PySpark job on 1 billion rows, only to watch it crash and burn?

Why did Databricks build the Photon engine?

The Lakehouse, its motivation, and the difference between Photon and the existing engine.

Unity Catalog OSS — 01

This blog series explores the open source version of Unity Catalog and how you can use it on your local machine for

Extracting Data from an API on Databricks

Introduction