It is becoming an increasingly uncomplicated task to run machine learning (ML) experiments. ML experimentation focuses on data preparation, algorithm selection and tuning, and model validation and verification. With the progress made in tooling and packages specifically designed to automate many of these steps, obtaining a great match to available data is something anyone can do with their eyes shut. Well, maybe not shut as you need to see and click the “auto train” button; but you get the point.
Yet, many organizations fail to put their ML models to production. This is a nightmare scenario for managers who have spent thousands of dollars on staffing, researching and developing machine learning models. These models often work perfectly fine on a test case run by a data scientist on a laptop; but they fail when deployed in the field.
Operationalization aims to streamline the process of productionizing models. It focuses on processes used for deploying models and the subsequent consumption and monitoring of resilient, efficient and measurable services. This is, most often, the challenging part of a machine learning project; and where most companies fail in delivering value for the money spent on data science initiatives.
More broadly, reliable operationalization of ML is a part of machine learning operations (MLOps); an engineering discipline that aims to make the process of building, versioning, deploying, monitoring, and updating ML models, fast, reliable and reproducible. As you can see in the figure below, traditionally machine learning projects follow a linear pipeline. They start by defining objectives and move through a series of experiments to fine-tune the algorithms. The exercise in concluded by putting models to production, albeit without a clearly defined and automated mechanism to make future adjustments. A modern MLOps pipeline resolves this issue by providing an opportunity to monitor the system and make changes in data collection, experimentation and operationalization stages. For instance, this is beneficial in addressing issues such as concept drift, where performance of a machine learning model deteriorates due to changes in statistical properties of incoming data and relationships between input and output parameters.
MLOps adds value by systemically organizing the lifecycle of machine learning and reducing risks to AI-driven enterprises. It facilitates the collaboration between data scientists by having a unified model versioning and control story. It also tracks the performance of models after deployment, alerting and assisting data scientists in making necessary adjustments.
At Maillance, we take this concept near and dear to our hearts.
Thanks to Kubernetes, our containerized workflows run at scale. Once you deploy your models to production, a unique and automated system starts to monitor the performance of deployed models against new data. With its built-in intelligence, it detects many common problems, including concept drift and sends alerts to data scientists. The system also can take actions automatically and make corrections by running a new set of experiments and redeploying the models. All this happens without interruption to your operations, and while you enjoy your coffee.