What is MLOps?

I recently started a new job at a Machine Learning startup. I’ve given up trying to explain what I do to non-technical friends and family (my mum still just tells people I work with computers). For those of you who at least understand that “AI” is just an overused marketing term for Machine Learning, I can break it down for you using the latest buzzword in the field: MLOps

The term “MLOps” (a compound of Machine Learning and Operations) refers to the practice of deploying, managing and monitoring machine learning models in production. It takes best practices from the field of DevOps and utilises them for the unique challenges that arise running machine learning systems in production.

Search interest for “MLOps” over the past 5 years, Google Trends

The term is relatively new and has grown rapidly in usage over the last year and is a direct result of a maturing Machine Learning landscape. As businesses get good at collecting data, designing and training ML models, their focus shifts towards integrating those models into their software estates. This brings all sorts of new challenges around infrastructure, scalability, performance and monitoring that most data science teams are not traditionally equipped to deal with.

One approach is to segregate duties between Data Science and DevOps like so:
– Data Science design, build and evaluate the models
– DevOps deploy, monitor and manage the models

This seems like a good idea at first but we only need to start asking some questions to see where we might struggle:

  • When do we retrain a model and deploy a new version?
  • What are the expected input/output formats of the model? Do we need to validate them?
  • Can the model performance be optimsed by utilizing a GPU?
  • How do we allow models to be continually tested?

Answering any of these questions requires knowledge of both the model itself and the complex environment it’s deployed in.

The reality is that the whole lifecycle of an ML system is tightly coupled and highly iterative in nature. Production ML is hard and requires expertise in Data Engineering, Data Science and DevOps. The umbrella term “MLOps” provides an easy way to refer to the techniques, tools and skilled engineers who inhabit the growing space between these disciplines.

Mandatory Venn Diagram, Wikipedia

So is MLOps just another buzzword? Absolutely! But for now it’s the best we’ve got and it serves an important purpose.