We may begin by first becoming acquainted with the term “DevOps”. DevOps constitutes a set of best practices in software engineering that has made it possible to ship software into a production environment within minutes whilst ensuring that the application is running reliably whilst in production. Therefore it may be said that DevOps is a software engineering practice that unifies software DEVelopment and software OPerationS.
DevOps = Software Development + Software Operations
Here is where the issue lies. When building traditional software applications, much of the concern for the DevOps team is regarding the code. On the other hand, Machine Learning applications involve code as well as data – effectively, the fundamental difference between a Machine Learning application and a Software Engineering application.
The final machine learning model that is deployed into production encompasses an algorithm that has been applied to a large set of data (better known as the training data) which in effect determines the behavior of the model in a production environment – the model’s behavior is also dependent on the input data that it receives at inference time, but we have no way of knowing in advance what it would be.
Essentially, code in both traditional software applications and machine learning applications are crafted in a controlled development environment. However, machine learning applications also include data that comes from a never-ending source called the “real-world”. Data does not stop changing, and there is no measure to completely control how data should change. To enhance your perception of this concept, you may consider a relationship between code and data of which both live in their own independent planes though sharing a dimension of time.
The gap between the planes is illustrative of a disconnect that is the root of several vital challenges that ought to be overcome by absolutely anybody attempting to deploy a Machine Learning model into production successfully. These challenges include: