As organisations utilise machine learning for a growing number of tasks, the ability to ensure proper governance, auditing and discoverability of those models is becoming increasingly important. Through using metadata, organisations can track the activity of their machine learning models and explain how these models are contributing to business functions, allowing for better compliance and risk management.
With the 1.2 release of Seldon Deploy core metadata capabilities are exposed to allow key information about the models available and running in production allowing stakeholders across the business to easily find models that can be useful for their business case, understand the models’ capabilities including inputs and outputs and find running versions of those models. This is the first release of our metadata capabilities and in future releases we will extend the functionality to ensure models can easily be pushed to production with confidence.
The image below shows a schematic of the metadata architecture in Seldon Deploy.
Seldon Deploy 1.2 enables organisations to manage their machine learning compliance via a rich UI and API to view, edit and search metadata of all their deployed models, while also providing capabilities to integrate this deployed model information into their internal corporate metadata systems. This feature provides organisations with full visibility of their machine learning assets, as well as the ability for them to build hierarchical taxonomies to ensure they can identify and manage production machine learning risk and compliance.
Organisations are often required to track and create taxonomies of their digital assets such as databases, database tables and model artifacts in order to manage their compliance and risk. Deployed production machine learning models are no different – at Seldon we are aware that these are also digital assets which require compliance and risk management considerations, which is why we have introduced an enterprise metadata management solution for production deployed models that addresses this.
The internal Seldon Deploy metadata store was designed with interoperability in mind, introducing programmatic interfaces through the Enterprise API that provides actionable insights on the metadata of your production machine learning layer. These new metadata features in the Seldon Deploy Enterprise platform also provide further interoperability with the open source ecosystem of metadata tools, providing integration touchpoints for tools and frameworks in the machine learning training and experimentation space such as DVC, Pachyderm and MLFlow, or metadata management systems like Amundsen, Datahub andAtlas along with other custom corporate internal metadata management systems.
Shown below you can see a brief example of how a new model can be registered into the platform by using the Seldon Deploy user interface.
Editing / Extending Model
Similarly it is possible to register a model into the model catalogue programmatically with the Seldon Deploy SDK which simplifies the interaction to the Seldon Deploy REST APIs through a simple Python interface – this SDK powers Seldon Deploys production integration with CI and ETL systems that ensure interoperability with other external systems throughout the end-to-end MLOps lifecycle. Below is a snippet of how model metadata can be inserted programmatically, you can find the full worked notebook here.
Request a free trial of the new Seldon Deploy today
Clive is Chief Technology Officer at Seldon and has more than 25 years of experience in IT and technology. Since its launch in 2014, Clive has led the creation of the company’s core product offering. Clive is a regular speaker at industry-leading conferences where he relays the story of Seldon’s technology, and has contributed to many ML open source projects, including Kubeflow. His PhD research was on Natural Language Processing and he previously worked successfully for early speech recognition pioneers.