MLOps London May : Talks on Data Quality and Optimizing Inference

May saw the return of our MLOps London meetup and a record-breaking attendance too. The buzz in the AI scene clearly remains strong but it was awesome to see that nearly everyone attending was a long-time practitioner of data science or machine learning.

A family emergency meant that Srivalsan was unable to make it. Don’t worry, you’ll be able to catch his talk on Kubeflow pipelines at our July meetup. To fill the slot, I offered the community the option of either extra networking time or a recent talk I’d delivered at the AI at Scale conference. Since people voted to hear my talk, I thought I’d include the fantastic summary that community member Atharva Lad created in his LinkedIn post:

Andrew Jones (Principal Engineer, GoCardless) introduced the concept of data contracts as a catalyst for changing an organization’s data culture. He discussed the data architecture at GoCardless in 2021, (services link to a database -> Data Lakehouse -> Analytics/ML models/Internal services) highlighting potential issues caused by changes at the database level impacting analytics and ML models downstream – with a bottleneck being created at the ETL/dbt part at the data lakehouse stage.

To ensure reliable access to documented and versioned data, the responsibility falls on data generators (Product and Engineering teams) to provide high-quality data. A data contract serves as an agreement between generators and consumers (BI analysts and Data Scientists), defining a stable interface, clear responsibilities, compliance with personal data regulations, and meeting business requirements. In a production pipeline, the data contract is validated, merged to the main/master branch and connected with isolated GCP resources + kubernetes services and integrated central services (eg: data catalogs, metrics, dashboards, alerts, BigQuery SQL, etc).

The goal is to increase collaboration and foster a real data-driven system that communicates business value – thus incentivizing generators to produce the required data quality for data consumers and/or end users.

Ed Shee (Head of DevRel at Seldon and organizer of the MLOps London meetup) spontaneously agreed to present his talk on using Seldon‘s ML inference server (an open source alternative that allows DS teams to serve models) to circumvent production challenges such as maximizing infrastructure usage, dependency management, working with multiple ML frameworks, standardizing API definitions and capturing payloads.

Demos showcasing the deployment of common XGBoost and SKLearn models, started with the creation of settings.json and model_settings.json files, to configure the server and models accordingly. The inference server supports multi-model serving (running multiple models within the same server instance, with their own REST or gRPC endpoint to make separate predictions), parallel inference (processing multiple prediction requests concurrently to improve overall throughput and reduce latency) and adaptive batching (optimizing efficiency of inference by adjusting the batch size of prediction requests to balance resource utilization and response time).

Overall, these aim to mitigate issues that arise while building servers from scratch by providing a free alternative that can utilise state-of-the-art models and deploy them to production by connecting to Docker and/or Kubernetes.

A warm thanks to Ed Shee and Seldon for providing a wonderful venue for data professionals and enthusiasts to network, have food + drinks and participate in engaging discussions!

You can catch the full replay of Andrew’s talk here

You can watch the full recording of Ed’s session here.

As always, the evening was capped off with plenty of networking time and lots of excellent conversations over a drink or two. If you’re local to London, make sure you come along in person to the next MLOps London meetup which takes place on 18th July. If not, you can follow all the action live on our YouTube channel.

.

.

.

.

Contents