Introducing Model URI Support in Seldon's Enterprise Platform

New Year, new major upgrades to Seldon’s technology, that will soon revolutionize the way you handle authentication, deploy models, manage errors, and more. Our commitment to providing a seamless and efficient experience for our users remains unwavering. Let’s dive in:

Seldon Enterprise Platform: Announcing UI available Features, Configuration and Customisation Options

Improved Authentication Configurability

With our latest update, we’ve removed the hassle of needing to build custom logic for authorizing access based on different groups across your business. Effortlessly manage access permissions to save valuable time and resources that were previously invested in custom solutions, and have more freedom to tailor authentication based on your unique organizational structure.

Support for Model URI in the UI

Support for Model URI in the user interface, allowing users to utilize a URI for model artifacts stored in a storage bucket. This enhancement means the UI supports more content beyond Docker images for custom models, providing more flexibility.

Improved Error Messages for Model Deployment Failures

To further enhance the user experience, Seldon Enterprise Platform’s UI now provides detailed error messages in case of model deployment failures. No more sifting through logs to understand what went wrong – our improved error messages pinpoint the issue, allowing for quick resolution, streamlining your troubleshooting processes.

Full History of Inference Requests Exposed via API/SDK

Data Scientists can now expose the full history of inference requests through API/SDK, empowering users with valuable insights for model retraining purposes. Dive deep into the performance of your models, understand patterns, and make informed decisions for continuous improvement to unleash the full potential of your data science projects.

Differentiate Between Users and Machines via OIDC Configuration

For those seeking a more enhanced programmatic deployment, Enterprise Platform now allows differentiation between users and machines via OIDC configuration. Streamline deployment processes and ensure efficient resource utilization by categorizing users and machines with precision to elevate your model deployment strategies.

Configurable Resource Allocation for Batch Job Pods

Running batch jobs with large files just got a whole lot smoother with the introduction of configurable resource allocation for batch job pods. This new update enables teams to set CPU and RAM limits, removing the hassle of “out of memory” errors, and benefit from seamless batch job execution with optimal resource allocation, giving users effortless control.

UI Support for Folder-based Batch Jobs

Embrace the freedom to structure and manage your batch jobs effortlessly with UI support for providing batch jobs via a folder instead of a single file. This offers unparalleled flexibility for customers, allowing you to specify batch jobs in a way that best suits your unique needs.

Seldon Core v1 and v2: Unveiling Enhanced Flexibility and Functionality

Our latest updates to Seldon Core v1 and v2 reflect our commitment to enhancing flexibility and overall user experience. Let’s delve into key features, updates and fixes that will help you get the most out of your MLOps projects.

Seldon Core v1: Streamlining Helm Chart Management

Make Helm Chart Manager User ID Optional

The user ID requirement, once a constraint during Helm chart installations in Kubernetes, has now become optional. This enhancement is small yet impactful, allowing for more adaptable usage of Helm charts, particularly benefiting programmatically-driven setups. With the removal of mandatory user IDs, reducing the installation issues, minimizes frustration while empowering users with the flexibility they need for seamless Helm chart management.

Fix:

helm docs runtime name typo →

Seldon Core v2: Catering to Your Specific Needs

Implement OAuth 2 SASL Mechanism for Confluent Kafka

A particularly beneficial enhancement for users seeking a more secure authorization or authentication method when using Confluent Kafka, we have implemented the OAuth 2 SASL mechanism for Core v2, to help users connect seamlessly with Confluent Kafka.

Enhanced Error Handling and Configurability

Error Handling: Key improvements have been made in reporting status messages when Core v2 pipelines fail due to lack of certain resources or dataflow engines and match model requirements for servers. These improvements provide specific error or status information and logs making troubleshooting easier when deployment fails.

Kafka Configurability: Similar to the improvements in Core v1, we’ve made Kafka consumer group ID prefix configuration configurable, providing users with the flexibility they need in their Kafka setup.

Change message.max.bytes in broker side to align with producer and consumer
(Auth) Add checks for empty/whitespace namespaces to Kafka auth stores

SeldonConfig Configurability: We’ve made SeldonConfig configurable in the SeldonRuntime Helm chart, allowing users to adapt all their configurations according to their specific requirements.

Addition to Core v2 Documentation and Support for MLServer’s New HuggingFace runtime

This update allows users to experiment with the deployment of HuggingFace models including a large range of model types like computer vision, language, or multi-modal helping to increase experimentation and efficiency in your deployments.

Bug Fixes and General Improvements:

Security and Useability Updates to Alibi Detect and Alibi Explain

Alibi Detect and Alibi Explain stand as formidable tools for monitoring and explainability, empowering teams to swiftly capture and troubleshoot issues while gaining a deeper understanding of their model outputs. While Alibi Detect and Explain remains compatible with Core v1, the latest updates cater specifically to Core v2 users, offering the capability to employ multiple detectors and explainer methods. This enhancement is particularly valuable for users dealing with more complex models, providing them with a versatile toolkit to address their unique needs.

As security is a primary focus, we have also prioritized updates to Common Vulnerabilities and Exposures (CVEs), resolved bugs, and executed critical dependency upgrades to ensure a seamless integration with Seldon Core.

Upgrades to Support Dependency

We’ve implemented crucial upgrades to support changes in dependencies, including Go, Grafana, and rclone. These updates ensure our platform remains seamlessly integrated with the latest technologies, providing you with a robust and up-to-date foundation for your operations.

Bug Fixes and Quality Improvements to Scheduler

We have prioritized quality improvements to our scheduler and operator components. The scheduler now seamlessly coordinates the loading and unloading of models, while the operator efficiently manages Kubernetes resources, providing real-time status updates to the scheduler. These enhancements ensure a smoother and more reliable experience for managing your models with Seldon.

Other updates and fixes to scheduler include:

Manual trigger envoy update →
Fix deleting models that are still progressing
Non-latest models consistency when server disconnects
Fix scheduler segfault when agent 0 disconnects

Data Processing

This update introduces essential data processing improvements, enhancing our platform’s capability to handle raw tensors with increased efficiency and precision. These contribute to a seamless experience when working with raw tensors, ensuring optimal data processing performance for your specific needs.

Links to other important updates and improvements

Podmonitor:

Pipeline gateway podmonitor label fix →

Operator:

Consistent pod service monitors reconcilor app labels →

Dataflow:

[SCV2-18] Fix for batching does not handle mixed raw and normal tensors
Fix for batching does not support raw tensors

Manifests:

Rename components to allow install along core v1 →

Other:

Hodometer enable flag and docs →
Helm docs runtime name typo →
Start triton server via bash -c tritonserver instead of just tritonserver

Announcing updates to Enterprise Platform, Core, and Alibi