Alibi Detect v0.11.0: Scaling up Drift Detection to Large Datasets

We are excited to announce the release of Alibi Detect v0.11.0, featuring widened serialisation support and a new backend that allows drift detection to be rapidly performed on large datasets.

Drift detection on large datasets

The sensitivity of a drift detector scales with the number of data samples. However, the memory and computational costs of a number of convenient and powerful kernel-based drift detectors do not scale favourably with increasing data set size.

In Alibi Detect v0.11.0 a new KeOps backend is implemented for the MMD and learned kernel MMD detectors. The public API of these detectors is almost unchanged from the existing TensorFlow and PyTorch implementations, but internally these detectors work with symbolic kernel matrices by leveraging the KeOps library. This drastically speeds and scales up the detectorsto large dataset sizes, with sizes in the order of 100,000’s of instances easily achievable on a single consumer grade GPU. 

Prediction time for the PyTorch and KeOps MMD detectors, versus number of instances in the reference and test sets. The KeOps detector is significantly faster for a given number of instances, and is capable of running with much larger numbers of instances.

Serialisation support

As well as numerous other enhancements and some bug fixes, Alibi Detect v0.11.0 includes significantly widened serialisation support. In addition to the existing TensorFlow support, PyTorch, scikit-learn and KeOps based detectors can now all be serialised. Additionally, online detectors now have methods available to save and load their state to disk, allowing users to create checkpoints and restart from them later.

.

.

.

.

Contents