MLOps Glossary

Multi-armed bandits

Multi-armed bandits:

(aka MABs) A set of online algorithms that use feedback to choose between multiple different options to optimize the best performing one. Often used in online advertising as an extension of A/B testing. Note that MABs can be used in two distinct ways – a MAB as a model itself (traditional use), e.g. given 3 different versions of a website, route visitors automatically to the best performing one (as measured e.g. by engagement time), or a MAB as a router to underlying ML models, e.g. given 3 different trained ML models, route requests to the best performing model.