We put DeepSeek’s R1 reasoning model to the test in a live demo showcasing how context windows impact LLM performance in real-world scenarios. Using Seldon Core 2 and the LLM module, we orchestrate controlled experiments with DeepSeek’s 8B and 14B models, better demonstrating how GPU allocation, prompt engineering, and pipeline configuration shape the outcome.
Through a logic-based mystery use case, we explore what it takes to get reliable, explainable reasoning from open-source LLMs, and how to deploy them at scale in your Kubernetes environment.