Demo, LLMOps

DeepSeek Reasoning, the Cake Mystery

About

We put DeepSeek’s R1 reasoning model to the test in a live demo showcasing how context windows impact LLM performance in real-world scenarios. Using Seldon Core 2 and the LLM module, we orchestrate controlled experiments with DeepSeek’s 8B and 14B models, better demonstrating how GPU allocation, prompt engineering, and pipeline configuration shape the outcome.

Through a logic-based mystery use case, we explore what it takes to get reliable, explainable reasoning from open-source LLMs, and how to deploy them at scale in your Kubernetes environment.

Learnings

Mixture-of-Experts in Action: DeepSeek uses a MoE architecture to reduce memory requirements and enable large-scale reasoning without maxing out GPU capacity.
Scaling with Context Windows: Compare 4K vs. 12K token contexts to see how reasoning depth and consistency change with prompt length.
Deployable Reasoning Pipelines: Orchestrate multi-step LLM pipelines using Core 2, persistent volumes, and GPU-aware configuration via Kubernetes.
Real-Time Debugging & Resilience: Gain tips for troubleshooting model loading, token allocation issues, and CRD management during LLM deployment.
Structured Outputs with Think Tags: Learn how to structure system prompts for reasoning traceability, by splitting outputs into ‘thinking’ and ‘answer’ segments.
Practical Insights for LLM Engineering: From temperature tuning (0.6 recommended) to inference formatting, walk through the real setup of repeatable, testable LLM experiments.
When Bigger Isn’t Always Better: Discover why larger models don’t guarantee better reasoning, and how context alignment and prompt structure matter just as much.

Alex Housley

Alex has been founding and building technology companies since 2003, specializing in mobile, data, and machine learning (ML). He experienced the challenges of scaling model deployment infrastructure whilst building a startup that served billions of personalized news article recommendations. In 2014, Alex founded Seldon with the aim of democratizing ML operations to solve the world's most pressing issues. He served as CEO until 2023, shaping the MLOps industry and delivering measurable, meaningful results for clients worldwide. As Founder and Seldon Board Member, Alex is responsible for company direction.

Complex, real-time use cases is what we do best

Talk with a expert to explore how Seldon can support more streamlined deployments for real-time, complex projects like fraud detection, personalization, and so much more.

MORE RECENT WEBINARS & EVENTS

Demo, LLMOps

An Essential Guide to ML Model Serving Strategies (Including LLMs)

Stay Ahead in MLOps with our
Monthly Newsletter!

Join over 25,000 MLOps professionals with Seldon’s MLOps Monthly Newsletter. Your source for industry insights, practical tips, and cutting-edge innovations to keep you informed and inspired. You can opt out anytime with just one click.

Email Signup Form

✅ Thank you! Your email has been submitted.

DEPLOYMENT

ACCELERATORS & MODULES

MOST RECENT

Building Trust in Production ML: A Complete Guide to Observability with Seldon

LEARNING

FREE GUIDES

DeepSeek Reasoning, the Cake Mystery

About

Learnings

Alex Housley

Complex, real-time use cases is what we do best

MORE RECENT WEBINARS & EVENTS

MLOps London November Talks: Deploying Production-Ready, Data-Centric Agents

Building Trust in Production ML: A Complete Guide to Model Observability

Production ML Serving & Monitoring with Seldon & Kubernetes

Europe’s Landmark Regulation: What You Should Know

The Foundation of AI-Enabled Apps

An Essential Guide to ML Model Serving Strategies (Including LLMs)

Stay Ahead in MLOps with our
Monthly Newsletter!

Solutions

Company

Resources

Privacy

DEPLOYMENT

ACCELERATORS & MODULES

MOST RECENT

Building Trust in Production ML: A Complete Guide to Observability with Seldon

LEARNING

FREE GUIDES

DeepSeek Reasoning, the Cake Mystery

About

Learnings

Alex Housley

Complex, real-time use cases is what we do best

MORE RECENT WEBINARS & EVENTS

MLOps London November Talks: Deploying Production-Ready, Data-Centric Agents

Building Trust in Production ML: A Complete Guide to Model Observability

Production ML Serving & Monitoring with Seldon & Kubernetes

Europe’s Landmark Regulation: What You Should Know

The Foundation of AI-Enabled Apps

An Essential Guide to ML Model Serving Strategies (Including LLMs)

Stay Ahead in MLOps with our Monthly Newsletter!

Stay Ahead in MLOps with our
Monthly Newsletter!