What is Speculative Decoding?

Explains speculative decoding, which pairs a small draft model with a large target model to accelerate LLM inference without changing outputs.

June 24, 2026 · 2 min

Local + Frontier Model Collaboration Patterns in Open Source Harnesses

New file `notes/ml/local-frontier-model-collaboration-patterns.md` added to the Notes section, alphabetically positioned after 'LLM Thinking Token Budgets' and before 'GGUF Models'.

June 24, 2026 · 2 min