Transformers on knowledged.to

Transformers on knowledged.tohttps://knowledged.to/tags/transformers/Recent content in Transformers on knowledged.toHugoen-usThu, 21 May 2026 21:14:09 +0530Why LLM Caching Is Only for Input Tokenshttps://knowledged.to/notes/ml/llm-caching-input-tokens/Thu, 21 May 2026 15:43:26 +0000https://knowledged.to/notes/ml/llm-caching-input-tokens/Explains why LLM prompt caching applies to reusable input-token prefill, not sequential output decoding.Attention in Machine Learninghttps://knowledged.to/notes/ml/attention/Sun, 17 May 2026 05:54:45 +0000https://knowledged.to/notes/ml/attention/Explanation of the attention mechanism in ML, covering Query/Key/Value, self-attention, multi-head, causal, cross-attention, and efficiency variants like FlashAttention and GQA.Mixture of Experts (MoE)https://knowledged.to/notes/ml/mixture-of-experts/Thu, 23 Apr 2026 16:04:47 +0000https://knowledged.to/notes/ml/mixture-of-experts/Overview of MoE architecture, routing, key components, variants, and trade-offs in machine learning models