<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Kv-Cache on knowledged.to</title><link>https://knowledged.to/tags/kv-cache/</link><description>Recent content in Kv-Cache on knowledged.to</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Thu, 21 May 2026 22:20:05 +0530</lastBuildDate><atom:link href="https://knowledged.to/tags/kv-cache/index.xml" rel="self" type="application/rss+xml"/><item><title>LLM Prompt Cache Options Across Providers</title><link>https://knowledged.to/notes/ml/llm-prompt-cache-provider-options/</link><pubDate>Thu, 21 May 2026 16:49:20 +0000</pubDate><guid>https://knowledged.to/notes/ml/llm-prompt-cache-provider-options/</guid><description>Compares prompt/KV cache TTLs, controls, pricing, scope, and strategies across major LLM providers.</description></item><item><title>LLM Prompt Caching: Implicit vs Explicit</title><link>https://knowledged.to/notes/ml/llm-prompt-caching-implicit-vs-explicit/</link><pubDate>Thu, 21 May 2026 16:08:55 +0000</pubDate><guid>https://knowledged.to/notes/ml/llm-prompt-caching-implicit-vs-explicit/</guid><description>Explains implicit vs explicit LLM prompt caching, prefix constraints, provider support, and when to use each.</description></item><item><title>Vectors vs Tensors</title><link>https://knowledged.to/notes/ml/vectors-vs-tensors/</link><pubDate>Thu, 21 May 2026 15:49:59 +0000</pubDate><guid>https://knowledged.to/notes/ml/vectors-vs-tensors/</guid><description>Explains how vectors relate to tensors in ML, including rank, framework terminology, and KV cache shapes.</description></item><item><title>Why LLM Caching Is Only for Input Tokens</title><link>https://knowledged.to/notes/ml/llm-caching-input-tokens/</link><pubDate>Thu, 21 May 2026 15:43:26 +0000</pubDate><guid>https://knowledged.to/notes/ml/llm-caching-input-tokens/</guid><description>Explains why LLM prompt caching applies to reusable input-token prefill, not sequential output decoding.</description></item></channel></rss>