Deepseek

LLM Prompt Cache Options Across Providers

Compares prompt/KV cache TTLs, controls, pricing, scope, and strategies across major LLM providers.

Critic-free RL algorithm that replaces PPO's value model with group-relative rewards for LLM fine-tuning.