Where RL Fits: Training vs. Inference in the LLM PipelineExplains that RL in LLMs is a training/alignment stage, not inference, with pipeline context.