Ml | knowledged.to

Where RL Fits: Training vs. Inference in the LLM Pipeline

Explains that RL in LLMs is a training/alignment stage, not inference, with pipeline context.

Reinforcement Learning (ELI-Teen Explainer)

Teen-friendly explainer of reinforcement learning agents, rewards, exploration, delayed rewards, and applications.

Model Type Classification by Modality (Multimodal, Vision, Image Generation)

Classifies multimodal, vision, and image-generation models by their input/output modalities.

TurboQuant

Explains TurboQuant, a rotation-based vector quantization method for KV-cache compression and vector search.

Anti-Narration in Harness Engineering

Harness pattern that forces verification before accepting fluent AI outputs as correct.

Model Drift

Overview of model drift, detection, mitigation, and LLM-specific issues like knowledge staleness and provider drift.

Attention in Machine Learning

Explanation of the attention mechanism in ML, covering Query/Key/Value, self-attention, multi-head, causal, cross-attention, and efficiency variants like FlashAttention and GQA.

Fine-Tuning Techniques for LLMs

Comprehensive guide to LLM fine-tuning methods including full, parameter-efficient, and preference-based approaches with modern recipes and tools like LoRA and DPO

Unsloth Studio — Fine-tuning Dataset Formats

Overview of dataset formats supported by Unsloth Studio for fine-tuning, including JSONL, Alpaca, ShareGPT, ChatML, and Reasoning formats with rules and best practices and dataset size guidelines

Mixture of Experts (MoE)

Overview of MoE architecture, routing, key components, variants, and trade-offs in machine learning models

Chain of Thought (CoT)

Prompting technique where an AI model is guided — or learns — to reason through a problem step by step before arriving at the final answer, rather than jumping straight to the conclusion.

Visual Chain-of-Thought Reasoning

Extension of chain-of-thought prompting to multimodal settings where models reason step-by-step over both visual and textual information.

Agent Harness Engineering

Overview of agent harness engineering — the scaffolding, infrastructure, and tooling surrounding an AI agent, covering execution environments, tool orchestration, memory management, control flow, tracing, safety, and state persistence

Diffusion Models in AI

Overview of diffusion models, how they reverse a gradual noising process to generate data, key variants like DDPM, DDIM, and Latent Diffusion Models, and how text-to-image conditioning works