Anti-Narration in Harness Engineering

Anti-narration in Harness Engineering In AI harness engineering, “anti‑narration” means the harness is designed to prevent large language models (LLMs) from producing fluent but unverified stories — it enforces verification before accepting outputs, ensuring correctness over coherence. It’s not about stopping hallucinations directly, but about breaking the tendency of AI systems to narrate confidently without grounding. 🔎 What “Anti‑Narration” Means Narration vs. Hallucination Narration: The structural tendency of LLMs to produce coherent, completed stories or answers. Hallucination: Fabricated or false information. Harness engineering focuses on narration because coherence can mask errors — a fluent answer may sound right but be wrong. moltbook.com Anti‑Narration Guardrails ...

May 25, 2026 · 2 min

PPO — Proximal Policy Optimization

PPO — Proximal Policy Optimization PPO is a reinforcement learning algorithm from OpenAI (Schulman et al., 2017) that became the default workhorse for RLHF — it’s what trained InstructGPT and the original ChatGPT. Core Idea Policy gradient methods are unstable because a single large update can collapse the policy. PPO fixes this by staying close to the previous policy on each update — the “proximal” part. It does this with a clipped surrogate objective: ...

May 19, 2026 · 2 min

GRPO — Group Relative Policy Optimization

GRPO — Group Relative Policy Optimization GRPO is a reinforcement learning algorithm introduced by DeepSeek (DeepSeekMath, later DeepSeek-R1) as a more efficient alternative to PPO for fine-tuning LLMs with RL. Core Idea PPO needs a separate value model (critic) of comparable size to the policy to estimate the baseline for advantage calculation. That doubles memory and compute. GRPO ditches the critic entirely. Instead, for each prompt it samples a group of G outputs from the current policy, scores each with the reward model, and uses the group’s mean and standard deviation as the baseline: ...

May 19, 2026 · 2 min

Tool-DC Strategic Anchor Grouping — Web Search Example

Tool-DC: Strategic Anchor Grouping — Web Search Example This is a concrete example illustrating how the Strategic Anchor Grouping mechanism works in the Tool-DC framework. See also: notes/ml/tool-dc-framework.md. Setup Query: “search the web for recent AI news” Tool library: 20 tools total Retriever returns top 3: T_top = [Google Search, Bing Search, DuckDuckGo Search] T_tail = 17 remaining tools (Calculator, Weather API, Wikipedia, Code Executor, etc.) With K=3, Tool-DC creates 4 groups: ...

May 19, 2026 · 4 min

AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization Source: arXiv:2510.05592 — ICLR 2026 Oral (Top 1.1%) Authors: Zhuofeng Li, Haoxiang Zhang, Seungju Han, Sheng Liu, Jianwen Xie, Yu Zhang, Yejin Choi, James Zou, Pan Lu (Stanford University, Texas A&M, UC San Diego, Lambda) The Problem It Solves Standard tool-augmented LLMs (like Search-R1 or ToRL) train a single monolithic policy that interleaves thinking and tool calls in one big context. This works okay on short tasks but scales poorly on long-horizon problems: the context grows, the reward signal is sparse (you only find out at the very end whether you succeeded), and the model generalizes weakly to new tool configurations. AgentFlow is built to fix all three of those. ...

May 19, 2026 · 3 min

Tool-DC Framework

Tool-DC Framework: Try, Check and Retry for Long-context Tool-Calling Source: arXiv:2603.11495 — Accepted at ACL 2026 Authors: Kunfeng Chen, Qihuang Zhong, Juhua Liu, Bo Du (Wuhan University), Dacheng Tao (NTU) The Core Problem When you give an LLM access to a large library of tools — say 20, 50, or hundreds of APIs — performance degrades sharply. The paper shows that even going from fewer than 10 tools to 20 causes significant accuracy drops across all tested models, especially smaller ones. Two things go wrong: the sheer length of the context buries the signal, and semantically similar tools with slightly different argument schemas confuse the model when it’s trying to fill in the right parameters. ...

May 19, 2026 · 3 min

Attention in Machine Learning

Attention in Machine Learning Attention is a mechanism that lets a model dynamically decide which parts of the input matter most when producing each piece of output. Instead of compressing everything into one fixed representation, the model computes a weighted combination of inputs where the weights are learned and depend on context. Intuition When translating “the cat sat on the mat” to French, generating the word for “cat” should mostly pay attention to “cat” in the source — not “mat” or “on.” Attention makes this routing explicit and differentiable. ...

May 17, 2026 · 3 min

Six-Dimension Art Evaluation Rubric

Six-Dimension Art Evaluation Rubric Source paper: Learning-based Artificial Intelligence Artwork: Methodology Taxonomy and Quality Evaluation, ACM Computing Surveys (2024). Origin The rubric was built from art vocabulary and traditional principles of painting analysis, then validated through a user study to confirm the weightings felt reasonable across different artwork types. The goal was a consistent, repeatable way to evaluate AI-generated artworks across different styles. The Six Dimensions Beauty (50%) — The dominant criterion. Encompasses overall compositional harmony: balance, proportion, the arrangement of visual elements, and the pleasing relationship between subjects. An image can score well on every other dimension and still feel wrong if the composition is off. This is where Gestalt principles are most directly applied — does the whole hang together? ...

May 14, 2026 · 3 min

Commitment Gate (Harness Engineering)

In harness engineering, a commitment gate is a point in the workflow where the agent must prove a change meets defined criteria before it can be merged or committed. It’s a quality-control checkpoint that turns “looks good” into an enforceable decision, usually through tests, lint rules, architectural checks, or explicit approval rules.[1][2] What it does A commitment gate is meant to stop bad or incomplete work from being accepted just because the agent produced it. In harness-engineering terms, this fits the broader pattern of using constraints, feedback loops, and quality gates to make AI agents reliable. OpenAI’s harness-engineering write-up emphasizes that the real job is designing environments and feedback loops so agents can work safely and consistently, rather than relying on humans to catch every mistake.[2][1] ...

May 13, 2026 · 2 min

Commitment Gate (Harness Engineering)

Commitment Gate (Harness Engineering) A commitment gate is a verification checkpoint between an agent producing a candidate output and that output being “locked in” — emitted as a final answer, written to disk, or used to call an irreversible tool. Instead of running skills along a fixed script and fusing results at the end (where errors propagate silently into late fusion), a harness with commitment gates pauses at each commit point, runs relative checks, and either lets the result through or triggers a targeted recovery loop. ...

May 13, 2026 · 2 min

Deterministic Graders (for LLM / AI Evaluation)

Deterministic Graders (for LLM / AI Evaluation) Definition A deterministic grader is an evaluation function that produces the same result every time for the same input — no randomness, no LLM-in-the-loop judgment. You check the model’s output against a fixed, code-based rule. Concrete Examples Exact string match — “Does the output equal Paris?” Regex match — “Does the output contain a valid ISO date?” Structured-output validation — “Does this parse as JSON and pass the schema?” Code execution / unit tests — “Run the generated function against these test cases. Did they pass?” Numeric tolerance — “Is the answer within 0.01 of the expected value?” Set membership — “Is the classification label one of {positive, negative, neutral}?” Contrast: Model-Graded / LLM-as-Judge The opposite approach is a model-graded (or “LLM-as-judge”) evaluator, where you ask another model something like “Is this answer helpful and correct?” ...

April 24, 2026 · 2 min

Chain of Thought (CoT)

Chain of Thought (CoT) Chain of Thought is a prompting technique where an AI model is guided — or learns — to reason through a problem step by step before arriving at a final answer, rather than jumping straight to the conclusion. The core idea is that breaking down complex reasoning into intermediate steps leads to more accurate and reliable outputs, much like how a person might work through a math problem by showing their work. ...

April 23, 2026 · 2 min

Multi-Turn Conversation in AI

Multi-Turn Conversation in AI Multi-turn conversation in AI refers to a dialogue system where a model maintains context across multiple exchanges — rather than treating each message as an isolated input. Single-Turn vs Multi-Turn In a single-turn interaction, the model sees one prompt and produces one response, with no memory of anything before or after. In a multi-turn interaction, the model receives the full conversation history (all prior messages) with each new request, allowing it to: ...

April 21, 2026 · 2 min

Agent Harness Engineering

Agent Harness Engineering Agent harness engineering is the practice of building the scaffolding, infrastructure, and tooling that surrounds an AI agent — everything that isn’t the model itself but makes the model useful, reliable, and safe in production. The model is the engine; the harness is the chassis, controls, safety systems, and instrumentation around it. Core Components Execution Environment The runtime that manages how the agent runs — process lifecycle, sandboxing, resource limits, timeouts, and isolation between agent instances. ...

April 17, 2026 · 2 min