Commit Intent in AI Harness Engineering

Commit Intent in AI Harness Engineering Commit intent is the discipline of having an agent explicitly declare what it is about to do, and why, immediately before it actually invokes a tool — separating the decision from the execution as two distinct steps in the harness. Concretely, before a tool call goes out, the agent emits a short, structured statement: the action being taken, the target, the expected outcome, and often the reasoning that justifies it. Only after that intent is committed does the harness fire the actual tool call. This sounds redundant — the tool call itself already encodes “what” — but it solves several real problems in agentic systems. ...

May 25, 2026 · 3 min

Sub-Agent vs Tool-Agent in AI Harness Engineering

Sub-Agent vs Tool-Agent in AI Harness Engineering A sub-agent is another agentic process delegated a goal. It has its own prompt/context, can reason over steps, may call tools, and returns a synthesized result or handoff. Use it when the work benefits from independent judgment. Example: Investigate why the auth tests are flaky and report root cause plus fix options. A tool-agent is a tool-shaped interface that may internally use agentic behavior, but from the harness perspective it is invoked like a tool: bounded input, bounded output, narrower contract. Use it when you want a capability, not an independent collaborator. ...

May 25, 2026 · 2 min

Tool-DC Strategic Anchor Grouping — Web Search Example

Tool-DC: Strategic Anchor Grouping — Web Search Example This is a concrete example illustrating how the Strategic Anchor Grouping mechanism works in the Tool-DC framework. See also: notes/ml/tool-dc-framework.md. Setup Query: “search the web for recent AI news” Tool library: 20 tools total Retriever returns top 3: T_top = [Google Search, Bing Search, DuckDuckGo Search] T_tail = 17 remaining tools (Calculator, Weather API, Wikipedia, Code Executor, etc.) With K=3, Tool-DC creates 4 groups: ...

May 19, 2026 · 4 min

AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization Source: arXiv:2510.05592 — ICLR 2026 Oral (Top 1.1%) Authors: Zhuofeng Li, Haoxiang Zhang, Seungju Han, Sheng Liu, Jianwen Xie, Yu Zhang, Yejin Choi, James Zou, Pan Lu (Stanford University, Texas A&M, UC San Diego, Lambda) The Problem It Solves Standard tool-augmented LLMs (like Search-R1 or ToRL) train a single monolithic policy that interleaves thinking and tool calls in one big context. This works okay on short tasks but scales poorly on long-horizon problems: the context grows, the reward signal is sparse (you only find out at the very end whether you succeeded), and the model generalizes weakly to new tool configurations. AgentFlow is built to fix all three of those. ...

May 19, 2026 · 3 min

Tool-DC Framework

Tool-DC Framework: Try, Check and Retry for Long-context Tool-Calling Source: arXiv:2603.11495 — Accepted at ACL 2026 Authors: Kunfeng Chen, Qihuang Zhong, Juhua Liu, Bo Du (Wuhan University), Dacheng Tao (NTU) The Core Problem When you give an LLM access to a large library of tools — say 20, 50, or hundreds of APIs — performance degrades sharply. The paper shows that even going from fewer than 10 tools to 20 causes significant accuracy drops across all tested models, especially smaller ones. Two things go wrong: the sheer length of the context buries the signal, and semantically similar tools with slightly different argument schemas confuse the model when it’s trying to fill in the right parameters. ...

May 19, 2026 · 3 min

Commitment Gate (Harness Engineering)

Commitment Gate (Harness Engineering) A commitment gate is a verification checkpoint between an agent producing a candidate output and that output being “locked in” — emitted as a final answer, written to disk, or used to call an irreversible tool. Instead of running skills along a fixed script and fusing results at the end (where errors propagate silently into late fusion), a harness with commitment gates pauses at each commit point, runs relative checks, and either lets the result through or triggers a targeted recovery loop. ...

May 13, 2026 · 2 min

Agent Harness Engineering

Agent Harness Engineering Agent harness engineering is the practice of building the scaffolding, infrastructure, and tooling that surrounds an AI agent — everything that isn’t the model itself but makes the model useful, reliable, and safe in production. The model is the engine; the harness is the chassis, controls, safety systems, and instrumentation around it. Core Components Execution Environment The runtime that manages how the agent runs — process lifecycle, sandboxing, resource limits, timeouts, and isolation between agent instances. ...

April 17, 2026 · 2 min