Tool-DC Strategic Anchor Grouping — Web Search Example

Tool-DC: Strategic Anchor Grouping — Web Search Example This is a concrete example illustrating how the Strategic Anchor Grouping mechanism works in the Tool-DC framework. See also: notes/ml/tool-dc-framework.md. Setup Query: “search the web for recent AI news” Tool library: 20 tools total Retriever returns top 3: T_top = [Google Search, Bing Search, DuckDuckGo Search] T_tail = 17 remaining tools (Calculator, Weather API, Wikipedia, Code Executor, etc.) With K=3, Tool-DC creates 4 groups: ...

May 19, 2026 · 4 min

AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization Source: arXiv:2510.05592 — ICLR 2026 Oral (Top 1.1%) Authors: Zhuofeng Li, Haoxiang Zhang, Seungju Han, Sheng Liu, Jianwen Xie, Yu Zhang, Yejin Choi, James Zou, Pan Lu (Stanford University, Texas A&M, UC San Diego, Lambda) The Problem It Solves Standard tool-augmented LLMs (like Search-R1 or ToRL) train a single monolithic policy that interleaves thinking and tool calls in one big context. This works okay on short tasks but scales poorly on long-horizon problems: the context grows, the reward signal is sparse (you only find out at the very end whether you succeeded), and the model generalizes weakly to new tool configurations. AgentFlow is built to fix all three of those. ...

May 19, 2026 · 3 min

Tool-DC Framework

Tool-DC Framework: Try, Check and Retry for Long-context Tool-Calling Source: arXiv:2603.11495 — Accepted at ACL 2026 Authors: Kunfeng Chen, Qihuang Zhong, Juhua Liu, Bo Du (Wuhan University), Dacheng Tao (NTU) The Core Problem When you give an LLM access to a large library of tools — say 20, 50, or hundreds of APIs — performance degrades sharply. The paper shows that even going from fewer than 10 tools to 20 causes significant accuracy drops across all tested models, especially smaller ones. Two things go wrong: the sheer length of the context buries the signal, and semantically similar tools with slightly different argument schemas confuse the model when it’s trying to fill in the right parameters. ...

May 19, 2026 · 3 min