AgentFlow
AgentFlow: In-the-Flow Agentic System Optimization Source: arXiv:2510.05592 — ICLR 2026 Oral (Top 1.1%) Authors: Zhuofeng Li, Haoxiang Zhang, Seungju Han, Sheng Liu, Jianwen Xie, Yu Zhang, Yejin Choi, James Zou, Pan Lu (Stanford University, Texas A&M, UC San Diego, Lambda) The Problem It Solves Standard tool-augmented LLMs (like Search-R1 or ToRL) train a single monolithic policy that interleaves thinking and tool calls in one big context. This works okay on short tasks but scales poorly on long-horizon problems: the context grows, the reward signal is sparse (you only find out at the very end whether you succeeded), and the model generalizes weakly to new tool configurations. AgentFlow is built to fix all three of those. ...