Poster Tue, Jul 7, 2026 • 6:30 PM – 8:15 PM PDT HALL A #808

SCOUT: Active Information Foraging for Long-Text Understanding with Decoupled Epistemic States

Zhenliang Zhang ⋅ Wenqing Wang ⋅ Yong Hu ⋅ Yaming Yang ⋅ Jiaheng Gao ⋅ Chen Shen ⋅ Xiaojun Wan

Project Page

Abstract

Long-Text Understanding (LTU) at million-token scale requires balancing reasoning fidelity with computational efficiency. Frontier long-context LLMs can process millions of token contexts end-to-end, but they suffer from high token consumption and attention dilution. In parallel, specialized LTU agents often sacrifice fidelity through task-agnostic abstractions like graph construction or indexing. We identify a key insight for LTU: query-relevant information is typically sparse relative to the full document, so effective reasoning should rely on a query-sufficient subset rather than the entire context. To address this, we propose SCOUT, a new paradigm for LTU that shifts from passive processing to active information foraging. It treats the document as an explorable environment and answers from a compact, provenance-grounded epistemic state. Guided by state-level gap diagnosis, SCOUT adaptively alternates between coarse-to-fine exploration and anchored state updates that progressively contract its epistemic state toward query sufficiency. Experiments show that SCOUT matches state-of-the-art proprietary models while reducing token consumption by up to 8 times. Moreover, SCOUT remains stable as context length scales, substantially alleviating the practical cost--capability trade-off in long-context reasoning. Resources are available at our Project Page.

Lay Summary

Many important tasks require reading very long documents, such as scientific papers, technical manuals, books, or large codebases. Today’s language models can sometimes take very long inputs, but processing everything at once is expensive and can still make them miss important details. Other systems try to shorten the document first, but they may throw away information that later turns out to be needed. This paper introduces Scout, a method that treats a long document more like a space to explore than a block of text to read all at once. Given a question, Scout searches the document step by step, collects only the pieces of information that are useful for answering, and keeps a compact record of what it has learned and where it came from. It also checks what information is still missing before deciding where to look next. Across long-document benchmarks, Scout answers questions as accurately as strong existing systems while using far fewer tokens, which can reduce the cost of analyzing very long documents. It also remains more stable as documents grow to very large sizes. These results suggest that careful, question-guided exploration can be a practical alternative to feeding an entire long document into a model at once.