Thinking in Scales: Accelerating Gigapixel Pathology Image Analysis via Adaptive Continuous Reasoning
Abstract
Traditional whole slide image (WSI) analysis methods typically rely on the multiple instance learning (MIL) paradigm, which extracts patch-level features at high magnification and aggregates them for slide-level prediction. However, such exhaustive patch-level processing is computationally expensive, severely limiting the efficiency and scalability of WSI analysis. To address this challenge, we propose \textbf{PathCTM} (a \textbf{\ul{Path}}ology-oriented \textbf{\ul{C}}ontinuous \textbf{\ul{T}}hought \textbf{\ul{M}}odel) that enables token-efficient scale-space continuous reasoning for gigapixel WSIs. PathCTM formulates diagnostic inference as a dynamic sequential information pursuit. It progressively transitions from low-magnification global to high-magnification local inspection, and adaptively terminates inference when sufficient evidence is gathered to effectively bound decision uncertainty. Specifically, it uses conditional computation for dynamic scale switching with attention-guided region pruning, coupled with confidence-aware early stopping. Extensive experiments demonstrate that compared with state-of-the-art MIL methods, PathCTM reduces the number of required image patches by 95.95\%, shortens inference time by approximately 95.62\%, and improves AUC by an average of 2.3\%. Code is available at \url{https://anonymous.4open.science/r/PathCTM}.