Deep Trajectory Supervision: Deep Supervision Strikes Back
Abstract
Interpreting residual architectures as discretizations of Ordinary Differential Equations characterizes the forward pass as a continuous latent flow. Although this framework defines the mechanics of inference, conventional training paradigms primarily constrain the terminal state, leaving the intermediate evolution unregulated. In this work, we formalize the forward pass as a Conditional Discriminative Flow and investigate its intrinsic kinematic laws. Using Tuned Lens analysis, we discover that the accumulation of semantic evidence follows a consistent exponential schedule. This finding confirms that deep models naturally require an extended phase of feature construction prior to a rapid transition toward categorical certainty in the terminal layers. Standard training ignores this latent progression. To resolve this impedance mismatch, we propose Deep Trajectory Supervision, a framework that aligns auxiliary supervision with this intrinsic exponential bias. By rectifying the trajectory of the inference flow, DTS functions as a critical physical inductive bias. Empirical evaluations on ImageNet-1K and various benchmarks demonstrate that DTS significantly accelerates convergence and improves terminal performance.