Feed-Forward Taylor-Gaussians-Flow: Towards Non-uniform Motion for Novel View Synthesis from Monocular Video
Abstract
Long-term non-uniform motion poses a significant challenge for feed-forward Novel View Synthesis (\textbf{NVS}), as it requires modeling higher-order motion, such as acceleration. Existing methods primarily rely on deformation fields or scene flow, which are limited to first-order approximations. Due to neglecting higher-order motion representations and supervision, these approaches suffer from long-term non-uniform motion scenarios. Inspired by Taylor’s theorem, we propose Taylor-Gaussians-Flow (\textbf{TGsF}) to represent and supervise non-uniform motion through first-order and second-order motion components. TGsF comprises two key modules: Taylor-Gaussians (\textbf{TGs}) and Taylor-Gaussians-Flow (\textbf{TGs-Flow}). TGs represent motion using Gaussian means with a quadratic temporal term and time-dependent opacity. Unlike previous methods, TGs-Flow decouples scene-flow supervision into separate depth and 2D optical-flow constraints. This approach effectively mitigates error propagation from either depth or motion estimation while circumventing the scarcity of labeled scene flow data. Guided by the above analysis, we develop the Feed-Forward Taylor-Gaussians-Flow (\textbf{FF-TGsF}) framework, which sets a new state-of-the-art on four dynamic benchmarks.