Oral
in
Workshop: Text, camera, action! Frontiers in controllable video generation

Alignment is All You Need: A Training-free Augmentation Strategy for Pose-guided Video Generation

Xiaoyu Jin ⋅ Zunnan Xu ⋅ Mingwen Ou ⋅ Wenming Yang

Keywords: Action-controllable video models

Project Page [ OpenReview]

Abstract

Character animation is a transformative field in computer graphics and vision, enabling dynamic and realistic video animations from static images. Despite advancements, maintaining appearance consistency in animations remains a challenge. Our approach addresses this by introducing a training-free framework that ensures the generated video sequence preserves the reference image's subtleties, such as physique and proportions, through a dual alignment strategy. We decouple skeletal and motion priors from pose information, enabling precise control over animation generation. Our method also improves pixel-level alignment for conditional control from the reference charactor, enhancing the temporal consistency and visual cohesion of animations. Our method significantly enhances the quality of video generation without the need for large datasets or extensive computational resources.

Chat is not available.