Poster

Generative Point Cloud Registration

Haobo Jiang · Jin Xie · jian Yang · Liang Yu · Jianmin Zheng

2025 Poster

[ OpenReview]

Abstract

In this paper, we propose a novel 3D registration paradigm, Generative Point Cloud Registration, which bridges advanced 2D generative models with 3D matching tasks to enhance registration performance. Our key idea is to generate cross-view consistent image pairs that are well-aligned with the source and target point clouds, enabling geometric-color feature fusion to facilitate robust matching. To ensure high-quality matching, the generated image pair should feature both 2D-3D geometric consistency and cross-view texture consistency. To achieve this, we introduce Match-ControlNet, a matching-specific, controllable 2D generative model. Specifically, it leverages the depth-conditioned generation capability of ControlNet to produce images that are geometrically aligned with depth maps derived from point clouds, ensuring 2D-3D geometric consistency. Additionally, by incorporating a coupled conditional denoising scheme and coupled prompt guidance, Match-ControlNet further promotes cross-view feature interaction, guiding texture consistency generation. Our generative 3D registration paradigm is general and could be seamlessly integrated into various registration methods to enhance their performance. Extensive experiments on 3DMatch and ScanNet datasets verify the effectiveness of our approach.

Lay Summary

Our research introduces a new 3D matching paradigm called Generative Point Cloud Registration. We've found a way to use advanced 2D generative AI to help with this 3D problem. Our key idea is to generate realistic image pairs from different viewpoints of the 3D scene. These images are carefully created to be consistent with both the 3D scene geometry and their visual appearance across views. By combining information from both the 3D scans and these generated 2D images, we can more accurately and reliably match the 3D objects.To achieve this, we developed Match-ControlNet, a specialized 2D generative model for 3D matching task. It uses information about the 3D depth of the scene to create images that are geometrically accurate. Additionally, it ensures that the generated images from different viewpoints have consistent textures, making them ideal for matching. Our proposed new paradigm is plug-and-play and can improve the performance of many existing 3D registration techniques. We've shown its effectiveness through extensive testing on widely used 3D datasets.

Video

Chat is not available.