Rethinking Serialization in Linear 3D Vision: Decoupling Anisotropic Geometry from Isotropic Semantics
Abstract
Current linear State-Space Models for 3D point clouds typically rely on 1D serialization (e.g., Hilbert curves) for global modeling. Such rigid ordering disrupts spatial continuity in dense scenes, introducing what we term Serialization Bias. We propose AnIsoNet, a framework that decouples anisotropic geometry from isotropic semantics via two dedicated modules: Local Anisotropy Geometric Modeling (LAGM) and Global Isotropy Semantic Aggregation (GISA). LAGM employs ellipsoidal encoding to capture local directionality without global order. GISA adapts to geometric characteristics via two modes: content-based accumulation (Identity Mode) for dense scenes and Morton serialization for sparse objects. This eliminates redundant multi-view scanning while maintaining O(N) complexity. Experiments show that avoiding artificial serialization in dense scenes achieves 82.62 % mIoU on S3DIS (surpassing PCM by 3.0 %), while Morton serialization for sparse objects achieves 94.21 % OA on ScanObjectNN (+1.6 %). On ScanNetV2, we reach 78.52 % mIoU, surpassing PTv3 (77.5 %) without pre-training. We achieve these results with only 12.2 M parameters (26.4 % of PTv3's)