Radial Scaling Voxelization for Accurate Small Object 3D Detection
Abstract
Voxel-based 3D object detectors typically discretize the spatial domain using a uniform Cartesian grid, which allocates the same voxel size to both near-range and far-range regions. However, this uniform discretization is suboptimal for small objects such as pedestrians and cyclists, as they occupy only a few voxels and thus struggle to capture fine-grained geometric details. Although increasing the global voxel resolution can alleviate this problem, it inevitably increases substantial memory consumption and computational overhead. In this paper, we propose Radial Scaling Voxelization (RSV), a simple yet effective non-uniform discretization strategy that adaptively modulates the effective voxel size based on the radial distance from the LiDAR sensor. Unlike previous cylindrical or polar discretization schemes, RSV preserves the Cartesian grid topology by applying a continuous radial scaling function to the input coordinates before standard voxelization. This operation yields a near-high, far-unchanged resolution pattern: the effective voxel size becomes finer in near regions, where the geometric structures of small objects are difficult to capture, while remaining nearly unchanged in far regions to avoid unnecessary computational cost. Importantly, RSV is architecture-agnostic and can directly replace the discretization module in any voxel-based detector without modifying the backbone, network design, or training pipeline. Extensive experiments on the KITTI and nuScenes datasets demonstrate that integrating our RSV into several voxel-based baselines consistently enhances small-object detection performance, especially for the Pedestrian and Cyclist categories, while incurring only marginal additional computational overhead.