Position: Human-Centric Vision Requires Topological Generalization Beyond Fixed Skeletal Topologies
Abstract
In this position paper, we argue that human-centric vision requires skeletal-topology generalization beyond fixed skeletons. Mainstream pose and body pipelines enforce a fixed skeleton graph with an indexed joint list and fixed adjacency, so the fixed joint inventory does not cover structural absence and anatomical absence becomes an ill-posed target for individuals with limb deficiencies. Anatomical absence is not a visibility state, so masking and forced completion can hide structural mismatch and produce hallucinated structure that contaminates downstream reasoning in prosthesis-facing settings. We argue that scaling data and model size alone does not resolve this mismatch while the skeleton schema remains fixed, and this is not a niche concern because these failures affect a large population and reach accessibility-facing systems. We advocate instance-adaptive skeletal topology, where a model jointly predicts joint existence and skeletal connectivity to produce an instance-specific skeleton graph that supports consistent inference and evaluation. We outline measurement upgrades, including existence-aware annotations with explicit absence semantics, skeletal-topology-aware scoring, and hallucination-under-absence penalties, and we close with a call to action for dataset curators, benchmark organizers, and model builders to treat morphological variation as a first-class generalization axis.