Poster
in
Workshop: Pluralistic Alignment Workshop

Position: Align AI to Our Aspirations, Not Our Flaws

Nikita Kazeev ⋅ Bui Nhat Huyen Phan

Project Page

Abstract

This position paper argues that aligning AI to aggregated human preferences is the wrong target. With current technology and modest expense, we can train AIs to share the values of a Silicon Valley techno-optimist, a degrowth environmentalist, a national-conservative culture warrior, a single-party state cadre, or a devout religious traditionalist. We should not. Human values produce societies that thrive or fail on the merits of those values --- from failed states and extreme inequality to falling happiness, political polarization, and government dysfunction in the world's wealthiest democracies. The pluralistic-alignment program correctly diagnoses that there is no single ``humanity'' to align with, but is dangerous if taken as the main directive. We argue that AI should be trained to a non-negotiable floor of objective alignment goals --- factual accuracy, competence, honesty, and lawfulness --- and that pluralism belongs at the surface (language, register, conventions, missing-context defaults) and across the wide band of legitimate value tradeoffs that respect the floor, but not at the level of values that violate it. We highlight the empirical reality of unfiltered pluralistic values, propose four commitments as a constructive alternative, and engage five credible objections: commercial pressure, democratic legitimacy, practical feasibility, the charge that the floor itself is culturally laden, and the limits of Coherent Extrapolated Volition.