Preference Alignment Improves Information Conveyance in Language Models
Yuwei Cheng ⋅ Weiyi Tian ⋅ Haifeng Xu
Abstract
Preference alignment is often believed to reduce uncertainty and diversity in large language models, but existing analyses overlook output length, a key confounder, and therefore fail to capture how uncertainty is distributed across an entire generation rollout. To address this, we propose Canopy Entropy ($\mathrm{CE}^\star$), a measure that views language generation from a tree perspective, where "canopy" represents the space of all possible rollouts, making $\mathrm{CE}^\star$ naturally quantify the effective size of the generation space. $\mathrm{CE}^\star$ jointly captures the output length $N$ and the generated sequence $Y_{1:N}$, and we show that it equals the total Shannon entropy $H(N, Y_{1:N}\mid X)$, where $X$ denotes the prompt. This formulation yields interpretable metrics, including a length--uncertainty correlation term $\rho(N, r_N)$, where $r_N$ is the entropy rate, quantifying information conveyance efficiency by indicating whether longer outputs are more or less informative per token. Empirically, across tasks and model families, we find that aligned instruction-tuned models consistently exhibit stronger positive correlation $\rho(N, r_N)$, even when total entropy decreases. Furthermore, after controlling for model family, task, prompt, and output-length effects, we find that alignment nearly triples the strength of the relationship between entropy rate and semantic diversity, suggesting that aligned models convert uncertainty into semantically meaningful variation much more efficiently. Overall, these results suggest that preference alignment does not simply reduce uncertainty, but fundamentally reorganizes it into more informative and semantically meaningful generations.
Successful Page Load