Adaptive Action Chunking Strategy from World Feedback in Mixed Traffic
Hongki Kim ⋅ Sangeun Park ⋅ Minhae Kwon
Abstract
Learning from real-world feedback requires policies that can act on measurable interaction outcomes rather than human preference signals alone. In sequential decision-making tasks, such feedback often depends on consequences that unfold over multiple future timesteps under uncertainty. We propose $\textbf{UA2C}$, an uncertainty-aware adaptive action chunking framework from world feedback in mixed traffic. UA2C first learns a flow-matching chunk policy from offline data and then refines the policy through online interaction. To account for behaviorally diverse surrounding vehicles, UA2C incorporates a driving-style inference module that augments the policy with local behavior context. UA2C estimates uncertainty from sampled action chunks and executes only a reliable prefix before replanning. Experiments show that UA2C improves the driving performance over a one-step method and fixed chunk execution.
Successful Page Load