Skip to yearly menu bar Skip to main content


Poster

DC-W2S: Dual-Consensus Weak-to-Strong Training for Reliable Process Reward Modeling in Biological Reasoning

Chi-Min Chan ⋅ Ehsan Hajiramezanali ⋅ Xiner Li ⋅ Edward De Brouwer ⋅ Carl Edwards ⋅ Wei Xue ⋅ Sirui Han ⋅ Yike Guo ⋅ Gabriele Scalia

Abstract

Log in and register to view live content