Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

377 Results

<<   <   Page 31 of 32   >   >>
Workshop
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge?
Zhaorun Chen · Yichao Du · Zichen Wen · Yiyang Zhou · Chenhang Cui · Zhenzhen Weng · Haoqin Tu · Chaoqi Wang · Zhengwei Tong · Leria HUANG · Canyu Chen · Qinghao Ye · Zhihong Zhu · Yuqing Zhang · Jiawei Zhou · Zhuokai Zhao · Rafael Rafailov · Chelsea Finn · Huaxiu Yao
Poster
Tue 2:30 Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wołczyk · Bartłomiej Cupiał · Mateusz Ostaszewski · Michał Bortkiewicz · Michał Zając · Razvan Pascanu · Lukasz Kucinski · Piotr Milos
Workshop
Generative Model for Small Molecules with Latent Space RL Fine-Tuning to Protein Targets
Ulrich Armel Mbou Sob · Qiulin Li · Miguel Arbesú · Oliver Bent · Andries Smit · Arnu Pretorius
Poster
Thu 4:30 BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey · Yatin Nandwani · Tahira Naseem · Mayank Mishra · Guangxuan Xu · Dinesh Raghu · Sachindra Joshi · Asim Munawar · Ramón Astudillo
Poster
Wed 4:30 Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman · Michał Bortkiewicz · Piotr Milos · Tomasz Trzcinski · Mateusz Ostaszewski · Marek Cygan
Poster
Wed 4:30 Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
Michael Matthews · Michael Beukman · Benjamin Ellis · Mikayel Samvelyan · Matthew T Jackson · Samuel Coward · Jakob Foerster
Workshop
Generative Design of Decision Tree Policies for Reinforcement Learning
Jacob Pettit · Chak Shing Lee · Jiachen Yang · Alex Ho · Daniel Faissol · Brenden Petersen · Mikel Landajuela
Workshop
Language Model-In-The-Loop: Data Optimal Approach to Recommend Actions in Text Games
Arjun V SS · Prasanna Parthasarathi · Janarthanan Rajendran · Sarath Chandar
Poster
Thu 2:30 Configurable Mirror Descent: Towards a Unification of Decision Making
Pengdeng Li · Shuxin Li · Chang Yang · Xinrun Wang · Shuyue Hu · Xiao Huang · Hau Chan · Bo An
Poster
Wed 2:30 No-Regret Reinforcement Learning in Smooth MDPs
Davide Maran · Alberto Maria Metelli · Matteo Papini · Marcello Restelli
Poster
Tue 4:30 Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro · Marco Mussi · Alberto Maria Metelli · Matteo Papini
Poster
Tue 4:30 Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Yen-Ju Chen · Nai-Chieh Huang · Ching-pei Lee · Ping-Chun Hsieh