Timezone: »

 
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Wenshuo Guo

Author Information

Wenshuo Guo (UC Berkeley)

More from the Same Authors