Workshop
Assessing World Models: Methods and Metrics for Evaluating Understanding
Kenneth Li · Belinda Li · Hao Tang · Keyon Vafa · Lio Wong · Lionel Wong
Generative models across domains are capable of producing outputs that appear to mimic the real world. But have these systems actually understood the laws that govern the world? Researchers across subfields are attempting to answer this question: in natural language processing, researchers measure whether LLMs understand real-world mechanisms in order to measure how robust they are to new tasks; in video generation, researchers assess whether a model has understood the laws of physics in order to evaluate how realistic its videos are; in scientific domains, foundation models are being developed in order to uncover new theories about the world. Despite studying similar questions, these communities remain disparate. This workshop will explore the question: how can we formalize and evaluate whether generative models have understood the real world? While this question is important across communities, we don’t have unified frameworks for defining and evaluating world models. This workshop will bring together these computer science communities along with non-computer-science scientists working on relevant applications. Our invited speakers include Jacob Andreas, Shiry Ginosar, Shirley Ho, Sendhil Mullainathan, and Martin Wattenberg, all of whom have confirmed they will be speaking and that they can make it in-person.
Live content is unavailable. Log in and register to view live content