Invited Talk Mon, Jul 6, 2026 • 4:30 PM – 5:30 PM PDT HALL C

Towards AI Agents In the Real World

Pascale FUNG

Abstract

Recent advances in AI agents have been driven by imitation learning with reinforcement learning in the digital world, based on large scale generative models, yielding strong performance in many online tasks but limited capability in physical world settings. I argue for a shift toward AI agents grounded in world modeling, allowing them to understand the physical environment, to understand user intentions and social contexts, thereby enhancing their ability to perform complex tasks autonomously in the real world. World modeling encompasses the integration of multimodal perception, planning through reasoning for action and control, and memory to create a comprehensive understanding of the physical world. I argue that achieving advanced machine intelligence requires modeling both the physical world and the mental world, including latent variables such as intent, attention, and context. I outline key challenges toward building context-aware, interactive agents in the real world. This essential trajectory demands continued efforts to develop robust world models and embodied agents that can truly assist humans with real tasks in the real world.

Speaker

Pascale FUNG

Pascale Fung’s long term research background is in multimodal interactive systems including audio, speech, text and video. She started research on world modeling after studying the limitation of generative models due to hallucinations. She is the Co-founder and Chief Research & Innovation Officer at AMI Labs. She was previously the Senior Director of AI Research at Meta-FAIR, leading research on embodied AI agents. She is also a Chair Professor of ECE at The Hong Kong University of Science & Technology (HKUST). She is a Fellow of the ACL, AAAI, IEEE, and ISCA for her significant contribution to human-machine interactions.