Timezone: »
Videos of actions are complex signals containing rich compositional structure in space and time. Current video generation methods lack the ability to condition the generation on multiple coordinated and potentially simultaneous timed actions. To address this challenge, we propose to represent the actions in a graph structure called Action Graph and present the new "Action Graph To Video" synthesis task. Our generative model for this task (AG2Vid) disentangles motion and appearance features, and by incorporating a scheduling mechanism for actions facilitates a timely and coordinated video generation. We train and evaluate AG2Vid on CATER and Something-Something V2 datasets, which results in videos that have better visual quality and semantic consistency compared to baselines. Finally, our model demonstrates zero-shot abilities by synthesizing novel compositions of the learned actions.
Author Information
Amir Bar (Tel Aviv University)
Roi Herzig (Tel Aviv University)
Xiaolong Wang (UCSD)

Our group has a broad interest around the directions of Computer Vision, Machine Learning and Robotics. Our focus is on learning 3D and dynamics representations through videos and physical robotic interaction data. We explore various means of supervision signals from the data itself, language, and common sense knowledge. We leverage these comprehensive representations to facilitate the learning of robot skills, with the goal of generalizing the robot to interact effectively with a wide range of objects and environments in the real physical world. Please check out our individual research topic of Self-Supervised Learning, Video Understanding, Common Sense Reasoning, RL and Robotics, 3D Interaction, Dexterous Hand.
Anna Rohrbach (UC Berkeley)
Gal Chechik (NVIDIA / Bar-Ilan University)
Trevor Darrell (University of California at Berkeley)
Amir Globerson (Tel Aviv University, Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Compositional Video Synthesis with Action Graphs »
Wed. Jul 21st 02:35 -- 02:40 AM Room
More from the Same Authors
-
2021 : Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation »
Nicklas Hansen · Hao Su · Xiaolong Wang -
2021 : Disentangled Attention as Intrinsic Regularization for Bimanual Multi-Object Manipulation »
Minghao Zhang · Pingcheng Jian · Yi Wu · Harry (Huazhe) Xu · Xiaolong Wang -
2021 : Learning Vision-Guided Quadrupedal Locomotionwith Cross-Modal Transformers »
Ruihan Yang · Minghao Zhang · Nicklas Hansen · Harry (Huazhe) Xu · Xiaolong Wang -
2021 : Explaining Reinforcement Learning Policies through Counterfactual Trajectories »
Julius Frost · Olivia Watkins · Eric Weiner · Pieter Abbeel · Trevor Darrell · Bryan Plummer · Kate Saenko -
2023 : Learning to Initiate and Reason in Event-Driven Cascading Processes »
Yuval Atzmon · Eli Meirom · Shie Mannor · Gal Chechik -
2023 : LLM-grounded Text-to-Image Diffusion Models »
Long (Tony) Lian · Boyi Li · Adam Yala · Trevor Darrell -
2023 Poster: Learning Dense Correspondences between Photos and Sketches »
Xuanchen Lu · Xiaolong Wang · Judith E. Fan -
2023 Poster: Learning to Initiate and Reason in Event-Driven Cascading Processes »
Yuval Atzmon · Eli Meirom · Shie Mannor · Gal Chechik -
2023 Oral: Equivariant Architectures for Learning in Deep Weight Spaces »
Aviv Navon · Aviv Shamsian · Idan Achituve · Ethan Fetaya · Gal Chechik · Haggai Maron -
2023 Poster: Equivariant Architectures for Learning in Deep Weight Spaces »
Aviv Navon · Aviv Shamsian · Idan Achituve · Ethan Fetaya · Gal Chechik · Haggai Maron -
2023 Poster: MonoNeRF: Learning Generalizable NeRFs from Monocular Videos without Camera Poses »
Yang Fu · Ishan Misra · Xiaolong Wang -
2023 Poster: On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline »
Nicklas Hansen · Zhecheng Yuan · Yanjie Ze · Tongzhou Mu · Aravind Rajeswaran · Hao Su · Huazhe Xu · Xiaolong Wang -
2023 Poster: Graph Positional Encoding via Random Feature Propagation »
Moshe Eliasof · Fabrizio Frasca · Beatrice Bevilacqua · Eran Treister · Gal Chechik · Haggai Maron -
2023 Poster: Auxiliary Learning as an Asymmetric Bargaining Game »
Aviv Shamsian · Aviv Navon · Neta Glazer · Kenji Kawaguchi · Gal Chechik · Ethan Fetaya -
2022 : Back to the Source: Test-Time Diffusion-Driven Adaptation »
Jin Gao · Jialing Zhang · Xihui Liu · Trevor Darrell · Evan Shelhamer · Dequan Wang -
2022 Poster: Temporal Difference Learning for Model Predictive Control »
Nicklas Hansen · Hao Su · Xiaolong Wang -
2022 Poster: Visual Attention Emerges from Recurrent Sparse Reconstruction »
Baifeng Shi · Yale Song · Neel Joshi · Trevor Darrell · Xin Wang -
2022 Spotlight: Visual Attention Emerges from Recurrent Sparse Reconstruction »
Baifeng Shi · Yale Song · Neel Joshi · Trevor Darrell · Xin Wang -
2022 Spotlight: Temporal Difference Learning for Model Predictive Control »
Nicklas Hansen · Hao Su · Xiaolong Wang -
2022 Poster: Efficient Learning of CNNs using Patch Based Features »
Alon Brutzkus · Amir Globerson · Eran Malach · Alon Regev Netser · Shai Shalev-Shwartz -
2022 Poster: Optimizing Tensor Network Contraction Using Reinforcement Learning »
Eli Meirom · Haggai Maron · Shie Mannor · Gal Chechik -
2022 Poster: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2022 Spotlight: Efficient Learning of CNNs using Patch Based Features »
Alon Brutzkus · Amir Globerson · Eran Malach · Alon Regev Netser · Shai Shalev-Shwartz -
2022 Spotlight: Zero-Shot Reward Specification via Grounded Natural Language »
Parsa Mahmoudieh · Deepak Pathak · Trevor Darrell -
2022 Spotlight: Optimizing Tensor Network Contraction Using Reinforcement Learning »
Eli Meirom · Haggai Maron · Shie Mannor · Gal Chechik -
2022 Poster: Multi-Task Learning as a Bargaining Game »
Aviv Navon · Aviv Shamsian · Idan Achituve · Haggai Maron · Kenji Kawaguchi · Gal Chechik · Ethan Fetaya -
2022 Spotlight: Multi-Task Learning as a Bargaining Game »
Aviv Navon · Aviv Shamsian · Idan Achituve · Haggai Maron · Kenji Kawaguchi · Gal Chechik · Ethan Fetaya -
2021 Workshop: ICML Workshop on Human in the Loop Learning (HILL) »
Trevor Darrell · Xin Wang · Li Erran Li · Fisher Yu · Zeynep Akata · Wenwu Zhu · Pradeep Ravikumar · Shiji Zhou · Shanghang Zhang · Kalesha Bullard -
2021 Poster: GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning »
Idan Achituve · Aviv Navon · Yochai Yemini · Gal Chechik · Ethan Fetaya -
2021 Spotlight: GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning »
Idan Achituve · Aviv Navon · Yochai Yemini · Gal Chechik · Ethan Fetaya -
2021 Poster: Personalized Federated Learning using Hypernetworks »
Aviv Shamsian · Aviv Navon · Ethan Fetaya · Gal Chechik -
2021 Spotlight: Personalized Federated Learning using Hypernetworks »
Aviv Shamsian · Aviv Navon · Ethan Fetaya · Gal Chechik -
2021 Poster: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Oral: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Poster: Towards Understanding Learning in Neural Networks with Linear Teachers »
Roei Sarussi · Alon Brutzkus · Amir Globerson -
2021 Poster: Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks »
Eli Meirom · Haggai Maron · Shie Mannor · Gal Chechik -
2021 Spotlight: Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks »
Eli Meirom · Haggai Maron · Shie Mannor · Gal Chechik -
2021 Spotlight: Towards Understanding Learning in Neural Networks with Linear Teachers »
Roei Sarussi · Alon Brutzkus · Amir Globerson -
2020 Workshop: 2nd ICML Workshop on Human in the Loop Learning (HILL) »
Shanghang Zhang · Xin Wang · Fisher Yu · Jiajun Wu · Trevor Darrell -
2020 Poster: Video Prediction via Example Guidance »
Jingwei Xu · Harry (Huazhe) Xu · Bingbing Ni · Xiaokang Yang · Trevor Darrell -
2020 Poster: Frustratingly Simple Few-Shot Object Detection »
Xin Wang · Thomas Huang · Joseph E Gonzalez · Trevor Darrell · Fisher Yu -
2020 Poster: On Learning Sets of Symmetric Elements »
Haggai Maron · Or Litany · Gal Chechik · Ethan Fetaya -
2020 Poster: Deep Isometric Learning for Visual Recognition »
Haozhi Qi · Chong You · Xiaolong Wang · Yi Ma · Jitendra Malik -
2019 : Fisher Yu: "Motion and Prediction for Autonomous Driving" »
Fisher Yu · Trevor Darrell -
2019 Poster: Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem »
Alon Brutzkus · Amir Globerson -
2019 Oral: Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem »
Alon Brutzkus · Amir Globerson -
2018 Poster: CyCADA: Cycle-Consistent Adversarial Domain Adaptation »
Judy Hoffman · Eric Tzeng · Taesung Park · Jun-Yan Zhu · Philip Isola · Kate Saenko · Alexei Efros · Trevor Darrell -
2018 Oral: CyCADA: Cycle-Consistent Adversarial Domain Adaptation »
Judy Hoffman · Eric Tzeng · Taesung Park · Jun-Yan Zhu · Philip Isola · Kate Saenko · Alexei Efros · Trevor Darrell -
2018 Poster: Learning to Optimize Combinatorial Functions »
Nir Rosenfeld · Eric Balkanski · Amir Globerson · Yaron Singer -
2018 Poster: Predict and Constrain: Modeling Cardinality in Deep Structured Prediction »
Nataly Brukhim · Amir Globerson -
2018 Oral: Learning to Optimize Combinatorial Functions »
Nir Rosenfeld · Eric Balkanski · Amir Globerson · Yaron Singer -
2018 Oral: Predict and Constrain: Modeling Cardinality in Deep Structured Prediction »
Nataly Brukhim · Amir Globerson -
2017 Poster: Curiosity-driven Exploration by Self-supervised Prediction »
Deepak Pathak · Pulkit Agrawal · Alexei Efros · Trevor Darrell -
2017 Poster: Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs »
Alon Brutzkus · Amir Globerson -
2017 Poster: Learning Infinite Layer Networks without the Kernel Trick »
Roi Livni · Daniel Carmon · Amir Globerson -
2017 Talk: Curiosity-driven Exploration by Self-supervised Prediction »
Deepak Pathak · Pulkit Agrawal · Alexei Efros · Trevor Darrell -
2017 Talk: Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs »
Alon Brutzkus · Amir Globerson -
2017 Talk: Learning Infinite Layer Networks without the Kernel Trick »
Roi Livni · Daniel Carmon · Amir Globerson