Timezone: »

 
Workshop
ICML Workshop on Machine Learning for Autonomous Vehicles 2017
Li Erran Li · Raquel Urtasun · Andrew Gray · Silvio Savarese

Wed Aug 09 03:30 PM -- 12:30 AM (PDT) @ C4.10
Event URL: https://sites.google.com/site/ml4autovehicles2017/home »

Although dramatic progress has been made in the field of autonomous driving, there are many major challenges in achieving full-autonomy. For example, how to make perception accurate and robust to accomplish safe autonomous driving? How to reliably track cars, pedestrians, and cyclists? How to learn long term driving strategies (known as driving policies) so that autonomous vehicles can be equipped with adaptive human negotiation skills when merging, overtaking and giving way, etc? How to achieve near-zero fatality?

These complex challenges associated with autonomy in physical world naturally suggest that we take a machine learning approach. Deep learning and computer vision have found many real-world applications such as face tagging. However, perception for autonomous driving has a unique set of requirements such as safety and explainability. Autonomous vehicles need to choose actions, e.g. steering commands which will affect the subsequent inputs (driving scenes) encountered. This setting is well-suited to apply reinforcement learning to determine the best actions to take. Many autonomous driving tasks such as perception and tracking requires large data sets of labeled examples to learn rich and high-performance visual representation. However, the progress is hampered by the sheer expenses of human labelling needed. Naturally we would like to employ unsupervised learning, transfer learning leveraging simulators, and techniques can learn efficiently.
The goal of this workshop is to bring together researchers and practitioners from in the field of autonomous driving to address core challenges with machine learning. These challenges include, but are not limited to
accurate and efficient pedestrian detection, pedestrian intent detection,
machine learning for object tracking,
unsupervised representation learning for autonomous driving,
deep reinforcement learning for learning driving policies,
cross-modal and simulator to real-world transfer learning,
scene classification, real-time perception and prediction of traffic scenes,
uncertainty propagation in deep neural networks,
efficient inference with deep neural networks

The workshop will include invited speakers, panels, presentations of accepted papers and posters. We invite papers in the form of short, long and position papers to address the core challenges mentioned above. We encourage researchers and practitioners on self-driving cars, transportation systems and ride-sharing platforms to participate. Since this is a topic of broad and current interest, we expect at least 200 participants from leading university researchers, auto-companies and ride-sharing companies.

Wed 3:20 p.m. - 3:30 p.m.
Opening Remarks: Drew Gray and Li Erran Li (Uber ATG) (Openning)
Wed 3:30 p.m. - 4:00 p.m.
Carl Wellington, Uber ATG (Invited Talk)
Wed 4:00 p.m. - 4:30 p.m.

Abstract Convolutional neural networks have achieved impressive success in many tasks in computer vision such as image classification, object detection / recognition or semantic segmentation. While these networks have proven effective in all these applications, they come at a high memory and computational cost, thus not feasible for applications where power and computational resources are limited. In addition, the process to train the network reduces productivity as it not only requires large computer servers but also takes a significant amount of time (several weeks) with the additional cost of engineering the architecture. In this talk, I first introduce our efficient architecture based on filter-compositions and then, a novel approach to jointly learn the architecture and explicitly account for compression during the training process. Our results show that we can learn much more compact models and significantly reduce training and inference time.

Bio: Dr. Jose Alvarez is a senior research scientist at Toyota Research Institute. His main research interests are in developing robust and efficient deep learning algorithms for perception with focus on autonomous vehicles. Previously, he was a researcher at Data61 / CSIRO (formerly NICTA), a Postdoctoral researcher at the Courant Institute of Mathematical Science, New York University, and visiting scholar at University of Amsterdam and Group Research Electronics at Volkswagen. Dr. Alvarez graduated in 2012 and he was awarded the best Ph.D. Thesis award. Dr. Alvarez serves as associate editor for IEEE Trans. on Intelligent Transportation Systems.

Wed 4:30 p.m. - 5:00 p.m.

Abstract: Modern advanced driver assistance systems (ADAS) rely on a range of sensors including radar, ultrasound, LIDAR and cameras. Active sensors have found applications in detecting traffic participants (TPs) such as cars or pedestrians and scene elements (SEs) such as roads. However, camera-based systems have the potential to achieve or augment these capabilities at a much lower cost, while allowing new ones such as determination of TP and SE semantics as well as their interactions in complex traffic scenes.

In this talk, we present several technical advances for vision-based ADAS. A common theme is to overcome the challenges posed by lack of large-scale annotations in deep learning frameworks. We introduce approaches to correspondence estimation that are trained on purely synthetic data but adapt well to real data at test-time. We introduce object detectors that are light enough for ADAS, trained with knowledge distillation to retain accuracies of deeper architectures. Our semantic segmentation methods are trained on weak supervision that requires only a tenth of conventional annotation time. We propose methods for 3D reconstruction that use deep supervision to recover fine TP part locations while relying on purely synthetic 3D CAD models. We develop deep learning frameworks for multi-target tracking, as well as occlusion-reasoning in TP localization and SE layout estimation. Finally, we present a framework for TP behavior prediction in complex traffic scenes that accounts for TP-TP and TP-SE interactions. Our approach allows prediction of diverse multimodal outcomes and aims to account for long-term strategic behaviors in complex scenes.

Bio: Manmohan Chandraker is an assistant professor at the CSE department of the University of California, San Diego and leads the computer vision research effort at NEC Labs America in Cupertino. He received a B.Tech. in Electrical Engineering at the Indian Institute of Technology, Bombay and a PhD in Computer Science at the University of California, San Diego. His personal research interests are 3D scene understanding and reconstruction, with applications to autonomous driving and human-computer interfaces. His works have received the Marr Prize Honorable Mention for Best Paper at ICCV 2007, the 2009 CSE Dissertation Award for Best Thesis at UCSD, a PAMI special issue on best papers of CVPR 2011 and the Best Paper Award at CVPR 2014.

Wed 5:00 p.m. - 5:30 p.m.
Coffee (Break)
Wed 5:30 p.m. - 6:00 p.m.

Jonathan Binas, Daniel Neil, Shih-Chii Liu, Tobi Delbruck, DDD17: End-To-End DAVIS Driving Dataset

Ransalu Senanayake and Fabio Ramos, Bayesian Hilbert Maps for Continuous Occupancy Mapping in Dynamic Environments

Wed 6:00 p.m. - 6:30 p.m.

Self-driving cars now deliver vast amounts of sensor data from large unstructured environments. In attempting to process and interpret this data there are many unique challenges in bridging the gap between prerecorded data sets and the field. This talk will present recent work addressing the application of deep learning techniques to robotic perception. We focus on solutions to several pervasive problems when attempting to deploy such techniques on fielded robotic systems. The themes of the talk revolve around alternatives to gathering and curating data sets for training. Are there ways of avoiding the labor-intensive human labeling required for supervised learning? These questions give rise to several lines of research based around self-supervision, adversarial learning, and simulation. We will show how these approaches applied to self-driving car problems have great potential to change the way we train, test, and validate machine learning-based systems.

Bio: Matthew Johnson-Roberson is Assistant Professor of Engineering in the Department of Naval Architecture & Marine Engineering and the Department of Electrical Engineering and Computer Science at the University of Michigan. He received a PhD from the University of Sydney in 2010. He has held prior postdoctoral appointments with the Centre for Autonomous Systems - CAS at KTH Royal Institute of Technology in Stockholm and the Australian Centre for Field Robotics at the University of Sydney. He is a recipient of the NSF CAREER award (2015). He has worked in robotic perception since the first DARPA grand challenge and his group focuses on enabling robots to better see and understand their environment.

Wed 6:30 p.m. - 7:00 p.m.

Today, there are two major paradigms for vision-based autonomous driving systems: mediated perception approaches that parse an entire scene to make a driving decision, and behavior reflex approaches that directly map an input image to a driving action by a regressor. In this paper, we propose a third paradigm: a direct perception based approach to estimate the affordance for driving. We propose to map an input image to a small number of key perception indicators that directly relate to the affordance of a road/traffic state for driving. Our representation provides a set of compact yet complete descriptions of the scene to enable a simple controller to drive autonomously. Falling in between the two extremes of mediated perception and behavior reflex, we argue that our direct perception representation provides the right level of abstraction. We evaluate our approach in a virtual racing game as well as real world driving and show that our model can work well to drive a car in a very diverse set of virtual and realistic environments.

Jianxiong Xiao (a.k.a., Professor X) is the Founder and CEO of AutoX Inc., a high-tech startup working on A.I. software solution for self-driving vehicles. AutoX's mission is to democratize autonomy and make autonomous driving universally accessible to everyone. Its innovative camera-first self-driving solution amounts to only a tiny fraction of the cost of traditional LiDar-based approaches. Dr. Xiao has over ten years of research and engineering experience in Computer Vision, Autonomous Driving, and Robotics. In particular, he is a pioneer in the fields of 3D Deep Learning, RGB-D Recognition and Mapping, Big Data, Large-scale Crowdsourcing, and Deep Learning for Robotics. Jianxiong received a BEng. and MPhil. in Computer Science from the Hong Kong University of Science and Technology in 2009. He received his Ph.D. from the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT) in 2013. And he was an Assistant Professor at Princeton University and the founding director of the Princeton Computer Vision and Robotics Labs from 2013 to 2016. His work has received the Best Student Paper Award at the European Conference on Computer Vision (ECCV) in 2012 and the Google Research Best Papers Award for 2012, and has appeared in the popular press. He was awarded the Google U.S./Canada Fellowship in Computer Vision in 2012, the MIT CSW Best Research Award in 2011, NSF/Intel VEC Research Award in 2016, and two Google Faculty Awards in 2014 and in 2015 respectively. He co-lead the MIT+Princeton joint team to participate in the Amazon Picking Challenge in 2016, and won the 3rd and 4th place worldwide. More information can be found at: http://www.jianxiongxiao.com.

Wed 9:30 p.m. - 10:00 p.m.

David Isele, Akansel Cosgun, To Go or Not to Go: A Case for Q-Learning at Unsignalized Intersections

Tomoki Nishi, Prashant Doshi, Danil Prokhorov, Freeway Merging in Congested Traffic based on Multipolicy Decision Making with Passive Actor Critic

Wed 10:00 p.m. - 10:30 p.m.
Coffee and Posters (Break)
Wed 10:30 p.m. - 11:00 p.m.

It is critical for a self-driving car in the wild to assess risk and adapt to changes on the road. In this talk, we will first go over our proposed accident anticipation method which is tested on a large dataset consisting of real-world accident videos. Then, we will present our latest ICCV paper about how to adapt a semantic segmentation model across 4 cities in three continents.

Bio: Min Sun is an assistant professor at National Tsing Hua University in Taiwan. Before that, he was a postdoctoral researcher at Washington University in Seattle and he graduated from the University of Michigan with a Ph.D. degree in EE: System. He also won the best paper award of 3dRR in 2007 and best paper award of CVGIP in 2015 and 2016.

Wed 11:00 p.m. - 11:30 p.m.

Research, media and corporate interest in autonomous road vehicles has exploded. Large industrial labs are hiring large teams to engineer solutions to the problem. As machine learners, I conjecture that it will be big data and cutting edge end-to-end reinforcement learning techniques that will have the best chance to achieve autonomy over the next decade and not the vastly hand engineered, rule based approaches of current teams. A crucial component of self driving car systems which is overlooked is model uncertainty. I claim that model uncertainty is as important as model accuracy, and encourage the community to pursue research in this direction. Good decisions require good predictions and well calibrated uncertainty estimates around those predictions.

Wed 11:30 p.m. - 11:40 p.m.
  1. Kangwook Lee, Hoon Kim, Changho Suh, Crash To Not Crash: Playing Video Games To Predict Vehicle Collisions

  2. Edward Schwalb, Bernhard Bieder, Daniel Wiesenhütter, What Makes It Testable? Conceptual Model for Safety Quantification

Ahmad El Sallab, Mahmoud Saeed, Omar Abdel Tawab, Mohammed Abdou, Meta learning Framework for Automated Driving

Wed 11:40 p.m. - 12:25 a.m.
Panel Discussion (Jose M. Alvarez, Manmohan Chandraker, Matt Johnson, Min Sun, Carl Wellington) (Panel)
Thu 12:25 a.m. - 12:30 a.m.
Closing Remarks: Li Erran Li and Drew Gray (Uber ATG) (Closing)

Author Information

Li Erran Li (Uber Technologies)

Li Erran Li received his Ph.D. in Computer Science from Cornell University advised by Joseph Halpern. He is currently with Uber ATG and an adjunct professor in the Computer Science Department of Columbia University. Before that, He worked as a researcher in Bell Labs. His research interests are AI, machine learning algorithms and systems. He is an IEEE Fellow and an ACM Distinguished Scientist.

Raquel Urtasun (University of Toronto)
Andrew Gray (Uber Technologies)
Silvio Savarese (Stanford University)

More from the Same Authors