Please watch the pre-recorded talk on SlidesLive.
Recent years have seen a surge in research on graph representation learning, including techniques for deep graph embeddings, generalizations of CNNs to graph-structured data, and neural message-passing approaches. These advances in graph neural networks and related techniques have led to new state-of-the-art results in numerous domains: chemical synthesis, 3D-vision, recommender systems, question answering, continuous control, self-driving and social network analysis. Building on the successes of three related workshops from last year (at ICML, ICLR and NeurIPS), the primary goal for this workshop is to facilitate community building, and support expansion of graph representation learning into more interdisciplinary projects with the natural and social sciences. With hundreds of new researchers beginning projects in this area, we hope to bring them together to consolidate this fast-growing area into a healthy and vibrant subfield. Especially, we aim to strongly promote novel and exciting applications of graph representation learning across the sciences, reflected in our choices of invited speakers.
Over the years, ML models have steadily grown in complexity, gaining predictivity often at the expense of interpretability. An active research area called explainable AI (or XAI) has emerged with the goal to produce models that are both predictive and understandable. XAI has reached important successes, such as robust heatmap-based explanations of DNN classifiers. From an application perspective, there is now a need to massively engage into new scenarios such as explaining unsupervised / reinforcement learning, as well as producing explanations that are optimally structured for the human. In particular, our planned workshop will cover the following topics:
-   Explaining beyond DNN classifiers: random forests, unsupervised learning, reinforcement learning
-   Explaining beyond heatmaps: structured explanations, Q/A and dialog systems, human-in-the-loop
- Explaining beyond explaining: Improving ML models and algorithms, verifying ML, getting insights
XAI has received an exponential interest in the research community, and awareness of the need to explain ML models have grown in similar proportions in industry and in the sciences. With the sizable XAI research community that has formed, there is now a key opportunity to achieve this push towards successful applications. Our hope is that our proposed XXAI workshop can accelerate this process, foster a more …
Even though supervised learning using large annotated corpora is still the dominant approach in machine learning,  self-supervised learning is gaining considerable popularity.  Applying self-supervised learning to audio and speech sequences, however, remains particularly challenging. Speech signals, in fact, are not only high-dimensional, long, and variable-length sequences, but also entail a complex hierarchical structure that is difficult to infer without supervision (e.g.phonemes, syllables, words). Moreover, speech is characterized by an important variability due to different speaker identities, accents, recording conditions and noises that highly increase the level of complexity.
We believe that self-supervised learning will play a crucial role in the future of artificial intelligence, and we think that great research effort is needed to efficiently take advantage of it in audio and speech applications. With our initiative, we wish to foster more progress in the field, and we hope to encourage a discussion amongst experts and practitioners from both academia and industry that might bring different points of view on this topic. Furthermore, we plan to extend the debate to multiple disciplines, encouraging discussions on how insights from other fields (e.g., computer vision and robotics) can be applied to speech, and how findings on speech can be used on other …
This workshop will bring together artificial intelligence (AI) researchers who study the interpretability of AI systems, develop interpretable machine learning algorithms, and develop methodology to interpret black-box machine learning models (e.g., post-hoc interpretations). This is a very exciting time to study interpretable machine learning, as the advances in large-scale optimization and Bayesian inference that have enabled the rise of black-box machine learning are now also starting to be exploited to develop principled approaches to large-scale interpretable machine learning. Interpretability also forms a key bridge between machine learning and other AI research directions such as machine reasoning and planning. Participants in the workshop will exchange ideas on these and allied topics.
Description: the workshop proposal in “Law and Machine Learning” aims to contribute to the research on social and legal risks of the deployment of AI systems using machine learning based decisions. Today, algorithms have been infiltrating and governing every aspect of our lives as individuals and as a society. Specifically, Algorithmic Decision Systems (ADS) are involved in many social decisions. For instance, such systems are increasingly used to support decision-making in fields, such as child welfare, criminal justice, school assignment, teacher evaluation, fire risk assessment, homelessness prioritization, healthcare, Medicaid benefit, immigration decision systems or risk assessment, and predictive policing, among other things. Law enforcement agencies are increasingly using facial recognition, algorithmic predictive policing systems to forecast criminal activity and allocate police resources. However, these predictive systems challenge fundamental rights and guarantees of the criminal procedure. For several years, numerous studies have revealed, social risks of ML, especially the risks of opacity, bias, manipulation of information.
While it is only the starting point of the deployment of such systems, more interdisciplinary research is needed. Our purpose is to contribute to this new field which brings together legal researchers, mathematicians and computer scientists, by bridging the gap between the performance of algorithmic …
Analysis of large amounts of data offers new opportunities to understand many processes better. Yet, data accumulation often implies relaxing acquisition procedures or compounding diverse sources, leading to many observations with missing features. From questionnaires to collaborative filtering, from electronic health records to single-cell analysis, missingness is everywhere at play and is rather the norm than the exception. Even “clean” data sets are often barely “cleaned” versions of incomplete data sets—with all the unfortunate biases this cleaning process may have created.
Despite this ubiquity, tackling missing values is often overlooked.  Handling missing values poses many challenges, and there is a vast literature in the statistical community, with many implementations available. Yet, there are still many open issues and the need to design new methods or to introduce new point of views: for missing values in a supervised-learning setting, in deep learning architectures, to adapt available methods for high dimensional observed data with different type of missing values, deal with feature mismatch and distribution mismatch. Missing data is one of the eight pillars of causal wisdom for Judea Pearl who brought graphical model reasoning to tackle some missing not at random values.
To the best of our knowledge, this is the …
A person’s health is determined by a variety of factors beyond those captured by electronic health records or the genome. Many healthcare organizations recognize the importance of the social determinants of health (SDH) such as socioeconomic status, employment, food security, education, and community cohesion. Capturing such comprehensive portraits of patient data is necessary to transform a healthcare system and improve population health while simultaneously delivering personalized healthcare provisions. Machine learning (ML) is well-positioned to transform system-level healthcare through the design of intelligent algorithms that incorporate SDH into clinical and policy interventions, such as population health programs and clinical decision support systems. Innovations in health-tech through wearable devices and mobile health, among others, provide rich sources of data, including those characterizing SDH. The guiding metric of success should be health outcomes: the improvement of health and care at both the individual and population levels. This workshop will identify the needs of system-level healthcare transformation that ML may satisfy. We will bring together ML researchers, health policy practitioners, clinical organization experts, and individuals from all areas of clinic-, hospital-, and community-based healthcare.
Self-driving cars and advanced safety features present one of today’s greatest challenges and opportunities for Artificial Intelligence (AI). Despite billions of dollars of investments and encouraging progress under certain operational constraints, there are no driverless cars on public roads today without human safety drivers. Autonomous Driving research spans a wide spectrum, from modular architectures -- composed of hardcoded or independently learned sub-systems -- to end-to-end deep networks with a single model from sensors to controls. In any system, Machine Learning is a key component. However, there are formidable learning challenges due to safety constraints, the need for large-scale manual labeling, and the complex high dimensional structure of driving data, whether inputs (from cameras, HD maps, inertial measurement units, wheel encoders, LiDAR, radar, etc.) or predictions (e.g., world state representations, behavior models, trajectory forecasts, plans, controls). The goal of this workshop is to explore the frontier of learning approaches for safe, robust, and efficient Autonomous Driving (AD) at scale. The workshop will span both theoretical frameworks and practical issues especially in the area of deep learning.
Website: https://sites.google.com/view/aiad2020
Until recently, Machine Learning has been mostly applied in industry by consulting academics, data scientists within larger companies, and a number of dedicated Machine Learning research labs within a few of the world’s most innovative tech companies. Over the last few years we have seen the dramatic rise of companies dedicated to providing Machine Learning software-as-a-service tools, with the aim of democratizing access to the benefits of Machine Learning. All these efforts have revealed major hurdles to ensuring the continual delivery of good performance from deployed Machine Learning systems. These hurdles range from challenges in MLOps, to fundamental problems with deploying certain algorithms, to solving the legal issues surrounding the ethics involved in letting algorithms make decisions for your business.
This workshop will invite papers related to the challenges in deploying and monitoring ML systems. It will encourage submission on: subjects related to MLOps for deployed ML systems (such as testing ML systems, debugging ML systems, monitoring ML systems, debugging ML Models, deploying ML at scale); subjects related to the ethics around deploying ML systems (such as ensuring fairness, trust and transparency of ML systems, providing privacy and security on ML Systems); useful tools and programming languages for deploying ML …
The ICML Workshop on Retrospectives in Machine Learning will build upon the success of the 2019 NeurIPS Retrospectives workshop to further encourage the publication of retrospectives. A retrospective of a paper or a set of papers, by its author, takes the form of an informal paper. It provides a venue for authors to reflect on their previous publications, to talk about how their thoughts have changed following publication, to identify shortcomings in their analysis or results, and to discuss resulting extensions. The overarching goal of MLRetrospectives is to improve the science, openness, and accessibility of the machine learning field, by widening what is publishable and helping to identify opportunities for improvement. Retrospectives also give researchers and practitioners unable to attend conferences access to the author’s updated understanding of their work, which would otherwise only be accessible to their immediate circle. The machine learning community would benefit from retrospectives on much of the research which shapes our field, and this workshop will present an opportunity for a few retrospectives to be presented.
The workshop will showcase recent research in the field of Computational Biology. Computational biology is an interdisciplinary field that develops and applies analytical methods, mathematical and statistical modeling and simulation to analyze and interpret vast collections of biological data, such as genetic sequences, cellular features or protein structures, and imaging datasets to make new predictions towards clinical response, discover new biology or aid drug discovery. The availability of high-dimensional data, at multiple spatial and temporal resolutions has made machine learning and deep learning methods increasingly critical for computational analysis and interpretation of the data. Conversely, biological data has also exposed unique challenges and problems that call for the development of new machine learning methods.
This workshop aims to bring together researchers working at the unique intersection of Machine Learning and Biology that include areas (and not limited to) such as computational genomics, neuroscience, pathology, radiology, evolutionary biology, population genomics, phenomics, ecology, cancer biology, causality, and representation learning and disentanglement to present recent advances and open questions to the ML community.
The workshop is a sequel to the WCB workshops we organized in the last four years ICML 2019, Long Beach , Joint ICML and IJCAI 2018, Stockholm , ICML 2017, …
Extreme classification is a rapidly growing research area focusing on multi-class and multi-label problems, where the label space is extremely large. It brings many diverse approaches under the same umbrella including natural language processing (NLP), computer vision, information retrieval, recommendation systems, computational advertising, and embedding methods. Extreme classifiers have been deployed in many real-world applications in the industry ranging from language modelling to document tagging in NLP, face recognition to learning universal feature representations in computer vision, etc. Moreover, extreme classification finds application in recommendation, tagging, and ranking systems since these problems can be reformulated as multi-label learning tasks where each item to be ranked or recommended is treated as a separate label. Such reformulations have led to significant gains over traditional collaborative filtering and content-based recommendation techniques. 
The proposed workshop aims to offer a timely collection of information to benefit the researchers and practitioners working in the aforementioned research fields of core supervised learning, theory of extreme classification, as well as application domains. These issues are well-covered by the Topics of Interest in ICML 2020. The workshop aims to bring together researchers interested in these areas to encourage discussion, facilitate interaction and collaboration and improve upon the state-of-the-art in …
The designers of a machine learning (ML) system typically have far more power over the system than the individuals who are ultimately impacted by the system and its decisions. Recommender platforms shape the users’ preferences; the individuals classified by a model often do not have means to contest a decision; and the data required by supervised ML systems necessitates that the privacy and labour of many yield to the design choices of a few.
The fields of algorithmic fairness and human-centered ML often focus on centralized solutions, lending increasing power to system designers and operators, and less to users and affected populations. In response to the growing social-science critique of the power imbalance present in the research, design, and deployment of ML systems, we wish to consider a new set of technical formulations for the ML community on the subject of more democratic, cooperative, and participatory ML systems.
Our workshop aims to explore methods that, by design, enable and encourage the perspectives of those impacted by an ML system to shape the system and its decisions. By involving affected populations in shaping the goals of the overall system, we hope to move beyond just tools for enabling human participation and …
Machine learning systems are commonly applied to isolated tasks or narrow domains (e.g. control over similar robotic bodies). It is further assumed that the learning system has simultaneous access to all the data points of the tasks at hand. In contrast, Continual Learning (CL) studies the problem of learning from a stream of data from changing domains, each connected to a different learning task. The objective of CL is to quickly adapt to new situations or tasks by exploiting previously acquired knowledge, while protecting previous learning from being erased. Meeting the objectives of CL will provide an opportunity for systems to quickly learn new skills given knowledge accumulated in the past and continually extend their capabilities to changing environments, a hallmark of natural intelligence.
Objects, and the interactions between them, are the foundations on which our understanding of the world is built. Similarly, abstractions centered around the perception and representation of objects play a key role in building human-like AI, supporting high-level cognitive abilities like causal reasoning, object-centric exploration, and problem solving. Indeed, prior works have shown how relational reasoning and control problems can greatly benefit from having object descriptions. Yet, many of the current methods in machine learning focus on a less structured approach in which objects are only implicitly represented, posing a challenge for interpretability and the reuse of knowledge across tasks. Motivated by the above observations, there has been a recent effort to reinterpret various learning problems from the perspective of object-oriented representations.
In this workshop, we will showcase a variety of approaches in object-oriented learning, with three particular emphases. Our first interest is in learning object representations in an unsupervised manner. Although computer vision has made an enormous amount of progress in learning about objects via supervised methods, we believe that learning about objects with little to no supervision is preferable: it minimizes labeling costs, and also supports adaptive representations that can be changed depending on the particular situation and …
In many settings such as education, healthcare, drug design, robotics, transportation, and achieving better-than-human performance in strategic games, it is important to make decisions sequentially. This poses two interconnected algorithmic and statistical challenges: effectively exploring to learn information about the underlying dynamics and effectively planning using this information. Reinforcement Learning (RL) is the main paradigm tackling both of these challenges simultaneously which is essential in the aforementioned applications. Over the last years, reinforcement learning has seen enormous progress both in solidifying our understanding on its theoretical underpinnings and in applying these methods in practice. 
This workshop aims to highlight recent theoretical contributions, with an emphasis on addressing significant challenges on the road ahead. Such theoretical understanding is important in order to design algorithms that have robust and compelling performance in real-world applications. As part of the ICML 2020 conference, this workshop will be held virtually. It will feature keynote talks from six reinforcement learning experts tackling different significant facets of RL. It will also offer the opportunity for contributed material (see below the call for papers and our outstanding program committee). The authors of each accepted paper will prerecord a 10-minute presentation and will also appear in a poster session. …
ML has shown great promise in modeling and predicting complex phenomenon in many scientificdisciples such as predicting cardiovascular risk factors from retinal images, understanding howelectrons behave at the atomic level [3], identifying patterns of weather and climate phenomena, etc. Further, models are able to learn directly (and better) from raw data as opposed to human selected features. The ability to interpret the model and find significant predictors couldprovide new scientific insights.Traditionally, the scientific discovery process has been based on careful observations of nat-ural phenomenon, followed by systematic human analysis (of hypothesis generation and ex-perimental validation). ML interpretability has the potential to bring a radically different yetprincipled approach. While general interpretability relies on ‘human parsing’ (common sense),scientific domains have semi-structured and highly structured bases for interpretation. Thus,despite differences in data modalities and domains, be it brain sciences, the behavioral sciences,or material sciences, there is a need for a common set of tools that address a similar flavor of problem, one of interpretability or fitting models to a known structure. This workshop aims to bring together members from the ML and physical sciences communities to introduce exciting problems to the broader community, and stimulate the productionof new approaches towards solving open scientific …
There has been growing interest in rectifying deep neural network instabilities. Challenges arise when models receive samples drawn from outside the training distribution. For example, a neural network tasked with classifying handwritten digits may assign high confidence predictions to cat images. Anomalies are frequently encountered when deploying ML models in the real world. Well-calibrated predictive uncertainty estimates are indispensable for many machine learning applications, such as self-driving vehicles and medical diagnosis systems. Generalization to unseen and worst-case inputs is also essential for robustness to distributional shift. In order to have ML models reliably predict in open environment, we must deepen technical understanding in the emerging areas of: (1) learning algorithms that can detect  changes in data distribution (e.g. out-of-distribution examples); (2) mechanisms to estimate and calibrate confidence produced by neural networks in typical and unforeseen scenarios; (3) methods to improve out-of-distribution generalization, including generalization to temporal, geographical, hardware, adversarial, and image-quality changes; (4) benchmark datasets and protocols for evaluating model performance under distribution shift; and (5) key applications of robust and uncertainty-aware deep learning (e.g., computer vision, robotics, self-driving vehicles, medical imaging) as well as broader machine learning tasks.
This workshop will bring together researchers and practitioners from the machine …
Optimization lies at the heart of many exciting developments in machine learning, statistics and signal processing. As models become more complex and datasets get larger, finding efficient, reliable and provable methods is one of the primary goals in these fields.
In the last few decades, much effort has been devoted to the development of first-order methods. These methods enjoy a low per-iteration cost and have optimal complexity, are easy to implement, and have proven to be effective for most machine learning applications. First-order methods, however, have significant limitations: (1) they require fine hyper-parameter tuning, (2) they do not incorporate curvature information, and thus are sensitive to ill-conditioning, and (3) they are often unable to fully exploit the power of distributed computing architectures.
Higher-order methods, such as Newton, quasi-Newton and adaptive gradient descent methods, are extensively used in many scientific and engineering domains. At least in theory, these methods possess several nice features: they exploit local curvature information to mitigate the effects of ill-conditioning, they avoid or diminish the need for hyper-parameter tuning, and they have enough concurrency to take advantage of distributed computing environments. Researchers have even developed stochastic versions of higher-order methods, that feature speed and scalability by incorporating …
One of the most significant and challenging open problems in Artificial Intelligence (AI) is the problem of Lifelong Learning. Lifelong Machine Learning considers systems that can continually learn many tasks (from one or more domains) over a lifetime. A lifelong learning system efficiently and effectively:
1. retains the knowledge it has learned from different tasks;
2. selectively transfers knowledge (from previously learned tasks) to facilitate the learning of new tasks;
3. ensures the effective and efficient interaction between (1) and (2).
Lifelong Learning introduces several fundamental challenges in training models that generally do not arise in a single task batch learning setting. This includes problems like catastrophic forgetting and capacity saturation. This workshop aims to explore solutions for these problems in both supervised learning and reinforcement learning settings.
Normalizing flows are explicit likelihood models using invertible neural networks to construct flexible probability distributions of high-dimensional data.  Compared to other generative models,  the main advantage of normalizing flows is that they can offer exact and efficient likelihood computation and data generation.  Since their recent introduction, flow-based models have seen a significant resurgence of interest in the machine learning community.  As a result, powerful flow-based models have been developed, with successes in density estimation, variational inference, and generative modeling of images, audio and video. 
This workshop is the 2nd iteration of the ICML 2019 workshop on Invertible Neural Networks and Normalizing Flows.  While the main goal of last year’s workshop was to make flow-based models more accessible to the general machine learning community, as the field is moving forward, we believe there is now a need to consolidate recent progress and connect ideas from related fields.  In light of the interpretation of latent variable models and autoregressive models as flows, this year we expand the scope of the workshop and consider likelihood-based models more broadly, including flow-based models, latent variable models and autoregressive models.  We encourage the researchers to use these models in conjunction to exploit the their benefits at …
One proposed solution towards the goal of designing machines that can extrapolate experience across environments and tasks, are inductive biases. Providing and starting algorithms with inductive biases might help to learn invariances e.g. a causal graph structure, which in turn will allow the agent to generalize across environments and tasks. 
While some inductive biases are already available and correspond to common knowledge, one key requirement to learn inductive biases from data seems to be the possibility to perform and learn from interventions. This assumption is partially motivated by the accepted hypothesis in psychology about the need to experiment in order to discover causal relationships. This corresponds to an reinforcement learning environment, where the agent can discover causal factors through interventions and observing their effects. 
We believe that one reason which has hampered progress on building intelligent agents is the limited availability of good inductive biases. Learning inductive biases from data is difficult since this corresponds to an interactive learning setting, which compared to classical regression or classification frameworks is far less understood e.g. even formal definitions of generalization in RL have not been developed.  While Reinforcement Learning has already achieved impressive results, the sample complexity required to achieve consistently good …
Models of negative dependence and submodularity are increasingly important in machine learning. Whether selecting training data, finding an optimal experimental design, exploring in reinforcement learning and Bayesian optimization, or designing recommender systems, selecting high-quality yet diverse items has become a core challenge. This workshop aims to bring together researchers who, using theoretical or applied techniques, leverage negative dependence and submodularity in their work. Expanding upon last year's workshop, we will highlight recent developments in the rich mathematical theory of negative dependence, cover novel critical applications, and discuss the most promising directions for future research.
Training machine learning models in a centralized fashion often faces significant challenges due to regulatory and privacy concerns in real-world use cases. These include distributed training data, computational resources to create and maintain a central data repository, and regulatory guidelines (GDPR, HIPAA) that restrict sharing sensitive data. Federated learning (FL) is a new paradigm in machine learning that can mitigate these challenges by training a global model using distributed data, without the need for data sharing. The extensive application of machine learning to analyze and draw insight from real-world, distributed, and sensitive data necessitates familiarization with and adoption of this relevant and timely topic among the scientific community.Despite the advantages of federated learning, and its successful application in certain industry-based cases, this field is still in its infancy due to new challenges that are imposed by limited visibility of the training data, potential lack of trust among participants training a single model, potential privacy inferences, and in some cases, limited or unreliable connectivity.The goal of this workshop is to bring together researchers and practitioners interested in FL. This day-long event will facilitate interaction among students, scholars, and industry professionals from around the world to understand the topic, identify technical challenges, …
Machine learning is increasingly being applied to problems in the healthcare domain. However, there is a risk that the development of machine learning models for improving health remain focused within areas and diseases which are more economically incentivised and resourced. This presents the risk that as research and technological entities aim to develop machine-learning-assisted consumer healthcare devices, or bespoke algorithms for their populations within a certain geographical region, that the challenges of healthcare in resource-constrained settings will be overlooked. The predominant research focus of machine learning for healthcare in the “economically advantaged” world means that there is a skew in our current knowledge of how machine learning can be used to improve health on a more global scale – for everyone. This workshop aims to draw attention to the ways that machine learning can be used for problems in global health, and to promote research on problems outside high-resource environments.
Deep learning has achieved great success in a variety of tasks such as recognizing objects in images, predicting the sentiment of sentences, or image/speech synthesis by training on a large-amount of data. However, most existing success are mainly focusing on perceptual tasks, which is also known as System I intelligence. In real world, many complicated tasks, such as autonomous driving, public policy decision making, and multi-hop question answering, require understanding the relationship between high-level variables in the data to perform logical reasoning, which is known as System II intelligence. Integrating system I and II intelligence lies in the core of artificial intelligence and machine learning. 
Graph is an important structure for System II intelligence, with the universal representation ability to capture the relationship between different variables, and support interpretability, causality, and transferability / inductive generalization. Traditional logic and symbolic reasoning over graphs has relied on methods and tools which are very different from deep learning models, such Prolog language, SMT solvers, constrained optimization and discrete algorithms. Is such a methodology separation between System I and System II intelligence necessary? How to build a flexible, effective and efficient bridge to smoothly connect these two systems, and create higher order artificial intelligence? …
Machine learning has achieved considerable successes in recent years, but this success often relies on human experts, who construct appropriate features, design learning architectures, set their hyperparameters, and develop new learning algorithms. Driven by the demand for off-the-shelf machine learning methods from an ever-growing community, the research area of AutoML targets the progressive automation of machine learning aiming to make effective methods available to everyone. Hence, the workshop targets a broad audience ranging from core machine learning researchers in different fields of ML connected to AutoML, such as neural architecture search, hyperparameter optimization, meta-learning, and learning to learn, to domain experts aiming to apply machine learning to new types of problems.
 The schedule is wrt CEST (i.e., the time zone of Vienna) 
Although data is considered to be the “new oil”, it is very hard to be priced. Raw use of data has been invaluable in several sectors such as advertising, healthcare, etc, but often in violation of people’s privacy. Labeled data has also been extremely valuable for the training of machine learning models (driverless car industry). This is also indicated by the growth of annotation companies such as Figure8 and Scale.AI, especially in the image space. Yet, it is not clear what is the right pricing for data workers who annotate the data or the individuals who contribute their personal data while using digital services. In the latter case, it is very unclear how the value of the services offered is compared to the private data exchanged. While the first data marketplaces have appeared, such as AWS, Narattive.io, nitrogen.ai, etc, they suffer from a lack of good pricing models. They also fail to maintain the right of the data owners to define how their own data will be used. There have been numerous suggestions for sharing data while maintaining privacy, such as training generative models that preserve original data statistics.
This workshop aims to bring together researchers from academia and industry to discuss major challenges, outline recent advances, and highlight future directions pertaining to novel and existing large-scale real-world experiment design and active learning problems. We aim to highlight new and emerging research opportunities for the machine learning community that arise from the evolving needs to make experiment design and active learning procedures that are theoretically and practically relevant for realistic applications.
The intended audience and participants include everyone whose research interests, activities, and applications involve experiment design, active learning, bandit/Bayesian optimization, efficient exploration, and parameter search methods and techniques. We expect the workshop to attract substantial interest from researchers working in both academia and industry. The research of our invited speakers spans both theory and applications, and represents a diverse range of domains where experiment design and active learning are of fundamental importance (including robotics & control, biology, physical sciences, crowdsourcing, citizen science, etc.).
The schedule is with respect to UTC (i.e., Universal Time) time zone.
In situations where a task can be cleanly formulated and data is plentiful, modern machine learning (ML) techniques have achieved impressive (and often super-human) results.  Here, plentiful data'' can mean labels from humans, access to a simulator and well designed reward function, or other forms of interaction and supervision.<br><br>On the other hand, in situations where tasks cannot be cleanly formulated and plentifully supervised, ML has not yet shown the same progress.  We still seem far from flexible  agents that can learn without human engineers carefully designing or collating their supervision. This is problematic in many settings where machine learning is or will be applied in real world settings, where these agents have to interact with human users and may be used in settings that go beyond any initial clean training data used during system development. A key open question is how to make machine learning effective and robust enough to operate in real world open domains.<br><br>Artificial {\it open} worlds are ideal laboratories for studying how to extend the successes of ML to build such agents. <br>Open worlds are characterized by:<br>\begin{itemize}<br>    \item Large (or perhaps infinite) collections of tasks, often not specified till test time; or lack of well defined tasks …
Language is one of the most impressive human accomplishments and is believed to be the core to our ability to learn, teach, reason and interact with others. Yet, current state-of-the-art reinforcement learning agents are unable to use or understand human language at all. The ability to integrate and learn from language, in addition to rewards and demonstrations, has the potential to improve the generalization, scope and sample efficiency of agents. Furthermore, many real-world tasks, including personal assistants and general household robots, require agents to process language by design, whether to enable interaction with humans, or simply use existing interfaces. The aim of our workshop is to advance this emerging field of research by bringing together researchers from several diverse communities to discuss recent developments in relevant research areas such as instruction following and embodied language learning, and identify the most important challenges and promising research avenues.
Artificial Intelligence (AI), and Machine Learning systems in particular, often depend on the information provided by multiple agents. The most well-known example is federated learning, but also sensor data, crowdsourced human computation, or human trajectory inputs for inverse reinforcement learning. However, eliciting accurate data can be costly, either due to the effort invested in obtaining it, as in crowdsourcing, or due to the need to maintain automated systems, as in distributed sensor systems. Low-quality data not only degrades the performance of AI systems, but may also pose safety concerns. Thus, it becomes important to verify the correctness of data and be smart in how data is aggregated, and to provide incentives to promote effort and high-quality data. During the recent workshop on Federated Learning at NeurIPS 2019, 4 of 6 panel members mentioned incentives as the most important open issue.
This workshop is proposed to understand this aspect of Machine Learning, both theoretically and empirically. We particularly encourage contributions on the following aspects:
- How to collect high quality and credible data for machine learning systems from self-interested and possibly malicious agents, considering the game-theoretical properties of the problem?
- How to evaluate the quality of data supplied by self-interested …
The ever-increasing size and accessibility of vast media libraries has created a demand more than ever for AI-based systems that are capable of organizing, recommending, and understanding such complex data. 
While this topic has received only limited attention within the core machine learning community, it has been an area of intense focus within the applied communities such as the Recommender Systems (RecSys), Music Information Retrieval (MIR), and Computer Vision communities. At the same time, these domains have surfaced nebulous problem spaces and rich datasets that are of tremendous potential value to machine learning and the AI communities at large.
This year's Machine Learning for Media Discovery (ML4MD) aims to build upon the five previous Machine Learning for Music Discovery editions at ICML, broadening the topic area from music discovery to media discovery. The added topic diversity is aimed towards having a broader conversation with the machine learning community and to offer cross-pollination across the various media domains.
One of the largest areas of focus in the media discovery space is on the side of content understanding. The recommender systems community has made great advances in terms of collaborative feedback recommenders, but these approaches suffer strongly from the cold-start problem.  As …
Recent years have witnessed the rising need for learning agents that can interact with humans. Such agents usually involve applications in computer vision, natural language processing, human computer interaction, and robotics. Creating and running such agents call for interdisciplinary research of artificial intelligence, machine learning, and software engineering design, which we abstract as Human in the Loop Learning (HILL). HILL is a modern machine learning paradigm of significant practical and theoretical interest. For HILL, models and humans engage in a two-way dialog to facilitate more accurate and interpretable learning. The workshop aims to bring together researchers and practitioners working on the broad areas of human in the loop learning, ranging from the interactive/active learning algorithm designs for real-world decision making systems (e.g., autonomous driving vehicles, robotic systems, etc.), models with strong explainability, as well as interactive system designs (e.g., data visualization, annotation systems, etc.). In particular, we aim to elicit new connections among these diverse fields, identifying theory, tools and design principles tailored to practical machine learning workflows. The target audience for the workshop includes people who are interested in using machines to solve problems by having a human be an integral part of the learning process. In this year’s …