The model for drug discovery and development is failing patients. It is expensive and high risk, with long research and development cycles. This has a societal cost, with 9,000 diseases being untreated - in addition to the disappointing reality that the top ten best-selling drugs only are effective in 30-50% of patients. Tackling this challenge is very complex. While many companies focus on one component of the drug discovery process, BenevolentAI chooses to apply data and machine-learning driven methods across drug discovery, from the processing of scientific literature, to knowledge completion, to precision medicine, to chemistry optimization, each leveraging domain expert knowledge and state-of-the-art research.
In this talk, we will discuss the peculiarities of machine learning for the drug discovery domain. In this field, there exist many unique challenges, including tradeoffs between novelty and accuracy; questions of quality and reliability, both in extracted data and in the underlying ground-truth; how best to learn from small volumes of data; and methods to best combine human experts and ML methods. As we discuss the tools and methods that BenevolentAI has developed, we will explore these themes and walk through approaches.
Finally, to give a real example of how we apply machine learning and AI in our day-to-day work, we will showcase the application of our technology to repurpose existing drugs, using our tools and internal clinical experts, as a potential treatment for COVID-19. Baracitinib, the top drug we identified is currently being investigated in a Phase 3 clinical trial.
Presenters: Daniel Neil, VP Artificial Intelligence, Sia Togia, AI Lead for Knowledge (NLP & Knowledge Graph), Olly Oechsle, Lead Application Engineer and Aylin Cakiroglu, Senior AI Scientist at BenevolentAI Join Virtual Talk & Panel Visit BenevolentAI Booth
This year has been an extraordinary year that has both emphasized the vulnerability of the world to health risks and the rapid generation and availability of data . Within weeks of the outbreak of COVID-19, the genome sequence of the virus was available, full-document datasets covering the disease were released, and online citizen science communities sprung up to help identify possible new approaches to treat the disease. What questions should machine learning researchers, new to the field, pursue to best further the identification of new therapies? What are some of the hard, unsolved problems in discovering new treatments? How has the rapid development of new data modalities affected the kinds of research needed?
This panel will explore these themes.
Presenters: Daniel Neil, VP Artificial Intelligence, Alix Lacoste, VP Data Science, JB Michel, Scientific Advisor and Founder at Patch Biosciences https://www.patch.bio, Páidí Creed, Director AI Science and Ana Leite, Lead Bioinformatics Data Scientist at BenevolentAIJoin Virtual Talk & Panel Visit BenevolentAI Booth
The advent of the SARS-CoV-2 pandemic has highlighted yet again the need (and opportunity) for the use of modern machine learning to accelerate the drug discovery process. Many in the machine learning community have been inspired by recent events. Yet all too often, advances in machine learning theory and practice have failed to translate into genuinely useful applications in drug discovery. There are many reasons for this, including poor alignment between the communities. This panel brings together distinguished ML researchers who have made genuine advances in AI for health, biology and drug discovery. The discussion will be structured, walking through the drug discovery process: from identifying targets and the problems posed by high dimensionality in genomics data, through the challenges of molecular discovery, to clinical trial design and analysis. The panel will discuss successes, promising research directions and outstanding challenges. The aim of the panel is to catalyze a more nuanced conversation between the drug discovery and machine learning communities and to highlight new opportunities for collaboration. Panel members will be available immediately after both events on RocketChat to answer questions.
Panel: Prof. Mihaela Van Der Schaar (Cambridge), Dr James Zou (Stanford University), Dr Andreas Bender (AstraZeneca). Chaired by: Dr Lindsay Edwards (AstraZeneca)Join Virtual Talk & Panel Visit AstraZeneca Booth
Neural Architecture Search (NAS) has become an important direction in automated machine learning. In this talk, we will introduce several of our latest works on differentiable NAS, NAS for detection and, based on which, share our opinions on the future trend of NAS, including exploring a sufficiently large search space, designing fast and stable search methods, and introducing hardware constraints to improve the practical value of the search results. We will also introduce to audience the application of these NAS techniques in smart devices and self-driving scenarios. Our talks will be of interest to the audience that are interested in NAS and desire to apply NAS to research and development scenarios.
Presenters: Fabio Maria Carlucci, Hang Xu, Lingxi XieJoin Virtual Talk & Panel Visit Huawei Technologies Co., Ltd Booth
There are thousands of data repositories on the Web, providing access to millions of datasets. National and regional governments, scientific publishers and consortia, commercial data providers, and others publish data for fields ranging from social science to life science to high-energy physics to climate science and more. Access to this data is critical to facilitating reproducibility of research results, enabling scientists to build on others’ work, and providing data journalists easier access to information and its provenance. In this talk, we will discuss recently launched Dataset Search by Google, which provides search capabilities over potentially all dataset repositories on the Web. We will talk about the open ecosystem for describing datasets that we hope to encourage.
Presenter: Natasha NoyJoin Virtual Talk & Panel Visit Google Research Booth
TensorFlow Probability (TFP) is an open-source Python library for probabilistic reasoning and statistical analysis. TFP implements a suite of distributions and bijectors which are accurate, efficient, and differentiable. TFP also provides building blocks for modern inference algorithms like gradient-based Markov chain Monte Carlo and variational inference. We present a case-study of end-to-end Bayesian modeling - from writing down a generative model and reasoning about the prior predictive distribution to performing Hamiltonian Monte Carlo and diagnosing the quality of the fit. Along the way, we highlight TFP’s unique features. Specifically we cover the JointDistribution abstraction - a declarative representation of graphical models. We also showcase the performance benefits when fitting models using specialized hardware such as GPUs and TPUs.
Presenter: Colin CarrollJoin Virtual Talk & Panel Visit Google Research Booth
Apple is dedicated to advancing state-of-the-art machine learning technologies. Deep integration between our hardware, software and tools provides a unique ML ecosystem in which it is easy for researchers and developers to infuse intelligence into our products. On-device machine learning capabilities (CoreML, CreateML), combined with hardware acceleration make model training and inference fast and efficient. Apple enables training of models at scale on machine learning platforms and tools to further improve the ML lifecycle, by simplifying the process of training models at scale on cloud compute infrastructure, deploying these models to devices reliably, and evaluating their performance. Another fundamental principle of Apple’s dedication to advancing the start-of-the-art in machine learning, is to do so responsibly, in a manner which protects the privacy of our users. We achieve this by combining the opportunities afforded through on-device ML with our powerful server-side platform and tools, to enable federated learning with differential privacy at scale.
In this talk, we will provide more details of Apple’s unique ML ecosystem, showcasing how easy it is to do use, and how we do so in a way which protects the privacy of our users.
Presenter: Gaurav Kapoor.Join Virtual Talk & Panel Visit Apple Booth
AI, specifically deep learning, is revolutionizing industries, products, and core capabilities by delivering dramatically enhanced experiences. However, the deep neural networks of today use too much memory, compute, and energy. At Qualcomm Technologies, we’ve been actively researching and developing AI solutions with the goal to make artificial intelligence ubiquitous across devices, machines, vehicles, and things. To this end, Qualcomm Innovation Center (QuIC) has open sourced the AI Model Efficiency Toolkit (AIMET) on GitHub to collaborate with other leading AI researchers and to provide a simple library plugin for AI developers to utilize for state-of-the-art model efficiency performance. The open source project is meant to help migrate the ecosystem toward integer inference because we believe this is an effective way to increase performance per watt.
In this talk, we will discuss why model efficiency is important and the challenges associated with running models on low-precision hardware. And we will introduce AIMET and its features.
Presenter: Tijmen BlankevoortJoin Virtual Talk & Panel Visit Qualcomm Booth
In this talk, we will discuss how Baidu is applying the state-of-the-art data-driven deep learning technology to address the keyword matching problem, which is of great importance in sponsored search. Keyword matching deals with linking users' queries and advertisers' keywords under the restriction of different match types (exact match, phrase match, and smart match). Three challenges exist in this problem: the semantic gap between queries and keywords, the matching type restriction, and the scalability problem induced by large volumes of queries and keywords.
Our talk will consist of 3 parts: a) how to make use of the data-driven DNN models to mitigate the semantic gap, b) how to use BERT to judge the matching type of a query-keyword pair, c) how to use knowledge distilling, synonymous keywords compression, and online-offline mixed structure to deploy the BERT model in a real industrial environment.
These data-driven deep learning approaches have been successfully applied in Baidu's sponsored search, which yield a significant increase in commercial revenue without degrading users' experience. We hope our method would shed light on the further design of the industrial sponsored search system.
Presenter: Yijiang LiangJoin Virtual Talk & Panel Visit BAIDU Booth
Reinforcement learning is a natural paradigm for automating the design of financial trading policies. Training the trading policies on historical financial data is challenging because financial data is limited to a few values per trading day (e.g. stock daily close price) and as such the amount of training data is relatively low. Federated Learning offers a potential solution by training on many parties' data, thereby increasing the amount of training data overall. A recent work by this team shows how it is possible to convert an RL strategy for training a portfolio optimization policy on a set of assets to a multi-task learning problem that benefits tremendously from federated learning. We implement the method on the federated reinforcement learning capability of the IBM Federated Learning (IFL) platform.
The session includes three sections: 1) We first give a mini-tutorial on using the IBM Federated Learning (IFL) platform for any federated reinforcement learning problem and illustrate it on the openAI gym pendulum example. 2) A demo shows how the IFL works on the financial portfolio optimization problem. 3) The accompanying talk provides more details on the method and results.
Presenters: Peng Qian Yu, Hifaz Hassan, Laura WynterJoin Virtual Talk & Panel Visit IBM Booth
Big Data, Deep Learning, and huge computing are shaping up AI and are transforming our society. Especially, learning with deep neural networks has achieved great success across a wide variety of tasks. To help increase speed to solution and reduce duplication of effort, automated model construction is of great interest to provide architecture-effective and domain-adaptive deep models. On the other hand, understanding model behaviors and building trust in model prediction are of particular importance in many applications such as autonomous driving, medical and fintech tasks. The research of automated and interpretable deep learning should at least include the following key components: (1) neural architecture search (2) model construction with a changing environment such as transfer learning and (3) understanding and interpretation of deep learning models.
In this panel we plan to focus on timely topics of above areas. The panel will include a comprehensive survey of state-of-the-art algorithms and systems, a detailed description of the presenters’ research experience, and live-demonstration of platforms built by the Baidu AutoDL team. Through this panel, attendees will gain an understanding of how to efficiently build automated deep learning models and enhance their trustworthiness. Our panel will also speed up the process of turning deep learning research results into industrial products by introducing Baidu AutoDL, a tool that facilitates automated and interpretable deep learning.
Presenters: Bolei Zhou, Yi Yang, Quanshi Zhang, Dejing Dou, Haoyi Xiong, Jiahui Yu, Humphrey Shi, Linchao Zhu, Xingjian Li
Talk(Haoyi Xiong): 6:30-7:00 (Los Angeles) / 21:30-22:00 (Beijing), 12/07/2020
Panel: 7:00-7:40 (Los Angeles) / 22:00-22:40 (Beijing), 12/07/2020
Live Zoom Room:
Password: 157673Join Virtual Talk & Panel Visit BAIDU Booth
Format: A talk consisting of 3 parts, each with a demo.
Automated AI/ML makes it easier for data scientists to develop pipelines by searching over hyperparameters, algorithms, data preparation steps, and even pipeline topologies.
1) Lale: Type-Driven Auto-ML with Scikit-Learn (includes a 10 minute demo) Lale (https://github.com/ibm/lale) is an open-source library of high-level Python interfaces that simplifies and unifies the syntax of automated ML to be consistent with manual ML, with other automated tools, and with error checks. It also supports advanced features such as topology search and higher-order operators.
2) AutoMLPipeline: Symbolic ML Pipeline Composition and Parallel Evaluation (~15 minute demo) AutoMLPipeline (https://github.com/IBM/AutoMLPipeline.jl) is a Julia toolkit that makes it trivial to create complex ML pipeline structures using simple expressions and evaluate them in parallel. It leverages the built-in macro programming features of Julia to symbolically process and manipulate pipeline expressions.
3) AutoAI with Stakeholder Constraints (10 minute demo) Common applications of AI involve multiple stakeholders with requirements beyond a single objective of predictive performance. This toolkit automatically generates pipelines with favorable predictive performance while satisfying stakeholder constraints related to deployment (inference time and pipeline size) and fairness. It also provides an API to specify custom constraints.
Presenters: Martin Hirzel, Paulito Palmes, Parikshit Ram, Dakuo WangJoin Virtual Talk & Panel Visit IBM Booth
In this demo, we show how the working flow of structural-to-modular NAS and the performance of models designed by SM-NAS on object detection tasks.
Presenter: Hang XuJoin Virtual Demonstration Visit Huawei Technologies Co., Ltd Booth
PaddlePaddle is a deep learning platform developed at Baidu. We will demonstrate the core technology behind PaddlePaddle and a range of applications built on its top. The main features to be demonstrated include its “easy to use” APIs, powerful inference engine and tool chain for fast inference and deployment, its one-stop learning environment, and EZDL Pro, an integrated user interface for industrial deep learning applications. We will also demonstrate our semantic representation model ERNIE and simultaneous translation developed using PaddlePaddle.
Presenter: Daxiang DongJoin Virtual Demonstration Visit BAIDU Booth
1) RXNMapper: Unsupervised attention-guided atom-mapping Explore the attentions of a Transformer model that has learned to solve the NP hard problem of how atoms rearrange in chemical reactions on its own, with no supervision or human guidance.
2) AI Explainability 360 (AIX360) AIX360 is an open-source Python toolkit for explaining data and machine learning models in diverse and state-of-the-art ways to address the needs of different stakeholders. This demo provides a glimpse of its capabilities, algorithms, and industry domains.
3) Command Line AI (CLAI) Explore and interact with the future of the Command Line with CLAI - Command Line AI. CLAI is an open-source project from IBM Research that brings the latest in AI and ML technologies to the command line as “skills”, and seeks to make the command line user’s daily life more efficient and productive.
4) COVID-19 Molecule Explorer The traditional drug discovery pipeline is time and cost intensive. To deal with new viral outbreaks and epidemics, such as COVID-19, we need more rapid drug discovery processes. We have developed robust generative frameworks that can overcome the inherent challenges to create novel peptides, proteins, drug candidates, and materials. We are working with several partners on validating the AI-generated molecules by using in-silico simulations and wet lab experiments, and will include those validation results into the exploration tool as they arrive.
Presenters: Ben Hoover, Hendrik Strobelt, Teodoro Laino, Vijay Arya, Amit Dhurandhar, Tathagata Chakraborti, Kartik Talamadupula, Mayank Agarwal, Payel Das, Enara VijilJoin Virtual Demonstration Visit IBM Booth