ICML 2017 Events with Videos

Invited Talks

Causal Learning
Genomics, Big Data, and Machine Learning: Understanding the Human Wiring Diagram and Driving the Healthcare Revolution
Towards Reinforcement Learning in the Real World
How AI Designers will Dictate Our Civic Future

Talks

Multi-objective Bandits: Optimizing the Generalized Gini Index
Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis
Robust Adversarial Reinforcement Learning
Enumerating Distinct Decision Trees
The loss surface of deep and wide neural networks
Robust Probabilistic Modeling with Bayesian Data Reweighting
PixelCNN Models with Auxiliary Variables for Natural Image Modeling
Tight Bounds for Approximate Carathéodory and Beyond
Online Learning with Local Permutations and Delayed Feedback
SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling
Minimax Regret Bounds for Reinforcement Learning
Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks
Post-Inference Prior Swapping
Understanding Synthetic Gradients and Decoupled Neural Interfaces
Parallel Multiscale Autoregressive Density Estimation
Oracle Complexity of Second-Order Methods for Finite-Sum Problems
Model-Independent Online Learning for Influence Maximization
Latent Feature Lasso
Fairness in Reinforcement Learning
Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things
Sharp Minima Can Generalize For Deep Nets
Evaluating Bayesian Models with Posterior Dispersion Indices
meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting
Global optimization of Lipschitz functions
Online Learning to Rank in Stochastic Click Models
Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency and Scalability
Boosted Fitted Q-Iteration
Multi-Class Optimal Margin Distribution Machine
Geometry of Neural Network Loss Surfaces via Random Matrix Theory
Automatic Discovery of the Statistical Types of Variables in a Dataset
Learning Important Features Through Propagating Activation Differences
Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks
Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions
The Sample Complexity of Online One-Class Collaborative Filtering
Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
Kernelized Support Tensor Machines
The Shattered Gradients Problem: If resnets are the answer, then what is the question?
Bayesian Models of Data Streams with Hierarchical Power Priors
Evaluating the Variance of Likelihood-Ratio Gradient Estimators
Learning Texture Manifolds with the Periodic Spatial GAN
Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence
Efficient Regret Minimization in Non-Convex Games
Coresets for Vector Summarization with Applications to Network Graphs
Constrained Policy Optimization
Dual Supervised Learning
Recovery Guarantees for One-hidden-layer Neural Networks
Ordinal Graphical Models: A Tale of Two Approaches
Equivariance Through Parameter-Sharing
Generalization and Equilibrium in Generative Adversarial Nets (GANs)
GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization
Identify the Nash Equilibrium in Static Games with Random Payoffs
Partitioned Tensor Factorizations for Learning Mixed Membership Models
Reinforcement Learning with Deep Energy-Based Policies
Learning Infinite Layer Networks without the Kernel Trick
Failures of Gradient-Based Deep Learning
Scalable Bayesian Rule Lists
Warped Convolutions: Efficient Invariance to Spatial Transformations
McGan: Mean and Covariance Feature Matching GAN
Breaking Locality Accelerates Block Gauss-Seidel
Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU
On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations
Prediction and Control with Temporal Segment Models
Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees
Analytical Guarantees on Numerical Precision of Deep Neural Networks
Learning Determinantal Point Processes with Moments and Cycles
Graph-based Isometry Invariant Representation Learning
Conditional Image Synthesis with Auxiliary Classifier GANs
Stochastic DCA for the Large-sum of Non-convex Functions Problem and its Application to Group Variable Selection in Classification
On Kernelized Multi-armed Bandits
Nonnegative Matrix Factorization for Time Series Recovery From a Few Temporal Aggregates
An Alternative Softmax Operator for Reinforcement Learning
Logarithmic Time One-Against-Some
Follow the Moving Leader in Deep Learning
Deep Bayesian Active Learning with Image Data
Deriving Neural Architectures from Sequence and Graph Kernels
Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
Gradient Projection Iterative Sketch for Large-Scale Constrained Least-Squares
Second-Order Kernel Online Convex Optimization with Adaptive Sketching
Frame-based Data Factorizations
Fake News Mitigation via Point Process Based Intervention
Understanding Black-box Predictions via Influence Functions
Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
Bayesian Boolean Matrix Factorisation
Wasserstein Generative Adversarial Networks
Connected Subgraph Detection with Mirror Descent on SDPs
Dueling Bandits with Weak Regret
Nearly Optimal Robust Matrix Completion
Curiosity-driven Exploration by Self-supervised Prediction
Re-revisiting Learning on Hypergraphs: Confidence Interval and Subgradient Method
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Learning the Structure of Generative Models without Labeled Data
Deep Transfer Learning with Joint Adaptation Networks
Learning Hierarchical Features from Deep Generative Models
Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks
On Context-Dependent Clustering of Bandits
Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
Interactive Learning from Policy-Dependent Human Feedback
Self-Paced Co-training
Convexified Convolutional Neural Networks
Learning to Discover Sparse Graphical Models
Meta Networks
Bottleneck Conditional Density Estimation
Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms
Provably Optimal Algorithms for Generalized Linear Contextual Bandits
No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis
End-to-End Differentiable Adversarial Imitation Learning
Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
On the Expressive Power of Deep Neural Networks
Local-to-Global Bayesian Network Structure Learning
SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization
Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo
Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization
Safety-Aware Algorithms for Adversarial Contextual Bandit
Coherence Pursuit: Fast, Simple, and Robust Subspace Recovery
Learning in POMDPs with Monte Carlo Tree Search
Iterative Machine Teaching
Depth-Width Tradeoffs in Approximating Natural Functions With Neural Networks
Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Zero-Inflated Exponential Family Embeddings
A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates
Adaptive Multiple-Arm Identification
Tensor Decomposition with Smoothness
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
Automated Curriculum Learning for Neural Networks
On Relaxing Determinism in Arithmetic Circuits
AdaNet: Adaptive Structural Learning of Artificial Neural Networks
Convex Phase Retrieval without Lifting via PhaseMax
Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret
Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use
Unifying task specification in reinforcement learning
Asymmetric Tri-training for Unsupervised Domain Adaptation
Efficient Nonmyopic Active Search
An Infinite Hidden Markov Model With Similarity-Biased Transitions
Learning to Learn without Gradient Descent by Gradient Descent
Attentive Recurrent Comparators
A Semismooth Newton Method for Fast, Generic Convex Programming
Active Learning for Accurate Estimation of Linear Models
Tensor Decomposition via Simultaneous Power Iteration
A Distributional Perspective on Reinforcement Learning
Source-Target Similarity Modelings for Multi-Source Transfer Gaussian Process Regression
Leveraging Union of Subspace Structure to Improve Constrained Clustering
Batched High-dimensional Bayesian Optimization via Structural Kernel Learning
Learned Optimizers that Scale and Generalize
State-Frequency Memory Recurrent Neural Networks
Approximate Newton Methods and Their Local Convergence
Adaptive Feature Selection: Computationally Efficient Online Sparse Linear Regression under RIP
A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery
Hierarchy Through Composition with Multitask LMDPs
Multi-task Learning with Labeled and Unlabeled Tasks
Active Heteroscedastic Regression
From Patches to Images: A Nonparametric Generative Model
Learning Gradient Descent: Better Generalization and Longer Horizons
Delta Networks for Optimized Recurrent Network Computation
Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values
Emulating the Expert: Inverse Optimization through Online Learning
An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation
A Laplacian Framework for Option Discovery in Reinforcement Learning
Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
Active Learning for Cost-Sensitive Classification
Fast Bayesian Intensity Estimation for the Permanental Process
Learning Algorithms for Active Learning
Recurrent Highway Networks
Practical Gauss-Newton Optimisation for Deep Learning
Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
Algorithms for $\ell_p$ Low-Rank Approximation
Modular Multitask Reinforcement Learning with Policy Sketches
Risk Bounds for Transferring Representations With and Without Fine-Tuning
Diameter-Based Active Learning
A Birth-Death Process for Feature Allocation
Tensor Balancing on Statistical Manifold
Test of Time Award
Leveraging Node Attributes for Incomplete Relational Data
How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?
Data-Efficient Policy Evaluation Through Behavior Policy Search
Distributed and Provably Good Seedings for k-Means in Constant Rounds
Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging
Exact MAP Inference by Avoiding Fractional Vertices
Relative Fisher Information and Natural Gradient for Learning Large Modular Models
Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
Lazifying Conditional Gradient Algorithms
Bayesian inference on random simple graphs with power law degree distributions
Faster Principal Component Regression and Stable Matrix Chebyshev Approximation
Stochastic Variance Reduction Methods for Policy Evaluation
Consistent k-Clustering
Estimating the unseen from multiple populations
Exact Inference for Integer Latent-Variable Models
Learning Deep Architectures via Generalized Whitened Neural Networks
On orthogonality and learning RNNs with long term dependencies
Conditional Accelerated Lazy Stochastic Gradient Descent
Analogical Inference for Multi-relational Embeddings
Spectral Learning from a Single Trajectory under Finite-State Policies
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering
Meritocratic Fairness for Cross-Population Selection
Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms
Continual Learning Through Synaptic Intelligence
Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs
SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs
Capacity Releasing Diffusion for Speed and Locality.
Consistent On-Line Off-Policy Evaluation
Hyperplane Clustering Via Dual Principal Component Pursuit
Neural networks and rational functions
Variational Inference for Sparse and Undirected Models
Adaptive Neural Networks for Efficient Inference
The Statistical Recurrent Unit
Approximate Steepest Coordinate Descent
Deep Generative Models for Relational Data with Side Information
Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition
Contextual Decision Processes with low Bellman rank are PAC-Learnable
Multilevel Clustering via Wasserstein Means
Tensor Belief Propagation
Combined Group and Exclusive Sparsity for Deep Neural Networks
Input Switched Affine Networks: An RNN Architecture Designed for Interpretability
StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent
On the Iteration Complexity of Support Recovery via Hard Thresholding Pursuit
A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency
Co-clustering through Optimal Transport
Uniform Deviation Bounds for k-Means Clustering
Faster Greedy MAP Inference for Determinantal Point Processes
Input Convex Neural Networks
Online and Linear-Time Attention by Enforcing Monotonic Alignments
Stochastic modified equations and adaptive stochastic gradient algorithms
Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening
Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization
Gradient Boosted Decision Trees for High Dimensional Sparse Output
Multiple Clustering Views from Multiple Uncertain Experts
Uniform Convergence Rates for Kernel Density Estimation
Zonotope hit-and-run for efficient sampling from projection DPPs
OptNet: Differentiable Optimization as a Layer in Neural Networks
Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control
Dissipativity Theory for Nesterov's Accelerated Method
Just Sort It! A Simple and Effective Approach to Active Preference Learning
On The Projection Operator to A Three-view Cardinality Constrained Set
Globally Induced Forest: A Prepruning Compression Scheme
Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery
Density Level Set Estimation on Manifolds with DBSCAN
Parseval Networks: Improving Robustness to Adversarial Examples
Deep Voice: Real-time Neural Text-to-Speech
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
Maximum Selection and Ranking under Noisy Comparisons
Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity
Forest-type Regression with General Losses and Robust Forest
Clustering High Dimensional Dynamic Data Streams
Algorithmic Stability and Hypothesis Complexity
On the Sampling Problem for Kernel Quadrature
Regularising Non-linear Models Using Feature Side-information
DeepBach: a Steerable Model for Bach Chorales Generation
Forward and Reverse Gradient-Based Hyperparameter Optimization
Active Learning for Top-$K$ Rank Aggregation from Noisy Comparisons
Compressed Sensing using Generative Models
Confident Multiple Choice Learning
Consistency Analysis for Binary Classification Revisited
Measuring Sample Quality with Kernels
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
Adaptive Sampling Probabilities for Non-Smooth Optimization
Learning to Align the Source Code to the Compiled Object Code
Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction
Regret Minimization in Behaviorally-Constrained Zero-Sum Games
Fast k-Nearest Neighbour Search via Prioritized DCI
Distributed Mean Estimation with Limited Communication
Variational Boosting: Iteratively Refining Posterior Approximations
A Closer Look at Memorization in Deep Networks
Learning to Generate Long-term Future via Hierarchical Prediction
Sub-sampled Cubic Regularization for Non-convex Optimization
RobustFill: Neural Program Learning under Noisy I/O
Efficient Distributed Learning with Sparsity
Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning
Deep Spectral Clustering Learning
Nonparanormal Information Estimation
Lost Relatives of the Gumbel Trick
Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
Sequence to Better Sequence: Continuous Revision of Combinatorial Structures
Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter
Programming with a Differentiable Forth Interpreter
Innovation Pursuit: A New Approach to the Subspace Clustering Problem
Strongly-Typed Agents are Guaranteed to Interact Safely
Joint Dimensionality Reduction and Metric Learning: A Geometric Take
A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions
Learning to Aggregate Ordinal Labels by Maximizing Separating Width
Visualizing and Understanding Multilayer Perceptron Models: A Case Study in Speech Processing
Tensor-Train Recurrent Neural Networks for Video Classification
“Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions
Differentiable Programs with Neural Libraries
Selective Inference for Sparse High-Order Interaction Models
Coordinated Multi-Agent Imitation Learning
ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices
Gradient Coding: Avoiding Stragglers in Distributed Learning
Uncorrelation and Evenness: a New Diversity-Promoting Regularizer
Axiomatic Attribution for Deep Networks
Sequence Modeling via Segmentations
Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization
Developing Bug-Free Machine Learning Systems With Formal Mathematics
Dictionary Learning Based on Sparse Distribution Tomography
Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability
Learning Discrete Representations via Information Maximizing Self-Augmented Training
Learning Latent Space Models with Angular Constraints
On Calibration of Modern Neural Networks
Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data
How to Escape Saddle Points Efficiently
Preferential Bayesian Optmization
Being Robust (in High Dimensions) Can Be Practical
Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution
When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications
Differentially Private Ordinary Least Squares
Fractional Langevin Monte Carlo: Exploring Levy Driven Stochastic Differential Equations for MCMC
Device Placement Optimization with Reinforcement Learning
Dynamic Word Embeddings
Asynchronous Stochastic Gradient Descent with Delay Compensation
Max-value Entropy Search for Efficient Bayesian Optimization
Multilabel Classification with Group Testing and Codes
Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis
Priv’IT: Private and Sample Efficient Identity Testing
Stochastic Bouncy Particle Sampler
Deep Tensor Convolution on Multicores
Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling
Adaptive Consensus ADMM for Distributed Optimization
Bayesian Optimization with Tree-structured Dependencies
High-Dimensional Structured Quantile Regression
Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control
Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier
Differentially Private Submodular Maximization: Data Summarization in Disguise
Canopy --- Fast Sampling with Cover Trees
MEC: Memory-efficient Convolution for Deep Neural Network
Coupling Distributed and Symbolic Execution for Natural Language Queries
Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks
Multi-fidelity Bayesian Optimisation with Continuous Approximations
High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation
Learning Stable Stochastic Nonlinear Dynamical Systems
iSurvive: An Interpretable, Event-time Prediction Model for mHealth
Differentially Private Learning of Graphical Models using CGMs
A Simulated Annealing Based Inexact Oracle for Wasserstein Loss Minimization
Beyond Filters: Compact Feature Map for Portable Deep Model
Image-to-Markup Generation with Coarse-to-Fine Attention
Projection-free Distributed Online Learning in Networks
Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space
Robust Structured Estimation with Single-Index Models
Local Bayesian Optimization of Motor Skills
Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture
Minimizing Trust Leaks for Robust Sybil Detection
Improving Gibbs Sampler Scan Quality with DoGS
Efficient softmax approximation for GPUs
Multichannel End-to-end Speech Recognition
Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference
Toward Efficient and Accurate Covariance Matrix Estimation on Compressed Data
Count-Based Exploration with Neural Density Models
Bidirectional learning for time-series models with hidden units
The Price of Differential Privacy For Online Learning
Magnetic Hamiltonian Monte Carlo
Dropout Inference in Bayesian Neural Networks with Alpha-divergences
Latent Intention Dialogue Models
Robust Guarantees of Stochastic Greedy Algorithms
Uncovering Causality from Multivariate Hawkes Integrated Cumulants
Robust Gaussian Graphical Model Estimation with Arbitrary Corruption
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
Learning Hawkes Processes from Short Doubly-Censored Event Sequences
Pain-Free Random Differential Privacy with Sensitivity Sampling
Probabilistic Path Hamiltonian Monte Carlo
Multiplicative Normalizing Flows for Variational Bayesian Neural Networks
Discovering Discrete Latent Topics with Neural Variational Inference
Guarantees for Greedy Maximization of Non-submodular Functions with Applications
Cost-Optimal Learning of Causal Graphs
Algebraic Variety Models for High-Rank Matrix Completion
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
Coherent probabilistic forecasts for hierarchical time series
Differentially Private Clustering in High-Dimensional Euclidean Spaces
Stochastic Gradient Monomial Gamma Sampler
Variational Dropout Sparsifies Deep Neural Networks
Toward Controlled Generation of Text
Robust Submodular Maximization: A Non-Uniform Partitioning Approach
Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables
High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm
The Predictron: End-To-End Learning and Planning
Soft-DTW: a Differentiable Loss Function for Time-Series
Differentially Private Chi-squared Test by Unit Circle Mechanism
Stochastic Gradient MCMC Methods for Hidden Markov Models
Unimodal Probability Distributions for Deep Ordinal Classification
Learning Continuous Semantic Representations of Symbolic Expressions
Probabilistic Submodular Maximization in Sub-Linear Time
Estimating individual treatment effect: generalization bounds and algorithms
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
Variational Policy for Guiding Point Processes
Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible
Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC
Adversarial Feature Matching for Text Generation
On Approximation Guarantees for Greedy Low Rank Optimization
Recursive Partitioning for Personalization using Observational Data
Optimal Densification for Fast and Accurate Minwise Hashing
FeUdal Networks for Hierarchical Reinforcement Learning
Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
An Adaptive Test of Independence with Analytic Kernel Embeddings
Distributed Batch Gaussian Process Optimization
Dance Dance Convolution
Language Modeling with Gated Convolutional Networks
Deletion-Robust Submodular Maximization: Data Summarization with "the Right to be Forgotten"
Identifying Best Interventions through Online Importance Sampling
Stochastic Generative Hashing
Deciding How to Decide: Dynamic Routing in Artificial Neural Networks
Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction
Sliced Wasserstein Kernel for Persistence Diagrams
Scalable Multi-Class Gaussian Process Classification using Expectation Propagation
World of Bits: An Open-Domain Platform for Web-Based Agents
Convolutional Sequence to Sequence Learning
Analysis and Optimization of Graph Decompositions by Lifted Multicuts
Deep IV: A Flexible Approach for Counterfactual Prediction
ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning
Neural Episodic Control
End-to-End Learning for Structured Prediction Energy Networks
Adapting Kernel Representations Online Using Submodular Maximization
Random Feature Expansions for Deep Gaussian Processes
Real-Time Adaptive Image Compression
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions
Near-Optimal Design of Experiments via Regret Minimization
Counterfactual Data-Fusion for Online Reinforcement Learners
Large-Scale Evolution of Image Classifiers
Neural Optimizer Search using Reinforcement Learning
A Unified View of Multi-Label Performance Measures
Spherical Structured Feature Maps for Kernel Approximation
Asynchronous Distributed Variational Gaussian Processes for Regression
Neural Message Passing for Quantum Chemistry
Grammar Variational Autoencoder
Robust Budget Allocation via Continuous Submodular Functions
Scalable Generative Models for Multi-label Learning with Missing Labels
Nyström Method with Kernel K-means++ Samples as Landmarks
High Dimensional Bayesian Optimization with Elastic Gaussian Process
Accelerating Eulerian Fluid Simulation With Convolutional Networks
Rule-Enhanced Penalized Regression by Column Generation using Rectangular Maximum Agreement

Tutorials

Distributed Deep Learning with MxNet Gluon
Interpretable Machine Learning
Machine Learning for Autonomous Vehicles
Recent Advances in Stochastic Convex and Non-Convex Optimization
Deep Reinforcement Learning, Decision Making, and Control
Deep Learning for Health Care Applications: Challenges and Solutions
Real World Interactive Learning
Sequence-To-Sequence Modeling with Neural Networks
Robustness Meets Algorithms (and Vice-Versa)

Report issues here.