# Downloads

Number of events: 470

- A Birth-Death Process for Feature Allocation
- Accelerating Eulerian Fluid Simulation With Convolutional Networks
- A Closer Look at Memorization in Deep Networks
- Active Heteroscedastic Regression
- Active Learning for Accurate Estimation of Linear Models
- Active Learning for Cost-Sensitive Classification
- Active Learning for Top-$K$ Rank Aggregation from Noisy Comparisons
- AdaNet: Adaptive Structural Learning of Artificial Neural Networks
- Adapting Kernel Representations Online Using Submodular Maximization
- Adaptive Consensus ADMM for Distributed Optimization
- Adaptive Feature Selection: Computationally Efficient Online Sparse Linear Regression under RIP
- Adaptive Multiple-Arm Identification
- Adaptive Neural Networks for Efficient Inference
- Adaptive Sampling Probabilities for Non-Smooth Optimization
- A Distributional Perspective on Reinforcement Learning
- A Divergence Bound for Hybrids of MCMC and Variational Inference and an Application to Langevin Dynamics and SGVI
- Adversarial Feature Matching for Text Generation
- Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks
- A Laplacian Framework for Option Discovery in Reinforcement Learning
- Algebraic Variety Models for High-Rank Matrix Completion
- Algorithmic Stability and Hypothesis Complexity
- Algorithms for $\ell_p$ Low-Rank Approximation
- An Adaptive Test of Independence with Analytic Kernel Embeddings
- Analogical Inference for Multi-relational Embeddings
- An Alternative Softmax Operator for Reinforcement Learning
- Analysis and Optimization of Graph Decompositions by Lifted Multicuts
- Analytical Guarantees on Numerical Precision of Deep Neural Networks
- An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
- An Efficient, Sparsity-Preserving, Online Algorithm for Low-Rank Approximation
- An Infinite Hidden Markov Model With Similarity-Biased Transitions
- Approximate Newton Methods and Their Local Convergence
- Approximate Steepest Coordinate Descent
- A Richer Theory of Convex Constrained Optimization with Reduced Projections and Improved Rates
- A Semismooth Newton Method for Fast, Generic Convex Programming
- A Simple Multi-Class Boosting Framework with Theoretical Guarantees and Empirical Proficiency
- A Simulated Annealing Based Inexact Oracle for Wasserstein Loss Minimization
- Asymmetric Tri-training for Unsupervised Domain Adaptation
- Asynchronous Distributed Variational Gaussian Processes for Regression
- Asynchronous Distributed Variational Gaussian Processes for Regresssion
- Asynchronous Stochastic Gradient Descent with Delay Compensation
- Attentive Recurrent Comparators
- A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions
- A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery
- A Unified View of Multi-Label Performance Measures
- Automated Curriculum Learning for Neural Networks
- Automatic Discovery of the Statistical Types of Variables in a Dataset
- Automatic Machine Learning (AutoML 2017)
- Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
- Axiomatic Attribution for Deep Networks
- Batched High-dimensional Bayesian Optimization via Structural Kernel Learning
- Bayesian Boolean Matrix Factorisation
- Bayesian inference on random simple graphs with power law degree distributions
- Bayesian Models of Data Streams with Hierarchical Power Priors
- Bayesian Optimization with Tree-structured Dependencies
- Being Robust (in High Dimensions) Can Be Practical
- Beyond Filters: Compact Feature Map for Portable Deep Model
- Bidirectional learning for time-series models with hidden units
- Boosted Fitted Q-Iteration
- Bottleneck Conditional Density Estimation
- Breaking Locality Accelerates Block Gauss-Seidel
- Canopy --- Fast Sampling with Cover Trees
- Capacity Releasing Diffusion for Speed and Locality.
- Causal Learning
- ChoiceRank: Identifying Preferences from Node Traffic in Networks
- Clustering by Sum of Norms: Stochastic Incremental Algorithm, Convergence and Cluster Recovery
- Clustering High Dimensional Dynamic Data Streams
- Co-clustering through Optimal Transport
- Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
- Coherence Pursuit: Fast, Simple, and Robust Subspace Recovery
- Coherent probabilistic forecasts for hierarchical time series
- Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible
- Combined Group and Exclusive Sparsity for Deep Neural Networks
- Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning
- Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis
- Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data
- Compressed Sensing using Generative Models
- Conditional Accelerated Lazy Stochastic Gradient Descent
- Conditional Image Synthesis with Auxiliary Classifier GANs
- Confident Multiple Choice Learning
- Connected Subgraph Detection with Mirror Descent on SDPs
- Consistency Analysis for Binary Classification Revisited
- Consistent k-Clustering
- Consistent On-Line Off-Policy Evaluation
- Constrained Policy Optimization
- Contextual Decision Processes with low Bellman rank are PAC-Learnable
- Continual Learning Through Synaptic Intelligence
- Convergence Analysis of Proximal Gradient with Momentum for Nonconvex Optimization
- Convexified Convolutional Neural Networks
- Convex Phase Retrieval without Lifting via PhaseMax
- “Convex Until Proven Guilty”: Dimension-Free Acceleration of Gradient Descent on Non-Convex Functions
- Convolutional Sequence to Sequence Learning
- Coordinated Multi-Agent Imitation Learning
- Coresets for Vector Summarization with Applications to Network Graphs
- Cost-Optimal Learning of Causal Graphs
- Count-Based Exploration with Neural Density Models
- Counterfactual Data-Fusion for Online Reinforcement Learners
- Coupling Distributed and Symbolic Execution for Natural Language Queries
- Curiosity-driven Exploration by Self-supervised Prediction
- Dance Dance Convolution
- DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
- Data-Efficient Policy Evaluation Through Behavior Policy Search
- Deciding How to Decide: Dynamic Routing in Artificial Neural Networks
- Decoupled Neural Interfaces using Synthetic Gradients
- DeepBach: a Steerable Model for Bach Chorales Generation
- Deep Bayesian Active Learning with Image Data
- Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability
- Deep Generative Models for Relational Data with Side Information
- Deep IV: A Flexible Approach for Counterfactual Prediction
- Deep Latent Dirichlet Allocation with Topic-Layer-Adaptive Stochastic Gradient Riemannian MCMC
- Deep Learning for Health Care Applications: Challenges and Solutions
- Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction
- Deep Reinforcement Learning, Decision Making, and Control
- Deep Spectral Clustering Learning
- Deep Structured Prediction
- Deep Tensor Convolution on Multicores
- Deep Transfer Learning with Joint Adaptation Networks
- Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
- Deep Voice: Real-time Neural Text-to-Speech
- Deletion-Robust Submodular Maximization: Data Summarization with "the Right to be Forgotten"
- Delta Networks for Optimized Recurrent Network Computation
- Density Level Set Estimation on Manifolds with DBSCAN
- Depth-Width Tradeoffs in Approximating Natural Functions With Neural Networks
- Deriving Neural Architectures from Sequence and Graph Kernels
- Developing Bug-Free Machine Learning Systems With Formal Mathematics
- Device Placement Optimization with Reinforcement Learning
- Diameter-Based Active Learning
- Dictionary Learning Based on Sparse Distribution Tomography
- Differentiable Programs with Neural Libraries
- Differentially Private Chi-squared Test by Unit Circle Mechanism
- Differentially Private Clustering in High-Dimensional Euclidean Spaces
- Differentially Private Learning of Graphical Models using CGMs
- Differentially Private Ordinary Least Squares
- Differentially Private Submodular Maximization: Data Summarization in Disguise
- Discovering Discrete Latent Topics with Neural Variational Inference
- Dissipativity Theory for Nesterov's Accelerated Method
- Distributed and Provably Good Seedings for k-Means in Constant Rounds
- Distributed Batch Gaussian Process Optimization
- Distributed Deep Learning with MxNet Gluon
- Distributed Mean Estimation with Limited Communication
- Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition
- Doubly Greedy Primal-Dual Coordinate Descent for Sparse Empirical Risk Minimization
- Dropout Inference in Bayesian Neural Networks with Alpha-divergences
- Dual Iterative Hard Thresholding: From Non-convex Sparse Minimization to Non-smooth Concave Maximization
- Dual Supervised Learning
- Dueling Bandits with Weak Regret
- Dynamic Word Embeddings
- Efficient Distributed Learning with Sparsity
- Efficient Nonmyopic Active Search
- Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret
- Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections
- Efficient Regret Minimization in Non-Convex Games
- Efficient softmax approximation for GPUs
- Emulating the Expert: Inverse Optimization through Online Learning
- End-to-End Differentiable Adversarial Imitation Learning
- End-to-End Learning for Structured Prediction Energy Networks
- Enumerating Distinct Decision Trees
- Equivariance Through Parameter-Sharing
- Estimating individual treatment effect: generalization bounds and algorithms
- Estimating the unseen from multiple populations
- Evaluating Bayesian Models with Posterior Dispersion Indices
- Evaluating the Variance of Likelihood-Ratio Gradient Estimators
- Exact Inference for Integer Latent-Variable Models
- Exact MAP Inference by Avoiding Fractional Vertices
- Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms
- Failures of Gradient-Based Deep Learning
- Fairness in Reinforcement Learning
- Fake News Mitigation via Point Process Based Intervention
- Fast Bayesian Intensity Estimation for the Permanental Process
- Faster Greedy MAP Inference for Determinantal Point Processes
- Faster Principal Component Regression and Stable Matrix Chebyshev Approximation
- Fast k-Nearest Neighbour Search via Prioritized DCI
- FeUdal Networks for Hierarchical Reinforcement Learning
- Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU
- Follow the Moving Leader in Deep Learning
- Forest-type Regression with General Losses and Robust Forest
- Forward and Reverse Gradient-Based Hyperparameter Optimization
- Fractional Langevin Monte Carlo: Exploring Levy Driven Stochastic Differential Equations for MCMC
- Frame-based Data Factorizations
- From Patches to Images: A Nonparametric Generative Model
- Generalization and Equilibrium in Generative Adversarial Nets (GANs)
- Genomics, Big Data, and Machine Learning: Understanding the Human Wiring Diagram and Driving the Healthcare Revolution
- Geometry of Neural Network Loss Surfaces via Random Matrix Theory
- Globally Induced Forest: A Prepruning Compression Scheme
- Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
- Global optimization of Lipschitz functions
- Gradient Boosted Decision Trees for High Dimensional Sparse Output
- Gradient Coding: Avoiding Stragglers in Distributed Learning
- Gradient Projection Iterative Sketch for Large-Scale Constrained Least-Squares
- Gram-CTC: Automatic Unit Selection and Target Decomposition for Sequence Labelling
- Grammar Variational Autoencoder
- Graph-based Isometry Invariant Representation Learning
- GSOS: Gauss-Seidel Operator Splitting Algorithm for Multi-Term Nonsmooth Convex Composite Optimization
- Guarantees for Greedy Maximization of Non-submodular Functions with Applications
- Hierarchy Through Composition with Multitask LMDPs
- High Dimensional Bayesian Optimization with Elastic Gaussian Process
- High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation
- High-Dimensional Structured Quantile Regression
- High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm
- How AI Designers will Dictate Our Civic Future
- How Close Are the Eigenvectors of the Sample and Actual Covariance Matrices?
- How to Escape Saddle Points Efficiently
- Human in the Loop Machine Learning
- Hyperplane Clustering Via Dual Principal Component Pursuit
- ICML Workshop on Machine Learning for Autonomous Vehicles 2017
- Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables
- Identifying Best Interventions through Online Importance Sampling
- Identify the Nash Equilibrium in Static Games with Random Payoffs
- Image-to-Markup Generation with Coarse-to-Fine Attention
- Implicit Generative Models
- Improved Variational Autoencoders for Text Modeling using Dilated Convolutions
- Improving Gibbs Sampler Scan Quality with DoGS
- Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution
- Improving Viterbi is Hard: Better Runtimes Imply Faster Clique Algorithms
- Innovation Pursuit: A New Approach to the Subspace Clustering Problem
- Input Convex Neural Networks
- Input Switched Affine Networks: An RNN Architecture Designed for Interpretability
- Interactive Learning from Policy-Dependent Human Feedback
- Interactive Machine Learning and Semantic Information Retrieval
- Interpretable Machine Learning
- iSurvive: An Interpretable, Event-time Prediction Model for mHealth
- Iterative Machine Teaching
- Joint Dimensionality Reduction and Metric Learning: A Geometric Take
- Just Sort It! A Simple and Effective Approach to Active Preference Learning
- Kernelized Support Tensor Machines
- Know-Evolve: Deep Temporal Reasoning for Dynamic Knowledge Graphs
- Language Modeling with Gated Convolutional Networks
- Large-Scale Evolution of Image Classifiers
- Latent Feature Lasso
- Latent Intention Dialogue Models
- Latent LSTM Allocation: Joint clustering and non-linear dynamic modeling of sequence data
- Lazifying Conditional Gradient Algorithms
- Learned Optimizers that Scale and Generalize
- Learning Algorithms for Active Learning
- Learning Continuous Semantic Representations of Symbolic Expressions
- Learning Deep Architectures via Generalized Whitened Neural Networks
- Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo
- Learning Determinantal Point Processes with Moments and Cycles
- Learning Discrete Representations via Information Maximizing Self-Augmented Training
- Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis
- Learning Gradient Descent: Better Generalization and Longer Horizons
- Learning Hawkes Processes from Short Doubly-Censored Event Sequences
- Learning Hierarchical Features from Deep Generative Models
- Learning Important Features Through Propagating Activation Differences
- Learning Infinite Layer Networks without the Kernel Trick
- Learning in POMDPs with Monte Carlo Tree Search
- Learning Latent Space Models with Angular Constraints
- Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture
- Learning Stable Stochastic Nonlinear Dynamical Systems
- Learning Texture Manifolds with the Periodic Spatial GAN
- Learning the Structure of Generative Models without Labeled Data
- Learning to Aggregate Ordinal Labels by Maximizing Separating Width
- Learning to Align the Source Code to the Compiled Object Code
- Learning to Detect Sepsis with a Multitask Gaussian Process RNN Classifier
- Learning to Discover Cross-Domain Relations with Generative Adversarial Networks
- Learning to Discover Sparse Graphical Models
- Learning to Generate Long-term Future via Hierarchical Prediction
- Learning to Generate Natural Language
- Learning to Learn without Gradient Descent by Gradient Descent
- Leveraging Node Attributes for Incomplete Relational Data
- Leveraging Union of Subspace Structure to Improve Constrained Clustering
- Lifelong Learning: A Reinforcement Learning Approach
- Local Bayesian Optimization of Motor Skills
- Local-to-Global Bayesian Network Structure Learning
- Logarithmic Time One-Against-Some
- Lost Relatives of the Gumbel Trick
- Machine Learning for Autonomous Vehicles
- Machine Learning for Music Discovery
- Machine Learning in Speech and Language Processing
- Magnetic Hamiltonian Monte Carlo
- Maximum Selection and Ranking under Noisy Comparisons
- Max-value Entropy Search for Efficient Bayesian Optimization
- McGan: Mean and Covariance Feature Matching GAN
- Measuring Sample Quality with Kernels
- MEC: Memory-efficient Convolution for Deep Neural Network
- meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting
- Meritocratic Fairness for Cross-Population Selection
- Meta Networks
- Minimax Regret Bounds for Reinforcement Learning
- Minimizing Trust Leaks for Robust Sybil Detection
- ML on a budget: IoT, Mobile and other tiny-ML applications
- Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
- Model-Independent Online Learning for Influence Maximization
- Modular Multitask Reinforcement Learning with Policy Sketches
- Multichannel End-to-end Speech Recognition
- Multi-Class Optimal Margin Distribution Machine
- Multi-fidelity Bayesian Optimisation with Continuous Approximations
- Multilabel Classification with Group Testing and Codes
- Multilevel Clustering via Wasserstein Means
- Multi-objective Bandits: Optimizing the Generalized Gini Index
- Multiple Clustering Views from Multiple Uncertain Experts
- Multiplicative Normalizing Flows for Variational Bayesian Neural Networks
- Multi-task Learning with Labeled and Unlabeled Tasks
- Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter
- Nearly Optimal Robust Matrix Completion
- Near-Optimal Design of Experiments via Regret Minimization
- Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
- Neural Episodic Control
- Neural Message Passing for Quantum Chemistry
- Neural networks and rational functions
- Neural Optimizer Search using Reinforcement Learning
- Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks
- Nonnegative Matrix Factorization for Time Series Recovery From a Few Temporal Aggregates
- Nonparanormal Information Estimation
- No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis
- Nyström Method with Kernel K-means++ Samples as Landmarks
- On Approximation Guarantees for Greedy Low Rank Optimization
- On Calibration of Modern Neural Networks
- On Context-Dependent Clustering of Bandits
- On Kernelized Multi-armed Bandits
- Online and Linear-Time Attention by Enforcing Monotonic Alignments
- Online Learning to Rank in Stochastic Click Models
- Online Learning with Local Permutations and Delayed Feedback
- Online Partial Least Square Optimization: Dropping Convexity for Better Efficiency and Scalability
- On Mixed Memberships and Symmetric Nonnegative Matrix Factorizations
- On orthogonality and learning RNNs with long term dependencies
- On Relaxing Determinism in Arithmetic Circuits
- On the Expressive Power of Deep Neural Networks
- On the Iteration Complexity of Support Recovery via Hard Thresholding Pursuit
- On The Projection Operator to A Three-view Cardinality Constrained Set
- On the Sampling Problem for Kernel Quadrature
- Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks
- Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
- Optimal Densification for Fast and Accurate Minwise Hashing
- OptNet: Differentiable Optimization as a Layer in Neural Networks
- Oracle Complexity of Second-Order Methods for Finite-Sum Problems
- Ordinal Graphical Models: A Tale of Two Approaches
- Orthogonalized ALS: A Theoretically Principled Tensor Decomposition Algorithm for Practical Use
- Pain-Free Random Differential Privacy with Sensitivity Sampling
- Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space
- Parallel Multiscale Autoregressive Density Estimation
- Parseval Networks: Improving Robustness to Adversarial Examples
- Partitioned Tensor Factorizations for Learning Mixed Membership Models
- Picky Learners: Choosing Alternative Ways to Process Data.
- PixelCNN Models with Auxiliary Variables for Natural Image Modeling
- Post-Inference Prior Swapping
- Practical Gauss-Newton Optimisation for Deep Learning
- Prediction and Control with Temporal Segment Models
- Prediction under Uncertainty in Sparse Spectrum Gaussian Processes with Applications to Filtering and Control
- Preferential Bayesian Optmization
- Principled Approaches to Deep Learning
- Private and Secure Machine Learning
- Priv’IT: Private and Sample Efficient Identity Testing
- Probabilistic Path Hamiltonian Monte Carlo
- Probabilistic Submodular Maximization in Sub-Linear Time
- Programming with a Differentiable Forth Interpreter
- Projection-free Distributed Online Learning in Networks
- ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices
- Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
- Provably Optimal Algorithms for Generalized Linear Contextual Bandits
- Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks
- Random Feature Expansions for Deep Gaussian Processes
- Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees
- Real-Time Adaptive Image Compression
- Real World Interactive Learning
- Recent Advances in Stochastic Convex and Non-Convex Optimization
- Recovery Guarantees for One-hidden-layer Neural Networks
- Recurrent Highway Networks
- Recursive Partitioning for Personalization using Observational Data
- Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning
- Regret Minimization in Behaviorally-Constrained Zero-Sum Games
- Regularising Non-linear Models Using Feature Side-information
- Reinforcement Learning with Deep Energy-Based Policies
- Reinforcement Learning Workshop
- Relative Fisher Information and Natural Gradient for Learning Large Modular Models
- Reliable Machine Learning in the Wild
- Reproducibility in Machine Learning Research
- Re-revisiting Learning on Hypergraphs: Confidence Interval and Subgradient Method
- Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things
- Risk Bounds for Transferring Representations With and Without Fine-Tuning
- Robust Adversarial Reinforcement Learning
- Robust Budget Allocation via Continuous Submodular Functions
- RobustFill: Neural Program Learning under Noisy I/O
- Robust Gaussian Graphical Model Estimation with Arbitrary Corruption
- Robust Guarantees of Stochastic Greedy Algorithms
- Robustness Meets Algorithms (and Vice-Versa)
- Robust Probabilistic Modeling with Bayesian Data Reweighting
- Robust Structured Estimation with Single-Index Models
- Robust Submodular Maximization: A Non-Uniform Partitioning Approach
- Rule-Enhanced Penalized Regression by Column Generation using Rectangular Maximum Agreement
- Safety-Aware Algorithms for Adversarial Contextual Bandit
- SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient
- Scalable Bayesian Rule Lists
- Scalable Generative Models for Multi-label Learning with Missing Labels
- Scalable Multi-Class Gaussian Process Classification using Expectation Propagation
- Scaling Up Sparse Support Vector Machines by Simultaneous Feature and Sample Reduction
- Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics
- Second-Order Kernel Online Convex Optimization with Adaptive Sketching
- Selective Inference for Sparse High-Order Interaction Models
- Self-Paced Co-training
- Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data
- Sequence Modeling via Segmentations
- Sequence to Better Sequence: Continuous Revision of Combinatorial Structures
- Sequence-To-Sequence Modeling with Neural Networks
- Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control
- Sharp Minima Can Generalize For Deep Nets
- Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation
- Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging
- Sliced Wasserstein Kernel for Persistence Diagrams
- Soft-DTW: a Differentiable Loss Function for Time-Series
- Source-Target Similarity Modelings for Multi-Source Transfer Gaussian Process Regression
- Sparse + Group-Sparse Dirty Models: Statistical Guarantees without Unreasonable Conditions and a Case for Non-Convexity
- Spectral Learning from a Single Trajectory under Finite-State Policies
- Spherical Structured Feature Maps for Kernel Approximation
- SPLICE: Fully Tractable Hierarchical Extension of ICA with Pooling
- SplitNet: Learning to Semantically Split Deep Networks for Parameter Reduction and Model Parallelization
- Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
- State-Frequency Memory Recurrent Neural Networks
- Statistical Inference for Incomplete Ranking Data: The Case of Rank-Dependent Coarsening
- StingyCD: Safely Avoiding Wasteful Updates in Coordinate Descent
- Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values
- Stochastic Bouncy Particle Sampler
- Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence
- Stochastic DCA for the Large-sum of Non-convex Functions Problem and its Application to Group Variable Selection in Classification
- Stochastic Generative Hashing
- Stochastic Gradient MCMC Methods for Hidden Markov Models
- Stochastic Gradient Monomial Gamma Sampler
- Stochastic modified equations and adaptive stochastic gradient algorithms
- Stochastic Variance Reduction Methods for Policy Evaluation
- Strongly-Typed Agents are Guaranteed to Interact Safely
- Strong NP-Hardness for Sparse Optimization with Concave Penalty Functions
- Sub-sampled Cubic Regularization for Non-convex Optimization
- Tensor Balancing on Statistical Manifold
- Tensor Belief Propagation
- Tensor Decomposition via Simultaneous Power Iteration
- Tensor Decomposition with Smoothness
- Tensor-Train Recurrent Neural Networks for Video Classification
- The loss surface of deep and wide neural networks
- Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank
- The Predictron: End-To-End Learning and Planning
- The Price of Differential Privacy For Online Learning
- The Sample Complexity of Online One-Class Collaborative Filtering
- The Shattered Gradients Problem: If resnets are the answer, then what is the question?
- The Statistical Recurrent Unit
- Tight Bounds for Approximate Carathéodory and Beyond
- Time Series Workshop
- Toward Controlled Generation of Text
- Toward Efficient and Accurate Covariance Matrix Estimation on Compressed Data
- Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering
- Towards Reinforcement Learning in the Real World
- Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs
- Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference
- Uncorrelation and Evenness: a New Diversity-Promoting Regularizer
- Uncovering Causality from Multivariate Hawkes Integrated Cumulants
- Understanding Black-box Predictions via Influence Functions
- Understanding Synthetic Gradients and Decoupled Neural Interfaces
- Uniform Convergence Rates for Kernel Density Estimation
- Uniform Deviation Bounds for k-Means Clustering
- Unifying task specification in reinforcement learning
- Unimodal Probability Distributions for Deep Ordinal Classification
- Unsupervised Learning by Predicting Noise
- Variants of RMSProp and Adagrad with Logarithmic Regret Bounds
- Variational Boosting: Iteratively Refining Posterior Approximations
- Variational Dropout Sparsifies Deep Neural Networks
- Variational Inference for Sparse and Undirected Models
- Variational Policy for Guiding Point Processes
- Video Games and Machine Learning
- Video Pixel Networks
- Visualizing and Understanding Multilayer Perceptron Models: A Case Study in Speech Processing
- Warped Convolutions: Efficient Invariance to Spatial Transformations
- Wasserstein Generative Adversarial Networks
- When can Multi-Site Datasets be Pooled for Regression? Hypothesis Tests, $\ell_2$-consistency and Neuroscience Applications
- Why is Posterior Sampling Better than Optimism for Reinforcement Learning?
- Workshop on Computational Biology
- Workshop on Human Interpretability in Machine Learning (WHI)
- Workshop on Visualization for Deep Learning
- World of Bits: An Open-Domain Platform for Web-Based Agents
- Zero-Inflated Exponential Family Embeddings
- Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
- ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning
- Zonotope hit-and-run for efficient sampling from projection DPPs