Cycle I proceedings are available at http://jmlr.org/proceedings/papers/v32/.

- A Discriminative Latent Variable Model for Online Clustering
- Rajhans Samdani, Kai-Wei Chang, Dan Roth

[abs][pdf][supplementary]

- Kernel Mean Estimation and Stein Effect
- Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, Arthur Gretton, Bernhard Schoelkopf

[abs][pdf][supplementary]

- Demystifying Information-Theoretic Clustering
- Greg Ver Steeg, Aram Galstyan, Fei Sha, Simon DeDeo

[abs][pdf] [supplementary]

- Covering Number for Efficient Heuristic-based POMDP Planning
- Zongzhang Zhang, David Hsu, Wee Sun Lee

[abs][pdf][supplementary]

- The Coherent Loss Function for Classification
- Wenzhuo Yang, Melvyn Sim, Huan Xu

[abs][pdf][supplementary]

- Active Detection via Adaptive Submodularity
- Yuxin Chen, Hiroaki Shioi, Cesar Fuentes Montesinos, Lian Pin Koh, Serge Wich, Andreas Krause

[abs][pdf][supplementary]

- Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization
- Shai Shalev-Shwartz, Tong Zhang

- An Adaptive Accelerated Proximal Gradient Method and its Homotopy Continuation for Sparse Optimization
- Qihang Lin, Lin Xiao

[abs][pdf][supplementary]

- Recurrent Convolutional Neural Networks for Scene Labeling
- Pedro Pinheiro, Ronan Collobert

- Thompson Sampling for Complex Online Problems
- Aditya Gopalan, Shie Mannor, Yishay Mansour

[abs][pdf] [supplementary]

- Boosting multi-step autoregressive forecasts
- Souhaib Ben Taieb, Rob Hyndman

[abs][pdf][supplementary]

- A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data
- Arun Rajkumar, Shivani Agarwal

[abs][pdf][supplementary]

- Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations
- Timothy Mann, Shie Mannor

[abs][pdf][supplementary]

- Von Mises-Fisher Clustering Models
- Siddharth Gopal, Yiming Yang

[abs][pdf][supplementary]

- Convergence rates for persistence diagram estimation in Topological Data Analysis
- Frédéric Chazal, Marc Glisse, Catherine Labruère, Bertrand Michel

- Buffer k-d Trees: Processing Massive Nearest Neighbor Queries on GPUs
- Fabian Gieseke, Justin Heinermann, Cosmin Oancea, Christian Igel

- Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget
- Anoop Korattikara, Yutian Chen, Max Welling

[abs][pdf][supplementary]

- Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis
- Jian Tang, Zhaoshi Meng, Xuanlong Nguyen, Qiaozhu Mei, Ming Zhang

- The Inverse Regression Topic Model
- Maxim Rabinovich, David Blei

[abs][pdf][supplementary]

- A Consistent Histogram Estimator for Exchangeable Graph Models
- Stanley Chan, Edoardo Airoldi

- Latent Variable Copula Inference for Bundle Pricing from Retail Transaction Data
- Benjamin Letham, Wei Sun, Anshul Sheopuri

- Towards Minimax Online Learning with Unknown Time Horizon
- Haipeng Luo, Robert Schapire

[abs][pdf][supplementary]

- Factorized Point Process Intensities: A Spatial Analysis of Professional Basketball
- Andrew Miller, Luke Bornn, Ryan Adams, Kirk Goldsberry

[abs][pdf][supplementary]

- Margins, Kernels and Non-linear Smoothed Perceptrons
- Aaditya Ramdas, Javier Peña

[abs][pdf][supplementary]

- Robust RegBayes: Selectively Incorporating First-Order Logic Domain Knowledge into Bayesian Models
- Shike Mei, Jun Zhu, Jerry Zhu

- Learning Theory and Algorithms for revenue optimization in second price auctions with reserve
- Mehryar Mohri, Andres Munoz Medina

[abs][pdf][supplementary]

- Low-density Parity Constraints for Hashing-Based Discrete Integration
- Stefano Ermon, Carla Gomes, Ashish Sabharwal, Bart Selman

[abs][pdf][supplementary]

- Prediction with Limited Advice and Multiarmed Bandits with Paid Observations
- Yevgeny Seldin, Peter Bartlett, Koby Crammer, Yasin Abbasi-Yadkori

[abs][pdf][supplementary]

- Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts
- Tien Vu Nguyen, Dinh Phung, Xuanlong Nguyen, Swetha Venkatesh, Hung Bui

[abs][pdf][supplementary]

- Large-Margin Metric Learning for Constrained Partitioning Problems
- Rémi Lajugie, Francis Bach, Sylvain Arlot

- Wasserstein Propagation for Semi-Supervised Learning
- Justin Solomon, Raif Rustamov, Guibas Leonidas, Adrian Butscher

- Efficient Approximation of Cross-Validation for Kernel Methods using Bouligand Influence Function
- Yong Liu, Shali Jiang, Shizhong Liao

- Generalized Exponential Concentration Inequality for Renyi Divergence Estimation
- Shashank Singh, Barnabas Poczos

- Boosting with Online Binary Learners for the Multiclass Bandit Problem
- Shang-Tse Chen, Hsuan-Tien Lin, Chi-Jen Lu

- Optimal Budget Allocation: Theoretical Guarantee and Efficient Algorithm
- Tasuku Soma, Naonori Kakimura, Kazuhiro Inaba, Ken-ichi Kawarabayashi

[abs][pdf][supplementary]

- Computing Parametric Ranking Models via Rank-Breaking
- Hossein Azari Soufiani, David Parkes, Lirong Xia

[abs][pdf][supplementary]

- Tracking Adversarial Targets
- Yasin Abbasi-Yadkori, Peter Bartlett, Varun Kanade

[abs][pdf][supplementary]

- Online Bayesian Passive-Aggressive Learning
- Tianlin Shi, Jun Zhu

[abs][pdf][supplementary]

- Deterministic Policy Gradient Algorithms
- David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller

[abs][pdf][supplementary]

- Modeling Correlated Arrival Events with Latent Semi-Markov Processes
- Wenzhao Lian, Vinayak Rao, Brian Eriksson, Lawrence Carin

[abs][pdf][supplementary]

- Towards scaling up Markov chain Monte Carlo: an adaptive subsampling approach
- Rémi Bardenet, Arnaud Doucet, Chris Holmes

[abs][pdf][supplementary]

- Diagnosis determination: decision trees optimizing simultaneously worst and expected testing cost
- Ferdinando Cicalese, Eduardo Laber, Aline Medeiros Saettler

- Condensed Filter Tree for Cost-Sensitive Multi-Label Classification
- Chun-Liang Li, Hsuan-Tien Lin

[abs][pdf][supplementary]

- On Measure Concentration of Random Maximum A-Posteriori Perturbations
- Francesco Orabona, Tamir Hazan, Anand Sarwate, Tommi Jaakkola

[abs][pdf][supplementary]

- Dimension-free Concentration Bounds on Hankel Matrices for Spectral Learning
- François Denis, Mattias Gybels, Amaury Habrard

- On Modelling Non-linear Topical Dependencies
- Zhixing Li, Siqiang Wen, Juanzi Li, Peng Zhang, Jie Tang

- (Near) Dimension Independent Risk Bounds for Differentially Private Learning
- Prateek Jain, Abhradeep Guha Thakurta

[abs][pdf][supplementary]

- Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels
- Jiyan Yang, Vikas Sindhwani, Haim Avron, Michael Mahoney

[abs][pdf][supplementary]

- Forward-Backward Greedy Algorithms for General Convex Smooth Functions over A Cardinality Constraint
- Ji Liu, Jieping Ye, Ryohei Fujimaki

- Online Learning in Markov Decision Processes with Changing Cost Sequences
- Travis Dick, Andras Gyorgy, Csaba Szepesvari

[abs][pdf][supplementary]

- Unimodal Bandits: Regret Lower Bounds and Optimal Algorithms
- Richard Combes, Alexandre Proutiere

- Maximum Mean Discrepancy for Class Ratio Estimation: Convergence Bounds and Kernel Selection
- Arun Iyer, Saketha Nath, Sunita Sarawagi

[abs][pdf][supplementary]

- Asymptotically consistent estimation of the number of change points in highly dependent time series
- Azadeh Khaleghi, Daniil Ryabko

- Coordinate-descent for learning orthogonal matrices through Givens rotations
- Uri Shalit, Gal Chechik

[abs][pdf][supplementary]

- Densifying One Permutation Hashing via Rotation for Fast Near Neighbor Search
- Anshumali Shrivastava, Ping Li

- A Divide-and-Conquer Solver for Kernel Support Vector Machines
- Cho-Jui Hsieh, Si Si, Inderjit Dhillon

[abs][pdf][supplementary]

- Nuclear Norm Minimization via Active Subspace Selection
- Cho-Jui Hsieh, Peder Olsen

[abs][pdf][supplementary]

- Provable Bounds for Learning Some Deep Representations
- Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma

- Large-scale Multi-label Learning with Missing Labels
- Hsiang-Fu Yu, Prateek Jain, Purushottam Kar, Inderjit Dhillon

[abs][pdf][supplementary]

- Learning Graphs with a Few Hubs
- Rashish Tandon, Pradeep Ravikumar

[abs][pdf][supplementary]

- Agnostic Bayesian Learning of Ensembles
- Alexandre Lacoste, Mario Marchand, Franois Laviolette, Hugo Larochelle

[abs][pdf][supplementary]

- Towards an optimal stochastic alternating direction method of multipliers
- Samaneh Azadi, Suvrit Sra

[abs][pdf][supplementary]

- Spherical Hamiltonian Monte Carlo for Constrained Target Distributions
- Shiwei Lan, Bo Zhou, Babak Shahbaba

- Efficient Continuous-Time Markov Chain Estimation
- Monir Hajiaghayi, Bonnie Kirkpatrick, Liangliang Wang, Alexandre Bouchard-Côté

[abs][pdf][supplementary]

- DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
- Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell

- Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers
- Dani Yogatama, Noah Smith

- Narrowing the Gap: Random Forests In Theory and In Practice
- Misha Denil, David Matheson, Nando De Freitas

[abs][pdf][supplementary]

- Coherent Matrix Completion
- Yudong Chen, Srinadh Bhojanapalli, Sujay Sanghavi, Rachel Ward

[abs][pdf][supplementary]

- Admixture of Poisson MRFs: A Topic Model with Word Dependencies
- David Inouye, Pradeep Ravikumar, Inderjit Dhillon

[abs][pdf][supplementary]

- Memory Efficient Kernel Approximation
- Si Si, Cho-Jui Hsieh, Inderjit Dhillon

[abs][pdf][supplementary]

- Learning Sum-Product Networks with Direct and Indirect Variable Interactions
- Amirmohammad Rooshenas, Daniel Lowd

[abs][pdf][supplementary]

- Hamiltonian Monte Carlo Without Detailed Balance
- Jascha Sohl-Dickstein, Mayur Mudigonda, Michael DeWeese

- Filtering with Abstract Particles
- Jacob Steinhardt, Percy Liang

[abs][pdf][supplementary]

- Stochastic Dual Coordinate Ascent with Alternating Direction Method of Multipliers
- Taiji Suzuki

[abs][pdf][supplementary]

- Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction
- Jian Zhou, Olga Troyanskaya

- An Efficient Approach for Assessing Hyperparameter Importance
- Frank Hutter, Holger Hoos, Kevin Leyton-Brown

[abs][pdf][supplementary]

- Global Graph Kernels Using Geometric Embeddings
- Fredrik Johansson, Vinay Jethava, Devdatt Dubhashi, Chiranjib Bhattacharyya

[abs] [pdf] [supplementary]

- Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data
- Zhiyuan Chen, Bing Liu

[abs] [pdf] [supplementary]

- K-means Recovers ICA Filters when Independent Components are Sparse
- Alon Vinnikov, Shai Shalev-Shwartz

[abs] [pdf] [supplementary]

- Learning Mixtures of Linear Classifiers
- Yuekai Sun, Stratis Ioannidis, Andrea Montanari

[abs] [pdf] [supplementary]

- The Falling Factorial Basis and Its Statistical Applications
- Yu-Xiang Wang, Ryan Tibshirani, Alex Smola

[abs] [pdf] [supplementary]

- Nonmyopic $\epsilon$-Bayes-Optimal Active Learning of Gaussian Processes
- Trong Nghia Hoang, Bryan Kian Hsiang Low, Patrick Jaillet, Mohan Kankanhalli

[abs] [pdf] [supplementary]

- A Unifying View of Representer Theorems
- Andreas Argyriou, Francesco Dinuzzo

[abs] [pdf] [supplementary]

- Online Clustering of Bandits
- Claudio Gentile, Shuai Li, Giovanni Zappella

[abs] [pdf] [supplementary]

- Cold-start Active Learning with Robust Ordinal Matrix Factorization
- Neil Houlsby, Jose Miguel Hernandez-Lobato, Zoubin Ghahramani

[abs] [pdf] [supplementary]

- Multivariate Maximal Correlation Analysis
- Hoang Vu Nguyen, Emmanuel Müller, Jilles Vreeken, Pavel Efros, Klemens Böhm

- Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm
- Hadi Daneshmand, Manuel Gomez-Rodriguez, Le Song, Bernhard Schoelkopf

[abs] [pdf] [supplementary]

- Coupled Group Lasso for Web-Scale CTR Prediction in Display Advertising
- Ling Yan, Wu-Jun Li, Gui-Rong Xue, Dingyi Han

- Putting MRFs on a Tensor Train
- Alexander Novikov, Anton Rodomanov, Anton Osokin, Dmitry Vetrov

[abs] [pdf] [supplementary]

- Efficient Algorithms for Robust One-bit Compressive Sensing
- Lijun Zhang, Jinfeng Yi, Rong Jin

[abs] [pdf] [supplementary]

- Learning Complex Neural Network Policies with Trajectory Optimization
- Sergey Levine, Vladlen Koltun

[abs] [pdf] [supplementary]

- Composite Quantization for Approximate Nearest Neighbor Search
- Ting Zhang, Chao Du, Jingdong Wang

[abs] [pdf] [supplementary]

- Local Ordinal Embedding
- Yoshikazu Terada, Ulrike von Luxburg

[abs] [pdf] [supplementary]

- Reducing Dueling Bandits to Cardinal Bandits
- Nir Ailon, Zohar Karnin, Thorsten Joachims

[abs] [pdf] [supplementary]

- Large-margin Weakly Supervised Dimensionality Reduction
- Chang Xu, Dacheng Tao, Chao Xu, Yong Rui

[abs] [pdf] [supplementary]

- Joint Inference of Multiple Label Types in Large Networks
- Deepayan Chakrabarti, Stanislav Funiak, Jonathan Chang, Sofus Macskassy

- Maximum Margin Multiclass Nearest Neighbors
- Aryeh Kontorovich, Roi Weiss

[abs] [pdf] [supplementary]

- Combinatorial Partial Monitoring Game with Linear Feedback and Its Applications
- Tian Lin, Bruno Abrahao, Robert Kleinberg, John Lui, Wei Chen

[abs] [pdf] [supplementary]

- Sparse Meta-Gaussian Information Bottleneck
- Melani Rey, Volker Roth, Thomas Fuchs

[abs] [pdf] [supplementary]

- Nonparametric Estimation of Renyi Divergence and Friends
- Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

[abs] [pdf] [supplementary]

- Robust Inverse Covariance Estimation under Noisy Measurements
- Jun-Kun Wang, Ting-Wei Lin, Shou-de Lin

- Bayesian Optimization with Inequality Constraints
- Jacob Gardner, Matt Kusner, Kilian Weinberger, John Cunningham, Zhixiang (Eddie) Xu

[abs] [pdf] [supplementary]

- Multiple Testing under Dependence via Semiparametric Graphical Models
- Jie Liu, Chunming Zhang, Elizabeth Burnside, David Page

[abs] [pdf] [supplementary]

- Making Fisher Discriminant Analysis Scalable
- Bojun Tu, Hui Qian, Zhihua Zhang

[abs] [pdf] [supplementary]

- Hierarchical Dirichlet Scaling Process
- Dongwoo Kim, Alice Oh

[abs] [pdf] [supplementary]

- Approximation Analysis of Stochastic Gradient Langevin Dynamics by using Fokker-Planck Equation and Ito Process
- Issei Sato, Hiroshi Nakagawa

- Communication-Efficient Distributed Optimization using an Approximate Newton-type Method
- Ohad Shamir, Nati Srebro, Tong Zhang

[abs] [pdf] [supplementary]

- Concept Drift Detection Through Resampling
- Maayan Harel, Shie Mannor, Ran El-Yaniv, Koby Crammer

[abs] [pdf] [supplementary]

- Anti-differentiating Approximation Algorithms: A case study with Min-cuts, Spectral, and Flow
- David Gleich, Michael Mahoney

- A Bayesian Wilcoxon Signed-rank Test Based on the Dirichlet Process
- Alessio Benavoli, Giorgio Corani, Francesca Mangili, Marco Zaffalon, Fabrizio Ruggeri

[abs] [pdf] [supplementary]

- Min-Max Problems on Factor Graphs
- Siamak Ravanbakhsh, Christopher Srinivasa, Brendan Frey, Russell Greiner

[abs] [pdf] [supplementary]

- Distributed Stochastic Gradient MCMC
- Sungjin Ahn, Babak Shahbaba, Max Welling

[abs] [pdf] [supplementary]

- Preference-Based Rank Elicitation using Statistical Models: The Case of Mallows
- Robert Busa-Fekete, Balázs Szörényi, Eyke Huellermeier

[abs] [pdf] [supplementary]

- Hierarchical Conditional Random Fields for Outlier Detection: An Application to Detecting Epileptogenic Cortical Malformations
- Bilal Ahmed, Karen Blackmon, Thomas Thesen, Ruben Kuzniecky, Chad Carlson, Jacqueline French, Werner Doyle, Carla Brodley

- A Physics-Based Model Prior for Object-Oriented MDPs
- Jonathan Scholz, Martin Levihn, Charles Isbell

- Outlier Path: A Homotopy Algorithm for Robust SVM
- Shinya Suzumura, Kohei Ogawa, Masashi Sugiyama, Ichiro Takeuchi

[abs] [pdf] [supplementary]

- Ensemble-Based Tracking: Aggregating Crowdsourced Structured Time Series Data
- Naiyan Wang, Dit-Yan Yeung

[abs] [pdf] [supplementary]

- Latent Confusion Analysis by Normalized Gamma Construction
- Issei Sato, Kashima Hisashi, Hiroshi Nakagawa

[abs] [pdf] [supplementary]

- Finito: A Faster, Permutable Incremental Gradient Method for Big Data Problems
- Aaron Defazio, Justin Domke, Tiberio Caetano

[abs] [pdf] [supplementary]

- Ensemble Methods for Structured Prediction
- Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri

[abs] [pdf] [supplementary]

- Standardized Mutual Information for Clustering Comparisons: One Step Further in Adjustment for Chance
- Simone Romano, James Bailey, Vinh Nguyen, Karin Verspoor

[abs] [pdf] [supplementary]

- Preserving Modes and Messages via Diverse Particle Selection
- Jason Pacheco, Silvia Zuffi, Michael Black, Erik Sudderth

[abs] [pdf] [supplementary]

- Nonlinear Information-Theoretic Compressive Measurement Design
- Liming Wang, Abolfazl Razi, Miguel Rodrigues, Robert Calderbank, Lawrence Carin

[abs] [pdf] [supplementary]

- Dual Query: Practical Private Query Release for High Dimensional Data
- Marco Gaboardi Emilio Jesus Gallego Arias, Justin Hsu, Aaron Roth, Zhiwei Steven Wu

- Deep Boosting
- Corinna Cortes, Mehryar Mohri, Umar Syed

[abs] [pdf] [supplementary]

- Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models
- Robert McGibbon, Bharath Ramsundar, Mohammad Sultan, Gert Kiss, Vijay Pande

- Online Multi-Task Learning for Policy Gradient Methods
- Haitham Bou Ammar, Eric Eaton, Paul Ruvolo, Matthew Taylor

- Learning the Parameters of Determinantal Point Process Kernels
- Raja Hafiz Affandi, Emily Fox, Ryan Adams, Ben Taskar

[abs] [pdf] [supplementary]

- Deep AutoRegressive Networks
- Karol Gregor, Ivo Danihelka, Andriy Mnih, Charles Blundell, Daan Wierstra

[abs] [pdf] [supplementary]

- A Convergence Rate Analysis for LogitBoost, MART and Their Variant
- Peng Sun, Tong Zhang, Jie Zhou

[abs] [pdf] [supplementary]

- Inferning with High Girth Graphical Models
- Uri Heinemann, Amir Globerson

[abs] [pdf] [supplementary]

- Learning Latent Variable Gaussian Graphical Models
- Zhaoshi Meng, Brian Eriksson, Al Hero

[abs] [pdf] [supplementary]

- Stochastic Backpropagation and Approximate Inference in Deep Generative Models
- Danilo Jimenez Rezende, Shakir Mohamed, Daan Wierstra

[abs] [pdf] [supplementary]

- One Practical Algorithm for Both Stochastic and Adversarial Bandits
- Yevgeny Seldin, Aleksandrs Slivkins

[abs] [pdf] [supplementary]

- Robust and Efficient Kernel Hyperparameter Paths with Guarantees
- Joachim Giesen, Soeren Laue, Patrick Wieschollek

- Active Transfer Learning under Model Shift
- Xuezhi Wang, Tzu-Kuo Huang, Jeff Schneider

[abs] [pdf] [supplementary]

- Approximate Policy Iteration Schemes: A Comparison
- Bruno Scherrer

[abs] [pdf] [supplementary]

- Robust and Efficient Representation Learning with Nonnegativity Constraints
- Tsung-Han Lin

- Sample Efficient Reinforcement Learning with Gaussian Processes
- Robert Grande, Thomas Walsh, Jonathan How

[abs] [pdf] [supplementary]

- Memory and Computation Efficient PCA via Very Sparse Random Projections
- Farhad Pourkamali Anaraki, Shannon Hughes

[abs] [pdf] [supplementary]

- Time-Regularized Interrupting Options (TRIO)
- Timothy Mann, Daniel Mankowitz, Shie Mannor

[abs] [pdf] [supplementary]

- Randomized Nonlinear Component Analysis
- David Lopez-Paz, Suvrit Sra, Alex Smola, Zoubin Ghahramani, Bernhard Schoelkopf

- High Order Regularization for Semi-Supervised Learning of Structured Output Problems
- Yujia Li, Rich Zemel

[abs] [pdf] [supplementary]

- Transductive Learning with Multi-class Volume Approximation
- Gang Niu, Bo Dai, Christoffel du Plessis, Masashi Sugiyama

[abs] [pdf] [supplementary]

- Methods of Moments for Learning Stochastic Languages: Unified Presentation and Empirical Comparison
- Borja Balle, William Hamilton, Joelle Pineau

[abs] [pdf] [supplementary]

- Effective Bayesian Modeling of Groups of Related Count Time Series
- Nicolas Chapados

[abs] [pdf] [supplementary]

- Variational Inference for Sequential Distance Dependent Chinese Restaurant Process
- Sergey Bartunov, Dmitry Vetrov

- Discovering Latent Network Structure in Point Process Data
- Scott Linderman, Ryan Adams

[abs] [pdf] [supplementary]

- Learning Representations for Interacting Manifolds with Higher-order Boltzmann Machines
- Scott Reed, Kihyuk Sohn, Yuting Zhang, Honglak Lee

- Learning Modular Structures from Network Data and Node Variables
- Elham Azizi, Edoardo Airoldi

- Probabilistic Partial Canonical Correlation Analysis
- Yusuke Mukuta, Tatsuya Harada

[abs] [pdf] [supplementary]

- Skip Context Tree Switching
- Marc Bellemare, Joel Veness, Erik Talvitie, Alex Graves

[abs] [pdf] [supplementary]

- Lower Bounds for the Gibbs Sampler over Mixtures of Gaussians
- Christopher Tosh, Sanjoy Dasgupta

- Marginalized Denoising Auto-encoders for Nonlinear Representations
- Minmin Chen, Kilian Weinberger, Fei Sha, Yoshua Bengio

- Gaussian Processes for Bayesian Estimation in Ordinary Differential Equations
- David Barber, Yali Wang

- Fast Multi-stage Submodular Maximization
- Kai Wei, Rishabh Iyer, Jeff Bilmes

[abs] [pdf] [supplementary]

- Programming by Feedback
- Marc Schoenauer, Riad Akrour, Michele Sebag, Jean-Christophe Souplet

- Probabilistic Matrix Factorization with Non-random Missing Data
- Jose Miguel Hernandez-Lobato, Neil Houlsby, Zoubin Ghahramani

[abs] [pdf] [supplementary]

- Pursuit-Evasion Without Regrets, with an Application to Trading
- Lili Dworkin, Michael Kearns, Yuriy Nevmyvaka

- The f-Adjusted Laplacian: a Diagonal Perturbation with a Geometric Interpretation
- Sven Kurras, Ulrike von Luxburg, Gilles Blanchard

[abs] [pdf] [supplementary]

- Riemannian Pursuit for Big Matrix Recovery
- Mingkui Tan, Ivor W. Tsang, Li Wang, Jialin Pan, Bart Vandereycken

[abs] [pdf] [supplementary]

- Dynamic Programming Boosting for Discriminative Macro-Action Discovery
- Leonidas Lefakis, Francois Fleuret

- Resource-Efficient Stochastic Optimization of a Locally Smooth Function under Correlated Bandit Feedback
- Mohammad Gheshlaghi azar, Alessandro Lazaric, Emma Brunskill

[abs] [pdf] [supplementary]

- Weighted Graph Clustering with Non-Uniform Uncertainties
- Yudong Chen, Shiau Hong Lim, Huan Xu

[abs] [pdf] [supplementary]

- GeNGA: A Generalization of Natural Gradient Ascent with Positive and Negative Convergence Results
- Philip Thomas

- A Bayesian Framework for Online Classifier Ensemble
- Qinxun Bai, Henry Lam, Stan Sclaroff

[abs] [pdf] [supplementary]

- Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm
- Jacob Steinhardt, Percy Liang

[abs] [pdf] [supplementary]

- Gaussian Approximation of Collective Graphical Models
- Liping Liu, Daniel Sheldon, Thomas Dietterich

[abs] [pdf] [supplementary]

- One-Bit Object Detection: On Learning to Localize Objects with Minimal Supervision
- Hyun Oh Song, Ross Girshick, Stefanie Jegelka, Julien Mairal, Zaid Harchaoui, Trevor Darrell

- Multiresolution Matrix Factorization
- Risi Kondor, Nedelina Teneva, Vikas Garg

[abs] [pdf] [supplementary]

- Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits
- Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, Robert Schapire

- Structured Recurrent Temporal Restricted Boltzmann Machines
- Roni Mittelman, Benjamin Kuipers, Silvio Savarese, Honglak Lee

- Scalable and Robust Bayesian Inference via the Median Posterior
- Stanislav Minsker, Sanvesh Srivastava, Lizhen Lin, David Dunson

[abs] [pdf] [supplementary]

- Kernel Adaptive Metropolis-Hastings
- Dino Sejdinovic, Heiko Strathmann, Maria Lomeli Garcia, Christophe Andrieu, Arthur Gretton

[abs] [pdf] [supplementary]

- Input Warping for Bayesian Optimization of Non-stationary Functions
- Jasper Snoek, Kevin Swersky, Rich Zemel, Ryan Adams

- Stochastic Gradient Hamiltonian Monte Carlo
- Tianqi Chen, Emily Fox, Carlos Guestrin

[abs] [pdf] [supplementary]

- A Deep Semi-NMF Model for Learning Hidden Representations
- George Trigeorgis, Konstantinos Bousmalis, Stefanos Zafeiriou, Bjoern Schuller

[abs] [pdf] [supplementary]

- Asynchronous Distributed ADMM Algorithm for Global Variable Consensus Optimization
- Ruiliang Zhang, James Kwok

- Spectral Regularization for Max-Margin Sequence Tagging
- Ariadna Quattoni, Borja Balle, Xavier Carreras, Amir Globerson

- Learning by Stretching Deep Networks
- Gaurav Pandey, Ambedkar Dukkipati

[abs] [pdf] [supplementary]

- Nonnegative Sparse PCA with Provable Guarantees
- Megasthenis Asteris, Alexandros Dimakis, Dimitris Papailiopoulos

[abs] [pdf] [supplementary]

- Active Learning of Parameterized Skills
- Bruno Da Silva, George Konidaris, Andrew Barto

[abs] [pdf] [supplementary]

- Learning Ordered Representations with Nested Dropout
- Oren Rippel, Michael Gelbart, Ryan Adams

[abs] [pdf] [supplementary]

- Learning the Irreducible Representations of Commutative Lie Groups
- Taco Cohen, Max Welling

[abs] [pdf] [supplementary]

- Towards End-To-End Speech Recognition with Recurrent Neural Networks
- Alex Graves, Navdeep Jaitly

- Multi-period Trading Prediction Markets with Connections to Machine Learning
- Jinli Hu, Amos Storkey

[abs] [pdf] [supplementary]

- Efficient Gradient-Based Inference through Transformations between Bayes Nets and Neural Nets
- Diederik Kingma, Max Welling

- Neural Variational Inference and Learning in Belief Networks
- Andriy Mnih, Karol Gregor

[abs] [pdf] [supplementary]

- Scalable Nonparametric Bayesian Analysis of Incomplete Multiway Data
- Piyush Rai, Yingjian Wang, Lawrence Carin

[abs] [pdf] [supplementary]

- Learning Character-level Representations for Part-of-Speech Tagging
- Cicero Dos Santos, Bianca Zadrozny

- Saddle Points and Accelerated Perceptron Algorithms
- Adams Wei Yu, fatma Kilinc-Karzan, Jaime Carbonell

[abs] [pdf] [supplementary]

- Robust Distance Metric Learning via Simultaneous L1-Norm Minimization and Maximization
- Hua Wang, Feiping Nie, Heng Huang

- Learning from Contagion (Without Timestamps)
- Kareem Amin, Hoda Heidari, Michael Kearns

[abs] [pdf] [supplementary]

- Stochastic Variational Inference for Bayesian Time Series Models
- Matthew Johnson, Alan Willsky

- Estimating Latent-Variable Graphical Models using Moments and Likelihoods
- Arun Tejasvi Chaganty, Percy Liang

[abs] [pdf] [supplementary]

- Universal Matrix Completion
- Srinadh Bhojanapalli, Prateek Jain

[abs] [pdf] [supplementary]

- Finding Dense Subgraphs via Low-Rank Bilinear Optimization
- Dimitris Papailiopoulos, Ioannis Mitliagkas, Alexandros Dimakis, Constantine Caramanis

[abs] [pdf] [supplementary]

- Compositional Morphology for Word Representations and Language Modelling
- Jan Botha, Phil Blunsom

- Learning Polynomials with Neural Networks
- Alexandr Andoni, Rina Panigrahy, Gregory Valiant, Li Zhang

- Exponential Family Matrix Completion under Structural Constraints
- Suriya Gunasekar, Pradeep Ravikumar, Joydeep Ghosh

[abs] [pdf] [supplementary]

- Sample-based Approximate Regularization
- Philip Bachman, Amir-Massoud Farahmand, Doina Precup

[abs] [pdf] [supplementary]

- Adaptive Monte-Carlo via Bandit Allocation
- James Neufeld, Andras Gyorgy, Csaba Szepesvari, Dale Schuurmans

[abs] [pdf] [supplementary]

- Efficient Dimensionality Reduction for High-Dimensional Network Estimation
- Safiye Celik, Benjamin Logsdon, Su-In Lee

- Deterministic Anytime Inference for Stochastic Continuous-Time Markov Processes
- E. Busra Celikkaya, Christian Shelton

- Doubly Stochastic Variational Bayes for non-Conjugate Inference
- Michalis Titsias, Miguel Lázaro-Gredilla

[abs] [pdf] [supplementary]

- Efficient Learning of Mahalanobis Metrics for Ranking
- Daryl Lim, Gert Lanckriet

[abs] [pdf] [supplementary]

- GEV-Canonical Regression for Accurate Binary Class Probability Estimation when One Class is Rare
- Arpit Agarwal, Harikrishna Narasimhan, Shivaram Kalyanakrishnan, Shivani Agarwal

[abs] [pdf] [supplementary]

- A Reversible Infinite HMM using Normalised Random Measures
- Konstantina Palla, David Knowles, Zoubin Ghahramani

[abs] [pdf] [supplementary]

- Structured Low-Rank Matrix Factorization: Optimality, Algorithm, and Applications to Image Processing
- Benjamin Haeffele, Rene Vidal, Eric Young

[abs] [pdf] [supplementary]

- Influence Function Learning in Information Diffusion Networks
- Nan Du, Yingyu Liang, Le Song, Maria Balcan

[abs] [pdf] [supplementary]

- An Information Geometry of Statistical Manifold Learning
- Ke Sun, Stéphane Marchand-Maillet

[abs] [pdf] [supplementary]

- Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem
- Masrour Zoghi, Shimon Whiteson, Remi Munos, Maarten de Rijke

[abs] [pdf] [supplementary]

- Concentration in Unbounded Metric Spaces and Algorithmic Stability
- Aryeh Kontorovich

[abs] [pdf] [supplementary]

- Spectral Bandits for Smooth Graph Functions
- Remi Munos, Michal Valko, Branislav Kveton, Tomas Kocak

[abs] [pdf] [supplementary]

- Robust Principal Component Analysis with Complex Noise
- Qian Zhao, Deyu Meng, Lei Zhang, Wangmeng Zuo, Zongben Xu

[abs] [pdf] [supplementary]

- Scalable Semidefinite Relaxation for Maximum A Posteriori Estimation
- Qixing Huang, Yuxin Chen, Guibas Leonidas

[abs] [pdf] [supplementary]

- Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery
- Cun Mu, Bo Huang, John Wright, Donald Goldfarb

[abs] [pdf] [supplementary]

- Automated Inference of Point of View from User Interactions in Collective Intelligence Venues
- Sanmay Das, Allen Lavoie

- Orthogonal Rank-One Matrix Pursuit for Matrix Completion
- Zheng Wang, Ming-Jun Lai, Zhaosong Lu, Wei Fan, Hasan Davulcu, Jieping Ye

- Near-Optimal Joint Object Matching via Convex Relaxation
- Yuxin Chen, Guibas Leonidas, Qixing Huang

[abs] [pdf] [supplementary]

- On p-norm Path Following in Multiple Kernel Learning for Non-linear Feature Selection
- Pratik Jawanpuria, Manik Varma, Saketha Nath

[abs] [pdf] [supplementary]

- Gradient Hard Thresholding Pursuit for Sparsity-Constrained Optimization
- Xiaotong Yuan, Ping Li, Tong Zhang

- Learning With Priors
- Jean Honorio, Tommi Jaakkola

[abs] [pdf] [supplementary]

- Geodesic Distance Function Learning via Heat Flows on Vector Fields
- Binbin Lin, Ji Yang, Xiaofei He, Jieping Ye

- Active Teaching for Crowdsourcing Classification
- Adish Singla, Ilija Bogunovic, Gabor Bartok, Amin Karbasi, Andreas Krause

[abs] [pdf] [supplementary]

- On the Convergence of No-regret Learning in Selfish Routing
- Benjamin Drighès, Walid Krichene, Alexandre Bayen

[abs] [pdf] [supplementary]

- Offline Evaluation of Recommendation Systems
- Olivier Nicol, Jérémie Mary, Philippe Preux

[abs] [pdf] [supplementary]

- Scaling Up Robust MDPs by Reinforcement Learning
- Aviv Tamar, Huan Xu, Shie Mannor

[abs] [pdf] [supplementary]

- Marginal Structured SVM with Hidden Variables
- Wei Ping, Qiang Liu, Alex Ihler

[abs] [pdf] [supplementary]

- From Exponential to Linear Complexity When Learning Practical Markov Random Fields
- Yariv Mizrahi, Nando De Freitas, Luis Tenorio

- Pitfalls in the Use of Parallel Inference for the Dirichlet Process
- Yarin Gal, Zoubin Ghahramani

- Optimal PAC Multiple Arm Identification with Applications to Crowdsourcing
- Yuan Zhou, Xi Chen, Jian Li

[abs] [pdf] [supplementary]

- Deep Generative Stochastic Networks Trainable by Backprop
- Yoshua Bengio, Eric Laufer, Jason Yosinski

[abs] [pdf] [supplementary]

- A Highly Scalable Parallel Algorithm for Isotropic Total Variation Models
- Jie Wang, Qingyang Li, Sen Yang, Wei Fan, Jieping Ye

[abs] [pdf] [supplementary]

- Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
- Yudong Chen, Jiaming Xu

[abs] [pdf] [supplementary]

- Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy
- Dengyong Zhou, Qiang Liu, John Platt, Christopher Meek

- Exchangeable Variable Models
- Mathias Niepert, Pedro Domingos

[abs] [pdf] [supplementary]

- Clustering in the Presence of Background Noise
- Nika Haghtalab, Shai Ben-David

[abs] [pdf] [supplementary]

- Safe Screening with Variational Inequalities and Its Application to Lasso
- Jun Liu, Zheng Zhao, Jie Wang, Jieping Ye

[abs] [pdf] [supplementary]

- Learning the Consistent Behavior of Common Users for Target Node Prediction across Social Networks
- Shan-Hung Wu, Hao-Heng Chien, Kuan-Hua Lin, Philip Yu

[abs] [pdf] [supplementary]

- Signal Recovery from $\ell_p$ Pooling Representations
- Joan Bruna Estrach, Arthur Szlam, Yann LeCun

[abs] [pdf] [supplementary]

- PAC-inspired Option Discovery in Lifelong Reinforcement Learning
- Emma Brunskill, Lihong Li

[abs] [pdf] [supplementary]

- Multi-label Classification via Feature-aware Implicit Label Space Encoding
- Zijia Lin, Guiguang Ding, Mingqing Hu, Jianmin Wang

[abs] [pdf] [supplementary]

- Scalable Gaussian Process Structured Prediction for Grid Factor Graph Applications
- Sebastien Bratieres, Novi Quadrianto, Sebastian Nowozin, Zoubin Ghahramani

[abs] [pdf] [supplementary]

- Hierarchical Quasi-Clustering Methods for Asymmetric Networks
- Gunnar Carlsson, Facundo Mémoli, Alejandro Ribeiro, Santiago Segarra

[abs] [pdf] [supplementary]

- Rectangular Tiling Process
- Masahiro Nakano, Katsuhiko Ishiguro, Akisato Kimura, Takeshi Yamada, Naonori Ueda

[abs] [pdf] [supplementary]

- Two-Stage Metric Learning
- Jun Wang, Ke Sun, Fei Sha, Stéphane Marchand-Maillet, Alexandros Kalousis

[abs] [pdf] [supplementary]

- Stochastic Inference for Scalable Probabilistic Modeling of Binary Matrices
- Jose Miguel Hernandez-Lobato, Neil Houlsby, Zoubin Ghahramani

[abs] [pdf] [supplementary]

- Elementary Estimators for High-Dimensional Linear Regression
- Eunho Yang, Aurelie Lozano, Pradeep Ravikumar

[abs] [pdf] [supplementary]

- Elementary Estimators for Sparse Covariance Matrices and other Structured Moments
- Eunho Yang, Aurelie Lozano, Pradeep Ravikumar

[abs] [pdf] [supplementary]

- Learning with Smoothness: Pointwise, Graph-based, Probabilistic
- Yuan Fang, Kevin Chang, Hady Lauw

[abs] [pdf] [supplementary]

- Bayesian Max-margin Multi-Task Learning with Data Augmentation
- Chengtao Li, Jun Zhu, Jianfei Chen

- Sparse Reinforcement Learning via Convex Optimization
- Zhiwei Qin, Weichang Li

[abs] [pdf] [supplementary]

- Gaussian Process Classification and Active Learning with Multiple Annotators
- Filipe Rodrigues, Francisco Pereira, Bernardete Ribeiro

[abs] [pdf] [supplementary]

- Structured Prediction of Network Response
- Hongyu Su, Aristides Gionis, Juho Rousu

[abs] [pdf] [supplementary]

- An Analysis of State-Relevance Weights and Sampling Distributions on L1-Regularized Approximate Linear Programming Approximation Accuracy
- Gavin Taylor, Connor Geer, David Piekut

- Optimization Equivalence of Divergences Improves Neighbor Embedding
- Zhirong Yang, Jaakko Peltonen, Samuel Kaski

[abs] [pdf] [supplementary]

- An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
- Ji Liu, Steve Wright, Christopher Re, Srikrishna Sridhar, Vicotr Bittorf

- Consistency of Causal Inference under the Additive Noise Model
- Samory Kpotufe, Eleni Sgouritsa, Dominik Janzing, Bernhard Schoelkopf

- Globally Convergent Parallel MAP LP Relaxation Solver using the Frank-Wolfe Algorithm
- Alexander Schwing, Tamir Hazan, Marc Pollefeys, Raquel Urtasun

- Linear Programming for Large-Scale Markov Decision Problems
- Alan Malek, Yasin Abbasi-Yadkori, Peter Bartlett

[abs] [pdf] [supplementary]

- Implicit Particle Sequential Monte Carlo
- Seong-Hwan Jun, Alexandre Bouchard-Côté

[abs] [pdf] [supplementary]

- Scaling SVM and Least Absolute Deviations via Exact Data Reduction
- Jie Wang, Jieping Ye

[abs] [pdf] [supplementary]

- Least Squares Revisited: Scalable Approaches for Multi-class Prediction
- Alekh Agarwal, Sham Kakade, Nikos Karampatziakis, Le Song, Gregory Valiant

- Local Algorithms for Interactive Clustering
- Pranjal Awasthi, Konstantin Voevodski, Maria Balcan

- Learning and Planning with Relational Uncertainty Predicates over the Existence of Objects
- Vien Ngo, Marc Toussaint

- A New Q(\lambda)
- Rich Sutton, Ashique Rupam Mahmood, Doina Precup, Hado van Hasselt

[abs] [pdf] [supplementary]

- On Robustness and Regularization of Structural Support Vector Machines
- Mohamad Ali Torkamani, Daniel Lowd

[abs] [pdf] [supplementary]

- Guess-Averse Loss Functions for Cost-Sensitive Multiclass Boosting
- Oscar Beijbom, Mohammad Saberian, Nuno Vasconcelos, David Kriegman

[abs] [pdf] [supplementary]

- Multimodal Neural Language Models
- Ryan Kiros, Ruslan Salakhutdinov, Rich Zemel

[abs] [pdf] [supplementary]

- An Adaptive Low Dimensional quasi-Newton Sum of Functions Optimizer
- Jascha Sohl-Dickstein, Ben Poole, Surya Ganguli

- Alternating Minimization for Mixed Linear Regression
- Xinyang Yi, Constantine Caramanis, Sujay Sanghavi

[abs] [pdf] [supplementary]

- Stochastic Neighbor Compression
- Matt Kusner, Stephen Tyree, Kilian Weinberger, Kunal Agrawal

- Robust Learning under Uncertain Test Distributions: Relating Covariate Shift to Model Misspecification
- Junfeng Wen, Chun-Nam Yu, Russell Greiner

[abs] [pdf] [supplementary]

- Nonparametric Estimation of Multi-View Latent Variable Models
- Le Song, Animashree Anandkumar, Bo Dai, Bo Xie

[abs] [pdf] [supplementary]

- Structured Generative Models of Natural Source Code
- Chris Maddison, Daniel Tarlow

[abs] [pdf] [supplementary]

- A Single-Pass Algorithm for Efficiently Recovering Sparse Cluster Centers for High-dimensional Data
- Jinfeng Yi, Lijun Zhang, Jun Wang, Rong Jin, Anil Jain

[abs] [pdf] [supplementary]

- Stochastic Approximation with Implicit Updates. Applications in Robust Online Learning of GLMs
- Panagiotis Toulis, Edoardo Airoldi, Jason Rennie