General Keywords

[ Algorithms ] [ Algorithms; Optimization ] [ Applications ] [ Data, Challenges, Implementations, and Software ] [ Deep Learning ] [ Deep Learning; Deep Learning ] [ Neuroscience and Cognitive Science ] [ Optimization ] [ Optimization; Optimization ] [ Probabilistic Methods ] [ Probabilistic Methods; Probabilistic Methods ] [ Reinforcement Learning and Planning ] [ Social Aspects of Machine Learning ] [ Theory ] [ Theory; Theory ]

Topic Keywords

[ Active Learning ] [ Active Learning; Algorithms ] [ Activity and Event Recognition ] [ Adaptive Data Analysis; Optimization ] [ Adversarial Examples ] [ Adversarial Learning ] [ Adversarial Learning; Algorithms ] [ Adversarial Networks ] [ Adversarial Networks ] [ Adversarial Networks; Deep Learning ] [ Adversarial Networks; Deep Learning ] [ AI Safety ] [ Algorithms Evaluation ] [ Approximate Inference ] [ Architectures ] [ Attention Models ] [ Audio and Speech Processing ] [ AutoML ] [ Bandit Algorithms ] [ Bandit Algorithms; Algorithms ] [ Bandit Algorithms; Reinforcement Learning and Planning ] [ Bandit Algorithms; Reinforcement Learning and Planning ] [ Bandits ] [ Bayesian Deep Learning ] [ Bayesian Methods ] [ Bayesian Nonparametrics ] [ Bayesian Theory ] [ Bayesian Theory ] [ Benchmarks ] [ Biologically Plausible Deep Networks ] [ Biologically Plausible Deep Networks; Deep Learning ] [ Biologically Plausible Deep Networks; Neuroscience and Cognitive Science ] [ Body Pose, Face, and Gesture Analysis ] [ Body Pose, Face, and Gesture Analysis; Applications ] [ Boosting and Ensemble Methods ] [ Boosting and Ensemble Methods; Algorithms ] [ Boosting and Ensemble Methods; Probabilistic Methods; Probabilistic Methods ] [ Causal Inference ] [ Classification ] [ Classification; Algorithms ] [ Classification; Algorithms ] [ Classification; Applications ] [ Classification; Deep Learning; Deep Learning ] [ Classification; Deep Learning; Deep Learning ] [ Clustering ] [ Clustering; Applications ] [ Clustering; Theory ] [ CNN Architectures; Deep Learning ] [ CNN Architectures; Deep Learning ] [ CNN Architectures; Theory ] [ Cognitive Science; Neuroscience and Cognitive Science ] [ Collaborative Filtering ] [ Collaborative Filtering; Algorithms ] [ Collaborative Filtering; Applications ] [ Combinatorial Optimization ] [ Components Analysis (e.g., CCA, ICA, LDA, PCA) ] [ Computational Biology and Bioinformatics ] [ Computational Biology and Bioinformatics; Applications ] [ Computational Complexity ] [ Computational Learning Theory ] [ Computational Photography ] [ Computational Social Science ] [ Computer Vision ] [ Computer Vision; Applications ] [ Computer Vision; Applications ] [ Computer Vision; Deep Learning ] [ Computer Vision; Deep Learning ] [ Computer Vision; Deep Learning ] [ Computer Vision; Deep Learning ] [ Continual Learning ] [ Convex Optimization ] [ Convex Optimization; Optimization ] [ Convex Optimization; Probabilistic Methods; Theory; Theory ] [ Convex Optimization; Theory ] [ Crowdsourcing ] [ Decision and Control ] [ Deep Autoencoders; Deep Learning ] [ Deep learning Theory ] [ Deep RL ] [ Density Estimation ] [ Density Estimation; Deep Learning ] [ Derivative Free Optimization ] [ Dialog- or Communication-Based Learning ] [ Dimensionality Reduction ] [ Distributed and Parallel Optimization ] [ Distributed Inference ] [ Efficient Inference Methods ] [ Efficient Training Methods; Deep Learning ] [ Embedding and Representation learning ] [ Embedding Approaches ] [ Exploration ] [ Fairness, Accountability, and Transparency ] [ Fairness, Accountability, and Transparency ] [ Few-Shot Learning ] [ Few-Shot Learning; Algorithms ] [ Frequentist Statistics ] [ Game Theory and Computational Economics ] [ Gaussian Processes ] [ Gaussian Processes and Bayesian non-parametrics ] [ Generative Models ] [ Generative Models ] [ Graphical Models ] [ Graphical Models ] [ Hardware and Systems ] [ Healthcare ] [ Human or Animal Learning ] [ Human or Animal Learning; Probabilistic Methods ] [ Image Segmentation ] [ Image Segmentation; Algorithms ] [ Image Segmentation; Applications ] [ Information Theory ] [ Kernel Methods ] [ Kernel Methods; Optimization ] [ Large Deviations and Asymptotic Analysis ] [ Large Scale Learning ] [ Large Scale Learning; Algorithms ] [ Large Scale Learning; Algorithms ] [ Large Scale Learning; Applications ] [ Large Scale Learning; Deep Learning ] [ Large Scale Learning; Probabilistic Methods ] [ Latent Variable Models ] [ Learning Theory ] [ Markov Decision Processes ] [ Markov Decision Processes; Reinforcement Learning and Planning ] [ Markov Decision Processes; Reinforcement Learning and Planning ] [ Matrix and Tensor Factorization ] [ MCMC ] [ Memory ] [ Memory; Optimization ] [ Meta-Learning ] [ Meta-Learning; Applications ] [ Metric Learning ] [ Missing Data; Algorithms ] [ Missing Data; Algorithms ] [ Missing Data; Theory ] [ Model Selection and Structure Learning ] [ Models of Learning and Generalization ] [ Monte Carlo Methods ] [ Multi-Agent RL ] [ Multimodal Learning ] [ Multitask and Transfer Learning ] [ Multitask and Transfer Learning; Algorithms ] [ Multitask and Transfer Learning; Probabilistic Methods ] [ Multitask, Transfer, and Meta Learning ] [ Natural Language Processing ] [ Network Analysis ] [ Networks and Relational Learning ] [ Neural Coding; Neuroscience and Cognitive Science ] [ Neuroscience ] [ Neuroscience and Cognitive Science ] [ Non-Convex Optimization ] [ Non-Convex Optimization ] [ Non-Convex Optimization; Theory ] [ Non-parametric models ] [ Object Detection; Deep Learning ] [ Object Detection; Neuroscience and Cognitive Science ] [ Online Learning ] [ Online Learning Algorithms ] [ Online Learning Theory ] [ Online Learning; Theory ] [ Optimal Transport ] [ Optimization for Deep Networks ] [ Others ] [ Others ] [ Others ] [ Others ] [ Others ] [ Planning and Control ] [ Plasticity and Adaptation ] [ Predictive Models ] [ Predictive Models; Deep Learning ] [ Predictive Models; Deep Learning ] [ Privacy, Anonymity, and Security ] [ Privacy, Anonymity, and Security ] [ Probabilistic Methods ] [ Probabilistic Programming ] [ Program Understanding and Generation ] [ Quantitative Finance and Econometrics ] [ Ranking and Preference Learning ] [ Ranking and Preference Learning; Theory ] [ Reasoning; Optimization ] [ Recommender Systems ] [ Recurrent Networks ] [ Recurrent Networks; Theory ] [ Regression ] [ Regression; Algorithms ] [ Regression; Applications ] [ Regression; Optimization ] [ Regression; Probabilistic Methods; Probabilistic Methods ] [ Regularization ] [ Regularization ] [ Reinforcement Learning ] [ Reinforcement Learning and Planning ] [ Relational Learning ] [ Representation Learning ] [ Representation Learning; Algorithms ] [ Representation Learning; Algorithms ] [ Representation Learning; Neuroscience and Cognitive Science ] [ Representation Learning; Neuroscience and Cognitive Science; Neuroscience and Cognitive Science ] [ Representation Learning; Optimization ] [ RL, Decisions and Control Theory ] [ Robotics ] [ Robust statistics ] [ Semi-Supervised Learning ] [ Social Aspects of Machine Learning ] [ Software Toolkits ] [ Spaces of Functions and Kernels ] [ Sparse Coding and Dimensionality Expansion; Applications ] [ Sparsity and Compressed Sensing ] [ Sparsity and Compressed Sensing; Applications ] [ Sparsity and Compressed Sensing; Optimization; Theory ] [ Speech Recognition ] [ Statistical Learning Theory ] [ Statistical Physics of Learning ] [ Stochastic Optimization ] [ Structured Prediction ] [ Submodular Optimization ] [ Supervised Learning ] [ Sustainability and Environment ] [ Theory ] [ Time Series Analysis ] [ Time Series Analysis; Deep Learning ] [ Time Series Analysis; Probabilistic Methods; Probabilistic Methods ] [ Time Series and Sequences ] [ Topic Models ] [ Uncertainty Estimation ] [ Uncertainty Estimation; Applications; Probabilistic Methods ] [ Unsupervised Learning ] [ Unsupervised Learning; Applications ] [ Unsupervised Learning; Deep Learning ] [ Variational Inference ] [ Visualization or Exposition Techniques for Deep Networks ] [ Visual Question Answering ] [ Visual Scene Analysis and Interpretation ]

30 Results

Oral
Tue 17:00 Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Shariq Iqbal, Christian Schroeder, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha
Oral
Tue 17:00 A Tale of Two Efficient and Informative Negative Sampling Distributions
Shabnam Daghaghi, Tharun Medini, Nicholas Meisburger, Beidi Chen, Mengnan Zhao, Anshumali Shrivastava
Spotlight
Tue 17:20 TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li, Siyuan Zhuang, Shiyuan Guo, Danyang Zhuo, Hao Zhang, Dawn Song, Ion Stoica
Spotlight
Tue 17:25 Quantization Algorithms for Random Fourier Features
Xiaoyun Li, Ping Li
Spotlight
Tue 17:40 Partially Observed Exchangeable Modeling
Yang Li, Junier Oliva
Spotlight
Tue 17:40 Heterogeneity for the Win: One-Shot Federated Clustering
Don Kurian Dennis, Tian Li, Virginia Smith
Spotlight
Tue 18:25 A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention
Tomoki Watanabe, Paolo Favaro
Spotlight
Tue 18:40 Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
Vivek Jayaram, John Thickstun
Oral
Tue 19:00 ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen, Lianmin Zheng, Zhewei Yao, Dequan Wang, Ion Stoica, Michael Mahoney, Joseph E Gonzalez
Spotlight
Tue 19:30 Training Graph Neural Networks with 1000 Layers
Guohao Li, Matthias Müller, Bernard Ghanem, Vladlen Koltun
Spotlight
Tue 19:35 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He
Spotlight
Tue 19:45 Ditto: Fair and Robust Federated Learning Through Personalization
Tian Li, Shengyuan Hu, Ahmad Beirami, Virginia Smith
Poster
Tue 21:00 Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
Vivek Jayaram, John Thickstun
Poster
Tue 21:00 A Tale of Two Efficient and Informative Negative Sampling Distributions
Shabnam Daghaghi, Tharun Medini, Nicholas Meisburger, Beidi Chen, Mengnan Zhao, Anshumali Shrivastava
Poster
Tue 21:00 1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He
Poster
Tue 21:00 Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning
Shariq Iqbal, Christian Schroeder, Bei Peng, Wendelin Boehmer, Shimon Whiteson, Fei Sha
Poster
Tue 21:00 ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Jianfei Chen, Lianmin Zheng, Zhewei Yao, Dequan Wang, Ion Stoica, Michael Mahoney, Joseph E Gonzalez
Poster
Tue 21:00 Partially Observed Exchangeable Modeling
Yang Li, Junier Oliva
Poster
Tue 21:00 TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li, Siyuan Zhuang, Shiyuan Guo, Danyang Zhuo, Hao Zhang, Dawn Song, Ion Stoica
Poster
Tue 21:00 Ditto: Fair and Robust Federated Learning Through Personalization
Tian Li, Shengyuan Hu, Ahmad Beirami, Virginia Smith
Poster
Tue 21:00 Quantization Algorithms for Random Fourier Features
Xiaoyun Li, Ping Li
Poster
Tue 21:00 Heterogeneity for the Win: One-Shot Federated Clustering
Don Kurian Dennis, Tian Li, Virginia Smith
Poster
Tue 21:00 Training Graph Neural Networks with 1000 Layers
Guohao Li, Matthias Müller, Bernard Ghanem, Vladlen Koltun
Poster
Tue 21:00 A Unified Generative Adversarial Network Training via Self-Labeling and Self-Attention
Tomoki Watanabe, Paolo Favaro
Spotlight
Wed 18:30 Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances
Berfin Simsek, François Ged, Arthur Jacot, Francesco Spadaro, Clement Hongler, Wulfram Gerstner, Johanni Brea
Poster
Wed 21:00 Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances
Berfin Simsek, François Ged, Arthur Jacot, Francesco Spadaro, Clement Hongler, Wulfram Gerstner, Johanni Brea
Spotlight
Thu 6:40 Efficient Training of Robust Decision Trees Against Adversarial Examples
Daniël Vos, Sicco Verwer
Spotlight
Thu 7:40 Bayesian Quadrature on Riemannian Data Manifolds
Christian Fröhlich, Alexandra Gessner, Philipp Hennig, Bernhard Schölkopf, Georgios Arvanitidis
Poster
Thu 9:00 Bayesian Quadrature on Riemannian Data Manifolds
Christian Fröhlich, Alexandra Gessner, Philipp Hennig, Bernhard Schölkopf, Georgios Arvanitidis
Poster
Thu 9:00 Efficient Training of Robust Decision Trees Against Adversarial Examples
Daniël Vos, Sicco Verwer