Timezone: »
Poster
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Pan Xu · Quanquan Gu
Wed Jul 15 11:00 AM -- 11:45 AM & Wed Jul 15 10:00 PM -- 10:45 PM (PDT) @
Q-learning with neural network function approximation (neural Q-learning for short) is among the most prevalent deep reinforcement learning algorithms. Despite its empirical success, the non-asymptotic convergence rate of neural Q-learning remains virtually unknown. In this paper, we present a finite-time analysis of a neural Q-learning algorithm, where the data are generated from a Markov decision process, and the action-value function is approximated by a deep ReLU neural network. We prove that neural Q-learning finds the optimal policy with $O(1/\sqrt{T})$ convergence rate if the neural function approximator is sufficiently overparameterized, where $T$ is the number of iterations. To our best knowledge, our result is the first finite-time analysis of neural Q-learning under non-i.i.d. data assumption.
Author Information
Pan Xu (University of California, Los Angeles)
Quanquan Gu (University of California, Los Angeles)
More from the Same Authors
-
2021 : Benign Overfitting in Adversarially Robust Linear Classification »
Jinghui Chen · Yuan Cao · Yuan Cao · Quanquan Gu -
2021 : Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures »
Yuan Cao · Yuan Cao · Quanquan Gu · Mikhail Belkin -
2021 : Nearly Minimax Optimal Regret for Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation »
Yue Wu · Dongruo Zhou · Quanquan Gu -
2021 : Nearly Optimal Regret for Learning Adversarial MDPs with Linear Function Approximation »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 : Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 : Almost Optimal Algorithms for Two-player Markov Games with Linear Function Approximation »
Zixiang Chen · Dongruo Zhou · Quanquan Gu -
2022 : The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2023 : Robust Learning with Progressive Data Expansion Against Spurious Correlation »
Yihe Deng · Yu Yang · Baharan Mirzasoleiman · Quanquan Gu -
2023 : DiffMol: 3D Structured Molecule Generation with Discrete Denoising Diffusion Probabilistic Models »
Weitong Zhang · Xiaoyun Wang · Justin Smith · Joe Eaton · Brad Rees · Quanquan Gu -
2023 : Borda Regret Minimization for Generalized Linear Dueling Bandits »
Yue Wu · Tao Jin · Qiwei Di · Hao Lou · Farzad Farnoud · Quanquan Gu -
2023 Poster: DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design »
Jiaqi Guan · Xiangxin Zhou · Yuwei Yang · Yu Bao · Jian Peng · Jianzhu Ma · Qiang Liu · Liang Wang · Quanquan Gu -
2023 Poster: Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes »
Jiafan He · Heyang Zhao · Dongruo Zhou · Quanquan Gu -
2023 Poster: Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation »
Yifei Min · Jiafan He · Tianhao Wang · Quanquan Gu -
2023 Poster: Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron »
Jingfeng Wu · Difan Zou · Zixiang Chen · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2023 Poster: Benign Overfitting in Two-layer ReLU Convolutional Neural Networks »
Yiwen Kou · Zixiang Chen · Yuanzhou Chen · Quanquan Gu -
2023 Poster: Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization »
Chris Junchi Li · Huizhuo Yuan · Gauthier Gidel · Quanquan Gu · Michael Jordan -
2023 Oral: Structure-informed Language Models Are Protein Designers »
Zaixiang Zheng · Yifan Deng · Dongyu Xue · Yi Zhou · Fei YE · Quanquan Gu -
2023 Poster: Personalized Federated Learning under Mixture of Distributions »
Yue Wu · Shuaicheng Zhang · Wenchao Yu · Yanchi Liu · Quanquan Gu · Dawei Zhou · Haifeng Chen · Wei Cheng -
2023 Poster: Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits »
Heyang Zhao · Dongruo Zhou · Jiafan He · Quanquan Gu -
2023 Poster: Structure-informed Language Models Are Protein Designers »
Zaixiang Zheng · Yifan Deng · Dongyu Xue · Yi Zhou · Fei YE · Quanquan Gu -
2023 Poster: The Benefits of Mixup for Feature Learning »
Difan Zou · Yuan Cao · Yuanzhi Li · Quanquan Gu -
2023 Poster: Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs »
Junkai Zhang · Weitong Zhang · Quanquan Gu -
2023 Poster: Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path »
Qiwei Di · Jiafan He · Dongruo Zhou · Quanquan Gu -
2023 Poster: On the Interplay Between Misspecification and Sub-optimality Gap in Linear Contextual Bandits »
Weitong Zhang · Jiafan He · Zhiyuan Fan · Quanquan Gu -
2023 Poster: Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes »
Chenlu Ye · Wei Xiong · Quanquan Gu · Tong Zhang -
2022 Poster: Learning Stochastic Shortest Path with Linear Function Approximation »
Yifei Min · Jiafan He · Tianhao Wang · Quanquan Gu -
2022 Spotlight: Learning Stochastic Shortest Path with Linear Function Approximation »
Yifei Min · Jiafan He · Tianhao Wang · Quanquan Gu -
2022 Poster: Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Poster: On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs »
Yuanzhou Chen · Jiafan He · Quanquan Gu -
2022 Oral: Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression »
Jingfeng Wu · Difan Zou · Vladimir Braverman · Quanquan Gu · Sham Kakade -
2022 Spotlight: On the Sample Complexity of Learning Infinite-horizon Discounted Linear Kernel MDPs »
Yuanzhou Chen · Jiafan He · Quanquan Gu -
2022 Poster: Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization »
Dongruo Zhou · Quanquan Gu -
2022 Poster: Langevin Monte Carlo for Contextual Bandits »
Pan Xu · Hongkai Zheng · Eric Mazumdar · Kamyar Azizzadenesheli · Animashree Anandkumar -
2022 Spotlight: Dimension-free Complexity Bounds for High-order Nonconvex Finite-sum Optimization »
Dongruo Zhou · Quanquan Gu -
2022 Spotlight: Langevin Monte Carlo for Contextual Bandits »
Pan Xu · Hongkai Zheng · Eric Mazumdar · Kamyar Azizzadenesheli · Animashree Anandkumar -
2021 : Stochastic Variance-Reduced High-order Optimization for Nonconvex Optimization »
Quanquan Gu -
2021 Workshop: Over-parameterization: Pitfalls and Opportunities »
Yasaman Bahri · Quanquan Gu · Amin Karbasi · Hanie Sedghi -
2021 Poster: On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients »
Difan Zou · Quanquan Gu -
2021 Spotlight: On the Convergence of Hamiltonian Monte Carlo with Stochastic Gradients »
Difan Zou · Quanquan Gu -
2021 Poster: Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits »
Tianyuan Jin · Jing Tang · Pan Xu · Keke Huang · Xiaokui Xiao · Quanquan Gu -
2021 Poster: MOTS: Minimax Optimal Thompson Sampling »
Tianyuan Jin · Pan Xu · Jieming Shi · Xiaokui Xiao · Quanquan Gu -
2021 Poster: Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping »
Dongruo Zhou · Jiafan He · Quanquan Gu -
2021 Poster: Logarithmic Regret for Reinforcement Learning with Linear Function Approximation »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 Spotlight: Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits »
Tianyuan Jin · Jing Tang · Pan Xu · Keke Huang · Xiaokui Xiao · Quanquan Gu -
2021 Spotlight: Logarithmic Regret for Reinforcement Learning with Linear Function Approximation »
Jiafan He · Dongruo Zhou · Quanquan Gu -
2021 Spotlight: MOTS: Minimax Optimal Thompson Sampling »
Tianyuan Jin · Pan Xu · Jieming Shi · Xiaokui Xiao · Quanquan Gu -
2021 Spotlight: Provably Efficient Reinforcement Learning for Discounted MDPs with Feature Mapping »
Dongruo Zhou · Jiafan He · Quanquan Gu -
2021 Poster: Provable Robustness of Adversarial Training for Learning Halfspaces with Noise »
Difan Zou · Spencer Frei · Quanquan Gu -
2021 Poster: Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins »
Spencer Frei · Yuan Cao · Quanquan Gu -
2021 Poster: Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise »
Spencer Frei · Yuan Cao · Quanquan Gu -
2021 Spotlight: Provable Robustness of Adversarial Training for Learning Halfspaces with Noise »
Difan Zou · Spencer Frei · Quanquan Gu -
2021 Oral: Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins »
Spencer Frei · Yuan Cao · Quanquan Gu -
2021 Spotlight: Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise »
Spencer Frei · Yuan Cao · Quanquan Gu -
2020 Poster: Optimization Theory for ReLU Neural Networks Trained with Normalization Layers »
Yonatan Dukler · Quanquan Gu · Guido Montufar -
2020 Poster: Neural Contextual Bandits with UCB-based Exploration »
Dongruo Zhou · Lihong Li · Quanquan Gu -
2019 Poster: On the Convergence and Robustness of Adversarial Training »
Yisen Wang · Xingjun Ma · James Bailey · Jinfeng Yi · Bowen Zhou · Quanquan Gu -
2019 Oral: On the Convergence and Robustness of Adversarial Training »
Yisen Wang · Xingjun Ma · James Bailey · Jinfeng Yi · Bowen Zhou · Quanquan Gu -
2019 Poster: Lower Bounds for Smooth Nonconvex Finite-Sum Optimization »
Dongruo Zhou · Quanquan Gu -
2019 Oral: Lower Bounds for Smooth Nonconvex Finite-Sum Optimization »
Dongruo Zhou · Quanquan Gu -
2018 Poster: Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow »
Xiao Zhang · Simon Du · Quanquan Gu -
2018 Poster: Continuous and Discrete-time Accelerated Stochastic Mirror Descent for Strongly Convex Functions »
Pan Xu · Tianhao Wang · Quanquan Gu -
2018 Oral: Fast and Sample Efficient Inductive Matrix Completion via Multi-Phase Procrustes Flow »
Xiao Zhang · Simon Du · Quanquan Gu -
2018 Oral: Continuous and Discrete-time Accelerated Stochastic Mirror Descent for Strongly Convex Functions »
Pan Xu · Tianhao Wang · Quanquan Gu -
2018 Poster: A Primal-Dual Analysis of Global Optimality in Nonconvex Low-Rank Matrix Recovery »
Xiao Zhang · Lingxiao Wang · Yaodong Yu · Quanquan Gu -
2018 Poster: Stochastic Variance-Reduced Hamilton Monte Carlo Methods »
Difan Zou · Pan Xu · Quanquan Gu -
2018 Oral: Stochastic Variance-Reduced Hamilton Monte Carlo Methods »
Difan Zou · Pan Xu · Quanquan Gu -
2018 Oral: A Primal-Dual Analysis of Global Optimality in Nonconvex Low-Rank Matrix Recovery »
Xiao Zhang · Lingxiao Wang · Yaodong Yu · Quanquan Gu -
2018 Poster: Stochastic Variance-Reduced Cubic Regularized Newton Method »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Poster: Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization »
Jinghui Chen · Pan Xu · Lingxiao Wang · Jian Ma · Quanquan Gu -
2018 Oral: Stochastic Variance-Reduced Cubic Regularized Newton Method »
Dongruo Zhou · Pan Xu · Quanquan Gu -
2018 Oral: Covariate Adjusted Precision Matrix Estimation via Nonconvex Optimization »
Jinghui Chen · Pan Xu · Lingxiao Wang · Jian Ma · Quanquan Gu -
2017 Poster: Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference »
Aditya Chaudhry · Pan Xu · Quanquan Gu -
2017 Poster: High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm »
Rongda Zhu · Lingxiao Wang · Chengxiang Zhai · Quanquan Gu -
2017 Poster: Robust Gaussian Graphical Model Estimation with Arbitrary Corruption »
Lingxiao Wang · Quanquan Gu -
2017 Talk: High-Dimensional Variance-Reduced Stochastic Gradient Expectation-Maximization Algorithm »
Rongda Zhu · Lingxiao Wang · Chengxiang Zhai · Quanquan Gu -
2017 Talk: Robust Gaussian Graphical Model Estimation with Arbitrary Corruption »
Lingxiao Wang · Quanquan Gu -
2017 Talk: Uncertainty Assessment and False Discovery Rate Control in High-Dimensional Granger Causal Inference »
Aditya Chaudhry · Pan Xu · Quanquan Gu -
2017 Poster: A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery »
Lingxiao Wang · Xiao Zhang · Quanquan Gu -
2017 Talk: A Unified Variance Reduction-Based Framework for Nonconvex Low-Rank Matrix Recovery »
Lingxiao Wang · Xiao Zhang · Quanquan Gu