Timezone: »
Spotlight
Improved Regret Bounds of Bilinear Bandits using Action Space Analysis
Kyoungseok Jang · Kwang-Sung Jun · Se-Young Yun · Wanmo Kang
We consider the bilinear bandit problem where the learner chooses a pair of arms, each from two different action spaces of dimension $d_1$ and $d_2$, respectively. The learner then receives a reward whose expectation is a bilinear function of the two chosen arms with an unknown matrix parameter $\Theta^*\in\mathbb{R}^{d_1 \times d_2}$ with rank $r$. Despite abundant applications such as drug discovery, the optimal regret rate is unknown for this problem, though it was conjectured to be $\tilde O(\sqrt{d_1d_2(d_1+d_2)r T})$ by Jun et al. (2019) where $\tilde O$ ignores polylogarithmic factors in $T$. In this paper, we make progress towards closing the gap between the upper and lower bound on the optimal regret. First, we reject the conjecture above by proposing algorithms that achieve the regret $\tilde O(\sqrt{d_1 d_2 (d_1+d_2) T})$ using the fact that the action space dimension $O(d_1+d_2)$ is significantly lower than the matrix parameter dimension $O(d_1 d_2)$. Second, we additionally devise an algorithm with better empirical performance than previous algorithms.
Author Information
Kyoungseok Jang (KAIST)
Kwang-Sung Jun (University of Arizona)
Se-Young Yun (KAIST)
Wanmo Kang (KAIST)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Improved Regret Bounds of Bilinear Bandits using Action Space Analysis »
Thu. Jul 22nd 04:00 -- 06:00 AM Room Virtual
More from the Same Authors
-
2022 : ALASCA: Rethinking Label Smoothing for Deep Learning Under Label Noise »
Jongwoo Ko · Bongsoo Yi · Se-Young Yun -
2023 : Improved Time-Uniform PAC-Bayes Bounds using Coin Betting »
Kyoungseok Jang · Kwang-Sung Jun · Ilja Kuzborskij · Francesco Orabona -
2023 : An Optimal Clustering Algorithm for the Labeled Stochastic Block Model »
Kaito Ariu · Se-Young Yun · Alexandre Proutiere -
2023 : Improved Time-Uniform PAC-Bayes Bounds using Coin Betting »
Kyoungseok Jang · Kwang-Sung Jun · Ilja Kuzborskij · Francesco Orabona -
2023 Poster: Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models »
Dongjun Kim · Yeongmin Kim · Se Jung Kwon · Wanmo Kang · IL CHUL MOON -
2023 Poster: Revisiting Simple Regret: Fast Rates for Returning a Good Arm »
Yao Zhao · Connor J Stephens · Csaba Szepesvari · Kwang-Sung Jun -
2023 Poster: SAAL: Sharpness-Aware Active Learning »
Yoon-Yeong Kim · Youngjae Cho · JoonHo Jang · Byeonghu Na · Yeongmin Kim · Kyungwoo Song · Wanmo Kang · IL CHUL MOON -
2023 Oral: Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models »
Dongjun Kim · Yeongmin Kim · Se Jung Kwon · Wanmo Kang · IL CHUL MOON -
2022 : Revisiting Architecture-aware Knowledge Distillation: Smaller Models and Faster Search »
Taehyeon Kim · Heesoo Myeong · Se-Young Yun -
2022 : Supernet Training for Federated Image Classification »
Taehyeon Kim · Se-Young Yun -
2022 Poster: Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation »
Dongjun Kim · Seungjae Shin · Kyungwoo Song · Wanmo Kang · IL CHUL MOON -
2022 Poster: Rotting Infinitely Many-Armed Bandits »
Jung-hun Kim · Milan Vojnovic · Se-Young Yun -
2022 Spotlight: Soft Truncation: A Universal Training Technique of Score-based Diffusion Model for High Precision Score Estimation »
Dongjun Kim · Seungjae Shin · Kyungwoo Song · Wanmo Kang · IL CHUL MOON -
2022 Spotlight: Rotting Infinitely Many-Armed Bandits »
Jung-hun Kim · Milan Vojnovic · Se-Young Yun -
2021 Poster: Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits »
Kwang-Sung Jun · Lalit Jain · Blake Mason · Houssam Nassif -
2021 Spotlight: Improved Confidence Bounds for the Linear Logistic Model and Applications to Bandits »
Kwang-Sung Jun · Lalit Jain · Blake Mason · Houssam Nassif -
2020 : Short Talk 1 - Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality »
Kwang-Sung Jun -
2019 Poster: Spectral Approximate Inference »
Sejun Park · Eunho Yang · Se-Young Yun · Jinwoo Shin -
2019 Oral: Spectral Approximate Inference »
Sejun Park · Eunho Yang · Se-Young Yun · Jinwoo Shin