Timezone: »
We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle. Sampling labels in batches is highly desirable in practice due to the smaller number of interactive rounds with the labeling oracle (often human beings). However, batch active learning typically pays the price of a reduced adaptivity, leading to suboptimal results. In this paper we propose a solution which requires a careful trade off between the informativeness of the queried points and their diversity. We theoretically investigate batch active learning in the practically relevant scenario where the unlabeled pool of data is available beforehand ({\em pool-based} active learning). We analyze a novel stage-wise greedy algorithm and show that, as a function of the label complexity, the excess risk of this algorithm%operating in the realizable setting for which we prove matches the known minimax rates in standard statistical learning settings. Our results also exhibit a mild dependence on the batch size. These are the first theoretical results that employ careful trade offs between informativeness and diversity to rigorously quantify the statistical performance of batch active learning in the pool-based scenario.
Author Information
Claudio Gentile (Google Research)
Zhilei Wang (Citadel Securities)
Tong Zhang (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Achieving Minimax Rates in Pool-Based Batch Active Learning »
Wed. Jul 20th through Thu the 21st Room Hall E #1218
More from the Same Authors
-
2021 Poster: Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees »
Anand Rajagopalan · Fabio Vitale · Danny Vainstein · Gui Citovsky · Cecilia Procopiuc · Claudio Gentile -
2021 Spotlight: Hierarchical Clustering of Data Streams: Scalable Algorithms and Approximation Guarantees »
Anand Rajagopalan · Fabio Vitale · Danny Vainstein · Gui Citovsky · Cecilia Procopiuc · Claudio Gentile -
2021 Poster: Dynamic Balancing for Model Selection in Bandits and RL »
Ashok Cutkosky · Christoph Dann · Abhimanyu Das · Claudio Gentile · Aldo Pacchiano · Manish Purohit -
2021 Spotlight: Dynamic Balancing for Model Selection in Bandits and RL »
Ashok Cutkosky · Christoph Dann · Abhimanyu Das · Claudio Gentile · Aldo Pacchiano · Manish Purohit -
2021 Poster: Best Model Identification: A Rested Bandit Formulation »
Leonardo Cella · Massimiliano Pontil · Claudio Gentile -
2021 Spotlight: Best Model Identification: A Rested Bandit Formulation »
Leonardo Cella · Massimiliano Pontil · Claudio Gentile -
2020 Poster: Adaptive Region-Based Active Learning »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Ningshan Zhang -
2020 Poster: Online Learning with Dependent Stochastic Feedback Graphs »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Ningshan Zhang -
2019 Poster: Online Learning with Sleeping Experts and Feedback Graphs »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2019 Oral: Online Learning with Sleeping Experts and Feedback Graphs »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2019 Poster: Active Learning with Disagreement Graphs »
Corinna Cortes · Giulia DeSalvo · Mehryar Mohri · Ningshan Zhang · Claudio Gentile -
2019 Oral: Active Learning with Disagreement Graphs »
Corinna Cortes · Giulia DeSalvo · Mehryar Mohri · Ningshan Zhang · Claudio Gentile -
2018 Poster: Online Learning with Abstention »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang -
2018 Oral: Online Learning with Abstention »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Scott Yang