Timezone: »
We study the problem of learning from multiple untrusted data sources, a scenario of increasing practical relevance given the recent emergence of crowdsourcing and collaborative learning paradigms. Specifically, we analyze the situation in which a learning system obtains datasets from multiple sources, some of which might be biased or even adversarially perturbed. It is known that in the single-source case, an adversary with the power to corrupt a fixed fraction of the training data can prevent PAC-learnability, that is, even in the limit of infinitely much training data, no learning system can approach the optimal test error. In this work we show that, surprisingly, the same is not true in the multi-source setting, where the adversary can arbitrarily corrupt a fixed fraction of the data sources. Our main results are a generalization bound that provides finite-sample guarantees for this learning setting, as well as corresponding lower bounds. Besides establishing PAC-learnability our results also show that in a cooperative learning setting sharing data with other parties has provable benefits, even if some participants are malicious.
Author Information
Nikola Konstantinov (IST Austria)
Elias Frantar (TU Vienna)
Dan Alistarh (IST Austria & NeuralMagic)
Christoph H. Lampert (IST Austria)
More from the Same Authors
-
2023 : Generating Efficient Kernels for Quantized Inference on Large Language Models »
Tommaso Pegolotti · Elias Frantar · Dan Alistarh · Markus Püschel -
2023 : ZipLM: Inference-Aware Structured Pruning of Language Models »
Eldar Kurtic · Elias Frantar · Dan Alistarh -
2023 Poster: Quantized Distributed Training of Large Models with Convergence Guarantees »
Ilia Markov · Adrian Vladu · Qi Guo · Dan Alistarh -
2023 Oral: SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot »
Elias Frantar · Dan Alistarh -
2023 Poster: SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot »
Elias Frantar · Dan Alistarh -
2023 Oral: SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge »
Mahdi Nikdan · Tommaso Pegolotti · Eugenia Iofinova · Eldar Kurtic · Dan Alistarh -
2023 Poster: SparseProp: Efficient Sparse Backpropagation for Faster Training of Neural Networks at the Edge »
Mahdi Nikdan · Tommaso Pegolotti · Eugenia Iofinova · Eldar Kurtic · Dan Alistarh -
2022 Poster: SPDY: Accurate Pruning with Speedup Guarantees »
Elias Frantar · Dan Alistarh -
2022 Spotlight: SPDY: Accurate Pruning with Speedup Guarantees »
Elias Frantar · Dan Alistarh -
2021 : Invited talk1:Q&A »
Christoph H. Lampert -
2021 Poster: Communication-Efficient Distributed Optimization with Quantized Preconditioners »
Foivos Alimisis · Peter Davies · Dan Alistarh -
2021 Spotlight: Communication-Efficient Distributed Optimization with Quantized Preconditioners »
Foivos Alimisis · Peter Davies · Dan Alistarh -
2021 : Part 3: Case study on CNN sparsification on ImageNet »
Dan Alistarh -
2021 : Part 2: Mathematical Background »
Dan Alistarh -
2021 Tutorial: Sparsity in Deep Learning: Pruning and growth for efficient inference and training »
Torsten Hoefler · Dan Alistarh -
2020 : Invited Talk: Christoph H. Lampert "Learning Theory for Continual and Meta-Learning" »
Christoph H. Lampert -
2020 Poster: Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks »
Mark Kurtz · Justin Kopinsky · Rati Gelashvili · Alexander Matveev · John Carr · Michael Goin · William Leiserson · Sage Moore · Nir Shavit · Dan Alistarh -
2019 Poster: Robust Learning from Untrusted Sources »
Nikola Konstantinov · Christoph H. Lampert -
2019 Poster: Towards Understanding Knowledge Distillation »
Mary Phuong · Christoph H. Lampert -
2019 Oral: Towards Understanding Knowledge Distillation »
Mary Phuong · Christoph H. Lampert -
2019 Oral: Robust Learning from Untrusted Sources »
Nikola Konstantinov · Christoph H. Lampert -
2018 Poster: Learning equations for extrapolation and control »
Subham S Sahoo · Christoph H. Lampert · Georg Martius -
2018 Oral: Learning equations for extrapolation and control »
Subham S Sahoo · Christoph H. Lampert · Georg Martius -
2018 Poster: Data-Dependent Stability of Stochastic Gradient Descent »
Ilja Kuzborskij · Christoph H. Lampert -
2018 Oral: Data-Dependent Stability of Stochastic Gradient Descent »
Ilja Kuzborskij · Christoph H. Lampert -
2017 Poster: PixelCNN Models with Auxiliary Variables for Natural Image Modeling »
Alexander Kolesnikov · Christoph H. Lampert -
2017 Poster: Multi-task Learning with Labeled and Unlabeled Tasks »
Anastasia Pentina · Christoph H. Lampert -
2017 Talk: Multi-task Learning with Labeled and Unlabeled Tasks »
Anastasia Pentina · Christoph H. Lampert -
2017 Talk: PixelCNN Models with Auxiliary Variables for Natural Image Modeling »
Alexander Kolesnikov · Christoph H. Lampert