PAC-Bayes Meets Interactive Learning

Workshop

PAC-Bayes Meets Interactive Learning

Hamish Flynn · Maxime Heuillet · Audrey Durand · Melih Kandemir · Benjamin Guedj

Meeting Room 314

[ Abstract ] Workshop Website

Interactive learning encompasses online learning, continual learning, active learning, bandits, reinforcement learning, and other settings where an algorithm must learn while interacting with a continual stream of data. Such problems often involve exploration-exploitation dilemmas, which can be elegantly handled with probabilistic and Bayesian methods. Deep interactive learning methods leveraging neural networks are typically used when the setting involves rich observations, such as images. As a result, both probabilistic and deep interactive learning methods are growing in popularity. However, acquiring observations in an interactive fashion with the environment can be costly. There is therefore great interest in understanding when sample-efficient learning with probabilistic and deep interactive learning can be expected or guaranteed. Within statistical learning theory, PAC-Bayesian theory is designed for the analysis of probabilistic learning methods. It has recently been shown to be well-suited for the analysis of deep learning methods. This workshop aims to bring together researchers from the broad Bayesian and interactive learning communities in order to foster the emergence of new ideas that could contribute to both theoretical and empirical advancement of PAC-Bayesian theory in interactive learning settings.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Fri 12:00 p.m. - 12:15 p.m.	Welcome ( Opening Remarks ) > SlidesLive Video	🔗
Fri 12:15 p.m. - 1:00 p.m.	PAC-Bayes Tutorial ( Invited Talk ) > SlidesLive Video	🔗
Fri 1:00 p.m. - 1:30 p.m.	Poster session 1 ( Poster Session ) >	🔗
Fri 1:30 p.m. - 2:15 p.m.	Invited talk: Lessons Learned from Studying PAC-Bayes and Generalization ( Invited Talk ) > SlidesLive Video	Gintare Karolina Dziugaite 🔗
Fri 2:15 p.m. - 3:00 p.m.	Invited Talk 2: A unified recipe for deriving (time-uniform) PAC-Bayes bounds ( Invited Talk ) > SlidesLive Video We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. Our main result is a PAC-Bayes theorem which holds for a wide class of discrete stochastic processes. We show how this result implies time-uniform versions of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds.	Aaditya Ramdas 🔗
Fri 3:00 p.m. - 4:00 p.m.	Lunch break ( Break ) >	🔗
Fri 4:00 p.m. - 4:10 p.m.	Improved Time-Uniform PAC-Bayes Bounds using Coin Betting ( Contributed Talk ) > link SlidesLive Video We propose a novel PAC-Bayes concentration inequality that implies a number of existing results including Bernoulli-KL (Maurer's) and empirical Bernstein inequalities.Our approach is based on the coin-betting framework that derives one of numerically tightest known time-uniform concentration inequalities through the regret analysis of online gambling algorithms.In particular, we derive the first PAC-Bayes concentration inequality based on the coin-betting approach that holds simultaneously for all sample sizes. Finally, we propose an efficient algorithm to numerically calculate confidence sequences from our bound, which often generates nonvacuous confidence sequences even with few samples, unlike the state-of-the-art PAC-Bayes bounds. Link	Kyoungseok Jang · Kwang-Sung Jun · Ilja Kuzborskij · Francesco Orabona 🔗
Fri 4:10 p.m. - 4:20 p.m.	PAC-Bayesian Offline Contextual Bandits with Guarantees ( Contributed Talk ) > link SlidesLive Video This paper introduces a new principled approach for off-policy learning in contextual bandits. Unlike previous work, our approach does not derive learning principles from intractable or loose bounds. We analyse the problem through the PAC-Bayesian lens, interpreting policies as mixtures of decision rules. This allows us to propose novel generalization bounds and provide tractable algorithms to optimize them. We prove that the derived bounds are tighter than their competitors, and can be optimized directly to confidently improve upon the logging policy offline. Our approach learns policies with guarantees, uses all available data and does not require tuning additional hyperparameters on held-out sets. We demonstrate through extensive experiments the effectiveness of our approach in providing performance guarantees in practical scenarios. Link	Otmane Sakhi · Pierre Alquier · Nicolas Chopin 🔗
Fri 4:20 p.m. - 4:30 p.m.	PAC-Bayes bounds’ parameter optimization via events’ space discretization: new bounds for losses with general tail behaviors ( Contributed Talk ) > link SlidesLive Video In this paper, we present new parameter-free high-probability PAC-Bayes bounds for losses with different tail behaviors: a PAC-Bayes Chernoff analogue when the loss’ cumulative generating function is bounded, and a bound when the loss’ second moment is bounded. These two bounds are obtained using a new technique based on a discretization of the space of possible events for the “in probability” parameter optimization problem. Finally, we extend all previous results to anytime-valid bounds using a simple technique applicable to any existing bound. Link	Borja Rodríguez Gálvez · Ragnar Thobaben · Mikael Skoglund 🔗
Fri 4:30 p.m. - 5:15 p.m.	Invited Talk 3: Meta-Learning Reliable Priors for Interactive Learning ( Invited Talk ) > SlidesLive Video To navigate the exploration-exploitation trade-off in interactive learning we often rely on the uncertainty estimates of a probabilistic or Bayesian model. A key challenge is to correctly specify the prior of our model so that its epistemic uncertainty estimates are reliable. In this talk, we explore how we can harness related datasets or previous experience to meta-learn priors in a data-driven way. We study this problem through the lens of PAC-Bayesian theory and derive practical and scalable meta-learning algorithms. In particular, we discuss how to make sure that the meta-learned priors yield confidence intervals that are not overconfident so that our interactive learners explore sufficiently. Overall, the proposed meta-learning framework allows us significantly speed up interactive learning through transfer from previous tasks/runs.	Jonas Rothfuss 🔗
Fri 5:15 p.m. - 6:00 p.m.	Invited Talk 4: Generalization Theory for Robot Learning ( Invited Talk ) > SlidesLive Video The ability of machine learning techniques to process rich sensory inputs such as vision makes them highly appealing for use in robotic systems (e.g., micro aerial vehicles and robotic manipulators). However, the increasing adoption of learning-based components in the robotics perception and control pipeline poses an important challenge: how can we guarantee the safety and performance of such systems? As an example, consider a micro aerial vehicle that learns to navigate using a thousand different obstacle environments or a robotic manipulator that learns to grasp using a million objects in a dataset. How likely are these systems to remain safe and perform well on a novel (i.e., previously unseen) environment or object? How can we learn control policies for robotic systems that provably generalize to environments that our robot has not previously encountered? Unfortunately, existing approaches either do not provide such guarantees or do so only under very restrictive assumptions. In this talk, I will present our group’s work on developing a principled theoretical and algorithmic framework for learning control policies for robotic systems with formal guarantees on generalization to novel environments. The key technical insight is to leverage and extend powerful techniques from PAC-Bayes theory. We apply our techniques on problems including vision-based navigation and manipulation in order to demonstrate the ability to provide strong generalization guarantees on robotic systems with nonlinear or hybrid dynamics, rich sensory inputs, and neural network-based control policies.	Anirudha Majumdar 🔗
Fri 6:00 p.m. - 6:30 p.m.	Poster Session 2 ( Poster Session ) >	🔗
Fri 6:30 p.m. - 6:40 p.m.	Anytime Model Selection in Linear Bandits ( Contributed Talk ) > link SlidesLive Video Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online learning algorithms that treat different models as experts. Existing methods, however, scale poorly ($\mathrm{poly}M$) with the number of models $M$ in terms of their regret. We develop ALEXP, an anytime algorithm, which has an exponentially improved ($\log M$) dependence on $M$ for its regret. We neither require knowledge of the horizon $n$, nor rely on an initial purely exploratory stage. Our approach utilizes a novel time-uniform analysis of the Lasso, by defining a self-normalized martingale sequence based on the empirical process error, establishing a new connection between interactive learning and high-dimensional statistics. Link	Parnian Kassraie · Aldo Pacchiano · Nicolas Emmenegger · Andreas Krause 🔗
Fri 6:40 p.m. - 6:50 p.m.	PAC-Bayesian Error Bound, via R\'enyi Divergence, for a Class of Linear Time-Invariant State-Space Models ( Contributed Talk ) > link SlidesLive Video In this paper we derive a PAC-Bayesian error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian error bound for such systems, 3) discuss various consequences of this error bound. Link	Deividas Eringis · john leth · Rafal Wisniewski · Zheng-Hua Tan · Mihaly Petreczky 🔗
Fri 6:50 p.m. - 7:00 p.m.	Experiment Planning with Function Approximation ( Contributed Talk ) > link SlidesLive Video We study the problem of experiment planning with function approximation in contextual bandit problems. In settings where there is a significant overhead to deploying adaptive algorithms; for example, when the execution of the data collection policies is required to be distributed, or a human in the loop is needed to implement these policies, producing in advance a set of policies for data collection is paramount. We study the setting where a large dataset of contexts -but not rewards- is available and may be used by the learner to design an effective data collection strategy. Although when rewards are linear this problem has been well studied, results are still missing for more complex reward models. In this work we propose two experiment planning strategies compatible with function approximation, first an eluder planning and sampling procedure that can recover optimality guarantees depending on the eluder dimension of the reward function class, and second we show the uniform sampler has competitive rates in the setting where the number of actions is small. Link	Aldo Pacchiano · Jonathan Lee · Emma Brunskill 🔗
Fri 7:00 p.m. - 7:45 p.m.	Panel Discussion ( Panel Discussion ) >	🔗
-	Improved Time-Uniform PAC-Bayes Bounds using Coin Betting ( Poster ) > link We propose a novel PAC-Bayes concentration inequality that implies a number of existing results including Bernoulli-KL (Maurer's) and empirical Bernstein inequalities.Our approach is based on the coin-betting framework that derives one of numerically tightest known time-uniform concentration inequalities through the regret analysis of online gambling algorithms.In particular, we derive the first PAC-Bayes concentration inequality based on the coin-betting approach that holds simultaneously for all sample sizes. Finally, we propose an efficient algorithm to numerically calculate confidence sequences from our bound, which often generates nonvacuous confidence sequences even with few samples, unlike the state-of-the-art PAC-Bayes bounds. Link	Kyoungseok Jang · Kwang-Sung Jun · Ilja Kuzborskij · Francesco Orabona 🔗
-	Flat minima can fail to transfer to downstream tasks ( Poster ) > link Large neural networks trained on one task are often finetuned and reused on different but related downstream tasks. The prospect of general principals that might lead to improved transferability is very enticing, as pretraining is exceptionally resource intensive.%and target tasks with small amounts of data In recent work, Liu et al. (2022) propose to use flatness as a metric to judge the transferability of pretrained neural networks, based on the observation that, on a suite of benchmarks, flatter minima led to better transfer. Is this a general principal?In this extended abstract, we show that flatness is not a reliable indicator of transferability, despite flatness having been linked to generalization via PAC-Bayes and empirical analysis.We demonstrate that the question of whether flatness helps or hurts depends on the relationship between the source and target tasks. Link	Deepansha Singh · Ekansh Sharma · Daniel Roy · Gintare Karolina Dziugaite 🔗
-	Computing non-vacuous PAC-Bayes generalization bounds for Models under Adversarial Corruptions ( Poster ) > link PAC-Bayes generalization bounds have been shown to provide non-vacuous performance certificates for several Machine Learning models. However, under adversarial corruptions, these bounds often fail to maintain their non-vacuous nature due to the increased empirical risk. In this work, we address this limitation by deriving and computing the first non-vacuous generalization bounds for models operating under adversarial conditions. Our approach combines the PAC-Bayes and Adversarial Smoothing frameworks to derive generalization bounds for randomly smoothed models. We empirically demonstrate the efficacy of our bounds in providing robust population risk certificates for stochastic Convolution Neural Networks (CNN) operating under $L_2$-bounded adversarial corruptions for both MNIST and CIFAR-10. Link	Waleed Mustafa · Philipp Liznerski · Dennis Wagner · Puyu Wang · Marius Kloft 🔗
-	XLDA: Linear Discriminant Analysis for Scaling Continual Learning to Extreme Classification Settings at the Edge ( Poster ) > link Streaming Linear Discriminant Analysis (LDA) while proven in Class-incremental Learning deployments at the edge with limited classes (upto 1000), has not been proven for deployment in extreme classification scenarios. In this paper, we present: (a) XLDA, a framework for Class-IL in edge deployment where LDA classifier is proven to be equivalent to FC layer including in extreme classification scenarios, and (b) optimizations to enable XLDA-based training and inference for edge deployment where there is a constraint on available compute resources. We show upto 42x speed up using a batched training approach and upto 5x inference speedup with nearest neighbor search on extreme datasets like AliProducts (50k classes) and Google Landmarks V2 (81k classes). Link	Karan Shah · Vishruth Veerendranath · Anushka Hebbar · Raghavendra Bhat 🔗
-	Information-Theoretic Generalization Bounds for the Subtask Problem ( Poster ) > link This paper focuses on the generalization error of the subtask problem, which is a specific instance of distribution shift in supervised learning. Here, the training data generated from the source domain consists of multiple classes or labels, while the data for the target domain only encompasses a specific subset of the classes encountered during training. We study the generalization error for this problem using information-theoretic tools. We start by considering the conditional mutual information (CMI) based bounds, and we show that tighter results can be obtained for the subtask problem compared to the bounds designed for general domain shifts. Then, we consider the PAC-Bayesian setting and provide a high probability bound. Furthermore, we focus on the Gibbs algorithm, which achieves the tightest PAC-Bayesian bound, and we can obtain an exact characterization of the generalization error of the subtask problem with this Gibbs algorithm. Link	Firas Laakom · Yuheng Bu · Moncef Gabbouj 🔗
-	Experiment Planning with Function Approximation ( Poster ) > link We study the problem of experiment planning with function approximation in contextual bandit problems. In settings where there is a significant overhead to deploying adaptive algorithms; for example, when the execution of the data collection policies is required to be distributed, or a human in the loop is needed to implement these policies, producing in advance a set of policies for data collection is paramount. We study the setting where a large dataset of contexts -but not rewards- is available and may be used by the learner to design an effective data collection strategy. Although when rewards are linear this problem has been well studied, results are still missing for more complex reward models. In this work we propose two experiment planning strategies compatible with function approximation, first an eluder planning and sampling procedure that can recover optimality guarantees depending on the eluder dimension of the reward function class, and second we show the uniform sampler has competitive rates in the setting where the number of actions is small. Link	Aldo Pacchiano · Jonathan Lee · Emma Brunskill 🔗
-	PAC-Bayesian Error Bound, via R\'enyi Divergence, for a Class of Linear Time-Invariant State-Space Models ( Poster ) > link In this paper we derive a PAC-Bayesian error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian error bound for such systems, 3) discuss various consequences of this error bound. Link	Deividas Eringis · john leth · Rafal Wisniewski · Zheng-Hua Tan · Mihaly Petreczky 🔗
-	PAC-Bayesian Offline Contextual Bandits with Guarantees ( Poster ) > link This paper introduces a new principled approach for off-policy learning in contextual bandits. Unlike previous work, our approach does not derive learning principles from intractable or loose bounds. We analyse the problem through the PAC-Bayesian lens, interpreting policies as mixtures of decision rules. This allows us to propose novel generalization bounds and provide tractable algorithms to optimize them. We prove that the derived bounds are tighter than their competitors, and can be optimized directly to confidently improve upon the logging policy offline. Our approach learns policies with guarantees, uses all available data and does not require tuning additional hyperparameters on held-out sets. We demonstrate through extensive experiments the effectiveness of our approach in providing performance guarantees in practical scenarios. Link	Otmane Sakhi · Pierre Alquier · Nicolas Chopin 🔗
-	Bayesian Feasibility Determination with Multiple Constraints ( Poster ) > link We aim to efficiently determine a feasible region of an unknown black box function that assigns real numbers to discrete alternatives. Unlike existing binary classification methods that primarily focus on developing a classifier based on a given number of training data, we optimize the order of alternative sampling to achieve high accuracy in feasibility determination with a small number of observations. To achieve this, we utilize the Gaussian process as the surrogate model and introduce a novel value-of-information acquisition function to perform adaptive sampling under multiple constraints. We thoroughly analyze the convergence of our proposed scheme and demonstrate its effectiveness through numerical experiments. Link	Tingnan Gong · Di Liu · Yao Xie · Seong-Hee Kim 🔗
-	Anytime Model Selection in Linear Bandits ( Poster ) > link Model selection in the context of bandit optimization is a challenging problem, as it requires balancing exploration and exploitation not only for action selection, but also for model selection. One natural approach is to rely on online learning algorithms that treat different models as experts. Existing methods, however, scale poorly ($\mathrm{poly}M$) with the number of models $M$ in terms of their regret. We develop ALEXP, an anytime algorithm, which has an exponentially improved ($\log M$) dependence on $M$ for its regret. We neither require knowledge of the horizon $n$, nor rely on an initial purely exploratory stage. Our approach utilizes a novel time-uniform analysis of the Lasso, by defining a self-normalized martingale sequence based on the empirical process error, establishing a new connection between interactive learning and high-dimensional statistics. Link	Parnian Kassraie · Aldo Pacchiano · Nicolas Emmenegger · Andreas Krause 🔗
-	PAC-Bayesian Domain Adaptation Bounds for Multi-view learning ( Poster ) > link This paper presents a series of new results for domain adaptation in the multi-view learning setting. The incorporation of multiple views in the domain adaptation was paid little attention in the previous studies (Germain et al., 2013; 2015). In this way, we propose an analysis of generalization bounds with Pac-Bayesian theory to consolidate the two paradigms, which are currently treated separately. Link	Mehdi Hennequin · Khalid Benabdeslem · Haytham Elghazel 🔗
-	Tighter fast and mixed rate PAC-Bayes bounds ( Poster ) > link In this paper, we present new high-probability PAC-Bayes bounds for losses with a bounded range. First, we develop a strengthened versionof Catoni’s bound that holds simultaneously for all parameter values. This leads to new fast rate and mixed rate bounds that are interpretable and tighter than previous bounds in the literature. All the presented bounds are easily extended to any-time valid thanks to the recent developments from Jang et al. (2023). Link	Borja Rodríguez Gálvez · Ragnar Thobaben · Mikael Skoglund 🔗
-	PAC-Bayes bounds’ parameter optimization via events’ space discretization: new bounds for losses with general tail behaviors ( Poster ) > link In this paper, we present new parameter-free high-probability PAC-Bayes bounds for losses with different tail behaviors: a PAC-Bayes Chernoff analogue when the loss’ cumulative generating function is bounded, and a bound when the loss’ second moment is bounded. These two bounds are obtained using a new technique based on a discretization of the space of possible events for the “in probability” parameter optimization problem. Finally, we extend all previous results to anytime-valid bounds using a simple technique applicable to any existing bound. Link	Borja Rodríguez Gálvez · Ragnar Thobaben · Mikael Skoglund 🔗
-	Bayesian Risk-Averse Q-Learning with Streaming Data ( Poster ) > link We consider a robust reinforcement learning problem, where a learning agent learns from a simulated training environment. We adopt a formulation of Bayesian risk MDP (BRMDP) with infinite horizon, which uses Bayesian posterior to estimate the transition model and impose a risk functional to account for the model uncertainty. Observations from the real environment that is out of the agent's control arrive periodically and are utilized by the agent to update the Bayesian posterior to reduce model uncertainty. We theoretically demonstrate that BRMDP balances the trade-off between robustness and conservativeness, and we further develop a multi-stage Bayesian risk-averse Q-learning algorithm with a provable performance guarantee to solve BRMDP with streaming observations from real environment. Link	Yuhao Wang · Enlu Zhou 🔗