Non-Parametric Optimization for Scalable Learning in Stochastic Decision Problems
Abstract
Stochastic optimization (SO) plays a central role in addressing decision‐making problems under uncertainty. Among them, time-varying stochastic optimization (TV-SO) is particularly important due to its applications in adaptive control and machine learning. Non-parametric approaches have been proposed for time-varying deterministic optimization, however, they have not been devised for their stochastic counterparts. This work specifically addresses non-parametric optimality by developing a stochastic variational framework based on Malliavin calculus. This framework enables deriving non-parametric optimality conditions for SO problems with a stochastic decision and supports the design of a scalable deep-learning algorithm that is insensitive to the parameterization dimension. Such an algorithm, called the stochastic path follower (SPF), is applied to solve two key problems under distribution drift, namely least-squares recovery and logistic regression. Experimental results show the merit of the proposed approach against learning-based and gradient-based methods in the state of the art in terms of both performance and scalability.