Timezone: »

Adaptive Sampling Probabilities for Non-Smooth Optimization
Hongseok Namkoong · Aman Sinha · Steven Yadlowsky · John Duchi

Tue Aug 08 01:30 AM -- 05:00 AM (PDT) @ Gallery #99

Standard forms of coordinate and stochastic gradient methods do not adapt to structure in data; their good behavior under random sampling is predicated on uniformity in data. When gradients in certain blocks of features (for coordinate descent) or examples (for SGD) are larger than others, there is a natural structure that can be exploited for quicker convergence. Yet adaptive variants often suffer nontrivial computational overhead. We present a framework that discovers and leverages such structural properties at a low computational cost. We employ a bandit optimization procedure that "learns" probabilities for sampling coordinates or examples in (non-smooth) optimization problems, allowing us to guarantee performance close to that of the optimal stationary sampling distribution. When such structures exist, our algorithms achieve tighter convergence guarantees than their non-adaptive counterparts, and we complement our analysis with experiments on several datasets.

Author Information

Hongseok Namkoong (Stanford University)
Aman Sinha (Stanford University)
Steven Yadlowsky (Stanford University)
John Duchi (Stanford University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors