Linear prediction methods, such as least squares for regression,
logistic regression and support vector machines for classification,
have been extensively used in statistics and machine learning.
In this paper, we study stochastic gradient descent (SGD) algorithms
on regularized forms of linear prediction methods.
This class of methods, related to online algorithms such as perceptron,
are both efficient and very simple to implement.
We obtain numerical rate of convergence for such algorithms, and
discuss its implications.
Experiments on text data will be provided to demonstrate
numerical and statistical consequences of our theoretical findings. |