Simple, Robust, Scalable Semi-sup ervised Learning via Exp ectation Regularization

Simple, Robust, Scalable Semi-sup ervised Learning via Exp ectation Regularization
Gideon S. Mann - University of Massachusetts, USA Andrew McCallum - University of Massachusetts, USA
Although semi-supervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semi-supervised learning method for exponential family parametric models that augments the traditional conditional label-likelihood ob jective function with an additional term that encourages model predictions on unlabeled data to match certain expectations -- such as label priors. The method is extremely easy to implement, scales as well as logistic regression, and can handle non-independent features. We present experiments on five different data sets, showing accuracy improvements over other semi-supervised methods.

Gideon S. Mann - University of Massachusetts, USA
Andrew McCallum - University of Massachusetts, USA

Although semi-supervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semi-supervised learning method for exponential family parametric models that augments the traditional conditional label-likelihood ob jective function with an additional term that encourages model predictions on unlabeled data to match certain expectations -- such as label priors. The method is extremely easy to implement, scales as well as logistic regression, and can handle non-independent features. We present experiments on five different data sets, showing accuracy improvements over other semi-supervised methods.