icml2008@helsinki.fi

ICML 2008 abstracts

paper ID: 111

Preconditioned Temporal Difference Learning

Hengshuai Yao and Zhi-Qiang Liu

This paper extends many of the recent popular reinforcement learning (RL) algorithms to a generalized framework that includes least-squares temporal difference (LSTD) learning, least-squares policy evaluation (LSPE) and a variant of incremental LSTD (iLSTD). The basis of this extension is a preconditioning technique that tries to solve a stochastic model equation. This paper also studies three signicant issues of the new framework: it presents a new rule of step-size that can be computed online, provides an iterative way to apply preconditioning, and reduces the complexity of related algorithms to near that of temporal difference (TD) learning.

ICML 2008 abstracts

Preconditioned Temporal Difference Learning

The GroupLASSO for Generalized Linear Models: Uniqueness of Solutions and Efficient Algorithms

Autonomous Geometric Precision Error Estimation in Low-level Computer Vision Tasks

A Worst-Case Comparison Between Temporal Difference and Residual Gradient with Linear Function Approximation

Dirichlet Component Analysis: Feature Extraction for Compositional Data

Adaptive p-Posterior Mixture-Model Kernels for Multiple Instance Learning

Pairwise Constraint Propagation by Semidefinite Programming for Semi-Supervised Classification

Cost-Sensitive Multi-class Classification from Probability Estimates

Fast Gaussian Process Methods for Point Process Intensity Estimation

Localized Multiple Kernel Learning

Causal Modelling Combining Instantaneous and Lagged Effects: an Identifiable Model Based on Non-Gaussianity

Uncorrelated Multilinear Principal Component Analysis through Successive Variance Maximization

A Dual Coordinate Descent Method for Large-scale Linear SVM

Listwise Approach to Learning to Rank - Theory and Algorithm

Efficient MultiClass Maximum Margin Clustering

Spectral Clustering with Inconsistent Advice

Nearest Hyperdisk Methods for High-Dimensional Classification

Query-Level Stability and Generalization in Learning to Rank

Local Likelihood Modeling of Temporal Text Streams

Inverting the Viterbi Algorithm: an Abstract Framework for Structure Design

Estimating Local Optimums in EM Algorithm over Gaussian Mixture Model

Efficiently Learning Linear-Linear Exponential Family Predictive Representations of State

Learning to Classify with Missing and Corrupted Features

Multi-Task Compressive Sensing with Dirichlet Process Priors

Fast Solvers and Efficient Implementations for Distance Metric Learning

Nu-Support Vector Machine as Conditional Value-at-Risk Minimization

Manifold Alignment using Procrustes Analysis

A Decoupled Approach to Exemplar-based Unsupervised Learning.

Laplace Maximum Margin Markov Networks

Gaussian Process Product Models for Nonparametric Nonstationarity

Prediction with Expert Advice for the Brier Game

Stability of Transductive Regression Algorithms

Learning All Optimal Policies with Multiple Criteria

Random Classification Noise Defeats All Convex Potential Boosters

Non-Parametric Policy Gradients: A Unified Treatment of Propositional and Relational Domains

On Partial Optimality in Multi-label MRFs

Learning Diverse Rankings with Multi-Armed Bandits

SVM Optimization: Inverse Dependence on Training Set Size

A Least Squares Formulation for Canonical Correlation Analysis

Learning from Incomplete Data with Infinite Imputations

Nonextensive Entropic Kernels

A Distance Model for Rhythms

Training Structural SVMs when Exact Inference is Intractable

Active Reinforcement Learning

Graph Transduction via Alternating Minimization

Learning to Sportscast: A Test of Grounded Language Acquisition

An HDP-HMM for Systems with State Persistence

Fully Distributed EM for Very Large Datasets

Grassmann Discriminant Analysis: a Unifying View on Subspace-Based Learning

On-line Discovery of Temporal-Difference Networks

A Reproducing Kernel Hilbert Space Framework for Pairwise Time Series Distances

Confidence-Weighted Linear Classification

On the Chance Accuracies of Large Collections of Classifiers

Hierarchical sampling for active learning

Efficiently Solving Convex Relaxations for MAP Estimation

Boosting with Incomplete Information

Privacy-Preserving Reinforcement Learning

Estimating Labels from Label Proportions

Deep Learning via Semi-Supervised Embedding

Online Kernel Selection for Bayesian Reinforcement Learning

Unsupervised Rank Aggregation with Distance-Based Models

The Projectron: a Bounded Kernel-Based Perceptron

Efficient Projections onto the L1-Ball for Learning in High Dimensions

Maximum Likelihood Rule Ensembles

Rank Minimization via Online Learning

Topologically-Constrained Latent Variable Models

Tailoring Density Estimation via Reproducing Kernel Moment Matching

Graph Kernels Between Point Clouds

Large Scale Manifold Transduction

On Multi-View Active Learning and the Combination with Semi-Supervised Learning

Bolasso: Model Consistent Lasso Estimation through the Bootstrap

A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning

Learning Dissimilarities by Ranking: From SDP to QP

The Skew Spectrum of Graphs

Modified MMI/MPE: a Direct Evaluation of the Margin in Speech Recognition

Bi-Level Path Following for Cross Validated Solution of Kernel Quantile Regression

Fast Nearest Neighbor Retrieval for Bregman Divergences

Accurate Max-margin Training for Structured Output Spaces

Optimized Cutting Plane Algorithm for Support Vector Machines