Talk
 in 
Workshop: 4th Lifelong Learning Workshop
                        
                    
                    The FOAK Cycle for Model-based Life-long Learning by Rich Sutton
Richard Sutton
A life-long learning agent should learn not only to solve problems, but also to pose new problems for itself. In reinforcement learning, the starting problems are maximizing reward and predicting value, and the natural new problems are achieving subgoals and predicting what will happen next. There has been a lot of work that provides a language for learning new problems (e.g., on auxiliary tasks and general value functions), but precious little that actually learns them (e.g., McGovern on learning subgoal states). In this talk I present a general strategy for learning new problems and, moreover, for learning an endless cycle of problems and solutions, each leading to the other. I call this cycle the FOAK cycle, because it is based on Features, Options, And Knowledge, where “options” are temporally extended ways of behaving, and “knowledge” refers to an agent’s option-conditional model of the transition dynamics of the world. The new problems in the FOAK cycle are 1) to find options that attain state features and 2) to model the consequences of those options. As these problems are solved and the models are used in planning, more abstract features are formed and made the basis for new options and models, continuing the cycle. The FOAK cycle is intended to produce a model-based reinforcement learning agent with successively more abstract representations and knowledge of its world, in other words, a life-long learning agent.