Abstract:
Please join us if you are interested in continuing reinforcement learning problems where the agent has a single non-episodic stream of experience. In many cases, and most importantly for natural intelligence, the agent is never reset to a state that it has visited before. What is the right objective for these problems? How is the problem different from the episodic ones? Plz join this social if you are also curious about these questions!
Chat is not available.