ICML Recursive Introspection: Teaching LLM Agents How to Self-Improve

Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild

Recursive Introspection: Teaching LLM Agents How to Self-Improve

Yuxiao Qu · Tianjun Zhang · Naman Garg · Aviral Kumar

Keywords: [ Self-Improvement ] [ Reinforcement Learning ] [ Large Language Model ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: A central piece in enabling intelligent agentic behavior in foundation models is to make them capable of introspecting upon their behavior, to reason and correct their mistakes. In this paper, we introduce $\textbf{RISE}$: $\textbf{R}$ecursive $\textbf{I}$ntro$\textbf{S}$p$\textbf{E}$ction, an approach for fine-tuning large language models (LLMs) to enable introspection and self-correction. $\textbf{RISE}$ prescribes an iterative fine-tuning procedure that teaches the model to alter its response after seeing previously unsuccessful attempts to solve a problem with additional environment feedback. Inspired by online imitation learning, we derive strategies for multi-turn data collection and training to imbue an LLM with the capability to recursively detect and correct its mistakes in subsequent iterations. Experiments show that $\textbf{RISE}$ enables 7B Llama2 and Mistral models to improve themselves with more turns on math reasoning tasks, outperforming single-turn strategies given equal inference-time computation, without disrupting one-turn abilities.

Chat is not available.

Poster in Workshop: ICML 2024 Workshop on Foundation Models in the Wild

Recursive Introspection: Teaching LLM Agents How to Self-Improve

Yuxiao Qu · Tianjun Zhang · Naman Garg · Aviral Kumar

Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild