Curriculum Reinforcement Learning for Black-Box Prompt Tuning via Large Language Models
Abstract
Black-box prompt tuning (BBPT) aims to optimize input prompts for large models where internal parameters and gradients are inaccessible. However, existing methods fail to simultaneously address the dual challenges of prompt interpretability and query efficiency. To address these challenges, we propose CRL-BPT, a curriculum reinforcement learning framework that utilizes a large language model as an agent to generate human-readable prompts. Specifically, CRL-BPT implements a dynamic curriculum schedule on two auxiliary objectives: an imitation loss and an innovation loss. By dynamically weighting these objectives, CRL-BPT regularizes the RL process, guiding the agent from mimicking reference prompts to discovering novel patterns. Additionally, we introduce tailored stabilization mechanisms comprising historical loss normalization and relative reward calibration to ensure robust training. Extensive experiments demonstrate that CRL-BPT establishes new state-of-the-art performance and generates highly interpretable prompts under a strict budget of API calls. Code is available at https://anonymous.4open.science/r/CRL-BPT.