Focus-Then-Contact: Speeding Up Robotic Contact-Rich Task Learning with Affordance-Guided Real-World Residual Reinforcement Learning
Abstract
Real-World Reinforcement Learning (RL) has shown significant potential in robotic manipulation tasks. However, many methods still require substantial human-in-the-loop involvement to complete contact-rich tasks, especially when there are disruptions such as visual backgrounds or positional changes. To address this, we propose the Focus Then Contact (FTC), a lightweight and low-cost method to accelerate the convergence of human-in-the-loop real-world RL for contact-rich tasks. FTC leverages residual RL to provide base actions, helping the system quickly reach the target regions and improve sample efficiency. Additionally, FTC integrates an affordance-guided reward that drives the real-world RL system to quickly focus on key regions of interest, making it possible for the robotic arm to continuously engage with these goal areas through force-control feedback. At the same time, we optimize the human-in-the-loop implementation to prevent conflicts with RL over control of the robotic arm. We demonstrate the effectiveness of FTC on 6 contact-rich tasks, where it outperforms baseline methods in achieving high success rates and speeds up robotic contact-rich task learning under a real-world RL setting. Video materials can be seen in \url{https://anonymous.4open.science/api/repo/FTC-website-BB5E/file/index.html}.