PISA: Privacy-Preserving Split Adaptation with Model IP Protection
Abstract
Fine-tuning Large Language Models (LLMs) enables data holders to construct proprietary, task-specific models by leveraging external high-performance computing infrastructure. However, existing paradigms typically address data privacy and model intellectual property (IP) in isolation, failing to simultaneously uphold both constraints. Privacy-prioritized methods compromise model IP by hosting parameters remotely, while IP-oriented collaborative schemes relying on end-to-end gradient flows inherently violate strict data privacy standards. To address these challenges, we present PISA (Privacy-preserving and IP-protected Split Adaptation), a split fine-tuning framework designed to preserve both data privacy and model IP while maintaining high utility. In PISA, we propose three methods: a Manifold Rectification Pre-training (MRP) method to equip the server-side model with intrinsic robustness against privacy-induced distribution shifts; a Dual-Stream Semantic Compensation (DSC) method to recover feature utility using local clean data as priors; and a Utility-Aware Gradient Rectification (UGR) method to adaptively maximize the performance of the parameter-constrained local model. Experiments on GLUE show that PISA ensures dual protection and delivers a substantial 23.0\% performance gain over the privacy-prioritized baseline under strict privacy budgets.