Timezone: »

CodeIPPrompt: Intellectual Property Infringement Assessment of Code Language Models
Zhiyuan Yu · Yuhao Wu · Ning Zhang · Chenguang Wang · Yevgeniy Vorobeychik · Chaowei Xiao

Wed Jul 26 05:00 PM -- 06:30 PM (PDT) @ Exhibit Hall 1 #510

Recent advances in large language models (LMs) have facilitated their ability to synthesize programming code. However, they have also raised concerns about intellectual property (IP) rights violations. Despite the significance of this issue, it has been relatively less explored. In this paper, we aim to bridge the gap by presenting CodeIPPrompt, a platform for automatic evaluation of the extent to which code language models may reproduce licensed programs. It comprises two key components: prompts constructed from a licensed code database to elicit LMs to generate IP-violating code, and a measurement tool to evaluate the extent of IP violation of code LMs. We conducted an extensive evaluation of existing open-source code LMs and commercial products and revealed the prevalence of IP violations in all these models. We further identified that the root cause is the substantial proportion of training corpus subject to restrictive licenses, resulting from both intentional inclusion and inconsistent license practice in the real world. To address this issue, we also explored potential mitigation strategies, including fine-tuning and dynamic token filtering. Our study provides a testbed for evaluating the IP violation issues of the existing code generation platforms and stresses the need for a better mitigation strategy.

Author Information

Zhiyuan Yu (Washington University, Saint Louis)

I am a Ph.D. candidate in the Department of Computer Science and Engineering at Washington University in St. Louis [(CV)](https://batyu.github.io/zhiyuanyu/files/CV_ZhiyuanYu.pdf). I am currently working in Computer Security and Privacy Lab (CSPL) supervised by Professor Ning Zhang. My research interests include cyber-physical security, adversarial machine learning, and usable privacy. Prior to my Ph.D. journey, I received B.S. degree in Electrical Engineering from Huazhong University of Science and Technology in 2019.

Yuhao Wu (Washington University, Saint Louis)
Ning Zhang (Washington University, Saint Louis)
Chenguang Wang (University of California Berkeley)
Yevgeniy Vorobeychik (Washington University, St. Louis)
Chaowei Xiao (Umich)

More from the Same Authors