Poster Wed, Jul 8, 2026 • 10:30 PM – 12:15 AM PDT HALL A #4408

CLAM-Bench: Benchmarking LLM Agents for Library-Scale Cross-Architecture Migration

Weijia Li ⋅ KE GAO ⋅ Jiajie Li ⋅ Han Sun ⋅ Yuhe Ding ⋅ Yongdong Mai ⋅ Yiran Le ⋅ Yongjie Qian ⋅ Zhibin Zhang ⋅ Xinyu Wang ⋅ Limin Cheng ⋅ Shouxu Kuang ⋅ Pengfei Chen ⋅ Ling Li

Abstract

Cross-architecture migration of high-performance libraries dictates ecosystem readiness on emerging hardware. The challenge is twofold: disentangling library-scale dependencies and performance-critical kernels with ISA-specific SIMD intrinsics, often trading migration speed for peak performance. While LLM-based agents offer a promising approach, are confined to function-level tasks or scalar code, failing to assess agents’ capabilities and limitations in realistic, library-scale migration. We present CLAM-Bench (Cross-architecture Library-scale Agent Migration benchmark), featuring 85 critical kernels from widely used libraries, including OpenCV, libjpeg, and NCNN. It supports comprehensive evaluations of compilability, correctness, and performance across major transitions: ARM→RISC-V, x86→ARM, and ARM→LoongArch. Evaluation of 12 SOTA agent-LLM combinations on CLAM-Bench reveals that, due to the lack of library-level navigation and hardware-aware optimization, agents regress to superficial pattern matching, yielding only 20.88% correctness and 0.83x speedup for libjpeg. Motivated by these findings, we further propose FSCM, a multi-agent framework incorporating hardware-aware global reconfiguration and performance optimization. FSCM improves OpenCV correctness to 71%. The benchmark and code are available at https://anonymous.4open.science/r/clam_bench-D8EB/.