Poster
in
Affinity Event: GlobalSouthML @ ICML 2026

Efficient AI Deployment on Legacy Data Centers in the Global South

Kedir Yassin Hussen

Project Page

Abstract

Data centers in the Global South face a triple constraint: legacy CPU-centric infrastructure, limited accelerators, and heterogeneous hardware obtained from multiple donors/vendors due to economic aid fragmentation. This paper proposes a training–inference separated scheduling framework for such environments. For training, we dynamically select among data, model, and pipeline parallelism based on workload characteristics and available heterogeneous GPUs. For inference, we employ speculative decoding, continuous batching, and KV caching on CPU–GPU hybrids. We then show how FlagOS provides a unified execution layer that abstracts vendor differences (NVIDIA, AMD, Intel, Huawei Ascend, and edge TPUs), enabling seamless integration of donated hardware. Using a realistic simulation of a Nigerian university data center (320 CPU cores + mixed donated GPUs), our approach improves training throughput by 2.3× and reduces inference latency by 58% compared to naive GPU-only deployment, while cutting hardware acquisition costs by 70%.