Skip to yearly menu bar Skip to main content


Poster

Ladder-Residual: Parallelism-Aware Architecture for Accelerating Large Model Inference with Communication Overlapping

Muru Zhang ⋅ Mayank Mishra ⋅ Zhongzhu Zhou ⋅ William Brandon ⋅ Jue Wang ⋅ Yoon Kim ⋅ Jonathan Ragan-Kelley ⋅ Shuaiwen Song ⋅ Ben Athiwaratkun ⋅ Tri Dao
2025 Poster

Abstract

Lay Summary

Video

Chat is not available.