Skip to yearly menu bar Skip to main content


Poster

Identifying and Mitigating Errors in Gradient Aggregation of Distributed Data Parallel Training

Zhenheng Tang ⋅ Junlin Huang ⋅ Zichen TANG ⋅ Xueze Kang ⋅ Yuxin Wang ⋅ Peijie Dong ⋅ Shaohuai Shi ⋅ Xiaowen Chu ⋅ Bo Li

Abstract

Log in and register to view live content