Timezone: »
Inter-Operator Parallelism
Zhuohan Li
Mon Jul 18 01:15 PM -- 01:45 PM (PDT) @
Author Information
Zhuohan Li (UC Berkeley)
More from the Same Authors
-
2023 Poster: FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU »
Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Christopher Re · Ion Stoica · Ce Zhang -
2023 Oral: FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU »
Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Christopher Re · Ion Stoica · Ce Zhang -
2022 Tutorial: Welcome to the "Big Model" Era: Techniques and Systems to Train and Serve Bigger Models »
Hao Zhang · Lianmin Zheng · Zhuohan Li · Ion Stoica -
2021 Poster: TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models »
Zhuohan Li · Siyuan Zhuang · Shiyuan Guo · Danyang Zhuo · Hao Zhang · Dawn Song · Ion Stoica -
2021 Spotlight: TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models »
Zhuohan Li · Siyuan Zhuang · Shiyuan Guo · Danyang Zhuo · Hao Zhang · Dawn Song · Ion Stoica -
2020 Poster: Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers »
Zhuohan Li · Eric Wallace · Sheng Shen · Kevin Lin · Kurt Keutzer · Dan Klein · Joseph Gonzalez