Skip to yearly menu bar Skip to main content


Poster Wed, Jul 16, 2025 • 4:30 PM – 7:00 PM PDT

MMInference: Accelerating Pre-filling for Long-Context Visual Language Models via Modality-Aware Permutation Sparse Attention

Yucheng Li · Huiqiang Jiang · Chengruidong Zhang · Qianhui Wu · Xufang Luo · Surin Ahn · Amir Abdi · Dongsheng Li · Jianfeng Gao · Yuqing Yang · Lili Qiu

Abstract

Lay Summary

Video

Chat is not available.