Timezone: »

SpeedDETR: Speed-aware Transformers for End-to-end Object Detection
Peiyan Dong · Zhenglun Kong · Xin Meng · PENG ZHANG · hao tang · Yanzhi Wang · Chih-Hsien Chou

Wed Jul 26 02:00 PM -- 03:30 PM (PDT) @ Exhibit Hall 1 #524

Vision Transformers (ViTs) have continuously achieved new milestones in object detection. However, the considerable computation and memory burden compromise their efficiency and generalization of deployment on resource-constraint devices. Besides, efficient transformer-based detectors designed by existing works can hardly achieve a realistic speedup, especially on multi-core processors (e.g., GPUs). The main issue is that the current literature solely concentrates on building algorithms with minimal computation, oblivious that the practical latency can also be affected by the memory access cost and the degree of parallelism. Therefore, we propose SpeedDETR, a novel speed-aware transformer for end-to-end object detectors, achieving high-speed inference on multiple devices. Specifically, we design a latency prediction model which can directly and accurately estimate the network latency by analyzing network properties, hardware memory access pattern, and degree of parallelism. Following the effective local-to-global visual modeling process and the guidance of the latency prediction model, we build our hardware-oriented architecture design and develop a new family of SpeedDETR. Experiments on the MS COCO dataset show SpeedDETR outperforms current DETR-based methods on Tesla V100. Even acceptable speed inference can be achieved on edge GPUs.

Author Information

Peiyan Dong (Northeastern University)
Zhenglun Kong (Northeastern University)
Xin Meng (Peking University)
PENG ZHANG (Tsinghua University, Tsinghua University)
hao tang (ETH Zurich)
Yanzhi Wang (Northeastern University)
Chih-Hsien Chou (Futurewei Technologies, Inc.)

More from the Same Authors