Timezone: »
This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.
Author Information
Dongqi Cai (Intel Labs China)
Yangyuxuan Kang (Chinese Academy of Sciences, Chinese Academy of Sciences)
Anbang Yao (Intel Labs China)
Anbang Yao is currently a Principal Research Scientist and a Principal Engineer (known as PE) at Intel Labs China where he leads the research efforts on developing omni-scale high-performance intelligent vision systems. He got his Ph.D. degree from Tsinghua University in January 2010. He has over 100 PCT/US/EP patent applications got granted/filed, which are broadly adopted in Intel AI HW Accelerators (Intel® VPU product line and Intel® Arria® Series FPGAs), HW Core Usages (Intel® Ultra-low-power Companion Die CVF paired with TGL, MTL and ADL Core Processors, and Intel® Xe GPUs), and SW Development Kits (Intel® Distribution of OpenVINO™ Toolkit and Intel® RealSense™ SDK). As the first/corresponding author, he has published about 40 top-tier research papers in ICLR, NeurIPS, ICML, AAAI, CVPR, ICCV, ECCV, TPAMI and etc. His works, such as INQ (Incremental Network Quantization) for convnet quantization, DNS (Dynamic Network Surgery) for sparse convnets and HyperNet for efficient object detection, are among Most Influential ICLR/NeurIPS/CVPR Papers in Google Scholar Metrics 2021/2022. He has been recognized with numerous Awards at Intel, such as Intel Innovator (the first and so far the only winner employee from China), 3 times of annual Intel Labs Gordy Awards (the highest annual research award named after Intel's co-founder Gordon Earle Moore), and 2 times of annual Intel China Awards. He also led the team and won the Winner of the prestigious EmotiW Challenges (held by ACM ICMI) in 2015/2017, beating out 74/100+ teams across the world. He demonstrated outstanding skills in mentoring interns, and many of them have already grown into top young researchers in the field.
Yurong Chen (Intel)
More from the Same Authors
-
2019 Workshop: Joint Workshop on On-Device Machine Learning & Compact Deep Neural Network Representations (ODML-CDNNR) »
Sujith Ravi · Zornitsa Kozareva · Lixin Fan · Max Welling · Yurong Chen · Werner Bailer · Brian Kulis · Haoji Hu · Jonathan Dekhtiar · Yingyan Lin · Diana Marculescu