期刊论文

【摘要】

Current studies have shown that the spatial-temporal graph convolutional network (ST-GCN) is effective for skeleton-based action recognition. However, for the existing ST-GCN-based methods, their temporal kernel size is usually fixed over all layers, which makes them cannot fully exploit the temporal dependency between discontinuous frames and different sequence lengths. Besides, most of these methods use average pooling to obtain global graph feature from vertex features, resulting in losing much fine-grained information for action classification. To address these issues, in this work, the authors propose a novel spatial attentive and temporal dilated graph convolutional network (SATD-GCN). It contains two important components, that is, a spatial attention pooling module (SAP) and a temporal dilated graph convolution module (TDGC). Specifically, the SAP module can select the human body joints which are beneficial for action recognition by a self-attention mechanism and alleviates the influence of data redundancy and noise. The TDGC module can effectively extract the temporal features at different time scales, which is useful to improve the temporal perception field and enhance the robustness of the model to different motion speed and sequence length. Importantly, both the SAP module and the TDGC module can be easily integrated into the ST-GCN-based models, and significantly improve their performance. Extensive experiments on two large-scale benchmark datasets, that is, NTU-RGB + D and Kinetics-Skeleton, demonstrate that the authors’ method achieves the state-of-the-art performance for skeleton-based action recognition.

【授权许可】

CC BY|CC BY-ND|CC BY-NC|CC BY-NC-ND

【预览】

附件列表
Files	Size	Format	View
RO202302050004873ZK.pdf	874KB	PDF	download

CAAI Transactions on Intelligence Technology
A spatial attentive and temporal dilated (SATD) GCN for skeleton-based action recognition
article
Jiaxu Zhang¹ Gaoxiang Ye² Zhigang Tu¹ Yongtao Qin³ Qianqing Qin¹ Jinlu Zhang¹ Jun Liu⁴
[1] State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University;State Grid Wuhan Power Supply Company;Shenzhen Infinova Ltd. Company;Information Systems Technology and Design Pillar, Singapore University of Technology and Design
关键词: feature extraction; image representation; image recognition; image motion analysis; learning (artificial intelligence); image classification; graph theory; video signal processing; object recognition;
DOI : 10.1049/cit2.12012
学科分类：数学（综合）
来源: Wiley
PDF


	文献评价指标
	下载次数：11次	浏览次数：2次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】