会议论文

【摘要】

Action recognition has already been a heated research topic recently, which attempts to classify different human actions in videos. The current main-stream methods generally utilize ImageNet-pretrained model as features extractor, however it's not the optimal choice to pretrain a model for classifying videos on a huge still image dataset. What's more, very few works notice that 3D convolution neural network(3D CNN) is better for low-level spatial-temporal features extraction while recurrent neural network(RNN) is better for modelling high-level temporal feature sequences. Consequently, a novel model is proposed in our work to address the two problems mentioned above. First, we pretrain 3D CNN model on huge video action recognition dataset Kinetics to improve generality of the model. And then long short term memory(LSTM) is introduced to model the high-level temporal features produced by the Kinetics-pretrained 3D CNN model. Our experiments results show that the Kinetics-pretrained model can generally outperform ImageNet-pretrained model. And our proposed network finally achieve leading performance on UCF-101 dataset.

【预览】

附件列表
Files	Size	Format	View
I3D-LSTM: A New Model for Human Action Recognition	629KB	PDF	download

2019 2nd International Conference on Advanced Materials, Intelligent Manufacturing and Automation
I3D-LSTM: A New Model for Human Action Recognition

Wang, Xianyuan^1 ; Miao, Zhenjiang^1 ; Zhang, Ruyi^1 ; Hao, Shanshan^1
School of Computer and Information Technology, Beijing Jiaotong University, Haidian District, Beijing
100044, China^1
关键词: Action recognition; Convolution neural network; Human-action recognition; Optimal choice; Recurrent neural network (RNN); Research topics; Spatial-temporal features; Temporal features;
Others : https://iopscience.iop.org/article/10.1088/1757-899X/569/3/032035/pdf DOI : 10.1088/1757-899X/569/3/032035

来源: IOP
PDF


	文献评价指标
	下载次数：7次	浏览次数：16次

【 摘 要 】

【 预 览 】

【摘要】

【预览】