Sensors | |
A Generalized Pyramid Matching Kernel for Human Action Recognition in Realistic Videos | |
Jun Zhu1  Quan Zhou2  Weijia Zou1  Rui Zhang1  | |
[1] Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; E-Mails:;College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China; E-Mail: | |
关键词: video analysis; human action recognition; pyramid matching kernel; kernel-based classification method; | |
DOI : 10.3390/s131114398 | |
来源: mdpi | |
【 摘 要 】
Human action recognition is an increasingly important research topic in the fields of video sensing, analysis and understanding. Caused by unconstrained sensing conditions, there exist large intra-class variations and inter-class ambiguities in realistic videos, which hinder the improvement of recognition performance for recent vision-based action recognition systems. In this paper, we propose a generalized pyramid matching kernel (GPMK) for recognizing human actions in realistic videos, based on a multi-channel “bag of words” representation constructed from local spatial-temporal features of video clips. As an extension to the spatial-temporal pyramid matching (STPM) kernel, the GPMK leverages heterogeneous visual cues in multiple feature descriptor types and spatial-temporal grid granularity levels, to build a valid similarity metric between two video clips for kernel-based classification. Instead of the predefined and fixed weights used in STPM, we present a simple, yet effective, method to compute adaptive channel weights of GPMK based on the kernel target alignment from training data. It incorporates prior knowledge and the data-driven information of different channels in a principled way. The experimental results on three challenging video datasets (
【 授权许可】
CC BY
© 2013 by the authors; licensee MDPI, Basel, Switzerland.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202003190032013ZK.pdf | 1723KB | download |