期刊论文

【摘要】

Abstract We introduce a novel spatiotemporal deformable part model for the localization of fine-grained human interactions of two persons in unsegmented videos. Our approach is the first to classify interactions and additionally provide the temporal and spatial extent of the interaction in the video. To this end, our models contain part detectors that support different scales as well as different types of feature descriptors, which are combined in a single graph. This allows us to model the detailed coordination between people in terms of body pose and motion. We demonstrate that this helps to avoid confusions between visually similar interactions. We show that robust results can be obtained when training on small numbers of training sequences (5–15) per interaction class. We achieve AuC scores of 0.82 with an IoU of 0.3 on the publicly available ShakeFive2 dataset, which contains interactions that differ slightly in their coordination. To further test the generalization of our models, we perform cross-dataset experiments where we test on two other publicly available datasets: UT-Interaction and SBU Kinect. These experiments show that our models generalize well to different environments.

【授权许可】

Unknown

EURASIP Journal on Image and Video Processing
Hands-on: deformable pose and motion models for spatiotemporal localization of fine-grained dyadic interactions

Coert van Gemeren¹ Remco C. Veltkamp¹ Ronald Poppe¹
[1] Department of Information and Computing Sciences, Utrecht University;
关键词: Interaction detection; Dyadic interactions; Spatiotemporal localization; Social behavior; Video analysis;
DOI : 10.1186/s13640-018-0255-0
来源: DOAJ


	文献评价指标
	下载次数：0次	浏览次数：10次

【 摘 要 】

【 授权许可】

【摘要】

【授权许可】