EURASIP Journal on Image and Video Processing | |
Hands-on: deformable pose and motion models for spatiotemporal localization of fine-grained dyadic interactions | |
Coert van Gemeren1  Remco C. Veltkamp1  Ronald Poppe1  | |
[1] Department of Information and Computing Sciences, Utrecht University; | |
关键词: Interaction detection; Dyadic interactions; Spatiotemporal localization; Social behavior; Video analysis; | |
DOI : 10.1186/s13640-018-0255-0 | |
来源: DOAJ |
【 摘 要 】
Abstract We introduce a novel spatiotemporal deformable part model for the localization of fine-grained human interactions of two persons in unsegmented videos. Our approach is the first to classify interactions and additionally provide the temporal and spatial extent of the interaction in the video. To this end, our models contain part detectors that support different scales as well as different types of feature descriptors, which are combined in a single graph. This allows us to model the detailed coordination between people in terms of body pose and motion. We demonstrate that this helps to avoid confusions between visually similar interactions. We show that robust results can be obtained when training on small numbers of training sequences (5–15) per interaction class. We achieve AuC scores of 0.82 with an IoU of 0.3 on the publicly available ShakeFive2 dataset, which contains interactions that differ slightly in their coordination. To further test the generalization of our models, we perform cross-dataset experiments where we test on two other publicly available datasets: UT-Interaction and SBU Kinect. These experiments show that our models generalize well to different environments.
【 授权许可】
Unknown