学位论文详细信息
Structure prediction for human parsing
Image Parsing;Human Pose Parsing;People Parsing;Object Detection;Structure Prediction
Tran, Duan
关键词: Image Parsing;    Human Pose Parsing;    People Parsing;    Object Detection;    Structure Prediction;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/29473/Tran_Duan.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

This thesis shows that structure prediction is well-suited for detecting and parsing people in images (and videos) due to the advantage of learning local part appearance models jointly with relationships between body parts. In detecting people, this method can deal with hard cases, for example, a person mounting a bicycle, that are uncommon in the training data and can cause current person detectors to fail. This thesis demonstrates a pedestrian finder which first finds the most likely human pose in the window using a discriminative procedure trained with structure learning on a small dataset, then presents features based on that configuration to an SVM classifier. This thesis shows, using the INRIA Person dataset, that estimates of configuration significantly improve the accuracy of a discriminative pedestrian finder. This thesis shows quantitative evidence that a full relational model of the body performs better at upper body parsing than the standard tree model, despite the need to adopt approximate inference and learning procedures.The method uses an approximate search for inference, and an approximate structure learning method to learn.This thesis compares this method to state of the art methods on a dataset prepared at UIUC (which depicts a wide range of poses),on the standard Buffy dataset, and on the reduced PASCAL dataset published recently. Results suggest that the Buffy dataset over emphasizes poses where the arms hang down, and that leads to generalization problems. Despite the superior performance of a full relational model to a tree structure model, its practical use is still limited because it must deal with the high complexity in inference. This thesis shows a method to boost a parser with poselet pruners. The method first develops a cascade of hierarchical poselet pruners to prune the search space to a small set of part states and then builds a hierarchical poselet parser to find part locations on the pruned set. Experiments on the UIUC Sport dataset shows that the poselet pruners can effectively prune away more than 99.6\% of unlikely part states to about 500 states per part. This small set of part states allows the use of advanced appearance models for better parsers. The method achieves performance comparable to state-of-the-art methods' while improves the speed of finding part locations several times.

【 预 览 】
附件列表
Files Size Format View
Structure prediction for human parsing 5855KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:14次