| IEEE Access | |
| A Newly Developed Ground Truth Dataset for Visual Saliency in Videos | |
| Muhammad Khurram Khan1  Muhammad Zeeshan2  Muhammad Majid2  Imran Fareed Nizami3  Syed Muhammad Anwar4  Ikram Ud Din5  | |
| [1] Center of Excellence in Information Assurance, King Saud University, Riyadh, Saudi Arabia;Department of Computer Engineering, University of Engineering and Technology at Taxila, Taxila, Pakistan;Department of Electrical Engineering, Bahria University, Islamabad, Pakistan;Department of Software Engineering, University of Engineering and Technology at Taxila, Taxila, Pakistan;Information Technology Department, University of Haripur, Haripur, Pakistan; | |
| 关键词: Ground truth; saliency map; saliency models; video coding; visual attention; | |
| DOI : 10.1109/ACCESS.2018.2826562 | |
| 来源: DOAJ | |
【 摘 要 】
Visual saliency models aim to detect important and eye catching portions in a scene by exploiting human visual system characteristics. The effectiveness of visual saliency models is evaluated by comparing saliency maps with a ground truth data set. In recent years, several visual saliency computation algorithms and ground truth data sets have been proposed for images. However, there is lack of ground truth data sets for videos. A new human labeled ground truth is prepared for video sequences that are commonly used in video coding. The selected videos are from different genres including conversational, sports, outdoor, and indoor having low, medium, and high motion. Saliency mask is obtained for each video by nine different subjects, which are asked to label the salient region in each frame in the form of a rectangular bounding box. A majority voting criteria is used to construct a final ground truth saliency mask for each frame. Sixteen different state-of-the-art visual saliency algorithms are selected for comparison and their effectiveness is computed quantitatively on the newly developed ground truth. It is evident from results that multiple kernel learning and spectral residual-based saliency algorithms perform best for different genres and motion-type videos in terms of F-measure and execution time, respectively.
【 授权许可】
Unknown