Electronics | |
Human Detection in Aerial Thermal Images Using Faster R-CNN and SSD Algorithms | |
Satish B. Shenoy1  Abhilash K. Pai2  A. Kotegar Karunakar2  K. R. Akshatha3  Nikhil Hunjanal Nagaraj4  Sambhav Singh Rohatgi4  | |
[1] Department of Aeronautical and Automobile Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;Department of Electronics and Communication Engineering, Centre for Avionics, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India;Department of Mechatronics Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India; | |
关键词: human detection; thermal camera; aerial images; convolutional neural network; object detection; Faster RCNN; | |
DOI : 10.3390/electronics11071151 | |
来源: DOAJ |
【 摘 要 】
The automatic detection of humans in aerial thermal imagery plays a significant role in various real-time applications, such as surveillance, search and rescue and border monitoring. Small target size, low resolution, occlusion, pose, and scale variations are the significant challenges in aerial thermal images that cause poor performance for various state-of-the-art object detection algorithms. Though many deep-learning-based object detection algorithms have shown impressive performance for generic object detection tasks, their ability to detect smaller objects in the aerial thermal images is analyzed through this study. This work carried out the performance evaluation of Faster R-CNN and single-shot multi-box detector (SSD) algorithms with different backbone networks to detect human targets in aerial view thermal images. For this purpose, two standard aerial thermal datasets having human objects of varying scale are considered with different backbone networks, such as ResNet50, Inception-v2, and MobileNet-v1. The evaluation results demonstrate that the Faster R-CNN model trained with the ResNet50 network architecture out-performed in terms of detection accuracy, with a mean average precision (mAP at 0.5 IoU) of 100% and 55.7% for the test data of the OSU thermal dataset and AAU PD T datasets, respectively. SSD with MobileNet-v1 achieved the highest detection speed of 44 frames per second (FPS) on the NVIDIA GeForce GTX 1080 GPU. Fine-tuning the anchor parameters of the Faster R-CNN ResNet50 and SSD Inception-v2 algorithms caused remarkable improvement in mAP by 10% and 3.5%, respectively, for the challenging AAU PD T dataset. The experimental results demonstrated the application of Faster R-CNN and SSD algorithms for human detection in aerial view thermal images, and the impact of varying backbone network and anchor parameters on the performance improvement of these algorithms.
【 授权许可】
Unknown