| Applied Sciences | |
| Transformers in Pedestrian Image Retrieval and Person Re-Identification in a Multi-Camera Surveillance System | |
| Muhammad Tahir1  Saeed Anwar2  | |
| [1] College of Computing and Informatics, Saudi Electronic University, Riyadh 11673, Saudi Arabia;Data61-Commonwealth Scientific and Industrial Research Organization(CSIRO), Clayton South, VIC 3169, Australia; | |
| 关键词: vision transformers; deep learning; re-ID; image retrieval; multi-camera surveillance system; pedestrian identification; | |
| DOI : 10.3390/app11199197 | |
| 来源: DOAJ | |
【 摘 要 】
Person Re-Identification is an essential task in computer vision, particularly in surveillance applications. The aim is to identify a person based on an input image from surveillance photographs in various scenarios. Most Person re-ID techniques utilize Convolutional Neural Networks (CNNs); however, Vision Transformers are replacing pure CNNs for various computer vision tasks such as object recognition, classification, etc. The vision transformers contain information about local regions of the image. The current techniques take this advantage to improve the accuracy of the tasks underhand. We propose to use the vision transformers in conjunction with vanilla CNN models to investigate the true strength of transformers in person re-identification. We employ three backbones with different combinations of vision transformers on two benchmark datasets. The overall performance of the backbones increased, showing the importance of vision transformers. We provide ablation studies and show the importance of various components of the vision transformers in re-identification tasks.
【 授权许可】
Unknown