【 摘 要 】
In general object detection, scale variation is always a big challenge. At present, feature pyramid networks are employed in numerous methods to alleviate the problems caused by large scale range of objects in object detection, which makes use of multi-level features extracted from the backbone for top-down upsampling and fusion to acquire a set of multi-scale depth image features. However, the feature pyramid network proposed by Ghiasi et al. adopts a simple fusion method, which fails to consider the fusion feature context, and therefore, it is difficult to acquire good features. In addition, the fusion of multi-scale features directly by traditional upsampling is prone to feature misalignment and loss of details. In this paper, an adaptive feature pyramid network is proposed based on the feature pyramid network to alleviate the foregoing potential problems, which includes two major designs, i.e., adaptive feature upsampling and adaptive feature fusion. The adaptive feature upsampling aims to predict a group of sampling points of each pixel through some models, and constitute feature representation of the pixel by feature combination of sampling points, while adaptive feature fusion is to construct pixel-level fusion weights between fusion features through attention mechanism. The experimental results verified the effectiveness of the method proposed in this paper. On the public object detection dataset MS-COCO test-dev, Faster R-CNN model achieved performance improvement of 1.2 AP by virtue of the adaptive feature pyramid network, and FCOS model could achieve performance improvement of 1.0 AP. What’s more, the experiments also validated that the adaptive feature pyramid network proposed herein was more accurate for object localization.
【 授权许可】
Unknown