IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | |
On the Effectiveness of Weakly Supervised Semantic Segmentation for Building Extraction From High-Resolution Remote Sensing Imagery | |
Xueliang Zhang1  Zixian Zheng1  Zhenshi Li1  Pengfeng Xiao1  | |
[1] Jiangsu Provincial Key Laboratory of Geographic Information Science and Technology, Key Laboratory for Land Satellite Remote Sensing Applications of Ministry of Natural Resources, School of Geography and Ocean Science, Nanjing University, Nanjing, China; | |
关键词: Building extraction; fully convolutional network; high-resolution remote sensing imagery; weakly supervised semantic segmentation (WSSS); | |
DOI : 10.1109/JSTARS.2021.3063788 | |
来源: DOAJ |
【 摘 要 】
A critical obstacle to achieve semantic segmentation of remote sensing images by the deep convolutional neural network is the requirement of huge pixel-level labels. Taking building extraction as an example, this study focuses on how to effectively apply weakly supervised semantic segmentation (WSSS) to high-resolution remote sensing (HR) images with image-level labels, which is a prominent solution for the huge labeling challenge. The widely used two-step WSSS framework is adopted, in which the pseudo-masks are first produced from image-level labels and followed by a segmentation network trained by the pseudo-masks. In addition, the fully connected conditional random field (CRF) is utilized to explore spatial context in both training and prediction stages. Detailed analyzes are implemented on applying WSSS on HR images in terms of producing pseudo-masks, training segmentation network, and optimizing predictions. We show that the tradeoff between precision and recall of pseudo-masks, as well as the boundary accuracy and the background, needs to be carefully considered. The benefits of the segmentation network in the two-step framework are demonstrated in comparison to using classification network only for WSSS, and the effects of CRF-loss are identified to be powerful for improving the segmentation network while it is not appropriate for dense buildings. An overlapping strategy and CRF postprocessing are further demonstrated to be effective for optimizing the segmentation results during inferencing. Through deliberate settings, we can generate results comparable to fully supervised on the ISPRS Potsdam and Vaihingen dataset, which is meaningful for promoting WSSS applications for extracting geographic information from HR images.
【 授权许可】
Unknown