Frontiers in Marine Science | |
MSGNet: multi-source guidance network for fish segmentation in underwater videos | |
Marine Science | |
Yuanshan Lin1  Sixue Wei1  Junfeng Wu1  Peng Zhang1  Xin Zhang1  Hong Yu1  Haiqing Li1  Wan Tu1  Zongyi Yang1  | |
[1] College of Information Engineering, Dalian Ocean University, Dalian, China;Dalian Key Laboratory of Smart Fisheries, Dalian Ocean University, Dalian, China;Key Laboratory of Facility Fisheries, Ministry of Education (Dalian Ocean University), Dalian, China;Liaoning Provincial Key Laboratory of Marine Information Technology, Dalian Ocean University, Dalian, China; | |
关键词: computer vision; underwater video processing; MSGNet; fish segmentation; optical flow; coattention; | |
DOI : 10.3389/fmars.2023.1256594 | |
received in 2023-07-11, accepted in 2023-09-04, 发布年份 2023 | |
来源: Frontiers | |
【 摘 要 】
Fish segmentation in underwater videos provides basic data for fish measurements, which is vital information that supports fish habitat monitoring and fishery resources survey. However, because of water turbidity and insufficient lighting, fish segmentation in underwater videos has low accuracy and poor robustness. Most previous work has utilized static fish appearance information while ignoring fish motion in underwater videos. Considering that motion contains more detail, this paper proposes a method that simultaneously combines appearance and motion information to guide fish segmentation in underwater videos. First, underwater videos are preprocessed to highlight fish in motion, and obtain high-quality underwater optical flow. Then, a multi-source guidance network (MSGNet) is presented to segment fish in complex underwater videos with degraded visual features. To enhance both fish appearance and motion information, a non-local-based multiple co-attention guidance module (M-CAGM) is applied in the encoder stage, in which the appearance and motion features from the intra-frame salient fish and the moving fish in video sequences are reciprocally enhanced. In addition, a feature adaptive fusion module (FAFM) is introduced in the decoder stage to avoid errors accumulated in the video sequences due to blurred fish or inaccurate optical flow. Experiments based on three publicly available datasets were designed to test the performance of the proposed model. The mean pixel accuracy (mPA) and mean intersection over union (mIoU) of MSGNet were 91.89% and 88.91% respectively with the mixed dataset. Compared with those of the advanced underwater fish segmentation and video object segmentation models, the mPA and mIoU of the proposed model significantly improved. The results showed that MSGNet achieves excellent segmentation performance in complex underwater videos and can provide an effective segmentation solution for fisheries resource assessment and ocean observation. The proposed model and code are exposed via Github1.
【 授权许可】
Unknown
Copyright © 2023 Zhang, Yu, Li, Zhang, Wei, Tu, Yang, Wu and Lin
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202310126041861ZK.pdf | 10279KB | download |