光学学报, 2023, 43 (24): 2428010, 网络出版: 2023-12-08  

注意力引导与多特征融合的遥感影像分割

Remote Sensing Image Segmentation Based on Attention Guidance and Multi-Feature Fusion
作者单位
1 昆明理工大学机电工程学院,云南 昆明 650500
2 火箭军工程大学导弹工程学院,陕西 西安 710025
摘要
从遥感影像能够获取到精度高、范围广的地物信息,因而遥感影像在高空侦察和精确制导等领域得到广泛应用。针对遥感影像地物目标边缘模糊、尺度多变导致难以精准分割的问题,提出以深度残差网络为主干并结合注意力引导与多特征融合的分割方法,命名为AMSNet。首先,采用类别引导通道注意力模块提高模型对难分辨区域的敏感性;其次,嵌入特征复用模块减少遥感影像特征提取过程中边缘损失和小尺度目标丢失的问题;最后,设计跨区域特征融合模块以增强对多尺度特征信息的获取能力,并耦合多尺度损失融合模块对损失函数进行优化,综合提升模型对多尺度遥感影像目标的分割能力。选取3组遥感影像数据集进行对比实验,结果表明,AMSNet能够有效分割遥感影像地物目标边缘和多尺度目标。
Abstract
Objective

Remote sensing images have a large detection range, long dynamic monitoring time, and a large amount of carrying information, making the obtained ground feature information more comprehensive and rich. By extracting ground object targets from remote sensing images, more detailed and accurate ground object information in the imaging area can be obtained, providing data support for high-altitude reconnaissance, precision guidance, and terrain matching. However, with the rapid increase in data volume, the current low level of intelligent and automated target extraction methods is difficult to embrace the demand. Traditional image extraction techniques contain edge detection, threshold segmentation, and region segmentation. These methods have good segmentation performance for remote sensing targets with significant contour boundaries but lack the ability of adaptive adjustment while facing complex and ever-changing remote sensing targets. Convolutional neural networks have stronger representation ability, scalability, and robustness than traditional methods by providing multi-level semantic information in images. Due to the uneven distribution, blurred edges, and variable scales of ground objects in remote sensing images, convolutional neural networks are prone to losing edge information and multi-scale feature information during feature extraction. In addition, cloud cover of remote sensing targets in complex scenes exacerbates the loss of target edge and multi-scale information, making it more difficult for convolutional neural networks to accurately segment remote sensing ground objects. In order to solve the above problems, we propose a segmentation method that uses deep residual networks as the backbone and combines attention guidance and multi-feature fusion to enhance the network's ability to segment remote sensing image ground object edges and multi-scale objects.

Methods

We propose a remote sensing image semantic segmentation network called AMSNet, which combines attention guidance and multi-feature fusion. In the Encoder Section, D_ Resnet50 is applied as the backbone network to extract the main feature information from remote sensing images, which can enhance the acquisition of detailed information such as edge and small-scale targets in remote sensing images. The category guidance channel attention module is inserted into the backbone to enhance the network's segmentation ability for difficult-to-distinguish and irregularly shaped areas in remote sensing images. A feature reuse module is added to the backbone network to solve the loss of edge detail information and the disappearance of scattered small-scale targets during feature extraction. In the Decoder Section, the cross-regional feature fusion module is applied to fuse the multi-feature information, improving the acquisition of multi-scale target information. Multi-scale loss fusion module is also joined to further enhance the segmentation performance of the network for multi-scale targets.

Results and Discussions

From the analysis of experimental results on the remote sensing image dataset of the plateau region and the remote sensing image dataset of the plateau region under cloud interference, compared with other semantic segmentation networks, the proposed network has better segmentation performance (Table 6 and Table 7) regardless of cloud interference. In addition, the segmentation performance is less affected by cloud interference. Even under cloud interference, the segmentation accuracy of ground targets is only 1.10 percentage points lower than that without cloud interference in mIoU, 0.58 percentage points lower than that in mPa, and 0.71 percentage points lower than that in mF1, which is lower than the influence of other semantic segmentation networks on segmentation effect under different cloud meteorological interference conditions. In addition, in order to verify the generalization performance of the AMSNet network segmentation effect, the International Society for Photogrammetry and Remote Sensing (ISPRS) dataset in the Vaihingen region of Germany is selected. In order to better fit the picture size, number of grouping convolutions of feature multiplexing modules in the AMSNet network is reduced to four groups. From the experimental results in Table 8, the network still performs better than other networks. This network is compared with PspNet and OCNet, with mIoU increased by 5.09 percentage points and 5.57 percentage points, Deeplabv3+ network with mIoU by 3.47 percentage points, mPa by 3.56 percentage points, and mF1 by 2.78 percentage points. From the segmenting effect diagram of Fig. 8, this network has a lower error rate, fewer omission, and a more accurate segmenting boundary for building edges and small-scale cars than other networks.

Conclusions

We propose a network model based on encoding-decoding structureAMSNet. In the encoding part, the D_Resnet50 network is applied as the backbone to extract the main feature information of remote sensing images. We also use a category-guided channel attention module to reduce the interference of channel noise on segmented objects and improve the segmentation effect of targets in difficult-to-distinguish areas. We embed a feature reuse module to compensate for the problem of target edge loss and small-scale target loss during the feature extraction process. In the decoding part, the cross-regional feature fusion module is designed to integrate multi-layer features and combine the multi-scale loss fusion module to calculate the feature loss at different scales to improve the segmentation effect of the network on multi-scale targets. This network conducts experiments on the remote sensing image dataset of the plateau region, remote sensing image dataset of the plateau region under cloud interference, and a public dataset. Compared with semantic segmentation networks such as BiseNetv2, PspNet, and Deeplabv3+, the proposed network achieves better results in the evaluation indicators of mIoU, mPa, and mF1. The visualization results show that the proposed network can effectively segment the ground object targets and scattered multi-scale targets in the interlaced and hard-to-distinguish areas in the remote sensing images, and it has good segmentation performance and good robustness in cloud interference.

张印辉, 张枫, 何自芬, 杨小冈, 卢瑞涛, 陈光晨. 注意力引导与多特征融合的遥感影像分割[J]. 光学学报, 2023, 43(24): 2428010. Yinhui Zhang, Feng Zhang, Zifen He, Xiaogang Yang, Ruitao Lu, Guangchen Chen. Remote Sensing Image Segmentation Based on Attention Guidance and Multi-Feature Fusion[J]. Acta Optica Sinica, 2023, 43(24): 2428010.

引用该论文: TXT   |   EndNote

相关论文

加载中...

关于本站 Cookie 的使用提示

中国光学期刊网使用基于 cookie 的技术来更好地为您提供各项服务,点击此处了解我们的隐私策略。 如您需继续使用本网站,请您授权我们使用本地 cookie 来保存部分信息。
全站搜索
您最值得信赖的光电行业旗舰网络服务平台!