注意力置换与通道重建的无人机城市街景实时语义分割

doi:10.13195/j.kzyjc.2024.0493

首页 > 过刊浏览>年第0卷第期 >. DOI:10.13195/j.kzyjc.2024.0493

注意力置换与通道重建的无人机城市街景实时语义分割
DOI:
                        10.13195/j.kzyjc.2024.0493
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:哈尔滨理工大学测控技术与通信工程学院
作者简介:
通讯作者:
中图分类号:TP391.4
基金项目:国家自然科学基金项目（面上项目，重点项目，重大项目）

Real-Time Semantic Segmentation of UAV Urban Street Scenes with Attention Permutation and Channel Reconstruction

Author:

Affiliation:

College of Measurement and Control Technology and Communication Engineering, Harbin University of Science and Technology

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan)

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对无人机城市街景实时语义分割任务中轻量级算法缺乏全局信息交互导致像素类别错分的问题，提出了一种注意力置换与通道重建的无人机城市街景实时语义分割网络，网络采用编码-解码结构。在编码器中，利用轻量级的置换自注意力机制来构建注意力分支，提取全局上下文信息的同时保持较高的计算效率；利用分裂-变换-融合的策略设计了通道重建模块对注意力分支的输入进行融合压缩，减小无关特征带来的计算量和对分割结果的影响。在解码器阶段，利用空间权重加权构建空间特征融合模块，实现对有效特征最大程度上的利用；利用置换自注意力机制和非对称卷积构建全局信息感知模块来克服无人机航拍图像中复杂背景的干扰。实验结果表明所提模型在UAVid验证集上平均交并比达到72.3%，相较于UNetFormer提升了2.3%，分割速度达到每秒105.8帧。在保证模型分割速度的前提下，取得了较好的分割精度。

Abstract:

In response to the issue of misclassification of pixel categories caused by the lack of global information interaction in lightweight algorithms for real-time semantic segmentation of urban street scenes by drones, a real-time semantic segmentation of UAV urban street scenes with attention permutation and channel reconstruction is proposed, adopting an encoder-decoder structure. In the encoder, a lightweight permutation self-attention mechanism is utilized to construct an attention branch, extracting global context information while maintaining high computational efficiency. By employing the split-transform-merge strategy, a channel reconstruction module is designed to fuse and compress the input of the attention branch, reducing the computational complexity caused by irrelevant features and their impact on segmentation results. In the decoder stage, a spatial feature fusion block is constructed using spatially weighted fusion, maximizing the utilization of effective features. Moreover, a permutation self-att

参考文献

相似文献

引证文献

引用本文

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2024-04-27
最后修改日期:2024-10-05
录用日期:2024-10-06
在线发布日期: 2024-10-15
出版日期:

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

分享

文章指标

历史

文章二维码