基于跨域双流网络的引导式特征融合小目标检测算法
CSTR:
作者:
作者单位:

1.燕山大学 智能控制系统与智能装备教育部工程研究中心 2.燕山大学河北省工业计算机控制工程重点实验室;2.燕山大学 电气工程学院

作者简介:

通讯作者:

中图分类号:

TP391.4

基金项目:

国家自然科学基金项目(61573305);河北省自然科学基金项目(F2022203038,F2019203511);省级重点实验室绩效补助经费项目(22567612H);


Guided feature fusion algorithm for small object detection based on cross-domain dual-stream network
Author:
Affiliation:

Fund Project:

The National Natural Science Foundation of China (General Program, Key Program, Major Research Plan); Project of the Natural Science Foundation of Province; Provincial Key Laboratory Performance-Based Subsidy Funding Program

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对无人机航拍图像中的目标存在尺寸小、分布密集、纹理细节模糊等问题, 本文提出了基于跨域双流网络的引导式特征融合小目标检测算法. 首先, 在主干网络部分提出了空间-频域协同作用的跨域双流检测架构, 该架构并行构建了空间域与频域两条特征提取路径. 空间流侧重于捕获局部细节特征, 频域流设计了边缘与频域增强模块, 通过频域变换与动态高斯掩码进行三频带划分, 并利用上下文感知门控机制自适应增强不同频率特征, 从而提升网络的全局上下文感知能力. 然后, 在跨域特征融合部分设计了自适应空间-频域协同融合模块, 通过动态权重分配实现跨域特征的高效整合. 最后, 在颈部网络采用了引导式三路融合模块, 以主路特征为引导自适应整合上采样、主路以及跨层特征之间的语义与细节信息, 有效缓解多尺度特征间的语义差异. 在 VisDrone2019 和 TinyPerson 两个公开数据集上进行实验, 实验结果验证了本文方法的有效性.

    Abstract:

    To address the challenges of small object detection in UAV aerial images, where targets are typically small in size, densely distributed, and lack clear texture details, this paper proposes a guided feature fusion algorithm based on a cross-domain dual-stream network. Specifically, a spatial–frequency collaborative dual-stream architecture is introduced in the backbone, in which spatial-domain and frequency-domain feature extraction pathways are constructed in parallel. The spatial stream focuses on capturing local detail features, while the frequency stream incorporates an edge and frequency enhancement module. This module performs three-band frequency decomposition via frequency transformation and dynamic Gaussian masking, and employs a context-aware gating mechanism to adaptively enhance features at different frequency bands, thereby improving the network"s global context perception capability. Subsequently, an adaptive spatial–frequency collaborative fusion module is designed to efficiently integrate cross-domain features through dynamic weight allocation. Finally, a guided three-branch fusion module is adopted in the neck network, where the main-branch features serve as guidance to adaptively fuse semantic and detailed information from upsampling, main-branch, and cross-layer features, effectively alleviating semantic discrepancies across different scales. Experiments conducted on the VisDrone2019 and TinyPerson public datasets demonstrate the effectiveness of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-11-25
  • 最后修改日期:2026-04-03
  • 录用日期:2026-04-05
  • 在线发布日期: 2026-04-14
  • 出版日期:
文章二维码