EMS-YOLO:基于高效多尺度融合的无人航拍图像目标检测
DOI:
CSTR:
作者:
作者单位:

哈尔滨理工大学

作者简介:

通讯作者:

中图分类号:

TP391.41

基金项目:


EMS-YOLO: An UAV image target detection based on efficient multi-scale fusion
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    随着无人机技术在城市安防、交通监管和应急救援等领域的快速发展,无人机图像的目标检测与识别技术为多行业应用提供了可靠的技术支持.低空视角的目标检测任务面临小目标密集、尺度变化大、背景复杂等挑战.针对上述问题,本文提出一种改进的YOLO11无人机图像目标检测算法.首先,设计CSP-MS模块通过分层融合和异构卷积结构实现多尺度特征的表达;其次,设计特征增强多尺度特征聚合金字塔模块,通过空洞卷积与跨层融合机制提高模型对复杂场景的感知能力;最后引入轻量级动态任务对齐检测头,降低模型参数量的同时提升对小尺寸目标的检测精度.模型在VisDrone数据集上mAP0.5和mAP0.5:0.95指标分别提升10.2%和6.7%,在CODrone数据集上分别提升5.4%和3.7%,实验结果表明,改进模型在小目标、复杂背景和多尺度目标场景中均具有显著性能优势,体现出较强的泛化能力和实用价值.

    Abstract:

    With the rapid development of unmanned aerial vehicle (UAV) technology in urban security, traffic monitoring, and emergency response, object detection and recognition based on UAV imagery have become essential for supporting a wide range of intelligent applications. However, UAV-based object detection from low-altitude perspectives remains highly challenging due to the dense distribution of small objects, significant scale variations, and complex background interference. To address these issues, this paper proposes an enhanced UAV-oriented YOLO11 object detection algorithm. First, a CSP-MS module is designed to improve multi-scale feature representation through hierarchical fusion and heterogeneous convolution structures. Second, an feature-enhanced multi-scale aggregation pyramid is introduced, which combines dilated convolutions with cross-layer fusion to strengthen the model’s perception capability in complex scenes. Finally, a lightweight dynamic task-aligned detection head is integrated to reduce model parameters while improving detection accuracy for small objects. Experiments on the VisDrone dataset demonstrate improvements of 10.2% in mAP0.5and 6.7% in mAP0.5:0.95.On the CODrone dataset, the proposed method achieves gains of 5.4% and 3.7%, respectively. Overall, the results show that the improved model delivers notable advantages in detecting small objects, handling complex backgrounds, and managing multi-scale targets, highlighting its strong generalization capability and practical applicability.

    参考文献
    相似文献
    引证文献
引用本文
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-09-09
  • 最后修改日期:2026-01-11
  • 录用日期:2026-01-12
  • 在线发布日期:
  • 出版日期:
文章二维码