Abstract:The object detection precision plays a critical role in computer vision task. Aiming at the precision problem available in the one-stage object detection model of YOLOv5, this paper proposes an enhanced self-adaptive loss weight YOLOv5 model based on predicted bounding boxes in clusters presenting the individual targets in multi-resolution feature maps to optimize the multi-task loss. The enhanced model consists of GT(Ground True) target bounding box UID distributor, GT target bounding box UID matcher, bounding box position loss weight algorithm and classification loss weight algorithm. The overall detection precision is improved by the enhancements of both position precision and classification precision in YOLOv5. The experimental results present that compared with YOLOv5.6, the mean average precision(mAP) is promoted relatively by 5.23% on average by the enhanced model which achieves the relative performance of 8.02% compared with the more complex model of YOLOv5x6.