nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2025, 02, v.33 34-47
基于YOLO-NPDL的复杂交通场景检测方法
基金项目(Foundation): 山东交通学院研究生科技创新项目(2024YK001)
邮箱(Email): qcxzhang@126.com;
DOI:
摘要:

为提高复杂交通场景下车辆目标检测模型的检测精度,以YOLOv8n(you only look once version 8 nano)为基准模型,设计具有复合主干的Neck-ARW(包括辅助检测分支、RepBlock模块、加权跳跃特征连接)颈部结构,减少信息瓶颈造成沿网络深度方向的信息丢失;引入RepBlock结构重参数化模块,在训练过程中采用多分支结构提高模型特征提取性能;添加P2检测层捕捉更多小目标细节特征,丰富网络内小目标的特征信息流;采用Dynamic Head自注意力机制检测头,将尺度感知、空间感知和任务感知自注意力机制融合到统一框架中,提高检测性能;采用基于层自适应幅度的剪枝(layer-adaptive magnitude based pruning, LAMP)算法,移除模型的冗余参数,构建YOLO-NPDL(Neck-ARW,P2,Dynamic Head, LAMP)车辆目标检测模型。以UA-DETRAC(university at Albany detection and tracking)数据集为试验数据集,分别进行RepBlock模块嵌入位置试验、不同颈部结构对比试验、剪枝试验、消融试验、模型性能对比试验,验证YOLO-NPDL模型的平均精度均值。试验结果表明:RepBlock模块同时嵌入辅助检测分支和颈部主干结构时对多尺度目标的特征提取能力更优,在训练过程中可保留更多的细节信息,但参数量和计算量均增大;采用Neck-ARW颈部结构后模型的平均精度均值EmAP50、EmAP50-95分别提高1.1%、1.7%,参数量减小约17.9%,结构较优;剪枝率为1.3时,模型参数量、计算量分别减小约38.0%、24.0%,冗余通道占比较少,结构较紧凑;与YOLOv8n模型相比,YOLO-NPDL模型在参数量基本相同的基础上,召回率增大2.7%,EmAP50增大2.7%,达到94.7%,EmAP50-95增大6.4%,达到79.7%;与目前广泛使用的YOLO系列模型相比,YOLO-NPDL模型在较少参数量的基础上,检测精度较高。YOLO-NPDL模型在检测远端目标、雨天及夜景等实际复杂交通情景中无明显误检、漏检情况,可检测到更多的远端小目标车辆,检测效果更优。

Abstract:

In order to improve the detection accuracy of the vehicle object detection model in complex traffic scenes, using YOLOv8n(you only look once version 8 nano) as the benchmark model, a Neck-ARW(including auxiliary detection branch, RepBlock module, and weighted jump feature fusion) neck structure with a composite backbone is designed to reduce information loss caused by information bottlenecks along the network depth direction; the RepBlock structure heavy parameterization module is introduced, and the multi-branch structure is used in the training process to improve the model feature extraction performance; the P2 detection layer is added to capture more small target detail features and enrich the feature information flow of small targets in the network; the Dynamic Head self-attention mechanism detection head is used, which integrates scale perception, spatial perception, and task perception self-attention mechanism into a unified framework to improve detection performance. The layer-adaptive magnitude based pruning(LAMP) algorithm is used to remove redundant parameters of the model and construct the YOLO-NPDL(Neck-ARW, P2, Dynamic Head, LAMP) vehicle object detection model. Using the UA-DETRAC(university at Albany detection and tracking) dataset as the experimental dataset, RepBlock module embedding position test, different neck structure comparison test, pruning test, ablation test, and model performance comparison test are conducted to verify the detection accuracy of the YOLO-NPDL model. The experimental results show that: RepBlock module has better feature extraction ability for multi-scale targets when embedding auxiliary detection branches and neck trunk structures at the same time, and can retain more detailed information during the training process, but the amount of parameters and computation increases; after using the Neck-ARW neck structure, the detection accuracy indicators EmAP50 and EmAP50-95 of the model are increased by 1.1% and 1.7%, respectively, and the number of parameters is reduced by about 17.9%, and the structure is better; when the pruning rate is 1.3, the model parameters and computation are reduced by about 38.0% and 24.0%, respectively, and the redundant channel accounts for less, and the structure is more compact; compared with the YOLOv8n model, the YOLO-NPDL model has a 2.7% increase in recall rate, a 2.7% increase in EmAP50, reaching 94.7%,a 6.4% increase in EmAP50-95, reaching 79.7%; compared with the widely used YOLO series models, the YOLO-NPDL model has higher detection accuracy on the basis of fewer parameters. The YOLO-NPDL model has no obvious false detection or omission in detecting remote targets, rainy days and night scenes, and can detect more remote small target vehicles, with better detection effect.

参考文献

[1] 李伟东,黄振柱,何精武,等.改进行为克隆与DDPG的无人驾驶决策模型[J].计算机工程与应用,2024,60(14):86-95.

[2] TERVEN J,CóRDOVA-ESPARZA D M,ROMERO-GONZáLEZ J A.A comprehensive review of YOLO architectures in computer vision:from YOLOv1 to YOLOv8 and YOLO-NAS[J].Machine Learning and Knowledge Extraction,2023,5(4):1680-1716.

[3] LIN T Y,DOLLáR P,GIRSHICK R,et al.Feature pyramid networks for object detection[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,HI,USA:IEEE,2017:936-944.

[4] TAN M X,PANG R M,LE Q V.EfficientDet:scalable and efficient object detection[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle,WA,USA:IEEE,2020:10778-10787.

[5] 孙庆.基于Transformer和BiFPN的轻量化车辆检测算法研究[D].西安:长安大学,2023.

[6] XU X Z,JIANG Y Q,CHEN W H,et al.DAMO-YOLO:a report on real-time object detection design[EB/OL].(2023-04-24)[2024-06-15].https://arxiv.org/abs/2211.15444v4.

[7] 廖龙杰,吕文涛,叶冬,等.基于深度学习的小目标检测算法研究进展[J].浙江理工大学学报(自然科学),2023,48(3):331-343.

[8] 潘晓英,贾凝心,穆元震,等.小目标检测研究综述[J].中国图象图形学报,2023,28(9):2587-2615.

[9] FENG X X,HAN J W,YAO X W,et al.TCANet:triple context-aware network for weakly supervised object detection in remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing,2021,59(8):6946-6955.

[10] GUAN L T,WU Y,ZHAO J Q.SCAN:semantic context aware network for accurate small object detection[J].International Journal of Computational Intelligence Systems,2018,11(1):951-961.

[11] TISHBY N,ZASLAVSKY N.Deep learning and the information bottleneck principle[C]//Proceedings of IEEE Information Theory Workshop.Jerusalem,Israel:IEEE,2015:1-5.

[12] WANG C Y,YEH I H,MARK LIAO H Y.YOLOv9:learning what you want to learn using programmable gradient information[C]//Computer Visio-ECCV 2024:18th European Conference.Cham,Switzerland:Springer Nature Switzerland,2025,15089:1-21.

[13] SHWARTZ-ZIV R,TISHBY N.Opening the black box of deep neural networks via information[EB/OL].(2017-04-29)[2024-06-15].https://arxiv.org/abs/1703.00810.

[14] VEIT A,WILBER M,BELONGIE S,et al.Residual networks behave like ensembles of relatively shallow networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.Barcelona,Spain:ACM,2016:550-558.

[15] LI C Y,LI L L,GENG Y F,et al.YOLOv6 v3.0:a full-scale reloading[EB/OL].(2023-01-13)[2024-06-15].https://arxiv.org/abs/2301.05586.

[16] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,NV,USA:IEEE,2016:770-778.

[17] HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,HI,USA:IEEE,2017:2261-2269.

[18] 郭佩林,张德,王怀秀.基于特征可视化探究跳跃连接结构对深度神经网络特征提取的影响[J/OL].计算机工程.(2024-04-29)[2024-07-20].https://link.cnki.net/doi/10.19678/j.issn.1000-3428.0068885.

[19] DING X H,ZHANG X Y,MA N N,et al.RepVGG:making VGG-style ConvNets great again[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville,TN,USA:IEEE,2021:13728-13737.

[20] DAI X Y,CHEN Y P,XIAO B,et al.Dynamic head:unifying object detection heads with attentions[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville,TN,USA:IEEE,2021:7369-7378.

[21] 孙阳,李佳.基于通道剪枝的YOLOv7-tiny输电线路异物检测算法[J].计算机工程与应用,2024,60(14):319-328.

[22] BLALOCK D,ORTIZ J J G,FRANKLE J,et al.What is the state of neural network pruning?[EB/OL].(2020-03-26)[2024-07-20].https://arxiv.org/abs/2003.03033v1.

[23] HAN S,POOL J,TRAN J,et al.Learning both weights and connections for efficient neural networks[C]//Proceedings of the 29th International Conference on Neural Information Processing Systems:Volume 1.Montreal,Canada:ACM,2015:1135-1143.

[24] GALE T,ELSEN E,HOOKER S.The state of sparsity in deep neural networks[EB/OL].(2019-02-25)[2024-07-20].https://arxiv.org/abs/1902.09574.

[25] LEE J,PARK S,MO S,et al.Layer-adaptive sparsity for the magnitude-based pruning[EB/OL].(2021-05-09)[2024-08-19].https://arxiv.org/abs/2010.07611.

[26] WEN L Y,DU D W,CAI Z W,et al.UA-DETRAC:a new benchmark and protocol for multi-object detection and tracking[J].Computer Vision and Image Understanding,2020,193:102907.

[27] 皇甫俊逸,孟乔,孟令辰,等.基于GhostNet与注意力机制的YOLOv5交通目标检测[J].计算机系统应用,2023,32(4):149-160.

[28] 姚若禹,郑世玲,史怡璇,等.基于改进YOLOv8n的钢材表面缺陷检测算法[J].聊城大学学报(自然科学版),2025,38(2):177-189.

[29] JIANG Y Q,TAN Z Y,WANG J Y,et al.GiraffeDet:a heavy-neck paradigm for object detection[EB/OL].(2022-06-22)[2024-08-19].https://arxiv.org/abs/2202.04256.

[30] WANG C C,HE W,NIE Y,et al.Gold-YOLO:efficient object detector via gather-and-distribute mechanism[EB/OL].(2023-10-23)[2024-08-19].https://arxiv.org/abs/2309.11331.

[31] LI C,LI L,JIANG H,et al.YOLOv6:a single-stage object detection framework for industrial applications[EB/OL].(2022-09-07)[2024-08-19].https://arxiv.org/abs/2209.02976.

基本信息:

DOI:

中图分类号:U495;TP183;TP391.41

引用信息:

[1]张浩晨,张竹林,史瑞岩等.基于YOLO-NPDL的复杂交通场景检测方法[J].山东交通学院学报,2025,33(02):34-47.

基金信息:

山东交通学院研究生科技创新项目(2024YK001)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文