基于深度强化学习的自动驾驶行为决策研究综述

doi:10.13195/j.kzyjc.2025.0441

首页 > 过刊浏览>2026年第41卷第2期 >305-328. DOI:10.13195/j.kzyjc.2025.0441

基于深度强化学习的自动驾驶行为决策研究综述
DOI:
                        10.13195/j.kzyjc.2025.0441
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:U471.1
基金项目:国家重点研发计划项目(2023YFB2504704)；2024年度河北省社会科学发展研究课题(202402302)；河北省省级科技计划软科学研究专项资助项目(25350801D)；河北省自然科学基金项目(E2024210032).

Review of autonomous driving behavior decision-making based on deep reinforcement learning

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

自动驾驶行为决策是车辆实现智能化的核心技术, 深度强化学习(DRL)因其环境交互特性和端到端决策优势在该领域展现出显著潜力. 鉴于此, 通过多维度分析, 系统梳理基于DRL的自动驾驶行为决策的研究内容和发展趋势: 首先, 回顾行为决策的发展历程, 并分析DRL在自动驾驶中的应用趋势; 然后, 提出“状态-动作-奖励-策略-评价”五维框架, 分析算法要素与跟驰、换道等驾驶任务的映射关系; 接着, 结合匝道合流、交叉口和施工区等典型场景, 剖析DRL在不确定性环境中的应用方案; 最后, 指出多车协同、长尾事件及可解释性等挑战, 并提出未来研究方向. 研究表明: 技术上, DRL算法选择与优化日趋多元化, 模型向多模态、轻量化发展; 应用上, 决策范式正从单车智能向车路云协同升级, 从功能实现向人性化交互进化, 突破现有技术“算法创新-硬件加速-法规适配”的协同演进路径.

Abstract:

Behavior decision-making is a core technology for vehicle intelligence. Deep reinforcement learning (DRL), with its environment-interactive capability and end-to-end decision-making advantages, has shown great potential in this field. This paper conducts a multidimensional analysis and systematically reviews the core content and development trends of DRL-based autonomous driving behavior decision-making research. First, the development of behavioral decision-making is reviewed, and the application trends of DRL in autonomous driving is analyzed. Second, a five-dimensional framework “state-action-reward-policy-evaluation” is proposed to analyze the mapping between algorithmic components and driving tasks such as car-following and lane-changing. Third, application schemes of DRL in uncertain environments are examined through typical traffic scenarios including ramp merging, intersections, and construction zones. Finally, we identify key challenges such as multi-vehicle coordination, long-tail event handling, and algorithm interpretability, and suggest future research directions. The study shows that, technically, DRL algorithm selection and optimization are becoming more diverse, with models evolving toward multi-modal and lightweight designs. In terms of application paradigms, behavior decision-making is transitioning from single-vehicle intelligence to vehicle-road-cloud collaboration, and from function-driven implementation to human-centric interaction. Overcoming current technical bottlenecks requires a co-evolution path of algorithm innovation, hardware acceleration, and regulatory adaptation.

参考文献

相似文献

引证文献

引用本文

王云泽,孙宇,骆中斌,等.基于深度强化学习的自动驾驶行为决策研究综述[J].控制与决策,2026,41(2):305-328

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-04-25
最后修改日期:
录用日期:
在线发布日期: 2026-01-17
出版日期: 2026-02-10

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

相关视频

分享

文章指标

历史

文章二维码