融合多源环境感知的强化学习能耗预测模型

doi:10.13195/j.kzyjc.2025.1052

首页 > 过刊浏览>2026年第41卷第6期 >1731-1742. DOI:10.13195/j.kzyjc.2025.1052

融合多源环境感知的强化学习能耗预测模型
DOI:
                        10.13195/j.kzyjc.2025.1052
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TM715
基金项目:

Reinforcement learning-based energy consumption prediction model with multi-source environmental perception

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对现有电动汽车实时能耗预测模型环境感知能力不足、动态校准机制缺失等问题, 提出一种融合环境感知与强化学习的能耗预测模型. 首先, 为增强模型对于复杂工况的感知和理解, 设计基于对比学习与耦合强化学习协同训练的路况感知算法, 并引入多尺度图像特征融合机制, 有效提取与车辆能效高度相关的环境特征, 从而提升对于非平稳工况的感知精度; 然后, 构建马尔可夫实时能效估计模型并将其映射至强化学习框架中, 引入基于折扣未来能耗的时序一致性正则项(其中$Q $函数仅用于能耗响应评估), 实现模型的自校准式优化, 从而在动态场景下显著增强预测的鲁棒性和自适应性(模型不产生控制输出); 同时, 结合场景感知的优先经验回放机制, 强化模型对坡度突变、急加减速等关键工况的识别和学习, 进一步提高复杂环境下的特征提取和模型泛化能力; 最后, 通过场景感知的优先采样策略优化训练样本分布, 提升强化学习的收敛速度和训练效率. 实验结果表明, 所提出方法在所测试的两款车型以及多种仿真工况下均表现出优越的鲁棒性和稳定性, 其MAE低于0.2%, RMSE低于0.3%, $R^2 $超过99.5%. 与现有Transformer、Informer、Mamba以及LSTM模型相比, 平均误差分别降低约40% $\sim $ 70%, 收敛速度提升约30%, 在复杂工况下能耗预测精度显著提高.

Abstract:

To address the limitations of existing real-time energy consumption prediction models for electric vehicles — Particularly their insufficient environmental perception and lack of dynamic calibration mechanisms — this study proposes an energy consumption prediction model integrating environmental perception with reinforcement learning. First, to enhance the model’s capability of understanding complex driving conditions, a road-condition perception algorithm is designed based on contrastive learning and coupled reinforcement learning, together with a multi-scale image feature fusion mechanism. This design effectively extracts environment features highly correlated with vehicle energy efficiency, thereby improving perception accuracy under non-stationary operating conditions. Second, a Markov-based real-time energy efficiency estimation model is constructed and mapped into a reinforcement learning framework. A temporal consistency regularization term based on discounted future energy consumption (where the $Q $-function is used solely as an energy-response evaluator) is introduced to achieve self-calibrated optimization, significantly enhancing prediction robustness and adaptability in dynamic scenarios (without generating control outputs). Meanwhile, a scenario-aware prioritized experience replay mechanism is incorporated to strengthen the model’s ability to recognize and learn from key driving conditions such as slope mutations and rapid acceleration/deceleration events, further improving feature extraction and generalization in complex environments. Finally, a scenario-aware prioritized sampling strategy is employed to optimize the distribution of training samples, improving the convergence rate and efficiency of the reinforcement learning process. Experimental results demonstrate that the proposed method exhibits excellent robustness and stability across two vehicle types and multiple simulated driving scenarios, achieving an MAE below 0.2%, an RMSE below 0.3%, and an $R^2 $ above 99.5%. Compared with existing Transformer, Informer, Mamba, and LSTM models, the proposed approach reduces average prediction error by approximately 40% $\sim $ 70% and improves convergence speed by about 30%, yielding significantly higher prediction accuracy under complex driving conditions.

参考文献

相似文献

引证文献

引用本文

彭自然,杨肖阳,舒中宾.融合多源环境感知的强化学习能耗预测模型[J].控制与决策,2026,41(6):1731-1742

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-10-11
最后修改日期:
录用日期:
在线发布日期: 2026-05-13
出版日期: 2026-06-10

首页

期刊简介

编委会

作者中心

精选专辑

品牌联动

引用本文

相关视频

分享

文章指标

历史

文章二维码