自适应RBF网络Q学习控制
DOI:
CSTR:
作者:
作者单位:

江南大学

作者简介:

徐明亮

通讯作者:

中图分类号:

基金项目:


Q-learning Control Based on Self-organizing RBF Network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    利用RBF网络逼近强化学习中的Q值函数,实现连续状态和连续动作的Q学习。RBF网络的输入为状态动作对,输出是状态动作对所对应的Q值。网络输入中的状态由系统的状态转移特性来决定,而网络输入中的动作是由网络推荐的贪婪动作与服从高斯分布的噪声干扰动作两个部分叠加而成。贪婪动作是通过优化网络输出的Q值函数获得。网络的结构和参数的学习利用RNA算法和梯度下降法进行自适应调整。倒立摆平衡控制的仿真试验结果验证了上述方法的有效性。

    Abstract:

    The Q-value function was approached with RBF(radial basis function) neural network was presented to generalise the information learnt by learning agent in continuous state space and action space. The input of RBF network is the pair of state and action, and the output is the Q-value of the pair of state and action. The state is decided by the transfer characteristic of system. The act of the input is consisted of the greedy act, which can be calculated with the Q-value optimization in the RBF neural network and noise act which has a normal distribution. The structure and parameters of network were adjusted with RNA algorithm and gradient decent algorithm. The effectiveness of the proposed Q-learning method is shown through simulation on the balancing control of a cart-pole system.

    参考文献
    相似文献
    引证文献
引用本文

徐明亮 须文波.自适应RBF网络Q学习控制[J].控制与决策,2010,25(2):303-306

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2009-03-24
  • 最后修改日期:2009-06-01
  • 录用日期:
  • 在线发布日期: 2010-02-20
  • 出版日期:
文章二维码