基于评分机制的类贪心森林优化特征选择算法
DOI:
CSTR:
作者:
作者单位:

云南民族大学

作者简介:

通讯作者:

中图分类号:

TP18

基金项目:


Feature selection using greedy-like forest optimization algorithm based on scoring mechanism
Author:
Affiliation:

Yunnan Minzu University

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    森林优化特征选择算法(FSFOA)具有良好的分类性能和维度缩减能力,但其初始化森林的质量参差不齐,局部播种和全局播种的随机性较大,且适应度评估代价较高导致计算效率较低.针对上述问题,本文提出一种基于评分机制的类贪心森林优化特征选择算法(feature selection using greedy-like forest optimization algorithm based on scoring mechanism, FSGLFOA-SM).首先,以每维决策变量的分类精度为其得分构建评分机制,提出类贪心初始化策略以生成较优质的初始化森林;提出基于评分比较的类贪心局部播种策略,使评分相对较高的决策变量获得更大的局部播种概率.然后,在全局播种阶段提出类贪心遗传算子播种策略,对候选森林择优重建并进行遗传、类贪心交叉和变异操作,以保留评分较高的特征维度,有利于提高全局播种阶段的分类准确率.最后,为解决昂贵适应度评估带来的计算效率低下问题,建立历史数据库,在适应度评估前先进行库内查找,减少了重复解个体的计算量.实验结果表明,相比6个对比算法,FSGLFOA-SM在13个UCI数据集上的分类精度和维度缩减率更加优越.

    Abstract:

    Feature selection using forest optimization algorithm (FSFOA) has well classification performance and dimensional reduction ability, but it has variable quality of initialised forest, larger randomness of local seeding and global seeding, and the low computational efficiency caused by expensive fitness evaluation. To solve the above problems, this paper proposes a feature selection using greedy-like forest optimization algorithm based on scoring mechanism (FSGLFOA-SM). Firstly, a scoring mechanism is constructed by using the classification accuracy of each dimensional decision variable as its score. From this, a greedy-like initialization strategy is proposed to generate an initialised forest with better quality, and a greedy-like local seeding strategy is proposed based on the comparison of scores, so that the decision variables with relatively higher scores could get a larger probability of local seeding. Then, a greedy-like genetic operator seeding strategy is proposed in the global seeding stage. The candidate forest is obtained by optimal selection and reconstruction, on which genetic, greedy-like crossover and mutation are carried. So that the feature dimensions with higher scores are more likely to be retained and the classification accuracy of global seeding stage could be improved. Finally, a historical database is established, which reduced the calculation of duplicate solutions through accessing the database before fitness evaluation. The experimental results show that FSGLFOA-SM has superior classification accuracy and dimension reduction on 13 UCI datasets compared to the six feature selection algorithms.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-11-07
  • 最后修改日期:2024-03-15
  • 录用日期:2024-03-18
  • 在线发布日期: 2024-04-10
  • 出版日期:
文章二维码