基于项集归减的高维频繁高效用项集挖掘多目标优化方法
CSTR:
作者:
作者单位:

1. 安徽大学 计算机科学与技术学院,合肥 230039;2. 安徽大学 人工智能学院,合肥 230039;3. 中国矿业大学 信息与电气工程学院,江苏 徐州 221116;4. 合肥师范学院 计算机学院,合肥 230001;5. 合肥综合性国家科学中心 人工智能研究院,合肥 230071

作者简介:

通讯作者:

E-mail: chengfan@mail.ustc.edu.cn.

中图分类号:

TP273

基金项目:

国家自然科学基金项目(61976001,62076001,61876184);安徽省教育厅高校优秀人才支持计划重点项目(gxyqZD2021089);安徽省自然科学基金项目(2008085QF309);安徽省高校协同创新项目(GXXT-2020-050).


An itemset reduction based multi-objective evolutionary algorithm for mining high-dimensional frequent and high utility itemsets
Author:
Affiliation:

1. School of Computer Science and Technology,Anhui University,Hefei 230039,China;2. School of Aritificial Intelligence,Anhui University,Hefei 230039,China;3. School of Information and Control Engineering,China University of Mining and Technology,Xuzhou 221116,China;4. School of Computer,Hefei Normal University,Hefei 230001,China;5. Institute of Artificial Intelligence,Hefei Comprehensive National Science Center,Hefei 230071,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    频繁高效用项集挖掘是数据挖掘的一项重要任务,挖掘到的项集由支持度和效用这2个指标衡量.在一系列用于解决这类问题的方法中,进化多目标方法能够提供1组高质量解以满足不同用户的需求,避免传统算法中支持度和效用的阈值难以确定的问题.但是已有多目标算法多采用0-1编码,使得决策空间的维度与数据集中项数成正比,因此,面对高维数据集会出现维度灾难问题.鉴于此,设计一种项集归减策略,通过在进化过程中不断对不重要项进行归减以减小搜索空间.基于此策略,进而提出一种基于项集归减的高维频繁高效用项集挖掘多目标优化算法(IR-MOEA),并针对可能存在的归减过度或未归减到位的个体提出基于学习的种群修复策略用以调整进化方向.此外还提出一种基于项集适应度的初始化策略,使得算法在进化初期生成利于后期进化的稀疏解.多个数据集上的实验结果表明,所提出算法优于现有的多目标优化算法,特别是在高维数据集上.

    Abstract:

    Frequent and high utility itemset mining is an important task in data mining, and the mined itemsets are measured by two metrics, support and utility. Among a series of methods used to solve such problems, evolutionary multi-objective methods provide a set of high-quality solutions to meet the needs of different users, as well as avoiding the problem of difficulty in determining the thresholds of support and utility in traditional algorithms. The existing multi-objective algorithms are encoded with 0-1 and the dimensionality of the decision space is proportional to items in the dataset. As a result, the curse of the dimensionality problem can occur in high-dimensional datasets. Therefore, this paper designs an itemset reduction strategy to reduce the search space by reducing the unimportant items. According to this strategy, the paper proposes a high-dimensional frequent and high utility multi-objective evolutionary algorithm for itemset mining based on itemset reduction (IR-MOEA), where a learning-based population restoration strategy is proposed to adjust the evolutionary direction for over-reduced or under-reduced individuals. In addition, an initialization strategy is proposed to generate sparse solutions that facilitate evolution. Finally, experimental results on datasets show that this algorithm outperforms the existing state-of-the-art multi-objective optimization algorithms for mining frequent and high utility itemsets, especially on high-dimensional datasets.

    参考文献
    相似文献
    引证文献
引用本文

张磊,李柳,杨海鹏,等.基于项集归减的高维频繁高效用项集挖掘多目标优化方法[J].控制与决策,2023,38(10):2832-2840

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-09-19
  • 出版日期: 2023-10-20
文章二维码