面向密度分布不均数据的近邻优化密度峰值聚类算法
CSTR:
作者:
作者单位:

1. 南昌工程学院 信息工程学院,南昌 330099;2. 华中科技大学 人工智能与自动化学院, 武汉 430074;3. 太原科技大学 计算机科学与技术学院,太原 030024

作者简介:

通讯作者:

E-mail: zhaojia925@163.com.

中图分类号:

TP301.6

基金项目:

国家自然科学基金项目(52069014,51669014);科技创新-----2030“新一代人工智能”重大项目(2018AAA0101200).


Density peaks clustering algorithm with nearest neighbor optimization for data with uneven density distribution
Author:
Affiliation:

1. School of Information Engineering,Nanchang Institute of Technology,Nanchang 330099,China;2. School of Artificial Intelligence and Automation,Huazhong University of Science and Technology,Wuhan 430074,China;3. School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    密度分布不均数据是指类簇间样本分布疏密程度不同的数据.密度峰值聚类(DPC)算法在处理密度分布不均数据时,倾向于在密度较高区域内找到类簇中心,并易将稀疏类簇的样本分配给密集类簇.为避免上述缺陷,提出一种面向密度分布不均数据的近邻优化密度峰值聚类(DPC-NNO)算法.DPC-NNO算法结合逆近邻和k近邻定义新的局部密度,提高稀疏样本的局部密度,使算法能更准确地找到类簇中心;定义分配策略时引入共享近邻,计算样本间相似性,构造相似矩阵,使同一类簇样本联系更紧密,避免错误分配样本.将所提出的DPC-NNO算法与IDPC-FA、DPCSA、FNDPC、FKNN-DPC、DPC算法进行对比,实验结果表明,DPC-NNO算法在处理密度分布不均数据时能获得优异的聚类效果,对于复杂数据集和UCI数据集,DPC-NNO算法的综合性能优于对比算法.

    Abstract:

    Data with uneven density distribution are those where the distribution of samples varies in sparsity between class clusters. When dealing with uneven density datasets, the density peak clustering(DPC) algorithm tends to find the center of class clusters in the higher density area and assign samples from sparse class clusters to dense class clusters. To avoid these defects, this paper proposes a density peaks clustering algorithm with nearest neighbor optimization(DPC-NNO) for data with uneven density distribution. The DPC-NNO algorithm combines the reverse nearest neighbor and k-nearest neighbor to define a new local density that improves the local density of sparse samples, allowing the algorithm to find class cluster centers more accurately; shared nearest neighbors are introduced to define the assignment strategy to calculate the similarity between samples and construct a similarity matrix to make the samples of the same class clusters more closely related and avoid the wrong assignment of samples. In this paper, we compare the DPC-NNO algorithm with IDPC-FA, DPCSA, FNDPC, FKNN-DPC, and DPC algorithms. Experimental results show that the DPC-NNO algorithm can achieve excellent clustering results on uneven density datasets, and the comprehensive performance of the DPC-NNO algorithm is better than other comparison algorithms on complex datasets and UCI datasets.

    参考文献
    相似文献
    引证文献
引用本文

陈蔚昌,赵嘉,肖人彬,等.面向密度分布不均数据的近邻优化密度峰值聚类算法[J].控制与决策,2024,39(3):919-928

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2024-02-25
  • 出版日期: 2024-03-20
文章二维码