基于相对密度估计和多簇合并的密度峰值聚类算法
CSTR:
作者:
作者单位:

南昌工程学院 信息工程学院,南昌 330099

作者简介:

通讯作者:

E-mail: zhaojia925@163.com.

中图分类号:

TP301.6

基金项目:

国家自然科学基金项目(52069014);江西省社会科学基金项目(21JY26);江西省教育厅科技计划项目(GJJ180940).


Density peaks clustering based on relative density estimating and multi cluster merging
Author:
Affiliation:

School of Information Engineering,Nanchang Institute of Technology,Nanchang 330099,China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    密度峰值聚类(DPC)算法是一种新颖的基于密度的聚类算法,其原理简单、运行效率高.但DPC算法的局部密度只考虑了样本之间的距离,忽略了样本所处的环境,导致算法对密度分布不均数据的聚类效果不理想;同时,样本分配过程易产生分配错误连带效应.针对上述问题,提出一种基于相对密度估计和多簇合并的密度峰值聚类(DPC-RD-MCM)算法.DPC-RD-MCM算法结合K近邻和相对密度思想,定义了相对K近邻的局部密度,以降低类簇疏密程度对类簇中心的影响,避免稀疏区域没有类簇中心;重新定义微簇间相似性度量准则,通过多簇合并策略得到最终聚类结果,避免分配错误连带效应.在密度分布不均数据集、复杂形态数据集和UCI数据集上,将DPC-RD-MCM算法与DPC及其改进算法进行对比,实验结果表明:DPC-RD-MCM算法能够在密度分布不均数据上获得十分优异的聚类效果,在复杂形态数据集和UCI数据集的聚类性能上高于对比算法.

    Abstract:

    Density peaks clustering (DPC) is a novel clustering algorithm based on density, which has simple principle and high efficiency. However, the definition of local density of samples in DPC only considers the distance between samples and ignores the environment of samples, which leads to the unsatisfactory clustering effect of the algorithm for data with uneven density distribution. At the same time, the process of sample allocation is easy to produce the joint effect of allocation errors. To solve the above problems, this paper proposes a density peaks clustering algorithm based on relative density estimating and multi cluster merging(DPC-RD-MCM). The DPC-RD-MCM defines the local density of the relative K-nearest neighbor based on the idea of the K-nearest neighbor and relative density, so as to reduce the influence of cluster density on the selection of cluster centers and avoid the absence of cluster centers in sparse regions. The similarity measure between micro clusters is redefined, and the final clustering result is obtained by the multi cluster merging strategy, which avoids the joint effect of allocation errors. The DPC-RD-MCM is compared with DPC and its improved algorithm on uneven density datasets, complex morphological datasets and UCI datasets. The experimental results show that the DPC-RD-MCM can achieve excellent clustering effect on uneven density datasets, and the clustering performance of complex morphological datasets and UCI datasets is higher than other comparison algorithms.

    参考文献
    相似文献
    引证文献
引用本文

吴润秀,尹士豪,赵嘉,等.基于相对密度估计和多簇合并的密度峰值聚类算法[J].控制与决策,2023,38(4):1047-1055

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2023-03-22
  • 出版日期: 2023-04-20
文章二维码