基于双向代表点和相互K近邻的密度峰值聚类算法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP301.6

基金项目:

宜宾学院高层次人才启航计划项目(2023QH02);四川省科技计划项目(2024ZYD0089).


Density peak clustering algorithm based on bidirectional representative points and mutual K-nearest neighbors
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    密度峰值聚类算法(DPC)能够识别任意形状的类簇, 但存在两大明显不足: 一是在密度分布不均的数据集中不能正确发现稀疏集群的聚类中心; 二是剩余点分配策略容易引起连锁反应导致数据点归类错误. 为此, 提出一种基于双向代表点(BRP)和相互$K $近邻(MKNN)的密度峰值聚类算法, 称为BRPMK-DPC. 首先, 设计一种基于正向$K $近邻代表点和逆向逆$K $近邻代表点的局部密度计算方法, 好处是可以在密度分布不均的数据集中高效识别正确的聚类中心; 其次, 提出一种相互$K $近邻的剩余点分配方法, 在分配过程中具有自适应性, 避免衍生类DPC算法采用固定$K $值带来的劣势; 最后, 在人工合成数据集和真实数据集上进行测试, 实验结果表明所提出的算法不仅能够高效识别密度不均集群的聚类中心, 而且在大部分数据集上的聚类性能优于其他7种对比算法.

    Abstract:

    The density peak clustering (DPC) algorithm is capable of identifying clusters of any shape, but there are two obvious shortcomings. First, it struggles to accurately identify the cluster centers of sparse clusters in datasets with uneven density distribution. Second, its remaining points assignment strategy can easily lead to a chain reaction, resulting in incorrect data point classification. Therefore, this paper introduces a density peak clustering algorithm called BRPMK-DPC based on bidirectional representative points (BRP) and mutual $K $-nearest neighbors (MKNN). The algorithm initially develop a method to calculate local density using forward $ K $-nearest neighbor representative points and backward inverse $K $-nearest neighbor representative points. This method efficiently identifies the correct cluster centers in datasets with uneven density distribution. Additionally, a method for assigning remaining points based on mutual $K $-nearest neighbors is proposed. This method offers adaptability in the assignment process, avoiding the drawback of using fixed $K $-values in derivative DPC algorithms. Finally, the BRPMK-DPC algorithm is tested on artificially synthesized datasets and real datasets. The experimental results demonstrate that the proposed algorithm not only efficiently identifies cluster centers with uneven density but also outperform the other seven compared algorithms on most datasets.

    参考文献
    相似文献
    引证文献
引用本文

任春华,李朝荣,余洋.基于双向代表点和相互K近邻的密度峰值聚类算法[J].控制与决策,2025,40(8):2491-2502

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-08-12
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-07-11
  • 出版日期: 2025-08-20
文章二维码