基于相互邻近度的密度峰值聚类算法
CSTR:
作者:
作者单位:

(1. 南昌工程学院信息工程学院,南昌330099;2. 江西省水信息协同感知与智能处理重点实验室,南昌330099;3. 鄱阳湖流域水工程安全与资源高效利用国家地方联合工程实验室,南昌330099)

作者简介:

通讯作者:

E-mail: zhaojia925@163.com.

中图分类号:

TP301.6

基金项目:

国家自然科学基金项目(51669014,61663029,62069014,62066030);江西省杰出青年基金项目(2018ACB21029);江西省自然科学基金项目(20192BAB207031).


Density peaks clustering based on mutual neighbor degree
Author:
Affiliation:

(1. School of Information Engineering,Nanchang Institute of Technology,Nanchang330099,China;2. Jiangxi Province Key Laboratory of Water Information Cooperative Sensing and Intelligent Processing,Nanchang330099,China;3. National-Local Engineering Laboratory of Water Engineering Safety and Effective Utilization of Resources in Poyang Lake Area,Nanchang330099,China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    密度峰值聚类算法对密集程度不一数据的聚类效果不佳,样本分配过程易产生连带错误.为此,提出一种基于相互邻近度的密度峰值聚类算法.所提算法引入k近邻思想计算局部密度,以此保证密度的相对性.定义综合数据全局和局部特征的样本相互邻近度的度量准则,据此准则,提出一种新的样本分配策略.新的分配策略采用k近邻思想寻找密度峰值,将密度峰值的k个近邻点分配给其对应类簇,对所有已分配数据点寻找相互邻近度最高的未分配数据点,将未分配数据点分配给已分配数据点所在类簇.在合成和UCI数据集上,将所提算法与DPC、DBSCAN、OPTICS、AP、K-Means及DPC的改进算法进行比较, 实验结果表明,所提出的算法性能最优.

    Abstract:

    Density peaks clustering has poor clustering effect on data with different densities, and the problem of the ripple may occur in the sample allocation process. To solve the problems, a new clustering method, density peaks clustering based on mutual neighbor degree, is proposed. Firstly, the method introduces k nearest neighbors to calculate local density, so as to ensure the relativity of the density. Then, it defines the metrics of mutual neighbour degree with global and local features of data. We propose a new allocation strategy according to the metrics. The strategy adopts k nearest neighbors to find the density peaks, assigns the k nearest neighbors to their corresponding clusters, searches the unallocated data points with the highest mutual neighbor degree for all allocated data points, and then assigns the unallocated points to the cluster where the allocated data points are located. The experiments on synthetic and UCI datasets show that the proposed method outperforms DPC, DBSCAN, OPTICS, AP, K-Means and other DPC variant algorithms.

    参考文献
    相似文献
    引证文献
引用本文

赵嘉,姚占峰,吕莉,等.基于相互邻近度的密度峰值聚类算法[J].控制与决策,2021,36(3):543-552

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2021-03-01
  • 出版日期: 2021-03-20
文章二维码