%0 Journal Article %T 基于目标特征选择和去除的改进K-means聚类算法 %T Improved K-means clustering algorithm based on feature selection and removal on target point %A 杨华晖 %A 孟晨 %A 王成 %A 姚运志 %A YANG,Hua hui %A MENG,Chen %A WANG,Cheng %A YAO,Yun zhi %J 控制与决策 %J Control and Decision %@ 1001-0920 %V 34 %N 6 %D 2019 %P 1219-1226 %K K-均值算法;特征选择;高维数据聚类;特征赋权;数据去噪 %K K-means algorithm;feature selection;high-dimensional data clustering;feature weighting;data denoising %X 针对高维数据聚类中K-means算法无法有效抑制噪声特征、实现不规则形状聚类的缺点,提出一种基于目标点特征选择和去除的改进K-均值聚类算法.该算法使用闵可夫斯基规度作为评价距离进行目标点的分类,增设权重调节参数a、重置权重系数α进行特征选择和去除,可有效减小非聚类指标特征带来的噪声影响.算法验证实验选取UCI真实数据集和人工数据集进行聚类分析,验证改进算法对抑制噪声特征的有效性,与WK-means、iMWK-means算法进行实验对比,分析聚类学习时特征选择的适用性,同时寻找最优的距离系数beta和权重系数α. %X Aiming at the weakness that the K-means algorithm cannot effectively suppress the noise attributes and realize irregular shape clustering on high-dimensional data, an improved K-means clustering algorithm based on feature selection and removal on target point is proposed. In the improved K-means algorithm, the Minkowski metric is adopted as the evaluation of distance for the classification of the target point. The weighting adjustment parameter a is added and the weighting coefficient α is reset for feature selection and removal, which can reduce the effect of non-clustering index noise features. The UCI real datasets and artificial datasets are used for clustering analysis in the algorithm validation experiment. And the effectiveness of suppressing the noise features is validated. Compared with the WK-means and iMWK-means algorithms in the validation experiment, the applicability of feature selection in clustering learning process is analyzed. At the same time, the optimal distance coefficient beta and the weighting coefficient α are found. %R 10.13195/j.kzyjc.2017.1548 %U http://kzyjc.alljournals.cn/kzyjc/home %1 JIS Version 3.0.0