引用本文:杨华晖,孟晨,王成,等.基于目标特征选择和去除的改进K-means聚类算法[J].控制与决策,2019,34(6):1219-1226
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】 附件
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 129次   下载 146 本文二维码信息
码上扫一扫!
分享到: 微信 更多
基于目标特征选择和去除的改进K-means聚类算法
杨华晖,孟晨,王成,姚运志
(陆军工程大学导弹工程系,石家庄050003)
摘要:
针对高维数据聚类中K-means算法无法有效抑制噪声特征、实现不规则形状聚类的缺点,提出一种基于目标点特征选择和去除的改进K-均值聚类算法.该算法使用闵可夫斯基规度作为评价距离进行目标点的分类,增设权重调节参数a、重置权重系数α进行特征选择和去除,可有效减小非聚类指标特征带来的噪声影响.算法验证实验选取UCI真实数据集和人工数据集进行聚类分析,验证改进算法对抑制噪声特征的有效性,与WK-means、iMWK-means算法进行实验对比,分析聚类学习时特征选择的适用性,同时寻找最优的距离系数beta和权重系数α.
关键词:  K-均值算法  特征选择  高维数据聚类  特征赋权  数据去噪
DOI:10.13195/j.kzyjc.2017.1548
分类号:N945
基金项目:国家自然科学基金项目(61501493).
Improved K-means clustering algorithm based on feature selection and removal on target point
YANG Hua-hui,MENG Chen,WANG Cheng,YAO Yun-zhi
(Department of Missile Engineering,Army Engineering University,Shijiazhuang050003,China)
Abstract:
Aiming at the weakness that the K-means algorithm cannot effectively suppress the noise attributes and realize irregular shape clustering on high-dimensional data, an improved K-means clustering algorithm based on feature selection and removal on target point is proposed. In the improved K-means algorithm, the Minkowski metric is adopted as the evaluation of distance for the classification of the target point. The weighting adjustment parameter a is added and the weighting coefficient α is reset for feature selection and removal, which can reduce the effect of non-clustering index noise features. The UCI real datasets and artificial datasets are used for clustering analysis in the algorithm validation experiment. And the effectiveness of suppressing the noise features is validated. Compared with the WK-means and iMWK-means algorithms in the validation experiment, the applicability of feature selection in clustering learning process is analyzed. At the same time, the optimal distance coefficient beta and the weighting coefficient α are found.
Key words:  K-means algorithm  feature selection  high-dimensional data clustering  feature weighting  data denoising

用微信扫一扫

用微信扫一扫