太赫兹科学与电子信息学报, 2020, 18 (3): 497, 网络出版: 2020-07-16
数据挖掘中一种改进的谱组合聚类算法
An improved spectral Ensemble Clustering algorithm in data mining
组合聚类 基本分区 低秩表示 共协矩阵 增强拉格朗日乘数法 Ensemble Clustering Basic Partitions Low Rank Representation(LRR) covariancematrix enhanced Lagrange multiplier method
摘要
组合聚类 (EC)是解决数据挖掘问题的关键手段之一,但现有的 EC方法较少考虑可能破坏聚类结构的各种噪声,降低了聚类性能。为此,提出一种改进的谱组合聚类 (ISEC)方法。将聚类问题建模为输入的多个基本分区 (BPs)派生的共协矩阵的图分割问题; ISEC方法学习得到共协矩阵的低秩表示,并在共协矩阵上进行谱聚类,提高聚类性能;最后采用增强拉格朗日乘数法进行优化求解,获得最终的聚类结果。在多个真实数据集上的 仿真实验结果表明, ISEC方法的聚类性能优于目前的大多数聚类方法。
Abstract
Ensemble Clustering(EC) is one of the key means to solve data mining problems, but the existing EC methods rarely consider the various noises that may damage the clustering structure and reduce the clustering performance. To solve this problem, an Improved Spectral Ensemble Clustering(ISEC) method is proposed. Firstly, the clustering problem is modeled as a graph partitioning problem of coincidence matrices derived from inputting multiple Basic Partitions(BPs). Then, The ISEC method learns to obtain the low rank representation of the covariance matrix, and carries on the spectral clustering to improve the clustering performance. Finally, the optimization solution is carried out by the enhanced Lagrange multiplier method, so as to obtain the final clustering result. The simulation results on several real data sets show that the clustering performance of ISEC method is better than that of most existing clustering methods.
童绪军, 吴义春. 数据挖掘中一种改进的谱组合聚类算法[J]. 太赫兹科学与电子信息学报, 2020, 18(3): 497. TONG Xujun, WU Yichun. An improved spectral Ensemble Clustering algorithm in data mining[J]. Journal of terahertz science and electronic information technology, 2020, 18(3): 497.