
鲁棒的极大熵聚类算法RMEC及其例外点标识
邓赵红1,2、王士同1,2,4、吴锡生1,2、胡德文3
Robust Maximum Entropy Clustering Algorithm RMEC and Its Outlier Labeling
Deng Zhaohong1,2、 Wang Shitong1,2,4、 Wu Xisheng1,2、 Hu Dewen3
针对极大熵聚类算法MEC(maximum entropy clustering)对例外点(outliers)较敏感和不能标识例外点的缺陷,提出了一种改进的极大熵聚类算法RMEC(robust maximum entropy clustering)。该算法的基本思想是通过引入Vapnik's ε-不敏感损失函数和权重因子重新构建目标函数,并利用优化理论推导出新的学习公式。RMEC算法不但对例外点较之MEC算法有更好的鲁棒性,而且还能有效地利用学习后的权重因子标识出数据集中存在的例外点。仿真试验结果亦表明了RMEC算法的上述优点。
In this paper, the novel robust maximum entropy clustering algorithm RMEC, as the improved version of the maximum entropy algorithm MEC, is presented to overcome its drawbacks: very sensitive to outliers and uneasy to label them. With the introduction of Vapnik's ε-insensitive loss function and the new weight factors, the new objective function is re-constructed, and consequently, its new update rules are derived according to the Lagrangian optimization theory. Compared with algorithm MEC, the main contributions of algorithm RMEC exist in its much better robustness for outliers and the fact that it can effectively label outliers in the dataset using the obtained weight factors. The experimental results demonstrate its superior performance in enhancing the robustness and labeling outliers in the dataset.
熵 / 聚类 / 鲁棒性 / 例外点 / ε-不敏感损失函数 / 权重因子
entropy / clustering / robustness / outliers / ε-insensitive loss function / weight factors
/
〈 |
|
〉 |