In the modal approach to clustering, clusters are defined as the local maxima of the underlying probability density function, where the latter can be estimated either non-parametrically or using finite mixture models. Thus, clusters are closely related to certain regions around the density modes, and every cluster corresponds to a bump of the density. The Modal EM algorithm is an iterative procedure that can identify the local maxima of any density function. In this contribution, we propose a fast and efficient Modal EM algorithm to be used when the density function is estimated through a finite mixture of Gaussian distributions with parsimonious component-covariance structures. After describing the procedure, we apply the proposed Modal EM algorithm on both simulated and real data examples, showing its high flexibility in several contexts.
翻译:在集群模式办法中,集群被定义为潜在概率密度函数的本地最大值,后者可以非参数性地或使用有限的混合模型来估计。因此,集群与密度模式周围的某些地区密切相关,每个集群都与密度的碰撞相对应。模型EM算法是一种迭代程序,可以识别任何密度函数的本地最大值。在此贡献中,我们建议,在密度函数通过高山分布的有限混合物和有偏差的成分差异结构来估计时,可以使用快速高效的模型EM算法。在描述程序之后,我们将拟议的模型EM算法应用于模拟和真实数据实例,表明其在若干情况下的高度灵活性。