2009 IEEE International Conference on
Systems, Man, and Cybernetics |
![]() |
Abstract
Mixture-model-based clustering has become a popular
approach in many data analysis problems for its statistical
properties and the implementation simplicity of the EM algorithm.
However the computation time of the EM algorithm and
its variants increases significantly with the sample size. For large
data sets, performing clustering on grouped data constitutes an
efficient alternative to speed up the algorithms execution time. A
rapid and effective algorithm dedicated to grouped data clustering
is then proposed in this paper. Inspired by the Classification EM
algorithm (CEM), the proposed approach estimates the missing
sample at each iteration. An experimental study using simulated
data and real acoustic emission data in the context of a flaw
detection application on gas tanks reveals good performances of
the proposed approach in terms of partitioning precision and
computing time.