2009 IEEE International Conference on
Systems, Man, and Cybernetics |
![]() |
Abstract
This paper presents a novel multi-label classi.cation frame-work for domains with large numbers of labels. Automatic image annotation is such a domain, as the available semantic concepts are typically hundreds. The proposed framework comprises an initial clustering phase that breaks the original training set into several disjoint clusters of data. It then trains a multi-label classier from the data of each cluster. Given a new test instance, the framework.rst .nds the nearest cluster and then applies the corresponding model. Empirical results using two clustering algorithms, four multi-label classi.cation algorithms and three image annotation data sets suggest that the proposed approach can improve the performance and reduce the training time of standard multi-label classi.cation algorithms, particularly in the case of large number of labels.