LNCS Homepage
ContentsAuthor IndexSearch

ConceptMap: Mining Noisy Web Data for Concept Learning

Eren Golge and Pinar Duygulu

Bilkent University, 06800, Cankaya, Turkey

Abstract. We attack the problem of learning concepts automatically from noisy Web image search results. The idea is based on discovering common characteristics shared among subsets of images by posing a method that is able to organise the data while eliminating irrelevant instances. We propose a novel clustering and outlier detection method, namely Concept Map (CMAP). Given an image collection returned for a concept query, CMAP provides clusters pruned from outliers. Each cluster is used to train a model representing a different characteristics of the concept. The proposed method outperforms the state-of-the-art studies on the task of learning from noisy web data for low-level attributes, as well as high level object categories. It is also competitive with the supervised methods in learning scene concepts. Moreover, results on naming faces support the generalisation capability of the CMAP framework to different domains. CMAP is capable to work at large scale with no supervision through exploiting the available sources.

Keywords: Weakly-labelled data, Clustering and outlier detection, Semi- supervised model learning, ConceptMap, Attributes, Object detection, Scene classification

LNCS 8695, p. 439 ff.

Full article in PDF | BibTeX


lncs@springer.com
© Springer International Publishing Switzerland 2014