ECCV 2014 - LNCS 8689-8695

Part-Based R-CNNs for Fine-Grained Category Detection

Ning Zhang, Jeff Donahue, Ross Girshick, and Trevor Darrell

University of California, Berkeley, USA
nzhang@eecs.berkeley.edu
jdonahue@eecs.berkeley.edu
rbg@eecs.berkeley.edu
trevor@eecs.berkeley.edu

Abstract. Semantic part localization can facilitate fine-grained categorization by explicitly isolating subtle appearance differences associated with specific object parts. Methods for pose-normalized representations have been proposed, but generally presume bounding box annotations at test time due to the difficulty of object detection. We propose a model for fine-grained categorization that overcomes these limitations by leveraging deep convolutional features computed on bottom-up region proposals. Our method learns whole-object and part detectors, enforces learned geometric constraints between them, and predicts a fine-grained category from a pose-normalized representation. Experiments on the Caltech-UCSD bird dataset confirm that our method outperforms state-of-the-art fine-grained categorization methods in an end-to-end evaluation without requiring a bounding box at test time.

Keywords: Fine-grained recognition, object detection, convolutional models

LNCS 8689, p. 834 ff.

Full article in PDF | BibTeX