ECCV 2014 - LNCS 8689-8695

Human Detection Using Learned Part Alphabet and Pose Dictionary

Cong Yao¹, Xiang Bai¹, Wenyu Liu¹, and Longin Jan Latecki²

¹Department of Electronics and Information Engineering, Huazhong University of Science and Technology, China
yaocong2010@gmail.com
xbai@hust.edu.cn
liuwy@hust.edu.cn

²Department of Computer and Information Sciences, Temple University, USA
latecki@temple.edu

Abstract. As structured data, human body and text are similar in many aspects. In this paper, we make use of the analogy between human body and text to build a compositional model for human detection in natural scenes. Basic concepts and mature techniques in text recognition are introduced into this model. A discriminative alphabet, each grapheme of which is a mid-level element representing a body part, is automatically learned from bounding box labels. Based on this alphabet, the flexible structure of human body is expressed by means of symbolic sequences, which correspond to various human poses and allow for robust, efficient matching. A pose dictionary is constructed from training examples, which is used to verify hypotheses at runtime. Experiments on standard benchmarks demonstrate that the proposed algorithm achieves state-of-the-art or competitive performance.

Keywords: Human detection, mid-level elements, part alphabet, pose dictionary, matching

LNCS 8693, p. 251 ff.

Full article in PDF | BibTeX