ECCV 2014 - LNCS 8689-8695

Unsupervised Video Adaptation for Parsing Human Motion^*

Haoquan Shen¹, Shoou-I Yu², Yi Yang³, Deyu Meng⁴, and Alexander Hauptmann²

¹School of Computer Science, Zhejiang University, China
shenhaoquan@gmail.com

²School of Computer Science, Carnegie Mellon University, USA
iyu@cs.cmu.edu
alex@cs.cmu.edu

³ITEE, The University of Queensland, Australia
yee.i.yang@gmail.com

⁴School of Mathematics and Statistics, Xi’an Jiaotong University, China
dymeng@mail.xjtu.edu.cn

Abstract. In this paper, we propose a method to parse human motion in unconstrained Internet videos without labeling any videos for training. We use the training samples from a public image pose dataset to avoid the tediousness of labeling video streams. There are two main problems confronted. First, the distribution of images and videos are different. Second, no temporal information is available in the training images. To smooth the inconsistency between the labeled images and unlabeled videos, our algorithm iteratively incorporates the pose knowledge harvested from the testing videos into the image pose detector via an adjust-and-refine method. During this process, continuity and tracking constraints are imposed to leverage the spatio-temporal information only available in videos. For our experiments, we have collected two datasets from YouTube and experiments show that our method achieves good performance for parsing human motions. Furthermore, we found that our method achieves better performance by using unlabeled video than adding more labeled pose images into the training set.

Keywords: Unsupervised Video Pose Estimation, Image to Video Adaptation, Unconstrained Internet Videos

Electronic Supplementary Material:

Electronic Supplementary Material (MP4 18,857 KB)

LNCS 8693, p. 347 ff.

Full article in PDF | BibTeX

Unsupervised Video Adaptation for Parsing Human Motion*

Unsupervised Video Adaptation for Parsing Human Motion^*