ECCV 2014 - LNCS 8689-8695

A Hierarchical Representation for Future Action Prediction

Tian Lan, Tsung-Chuan Chen, and Silvio Savarese

Stanford University, USA

Abstract. We consider inferring the future actions of people from a still image or a short video clip. Predicting future actions before they are actually executed is a critical ingredient for enabling us to effectively interact with other humans on a daily basis. However, challenges are two fold: First, we need to capture the subtle details inherent in human movements that may imply a future action; second, predictions usually should be carried out as quickly as possible in the social world, when limited prior observations are available.

In this paper, we propose hierarchical movemes - a new representation to describe human movements at multiple levels of granularities, ranging from atomic movements (e.g. an open arm) to coarser movements that cover a larger temporal extent. We develop a max-margin learning framework for future action prediction, integrating a collection of moveme detectors in a hierarchical way. We validate our method on two publicly available datasets and show that it achieves very promising performance.

LNCS 8691, p. 689 ff.

Full article in PDF | BibTeX