ECCV 2014 - LNCS 8689-8695

Physically Grounded Spatio-temporal Object Affordances

Hema S. Koppula and Ashutosh Saxena

Department of Computer Science, Cornell University, USA
hema@cs.cornell.edu
asaxena@cs.cornell.edu

Abstract. Objects in human environments support various functionalities which govern how people interact with their environments in order to perform tasks. In this work, we discuss how to represent and learn a functional understanding of an environment in terms of object affordances. Such an understanding is useful for many applications such as activity detection and assistive robotics. Starting with a semantic notion of affordances, we present a generative model that takes a given environment and human intention into account, and grounds the affordances in the form of spatial locations on the object and temporal trajectories in the 3D environment. The probabilistic model also allows uncertainties and variations in the grounded affordances. We apply our approach on RGB-D videos from Cornell Activity Dataset, where we first show that we can successfully ground the affordances, and we then show that learning such affordances improves performance in the labeling tasks.

Keywords: Object Affordances, 3D Object Models, Functional Representation of Environment, Generative Graphical Model, Trajectory Modeling, Human Activity Detection, RGBD Videos

LNCS 8691, p. 831 ff.

Full article in PDF | BibTeX