![]() |
|
||
Stixmantics: A Medium-Level Model for Real-Time Semantic Scene UnderstandingTimo Scharwächter1, 2, Markus Enzweiler1, Uwe Franke1, and Stefan Roth2 1Environment Perception, Daimler R&D, Sindelfingen, Germany 2Department of Computer Science, TU Darmstadt, Germany Abstract. In this paper we present Stixmantics, a novel medium-level scene representation for real-time visual semantic scene understanding. Relevant scene structure, motion and object class information is encoded using so-called Stixels as primitive elements. Sparse feature-point trajectories are used to estimate the 3D motion field and to enforce temporal consistency of semantic labels. Spatial label coherency is obtained by using a CRF framework. The proposed model abstracts and aggregates low-level pixel information to gain robustness and efficiency. Yet, enough flexibility is retained to adequately model complex scenes, such as urban traffic. Our experimental evaluation focuses on semantic scene segmentation using a recently introduced dataset for urban traffic scenes. In comparison to our best baseline approach, we demonstrate state-of-the-art performance but reduce inference time by a factor of more than 2,000, requiring only 50 ms per image. Keywords: semantic scene understanding, bag-of-features, region classification, real-time, stereo vision, stixels LNCS 8693, p. 533 ff. lncs@springer.com
|