Real Static Scene Image Sequences for the Evaluation of Structure From Motion Methods in an Automotive Context

Abstract

This data set contains five real-life sequences recorded by a stereo camera setup mounted in a car moving through (almost) static everyday scenes. Their purpose is the evaluate structure from motion approaches in an automotive application. The recorded image data and rectified stereo image pairs are provided.

The following sequences were selected from the stereo rig database introduced in [1] to evaluate structure from motion approaches in an automotive application. All sequences were recorded using a high-quality stereo camera setup mounted in a car moving through different environments at up to 100 km/h. As the camera moves into the scene, only few structure information is provided by observed optical flow, especially in contrast to stereo setups. This renders this data set a very challenging one. The visible scenes are almost static and thus meets an assumption usually made by 3D reconstruction approaches. In detail, this means that any optical flow observed in the sequences is caused by camera motion only, with few exceptions of e.g. trees moving slowly in the wind and walking pedestrians.

The data provided by the cameras of the high quality stereo camera system (see [1] for details) was downscaled to a spatial resolution of 656 x 541 pixels and a sampling rate of 25 Hz. Gray value data has a resolution of 12 bit. In addition to the recorded images, we provide the rectified stereo image pairs to allow verification of the results using stereo depth estimation approaches.

This data set was used in [2] to evaluate a variational approach for the estimation of dense scene structure and egomotion from monocular image sequences.


Videos

Avenue

crossview:

parallelview:

Bend

crossview:

parallelview:

City

crossview:

parallelview:

Village

crossview:

parallelview:

Parking

crossview:

parallelview

References

[1] S. Meister, B. Jähne, D. Kondermann: An Outdoor Stereo Camera System for the Generation of Real-World Benchmark Datasets with Ground Truth, 2011, Technical Report, IWR, University of Heidelberg [PDF]

[2] F. Becker, F. Lenzen, J. H. Kappes, C. Schnörr: Variational Recursive Joint Estimation of Dense Scene Structure and Camera Motion from Monocular High Speed Traffic Sequences, 2011, Proceedings of ICCV 2011, to appear [PDF and details]

Acknowledgements

The present data was acquired and processed by Daniel Kondermann, Stephan Meister and Paul-Sebastian Lauer in close cooperation with Robert Bosch GmbH. Additional scenes were recorded by Frank Lenzen and Stefan Meister. The authors thank Bernd Jähne, Wolfgang Niehsen and Jochen Wingbermühle for making this research possible. The authors also thank Annika Berger, Julian Coordts, Tobias Praetsch and Christoph Koke who spent countless hours supporting our efforts.

Dataset

The data provided by the cameras of the high quality stereo camera system was downscaled to a spatial resolution of 656 x 541 pixels and a sampling rate of 25 Hz. Please fill out the form below and we will send you a download link.