Reconstructing and Understanding the 3D World

The course runs every day (from 17.9 to 28.9) from 9.30 - 18.00(max). There is a lunch break 12.00 - 13.30 every day. Please be on time.

This lecture will cover areas of computer vision which deal with 3D reconstruction and scene understanding. This means, to recover a 3D scene from a set of photographs, and to extract and track objects in the scene. We will discuss the underlying principles and methods to solve these task. We will also discuss latest state-of-the art techniques, which are often a combination of traditional approaches and neural networks. Additionally we cover the necessary background knowledge, e.g. image formation model, camera models, machine learning.

- Brief introduction to necessary Machine Learning concepts (Markov Random Fields, Neural Networks, etc)
- Image formation process (BRFDs, rendering equation, intrinsic images)
- Camera models (RGB, ToF, LightField)
- Sparse feature detection and description (points, edges, LIFT) w/ and w/o Neural networks
- Projective Geometry, Epipolar Geometry
- Sparse reconstruction (image matching, image descriptors, RANSAC) w/ and w/o Neural Networks
- Dense Reconstruction, SLAM and Camera Localization
- 3D Object detection (End-to-End Trainable Pipelines) w/ and w/o Neural Networks
- 3D Object tracking (6D Pose estimation, Kalman Filter, Particle Filter)
- Stereo and Optical flow (supervised, un-supervised, Graphical Models) w/ and w/o Neural Networks
- Training data generation and Instance Segmentation
- Interactive paper discussion and ranking (simulation of paper-selection procedure at an international conference on computer vision)

Teaching assistant (main point of contact): Siva karthik Mustikovela ( Also involved in mini project organization: Eric Brachmann ( Hassan Abu AlHaija ( and Omid Hosseini Jafari ( (please send all emails in English)

Prerequisite: recommended are Fundamentals of Machine Learning and Advanced Machine Learning or equivalent

Exam: graded final report (about 10 pages). The final report has to be submitted until 15.12.2018 (please send an email to Omid Hosseini Jafari ( if you need an extension of this deadline. Please explain why you need the extension.)

Leistungspunkte: 6 LP

Form: block course 10 days

Amount of work: 180h thereof 30h Lectures, 30h Exercises, 20h Revision and Home Exercise, 70h programming a mini research project, and 30h preparation of final report.
(Comment, the mini research project will start at the end of the second week).

Usability: MSc. Angewandte Informatik MSc. Scientific Computing

Teaching goals:
The students
- Understand the principles behind estimating 3D Point Clouds and Motion from two or more images. They are able to apply this knowledge to new tasks in the field of 3D reconstruction.
- Understanding the principles of an image formation process and corresponding Geometry. This can be utilized to design new algorithms, for e.g. 3D motion estimation for autonomous driving.
- Understand and implement methods that combine machine learning based methods with classical computer vision based techniques.
- Have studied various state-of-the-art computer vision systems and approaches, and are then able to evaluate and classify new systems and approaches.
- Understand and implement different approaches for object tracking and object-instance recognition.

Information about when and where the course takes place, can be found here: