Computer Vision: Foundations

Summer Semester 2020

Lecturer: Prof. Fred Hamprecht; Teaching Assistant: Alberto Bailoni

Overview

Computer Vision is used to make key decisions based on single images or video. This entails the processing of millions of pixels in images, and billions of pixels in video; but a only a small number of efficient algorithms are up to the challenge of processing such large input.

As a consequence, all extant computer vision pipelines are constructed from a small number of building blocks, which are the focus of this lecture.

In this course, we will both see how these building blocks are used to model practical computer vision tasks such as segmentation, tracking and model fitting; and deconstruct and understand these algorithms at a fundamental level, introducing and using notions from linear system theory, polyhedral combinatorics and algebraic graph theory.

In summary, computer vision is both an art and a science. This lecture will emphasize the "science" part, while recurring to practical examples.

Curriculum

(Approximate) schedule, with topics and applications in [parentheses]:

  • Color spaces, convolution filters, Discrete Fourier Transform (DFT), impulse response [blob detection, edge detection, semantic segmentation]
  • Downsampling, aliasing, interpolation [multiresolution maps and images]
  • Convolutional neural networks (CNNs), U-Net [image classification, semantic segmentation]
  • Metric learning, clustering [instance segmentation]
  • Efficient algorithms: union-find for connected components; greedy algorithms for minimum spanning tree / single linkage clustering / watershed [clustering, superpixels]
  • Dynamic programming: (all-pairs) shortest paths, distance transform / infimal convolution, widest bottleneck paths [seeded segmentation]
  • Dynamic programming: inference on trees [pose estimation]
  • Algebraic graph theory [path problems in networks]
  • Min-cost flow [tracking]
  • Linear programming, total unimodularity, polyhedral combinatorics [matching]
  • Either incompressible flows or optimal transport or computational photography or unsupervised side losses [plenty :-)]

Format

Two hours of lectures per week, plus python programming exercises and some pen-and-paper questions.

I will give the lecture in the form of a zoom meeting -- meaning that you, the participants, will have the chance to ask live questions much as we do in a standard lecture hall format. We will use a mixture of best-of-class tools for this virtual class, see below for registration details.

Venue and Registration

Lecture

The lecture takes place here: https://zoom.us/j/96167627843 on Fridays from 09:15 -- 11:00, starting on April 24th, 2020.

To join the lecture:

  1. Please install the zoom client and, importantly, register in zoom with a uni-heidelberg.de email address. Participation in the lecture is limited to participants who have a zoom account registered under a uni-heidelberg.de email address. In view of the interactive format, this is unfortunately necessary to protect us from trolls.
  2. If this is the first time you are using zoom, please try it out with friends to familiarize yourself with the interface.
  3. Please enroll for the lecture at https://uebungen.physik.uni-heidelberg.de/v/1176 . Students from all subjects can register as long as they have a Uni-ID. Your registration at this point is not binding, but it will help us organize our exercise groups.
  4. If you don't use zoom routinely, log on around 09:00 on April 24 to make sure that your setup works.

Exercise

The exercise group will take place here: https://us02web.zoom.us/j/82687847095?pwd=RUZnMElwWEhndWpHellncFRBbC9hdz09 on Tuesdays, 15:15 -- 17:00, starting on April 28th, 2020.

Chat

To discuss during or outside the lecture, or to collaborate on exercise sheets, you are welcome (but not required) to use a text / audio / video chat server. If you want to use this service,

  1. Create a Discord account here: https://discordapp.com/register
  2. Join the "Computer Vision: Foundations" server on Discord: https://discord.gg/BF9j4Za. You can either use the platform via browser or download the Discord smartphone/desktop app here: https://discordapp.com/download
  3. In the "new-members" channel you will find more information about how to be approved as a course member in the server.

Prerequisites

Must have: Linear algebra, multivariate calculus, programming experience in at least one language (preferably python).

Good to have: Machine learning, algorithms and data structures

If you have no prior experience in python and machine learning, you should still be able to take part; but expect a steep learning curve and extra hours of work.

Credit and eligibility

To earn credit for this lecture, you need to pass a written exam at the end of the semester. The lecture gives 5CP and can be elected as part of the MSc exam on the "Vertiefungsrichtung Computational Physics".

Any questions?

Please reach out to Alberto at cvflecture.s20@gmail.com