Jointly wit other teams in Heidelberg we are organizing a Computer Vision and Machine Learning Talk Series External and internal speaker are invited to give talks about their recent works. Below you find the list of talks in this series. We try to keep a strict time limit of one hour (45min presentation, 15min questions).
|16.12.2016 11:00 2024 or 2026||Subbu Veeravasarapu (Goethe Universität Frankfurt)
Transfer Learning from Virtual Reality
There is a growing interest to utilize Computer Graphics (CG) renderings to generate large scale annotated data in order to train machine learning systems, such as Deep convolutional neural networks, for Computer Vision (CV). However, the impact computational rendering approximations, due to choices in the rendering pipeline, on trained CV systems generalization performance is still not clear. I address this space with a case study involving traffic scenes. I also try to quantize the performance gains due to unsupervised learning of virtual scene priors from the target (real world) domain's data. I finally conclude with some interesting insights from our experimentation about the impact of rendering choices and scene models on the issue of 'domain shift' and various ways to correct it.
|15.12.2016 14:00 2024 or 2026||Pierre Baqué (EPFL)
Multi-Modal Mean-Fields via Cardinality-Based Clamping
Mean Field inference is central to statistical physics. It has attracted much interest in the Computer Vision community to efficiently solve problems expressible in terms of large Conditional Random Fields. However,
since it models the posterior probability distribution as a product of marginal probabilities, it may fail to properly account for important dependencies between variables.
We therefore replace the fully factorized distribution of Mean Field by a weighted mixture of such distributions, that similarly minimizes the KL-Divergence to the true posterior. By introducing two new ideas, namely, conditioning on groups of variables instead of single ones and using a parameter of the conditional random field potentials, that we identify to the temperature in the sense of statistical physics to select such groups, we can perform this minimization efficiently. Our extension of the clamping method proposed in previous works allows us to both produce a more descriptive approximation of the true posterior and, inspired by the diverse MAP paradigms, fit a mixture of Mean Field approximations. We demonstrate that this positively impacts real-world algorithms that initially relied on mean fields.
|30.11.2016 9:30 2024 or 2026||Vincent Lepetit (TU Graz)
Computer Vision Pipelines and Deep Learning
Many traditional Computer Vision problems can now be solved by a single Deep Network. In this talk, I will use 3D hand tracking and interest point detection and description as examples to show that we can also tackle more complex problems by building full computer vision pipelines from multiple deep networks working together. By designing these pipelines carefully, we can keep the advantages of single Deep Networks: end-to-end learning by minimizing only one clear cost function for the full pipeline, and high performance.
|27.10.2016 15:15 2026||Oliver Zendel (Austrian Institute of Technology)
Ingredients For Good CV Test Data
The quality and robustness of CV solution directly correlates to the quality and completeness of the test data used to validate it. There is a tool to determine the quality of existing test data sets: CV-HAZOP. The talk will give a short introduction into the topic and give insights into how to apply this tool for a CV use case.
|26.04.2016 13:00 2026||Oliver Lange, Holger Janssen (Bosch)
Computer Vision Research at Bosch
|22.04.2016 10:00 2026||Andreas Geiger (MPI Tübingen)
Robust Visual Perception for Intelligent Systems
Perception is a key component of every intelligent system as it enables actions within a changing environment. While humans perceive their environment with seemingly little efforts, computers first need to be trained for these tasks. One of the biggest challenges in computer vision are ambiguities which arise due to the complex nature of our environment and the information loss caused by observing two-dimensional projections of our three-dimensional world. In this talk, I will present several recent results in stereo estimation, 3D reconstruction and motion estimation which integrate high-level non-local prior knowledge for resolving ambiguities that can't be resolved using local assumptions alone. Furthermore, I will discuss the "curse of dataset annotation" and present a method for augmenting video sequences efficiently with semantic information.
|07.04.2016 15:00 2026||Björn Andres (MPI Saarbrücken)
Lifting of Multicuts and the Decomposition of Graphs
Decomposition of graphs are a fundamental structure in discrete mathematics, with important applications in computer vision. This talk introduces the minimum cost lifted multicut problem and discusses properties of lifted multicut polytopes. It shows how applications of this problem in computer vision have advanced image segmentation, multi-object tracking and human body pose estimation.
|23.02.2015 15:00 2024||Roberto Martín-Martín (Robotics and Biology Laboratory, TU Berlin)
Interactive Perception of Articulated Objects
Interactive perception leverages the capabilities of the robot to interact with the environment to reveal and understand the effects of its own actions. Knowledge about the correlations between actions and their effect is a strong prior that can be exploited to continuously interpret sensor data and enable real-time perception. We applied this insight to the estimation of the kinematic structure and state of articulated objects, and the reconstruction of their shape. Perceiving degrees of freedom is a crucial skill for robots that aim to manipulate the environment. In my talk I will present our work in interactive perception of articulated objects and other research projects from our lab that explore how interactions simplify perception and manipulation.
|10.02.2015 14:00 2024||Sebastian Ramos (Daimler Research)
A Vision of Self-Driving Cars
Self-driving cars started as a distant dream just a few decades ago. However, thanks to the recent and great progress made in many fields of computer science and engineering, this dream is now becoming reality. Self-driving cars will revolutionize our society by improving our current mobility models, and most importantly, by radically reducing the number of road fatalities. In particular, computer vision and machine learning play a central role in the perception and understanding capabilities that these vehicles required to be able to correctly operate not only under standard conditions, but also at the most unexpected situations. This talk will present the research work on road scene understanding done by the Daimler R&D Image Understanding Group during the last years, including some recent deep learning-based advancements within the context of environment perception for self-driving cars.
|26.11.2015 15:00 2024||Jiri Matas (Prague University)
Beyond Vanilla Image Retrieval
After introducing the classical formulation of the problem, I will focus on retrieval methods based on the bag of words image representation that exploit geometric constrains. Novel formulations of image retrieval problem will be discussed, showing that the standard ranking of images based on similarity addresses only one of possible user requirements.Retrieval methods efficiently solving the new formulations by exploiting geometric constraints will be used in different scenarios. These include online browsing of image collections, image analysis based on large collections of photographs, or 3D model construction.For online browsing, I will show queries that try to answer question such as: "What is this?" (zoom in at a detail), "Where is that?" (zoom-out to larger visual context), or "What is to the left / right of this?". For image analysis, two novel problems straddling the boundary between image retrieval and data mining are formulated: for every pixel in the query image, (i) find the database image with the maximum resolution depicting the pixel and (ii) find the frequency with which it is photographed in detail.
|20.11.2015 11:10 2026||Markus Wacker (HTW Dresden)
New methods for acquisition and analysis of parametrized surfaces
In this talk I will present recent research of the CG group at the HTW Dresden. In our Motion Capture Lab we have developed new methods to capture detailed surface movements of skin or apparel. With a new decomposition method (SPLOCS sparse localized deformation components) we are able to extract meaningful parameters to model, edit and control animations of highly detailed surface meshes.
|02.09 15:00 2026||Ulrike Thomas (TU Chemnitz)
Challenges for Robots in Manufacturing and Service Domains
The talk will firstly introduce the newly established robotics and human machine interaction lab at TU-Chemnitz. Then it will outline current research topics in robotics like human robot interaction useful in particular for flexible assembly and service-robotics. On key issue is compliant robotics, it will be explained what does compliance mean and where it is helpful and how it can be implemented as well as used for robotic applications. Moreover robot vision is very important to bring robots into our future live and into real scenarios. Therefore solution for the bin-picking and object localization problem will be shown.
|12.05 16:00 2026||Paul Swoboda (Universität Heidelberg)
Convergent Message Passing for Structured Linear Programs
Message passing solvers belong for many applications (e.g. MAP-MRF, linear assignment, b-matching, perfect matching) to the state-of-the-art, due to them being fast, having low memory footporint and being easy to implement. In the field of MAP-MRF variants of message passing (SRMP, MPLP, TRWS) have been developed in recent years, which are monotonically convergent and guaranteedly converge to a fixpoint, which is usually a very good one in practice. Our message passing algorithm transfers these desirable characteristics to a more general class of structured linear programs. Experiments on graph matching, MAP-MRF and multicut problems confirm the benefits of our apporach.
|23.02 14:00 MPI-CBC, Galleria||Jan Stühmer (TU Munich)
Connectivity Constraints in Image Segmentation and 3D Reconstruction
A particular problem in image segmentation is the segmentation of thin structures. Especially in biomedical images, the objects of interest often contain small scale elongated features that are very challenging to extract from noisy data. To solve these image segmentation tasks, connectivity constraints are a powerful extension to existing segmentation methods, and allow to preserve these thin structures. Application areas are, among others, blood vessel segmentation in angiography, quantification of cell protrusions in microscopy and also the three dimensional reconstruction of objects. I will present my recent work on connectivity constraints that are formulated on a tree that spans the whole image. The feasible set of this special instance of topological constraints form a convex cone and the resulting segmentation problem can be solved efficiently with a recent primal-dual algorithm for continuous convex optimization.
|04.09 10:00 2026||Dmitry Vetrov (Moscow State University) Learning deep shape models from weakly annotated data|
|28.08 11:00 MPI-CBG||Alexander Krull (CVLD) The Informed Sampler: A Discriminative Approach to Bayesian Inference in Generative Computer Vision Models, paper|
|21.08 11:00 MPI-CBG||Eric Brachmann (CVLD) Conditional regression forests for human pose estimation|
|24.07 11:00 MPI-CBG, SR2||Diverse speakers, CVPR2014 Overview. There will be a couple of (more or less) informal presentations, discussions etc. Note: the meeting take place in "seminar room 2, second floor".|
|17.07 11:00 MPI-CBG||Eric Brachmann, Alexander Krull (CVLD), Object Instance Recognition and Tracking in 6D using 3D Object Coordinates, paper|
|19.06 11:00 2026||Anita Sellent (University of Bern), Pixel Correspondences in Supportive Video-Camera Setups|
|17.06 14:50 1004||Ralf Herbrich (Director Amazon Research), Machine Learning @ Amazon|
|03.06 11:10 2026||Hassan Abu Alhaija (HCI Heidelberg), Interactive Edge-based Optical Flow using graph matching|
|26.05 11:10 2026||Bogdan Savchynskyy (HCI Heidelberg), Global MAP-Optimality by Shrinking the Combinatorial Search Area with Convex Relaxation, paper|
|05.05 11:00 2026||Oswald Aldrian (University of Applied Sciences, Stuttgart), Model-based Inverse Rendering of Faces|
|17.04 11:00 MPI-CBG||Dmitrij Schlesinger (CVLD), Two-Way MRFs: a new modeling paradigm|
|18.03 2026||Shuai Zheng (Kyle) (Robotics Research Group, University of Oxford), Beyond sliding window: Fast object detection with proposals|
|14.03 11:00 2026||Vladimir Kolmogorov (IST, Austria), Extensions of submodularity and their applications in computer vision|
|12.03 13:00-18:00 2026||Uwe Schmidt (Darmstadt University), About generative and discriminative models for image restoration, Uwe Schmidt (Darmstadt University), Transformation-aware feature learning, abstract|
|07.02 11:10||Sören König (CGV), Quality and Material Aware 3d Scanning|
|21.01||Alexander Zouhar (CVLD), Joint Shape Classification and Labeling of 3-D Objects Using the Energy Minimization Framework, paper|
|20.12 11:00||Michael Hornacek (TU Vienna), Dense RGB-D Correspondence Search at Large Displacement, page|
|17.12||2x25 min Alexander Krull (CGV, TUD), A divide and conquer strategy for the maximum likelihood localization of low intensity object, Eric Brachmann (CGV, TUD), Dense 3D Object Coordinates for 6D Pose Estimation of Multiple Objects: A Discriminative-Generative Approach|
|10.12||Shuai Zheng (Kyle) (Robotics Research Group, University of Oxford), Reconstruction meets recognition: A brief survey on the state-of-the-art methods in 3D scene understanding, page|
|06.12 11:00||Rahul Nair (HCI Heidelberg), Reflections on stereo, page|
|04.12 16:30||Daniel Kondermann (HCI Heidelberg), Ground Truth Generation And Performance Analysis, page|
|03.12||Carsten Rother's (CVLD) external research activity|
|26.11||Dmitrij Schlesinger (CVLD), Shape priors for MRF-based segmentation, paper1 paper2 paper3 slides|