Workshop on Probabilistic Graphical Models, October 22-23, 2015


Heidelberg Collaboratory for Image Processing (Speyerer Str. 6, 2nd floor, room H 2.22)


There is no registration fee but the number of participants is limited.
To apply, please send an email with your name, institution and poster abstract to


Thursday, Oct 22, 2015

12:00 Light lunch (Social room, 3rd floor)
13:00 Welcome (2nd floor, room H 2.22)
13:05 Max Welling: Deep Learning when N is Small
13:45 Discussion
14:05 Melih Kandemir: Deep Gaussian Processes, Their Inference, and Using Them for Transfer Learning
14:25 Discussion
14:45 Poster session & coffee break (Social corner, close by main entrance door 2nd floor)
15:45 Mario A. T. Figueiredo: Learning with Strongly Correlated Features
16:25 Discussion
16:45 Johannes Berger: Second-Order Recursive Filtering on the Rigid-Motion Group SE(3) Based on Nonlinear Observations from Monocular Videos
17:05 Discussion
17:25 Walk to old town
20:00 Dinner

Friday, Oct 23, 2015

9:30 Raquel Urtasun: Learning Deep Structured Models
10:10 Discussion
10:30 Coffee break
11:00 Victor Lempitsky: New quantization methods for extreme compression of high-dimensional vectors
11:40 Discussion
12:00 Light lunch
13:00 End


  • Deep Learning when N is Small
    Max Welling, University of Amsterdam

    Deep neural networks have been amazingly successful in a number of problems where the datasets are large. But it is well known that deep learning can overfit when the capacity of the model is too large relative to the number of labeled examples. In application areas such as healthcare where the amount of measurements per patient can reach a terabyte but the number of patients in a single dataset is usually no larger than a few thousand, this may become a real issue. In this talk we will explore a number of approaches to deal with the "p>>N" problem, among which 1) Bayesian approaches to deep learning as alternatives to dropout, 2) semi-supervised extensions to deep learning to exploit unlabeled data, 3) techniques to remove known and observed nuisance factors (such as hospital ID), and 4) a principled handling of symmetries in the data.

  • Learning with Strongly Correlated Features
    Mario A. T. Figueiredo, University of Lisbon

    In high-dimensional regression/learning problems, it is common to have several features (also referred to as covariates, predictors, or variables) that are highly correlated. Using standard sparsity-inducing regularization (namely, the famous LASSO) in such scenarios is unsatisfactory, as it leads to the selection of arbitrary convex combinations of those features, maybe even of only a subset thereof. However, specially in scientific applications, it is desirable to explicitly identify all the covariates that are relevant, as well as explicitly identify groups/clusters of such highly correlated covariates. This talk addresses the recently introduced sorted weighted L1 regularizer, which has been proposed precisely for this purpose. We review several convex optimization aspects concerning this regularizer, namely efficient methods to compute the corresponding proximity operator and Euclidean projection and, in the analysis front, we give sufficient conditions for exact feature clustering and characterize its statistical performance.

  • Learning Deep Structured Models

    Raquel Urtasun, University of Toronto
    Deep learning algorithms attempt to model high-level abstractions of the data using architectures composed of multiple non-linear transformations. A multiplicity of variants have been proposed and shown to be extremely successful in a wide variety of applications including computer vision, speech recognition as well as natural language processing. Deep neural networks can, however, be even more powerful when combined with graphical models in order to capture the statistical dependencies between the variables of interest. It is, however, an open problem how to develop scalable deep learning algorithms that can learn higher-order knowledge taking into account the output variables’ dependencies. Existing approaches often rely on a two-step process where a non-linear classifier that employs deep features is trained first, and its output is used to generate potentials for the structured predictor. This piece-wise training is, however, suboptimal as the deep features are learned while ignoring the dependencies between the variables of interest. In this talk I’ll show a wide variety of new deep learning algorithms that can learn complex representations taking into account the dependencies between the output random variables.

  • New quantization methods for extreme compression of high-dimensional vectors

    Victor Lempitsky, Skoltech Computer Vision, Moscow
    I will discuss the problem of encoding high-dimensional vectors (for instance, image descriptors) into a small number of bytes, typically between 4 and 32. In recent years, quantization-based methods such as product quantization of Jegou et al. have emerged as state-of-the-art, as they provide very good compression accuracy and on top of that allow fast evaluation of distances and scalar products between a compressed dataset and an uncompressed query vector. In the talk, I will present two new quantization approaches called additive quantization and tree quantization that surpass product quantization in accuracy, especially for "deep descriptors", while retaining some of its speed characteristics. Curiously, the encoding process within additive quantization gives rise to a fully-connected pairwise Markov Random Field (MRF) optimization problem, which has proven hard to solve efficiently and made us to resort to beam-search type heuristic. Tree quantization bypasses this difficulty by reducing a fully-connected MRF to a tree-shaped MRF, whereas the shape of the tree is optimized jointly with quantization codebooks using integer linear programming. Overall, the talk will illustrate an interesting connection between quantization algorithms and the MRF world, and will be based on joint works (CVPR14, CVPR15) with Artem Babenko.

  • Deep Gaussian Processes, Their Inference, and Using Them for Transfer Learning
    Melih Kandemir, Research Training Group, Heidelberg University

    Deep Gaussian processes (GPs) have been recently introduced as an alternative way for building deep architectures. While a conventional neural net is constructed by connecting many logistic regressors, or an approximation to them, a Deep GP consists of a network of neurons, each of which is a Gaussian process. This talk will be about how Deep GP networks can be built in a feed-forward or backward manner, how their inference can be performed, how they can be rescued from overparameterization, and how they can scale up to large data sets. With their kernelized neurons, Deep GPs provide a promising direction for future research on devising powerful learners with shallow architectures. As an example to this, I will show how state-of-the-art level transfer learning can be performed with an intuitive modification of a two-layer Deep GP.

  • Second-Order Recursive Filtering on the Rigid-Motion Group SE(3) Based on Nonlinear Observations from Monocular Videos
    Johannes Berger, Research Training Group, Heidelberg University

    Joint camera motion and depth map estimation from observed scene features is a key task in order to reconstruct 3D scene structure using low-cost monocular video sensors. Due to the nonlinear measurement equations that connect ego-motion with the high-dimensional depth map and optical flow, the task of stochastic state-space filtering is intractable. After introducing the overall problem, the talk focuses on a novel second-order minimum energy approximation that exploits the geometry of SE(3) and recursively estimates the state based on a higher-order kinematic model and the nonlinear measurements. Experimental results for synthetic and real sequences (e.g. KITTI benchmark) demonstrate that our approach achieves the accuracy of modern visual odometry methods.
    This is joint work with Florian Becker, Frank Lenzen, Andreas Neufeld and Christoph Schnörr