Optimization is central to all of machine learning, and can broadly be classified as non-convex and convex. In deep learning, the former is always needed at train time, when looking for parameters that will minimize some empirical loss. The latter is crucial for advanced deep learning problems in which basic loss functions such as mean squared error or cross-entropy are not adequate.

This seminar will look at advanced loss functions for deep learning and at the convex optimization problems these entail.

Contents:

- Convex functions and convex conjugates
- Linear Programming
- Duality
- f-divergences
- Donsker-Varadhan representation
- Integral probability metrics
- Kantorovich-Rubinstein representation
- Maximum mean discrepancy, energy distance and their equivalence

We will also study practical applications of all of these concepts in modern deep learning.

Main textbooks:

- Boyd, Vandenberghe: Convex Optimization
- Peyre, Cuturi: Computational Optimal Transport

## Prerequisites

Participants should be familiar with the basics of deep learning, and be interested in understanding and presenting somewhat to fairly theoretical papers.

## Eligibility for your degree

This is a "MSc Pflichtseminar" in the Physics MSc program. Participants are expected to give a presentation and prepare a written report to earn 6CP. Students from other courses or programs who do not want to write a report can earn 2CP by giving a presentation only. Participants are required to attend all presentations.

## Registering

Just drop in for an overview of the topics at 16h15 on Monday, April 17th, 2023 in seminar room 11 of Mathematikon, INF 205.

Future time slots will be agreed amongst the participants.