Dr. Ulrich Köthe
(The lecture's official but outdated entry in the LSF)
The lecture belongs to the Master in Physics program (specialisation Computational Physics, code "MVSpec"), but is also open for students towards a Master of Applied Informatics, Master of Scientific Computing (Code 130000201421901) and anyone interested.
Solid basic knowledge in linear algebra, analysis (multi-dimensional differentiation and integration) and probability theory is required.
Summary
Machine learning is one of the most promising approaches to address difficult decision and regression problems under uncertainty. The general idea is very simple: Instead of modeling a solution explicitly, a domain expert provides example data that demonstrate the desired behavior on representative problem instances. A suitable machine learning algorithm is then trained on these examples to reproduce the expert's solutions as well as possible and generalize it to new, unseen data. The last two decades have seen tremendous progress towards ever more powerful algorithms, and the course will cover the fundamental ideas from this field.
Dates
| Lecture | Wednesdays | 9:15-10:45 | HCI (Speyerer Str. 6, 2nd floor), seminar room H2.22 | 
| Lecture | Fridays | 11:15-12:45 | HCI, seminar room H2.22 | 
| Exercises | Fridays | 9:30-11:00 | HCI, seminar room H2.22 | 
Please register for the lecture via MÜSLI.
Execise Assignments
- Exercise 0 (Oct 23, 2015): Introduction to Python and sklearn
- Exercise 1 (deadline: Oct 30, 2015): Nearest-neighbor classification and cross validation
- Exercise 2 (deadline: Nov 03, 2014): QDA and LDA
- Exercise 3 (deadline: Nov 10, 2014): Least-squares derivation of LDA
- Exercise 4 (deadline: Dec 01, 2014): Naive Bayes and density trees
- Exercise 5 (deadline: Dec 08, 2014): Tomographic reconstruction via sparse linear regression
- Exercise 6 (deadline: Dec 15, 2014): Kernel (ridge) regression, RANSAC
- Exercise 7 (deadline: Jan 12, 2015): Covariance penalty, model selection
- Exercise 8 (deadline: Feb 24, 2015): Mini research projects (graded)
Textbooks:
General- Trevor Hastie, Robert Tibshirani, Jerome Friedman: "The Elements of Statistical Learning" (2nd edition), 745 pages, Springer, 2009 (recommended)
- Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani: "An Introduction to Statistical Learning", 426 pages, Springer, 2013 (example-based, less mathematical version of "The Elements of Statistical Learning")
- Richard O. Duda, Peter E. Hart, David G. Stork: "Pattern Classification" (2nd edition), 680 pages, Wiley, 2000
- David Barber: "Bayesian Reasoning and Machine Learning", 720 pages, Cambridge University Press, 2012
- Christopher M. Bishop: "Pattern Recognition and Machine Learning", 738 pages, Springer, 2006
- Kevin P. Murphy: "Machine Learning - A Probabilistic Perspective", 1105 pages, The MIT Press, 2012
- Charles L. Lawson, Richard J. Hanson: "Solving Least Squares Problems", 350 pages, Society for Industrial and Applied Mathematics, 1987 (best introduction to ordinary and non-negative least-squares, QR decomposition etc.)
- Sabine Van Huffel, Joos Vandewalle: "The Total Least Squares Problem: Computational Aspects and Analysis", 288 pages, Society for Industrial and Applied Mathematics, 1991 (total least squares)
- George A. F. Seber, Christopher J. Wild: "Nonlinear Regression", 792 pages, Wiley, 2003
- Ricardo A. Maronna, R. Douglas Martin, Victor J. Yohai: "Robust Statistics: Theory and Methods", 403 pages, Wiley, 2006 (good introduction to robust loss functions and robust regression)
- Bernhard Schölkopf, Alexander Smola: "Learning with Kernels", 648 pages, The MIT Press, 2001 (everything about support vector machines)
Contents (29 lectures):
- 16. October 2015: Introduction
- What is "machine learning" and why do we care?
- Learning problems (classification, regression, forecasting,...)
- Types of training data (supervised, unsupervised, weakly supervised)
- Modeling via joint probabilities and their factorization into tractable form
- Sources of uncertainty
 
- 21. October 2015: Classification Basics cf. Duda/Hart/Stork: sections 2.2 to 2.4
- Problem statement, confusion matrix
- Upper bound on the error: pure guessing
- Lower bound on the error: Bayesian decision rule
- Discriminative vs. generative models
- Example calculations of Bayesian error rates
 
- 23. and 28. October 2015: Nearest-Neighbor Classification cf. Duda/Hart/Stork: section 4.5
- Nearest neighbor decision rule
- Empirical error analysis via cross validation
- Asymptotic error analysis
- Finite sample error analysis (with Mathematica software demo)
- Drawbacks of NN classification and how to address them
 


