🪴 Anil's Garden

❯

❯

Statistical Learning Theory

Statistical Learning Theory

18 Jul 20253 min read

topic
statistics
learning

Building Bridges between Regression, Clustering, and Classification
XGBoost A Scalable Tree Boosting System
Learnability and the Vapnik-Chervonenkis dimension

Statistical Learning Theory (Spring 2001; CS 281B / Stat 241B) by Michael Jordan

Tree Models
Cross Validation, Regularization, and Information Criteria
TIC/AIC
Bayesian Model Selection
MDL Introduction and Source Coding
Minimum Description Length
More on Marginal Likelihood
Approximation of Marginal Likelihood
Reversible Jump MCMC and Introduction to Kernel Methods (version 1)
Reversible Jump MCMC and Introduction to Kernel Methods (version 2)
Introduction to Support Vector Machines
Lagrangian Duality
Optimal Margin Classifiers
Introduction to Kernels
Support Vector Machines---Non-Separable Classification and Regression
Kernel Principal Component Analysis
Reproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces II
The Representer Theorem
Regularization and RKHS
Fourier Perspective on Regularization
Gaussian Processes I
Gaussian Processes II
Gaussian Processes and Reproducing Kernels
Background on Uniform Convergence Bounds
Statistical Learning Theory---Finite Case I
Statistical Learning Theory---Finite Case II
Statistical Learning Theory---Symmetrization Lemma
Annealed Entropy and Growth Function
Vapnik-Chervonenkis Dimension
Structural Risk Minimization
Boosting I
Boosting II

Statistical Learning Theory (Spring 2004; CS 281B / Stat 241B) by Michael Jordan

Course Readings

Introduction [ps] [pdf]
Maximal margin classification [ps] [pdf]
Introduction to kernels [ps] [pdf]
Ridge regression and kernels [ps] [pdf]
Properties of kernels [ps] [pdf]
Soft-margin SVM, sparseness [ps] [pdf]
Regression, the SVD and PCA [ps] [pdf]
Kernel PCA and kernel CCA
Incomplete Cholesky decomposition [ps] [pdf]
ANOVA kernels and diffusion kernels [ps] [pdf]
String kernels and marginalized kernels [ps] [pdf]
Fisher kernels and semidefinite programming [ps] [pdf]
Multiple kernels and RKHS introduction [ps] [pdf]
Reproducing kernel Hilbert spaces I [ps] [pdf]
Reproducing kernel Hilbert spaces II [ps] [pdf]
The Representer Theorem [ps] [pdf]
Gaussian processes I [ps] [pdf]
Gaussian processes II [ps] [pdf]
Gaussian processes and reproducing kernels [ps] [pdf]
Spectral clustering [ps] [pdf]
Spectral clustering, introduction to Bayesian methods [ps] [pdf]
Conjugacy and exponential family [ps] [pdf]
Importance sampling and MCMC
Properties of Dirichlet distribution [ps] [pdf]
Dirichlet processes I [ps] [pdf]
Dirichlet processes II [ps] [pdf]
Dirichlet process mixtures I [ps] [pdf]
Dirichlet process mixtures II [ps] [pdf]
Probabilistic formulation of prediction problems [ps]
Risk bounds, concentration inequalities [ps]
Glivenko-Cantelli classes and Rademacher averages [ps]
Growth function and VC-dimension [ps]
Applications of Rademacher averages in large margin classification [ps]
Growth function estimates for parameterized binary classes [ps]
Covering numbers and metric entropy [ps]
Chaining, Dudley’s entropy integral [ps]
Covering numbers of VC classes [ps]
Bernstein’s inequality, and generalizations [ps]

Books and Articles - Multivariate Statistics and Machine Learning by Michael Jordan (1998)

T. S. Jaakkola, and M. I. Jordan
Variational probabilistic inference and the QMR-DT database
October, 1998

M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul
An introduction to variational methods for graphical models
December, 1997

R. D. Shachter, S. K. Andersen, and P. Szolovits
Global conditioning for probabilistic inference in belief networks
December, 1997

Kevin P. Murphy
Fitting a conditional Gaussian distribution

Michael I. Jordan
Notes on recursive least squares

M. I. Jordan and R. A. Jacobs
Hierarchical mixtures of experts and the EM algorithm.

M. I. Jordan and R. A. Jacobs
Learning in modular and hierarchical systems.

Michael I. Jordan
Slides from a tutorial on clustering

Michael I. Jordan
Why the logistic function?

Robert Cowell
Introduction to Inference in Bayesian Networks

Michael I. Jordan (Ed.)
Learning in graphical models. MIT Press, Cambridge, MA 1999.

Christopher M. Bishop
Neural Networks for Pattern Recognition. Oxford University Press, 1995.

David Heckerman
Tutorial on Learning With Bayesian Networks, updated November 1996.

Graph View

Statistical Learning Theory (Spring 2001; CS 281B / Stat 241B) by Michael Jordan
Statistical Learning Theory (Spring 2004; CS 281B / Stat 241B) by Michael Jordan
Books and Articles - Multivariate Statistics and Machine Learning by Michael Jordan (1998)

Backlinks

Theory of Deep Learning
Variational Inference

Website
Bluesky
Twitter/X
GitHub
LinkedIn
Instagram
Goodreads
Letterboxd
🍋