Statistical Learning Theory (Spring 2001; CS 281B / Stat 241B) by Michael Jordan

Statistical Learning Theory (Spring 2004; CS 281B / Stat 241B) by Michael Jordan

  • Introduction [ps] [pdf]
  • Maximal margin classification [ps] [pdf]
  • Introduction to kernels [ps] [pdf]
  • Ridge regression and kernels [ps] [pdf]
  • Properties of kernels [ps] [pdf]
  • Soft-margin SVM, sparseness [ps] [pdf]
  • Regression, the SVD and PCA [ps] [pdf]
  • Kernel PCA and kernel CCA
  • Incomplete Cholesky decomposition [ps] [pdf]
  • ANOVA kernels and diffusion kernels [ps] [pdf]
  • String kernels and marginalized kernels [ps] [pdf]
  • Fisher kernels and semidefinite programming [ps] [pdf]
  • Multiple kernels and RKHS introduction [ps] [pdf]
  • Reproducing kernel Hilbert spaces I [ps] [pdf]
  • Reproducing kernel Hilbert spaces II [ps] [pdf]
  • The Representer Theorem [ps] [pdf]
  • Gaussian processes I [ps] [pdf]
  • Gaussian processes II [ps] [pdf]
  • Gaussian processes and reproducing kernels [ps] [pdf]
  • Spectral clustering [ps] [pdf]
  • Spectral clustering, introduction to Bayesian methods [ps] [pdf]
  • Conjugacy and exponential family [ps] [pdf]
  • Importance sampling and MCMC
  • Properties of Dirichlet distribution [ps] [pdf]
  • Dirichlet processes I [ps] [pdf]
  • Dirichlet processes II [ps] [pdf]
  • Dirichlet process mixtures I [ps] [pdf]
  • Dirichlet process mixtures II [ps] [pdf]
  • Probabilistic formulation of prediction problems [ps]
  • Risk bounds, concentration inequalities [ps]
  • Glivenko-Cantelli classes and Rademacher averages [ps]
  • Growth function and VC-dimension [ps]
  • Applications of Rademacher averages in large margin classification [ps]
  • Growth function estimates for parameterized binary classes [ps]
  • Covering numbers and metric entropy [ps]
  • Chaining, Dudley’s entropy integral [ps]
  • Covering numbers of VC classes [ps]
  • Bernstein’s inequality, and generalizations [ps]

Books and Articles - Multivariate Statistics and Machine Learning by Michael Jordan (1998)

T. S. Jaakkola, and M. I. Jordan
Variational probabilistic inference and the QMR-DT database
October, 1998

M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul
An introduction to variational methods for graphical models
December, 1997

R. D. Shachter, S. K. Andersen, and P. Szolovits
Global conditioning for probabilistic inference in belief networks
December, 1997

Kevin P. Murphy
Fitting a conditional Gaussian distribution

Michael I. Jordan
Notes on recursive least squares

M. I. Jordan and R. A. Jacobs
Hierarchical mixtures of experts and the EM algorithm.

M. I. Jordan and R. A. Jacobs
Learning in modular and hierarchical systems.

Michael I. Jordan
Slides from a tutorial on clustering

Michael I. Jordan
Why the logistic function?

Robert Cowell
Introduction to Inference in Bayesian Networks

Michael I. Jordan (Ed.)
Learning in graphical models. MIT Press, Cambridge, MA 1999.

Christopher M. Bishop
Neural Networks for Pattern Recognition. Oxford University Press, 1995.

David Heckerman
Tutorial on Learning With Bayesian Networks, updated November 1996.