Date

Topics

Readings

Notes

Introduction

Th 9/3

Introduction

Overview, applications, history

Bishop 1


Tu 9/8

Math Review

Probability, stats, linear algebra, optimization

You do NOT need to read all
of these. However, you should
be familiar with the material.

Probability:

Bishop Chapter 2

Bishop Appendix B

Andrew Moore Tutorial

Tom Minka's nuances of probability (advanced)

Wolfram Probability and stats

Linear algebra

Bishop Appendix C

Sam Roweis' Matix Identities

Tom Minka's Matrix Optimization


Th 9/10

Foundations from Learning Theory

Learning definitions, settings

Bishop 1


Tu 9/15

Decision Trees

Construction, pruining, over-fitting

Nilsson chapter 6


Supervised Learning: Linear Methods

Th 9/17

Regression

Least squares and regression

Bishop 3


Tu 9/22

Classification

Logistic Regression

Bishop 4


Th 9/24

Generative vs. discriminative

Naive Bayes and Logistic Regression

Ng and Jordan, 2001


Tu 9/29

Online methods

Perceptron

Blum. On-Line Algorithms in Machine Learning. 1996


Supervised Learning: Non-Linear Methods

Th 10/1

Support Vector Machines

Max-margin classification and optimization

Bishop 7.1

Chris Burges SVM Tutorial

Bishop Appendix E


Tu 10/6

Kernel Methods

Dual optmization, kernel trick

Bishop 6.1, 6.2


Th 10/8

Instance based learning

Nearest-neighbors

Bishop 2.5

Mitchell 8-8.4


Tu 10/13

Neural Networks 1

Neural Network models

Bishop 5.1,5.2


Th 10/15

Neural Networks 2

Learning neural networks

Bishop 5.3,5.5


Unsupervised Learning

Tu 10/20

EM and Clustering 1

Expectation-Maximization and k-means

Bishop 9


Th 10/22

EM and Clustering 2

Gaussian mixture models

Bishop 9


Tu 10/27

Graphical models 1

Bayesian networks and conditional independence

Bishop 8.1, 8.2


Th 10/29

Graphical models 2

MRFs and Exact inference

Bishop 8.3, 8.4


Complex Output

Tu 11/3

Sequential graphical models 1

Max Sum and Max Product

Bishop 13.1,13.2


Th 11/5

Sequential graphical models 2

HMMs and CRFs

Sutton, McCallum CRF tutorial


Tu 11/10

Dimensionality reduction

PCA, probabilistic PCA

Bishop 12.1,12.2,12.3

Max Welling's PCA Tutorial


Th 11/12

Ensemble Methods

Boosting and ensembles

Bishop 14.1,14.2,14.3

A Short Introduction to Boosting


Tu 11/17

Multi-class

Reductions, 1-of-K encoding, structured

Bishop 4.1.2

Solving Multiclass Learning Problems via Error-Correcting Output Codes (Sections 1, 2.3, 2.4)

Reducing Multiclass to Binary (Sections 1, 2, 3)


Learning Settings

Th 11/19

Learning settings 1

Unsupervised Prediction Aggregation


Tu 11/24

Learning settings 2

Active learning

Burr Settles Active Learning Tutorial


Th 11/26

Thanksgiving Break

No class



Tu 12/1

Learning settings 3

Multi-task learning, transfer learning and domain adaptation

Rich Caruana. Multi-Task Learning

Domain adaptation tutorial


Th 12/3

Learning settings 4

Semi-supervised learning

Jerry Zhu Semi-Supervised Learning Tutorial


Th 12/17

Final Exam Time

Project presentations

6-9pm