Statistical Machine Learning

600.675 Spring 2015

Raman Arora


Description: This is a second graduate level course in machine learning. It will provide a formal and an in-depth coverage of topics at the interface of statistical theory and computational sciences. We will revisit popular machine learning algorithms and understand their performance in terms of the size of the data (sample complexity), memory needed (space complexity), as well as the overall computational runtime (computation or iteration complexity). We will cover topics including nonparametric methods, kernel methods, online learning and reinforcement learning, as well as introduce students to current topics in large-scale machine-learning and randomized projections. Topics will vary from year-to-year but the general focus would be on combining methodology with theoretical and computational foundations. A tentative list of topics that we will cover:

1. Intro [Motivation, definitions, terminology, probability tools, concentration inequalities]

2. Foundations [PAC learning, finite class, realizable case, unrealizable case, Bayes error, estimation and approximation errors, model selection, regularization, Rademacher complexity, VC dimension]

3. Nonparametric methods [density estimation, nonparametric Bayes, clustering]

4. Support Vector Machines [linear classification, margin bounds, kernel methods]

5. Ensemble methods [Weak and strong learning, margin interpretation, Adaboost, analysis]

6. Online learning [Perceptron, online gradient descent, experts setting, winnow rule, Bregman divergence, mirror descent, online to batch conversion, sketching] 

7. Regression [Generalization bounds, linear regression, kernel ridge regression, support vector regression, sparsity]

8. Large-scale machine learning [Big Data, parallel and distributed learning, mini-batching, stochastic approximation algorithms, SGD methods, Pegasos]

9. Other topics [Active Learning, Reinforcement learning, Ranking, Randomized projections]

Prerequisites: You should have taken CS 600.475 or CS 600.476 or equivalent introductory machine learning class.

Textbook: We will use the following texts for this course

1. Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms. 2014.

2. Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning. 2012.

3. Shai Shalev-Shwartz. Online learning and online convex optimization. 2012. Available at:

Other useful references are:

4. Trevor Hastie, Robert Tibshirani, Jerome Friedman (2001). The Elements of Statistical Learning. Available at

5. Luc Devroye, László Györfi, Gábor Lugosi. A probabilistic theory of pattern recognition.

6. Vladimir N. Vapnik. Statistical learning theory. 1998.

7. Kevin P. Murphy. Machine learning: a probabilistic perspective. 2012.

8. Christopher M. Bishop. Pattern recognition and machine learning. 2009.

9. Tom M. Mitchell. Machine learning. 1997.

Grading: Grades will be based on four homework assignments (30%), final project (40%), and an in-class midterm exam (30%).

Forums: This term we will be using Piazza for class discussion. Rather than emailing questions to the teaching staff, I encourage you to post your questions on Piazza. Find our class page at: We will also be trying for an interactive class experience, collaboration and real-time polls. Please sign up for a free account and watch your email for a passphrase to enroll.

A more detailed course description is available here:

Instructor: Raman Arora

Time: Thursdays (3:00PM-4:15PM)

Location: Malone 228