Nonparametric Learning in High Dimensions

Han Liu, Carnegie Mellon University

Despite the high dimensionality and complexity of many modern datasets, some problems have hidden structure that makes efficient statistical inference feasible. Examples of these hidden structures include: additivity, sparsity, low-dimensional manifold structure, smoothness, copula structure, and conditional independence relations.

In this talk, I will describe efficient nonparametric learning algorithms that exploit such hidden structures to overcome the curse of dimensionality. These algorithms have strong theoretical guarantees and provide practical methods for many fundamentally important learning problems, ranging from unsupervised exploratory data analysis to supervised predictive modeling.

I will use two examples of high dimensional graph estimation and multi-task regression to illustrate the principles of developing high dimensional nonparametric methods. The theoretical results are presented in terms of risk consistency, estimation consistency, and model selection consistency. The practical performance of the algorithms is illustrated on genomics and cognitive neuroscience examples and compared to state-of-the-art parametric competitors.

This work is joint with John Lafferty and Larry Wasserman.

Speaker Biography

Han Liu is a fifth-year PhD student in the Machine Learning Department within the School of Computer Science at Carnegie Mellon University. He is in the Joint PhD program in Machine Learning and Statistics. His dissertation, directed by John Lafferty and Larry Wasserman, is entitled, “High Dimensional Nonparametric Learning and Massive-Data Analysis”. This study investigates fundamental theory and methods for high dimensional nonparametric inference and demonstrates their applicability to areas such as computational biology and cognitive neuroscience. Over the past two years, Han Liu has won the Google Ph.D. fellowship award in Statistics, the best student paper award at the 26th International Conference on Machine Learning (ICML 2009), and the 2010 best paper award in the ASA (American Statistical Association) student paper competition in Statistical Computing and Graphics. Han Liu obtained his Msc in Statistics and Machine Learning at Carnegie Mellon University in 2007 and another Msc in Computer Science at University of Toronto in 2005.