Tera-scale deep learning

Quoc V. Le, Stanford University
Host: Jason Eisner and Suchi Saria

Deep learning and unsupervised feature learning offer the potential to transform many domains such as vision, speech, and NLP. However, these methods have been fundamentally limited by our computational abilities, and typically applied to small-sized problems. In this talk, I describe the key ideas that enabled scaling deep learning algorithms to train a very large model on a cluster of 16,000 CPU cores (2000 machines). This network has 1.15 billion parameters, which is more than 100x larger than the next largest network reported in the literature.Such network, when applied at the huge scale, is able to learn abstract concepts in a much more general manner than previously demonstrated. Specifically, we find that by training on 10 million unlabeled images, the network produces features that are very selective for high-level concepts such as human faces and cats. Using these features, we also obtain significant leaps in recognition performance on several large-scalecomputer vision tasks.

Speaker Biography

Quoc Le is a PhD student at Stanford and software engineer at Google. At Stanford and Google, Quoc works on large scale brain simulation using unsupervised feature learning and deep learning. His recent work was widely distributed and discussed on various technology blogs and news sites. Quoc obtained his undergraduate degree at Australian National University, and was research visitors at National ICT Australia, Microsoft Research and Max Planck Institute of Biological Cybernetics. Quoc won the best paper award as ECML 2007.