Department of Computer Science
Johns Hopkins University
3400 N. Charles Street, Hackerman 226
Baltimore, MD 21218-2680 U.S.A.
G+: profile page
Office: Hackerman 324C
Phone: (410) 516-8438 (dial 516-THETA)
Fax: (410) 516-6134
|Department of Computer Science||(my primary appointment)|
|Center for Language and Speech Processing||(my major multi-departmental center at JHU)|
|Machine Learning Group||(large community of ML researchers at JHU)|
|Human Language Technology Center of Excellence||(another large group I'm involved with at JHU)|
|Department of Cognitive Science||(my joint appointment)|
All kinds of novel methods for natural language processing:
New machine learning, combinatorial algorithms, probabilistic models of linguistic structure, and declarative specification of knowledge and algorithms.
The question: How can we appropriately formalize linguistic structure and discover it automatically?
The engineering motivation: Computers must learn to understand human language. A huge portion of human communication, thought, and culture now passes through computers. Ultimately, we want our devices to help us by understanding text and speech as a human would—both at the small scale of intelligent user interfaces and at the large scale of the entire multilingual Internet.
The scientific motivation: Human language is fascinatingly complex and ambiguous. Yet babies are born with the incredible ability to discover the structure of the language around them. Soon they are able to rapidly comprehend and produce that language and relate it to events and concepts in the world. Figuring out how this is possible is a grand challenge for both cognitive science and machine learning.
The disciplines: My research program combines computer science with statistics and linguistics. The challenge is to fashion statistical models that are nuanced enough to capture good intuitions about linguistic structure, and especially, to develop efficient algorithms to apply these models to data (including training them with as little supervision as possible).
Models: I've developed significant modeling approaches for a wide variety of domains in natural language processing—syntax, phonology, morphology, and machine translation, as well as semantic preferences, name variation, and even database-backed websites. The goal is to capture not just the structure of sentences, but also deep regularities within the grammar and lexicon of a language (and across languages). My students and I are always thinking about new problems and better models. For example, latent variables and nonparametric Bayesian methods let us construct a linguistically plausible account of how the data arose. Our latest models continue to include linguistic ideas, but they also include deep neural networks in order to fit unanticipated regularities.
Algorithms: A good mathematical model will define the best analysis of the data, but can we compute that analysis? My students and I are constantly developing new algorithms, to cope with the tricky structured prediction and learning problems posed by increasingly sophisticated models. Unlike many areas of machine learning, we have to deal with probability distributions over unboundedly large structured variables such as strings, trees, alignments, and grammars. My favorite tools include dynamic programming, Markov chain Monte Carlo (MCMC), belief propagation and other variational approximations, automatic differentiation, deterministic annealing, stochastic local search, coarse-to-fine search, integer linear programming, and relaxation methods. I especially enjoy connecting disparate techniques in fruitful new ways.
General paradigms: My students and I also work to pioneer general statistical and algorithmic paradigms that cut across problems (not limited to NLP). We are developing a high-level declarative programming language, Dyna, which allows startlingly short programs, backed up by many interesting general efficiency tricks so that these don't have to be reinvented and reimplemented in new settings all the time. We are also showing how to learn execution strategies that do fast and accurate approximate statistical inference, and how to properly train these essentially discriminative strategies in a Bayesian way. We have also developed other machine learning techniques and modeling frameworks of general interest, primarily for structured prediction and temporal sequence modeling.
Measuring success: We implement our new methods and evaluate them carefully on collections of naturally occurring language. We have repeatedly improved the state of the art. While our work can certainly be used within today's end-user applications, such as machine translation and information extraction, we ourselves are generally focused on building up the long-term fundamentals of the field.
In general, I have broad interests and have worked on a wide range of fundamental topics in NLP, drawing on varied areas of computer science. See my papers, CV, and research summary for more information; see also notes on my advising style.
See also other tutorial material.
Undergraduates are often curious about their teachers' secret lives. In the name of encouraging curiosity-driven research, here are a few photos:
And some non-photos:
If I had a geek code, it would be GCS/O/M/MU d-(+) s:- a+ C++$ ULS+(++) L++ P++ E++>+++ W++ N++ o+ K++ w@ !O V- PS++ PE- Y+ PGP b++>+++ !tv G e++++ h- r+++ y+++, but I disapprove of the feeping creaturism of these things.