Adam Lopez
The biggest barrier to global communication is the fact that we speak many different languages. My research and teaching focus on technology that will break this barrier, in particular systems that learn how to translate from vast amounts of data (like Google Translate). Improvements to these systems depend on the extension and application of fundamental ideas from diverse fields such as algorithms, machine learning, formal language and automata theory, and computational linguistics. I am interested in a variety of problems in these fields.
I am an assistant research professor at Johns Hopkins University, where I am primarily affiliated with the Human Language Technology Center of Excellence (HLTCOE). I am also affiliated with the Center for Language and Speech Processing and the computer science department. Previously I was a research fellow in the machine translation research group at the University of Edinburgh, where I moved after earning my Ph.D. in computer science from the University of Maryland. I've had the good fortune to collaborate with many excellent researchers, including Abhishek Arun, Michael Auli, Phil Blunsom, Chris Callison-Burch, David Chiang, Trevor Cohn, Chris Dyer, Juri Ganitkevitch, Barry Haddow, Rebecca Hwa, Philipp Koehn, Jimmy Lin, Nitin Madnani, Christof Monz, Matt Post, Philip Resnik, Jason Smith, and Jonathan Weese.
News
- My Ph.D. student Jason Smith will graduate in 2013. You should hire him!
- My Ph.D. student Michael Auli successfully defended his thesis and joined Microsoft Research in fall 2012.
- I taught Machine Translation in spring 2012 with Chris Callison-Burch and Matt Post. If you're teaching a class that includes topics in machine translation, consider using our open-ended homework assignments (Instructions: 1, 2, 3, 4). Feel free to email me for additional information.
- I co-chaired the machine translation track at ACL 2012.
Publications
- Learning to translate with products of novices: Teaching MT with open-ended challenge problems. with Matt Post, Chris Callison-Burch, Jonathan Weese, Juri Ganitkevitch, Narges Ahmidi, Olivia Buzek, Leah Hanson, Beenish Jamil, Matthias Lee, Ya-Ting Lin, Henry Pao, Fatima Rivera, Leili Shahriyari, Debu Sinha, Adam Teichert, Stephen Wampler, Michael Weinberger, Daguang Xu, Lin Yang, and Shang Zhao. To appear in Transactions of the ACL, June 2013.
- Massively Parallel Suffix Array Queries and On-Demand Phrase Extraction for Statistical Machine Translation Using GPUs. Hua He, Jimmy Lin, and Adam Lopez. To appear in Proceedings of NAACL HLT, June 2013.
-
Putting Human Assessments of Machine Translation Systems in Order.
In
Proceedings of WMT,
June 2012.
[abstract]
[slides]
[code]
Human assessment is often considered the gold standard in evaluation of translation systems. But in order for the evaluation to be meaningful, the rankings obtained from human assessment must be consistent and repeatable, and recent analysis by Bojar et al. (2011) raised several concerns about the rankings derived from human assessments of English-Czech translation systems in the 2010 Workshop on Machine Translation. We extend their analysis to all of the ranking tasks from 2010 and 2011, and show through an extension of their reasoning that the ranking is naturally cast as an instance of finding the minimum feedback arc set in a tournament, a well-known NP-complete problem. All instances of this problem in the workshop data are efficiently solvable, but in some cases the rankings it produces are surprisingly different from the ones previously published. This leads to strong caveats and recommendations for both producers and consumers of these rankings.
-
Using Categorial Grammar to Label Translation Rules.
Jonathan Weese, Chris Callison-Burch and Adam Lopez.
In
Proceedings of WMT,
June 2012.
[abstract]
Adding syntactic labels to synchronous context-free translation rules can improve performance, but labeling with phrase structure constituents, as in GHKM (Galley et al., 2004), excludes potentially useful translation rules. SAMT (Zollmann and Venugopal, 2006) introduces heuristics to create new non-constituent labels, but these heuristics introduce many complex labels and tend to add rarely-applicable rules to the translation grammar. We introduce a new labeling scheme based on categorial grammar, which allows syntactic labeling of many rules with a minimal, well-motivated label set. We show that our labeling scheme performs comparably to SAMT on an Urdu–English translation task, yet the label set is an order of magnitude smaller, and translation is twice as fast.
-
Training a Log-Linear Parser with Loss Functions via Softmax-Margin.
Michael Auli and Adam Lopez.
In
Proceedings of EMNLP,
July 2011.
[abstract]
Log-linear parsing models are often trained by optimizing likelihood, but we would prefer to optimize for a task-specific metric like F-measure. Softmax-margin is a convex objective for such models that minimizes a bound on expected risk for a given loss function, but its naïve application requires the loss to decompose over the predicted structure, which is not true of F-measure. We use softmax-margin to optimize a log-linear CCG parser for a variety of loss functions, and demonstrate a novel dynamic programming algorithm that enables us to use it with F-measure, leading to substantial gains in accuracy on CCGBank. When we embed our loss-trained parser into a larger model that includes supertagging features incorporated via belief propagation, we obtain further improvements and achieve a labelled/unlabelled dependency F-measure of 89.3%/94.0% on gold part-of-speech tags, and 87.2%/92.8% on automatic part-of-speech tags, the best reported results for this task.
-
Joshua 3.0: Syntax-based Machine Translation with the Thrax Grammar Extractor.
Jonathan Weese, Juri Ganitkevitch, Chris Callison-Burch, Matt Post and Adam Lopez.
In
Proceedings of WMT,
July 2011.
[abstract]
We present progress on Joshua, an open-source decoder for hierarchical and syntax-based machine translation. The main focus is describing Thrax, a flexible, open source synchronous context-free grammar extractor. Thrax extracts both hierarchical (Chiang, 2007) and syntax-augmented machine translation (Zollmann and Venugopal, 2006) grammars. It is built on Apache Hadoop for efficient distributed performance, and can easily be extended with support for new grammars, feature functions, and output formats.
-
A Comparison of Loopy Belief Propagation and Dual Decomposition for Integrated CCG Supertagging and Parsing.
Michael Auli and Adam Lopez.
In
Proceedings of ACL,
June 2011.
[abstract]
Via an oracle experiment, we show that the upper bound on accuracy of a CCG parser is significantly lowered when its search space is pruned using a supertagger, though the supertagger also prunes many bad parses. Inspired by this analysis, we design a single model with both supertagging and parsing features, rather than separating them into distinct models chained together in a pipeline. To overcome the resulting increase in complexity, we experiment with both belief propagation and dual decomposition approaches to inference, the first empirical comparison of these algorithms that we are aware of on a structured natural language processing problem. On CCGbank we achieve a labelled dependency F-measure of 88.8% on gold POS tags, and 86.7% on automatic part-of-speeoch tags, the best reported results for this task.
-
Efficient CCG Parsing: A* versus Adaptive Supertagging.
Michael Auli and Adam Lopez.
In
Proceedings of ACL,
June 2011.
[abstract]
We present a systematic comparison and combination of two orthogonal techniques for efficient parsing of Combinatory Categorial Grammar (CCG). First we consider adaptive supertagging, a widely used approximate search technique that prunes most lexical categories from the parser's search space using a separate sequence model. Next we consider several variants on A*, a classic exact search technique which to our knowledge has not been applied to more expressive grammar formalisms like CCG. In addition to standard hardware-independent measures of parser effort we also present what we believe is the first evaluation of A* parsing on the more realistic but more stringent metric of CPU time. By itself, A* substantially reduces parser effort as measured by the number of edges considered during parsing, but we show that for CCG this does not always correspond to improvements in CPU time over a CKY baseline. Combining A* with adaptive supertagging decreases CPU time by 15% for our best model.
- Final Report of the 2010 CLSP Workshop on Models for Synchronous Grammar Induction. Phil Blunsom, Chris Callison-Burch, Trevor Cohn, Chris Dyer, Jonathan Graehl, Adam Lopez, Jan Botha, Vladimir Eidelman, ThuyLinh Nguyen, Ziyuan Wang, Jonathan Weese, Olivia Buzek, and Desai Chen. August 2010.
-
Monte Carlo Techniques for Phrase-Based Translation.
Abhishek Arun, Barry Haddow, Philipp Koehn, Adam Lopez, Phil Blunsom, and Chris Dyer.
In
Machine Translation
24(2):
August 2010.
[abstract]
Recent advances in statistical machine translation have used approximate beam search for NP-complete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior distribution. In doing so we overcome the limitations of heuristic beam search and obtain theoretically sound solutions to inference problems such as finding the maximum probability translation and minimum risk training and decoding.
-
cdec: A Decoder, Alignment, and Learning Framework for Finite-State and Context-Free Translation Models.
Chris Dyer, Adam Lopez, Juri Ganitkevitch, Jonny Weese, Ferhan Ture, Phil Blunsom, Hendra Setiawan, Vlad Eidelman, and Philip Resnik.
In
Proceedings of ACL (Demonstration track),
pages 7–12,
July 2010.
[abstract]
We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and models based on synchronous context-free grammars. Using a single unified internal representation for translation forests, the decoder strictly separates model-specific translation logic from general rescoring, pruning, and inference algorithms. From this unified representation, the decoder can extract not only the 1- or k-best translations, but also alignments to a reference, or the quantities necessary to drive discriminative training using gradient-based or gradient-free optimization techniques. Its efficient C++ implementation means that memory use and runtime performance are significantly better than comparable decoders.
-
A Unified Framework for Phrase-Based, Hierarchical, and Syntax-Based Statistical Machine Translation.
Hieu Hoang, Philipp Koehn, and Adam Lopez.
In
Proceedings of IWSLT,
December 2009.
[abstract]
Despite many differences between phrase-based, hierarchical, and syntax-based translation models, their training and testing pipelines are strikingly similar. Drawing on this fact, we extend the Moses toolkit to implement hierarchical and syntactic models, making it the first open source toolkit with end-to-end support for all three of these popular models in a single package. This extension substantially lowers the barrier to entry for machine translation research across multiple models.
-
Monte Carlo inference and maximization for phrase-based translation.
Abhishek Arun, Chris Dyer, Barry Haddow, Phil Blunsom, Adam Lopez, and Philipp Koehn.
In
Proceedings of CoNLL,
June 2009.
[abstract]
Recent advances in statistical machine translation have used beam search for approximate NP-complete inference within probabilistic translation models. We present an alternative approach of sampling from the posterior distribution defined by a translation model. We define a novel Gibbs sampler for sampling translations given a source sentence and show that it effectively explores this posterior distribution. In doing so we overcome the limitations of heuristic beam search and obtain theoretically sound solutions to inference problems such as finding the maximum probability translation and minimum expected risk training and decoding.
-
Translation as Weighted Deduction.
In
Proceedings of EACL,
March 2009.
[abstract]
[errata]
[slides]
We present a unified view of many translation algorithms that synthesizes work on deductive parsing, semiring parsing, and efficient approximate search algorithms. This gives rise to clean analyses and compact descriptions that can serve as the basis for modular implementations. We illustrate this with several examples, showing how to mechanically develop search spaces using non-local features, novel models, and a variety of disparate phrase-based strategies. Although the framework is drawn from parsing and applied to translation, it is applicable to many dynamic programming problems arising in natural language processing and other areas.This draft corrects errors that appeared in the goal item of logic Monotone-Generate (Section 5; in particular, the goal item should have no words left to generate); and the deductive rules of Monotone-Generate + Ngram (Figure 2.2; the indexes of the n-gram context were incorrect, and the consequent of the second rule should start with i rather than i+1).
Thanks to Shay Cohen for pointing these out.
-
A Systematic Analysis of Translation Model Search Spaces.
Michael Auli, Adam Lopez, Hieu Hoang, and Philipp Koehn.
In
Proceedings of the Fourth Workshop on Statistical Machine Translation,
March 2009.
[abstract]
Translation systems are complex, and most metrics do little to pinpoint causes of error or isolate system differences. We use a simple technique to discover induction errors, which occur when good translations are absent from model search spaces. Our results show that a common pruning heuristic drastically increases induction error, and also strongly suggest that the search spaces of phrase-based and hierarchical phrase-based models are highly overlapping despite the well known structural differences.
-
Tera-Scale Translation Models via Pattern Matching.
In
Proceedings of COLING,
pages 505–512,
August 2008.
[abstract]
[slides]
Translation model size is growing at a pace that outstrips improvements in computing power, and this hinders research on many interesting models. We show how an algorithmic scaling technique can be used to easily handle very large models. Using this technique, we explore several large model variants and show an improvement 1.4 BLEU on the NIST 2006 Chinese-English task. This opens the door for work on a variety of models that are much less constrained by computational limitations.
-
Statistical Machine Translation.
In
ACM Computing Surveys
40(3):
Article 8,
pages 1–49,
August 2008.
[abstract]
[errata]
Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and new ideas are constantly introduced. This survey presents a tutorial overview of the state-of-the-art. We describe the context of the current research and then move to a formal problem description and an overview of the main subproblems: translation modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and a discussion of future directions.The reference for Banerjee and Lavie (2005) on p. 39 is missing. It should be:
- S. Banerjee and A. Lavie. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL 2005 Workshop on Intrinsic and Extrinsic Evaulation Measures for MT and/or Summarization, 2005.
There is a typo in rule S3 on page 11; the English and Chinese sides of the rule are swapped. It should read:
- NPB → JJ1 NPB2 / NPB2 JJ1
-
Machine Translation by Pattern Matching.
Dissertation, University of Maryland.
March 2008.
[abstract]
[slides]
[LaTeX source]
The best systems for machine translation of natural language are based on statistical models learned from data. Conventional representation of a statistical translation model requires substantial offline computation and representation in main memory. Therefore, the principal bottlenecks to the amount of data we can exploit and the complexity of models we can use are available memory and CPU time, and current state of the art already pushes these limits. With data size and model complexity continually increasing, a scalable solution to this problem is central to future improvement.
Callison-Burch et al. (2005) and Zhang and Vogel (2005) proposed a solution that we call "translation by pattern matching", which we bring to fruition in this dissertation. The training data itself serves as a proxy to the model; rules and parameters are computed on demand. It achieves our desiderata of minimal offline computation and compact representation, but is dependent on fast pattern matching algorithms on text. They demonstrated its application to a common model based on the translation of contiguous substrings, but leave some open problems. Among these is a question: can this approach match the performance of conventional methods despite unavoidable differences that it induces in the model? We show how to answer this question affirmatively.
The main open problem we address is much harder. Many translation models are based on the translation of discontiguous substrings. The best pattern matching algorithm for these models is much too slow, taking several minutes per sentence. We develop new algorithms that reduce empirical computation time by two orders of magnitude for these models, making translation by pattern matching widely applicable. We use these algorithms to build a model that is two orders of magnitude larger than the current state of the art and substantially outperforms a strong competitor in Chinese-English translation. We show that a conventional representation of this model would be impractical. Our experiments shed light on some interesting properties of the underlying model. The dissertation also includes the most comprehensive contemporary survey of statistical machine translation.
-
Hierarchical Phrase-Based Translation with Suffix Arrays.
In
Proceedings of EMNLP-CoNLL,
pages 976–985,
June 2007.
[abstract]
[slides]
A major engineering challenge in statistical machine translation systems is the efficient representation of extremely large translation rulesets. In phrase-based models, this problem can be addressed by storing the training data in memory and using a suffix array as an efficient index to quickly lookup and extract rules on the fly. Hierarchical phrase-based translation introduces the added wrinkle of source phrases with gaps. Lookup algorithms used for contiguous phrases no longer apply and the best approximate pattern matching algorithms are much too slow, taking several minutes per sentence. We describe new lookup algorithms for hierarchical phrase-based translation that reduce the empirical computation time by nearly two orders of magnitude, making on-the-fly lookup feasible for source phrases with gaps.
-
Word-Based Alignment, Phrase-Based Translation: What's the Link?
With Philip Resnik.
In
Proceedings of AMTA,
pages 90–99,
August 2006.
[abstract]
[slides]
State-of-the-art statistical machine translation is based on alignments between phrases—sequences of words in the source and target sentences. The learning step in these systems often relies on alignments between words. It is often assumed that the quality of this word alignment is critical for translation. However, recent results suggest that the relationship between alignment quality and translation quality is weaker than previously thought. We investigate this question directly, comparing the impact of high-quality alignments with a carefully constructed set of degraded alignments. In order to tease apart various interactions, we report experiments investigating the impact of alignments on different aspects of the system. Our results confirm a weak correlation, but they also illustrate that more data and better feature engineering may be more beneficial than better alignment.
-
The Hiero Machine Translation System: Extensions, Evaluation, and Analysis.
David Chiang, Adam Lopez, Nitin Madnani, Christof Monz, Philip Resnik, and Michael Subotin.
In
Proceedings of HLT/EMNLP,
pages 779–786,
October 2005.
[abstract]
[slides]
Hierarchical organization is a well known property of language, and yet the notion of hierarchical structure has been largely absent from the best performing machine translation systems in recent community-wide evaluations. In this paper, we discuss a new hierarchical phrase-based statistical machine translation system (Chiang, 2005), presenting recent extensions to the original proposal, new evaluation results in a community-wide evaluation, and a novel technique for fine-grained comparative analysis of MT systems.
-
Pattern Visualization for Machine Translation Output.
With Philip Resnik.
In
Proceedings of HLT/EMNLP Demonstrations,
pages 12–13,
October 2005.
[abstract]
[slides]
We describe a method for identifying systematic patterns in translation data using part-of-speech tag sequences. We incorporate this analysis into a diagnostic tool intended for developers of machine translation systems, and demonstrate how our application can be used by developers to explore patterns in machine translation output.
-
Improved HMM Alignment Models for Languages with Scarce Resources.
With Philip Resnik.
In
Proceedings of the ACL 2005 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond,
pages 83–86,
June 2005.
[abstract]
[slides]
[code]
We introduce improvements to statistical word alignment based on the Hidden Markov Model. One improvement incorporates syntactic knowledge. Results on the workshop data show that alignment performance exceeds that of a state-of-the art system based on more complex models, resulting in over a 5.5% absolute reduction in error on Romanian-English.
-
Word-Level Alignment for Multilingual Resource Acquisition.
With Michael Nossal, Rebecca Hwa, and Philip Resnik.
In
Proceedings of the LREC Workshop on Linguistic Knowledge Acquisition and Representation—Bootstrapping Annotated Language Data,
pages 34–42,
June 2002.
[abstract]
[slides]
We present a simple, one-pass word alignment algorithm for parallel text. Our algorithm utilizes synchronous parsing and takes advantage of existing syntactic annotations. In our experiments the performance of this model is comparable to more complicated iterative methods. We discuss the challenges and potential benefits of using this model to train syntactic parsers for new languages.
Talks and Tutorials
My philosophy is that slides are visual aids for talks, and I make no representation that they stand on their own. If you still find them useful, you're free to do what you like with them. I'd appreciate an acknowledgement if you use them in your work.- Statistical Machine Translation. One-week course at NASSLLI (and previously at ESSLLI 2010), June 2012. I've given many introductory lectures and short courses on statistical MT over the past several years, usually aimed at students with no prior knowledge of the area, but hopefully fun for everyone.
- Integrated Parsing and Tagging. Talk at IBM Research, April 2012.
- Semiring Parsing without Parsing. Talk at Cambridge and Oxford Universities, November 2009.
- Translation Model Search Spaces. Talk at Saarland University, July 2009. Also given at at Dublin City University.
- Translation by Pattern Matching. Talk at the Second Machine Translation Marathon, May 2008. I've given various versions of this talk at Amsterdam, Carnegie Mellon, Edinburgh, Microsoft Research, MITRE, and Pittsburgh.
- Syntax-based Machine Translation. Tutorial at the Second Machine Translation Marathon, May 2008.
- Inside the Hiero Decoder. Tutorial given at the University of Maryland, September 2006.