Department of Computer Science, Johns Hopkins University
spacerHomeAbout UsWhy Join UsPeopleAcademicsResearchEventsServices
Department of Computer Science, Johns Hopkins Universityspacer

April 5 , 2010 - Zhifei Li

Title: Novel Inference, Training, and Decoding Methods over Translation Forests

Abstract:
A hypergraph or “packed forest” is a compact data structure that can represent exponentially many trees generated (for a given input sentence) by a tree-based machine translation system.  This talk presents novel inference, training, and decoding methods that work with hypergraphs.

The hypergraph is weighted, with the weights (which are often negative
log-probabilities) being learnt by a discriminative training method. One common drawback of such method is that it relies on the existence of high-quality supervised data (i.e., bilingual data), which may be expensive to obtain. We  present two unsupervised discriminative training methods: minimum imputed-risk training, and contrastive language model estimation, both can exploit monolingual English data to perform discriminative training.

Decoding in this context refers to finding a translation that has a maximum posterior probability (i.e., MAP decoding) in the hypergraph. However, this is intractable due to spurious ambiguity, a situation where the probability of a translation string is split among many distinct derivations (e.g., trees or segmentations).  We present a novel technique, variational decoding, which effectively approximates the intractable MAP decoding based on variational inference.

Finally, many atomic inference operations on hypergraphs such as finding the one-best or k-best trees, or computing expectations have wider applicability beyond machine translation.  A general framework to achieve these operations is semiring-weighted logic programming.  Within this framework, we first extend the expectation semiring, originally proposed for a finite state automaton, to a hypergraph.  We then propose a novel second-order expectation semiring. These semirings can be used to efficiently compute a variety of frequently used expectations (e.g., entropy and its gradient) over the exponentially many trees in a hypergraph.

The modeling, decoding and algorithmic advances are presented in the context of a machine translation task, but we believe that they will also find applications in other areas of natural language processing (e.g., parsing and speech recognition) and more generally in pattern recognition.













































spacerSearchContact UsIntegrity CodeAcademics FAQLibrary ResourcesJob Center