Summer 2019

Computer Science Student Defense

June 17, 2019

In this dissertation, we examine applications of neural machine translation to computer aided translation, building tools for human translators. We present a neural approach to interactive translation prediction (a form of “auto-complete” for human translators) and demonstrate its effectiveness through both simulation studies, where it outperforms a phrase-based statistical machine translation approach, and a user study. We find that about half of the translators in the study are faster using neural interactive translation prediction than they are when post-editing output of the same underlying machine translation system, and most translators express positive reactions to the tool. We perform an analysis of some challenges that neural machine translation systems face, particularly with respect to novel words and consistency. We experiment with methods of improving translation quality at a fine-grained level to address those challenges. Finally, we bring these two areas – interactive and adaptive neural machine translation – together in a simulation that shows that their combination has a positive impact on novel word translation and other metrics.

Speaker Biography: Rebecca Knowles is a PhD candidate in the Center for Language and Speech Processing and Computer Science Department at Johns Hopkins University, where she is advised by Philipp Koehn. She received an NSF Graduate Research Fellowship in 2013. Her research focuses on machine translation and computer aided translation (building tools for human translators). She received her B.S. in mathematics and linguistics from Haverford College.

Computer Science Student Defense

July 22, 2019

Language suggests information about entities and events—real or imagined. We are interested in inferring such information or meaning, such semantics, from text. In this dissertation, we build upon and contribute to a decompositional view of semantic prediction which is inherently (1) structured—multi-dimensional with correlation and possibly constraints among the possible semantic questions, (2) graded—predicted quantities represent magnitudes or probabilities rather than binary or categorical values, and (3) subjective. Combining these aspects leads to interesting opportunities for modeling and annotation and raises important questions about the impact of these practices. Specifically, we propose the first structured model for the task of Semantic Proto-Role Labeling, casting the structured problem as a multi-label prediction task which we related empirically to semantic role labeling. We subsequently propose mathematical models of structured ordinal prediction that allow us to incorporate graded annotation and to jointly model multiple annotators. We investigate the decompositional semantic prediction task of Situation Frame Identification (a flavor of topic identification) and propose a graded model for the binary task. Finally, we address issues in efficient scalar annotation.

Speaker Biography: Adam Teichert is a PhD candidate in the Center for Language and Speech Processing and an Assistant Professor of Software Engineering at Snow College in Ephraim, UT. Before coming to Johns Hopkins, he received a B.S. in Computer Science from Brigham Young University and a MS in Computing from the University of Utah. His research has explored methods for efficient learning and inference in natural language processing with recent focus on structured models and related methods for decompositional semantic labeling and topic identification.

Computer Science Student Defense

August 16, 2019

Computer Science Student Defense

August 30, 2019