Travis Wolfe

travis@cs.jhu.edu

Hackerman 321
Center for Language and Speech Processing
Johns Hopkins University

I'm a PhD student at Johns Hopkins University. I'm affiliated with the Center for Language and Speech Processing (CLSP) and the Human Language Technology Center of Excellence (HLTCOE). I'm advised by Mark Dredze and Benjamin Van Durme. I did my undergraduate work in statistics at Carnegie Mellon University.

My research interests are in statistical learning for understanding language. I have worked on entity-centric (entity linking, cross-document coreference resolution, and author attribute prediction) as well as event-centric (predicate linking) aspects of natural language inference. I'm also interested in topic models and minimally supervised models of text.

I have been a teaching assistant for both machine learning courses here at JHU and I helped run the NLP Reading group during 2013-2016.

I spent Summer 2014 at Google's Machine Intelligence group working with Marius Pasca on unsupervised relation extraction over the web.


Predicate Argument Alignment

Semantic predicates express literal meaning in a discourse. Collections of discourses may discuss a common theme, like a string of news stories about an event, repeating predicates and arguments in multiple documents. In this line of work we align the predicates argument structures which appear in a pair of documents. Detecting these links is useful for a variety of tasks like summarization (remove duplicates) and question answering (align the question word to the answer).

In my ACL 2013 paper, we discuss how to use a wide range of lexical resources used to train a max-margin discriminative model for predicate argument alignment.

There are also regularities among the predicates and arguments within a given discourse which should be preserved by alignment, and we exploit these regularities in the model proposed in my NAACL 2015 paper. For example, the temporal ordering of events in two documents must be preserved.


Semantic Role Labeling and Frame Parsing

Semantic Role Labeling (SRL) is the task of finding and labeling all of the semantic arguments to a given predicate. Each argument is assigned a role which is useful in explaining the "who", "what", "when", "where", "why" type of questions that one might ask about a given situation.

My most recent work on SRL has focused on training models which use global features (expressive) and greedy inference (fast). See our poster for more information.


Transition system used for SRL.

In that work we studied two imitation/reinforcement learning algorithms, Violation Fixing Perceptron (VFP) and Locally Optimal Learning to Search (LOLS), as applied to our SRL models with global features. We discuss some failure modes that each algorithm can fall into and explain how to fix them. Our final model performs better without sacrificing speed.

In another work we proposed methods for automatic feature generation for SRL. Feature selection has lost favor since the resurgence of neural methods, but a fundamental question in any machine learning problem is what to feed as input (the bias vs variance trade-off). Features are a very flexible, expressive, and interprettable method of specifying model structures. The novel part of this work is feature generation: building feature function up from featlets, which are a richer class of function(al)s than feature templates.

Graphs representing roll-outs used for training by VFP (top) and LOLS (bottom).

Earlier I worked on the problem of extending the coverage of FrameNet in my ACL 2015 paper. Framenet is an excellent resource which is unique in its schematic structure which including frame-frame relations and a universal role set. However, the mapping between trigger words and frames is lacking coverage, and in this work we use paraphrasing techniques to enlarge the lexicon and validate the new data using Amazon Mechanical Turk annotators.


Publications

My Google Scholar Page

Pocket Knowledge Base Population. Travis Wolfe, Mark Dredze, and Benjamin Van Durme. ACL 2017. [pdf] [bibtex]

A Study of Imitation Learning Methods for Semantic Role Labeling. Travis Wolfe, Mark Dredze, and Benjamin Van Durme. EMNLP 2016 Workshop on Structured Prediction for NLP. [pdf] [poster]

Feature Generation for Robust Semantic Role Labeling. Travis Wolfe, Mark Dredze, and Benjamin Van Durme. arXiv 2016. [pdf]

FrameNet+: Fast Paraphrastic Tripling of FrameNet. Ellie Pavlick, Travis Wolfe, Pushpendre Rastogi, Chris Callison-Burch, Mark Dredze, and Benjamin Van Durme. ACL 2015. [pdf]

Interactive Knowledge Base Population. Travis Wolfe, Mark Dredze, James Mayfield, Paul McNamee, Craig Harman, Tim Finin, and Benjamin Van Durme. arXiv 2015. [pdf]

Predicate Argument Alignment using a Global Coherence Model. Travis Wolfe, Mark Dredze, Benjamin Van Durme. NAACL 2015. [pdf] [bibtex] [code]

Concretely Annotated Corpora. Francis Ferraro, Max Thomas, Matthew R. Gormley, Travis Wolfe, Craig Harman, and Benjamin Van Durme. AKBC Workshop at NIPS 2014. [pdf] [bibtex]

PARMA: A Predicate Argument Aligner. Travis Wolfe, Benjamin Van Durme, Mark Dredze, Nicholas Andrews, Charley Beller, Chris Callison-Burch, Jay DeYoung, Justin Snyder, Jonathan Weese, Tan Xu, and Xuchen Yao. ACL 2013. [pdf] [bibtex] [code]

Topic Models and Metadata for Visualizing Text Corpora. Justin Snyder, Rebecca Knowles, Mark Dredze, Matthew R. Gormley, Travis Wolfe. NAACL 2013. [pdf]

Cross-Document Coreference Resolution and Entity Linking using a Dirichlet Process Travis Wolfe, Nicholas Andrews, Matt Gormley, Mark Dredze MASC 2012. [poster]

News Personalization using Support Vector Machines. Anatole Gershman, Travis Wolfe, Eugene Fink and Jaime Carbonell. Enriching Information Retrieval Workshop at ACM SIGIR 2011. [pdf] [bibtex]


Things I Hack On

parma - a sweet open source scala library for predicate-argument alignment.

fnparse - a FrameNet and Propbank frame parser and SRL system with global features trained with imitation/reinforcement learning.

A high-performance NLP utility library called tutils for Java. You might be interested in this if you 1) need to read Concrete data, 2) think all Java NLP libraries are slow and/or use too much memory, or 3) can't use GPL software.

Concrete and other cool stuff that is going on at the HLTCOE


updated March 19, 2015