Zhifei Li

Ph.D. Student

Department of Computer Science

Natural Language Processing Lab

Center for Language and Speech Processing

The Johns Hopkins University

3400 N. Charles Street, Baltimore, MD

Office: CSEB 322

Email: zhifei.work@gmail.com

 

 
News:
I finished my study at Hopkins in May 2010, and I am currently working as a research scientist in the translation team at Google Research.
Research Interests
Natural Language Processing (current focus)

I am interested in applying machine learning methods (mainly sequence/structured model) on large-scale NLP applications such as Machine Translation and Language Modeling. Due to the large scale of the training data and the model search space, compact data structures (for model representation) and efficient algorithms (for model inference and learning) are essential. Also, due to the complex dependency among the model parameters, a discriminative feature-based framework is more flexible than a generative model. Currently, I am working on large-scale discriminative training for machine translation systems.

Networking and Distributed Systems

I am also interested in system-oriented research on computer networks. Before coming to JHU, I have done some work on wireless networks. In particular, during 1999-2002, I have worked at MobileSoft for several commercial projects related to WAP and Bluetooth. During 2002-2004 at NTU, I have worked on Medium Access Control issues in IEEE 802.11-based wireless networks. Here are some of my publications.

Applied Algorithms

I was very interested in applying algorithmic/theoretical methods in some fascinating applications. I found the following theoretical areas, Distributed Algorithms, Game Theory, Graph Algorithms, and Online Algorithms, are of great interests. The application problems that I was interested in are Wireless Networks, P2P, Web Search Ranking, and Reputation Systems.

 
Talk: Training and Inference Methods over Translation Forests.
Slides of the talk I gave during my job search and at CWMT-09.
Publications (on Natural Language Processing):
Machine Translation:

Zhifei Li, Ziyuan Wang, Sanjeev Khudanpur, and Jason Eisner. Unsupervised Discriminative Language Model Training for Machine Translation using Simulated Confusion Sets. In Proceedings of COLING 2010.

Zhifei Li and Jason Eisner. First- and Second-order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests. In Proceedings of EMNLP 2009.
Slides: animated, non-animated

Zhifei Li, Jason Eisner and Sanjeev Khudanpur. Variational Decoding for Statistical Machine Translation. In Proceedings of ACL 2009. Nominated for Best Paper Award.
Slides: animated, non-animated

Zhifei Li and Sanjeev Khudanpur. Efficient Extraction of Oracle-best Translations from Hypergraphs. In Proceedings of NAACL 2009 (short paper).
Slides: animated

Zhifei Li and Sanjeev Khudanpur. Forest Reranking for Machine Translation with the Perceptron Algorithm. To appear in the GALE book chapter on "MT from text", 2009.

Jason Smith, Damianos Karakos, Zhifei Li, Jason Eisner and Sanjeev Khudanpur. Novel System Combination Approaches for MT. To appear in the GALE book chapter on "MT from text", 2009.

Zhifei Li and Sanjeev Khudanpur. Large-scale Discriminative n-gram Language Models for Statistical Machine Translation. In Proceedings of AMTA 2008.
Slides: animated, non-animated
Zhifei Li and Sanjeev Khudanpur. A Scalable Decoder for Parsing-based Machine Translation with Equivalent Language Model State Maintenance. In Proceedings of ACL SSST 2008.
Slides: PowerPoint
Zhifei Li and David Yarowsky. Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora. In Proceedings of ACL 2008.
Information Extraction:
Zhifei Li and David Yarowsky. Mining and Modeling Relations between Formal and Informal Chinese Phrases from Web Corpora. In Proceedings of EMNLP 2008.
Training and test data: download
Slides: animated
Spoken Dialogue Management:
Zhifei Li, Patrick Nguyen, and Geoffrey Zweig. Optimal Dialog in Consumer-Rating Systems using a POMDP Framework. In Proceedings of ACL SIGdial 2008.
Joshua Decoder Related:
Software: download

Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Ganitkevitch, Sanjeev Khudanpur, Lane Schwartz, Wren Thornton, Jonathan Weese and Omar Zaidan, 2009.
Demonstration of Joshua: An Open Source Toolkit for Parsing-based Machine Translation. To appear in ACL 2009 (demo).

Zhifei Li, Chris Callison-Burch, Chris Dyer, Juri Ganitkevitch, Sanjeev Khudanpur, Lane Schwartz, Wren Thornton, Jonathan Weese and Omar Zaidan, 2009.
Joshua: An Open Source Toolkit for Parsing-based Machine Translation. In Proceedings of the Workshop on Statistical Machine Translation (WMT09).

 
Publications (on wireless networking)
 
Teaching (TA)
Introduction to Algorithms (Fall of 2005, Prof. Baruch Awerbuch)
Computer Networks Fundamentals (Spring of 2005 and 2006, Prof. Gerald M. Masson)
Computer Organization Fundamentals (Fall of 2004, Prof. Gerald M. Masson)

 


Revised: 01/29/06.