|
Juri GanitkevitchPhD StudentCenter for Language and Speech Processing Johns Hopkins University |
About Me
I'm a second-year PhD student at the Computer Science Department of the Johns Hopkins University. More
precisely, I work at the Center
for Language and Speech Processing.
My advisor is Chris Callison-Burch. I also frequently consult with Ben Van Durme, Alexandre Klementiev and Adam Lopez.
My main research interest is in statistical machine translation, particularly in automatic (and possibly non-parametric) induction and training of grammars with non-trivial non-terminals. I'm also curious about efficient processing of vast amounts of data, particularly randomized and approximative algorithms, probabilistic data structures and online methods. I'm convinced that semi-supervised learning is a pretty good idea.
I hold a Master's degree in Computer Science from JHU as well as a Diplom (equivalent to a Master's) of Computer Science from RWTH Aachen University. My Diplom thesis project as well as some prior research assistant work was done at the Human Language Technology and Pattern Recognition Group, advised by Sasa Hasan and Hermann Ney.
I also spent a year as a visiting Master's student at the ENST/Télécom Paris and took a year off school to work with the Voice Technology Group at IBM Germany Research & Development as a full-time intern. While working on my Diplom thesis I held a part-time software engineering position with Nuance Communications in Aachen, Germany.
I currently am a returning intern (Summers 2010 and 2011) with the Google Translate team in Mountain View, where I work with Ashish Venugopal.
My legal name, as per my passport, is Jurij Ganitkevic. It's the result of an unfortunate transliteration accident and I much prefer the old spelling of my name that you see above. I continue to use it in publications, and generally wherever I can get away with it.
My advisor is Chris Callison-Burch. I also frequently consult with Ben Van Durme, Alexandre Klementiev and Adam Lopez.
My main research interest is in statistical machine translation, particularly in automatic (and possibly non-parametric) induction and training of grammars with non-trivial non-terminals. I'm also curious about efficient processing of vast amounts of data, particularly randomized and approximative algorithms, probabilistic data structures and online methods. I'm convinced that semi-supervised learning is a pretty good idea.
I hold a Master's degree in Computer Science from JHU as well as a Diplom (equivalent to a Master's) of Computer Science from RWTH Aachen University. My Diplom thesis project as well as some prior research assistant work was done at the Human Language Technology and Pattern Recognition Group, advised by Sasa Hasan and Hermann Ney.
I also spent a year as a visiting Master's student at the ENST/Télécom Paris and took a year off school to work with the Voice Technology Group at IBM Germany Research & Development as a full-time intern. While working on my Diplom thesis I held a part-time software engineering position with Nuance Communications in Aachen, Germany.
I currently am a returning intern (Summers 2010 and 2011) with the Google Translate team in Mountain View, where I work with Ashish Venugopal.
My legal name, as per my passport, is Jurij Ganitkevic. It's the result of an unfortunate transliteration accident and I much prefer the old spelling of my name that you see above. I continue to use it in publications, and generally wherever I can get away with it.
Projects
- I'm involved in the Joshua decoder, an open-source statistical machine translation system developed at JHU and written in Java. We're trying to make it easily accessible. Have a go.
- I also do some work on the cdec decoder, another open-source statistical machine translation system. This one is written by Chris Dyer at UMD College Park.
- Another neat project I'm involved in is Jonny Weese's Thrax, a MapReduce grammar extractor for SCFGs (it does both Hiero and SAMT). It's open-source as well, so come and lend a hand.
- Feel free to check on my most recent misadventures on GitHub.
Publications
- Learning Sentential Paraphrases from
Bilingual Parallel Corpora for Text-to-Text Generation
J. Ganitkevitch, C. Callison-Burch, C. Napoles, and B. Van Durme
In Proceedings of EMNLP; Edinburgh, United Kingdom, July 2011. - Watermarking the Outputs of Structured
Prediction with an Application in Statistical Machine Translation
A. Venugopal, J. Uszkoreit, D. Talbot, F. Och, and J. Ganitkevitch
In Proceedings of EMNLP; Edinburgh, United Kingdom, July 2011. - Paraphrastic Sentence Compression with
a Character-based Metric: Tightening without Deletion
C. Napoles, C. Callison-Burch, J. Ganitkevitch and B. Van Durme
In Proceedings of Workshop on Monolingual Text-To-Text Generation; Portland, USA, June 2011. - Joshua 3.0: Syntax-based Machine
Translation with the Thrax Grammar Extractor
J. Weese, J. Ganitkevitch, C. Callison-Burch, M. Post, and A. Lopez
In Proceedings of the Sixth Workshop on Statistical Machine Translation; Edinburgh, United Kingdom, July 2011. - cdec: A Decoder, Alignment, and
Learning Framework for Finite-State and Context-Free Translation
Models
C. Dyer, A. Lopez, J. Ganitkevitch, J. Weese, F. Ture, P. Blunsom, H. Setiawan, V. Eidelman, and P. Resnik
In Proceedings of ACL, Software Demonstrations; Uppsala, Sweden, July 2010. - Joshua 2.0: A Toolkit for
Parsing-Based Machine Translation with Syntax, Semirings,
Discriminative Training and Other Goodies
Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, A. Irvine, L. Schwartz, W. Thornton, Z. Wang, J. Weese, and O. Zaidan
In Proceedings of the Fifth Workshop on Statistical Machine Translation; Uppsala, Sweden, July 2010. - An Enriched MT
Grammar for Under $100
O. Zaidan and J. Ganitkevitch
In Proceedings of the Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk; Los Angeles, USA, June 2010. - Demonstration of Joshua: An
Open Source Toolkit for Parsing-Based Machine Translation
Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, S. Khudanpur, L. Schwartz, W. Thronton, J. Weese, and O. Zaidan
In Proceedings of ACL/IJCNLP, Software Demonstrations; Suntec, Singapore, August 2009. - Joshua: An Open Source
Toolkit for Parsing-Based Machine Translation
Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, S. Khudanpur, L. Schwartz, W. Thronton, J. Weese, and O. Zaidan
In Proceedings of the Fourth Workshop on Statistical Machine Translation; Athens, Greece, March 2009. - Triplet Lexicon Models
for Statistical Machine Translation
Sasa Hasan, Juri Ganitkevitch, Hermann Ney, and J. Andrés-Ferrer
Proceedings of EMNLP; Honolulu, Hawaii, October 2008.
2011
2010
2009
2008
Contact
My email address is juri at CS dot JHU dot edu. You can also
follow my rather unprofessional musings on Twitter.