A Fully Configurable Open Source Tool for

Minimum Error Rate Training of Machine Translation Systems


Omar F. Zaidan

Johns Hopkins University


Department of Computer Science


The Center for Language and Speech Processing


Latest Release: February 14th, 2011 (v1.50).

Latest Webpage Update: February 14th, 2011.


1. Overview

Z-MERT is a software tool for minimum error rate training of machine translation systems (Och, 2003). It is:

  • open source, extremely easy to run, and platform-independent.
  • fully modular regarding the evaluation metric, easily supporting any new evaluation metric (demo video!) that has decomposable sufficient statistics.
  • fully modular regarding the decoder, requiring no modification to function with any decoder.
  • fully configurable, allowing the user to specify any subset of its 20-some parameters.
  • highly optimized, and demonstrably time- and space- efficient.
  • bug-free :-)


2. Description

State-of-the-art machine translation (MT) systems rely on several models to evaluate the "goodness" of a given candidate translation in the target language. Each model would correspond to a feature that is a function of a <candidate translation,foreign sentence> pair. Treated as a log-linear model, we need to assign a weight for each of the features. Och (2003) provides empirical evidence that setting those weights should take into account the evaluation metric by which the MT system will eventually be judged (i.e. maximize performance on a development set, as measured by that evaluation metric). The other insight of Och's work is that there exists an efficient algorithm to find such weights. This process is known as the MERT phase, for Minimum Error Rate Training.


The existence of a MERT module that can be integrated with minimal effort with an existing MT system would be beneficial for the research community. For maximum benefit, this tool should be easy to set up and use and should have a demonstrably efficient implementation. Z-MERT (Zaidan, 2009) is a tool developed with these goals in mind. Great care has been taken to ensure that Z-MERT can be used with any MT system without modification to the code, and without the need for an elaborate web of scripts, which is a situation that unfortunately exists in practice in current training pipelines.


3. Why You Should Use Z-MERT

  • Z-MERT is completely modular regarding the decoder.
  • Z-MERT supports any evaluation metric (with decomposable sufficient statistics), and a minimal amount of code is needed to implement any new evaluation metric.
  • Z-MERT is written in Java, making it usable by virtually everybody, since the Java interpreter is freely available (and for all common platforms).
  • Z-MERT is highly optimized, making it both time- and space-efficient, and orders of magnitude faster than implementations written in interpreted languages, and apparently even faster than Moses' C++ MERT implementation.
  • Z-MERT requires no monitoring from the user, and launches the decoder and processes its output automatically.
  • Z-MERT is fully configurable, allowing the user to specify any subset of 20-some MERT parameters.
  • Z-MERT is fully documented, complete with usage instructions as well as a tutorial.
  • Z-MERT produces human-readable, (optionally) verbose, and useful output.


4. Download, Licensing, and Citation

Z-MERT's source code, instructions, documentation, and a tutorial are all included in the distribution. Unless you have a good reason to do otherwise, you should download the most recent version (v1.50):

  • Z-MERT v1.50, released 2/14/11.
  • Z-MERT v1.41, released 10/28/09.
  • Z-MERT v1.40, released 10/27/09.
  • Z-MERT v1.30, released 5/4/09.
  • Z-MERT v1.20, released 3/10/09.
  • Z-MERT v1.10, released 2/9/09.
  • Z-MERT v1.00, released 1/20/09.


Z-MERT is an open-source tool, licensed under the terms of the GNU Lesser General Public License (LGPL). Therefore, it is free for personal and scientific use by individuals and/or research groups. It may not be modified or redistributed, publicly or privately, unless the licensing terms are observed. If in doubt, contact the author for clarification and/or an explicit permission.


If you use Z-MERT in your work, please cite the following paper: (BibTeX entry)


5. The Mechanics of Z-MERT (Abbreviated Version)

Z-MERT is quite easy to use. There is no need to compile or install any files. You simply edit Z-MERT's configuration files to suit whichever experimental setup you want. Detailed instructions are included in Z-MERT's documentation, but basically Z-MERT expects a configuration file as its main argument, and some limit on how much memory it can use:


            java -cp zmert.jar ZMERT -maxMem 500 ZMERT_config.txt


The -maxMem argument tells Z-MERT that it should not persist to use up memory while the decoder is running, during which time Z-MERT would be idle. The 500 tells Z-MERT that it can only use a maximum of 500 MB when it is active. (Java's -Xmx option will not do that for you. The documentation explains exactly what maxMem is. It's pretty simple really, so don't worry too much about it for now.)


How does Z-MERT interact with the decoder? The configuration file tells Z-MERT how the decoder is launched. Z-MERT uses that information to launch the decoder as an external process to produce translations, and then uses the resulting output file in its parameter tuning. In so doing, Z-MERT treats the decoder as a black box, and knows nothing about its internals.


Hopefully this gives you an idea of just how simple it is to run Z-MERT. The full details (as well as a tutorial) can be found in the distribution, so download it and start MERTing already.


6. FAQ

Q: Why did you develop Z-MERT?

A: Z-MERT is part of a larger effort at JHU to develop Joshua (Li et al., 2009), an open source package that includes the components of a complete MT training pipeline, including a MERT module. We considered existing implementations to use as Joshua's MERT module, and found that they were not suitable for our needs, and did not meet our standards of flexibility and ease of use. So we wrote our own implementation, and Z-MERT was born.


Q: Does that mean I need to have Joshua to use Z-MERT?

A: Not at all. Z-MERT functions completely independently from the decoder (in fact, that is one of its features), and so it does not even know that Joshua exists.


Q: Why is it called Z-MERT?

A: We had used an implementation by David Chiang called C-MERT. So, I interpolated. Also, Z, being the last letter of the alphabet, signifies that this is the last implementation of MERT the world will ever need! Oh, and everybody knows that the letter Z is pretty awesome.


Q: I have some questions about the MERT algorithm itself. Can you help me?

A: Ideally, referring you to Och's paper (Och, 2003) would be enough. Unfortunately, key ideas of MERT are not explained well in Och's paper. My Z-MERT paper (Zaidan, 2009) contains a section that explains MERT's optimization algorithm, so it could be pretty helpful.


Q: What else is in your paper?

A: The paper (included in the Z-MERT distribution) contains Z-MERT's pseudocode, contrasts my implementation to two existing implementations, and discusses some of Z-MERT's features. It also reports on a number of experiments that illustrate Z-MERT's runtime efficiency.


Q: I have a feeling that you'd like to thank some people. Am I right or am I right?

A: Right you are. For starters, this research was supported in part by the Defense Advanced Research Projects Agency's GALE program under Contract No. HR0011-06-2-0001. More importantly, I would like to thank some members of the Joshua development team at JHU who offered continuing and helpful discussion, feedback, and ideas: Zhifei Li, Lane Schwartz, Wren Thornton, and our team leader Chris Callison-Burch.


Q: I have an unanswered question. What do I do now?

A: Ask it! Put that series of tubes to good use: ozaidan@cs.jhu.edu.


Q: Wow, you sure have a lot of patience to answer all those questions.

A: Is that a question or a qompliment?


7. History

Note: Version changes in the first decimal place (e.g. v1.05 to v1.10) reflect significant changes, such as changes in functionality or use. Changes in the second decimal place (e.g. v1.23 to v1.24) reflect minor changes in the documentation, instructions, output, etc.


v1.50 (2/14/11)

       Fixed bug that was causing code to attempt accessing the -1th element of an array. L

       Fixed flag passing to tercom.

       Added fourth option to TER (number of threads for TER scoring).

       Added fifth option to TER (location of tercom jar file).

       Better batch evaluation for TER (and TER-BLEU), with less overhead.

       Minor documentation changes (fixing typos, etc).

v1.41 (10/28/09)

       Added -Dfile.encoding=utf8 flag to the launch command for tercom-0.7.25.

v1.40 (10/27/09)

       Added the TER-BLEU metric, which can be used in conjunction with tercom-0.7.25.
       Added the -txtNrm parameter for text normalization.
       Minor documentation changes (fixing typos, etc).

v1.30 (5/4/09)

       Better TER documentation.
       Bug fix: fixed BLEU's handling of very short sentences and empty lines.
       Minor documentation changes (fixing typos, etc).

v1.20 (3/10/09)

       Added TER.java, which can be used in conjunction with tercom-0.7.25.
       Bug fixes: BLEU's verbose output in EvalTool; prevMERTIteration behavior.
       Exception handling; eliminated unnecessary use of static; code style improvements.
       Slightly more efficient file handling.
       Eliminated some redundant sufficient statistics calculations, and doing batch processing.
       Minor documentation changes (fixing typos, etc).

v1.10 (2/9/09)

       Full documentation
       .jar packaging
       Video demonstration (implementing a new metric in Z-MERT)

v1.00 (1/20/09)

       Initial release!


8. References

Li et al., 2009: (PDF @ ACL Anthology)

Z. Li, C. Callison-Burch, C. Dyer, J. Ganitkevitch, S. Khudanpur, L. Schwartz, W. Thornton, J. Weese, and O. Zaidan. 2009. Joshua: Open Source Toolkit for Parsing-based Machine Translation. In Proceedings of EACL 2009 Fourth Workshop on Statistical Machine Translation, pages 135-139.


Och, 2003: (PDF @ ACL Anthology)

F. Och. 2003. Minimum Error Rate Training in Statistical Machine Translation. In Proceedings of ACL, pages 160-167.


Zaidan, 2009: (PDF)

O. Zaidan. 2009. Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems. The Prague Bulletin of Mathematical Linguistics, No. 91:79-88.