|
Joshua
open source statistical hierarchical phrase-based machine translation system
|
Static Public Member Functions | |
| static void | main (String[] args) |
Static Protected Member Functions | |
| static void | extractOneBest (IndexedReader< String > nbestReader, BufferedWriter onebestWriter) throws IOException |
This program extracts the 1-best output translations from the n-best output translations generated by joshua.decoder.JoshuaDecoder.
| static void joshua.util.ExtractTopCand.extractOneBest | ( | IndexedReader< String > | nbestReader, |
| BufferedWriter | onebestWriter | ||
| ) | throws IOException [static, protected] |
Prints the one-best translation for each segment ID from the reader as a line on the writer, and closes both before exiting. The translations for a segment are printed in the order of the first occurance of the segment ID. Any information about the segment other than the translation (including segment ID) is not printed to the writer.
This implementation assumes:
We will need to alter the implementation if these assumptions no longer hold for the output of JoshuaDecoder (or any sensible n-best format passed to this method).
We should switch to using an n-best joshua.decoder.segment_file.SegmentFileParser to ensure future compatibility with being able to configure the output format of the decoder. The MERT code needs such a SegmentFileParser anyways, so that will reduce the code duplication between these two classes.
| static void joshua.util.ExtractTopCand.main | ( | String[] | args | ) | [static] |
Usage: java ExtractTopCand nbestInputFile 1bestOutputFile.
If the input file name is "-" then input is read from System.in. If the output file name is "-" then output is directed to System.out. If a file already exists with the output file name, it is truncated before writing. The bulk of this program is implemented by extractOneBest(IndexedReader,BufferedWriter).