Joshua
open source statistical hierarchical phrase-based machine translation system
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
joshua.decoder.InputHandler Class Reference
Inheritance diagram for joshua.decoder.InputHandler:
[legend]
Collaboration diagram for joshua.decoder.InputHandler:
[legend]

List of all members.

Public Member Functions

synchronized boolean hasNext ()
Sentence next ()
void remove ()
void register (Translation translation)
String oracleSentence (int id)

Package Functions

 InputHandler (String corpusFile, String oracleFile)

Package Attributes

String corpusFile = null
int sentenceNo = -1
Sentence nextSentence = null
BufferedReader lineReader = null
String nextOracleSentence = null
BufferedReader oracleReader = null
List< Sentenceissued
List< Translationcompleted
List< String > oracles
int lastCompletedId = -1

Private Member Functions

void prepareNextLine ()

Static Private Attributes

static final Logger logger = Logger.getLogger(InputHandler.class.getName())
static final Charset FILE_ENCODING = Charset.forName("UTF-8")

Detailed Description

This class represents input to the decoder. It currently supports three kinds of input: (1) plain sentences and (2) sentences wrapped in a <seg> tag (via the Sentence class) and (3) lattices (in Python Lattice Format, via the Lattice class). Format (2) is used to denote the sentences number of each sentence.

The input handler provides thread-safe sequential access to the input sentences. It also manages receiving and assembling decoded sentences in order (via calls to register()).

Ideally, InputHandler objects could represent complicated constraints and restrictions on the object being decoded. This would require the actual chart-parsing code to be aware of the restrictions, which could be provided through this object, whose job it would be to parse those constraints from the input.

Author:
Matt Post post@.nosp@m.jhu..nosp@m.edu

Constructor & Destructor Documentation

joshua.decoder.InputHandler.InputHandler ( String  corpusFile,
String  oracleFile 
) [package]

Here is the call graph for this function:


Member Function Documentation

synchronized boolean joshua.decoder.InputHandler.hasNext ( )

Here is the call graph for this function:

Here is the caller graph for this function:

When the ability to handle oracle sentences is added back in, this function should return the parallel oracle sentence.

Here is the caller graph for this function:

This is called only from (a) the constructor and (b) the next() function. Since the Constructor is called only once, and the call to prepareNextLine() in next() happens within a lock, this function does not require synchronization.

Here is the call graph for this function:

Here is the caller graph for this function:

Receives a sentence from a thread that has finished translating it.

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

String joshua.decoder.InputHandler.corpusFile = null [package]
final Charset joshua.decoder.InputHandler.FILE_ENCODING = Charset.forName("UTF-8") [static, private]
BufferedReader joshua.decoder.InputHandler.lineReader = null [package]
final Logger joshua.decoder.InputHandler.logger = Logger.getLogger(InputHandler.class.getName()) [static, private]
BufferedReader joshua.decoder.InputHandler.oracleReader = null [package]
List<String> joshua.decoder.InputHandler.oracles [package]