Joshua
open source statistical hierarchical phrase-based machine translation system
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
joshua.decoder.ff.lm.ArpaFile Class Reference
Collaboration diagram for joshua.decoder.ff.lm.ArpaFile:
[legend]

List of all members.

Public Member Functions

 ArpaFile (String arpaFileName) throws IOException
int size ()
int getOrder () throws FileNotFoundException
Iterator< ArpaNgramiterator ()

Static Public Attributes

static final Regex BLANK_LINE = new Regex("^\\s*$")
static final Regex NGRAM_HEADER = new Regex("^\\\\\\d-grams:\\s*$")
static final Regex NGRAM_END = new Regex("^\\\\end\\\\s*$")

Private Attributes

final File arpaFile

Static Private Attributes

static final Logger logger = Logger.getLogger(ArpaFile.class.getName())

Detailed Description

Utility class for reading ARPA language model files.

Author:
Lane Schwartz

Constructor & Destructor Documentation

joshua.decoder.ff.lm.ArpaFile.ArpaFile ( String  arpaFileName) throws IOException

Constructs an object that represents an ARPA language model file.

Parameters:
arpaFileNameFile name of an ARPA language model file
vocabSymbol table to be used by this object

Here is the call graph for this function:


Member Function Documentation

int joshua.decoder.ff.lm.ArpaFile.getOrder ( ) throws FileNotFoundException

Here is the call graph for this function:

Gets an iterator capable of iterating over all n-grams in the ARPA file.

Returns:
an iterator capable of iterating over all n-grams in the ARPA file

Here is the call graph for this function:

Gets the total number of n-grams in this ARPA language model file.

Returns:
total number of n-grams in this ARPA language model file

Member Data Documentation

ARPA file for this object.

final Regex joshua.decoder.ff.lm.ArpaFile.BLANK_LINE = new Regex("^\\s*$") [static]

Regular expression representing a blank line.

final Logger joshua.decoder.ff.lm.ArpaFile.logger = Logger.getLogger(ArpaFile.class.getName()) [static, private]

Logger for this class.

final Regex joshua.decoder.ff.lm.ArpaFile.NGRAM_END = new Regex("^\\\\end\\\\s*$") [static]

Regular expression representing a line ending an ARPA language model file.

final Regex joshua.decoder.ff.lm.ArpaFile.NGRAM_HEADER = new Regex("^\\\\\\d-grams:\\s*$") [static]

Regular expression representing a line starting a new section of n-grams in an ARPA language model file.