Joshua
open source statistical hierarchical phrase-based machine translation system
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
joshua.decoder.ff.tm.Grammar Interface Reference
Inheritance diagram for joshua.decoder.ff.tm.Grammar:
[legend]

List of all members.

Public Member Functions

Trie getTrieRoot ()
void sortGrammar (List< FeatureFunction > models)
boolean isSorted ()
boolean hasRuleForSpan (int startIndex, int endIndex, int pathLength)
int getNumRules ()
Rule constructOOVRule (int num_feats, int source_word, int target_word, boolean use_max_lm_cost)
Rule constructLabeledOOVRule (int num_feats, int source_word, int target_word, int lhs, boolean use_max_lm_cost)
int getOOVRuleID ()
Rule constructManualRule (int lhs, int[] sourceWords, int[] targetWords, float[] scores, int aritity)
void writeGrammarOnDisk (String file)
void changeGrammarCosts (Map< String, Double > weightTbl, HashMap< String, Integer > featureMap, double[] scores, String prefix, int column, boolean negate)
void obtainRulesIDTable (Map< String, Integer > rulesIDTable)

Detailed Description

Grammar is a class for wrapping a trie of TrieGrammar in order to store holistic metadata.

Author:
wren ng thornton wren@.nosp@m.user.nosp@m.s.sou.nosp@m.rcef.nosp@m.orge..nosp@m.net
Zhifei Li, zhife.nosp@m.i.wo.nosp@m.rk@gm.nosp@m.ail..nosp@m.com
Version:
$LastChangedDate$

Member Function Documentation

void joshua.decoder.ff.tm.Grammar.changeGrammarCosts ( Map< String, Double >  weightTbl,
HashMap< String, Integer >  featureMap,
double[]  scores,
String  prefix,
int  column,
boolean  negate 
)
Rule joshua.decoder.ff.tm.Grammar.constructLabeledOOVRule ( int  num_feats,
int  source_word,
int  target_word,
int  lhs,
boolean  use_max_lm_cost 
)

Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.

Here is the caller graph for this function:

Rule joshua.decoder.ff.tm.Grammar.constructManualRule ( int  lhs,
int[]  sourceWords,
int[]  targetWords,
float[]  scores,
int  aritity 
)

This is used to construct a manual rule supported from outside the grammar, but the owner should be the same as the grammar. Rule ID will the same as OOVRuleId, and no lattice cost

Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.

Here is the caller graph for this function:

Rule joshua.decoder.ff.tm.Grammar.constructOOVRule ( int  num_feats,
int  source_word,
int  target_word,
boolean  use_max_lm_cost 
)

Construct an out-of-vocabulary (OOV) rule for the word source. Only called when creating oov rule in Chart or DiskHypergraph, all the transition cost for phrase model, arity penalty, word penalty are all zero, except the LM cost or the first feature if useMaxLMCost==false.

TODO: will try to get rid of owner, have_lm_model, and num_feats

Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.

Here is the caller graph for this function:

Gets the number of rules stored in the grammar.

Returns:
the number of rules stored in the grammar

Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.

Here is the caller graph for this function:

Gets the integer identifier of this grammar's out-of-vocabulary (OOV) rule.

Returns:
the integer identifier of this grammar's out-of-vocabulary (OOV) rule

Implemented in joshua.decoder.ff.tm.AbstractGrammar.

Gets the root of the Trie backing this grammar.

Note: This method should run as a small constant-time function.

Returns:
the root of the Trie backing this grammar

Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.

Here is the caller graph for this function:

boolean joshua.decoder.ff.tm.Grammar.hasRuleForSpan ( int  startIndex,
int  endIndex,
int  pathLength 
)

Returns whether this grammar has any valid rules for covering a particular span of a sentence. Hiero's "glue" grammar will only say True if the span is longer than our span limit, and is anchored at startIndex==0. Hiero's "regular" grammar will only say True if the span is less than the span limit. Other grammars, e.g. for rule-based systems, may have different behaviors.

Parameters:
startIndexIndicates the starting index of a phrase in a source input phrase, or a starting node identifier in a source input lattice
endIndexIndicates the ending index of a phrase in a source input phrase, or an ending node identifier in a source input lattice
pathLengthLength of the input path in a source input lattice. If a source input phrase is used instead of a lattice, this value will likely be ignored by the underlying implementation, but would normally be defined as endIndex-startIndex

Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.

Here is the caller graph for this function:

Determines whether the rules in this grammar have been sorted based on the latest feature function values.

This method is needed for the cube-pruning algorithm.

Returns:
true if the rules in this grammar have been sorted based on the latest feature function values, false otherwise

Implemented in joshua.decoder.ff.tm.AbstractGrammar.

void joshua.decoder.ff.tm.Grammar.obtainRulesIDTable ( Map< String, Integer >  rulesIDTable)

After calling this method, the rules in this grammar are guaranteed to be sorted based on the latest feature function values.

Cube-pruning requires that the grammar be sorted based on the latest feature functions.

Parameters:
modelsList of feature functions

Implemented in joshua.decoder.ff.tm.AbstractGrammar.

Here is the caller graph for this function: