|
Joshua
open source statistical hierarchical phrase-based machine translation system
|
Public Member Functions | |
| Trie | getTrieRoot () |
| void | sortGrammar (List< FeatureFunction > models) |
| boolean | isSorted () |
| boolean | hasRuleForSpan (int startIndex, int endIndex, int pathLength) |
| int | getNumRules () |
| Rule | constructOOVRule (int num_feats, int source_word, int target_word, boolean use_max_lm_cost) |
| Rule | constructLabeledOOVRule (int num_feats, int source_word, int target_word, int lhs, boolean use_max_lm_cost) |
| int | getOOVRuleID () |
| Rule | constructManualRule (int lhs, int[] sourceWords, int[] targetWords, float[] scores, int aritity) |
| void | writeGrammarOnDisk (String file) |
| void | changeGrammarCosts (Map< String, Double > weightTbl, HashMap< String, Integer > featureMap, double[] scores, String prefix, int column, boolean negate) |
| void | obtainRulesIDTable (Map< String, Integer > rulesIDTable) |
Grammar is a class for wrapping a trie of TrieGrammar in order to store holistic metadata.
| void joshua.decoder.ff.tm.Grammar.changeGrammarCosts | ( | Map< String, Double > | weightTbl, |
| HashMap< String, Integer > | featureMap, | ||
| double[] | scores, | ||
| String | prefix, | ||
| int | column, | ||
| boolean | negate | ||
| ) |
Implemented in joshua.decoder.ff.tm.AbstractGrammar.
| Rule joshua.decoder.ff.tm.Grammar.constructLabeledOOVRule | ( | int | num_feats, |
| int | source_word, | ||
| int | target_word, | ||
| int | lhs, | ||
| boolean | use_max_lm_cost | ||
| ) |
Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.
| Rule joshua.decoder.ff.tm.Grammar.constructManualRule | ( | int | lhs, |
| int[] | sourceWords, | ||
| int[] | targetWords, | ||
| float[] | scores, | ||
| int | aritity | ||
| ) |
This is used to construct a manual rule supported from outside the grammar, but the owner should be the same as the grammar. Rule ID will the same as OOVRuleId, and no lattice cost
Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.
| Rule joshua.decoder.ff.tm.Grammar.constructOOVRule | ( | int | num_feats, |
| int | source_word, | ||
| int | target_word, | ||
| boolean | use_max_lm_cost | ||
| ) |
Construct an out-of-vocabulary (OOV) rule for the word source. Only called when creating oov rule in Chart or DiskHypergraph, all the transition cost for phrase model, arity penalty, word penalty are all zero, except the LM cost or the first feature if useMaxLMCost==false.
TODO: will try to get rid of owner, have_lm_model, and num_feats
Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.
Gets the number of rules stored in the grammar.
Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.
Gets the integer identifier of this grammar's out-of-vocabulary (OOV) rule.
Implemented in joshua.decoder.ff.tm.AbstractGrammar.
Gets the root of the Trie backing this grammar.
Note: This method should run as a small constant-time function.
Trie backing this grammar Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.
| boolean joshua.decoder.ff.tm.Grammar.hasRuleForSpan | ( | int | startIndex, |
| int | endIndex, | ||
| int | pathLength | ||
| ) |
Returns whether this grammar has any valid rules for covering a particular span of a sentence. Hiero's "glue" grammar will only say True if the span is longer than our span limit, and is anchored at startIndex==0. Hiero's "regular" grammar will only say True if the span is less than the span limit. Other grammars, e.g. for rule-based systems, may have different behaviors.
| startIndex | Indicates the starting index of a phrase in a source input phrase, or a starting node identifier in a source input lattice |
| endIndex | Indicates the ending index of a phrase in a source input phrase, or an ending node identifier in a source input lattice |
| pathLength | Length of the input path in a source input lattice. If a source input phrase is used instead of a lattice, this value will likely be ignored by the underlying implementation, but would normally be defined as endIndex-startIndex |
Implemented in joshua.decoder.ff.tm.hash_based.MemoryBasedBatchGrammar, and joshua.decoder.ff.tm.packed.PackedGrammar.
| boolean joshua.decoder.ff.tm.Grammar.isSorted | ( | ) |
Determines whether the rules in this grammar have been sorted based on the latest feature function values.
This method is needed for the cube-pruning algorithm.
true if the rules in this grammar have been sorted based on the latest feature function values, false otherwise Implemented in joshua.decoder.ff.tm.AbstractGrammar.
| void joshua.decoder.ff.tm.Grammar.obtainRulesIDTable | ( | Map< String, Integer > | rulesIDTable | ) |
Implemented in joshua.decoder.ff.tm.AbstractGrammar.
| void joshua.decoder.ff.tm.Grammar.sortGrammar | ( | List< FeatureFunction > | models | ) |
After calling this method, the rules in this grammar are guaranteed to be sorted based on the latest feature function values.
Cube-pruning requires that the grammar be sorted based on the latest feature functions.
| models | List of feature functions |
Implemented in joshua.decoder.ff.tm.AbstractGrammar.
| void joshua.decoder.ff.tm.Grammar.writeGrammarOnDisk | ( | String | file | ) |
Implemented in joshua.decoder.ff.tm.AbstractGrammar.