|
Joshua
open source statistical hierarchical phrase-based machine translation system
|
Public Member Functions | |
| Chart (Sentence sentence, List< FeatureFunction > featureFunctions, List< StateComputer > stateComputers, Grammar[] grammars, boolean useMaxLMCostForOOV, String goalSymbol) | |
| void | setGoalSymbolID (int i) |
| HyperGraph | expand () |
| Cell | getCell (int i, int j) |
| void | addAxiom (int i, int j, Rule rule, SourcePath srcPath) |
Package Attributes | |
| int | nPreprunedEdges = 0 |
| int | nPreprunedFuzz1 = 0 |
| int | nPreprunedFuzz2 = 0 |
| int | nPrunedItems = 0 |
| int | nMerged = 0 |
| int | nAdded = 0 |
| int | nDotitemAdded = 0 |
| int | nCalledComputeNode = 0 |
| int | segmentID |
Private Member Functions | |
| void | completeSpan (int i, int j) |
| void | logStatistics (Level level) |
| int | addUnaryNodes (Grammar[] grs, int i, int j) |
| void | completeCell (int i, int j, DotNode dotNode, List< Rule > sortedRules, int arity, SourcePath srcPath) |
Private Attributes | |
| Cell[][] | cells |
| int | sourceLength |
| List< FeatureFunction > | featureFunctions |
| List< StateComputer > | stateComputers |
| Grammar[] | grammars |
| DotChart[] | dotcharts |
| Cell | goalBin |
| int | goalSymbolID = -1 |
| Lattice< Integer > | inputLattice |
| SyntaxTree | parseTree |
| Combiner | combiner = null |
| ManualConstraintsHandler | manualConstraintsHandler |
Static Private Attributes | |
| static final Logger | logger = Logger.getLogger(Chart.class.getName()) |
Chart class this class implements chart-parsing: (1) seeding the chart (2) cky main loop over bins, (3) identify applicable rules in each bin
Note: the combination operation will be done in Cell
Signatures of class: Cell: i, j SuperNode (used for CKY check): i,j, lhs HGNode ("or" node): i,j, lhs, edge ngrams HyperEdge ("and" node)
index of sentences: start from zero index of cell: cell (i,j) represent span of words indexed [i,j-1] where i is in [0,n-1] and j is in [1,n]
| joshua.decoder.chart_parser.Chart.Chart | ( | Sentence | sentence, |
| List< FeatureFunction > | featureFunctions, | ||
| List< StateComputer > | stateComputers, | ||
| Grammar[] | grammars, | ||
| boolean | useMaxLMCostForOOV, | ||
| String | goalSymbol | ||
| ) |
| void joshua.decoder.chart_parser.Chart.addAxiom | ( | int | i, |
| int | j, | ||
| Rule | rule, | ||
| SourcePath | srcPath | ||
| ) |
axiom is for rules with zero-arity
| int joshua.decoder.chart_parser.Chart.addUnaryNodes | ( | Grammar[] | grs, |
| int | i, | ||
| int | j | ||
| ) | [private] |
agenda based extension: this is necessary in case more than two unary rules can be applied in topological order s->x; ss->s for unary rules like s->x, once x is complete, then s is also complete
| void joshua.decoder.chart_parser.Chart.completeCell | ( | int | i, |
| int | j, | ||
| DotNode | dotNode, | ||
| List< Rule > | sortedRules, | ||
| int | arity, | ||
| SourcePath | srcPath | ||
| ) | [private] |
| void joshua.decoder.chart_parser.Chart.completeSpan | ( | int | i, |
| int | j | ||
| ) | [private] |
Construct the hypergraph with the help from DotChart.
a parser that can handle: - multiple grammars - on the fly binarization - unary rules (without cycle)
each dotChart can act individually (without consulting other dotCharts) because it either consumes the source input or the complete nonTerminals, which are both grammar-independent
Cube-pruning requires the nodes being sorted, when prunning for later/wider cell. Cuebe-pruning will see superNode, which contains a list of nodes. getSortedNodes() will make the nodes in the superNode get sorted
| Cell joshua.decoder.chart_parser.Chart.getCell | ( | int | i, |
| int | j | ||
| ) |
| void joshua.decoder.chart_parser.Chart.logStatistics | ( | Level | level | ) | [private] |
| void joshua.decoder.chart_parser.Chart.setGoalSymbolID | ( | int | i | ) |
Manually set the goal symbol ID. The constructor expects a String representing the goal symbol, but there may be time (say, for example, in the second pass of a synchronous parse) where we want to set the goal symbol to a particular ID (regardless of String representation).
This method should be called before expanding the chart, as chart expansion depends on the goal symbol ID.
| i | the id of the goal symbol to use |
Cell [][] joshua.decoder.chart_parser.Chart.cells [private] |
Combiner joshua.decoder.chart_parser.Chart.combiner = null [private] |
DotChart [] joshua.decoder.chart_parser.Chart.dotcharts [private] |
List<FeatureFunction> joshua.decoder.chart_parser.Chart.featureFunctions [private] |
int joshua.decoder.chart_parser.Chart.goalSymbolID = -1 [private] |
Grammar [] joshua.decoder.chart_parser.Chart.grammars [private] |
Lattice<Integer> joshua.decoder.chart_parser.Chart.inputLattice [private] |
final Logger joshua.decoder.chart_parser.Chart.logger = Logger.getLogger(Chart.class.getName()) [static, private] |
int joshua.decoder.chart_parser.Chart.nAdded = 0 [package] |
int joshua.decoder.chart_parser.Chart.nCalledComputeNode = 0 [package] |
int joshua.decoder.chart_parser.Chart.nDotitemAdded = 0 [package] |
int joshua.decoder.chart_parser.Chart.nMerged = 0 [package] |
int joshua.decoder.chart_parser.Chart.nPreprunedEdges = 0 [package] |
how many items have been pruned away because its cost is greater than the cutoff in calling chart.add_deduction_in_chart()
int joshua.decoder.chart_parser.Chart.nPreprunedFuzz1 = 0 [package] |
int joshua.decoder.chart_parser.Chart.nPreprunedFuzz2 = 0 [package] |
int joshua.decoder.chart_parser.Chart.nPrunedItems = 0 [package] |
int joshua.decoder.chart_parser.Chart.segmentID [package] |
int joshua.decoder.chart_parser.Chart.sourceLength [private] |
List<StateComputer> joshua.decoder.chart_parser.Chart.stateComputers [private] |