|
Joshua
open source statistical hierarchical phrase-based machine translation system
|
Classes | |
| class | RankerResult |
| class | RankerTask |
Public Member Functions | |
| NbestMinRiskReranker (boolean produceRerankedNbest, double scalingFactor) | |
| String | processOneSent (List< String > nbest, int sentID) |
| double | computeExpectedGain (int curHypLen, HashMap< String, Integer > curHypNgramTbl, List< HashMap< String, Integer >> ngramTbls, List< Integer > sentLens, List< Double > nbestProbs) |
Static Public Member Functions | |
| static void | computeNormalizedProbs (List< Double > nbestLogProbs, double scalingFactor) |
| static double | computeExpectedGain (String curHyp, List< String > nbestHyps, List< Double > nbestProbs) |
| static void | main (String[] args) throws IOException |
Package Functions | |
| void | getGooglePosteriorCounts (List< HashMap< String, Integer >> ngramTbls, List< Double > normalizedProbs, HashMap< String, Double > posteriorCountsTbl) |
| double | computeExpectedLinearCorpusGain (int curHypLen, HashMap< String, Integer > curHypNgramTbl, HashMap< String, Double > posteriorCountsTbl) |
Package Attributes | |
| boolean | produceRerankedNbest = false |
| double | scalingFactor = 1.0 |
| final PriorityBlockingQueue < RankerResult > | resultsQueue |
Static Package Attributes | |
| static int | bleuOrder = 4 |
| static boolean | doNgramClip = true |
| static boolean | useGoogleLinearCorpusGain = false |
Static Private Member Functions | |
| static double | addInLogSemiring (double x, double y, int addMode) |
this class implements: (1) nbest min risk (MBR) reranking using BLEU as a gain funtion.
This assume that the string is unique in the nbest list In Hiero, due to spurious ambiguity, a string may correspond to many possible derivations, and ideally the probability of a string should be the sum of all the derivataions leading to that string. But, in practice, one normally uses a Viterbi approximation: the probability of a string is its best derivation probability So, if one want to deal with spurious ambiguity, he/she should do that before calling this class
| joshua.decoder.NbestMinRiskReranker.NbestMinRiskReranker | ( | boolean | produceRerankedNbest, |
| double | scalingFactor | ||
| ) |
| static double joshua.decoder.NbestMinRiskReranker.addInLogSemiring | ( | double | x, |
| double | y, | ||
| int | addMode | ||
| ) | [static, private] |
| double joshua.decoder.NbestMinRiskReranker.computeExpectedGain | ( | int | curHypLen, |
| HashMap< String, Integer > | curHypNgramTbl, | ||
| List< HashMap< String, Integer >> | ngramTbls, | ||
| List< Integer > | sentLens, | ||
| List< Double > | nbestProbs | ||
| ) |
| static double joshua.decoder.NbestMinRiskReranker.computeExpectedGain | ( | String | curHyp, |
| List< String > | nbestHyps, | ||
| List< Double > | nbestProbs | ||
| ) | [static] |
| double joshua.decoder.NbestMinRiskReranker.computeExpectedLinearCorpusGain | ( | int | curHypLen, |
| HashMap< String, Integer > | curHypNgramTbl, | ||
| HashMap< String, Double > | posteriorCountsTbl | ||
| ) | [package] |
| static void joshua.decoder.NbestMinRiskReranker.computeNormalizedProbs | ( | List< Double > | nbestLogProbs, |
| double | scalingFactor | ||
| ) | [static] |
based on a list of log-probabilities in nbestLogProbs, obtain a normalized distribution, and put the normalized probability (real value in [0,1]) into nbestLogProbs
| void joshua.decoder.NbestMinRiskReranker.getGooglePosteriorCounts | ( | List< HashMap< String, Integer >> | ngramTbls, |
| List< Double > | normalizedProbs, | ||
| HashMap< String, Double > | posteriorCountsTbl | ||
| ) | [package] |
| static void joshua.decoder.NbestMinRiskReranker.main | ( | String[] | args | ) | throws IOException [static] |
| String joshua.decoder.NbestMinRiskReranker.processOneSent | ( | List< String > | nbest, |
| int | sentID | ||
| ) |
value in baselineScores will be changed to normalized probability
TODO: zhifei: now the re-ranking takes O(n^2) where n is the size of the nbest. But, we can significantly speed up this (leadding to O(n)) by first estimating a model on nbest, and then rerank the nbest using the estimated model.
int joshua.decoder.NbestMinRiskReranker.bleuOrder = 4 [static, package] |
boolean joshua.decoder.NbestMinRiskReranker.doNgramClip = true [static, package] |
boolean joshua.decoder.NbestMinRiskReranker.produceRerankedNbest = false [package] |
final PriorityBlockingQueue<RankerResult> joshua.decoder.NbestMinRiskReranker.resultsQueue [package] |
new PriorityBlockingQueue<RankerResult>()
double joshua.decoder.NbestMinRiskReranker.scalingFactor = 1.0 [package] |
boolean joshua.decoder.NbestMinRiskReranker.useGoogleLinearCorpusGain = false [static, package] |