Joshua
open source statistical hierarchical phrase-based machine translation system
 All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends
joshua.decoder.NbestMinRiskReranker Class Reference
Collaboration diagram for joshua.decoder.NbestMinRiskReranker:
[legend]

List of all members.

Classes

class  RankerResult
class  RankerTask

Public Member Functions

 NbestMinRiskReranker (boolean produceRerankedNbest, double scalingFactor)
String processOneSent (List< String > nbest, int sentID)
double computeExpectedGain (int curHypLen, HashMap< String, Integer > curHypNgramTbl, List< HashMap< String, Integer >> ngramTbls, List< Integer > sentLens, List< Double > nbestProbs)

Static Public Member Functions

static void computeNormalizedProbs (List< Double > nbestLogProbs, double scalingFactor)
static double computeExpectedGain (String curHyp, List< String > nbestHyps, List< Double > nbestProbs)
static void main (String[] args) throws IOException

Package Functions

void getGooglePosteriorCounts (List< HashMap< String, Integer >> ngramTbls, List< Double > normalizedProbs, HashMap< String, Double > posteriorCountsTbl)
double computeExpectedLinearCorpusGain (int curHypLen, HashMap< String, Integer > curHypNgramTbl, HashMap< String, Double > posteriorCountsTbl)

Package Attributes

boolean produceRerankedNbest = false
double scalingFactor = 1.0
final PriorityBlockingQueue
< RankerResult > 
resultsQueue

Static Package Attributes

static int bleuOrder = 4
static boolean doNgramClip = true
static boolean useGoogleLinearCorpusGain = false

Static Private Member Functions

static double addInLogSemiring (double x, double y, int addMode)

Detailed Description

this class implements: (1) nbest min risk (MBR) reranking using BLEU as a gain funtion.

This assume that the string is unique in the nbest list In Hiero, due to spurious ambiguity, a string may correspond to many possible derivations, and ideally the probability of a string should be the sum of all the derivataions leading to that string. But, in practice, one normally uses a Viterbi approximation: the probability of a string is its best derivation probability So, if one want to deal with spurious ambiguity, he/she should do that before calling this class

Author:
Zhifei Li, zhife.nosp@m.i.wo.nosp@m.rk@gm.nosp@m.ail..nosp@m.com
Version:
$LastChangedDate$

Constructor & Destructor Documentation

joshua.decoder.NbestMinRiskReranker.NbestMinRiskReranker ( boolean  produceRerankedNbest,
double  scalingFactor 
)

Here is the caller graph for this function:


Member Function Documentation

static double joshua.decoder.NbestMinRiskReranker.addInLogSemiring ( double  x,
double  y,
int  addMode 
) [static, private]

Here is the caller graph for this function:

double joshua.decoder.NbestMinRiskReranker.computeExpectedGain ( int  curHypLen,
HashMap< String, Integer >  curHypNgramTbl,
List< HashMap< String, Integer >>  ngramTbls,
List< Integer >  sentLens,
List< Double >  nbestProbs 
)

Here is the call graph for this function:

Here is the caller graph for this function:

static double joshua.decoder.NbestMinRiskReranker.computeExpectedGain ( String  curHyp,
List< String >  nbestHyps,
List< Double >  nbestProbs 
) [static]

Here is the call graph for this function:

double joshua.decoder.NbestMinRiskReranker.computeExpectedLinearCorpusGain ( int  curHypLen,
HashMap< String, Integer >  curHypNgramTbl,
HashMap< String, Double >  posteriorCountsTbl 
) [package]

Here is the call graph for this function:

Here is the caller graph for this function:

static void joshua.decoder.NbestMinRiskReranker.computeNormalizedProbs ( List< Double >  nbestLogProbs,
double  scalingFactor 
) [static]

based on a list of log-probabilities in nbestLogProbs, obtain a normalized distribution, and put the normalized probability (real value in [0,1]) into nbestLogProbs

Here is the call graph for this function:

Here is the caller graph for this function:

void joshua.decoder.NbestMinRiskReranker.getGooglePosteriorCounts ( List< HashMap< String, Integer >>  ngramTbls,
List< Double >  normalizedProbs,
HashMap< String, Double >  posteriorCountsTbl 
) [package]

Here is the caller graph for this function:

static void joshua.decoder.NbestMinRiskReranker.main ( String[]  args) throws IOException [static]

Here is the call graph for this function:

String joshua.decoder.NbestMinRiskReranker.processOneSent ( List< String >  nbest,
int  sentID 
)

value in baselineScores will be changed to normalized probability

TODO: zhifei: now the re-ranking takes O(n^2) where n is the size of the nbest. But, we can significantly speed up this (leadding to O(n)) by first estimating a model on nbest, and then rerank the nbest using the estimated model.

Here is the call graph for this function:

Here is the caller graph for this function:


Member Data Documentation

boolean joshua.decoder.NbestMinRiskReranker.doNgramClip = true [static, package]
final PriorityBlockingQueue<RankerResult> joshua.decoder.NbestMinRiskReranker.resultsQueue [package]
Initial value:
      new PriorityBlockingQueue<RankerResult>()