Introduction to Natural Language Processing (600.465) LM Smoothing (The EM Algorithm)

The Zero Problem

Why do we need Nonzero Probs?

Eliminating the Zero Probabilities: Smoothing

Smoothing by Adding 1

Adding less than 1

Good - Turing

Good-Turing: An Example

Smoothing by Combination: Linear Interpolation

Typical n-gram LM Smoothing

Held-out Data

The Formulas

The (Smoothing) EM Algorithm

Remark on Linear Interpolation Smoothing

Bucketed Smoothing: The Algorithm

Simple Example

Some More Technical Hints