Introduction to Natural Language Processing (600.465) LM Smoothing (The EM Algorithm)

9/7/00


Click here to start


Table of Contents

Introduction to Natural Language Processing (600.465) LM Smoothing (The EM Algorithm)

The Zero Problem

Why do we need Nonzero Probs?

Eliminating the Zero Probabilities: Smoothing

Smoothing by Adding 1

Adding less than 1

Good - Turing

Good-Turing: An Example

Smoothing by Combination: Linear Interpolation

Typical n-gram LM Smoothing

Held-out Data

The Formulas

The (Smoothing) EM Algorithm

Remark on Linear Interpolation Smoothing

Bucketed Smoothing: The Algorithm

Simple Example

Some More Technical Hints

Author: Jan Hajic

Email: hajic@cs.jhu.edu

Home Page: http://www.cs.jhu.edu/~hajic/courses/cs465/syllabus.html