600.465 Introduction to NLP (Fall 1999)

Midterm Exam Answers

Date: Nov 01 2pm (30 min.)




  If asked to compute something for which you have the numbers, that really means to compute the final number, not just to write the formula. If asked for a formula, write down the formula.

1. Probability

Let S = { a, b, c } (the sample space), and p be the joint distribution on a sequence of two events (i.e. on S x S, ordered). If you know that p(a,a) [a followed by a] = 0.25, p(a,b) [a followed by b] = 0.125, p(b,c) [b followed by c] = 0.125, p(c,a) [c followed by a] = 0.25, and p(c,c) [c followed by c] = 0.25, is it enough to compute p(b|a) (i.e., the probability of seeing b if we already know that the preceding event generated a)?

2. Estimation and Cross-entropy

Use the bigram distribution from question 1.

3. Mutual information

Use the bigram distribution from question 1.

4. Smoothing and the sparse data problem

5. Classes based on Mutual Information

Suppose you have the following data:

Is this question really so easy , or was it rather the previous question , that was so difficult ?

What is the best pair of candidates for the first merge, if you use the greedy algorithm for classes based on bigram mutual information (i.e. the homework #2 algorithm)? Use your judgment, not computation.


6. Hidden Markov Models

Now check if you have filled in your name and SSN. Also, please carefully check your answers and hand the exam in.
1 The perplexity computation is the only one computation here for which you might need a calculator; it is ok if you use an expression (use the appropriate (integer) numbers, though!).