Johns Hopkins Computer Science Home
Johns Hopkins University The Whiting School of Engineering

Natural Language Processing
Prof. Jason Eisner
Course # 600.465 - Fall 2009

parse trees

Announcements


Vital Statistics [1]

Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to linguistics and statistics. You need to know how to program (e.g., 600.120) and use common data structures (600.226). It might also be nice to have some previous familiarity with automata (600.271) and probabilities (600.475, 550.420, or 550.310). At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's formal and statistical techniques, and be able to understand research papers in the field.

Course catalog entry: This course is an in-depth overview of techniques for processing human language. How should linguistic structure and meaning be represented? What algorithms can recover them from text? And crucially, how can we build statistical models to choose among the many legal answers? The course covers methods for trees (parsing and semantic interpretation), sequences (finite-state transduction such as morphology), and words (sense and phrase induction), with applications to practical engineering tasks such as information retrieval and extraction, text classification, part-of-speech tagging, speech recognition and machine translation. There are a number of structured but challenging programming assignments. Prerequisite: 600.226. [Eisner, Applications, Fall] 3 credits

Lectures:MWF 3-4 pm (sometimes 3-4:30), CSEB B17
Prof:Jason Eisner - (image of email address) ((image of email address))
TA:Jonathan Weese - (image of email address) (jonny at cs dot jhu dot edu)
CA:Chuan Liu - (image of email address) (chuan at cs dot jhu dot edu)
Office hrs: For Prof: MW 4pm after class (or by appt) in CSEB 324C
For TA: Th 2-4 (or by appt) in CSEB 322
Mailing list: (image of email address) ... public questions, discussion, announcements
Web page:http://cs.jhu.edu/~jason/465
Textbook: Jurafsky & Martin, 2nd ed. (semi-required - P98.J87 2009 in Science Ref section on C-Level)
Manning & Schütze (recommended - online PDF version here!)
Policies: Grading: homework 50%, participation 5%, midterm 15%, final 30%
Submission: TBA
Lateness: floating late days policy
Honesty: here's what it means
Intellectual engagement: much encouraged
Announcements: Read mailing list and this page!
Related
sites:
List of many courses - some may have useful material!

Schedule

This class is in the 3-4:30 time slot, a "flex slot" that is intended to permit either three 50-minute lectures or two 75-minute lectures per week. Usually, we have three 50-minute lectures per week (MWF). However, you will see on the schedule below that I am fairly likely to cancel class on M 9/29, M 10/27, W 11/19, and M 12/8. I may therefore ask you to let me make up the time with a few 75-minute lectures, for example, on M and W in the week before a canceled class.

Warning: For future lectures and assignments, the links below take you to last year's versions, which are subject to change.

Warning: The Jurafsky & Martin chapter numbers refer to the 1st edition, not the 2nd. We will update them shortly.

Warning: I've turned off the PDF links for now because they are not all up to date with the PPT links. Just click on "ppt" instead.

Week Monday Wednesday Friday Suggested Reading
8/31 Introduction (ppt)
  • Why is NLP hard?
  • Levels of language
  • NLP applications
  • Random language via n-grams
  • Questionnaire
  • Assignment 1 given: Designing CFGs
    Chomsky hierarchy (ppt)
  • What's wrong with n-grams?
  • Regular expressions, CFGs, & more
  • Lists, trees, and vectors
  • J&M chapter 1; for assignment, J&M 9 (or M&S 3)
    9/7 No class (Labor Day) Language models (ppt)
  • Language ID
  • Text categorization
  • Spelling correction
  • Segmentation
  • Speech recognition
  • Machine translation
  • Probability concepts (ppt)
  • Joint & conditional prob
  • Entropy, cross-entropy, perplexity
  • Continuous distributions?
    Bayes' Theorem (ppt)
  • J&M chapters 13, 6.2 (see also M's slides or Andrew Moore's slides); M&S chapter 2
    9/14 Smoothing n-grams (ppt)
  • Add-one or add-lambda smoothing
  • Good-Turing discounting
  • Smoothing with backoff
  • Deleted interpolation
  • Assignment 2 given: Using n-Grams
    (lecture postponed till 10/5 or 10/12)
    Human sentence processing (ppt)
  • Methodology
  • Frequency sensitivity
  • Incremental interpretation
  • No class (Rosh Hashanah)
    But could have a Q&A / homework help session with the TA ...
    M&S chapter 6; Rosenfeld (2000) survey
    9/21
  • Assignment 1 due
        (& another sign meant 3 ... ?)
    Limitations of CFG
  • Discussion of Asst. 1
  • Improving CFG with features (ppt)
  • Morphology
  • Lexicalization
  • Tenses
  • Gaps (slashes)
  • Context-free parsing (ppt)
  • What is parsing?
  • Why is it useful?
  • Brute-force algorithm
  • CKY and Earley algorithms
  • J&M 10, 11.1-11.4
    9/28 No class (Yom Kippur)
    But could have a Q&A / homework help session with the TA ...
    Context-free parsing
  • From recognition to parsing
  • Incremental strategy
  • Dotted rules
  • Sparse matrices
  • Earley's algorithm (ppt)
  • Top-down parsing
  • Earley's algorithm
  • J&M 10
    10/5 (lecture skipped this year)
    Extending CFG (summary (ppt))
  • CCG
  • TSG, TAG, TIG
  • Probabilistic parsing (ppt)
  • PCFG parsing
  • Dependency grammar
  • Lexicalized PCFGs
  • Assignment 2 due Assignment 3 given: Parsing
    Parsing tricks (ppt)
  • Pruning; best-first
  • Left-corner strategy
  • Rules as regexps
  • Smoothing
  • Evaluation
    A song about parsing
  • J&M 12 (or M&S 11.1-11.3 and 12.1.1-12.1.5)
    10/12 (lecture skipped this year)
    Learning is impossible? (ppt)
  • Learning in the limit (Gold's theorem)
  • Semantics (ppt)
  • What is understanding?
  • Lambda terms
  • Semantic phenomena and representations
  • Semantics continued
  • Adding semantics to CFG rules
  • Compositional semantics
  • J&M 14-15; also this web page, up to but not including "denotational semantics" section; and you could try the Penn Lambda Calculator; and how about lambda calculus for kids?
    10/19 Assignment 4 given: Semantics
    Finite-state functions (ppt)
  • Regexp review
  • Properties
  • Functions, relations, composition
  • Simple applications
  • Finite-state implementation (ppt)
  • Finite-state operators
  • Uses of composition
  • Implementing the operators
  • Midterm exam
    (in class)
    chap 2 of xfst book draft (only accessible from barley and other Solaris machines at JHU CS; don't distribute)
    10/26 Programming with Regexps (ppt)
  • Analogy to programming
  • Extended finite-state operators
  • Date parsing
  • FASTUS
  • Noisy Channels and FSTs (ppt)
  • Regexps and segmentation
  • The noisy channel generalization
  • Applications of the noisy channel
  • Implementation using FSTs
  • Assignment 3 due (lecture skipped this year)
    Morphology and Phonology (ppt)
  • English, Turkish, Arabic
  • Stemming
  • Compounds, segmentation
  • Two-level morphology
  • Punctuation
  • Rewrite rules
  • OT
  • chap 3 of xfst book draft; perhaps also this paper
    11/2 Finite-state parsing
  • CFG approximation
  • Multipass chunking
  • (Wo)man vs. machine
  • Finite-state tagging (ppt)
  • The task
  • Hidden Markov Models
  • Transformation-based
  • Constraint-based
  • HMMs
  • Tagging continued
  • Chunking
  • Speech
  • J&M 8 or M&S 10
    11/9 Assignment 4 due
    Assignment 5 given: Finite-State Grammars
    Forward-backward algorithm (Excel spreadsheet; Viterbi version; lesson plan)
  • Ice cream, weather, words and tags
  • Forward and backward probabilities
  • Inferring hidden states
  • Controlling the smoothing effect
  • Forward-backward continued
  • Reestimation
  • Likelihood convergence
  • Symmetry breaking
  • Local maxima
  • Uses of states
  • Expectation Maximization (ppt)
  • Generalizing the forward-backward strategy
  • Inside-outside algorithm
  • J&M 6 (2nd ed.) or perhaps Allen pp. 195-208 (handout); M&S 11; John Lafferty's inside-outside notes
    11/16 Final FSM Examples (ppt)
  • Baby talk
  • Edit distance
  • Back-transliteration
  • Machine translation
  • Assignment 5 due
    Assignment 6 given: Training an HMM
    Grouping words (ppt)
  • Words vs. senses
  • Class-based / HMM
  • Clustering words as vectors
  • Splitting words (ppt)
  • WordNet
  • Supervised disambiguation (e.g., naive Bayes)
  • Unsupervised
  • M&S 14, 7, 5, (since J&M 16-17 covers only some of this)
    11/23 Words vs. senses in IR (ppt)
  • Information retrieval
  • Query expansion
  • Disambiguation
  • Latent Semantic Analysis
  • No class (Thanksgiving break) No class
    (Thanksgiving break)
    M&S 15.2, 15.4
    11/30 (may replace this & next lecture with more general overview of machine learning in NLP)
    Text categorization (ppt)
  • Task and its variants
  • Accuracy, precision, recall ...
  • Clusters, k-nearest neighbor
  • Naive Bayes
  • Decision trees
  • Log-linear models (ppt)
  • Maximum entropy perspective
  • Machine Translation
  • Competitive linking [last year]
  • IBM Model 1
  • Phrase-based models and decoding
  • Text cat: M&S 16, Science News article
    Log-linear: J&M 6
    MT: J&M 25, M&S 13, statmt.org;
    tutorial (2003), workbook (1999), introductory essay (1997), technical paper (1993);
    tutorial (2006) focusing on more recent developments (slides, 3-hour video part 1, part 2)
    12/7 Assignment 6 due
    Current and Future Research (ppt)
  • Current state of the art
  • Deploying NLP
  • IE for the masses?
  • Dialogue Systems: Eliza, Loebner Prize, TRIPS
  • Sun 12/13 is absolute deadline for late assignments --->

    Final exam: Thu 12/17, 9am-noon --->