Johns Hopkins Computer Science Home
Johns Hopkins University The Whiting School of Engineering

Natural Language Processing
Prof. Jason Eisner
Course # 600.465 — Fall 2016

parse trees


Vital Statistics [1]

Course catalog entry: This course is an in-depth overview of techniques for processing human language. How should linguistic structure and meaning be represented? What algorithms can recover them from text? And crucially, how can we build statistical models to choose among the many legal answers?

The course covers methods for trees (parsing and semantic interpretation), sequences (finite-state transduction such as tagging and morphology), and words (sense and phrase induction), with applications to practical engineering tasks such as information retrieval and extraction, text classification, part-of-speech tagging, speech recognition, and machine translation. There are a number of structured but challenging programming assignments. Prerequisite: 600.226 or equivalent. [Applications, 4 credits]

Course objectives: Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to linguistics and statistics. You need to know how to program (e.g., 600.120) and use common data structures (600.226). It might also be nice—though it's not required—to have some previous familiarity with automata (600.271) and probabilities (600.475, 550.420, or 550.310). At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's formal and statistical techniques, and be able to understand research papers in the field.

Lectures:MWF 3-4 or 3-4:15, Hodson 311.
Recitations:Tue 6-7:30, Hodson 213.
Prof:Jason Eisner - (image of email address) ((image of email address))
TA:Chu-Cheng Lin - (image of email address)
CA:Bo Liu - (image of email address)
CA:Vincent Yan - (image of email address)
Office hrs: Prof: After class until 4:30; or by appt in Hackerman 324C
Chu-Cheng: Tue 11-12, Fri 10-11, Hackerman 322
Bo: Thu 4:30-5:30, Malone 229
Vincent: Mon 12-1, Malone 239
Discussion site: ... public questions, discussion, announcements
Web page:
Textbook: Jurafsky & Martin, 2nd ed. (semi-required - P98.J87 2009 in Science Ref section on C-Level)
Roark & Sproat (recommended - P98.R63 2007 in same section)
Manning & Schütze (recommended - free online PDF version here!)
Policies: Grading: homework 50%, participation 5%, midterm 15%, final 30%
Submission: TBA
Lateness: floating late days policy
Honesty: CS integrity code, JHU undergraduate policies, JHU graduate policies
Intellectual engagement: much encouraged
Disabilities: If you need accommodations for a disability, obtain a letter from Student Disability Services, 385 Garland, (410) 516-4720.
Announcements: Read mailing list and this page!


This class is in the "flexible time slot" MWF 3-4:30. Please keep the entire slot open. Class will usually run 3-4, followed by office hours in the classroom from 4-4:30 (stick around to get your money's worth). However, class will sometimes run till 4:15 in order to keep up with the syllabus. I'll try to give advance notice of these "long classes," which among other things make up for no-class days when I'm out of town.

We also run a once-per-week recitation led by the prof or the TA. This session will focus on solving problems together. That's meant as an efficient and cooperative way to study for an hour: it reinforces the past week's class material without adding to your homework load. Also, if you come to discussion session as recommended, you won't be startled by the exam style — the discussion problems are taken from past exams and are generally interesting.

Warning: The schedule below may change. Links to future lectures and assignments may also change (they currently point to last year's versions).

Warning: Use the PPT slides if possible. The PDF export versions don't have animations and they may be out of date relative to the PPT files (although I do try to update them before exams).

Week Monday Wednesday Friday Suggested Reading
8/29 Introduction (ppt)
  • Questionnaire
  • Why is NLP hard?
  • Levels of language
  • NLP applications
  • Random language via n-grams
  • Intro: J&M chapter 1
  • 9/5 No class (Labor Day) Assignment 1 given: Designing CFGs
    Modeling grammaticality (ppt)
  • What's wrong with n-grams?
  • Regular expressions, FSAs, CFGs, ...
  • Language models (ppt)
  • Language ID
  • Text categorization
  • Spelling correction
  • Segmentation
  • Speech recognition
  • Machine translation
  • Chomsky hierarchy: J&M 16
  • Homework: J&M 12, martin M&S 3
  • Language models: M&S 6 (or R&S 6)
  • 9/12 Probability concepts (ppt; video lecture)
  • Joint & conditional prob
  • Chain rule and backoff
  • Modeling sequences
  • Cross-entropy and perplexity
  • Bayes' Theorem (ppt)
    Smoothing n-grams (ppt)
  • Maximum likelihood estimation
  • Bias and variance
  • Add-one or add-λ smoothing
  • Cross-validation
  • Smoothing with backoff
  • Good-Turing, Witten-Bell
  • Assignment 2 given: Probabilities
    Smoothing continued
  • Conditional log-linear models (interactive visualization)
  • Maximum likelihood, regularization
  • Prob/Bayes: M&S 2; slides by Moore or Martin
  • Smoothing: M&S 6; J&M 4; Rosenfeld (2000)
  • 9/19 Catch-up day
        (we'll be behind schedule by now)
  • Assignment 1 due
        (& another sign meant 3 ... ?)
  • Discussion of Asst. 1
  • Improving CFG with attributes (ppt)
  • Morphology
  • Lexicalization
  • Tenses
  • Gaps (slashes)
  • Assignment 3 given: Language Models
    Context-free parsing (ppt)
  • What is parsing?
  • Why is it useful?
  • Brute-force algorithm
  • CKY and Earley algorithms
  • Attributes: J&M 15
  • Parsing: J&M 13
  • 9/26 Context-free parsing
  • From recognition to parsing
  • Incremental strategy
  • Dotted rules
  • Sparse matrices
  • Assignment 2 due
    Earley's algorithm (ppt)
  • Top-down parsing
  • Earley's algorithm
  • Quick in-class quiz: Log-linear models
    Extending CFG (summary (ppt))
  • CCG
  • TSG
  • TAG (by Darcey Riley)
  • CCG: Steedman & Baldridge; more
  • TAG/TSG: Van Noord, Guo, Zhang 1/2/3
  • 10/3 Probabilistic parsing (ppt)
  • PCFG parsing
  • Dependency grammar
  • Lexicalized PCFGs
  • Assignment 4 given: Parsing
    Parsing tricks (ppt)
  • Pruning; best-first
  • Rules as regexps
  • Left-corner strategy
  • Smoothing
  • Evaluation
  • A song about parsing
  • Assignment 3 due
    Human sentence processing (ppt)
  • Methodology
  • Frequency sensitivity
  • Incremental interpretation
  • Unscrambling text (ppt)
  • Prob. parsing: M&S 12, J&M 14
  • Psycholinguistics: Tanenhaus & Trueswell (2006), Human Sentence Processing website
  • 10/10 Semantics (ppt)
  • What is understanding?
  • Lambda terms
  • Semantic phenomena and representations
  • Midterm exam
    (3-4:30 in classroom)
    Semantics continued
  • More semantic phenomena and representations
  • Semantics: J&M 17-18; this web page, up to but not including "denotational semantics" section; try the Penn Lambda Calculator; lambda calculus for kids
  • 10/17 Assignment 5 given: Semantics
    Semantics continued
  • Adding semantics to CFG rules
  • Compositional semantics
  • Learning in the limit (ppt)
  • Gold's theorem
  • Class is on Thursday, not Friday
    Forward-backward algorithm (ppt) (Excel spreadsheet; Viterbi version; lesson plan; video lecture)
  • Ice cream, weather, words and tags
  • Forward and backward probabilities
  • Inferring hidden states
  • Controlling the smoothing effect
  • Forward-backward: J&M 6
  • Inside-outside: John Lafferty's notes; M&S 11; relation to backprop
  • 10/24 Forward-backward continued
  • Reestimation
  • Likelihood convergence
  • Symmetry breaking
  • Local maxima
  • Uses of states
  • Assignment 4 due
    Assignment 6 given: Hidden Markov Models
    Expectation Maximization (ppt)
  • Generalizing the forward-backward strategy
  • Inside-outside algorithm
  • Posterior decoding
  • Finite-state algebra (ppt)
  • Regexp review
  • Properties
  • Functions, relations, composition
  • Simple applications
  • Finite-state machines: R&S 1
  • Finite-state operators: chaps 2-3 of XFST book draft
  • 10/31 Finite-state machines
  • Acceptors
  • Expressive power
  • Weights and semirings
  • Lattice parsing
  • Transducers
  • Finite-state implementation (ppt)
  • Finite-state operators
  • Uses of composition
  • Implementing the operators
  • Assignment 5 due
    Assignment 7 given: Finite-State Modeling
    Noisy channels and FSTs (ppt)
  • Segmentation
  • Spelling correction
  • The noisy channel generalization
  • Implementation using FSTs
  • Finite-state NLP: Karttunen (1997)
  • 11/7 Noisy-channel FSTs continued
  • Baby talk
  • Morphology
  • Edit distance
  • Transliteration
  • Speech recognition
  • Finite-state tagging (ppt)
  • The task
  • Hidden Markov Models
  • Transformation-based
  • Constraint-based
  • Programming with regexps (ppt)
  • Analogy to programming
  • Extended finite-state operators
  • Date parsing
  • Tagging: J&M 5 or M&S 10
  • 11/14 Assignment 6 due
  • Morphology and phonology (ppt)
  • English, Turkish, Arabic
  • Stemming
  • Compounds, segmentation
  • Two-level morphology
  • Punctuation
  • Rewrite rules
  • OT
  • Optimal paths in graphs
  • The Dyna perspective
  • Structured prediction (ppt)
  • Perceptrons
  • CRFs
  • Feature engineering
  • Generative vs. discriminative
  • Morphology: R&S 2
  • Dyna tutorial and non-required assignment
  • 11/21 No class
    (Thanksgiving break)
    No class
    (Thanksgiving break)
    No class
    (Thanksgiving break)
    11/28 Current NLP tasks and competitions (ppt)
  • The NLP research community
  • Text annotation tasks
  • Other types of tasks
  • Applied NLP continued Applied NLP continued
  • Explore links in "NLP tasks" slides
  • 12/5 Topic models (ppt) Graphical models, deep learning, ... Assignment 7 due
    Machine translation
  • Guest lecture by Matt Post or Philipp Koehn?

    Final exam: Thu 12/15, 9am-noon
  • Topic models: intro readings/slides from Dave Blei, slides by Jason Eisner (video lecture part 1, part 2)
  • MT: J&M 25, M&S 13,; tutorial (2003), workbook (1999), introductory essay (1997), technical paper (1993); tutorial (2006) focusing on more recent developments (slides, 3-hour video part 1, part 2)

  • Old Materials

    Lectures from past years, some still useful: Old assignment:

    ABET Outcomes

    Course Outcomes

    Program Outcomes