Natural Language Processing
Prof. Jason Eisner
Course # 601.465/665 — Fall 2018
- 11/19/18 HW7
is available at 25 pages. Long homework! It walks you through a
series of exercises, holding your hand along the way.
You may work in pairs. It is due on Friday, December 7, at
11:59pm (as late as we could make it without cutting into reading
- 10/29/18 HW6
is available, with a separate "reading handout" appended to it. It
is due on Thursday, November 15 at 9pm. This isn't conceptually hard
if you understood the HMM lectures. But there are some finicky details ... it's a programming assignment with a long pair of handouts, so start as soon as possible!
- 10/13/18 HW5 is available, with a short
"reading handout" appended to it. It deals with
attaching semantic λ-expressions to grammar rules. It is
due on Friday, 11/2, at 2pm.
- 10/12/18 Midterm will be held on Wed 10/17,
in our usual classroom, from 3-4:30.
- 10/5/18 HW4
is available, with a separate "reading handout" appended to it.
The reading might also help you study parsing for the midterm.
This is the conceptually hardest assignment in the course, with two
major challenges: probabilistic Earley parsing, and making parsing
efficient. It is due on Wednesday, October 24 at 2pm. You
may work with a partner on this one.
- 9/25/18 The log-linear quiz associated with HW2 has been rescheduled to Monday 10/1 (in class).
- 9/24/18 HW3
is now available, with a separate "reading handout" appended to it.
The due date is Friday, 10/5, at 2pm. Start early: This is
a long and detailed assignment that requires you to write
some annoying programs for smoothing and experiment with their
parameters and design to see what happens. I strongly suggest
that you start going through the reading handout now, and then
spread the work out. You may work in pairs.
- 9/12/18 HW2 (9 pages + extra credit) is available. It's due in 2 weeks, Wednesday 9/26 at 2pm. This assignment is mostly a problem set about manipulating probabilities. But it is a long assignment! Most significantly, question 6 asks you to read a separate handout and to work through a series of online lessons, preferably with a partner or two. Question 8 asks you to write a small program. It is okay to work on questions 6 and 8 out of order.
- 8/31/18 HW1
(11 pages) is available. It is due on Friday, 9/17 at 2pm: please
get this one in on time so we can discuss it in class an hour
- 8/30/18 Please bookmark this page (
- Start of class is Thursday 8/30, 3pm.
- As explained below, please keep MWF 3-4:30 open to accommodate a variable class schedule as well as office hours after class.
- The main classroom
305. Olin Hall is northwest of Hodson, across San Martin
- We will have our weekly problem discussion sessions Tuesdays 6-7:30pm in Shaffer 101.
Vital Statistics 
Course catalog entry: This course is an in-depth overview of
techniques for processing human language. How should linguistic
structure and meaning be represented? What algorithms can recover
them from text? And crucially, how can we build statistical models
to choose among the many legal answers?
The course covers
methods for trees (parsing and semantic interpretation), sequences
(finite-state transduction such as tagging and morphology), and
words (sense and phrase induction), with applications to practical
engineering tasks such as information retrieval and extraction,
text classification, part-of-speech tagging, speech recognition,
and machine translation. There are a number of structured but
challenging programming assignments. Prerequisite: 601.226 or
equivalent. [Applications, 4 credits]
Course objectives: Welcome! This course is designed to
introduce you to some of the problems and solutions of NLP, and their
relation to linguistics and statistics. You need to know how to
program (e.g., 601.120) and use common data structures (601.226). It
might also be nice—though it's not required—to have
some previous familiarity with automata (600.271) and probabilities
(601.475/675, 553.420/620, or 553.310/311). At the end you should agree (I hope!)
that language is subtle and interesting, feel some ownership over some
of NLP's formal and statistical techniques, and be able to understand
research papers in the field.
|Lectures:||MWF 3-4 or 3-4:15,
|Recitations:||Tue 6-7:30, Shaffer 101.|
|Prof:||Jason Eisner - ()|
|TAs:||Arya McCarthy, Sebastian
|CAs:||Lisa Li, Shijia Liu, Molly Pan
Prof: After class until 4:30; or by appt in Hackerman 324C|
... public questions, discussion, announcements|
Jurafsky & Martin, 2nd ed. (semi-required - P98.J87 2009 in Science Ref section on C-Level)|
Roark & Sproat (recommended - P98.R63 2007 in same section)
Manning & Schütze (recommended - free online PDF version here!)
Grading: homework 50%, participation 5%, midterm 15%, final 30% |
Lateness: floating late days policy
Honesty: CS integrity code, JHU undergraduate policies, JHU graduate policies
Intellectual engagement: much encouraged
Disabilities: If you need accommodations for a disability, obtain a letter from Student Disability Services, 385 Garland, (410) 516-4720.
Announcements: Read mailing list and this page!
This class is in the "flexible time slot" MWF
3-4:30. Please keep the entire slot open.
Class will usually run 3-4, followed by office hours in the
classroom from 4-4:30 (stick around to get your money's worth).
However, class will sometimes run till 4:15 in order to keep up with
the syllabus. I'll try to give advance notice of these "long
classes," which among other things make up for no-class days when
I'm out of town.
We also run a once-per-week recitation led by the prof or the
TA. This session will focus on solving problems together.
That's meant as an efficient and cooperative way to study for an hour:
it reinforces the past week's class material without adding to your
homework load. Also, if you come to discussion session as
recommended, you won't be startled by the exam style — the
discussion problems are taken from past exams and are generally
Warning: The schedule below may change.
Links to future lectures and assignments may also change (they
currently point to last year's versions).
Warning: Use the PPT slides if possible.
The PDF export versions don't have animations and they may be
out of date relative to the PPT files (although I do try to update them before
Class is on Thursday, not Wednesday as shown
Why is NLP hard?
Levels of language
Random language via n-grams
Assignment 1 given: Designing CFGs
What's wrong with n-grams?
Regular expressions, FSAs, CFGs, ...
Intro: J&M chapter 1
Chomsky hierarchy: J&M 16
Homework: J&M 12, M&S 3, Huddleston
||No class (Labor Day)
(ppt; video lecture)
Joint & conditional prob
Chain rule and backoff
Cross-entropy and perplexity
Language models: M&S 6 (or R&S 6)
Prob/Bayes: M&S 2; slides by Moore or Martin
||No class (Rosh Hashanah)
Assignment 2 given: Probabilities
Maximum likelihood estimation
Bias and variance
Add-one or add-λ smoothing
Smoothing with backoff
Conditional log-linear models
Maximum likelihood, regularization
Smoothing: M&S 6; J&M 4; Rosenfeld (2000)
Log-linear models: Collins (pp. 1-4) or Smith (section 3.5)
Assignment 1 due
(& another sign meant 3 ... ?)
Discussion of Asst. 1
Improving CFG with attributes
|No class (Yom Kippur)
Assignment 3 given: Language Models
What is parsing?
Why is it useful?
CKY and Earley algorithms
Attributes: J&M 15
Parsing: J&M 13
From recognition to parsing
Assignment 2 due
Quick in-class quiz: Log-linear models
CCG: Steedman & Baldridge; more
TAG/TSG: Van Noord, Guo, Zhang 1/2/3
Prob. parsing: M&S 12, J&M 14
Assignment 4 given: Parsing
Rules as regexps
A song about parsing
Assignment 3 due
Human sentence processing
Unscrambling text (ppt)
What is understanding?
Semantic phenomena and representations
Psycholinguistics: Tanenhaus & Trueswell (2006), Human Sentence Processing website
Semantics: J&M 17-18;
this web page, up to but not including "denotational semantics" section;
try the Penn Lambda Calculator;
lambda calculus for kids
More semantic phenomena and representations
Assignment 5 given: Semantics
Adding semantics to CFG rules
Forward-backward algorithm (ppt)
(Excel spreadsheet; Viterbi version; lesson plan; video lecture)
Ice cream, weather, words and tags
Forward and backward probabilities
Inferring hidden states
Controlling the smoothing effect
Forward-backward: J&M 6
Uses of states
(3-4:30 in classroom)
|No class (fall break)
Learning in the limit
Assignment 4 due
Assignment 6 given: Hidden Markov Models
Generalizing the forward-backward strategy
Functions, relations, composition
Inside-outside and EM: John Lafferty's notes; M&S 11; relation to backprop
Finite-state machines: R&S 1
Weights and semirings
Uses of composition
Implementing the operators
Assignment 5 due
Assignment 7 given: Finite-State Modeling
Probably no class (prof traveling)
Finite-state operators: chaps 2-3 of XFST book draft
Noisy channels and FSTs
The noisy channel generalization
Implementation using FSTs
Noisy-channel FSTs continued
Hidden Markov Models
Tagging: J&M 5 or M&S 10
Finite-state NLP: Karttunen (1997)
Programming with regexps
Analogy to programming
Extended finite-state operators
Assignment 6 due
Morphology and phonology (ppt)
English, Turkish, Arabic
Generative vs. discriminative
Morphology: R&S 2
and non-required assignment
Current NLP tasks and competitions
The NLP research community
Text annotation tasks
Other types of tasks
Applied NLP continued
Applied NLP continued
Explore links in the "NLP tasks" slides!
Graphical models, deep learning, ...
Assignment 7 due
Guest lecture by Matt Post or Philipp Koehn?
Final exam: Fri 12/14, 9am-noon
intro readings/slides from Dave Blei,
slides by Jason Eisner (video lecture part 1, part 2)
MT: J&M 25, M&S 13, statmt.org;
introductory essay (1997),
technical paper (1993);
tutorial (2006) focusing on more recent developments
3-hour video part 1, part 2)
Lectures from past years, some still useful:
- Show sensitivity to linguistic phenomena and an ability to model
them with formal grammars. [program outcomes (a),(c*),(i),(j)]
- Understand and carry out proper experimental methodology for
training and evaluating empirical NLP systems.
[program outcomes (b),(c),(e*)]
- Be able to manipulate probabilities, construct statistical models
over strings and trees, and estimate parameters using supervised and
unsupervised training methods. [program outcomes (a),(i),(j*)]
- Be able to design, implement, and analyze NLP algorithms.
[program outcomes (a),(c),(d),(i*),(j)]
- (a) An ability to apply knowledge of computing and mathematics appropriate to the discipline.
- (b) An ability to analyze a problem, and identify and define the computing requirements appropriate to its solution.
- (c) An ability to design, implement, and evaluate a computer-based system, process, component, or program to meet desired needs.
- (d) An ability to function effectively on teams to accomplish a common goal.
- (e) An understanding of professional, ethical, legal, security, and social issues and responsibilities.
- (i) An ability to use current techniques, skills, and tools necessary for computing practice.
- (j) An ability to apply mathematical foundations, algorithmic principles, and computer science theory in the modeling and design of computer-based systems in a way that demonstrates comprehension of the tradeoffs involved in design choices.