|
|
||||
|
|
|
|
|
|
Natural Language Processing
|
|
ic2") that you can use to test your smoothing code,
if you like.
http://cs.jhu.edu/~jason/465).Welcome! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to linguistics and statistics. You need to know how to program (e.g., 600.120) and use common data structures (600.226). It might also be nice to have some previous familiarity with automata (600.271) and probabilities (550.310). At the end you should agree (I hope!) that language is subtle and interesting, feel some ownership over some of NLP's formal and statistical techniques, and be able to understand research papers in the field.
Course catalog entry: This course is an in-depth overview of techniques for processing human language. How should linguistic structure and meaning be represented? What algorithms can recover them from text? And crucially, how can we build statistical models to choose among the many legal answers? The course covers methods for trees (parsing and semantic interpretation), sequences (finite-state transduction such as morphology), and words (sense and phrase induction), with applications to practical engineering tasks such as information retrieval and extraction, text classification, part-of-speech tagging, speech recognition and machine translation. There are a number of structured but challenging programming assignments. Prerequisite: 600.226. [Eisner, Applications, Fall] 3 credits
| Lectures: | MTW 2-3 pm, Hodson 311 | |
| Prof: | Jason Eisner - ( ) | |
| TA: | Jason Smith - ( ) | |
| CA: | TBA | |
| Office hrs: |
For Prof: Mon/Tue 3-4pm, or by appt, in NEB 324A For TA: TBA | |
| Mailing list: |
probably
... public questions, discussion, announcements | |
| Web page: | http://cs.jhu.edu/~jason/465 | |
| Textbook: |
Jurafsky &
Martin (required - online partial draft of
next edition wants your comments!) Manning & Schütze (recommended - online PDF version is accessible for free from within JHU) | |
| Policies: |
Grading: homework 45%, participation 10%, midterm 15%, final 30% Submission: via this web form Lateness: floating late days policy Honesty: here's what it means Intellectual engagement: much encouraged Announcements: Read mailing list and this page! | |
| Related course sites: |
|
Warning: For future lectures and assignments, the links below take you to last year's versions, which are subject to change.
Warning: The Jurafsky & Martin chapter numbers refer to the 1st edition, so be careful about which chapters you download from the 2nd edition website.
| Week | Monday | Tuesday | Wednesday | Readings | |
| 9/10 |
Introduction
(ppt)
|
Assignment 1 given: Designing CFGs Chomsky hierarchy (ppt) |
Speech Recognition
Guest lecture by Prof. Sanjeev Khudanpur |
J&M chapters 1, 13, 6.2; for assignment, J&M 9 (or M&S 3) | |
| 9/17 |
Language models
(ppt)
|
Probability concepts
(ppt)
Bayes' Theorem (ppt) |
Smoothing n-grams
(ppt)
|
M&S chapters 2, 6 | |
| 9/24 |
(& another sign meant 3 ... ?) Assignment 2 given: Using n-Grams Limitations of CFG |
Improving CFG with features
(ppt)
|
Skipped this material since we were behind Extending CFG (summary (ppt)) |
J&M 11.1-11.4 | |
| 10/1 |
Context-free parsing
(ppt)
|
Context-free parsing
|
Earley's algorithm
(ppt)
|
J&M 10 | |
| 10/8 |
Probabilistic parsing
(ppt)
|
Parsing tricks
(ppt)
A song about parsing Assignment 2 due on Friday ---> |
Human sentence processing
(ppt)
|
J&M 12 (or M&S 11.1-11.3 and 12.1.1-12.1.5) | |
| 10/15 | No class (fall break) |
Semantics
(ppt)
|
Semantics continued
|
J&M 14-15; also this web page, up to but not including "denotational semantics" section; and you could try the Penn Lambda Calculator; and how about lambda calculus for kids? | |
| 10/22 | Midterm exam |
Assignment 3 given: Parsing and Semantics Finite-state functions (ppt) |
Finite-state implementation
(ppt)
|
chap 2 of xfst book draft (only accessible from barley and other Solaris machines at JHU CS; don't distribute) | |
| 10/29 |
Programming with Regexps
(ppt)
|
Noisy Channels and FSTs
(ppt)
|
Morphology and Phonology
(ppt)
|
chap 3 of xfst book draft;
perhaps also this paper |
| |
| 11/5 |
Finite-state parsing
|
Finite-state tagging
(ppt)
|
HMMs
|
J&M 8 or M&S 10 | |
| 11/12 |
Assignment 3 due Assignment 4 given: Finite-State Grammars Forward-backward algorithm (Excel spreadsheet; Viterbi version; lesson plan) |
Forward-backward continued
|
Expectation Maximization
(ppt)
|
J&M chapter 6 (2nd ed.) or perhaps Allen pp. 195-208 (handout); M&S 11 | |
| 11/19 |
Grouping words
(ppt; Excel spreadsheet)
|
More on learning
(ppt)
| Assignment 4 due |
Assignment 5 given: Training an HMM No class (Thanksgiving coming) M&S 14 |
| |
| 11/26 |
Splitting words
(ppt)
|
Words vs. senses in IR
(ppt)
|
Final FSM Examples
(ppt)
|
M&S 7, 5, 15.2, 15.4 (since J&M 16-17 covers only some of this) | |
| 12/3 |
Machine Translation
|
Text categorization (ppt)
|
Maximum entropy (ppt) | Kevin Knight's great MT tutorial and workbook; M&S 13, 16 | |
| 12/10 |
Assignment 5 due Current and Future Research (ppt) |
Sun 12/16 is absolute deadline for late assignments ---> |
Final exam: Wed 12/19, 9am-noon ---> |