Course Info

Welcome! This is a HEART course designed to introduce freshmen to research in multilingual natural language processing. After completing the course, students should gain

  • a high-level understanding of the field of NLP
  • familiarity with standard NLP algorithms and techniques
  • basic knowledge of linguistics, historical linguistics, machine translation, and multilingual techniques for NLP

Natural language processing is just what it sounds like: getting computers to process language. It's a combination of computer science, linguistics, and math. We will be talking about a variety of topics in NLP, with a focus on multilingual applications, including machine translation.

This class is supposed to have no prerequisites. However, programming is an essential part of NLP. We will be doing some light programming using Julia and Pluto notebooks. Don't worry if you don't know Julia! We'll be gradually introducing the language throughout the homeworks.

Logistics

Time: Thursdays 5 - 6:15pm ET

Location: Zoom (check here)

Instructor: Winston Wu

Every week will have short homework assignments or readings. For a final project, you will design and run your own NLP experiment.

After every class, you will fill out a short survey. This will provide you the opportunity to give me feedback and suggestions for future lectures as well as help me guage how everyone is understanding the material.

Grading is S/U.

Schedule

Date Topic Materials
9/3 Introduction and Language Modeling Setup
Language Modeling
Links 1 2 3
9/10 Whirlwind Tour of NLP Julia Cheat Sheet
Conditional Probability
9/17 Language and Linguistics Language in 10 Assignment
9/24 Language in 10 presentations
10/1 ML and Language ID Language ID
10/8 Historical Linguistics Phylogeny Homework
10/15 Word Embeddings Embeddings
Embedding Projector
10/22 Fall Break (no class)
10/29 Phrase-Based Machine Translation Interstellar First Contact
MT Output Analysis Homework
11/5 Neural Machine Translation Morphological Inflection Homework
11/12 Cross-Lingual Embeddings
11/19 Low-Resource NLP

The schedule is flexible, so if there is any topic not on this list, or if there is a topic you want covered in more/less detail, let me know!