Unsupervised Machine Learning for the Humanities

Tom Lippincott, Johns Hopkins University

A recurring task at the intersection of humanities and computational research is pairing data collected by a traditional scholar with an appropriate machine learning technique, ideally in a form that creates minimal burden on the scholar while yielding relevant, interpretable insights.

In this talk, I first introduce myself and explain how my interests, educational background and previous research has led to a focus on this general task. Next, I describe a specific effort to design a graph-aware autoencoding model of relational data that can be directly applied to a broad range of humanities research, and easily extended with improved neural (sub)architectures. I then present results from an ongoing historical study of the post-Atlantic slave trade in Baltimore, illustrating several ways it benefits traditional scholars. Finally, I briefly mention a few ongoing studies with collaborators from various departments, and a rough outline of a course aimed at a mixture of CS and Krieger students.

Speaker Biography

Dr. Lippincott has been a research scientist in the Johns Hopkins Human Language Technology Center of Excellence since receiving his Ph.D. from the University of Cambridge in 2015 under the supervision of Anna Korhonen. He spent two years prior to graduation as research faculty at Columbia University, working with Owen Rambow and Nizar Habash. His ongoing work at the HLTCOE includes text classification, sentiment analysis, and unsupervised modeling of semi-structured, heterogeneous data.