While rapid advances in technology and data availability have greatly increased the practical usability of natural language processing (NLP) models, current failures to center people in NLP research has contributed to an ethical crisis: models are liable to amplify stereotypes, spread misinformation, and perpetuate discrimination. These potential harms are difficult to identify and mitigate in data and models, because they are often subjective, subtle, dependent on social context, and cannot be reduced to supervised classification tasks. In this talk, I will discuss two projects focused on developing distantly-supervised NLP models to detect and mitigate these potential harms in text. The first exposes subtle media manipulation strategies in a state-influenced Russian newspaper by comparing media coverage with economic indicators, combining algorithms for processing text and economic data with frameworks from political science. The second develops a model to identify systemic differences in social media comments addressed towards men and women by training a model to predict the gender of the addressee and incorporating propensity matching and adversarial training to surface subtle features indicative of bias. This approach allows us to identify comments likely to contain bias without needing explicit bias annotations. Overall, my work aims to develop NLP models that facilitate text processing in diverse hard-to-annotate settings, provide insights into social-oriented questions, and advance the equity and fairness of NLP systems.
Anjalie Field is a PhD candidate at the Language Technologies Institute at Carnegie Mellon University and a visiting student at the University of Washington, where she is advised by Yulia Tsvtekov. Her work focuses on social-oriented natural language processing, specifically identifying and mitigating potential harms in text and text processing systems. This interdisciplinary work involves developing machine learning models to examine social issues like propaganda, stereotypes, and prejudice in complex real-world data sets, as well as exploring their amplification and ethical impacts in AI systems. Anjalie has published her work in NLP and interdisciplinary conferences, receiving a nomination for best paper at SocInfo 2020, and she is also the recipient of a NSF graduate research fellowship and a Google PhD fellowship. Prior to graduate school, Anjalie received her undergraduate degree in computer science, with minors in Latin and ancient Greek, from Princeton University.