As AI-driven language interfaces (such as chat-bots) become more integrated into our lives, they need to become more versatile and reliable in their communication with human users. How can we make progress toward building more “general” models that are capable of understanding a broader spectrum of language commands, given practical constraints such as the limited availability of labeled data?
In this talk, I will describe my research toward addressing this question along two dimensions of generality. First I will discuss progress in “breadth” — models that address a wider variety of tasks and abilities, drawing inspiration from existing statistical learning techniques such as multi-task learning. In particular, I will showcase a system that works well on several QA benchmarks, resulting in state-of-the-art results on 10 benchmarks. Furthermore, I will show its extension to tasks beyond QA (such as text generation or classification) that can be “defined” via natural language. In the second part, I will focus on progress in “depth” — models that can handle complex inputs such as compositional questions. I will introduce Text Modular Networks, a general framework that casts problem-solving as natural language communication among simpler “modules.” Applying this framework to compositional questions by leveraging discrete optimization and existing non-compositional closed-box QA models results in a model with strong empirical performance on multiple complex QA benchmarks while providing human-readable reasoning.
I will conclude with future research directions toward broader NLP systems by addressing the limitations of the presented ideas and other missing elements needed to move toward more general-purpose interactive language understanding systems.
Daniel Khashabi is a postdoctoral researcher at the Allen Institute for Artificial Intelligence (AI2), Seattle. Previously, he completed his Ph.D. in Computer and Information Sciences at the University of Pennsylvania in 2019. His interests lie at the intersection of artificial intelligence and natural language processing, with a vision toward more general systems through unified algorithms and theories.