How well do neural NLP systems generalize?

Neural networks have rapidly become central to NLP systems. While such systems perform well on typical test set examples, their generalization abilities are often poorly understood. In this talk, I will discuss new methods to characterize the gaps between the abilities of neural systems and those of humans, by focusing on interpretable axes of generalization from the training set rather than on average test set performance. I will show that recurrent neural network (RNN) language models are able to process syntactic dependencies in typical sentences with considerable success, but when evaluated on more complex syntactically controlled materials, their error rate increases sharply. Likewise, neural systems trained to perform natural language inference generalize much more poorly than their test set performance would suggest. Finally, I will discuss a novel method for measuring compositionality in neural network representations; using this method, we show that the sentence representations acquired by neural natural language inference systems are not fully compositional, in line with their limited generalization abilities.

Speaker Biography

Tal Linzen is an Assistant Professor of Cognitive Science at Johns Hopkins University. Before moving to Johns Hopkins in 2017, he was a postdoctoral researcher at the École Normale Supérieure in Paris, where he worked with Emmanuel Dupoux and Benjamin Spector; before that he obtained his PhD from the Department of Linguistics at New York University in 2015, under the supervision of Alec Marantz. At JHU, Dr. Linzen directs the Computation and Psycholinguistics Lab; the lab develops computational models of human language comprehension and acquisition, as well as methods for interpreting, evaluating and extending neural network models for natural language processing. The lab’s work has appeared in venues such as EMNLP, ICLR, NAACL and TACL, as well as in journals such as Cognitive Science and Journal of Neuroscience. Dr. Linzen is one of the co-organizers of the BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (EMNLP 2018, ACL 2019).