AutoML for Natural Language Processing

A tutorial at EACL 2023
Presenters: Kevin Duh and Xuan Zhang (Johns Hopkins University)
Slides: pdf

QR code for this page:
QR code


Automated Machine Learning (AutoML) is an emerging field that has potential to impact how we build models in NLP. As an umbrella term that includes topics like hyperparameter optimization and neural architecture search, AutoML has recently become mainstream at major conferences such as NeurIPS, ICML, and ICLR. The inaugural AutoML Conference was started in 2022, and with this community effort, we expect that deep learning software frameworks will begin to include AutoML functionality in the near future.

What does this mean to NLP? Currently, models are often built in an ad hoc process: we might borrow default hyperparameters from previous work and try a few variant architectures, but it is never guaranteed that final trained model is optimal. Automation can introduce rigor in this model-building process. For example, hyperparameter optimization can help NLP researchers find reasonably accurate models under limited computation budget, leading to fairer comparison of proposed and baseline methods. Similarly, neural architecture search can help NLP developers discover models with the desired speed-accuracy tradeoffs for deployment.

This tutorial will summarize the main AutoML techniques and illustrate how to apply them to improve the NLP model-building process. The goal is to provide the audience with the necessary background to follow and use AutoML research in their own work.

Target Audience

The tutorial is aimed at NLP researchers and developers who have experience in building deep learning models and are interested in exploring the potential of AutoML in improving their system-building process. Recommended prerequisites are:


This is a 3-hour tutorial. It is divided into two parts:

In Part 1, we will focus on two major sub-areas within AutoML: Hyperparameter optimization is the problem of finding optimal hyperparameters, such as learning rate of gradient descent and embedding size of Transformers, based on past training experience. Neural architecture search is the problem of designing the optimal combination of neural network components in a fined-grained fashion. We will summarize these rapidly developing fields and explain several representative algorithms, including Bayesian Optimization, Evolutionary Strategies, Population-Based Training, Asynchronous Hyperband, and DARTS.

Part 2 will discuss the practical issues of applying AutoML research to NLP. Questions we will seek to answer include: (a) How do we evaluate AutoML methods on NLP tasks? (b) How can we extend AutoML methods to deployment situations that require multiple objectives, such as inference speed and test accuracy? (c) What is the cost (and carbon footprint) of these methods, and when will it be worthwhile? (d) How should we design our model-building software given a specific computing environment, and what existing tools are available?