AutoML for Speech and Language Processing

Presenters: Kevin Duh and Xuan Zhang (Johns Hopkins University)

Tutorial Slides for ASRU 2023: pdf
Tutorial Slides for EACL 2023: pdf
The slides differ only in the final literature survey section: the ASRU version focuses on speech and EACL version focuses on text.

QR code for this page:
QR code

Description

Automated Machine Learning (AutoML) is an emerging field that has potential to impact how we build models in Speech and Language Processing (SLP). As an umbrella term that includes topics like hyperparameter optimization and neural architecture search, AutoML has recently become mainstream at major conferences such as NeurIPS, ICML, and ICLR. The inaugural AutoML Conference was started in 2022, and with this community effort, we expect that deep learning software frameworks will begin to include AutoML functionality in the near future.

What does this mean to SLP? Currently, models are often built in an ad hoc process: we might borrow default hyperparameters from previous work and try a few variant architectures, but it is never guaranteed that final trained model is optimal. Automation can introduce rigor in this model-building process. For example, hyperparameter optimization can help SLP researchers find reasonably accurate models under limited computation budget, leading to fairer comparison of proposed and baseline methods. Similarly, neural architecture search can help SLP developers discover models with the desired speed-accuracy tradeoffs for deployment.

This tutorial will summarize the main AutoML techniques and illustrate how to apply them to improve the SLP model-building process. The goal is to provide the audience with the necessary background to follow and use AutoML research in their own work.

Target Audience

The tutorial is aimed at SLP researchers and developers who have experience in building deep learning models and are interested in exploring the potential of AutoML in improving their system-building process. Recommended prerequisites are:

SLP: Familiarity with common neural networks used in the field, especially the Transformer architecture.
Machine Learning: Understanding of classical supervised learning. Knowledge of Bayesian and Evolutionary methods will be a plus, but not required.
Programming: Basic experience with training models in deep learning frameworks like PyTorch or Tensorflow.

Outline

This is a 3-hour tutorial. It is divided into two parts:

Overview of major AutoML techniques

Hyperparameter optimization
Neural architecture search

Application of AutoML to SLP

Evaluation
Multiple objectives for deployment
Cost and carbon footprint
Software design best practices
Literature survey

In Part 1, we will focus on two major sub-areas within AutoML: Hyperparameter optimization is the problem of finding optimal hyperparameters, such as learning rate of gradient descent and embedding size of Transformers, based on past training experience. Neural architecture search is the problem of designing the optimal combination of neural network components in a fined-grained fashion. We will summarize these rapidly developing fields and explain several representative algorithms, including Bayesian Optimization, Evolutionary Strategies, Population-Based Training, Asynchronous Hyperband, and DARTS.

Part 2 will discuss the practical issues of applying AutoML research to SLP. Questions we will seek to answer include: (a) How do we evaluate AutoML methods on SLP tasks? (b) How can we extend AutoML methods to deployment situations that require multiple objectives, such as inference speed and test accuracy? (c) What is the cost (and carbon footprint) of these methods, and when will it be worthwhile? (d) How should we design our model-building software given a specific computing environment, and what existing tools are available?