Xuan Zhang

Xuan Zhang

Data

  • PHOENIX14T-HS [data]
    A continuous sign language recognition and translation dataset with handshape annotations.
    We have enriched the sign language recognition dataset PHOENIX14T by incorporating handshape labels derived from a public dictionary and manual labeling.

  • Tabular Hyperparameter Optimization Dataset for Neural Machine Translation [data]
    A benchmark dataset for comparing HPO methods on NMT models.
    We trained a total of 2,245 Transformers on six different corpora with a cost of approximately 1,547 GPU days,
    and collected all pairs of hyperparameter settings and corresponding performance metrics.

Tools

  • Handshape-Aware Sign Language Recognition Systems [code]
    A sign language recognition system that incorporates handshape information.
  • A Hyperparameter Optimization Toolkit for Neural Machine Translation Research [code]
    A hyperparameter optimization toolkit for neural machine translation to help researchers focus their time on the creative rather than the mundane.
    The toolkit is implemented as a wrapper on top of the open-source Sockeye NMT software using the Asynchronous Successive Halving Algorithm (ASHA).

  • Graph-based Hyperparameter Optimization [code]
    This is an extension of graph-based semi-supervised regression for hyperparameter optimization.

Talks

    Tutorial

    AutoML for Natural Language Processing
    Kevin Duh, Xuan Zhang
    EACL2023
    [website] [slides] [recording]

    AutoML for Neural Machine Translation
    Kevin Duh, Xuan Zhang
    AMTA2022
    [slides]

    Others

    Practical Tips on BERT Applications
    Large Language Model Bootcamp, JHU, 2022 [slides]

    Knowledge Base - Based Language Model Pre-training
    CLSP Seminar, JHU, 2020 [slides]

    Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems
    Microsoft Research, 2020 [slides]

    Hyperparameter Optimization of Neural Machine Translation Systems
    CLSP Seminar, JHU, 2020 [slides]

    Train Better Models Faster -- Curriculum Learning and Intelligent Hyperparameter Search for Neural Machine Translation
    CLSP Seminar, JHU, 2018 [slides]

Teaching