About
I'm a research scientist in Amazon AWS AI working to advance the state-of-the-art of machine translation systems, primarily the Amazon Translate system. I defended my doctoral dissertation at the Johns Hopkins University in April, 2022 and am currently finishing up some minor dissertation corrections for the final submission. My dissertation advisor was Philipp Koehn.
During my graduate studies, I was primarily affiliated with Center for Language and Speech Processing and also did research on neural machine translation. Yet, I was also invovled in several projects focused on broader problems of natural language processing (both text and speech).
I had also spent a few memorable months either interning at Microsoft Translator, Salesforce Research, Amazon, or visiting The University of Edinburgh. Before joining Johns Hopkins, I got my Bachelor's degree in Beijing University of Posts & Telecommunications. During the last year of my undergraduate study, I worked with Weiwei Sun in the Language Computing and Web Mining Group of Institute of Computer Science & Technology, Peking University, with a focus on semantic parsing and Chinese word segmentation.
News
- Jan 2022: I'll be starting a full-time job with Amazon AWS Translate in May.
- Aug 2021: Our work on using Levenshtein Transformer for word-level quality estimation will appear in EMNLP 2021.
- Aug 2021: I led the JHU-Microsoft team in the WMT21 word-level quality estimation shared task, where we rank the 1st place on Word-MCC metric for the English-German language pair. The paper describing our method will appear in WMT 2021.
- May 2021: I gave an invited talk at University of Cambridge NLIP Seminar Series.
- Mar 2021: Our work on evaluating saliency interpretations from neural language models will appear in NAACL 2021.
Courses
- 600.465: Natural Language Processing
- 600.475: Machine Learning
- 600.468: Machine Translation
- 600.676: Machine Learning: Data to Models
- 050.620: Syntax I
- 600.615: Big Data, Small Languages, Scalable Systems
- 550.661: Nonlinear Optimization I
- 600.420: Parallel Programming
Teaching
- Nov 2021: Guest Lecture, EN.600.468/601.668 Machine Translation -- Analysis and Visualization
- Fall 2017: Graduate Teaching Assistant, EN.600.468/601.668 Machine Translation. Checkout the neural network and NMT homework I designed.
- Spring 2017: Guest Lecture, EN.600.435 Artificial Intelligence -- Markov Decision Process
- Spring 2016: Guest Lecture, EN.600.468 Machine Translation -- Syntax-Based Models
Publications
The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task
Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Christian Federmann, Philipp Koehn
Sixth Conference on Machine Translation (WMT 2021) [pdf][poster]Levenshtein Training for Word-level Quality Estimation
Shuoyang Ding, Marcin Junczys-Dowmunt, Matt Post, Philipp Koehn
EMNLP 2021 [pdf][code][slides][poster][talk]Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding, Philipp Koehn
NAACL 2021 [pdf][code][slides][talk]Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, Sanjeev Khudanpur
ASRU 2019 [pdf][code]A Call for Prudent Choice of Subword Merge Operations in Neural Machine Translation
Shuoyang Ding, Adithya Renduchintala, Kevin Duh
MT Summit 2019 [pdf][code][poster]An Exploration of Masking for Neural Machine Translation
Matt Post, Shuoyang Ding, Marianna Martindale and Winston Wu
MT Summit 2019 [pdf]Saliency-driven Word Alignment Interpretation for Neural Machine Translation
Shuoyang Ding, Hainan Xu, Philipp Koehn
Fourth Conference on Machine Translation (WMT) 2019 [pdf][code][slides]Parallelizable Stack Long Short-Term Memory
Shuoyang Ding, Philipp Koehn
NAACL 2019 Workshop on Structured Prediction for NLP [pdf][bib][code][slides]Improving End-to-end Speech Recognition with Pronunciation-assisted Sub-word Modeling
Hainan Xu, Shuoyang Ding, Shinji Watanabe
ICASSP 2019 [pdf][bib][code]Multi-Modal Data Augmentation for End-to-end ASR
Adithya Renduchintala, Shuoyang Ding, Matthew Wiesner, Shinji Watanabe
Interspeech 2018 Best Student Paper Award (3/700+) [pdf][bib]The JHU Machine Translation Systems for WMT 2017
Shuoyang Ding, Huda Khayrallah, Philipp Koehn, Matt Post, Gaurav Kumar, and Kevin Duh
Second Conference on Machine Translation (WMT) 2017 [pdf][bib]The JHU Machine Translation Systems for WMT 2016
Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn, and Matt Post
First Conference on Machine Translation (WMT) 2016 [pdf][bib]Grammatical Relations in Chinese: GB-Ground Extraction and Data-Driven Parsing
Weiwei Sun, Yantao Du, Xin Kou, Shuoyang Ding, Xiaojun Wan
Annual Meeting of the Association for Computational Linguistics (ACL) 2014 [pdf][bib]
Preprints
Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation
Weiting Tan, Shuoyang Ding, Huda Khayrallah, Philipp Koehn, 2021 [pdf]How Do Source-side Monolingual Word Embeddings Impact Neural Machine Translation?
Shuoyang Ding and Kevin Duh, 2018 [pdf]Backstitch: Counteracting Finite-sample Bias via Negative Steps
Yiming Wang, Hossein Hadian, Shuoyang Ding, Ke Li, Hainan Xu, Xiaohui Zhang, Daniel Povey, Sanjeev Khudanpur, 2017 [pdf]