ACL-2000 Tutorials Schedule
Sunday 1 October, Monday 2 October 2000
| Unification-based Processing Underway to Dot Com |
| 1 October 13:30-17:00 |
| Dan Flickinger, CSLI Stanford & YY Software Corporation |
| Stephan Oepen, Saarland University |
In this tutorial we
will review the state of the art in the development and application of
broad-coverage declarative grammars built on sound linguistic foundations
(the `deep' processing paradigm) and present several aspects of an international
research effort---a consortium involving Saarbruecken (Germany), Stanford
(USA) and the University of Tokyo (Japan)---to produce comprehensive,
re-usable grammars and efficient technology for parsing and generating
with such grammars. While statistical methods, often described as `shallow'
processing techniques, can bring real advantages in robustness and efficiency,
they do not provide the precise, reliable representations of meaning that
more conventional symbolic grammars can supply for natural language. We
will illustrate the benefits and viability of the declarative approach
both in multilingual grammar development (for English, German, and Japanese),
and in commercially relevant applications including machine translation,
speech prosthesis, and automated email response. The topics we will discuss
and demonstrate will include: descriptive formalism requirements, linguistic
framework and resources, grammar development tools, diagnostics and measurement,
processing efficiency, semantic engineering, re-usability and exchange
to support collaboration, and practical applications.
| Statistical Machine Translation |
| 1 October 13:30-17:00 |
| Kevin Knight |
| USC/ISI |
The
statistical approach to machine translation (MT) seeks to extract translation
knowledge automatically from online bilingual texts (e.g., publications
of the Canadian or Hong Kong governments). This idea can be traced back
to suggestions made by Warren Weaver in the 1940s. It was pioneered at
IBM in the 1990s and continues to be inspired by relative successes in
statistical speech recognition. We will present a technical, focused tutorial
that will cover the statistical MT literature to date. This tutorial will
not cover MT in the broad sense (transfer and interlingua approaches,
evaluation, commercial products, etc.)---we will instead concentrate on
statistical models proposed for the translation process, using accessible
graphical influence diagrams to explain models used in different research
projects around the world. We will also cover language models and "decoding"
algorithms that perform online translations.
- Introduction
- History of statistical MT
- Substitution ciphers, light probability, noisy channel framework
- Transliteration: a case study of MT as codebreaking
- Sketch of a complete statistical MT system (training/translation modules)
- Building Blocks
- Acquisition and cleaning of training data
- Language modeling and training
- Translation modeling and training
- Online translation ("decoding")
- Assessment
- Empirical results: does it work?
- Strengths and weaknesses of statistical MT
- Related applications
- Immediate and long-term prospects
| Morphology for Asian Languages |
| 2 October 08:30-12:00 |
| Kenneth Church |
| AT&T Labs Research (chair) |
The ACL meeting in Hong Kong presents a rare opportunity to bring together a number of well-known experts on NLP issues specific to Asian languages, especially word segmentation (morphology). Unlike English, it is a non-trivial problem in many languages to split a sequence of characters into a sequence of words because there isn't any white space. There is a lot of work on these problems taking place in many countries, but relatively little of the literature crosses language boundaries. We felt that the this ACL meeting would be an excellent chance to bring much of this work together. The tutorial will consist of a half dozen invited talks covering three languages (Chinese, Japanese and Korean) from computational as well as linguistic perspectives. We intend this tutorial to be as inclusive as possible. It should be of interest to both engineers and linguists, and accessible to a diverse audience including experts who have spent considerable time with Asian languages as well as novices like the chair whose experience is, for the most part, limited to a single Western language, namely English.
Speakers:
Keh-Jiann Chen, Academia Sinica, Taiwan
Key-Sun Choi, Korea Advanced Institute of Science and Technology
Kiyong Lee, Korea University
Yuji Matsumoto, Nara Institute of Science and Technology, Japan
Masaaki Nagata, NTT Information and Communication Systems Laboratories,
Japan
Benjamin K Tsou, City University of Hong Kong
Committee Members:
Kenneth Church (chair), Key-Sun Choi, Yuji Matsumoto, Sung Hyon Myaeng,
Masaaki Nagata, Keh-Yih Su, Lua Kim Teng
| Multilingual Information Access |
| 2 October 08:30-12:00 |
| Douglas W. Oard |
| University of Maryland |
This tutorial will address
the application of techniques at the intersection of computational linguistics
and information retrieval to help users search multilingual collections.
The tutorial will draw from several perspectives, examining the contributions
of the computational linguistics and information retrieval communities
to cross-language information retrieval, and augmenting that with a discussion
of related issues from machine translation, text summarization and human-computer
interaction. Alternative techniques for each key component will be explained
and illustrated using working systems and reported experimental results.
Evaluation issues, best present practice, and open research questions
will be highlighted throughout the tutorial. The worldwide series of cross-language
information retrieval evaluation venues will also be introduced, with
particular attention to evaluations that focus on Asian languages. The
tutorial will conclude with an assessment of the prospects for adoption
of this technology for Internet searching, commercial information retrieval
systems, and special-purpose applications.
[The tutorial will be joint between ACL and IRAL'2000, the 5th
International Workshop on Information Retrieval with Asian Languages]
ACL-2000 Tutorials Co-Chairs:
John Carroll
University of Sussex
Hemant Darbari
CDAC - Pune University
acl2k-tutorials@cogs.susx.ac.uk