Nguyen Bach
Graduate
Student
Home
| Experience | Miscellaneous | Personal Info
|
Johns Hopkins University
Department of Computer Science
3400 N. Charles Street
224 NEB
Baltimore, MD 21218
Phone gif-uks-pihu
Email x@y where x=nguyen; y=cs.jhu.edu
News: I graduated !!! I moved to Carnegie Mellon University - LTI,
School of Computer Science and am
doing my PhD. My
new email address is x@y where x=nbach, y=andrew.cmu.edu, and homepage is
www.cs.cmu.edu/~nbach. More information about my new home will be updated soon.
|

|
- Information retrieval
- Natural language processing
- Speech recognition and synthesis
- Machine learning
- Computational linguistics
I also have interests in applied speech & language technologies to solve issues
of speech disorders and hearing impaired persons.
- H. Mixdorff, N. Bach,
et al.: 'Quantitative Analysis and Synthesis of Syllabic Tones in
Vietnamese,' Proceeding of The Eighth European Conference on Speech Communication
and Technology 2003, Sep 2003, pp 177 - 180. [PDF]
- N. Bach, M. Luong:
'Application of Dynamic Time Warping Algorithm for the recognition of Vietnamese
isolated words,' Proceeding of National Scientific Conference in Hanoi, Vietnam,
Dec 2001, pp 465 - 473. [PDF in
Vietnamese]
|
TECHNICAL REPORTS
– IMPLEMENTATIONS
|
- N. Bach, ' MetaShopper - a preliminary study and implementation
', May 2004, Johns Hopkins University
You can try the implementation here VeryNaiveBookCrawler
- N. Bach, S. Reddy, 'A
preliminary quantitative study on the characteristics of Vietnamese vowels
and English vowels', May 2004, Johns Hopkins
University
- A random sentence generator. Each time you run
the generator; it reads the context-free grammar from a file and prints
one or more random sentences. This small program was done in September
2003 and updated June 2004. You can try it here: 10
English sentences or 10
Vietnamese sentences with Nguyen_Binh's style
- A text classifier. The program uses 2 training
corpora. They can be spam and not-spam or English and Spanish. Given an
email the program classifies it to a training group. So for spam detector,
the email is determined whether it is spam or not-spam. For language
identification, the email is determined whether it is written in English
or Spanish. By using smoothing techniques the error rate sharply
decreases. I tried uniform, add-lambda, add-lambda backoff,
and Witten-Bell backoff.
Nguyen Bach
Last
modified: Friday, June 09, 2005