

Author: Liu Fu-Hua Gu Liang Gao Yuqing Picheny Michael
Publisher: Springer Publishing Company
ISSN: 1381-2416
Source: International Journal of Speech Technology, Vol.7, Iss.2-3, 2004-04, pp. : 221-229
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Abstract
This paper describes various language modeling issues in a speech-to-speech translation system. These issues are addressed in the IBM speech-to-speech system we developed for the DARPA Babylon program in the context of two-way translation between English and Mandarin Chinese. First, the language models for the speech recognizer had to be adapted to the specific domain to improve the recognition performance for in-domain utterances, while keeping the domain coverage as broad as possible. This involved considerations of disfluencies and lack of punctuation, as well as domain-specific utterances. Second, we used a hybrid semantic/syntactic representation to minimize the data sparseness problem in a statistical natural language generation framework. Serious inflection and synonym issues arise when words in the target language are to be determined in the translation output. Instead of relying on tedious handcrafted grammar rules, we used
Related content


By Peng Gang Wang William S.-Y.
International Journal of Speech Technology, Vol. 7, Iss. 2-3, 2004-04 ,pp. :


By Cao Yang Zhang Shuwu Huang Taiyi Xu Bo
International Journal of Speech Technology, Vol. 7, Iss. 2-3, 2004-04 ,pp. :




By Huang Chao Chen Tao Chang Eric
International Journal of Speech Technology, Vol. 7, Iss. 2-3, 2004-04 ,pp. :


By Chi-Shun Cheung Fung Pascale
International Journal of Speech Technology, Vol. 7, Iss. 2-3, 2004-04 ,pp. :