|
naistar (NAIST Academic Repository) >
学術リポジトリ naistar / NAIST Academic Repository naistar >
国際会議発表論文 / Proceedings >
情報科学研究科 / Graduate School of Information Science >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10061/7988
|
| Title: | A Large-Vocabulary Continuous Speech Recognition Algorithm and its Application to a Multi-modal Telephone Directory Assistance System |
| Authors: | Yasuhiro Minami Kiyohiro Shikano Osamu Yoshioka Satoshi Takahashi Tomokazu Yamada Sadaoki Furui |
| Issue Date: | Mar-1997 |
| Start page: | 387 |
| End page: | 392 |
| Abstract: | This paper describes an accurate and efficient algorithm for very-large-vocabulary continuous speech recognition based on an HMM-LR algorithm. The HMM-LR algorithm uses a generalized LR parser as a language model and hidden Markov models (HMMs) as phoneme models. To reduce the search space without pruning the correct candidate, we use forward and backward trellis likelihoods, an adjusting window for choosing only the probable part of the trellis for each predicted phoneme, and an algorithm for merging candidates that have the same allophonic phoneme sequences and the same context-free grammar states. Candidates are also merged at the meaning level. This algorithm is applied to a telephone directory assistance system that recognizes spontaneous speech containing the names and addresses of more than 70,000 subscribers (vocabulary size is about 80,000). The experimental results show that the system performs well in spite of the large perplexity. This algorithm was also applied to a multi-modal telephone directory assistance system, and the system was evaluated from the human-interface point of view. To cope with the problem of background noise, an HMM composition technique which combines a noise-source HMM and a clean phoneme HMM into a noise-added phoneme HMM was investigated and incorporated into the system. |
| Description: | HLT1994: Workshop on Human Language Technology , March 8-11, 1994, Plainsboro, New Jerey, USA. |
| URI: | http://hdl.handle.net/10061/7988 |
| ISBN: | 1558603573 |
| Rights: | Copyright 1994 Association for Computational Linguistics |
| Text Version: | Publisher |
| Publisher DOI: | 10.3115/1075812.1075902 |
| Appears in Collections: | 情報科学研究科 / Graduate School of Information Science
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|