|
naistar (NAIST Academic Repository) >
学術リポジトリ naistar / NAIST Academic Repository naistar >
国際会議発表論文 / Proceedings >
情報科学研究科 / Graduate School of Information Science >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/10061/8125
|
| Title: | Speech-to-Lip Movement Synthesis Maximizing Audio-Visual Joint Probability Based on EM Algorithm |
| Authors: | Eli Yamamoto Satoshi Nakamura Kiyohiro Shikano |
| Issue Date: | Dec-1998 |
| Publisher: | IEEE |
| Start page: | 53 |
| End page: | 58 |
| Abstract: | We investigate methods using the hidden Markov model (HMM) to drive a lip movement sequence with input speech. We have already investigated a mapping method based on the Viterbi decoding algorithm which converts an input speech to a lip movement sequence through the most likely HMM state sequence conducted by audio HMMs. However, the method contains a substantial problem of producing errors along incorrectly decoded HMM states. This paper newly proposes a method to re-estimate the visual parameters using the HMMs of the audio-visual joint probability under the expectation-maximization (EM) algorithm. In experiments, the proposed mapping method using the EM algorithm shows an error reduction of 26% compared to a method using the Viterbi algorithm at incorrectly decoded bi-labial consonants. |
| Description: | IEEE Second Workshop on Multimedia Signal Processing, December 7-9, 1998, Redondo Beach, California, USA. |
| URI: | http://hdl.handle.net/10061/8125 |
| ISBN: | 0780349199 |
| Rights: | Copyright 1998 IEEE |
| Text Version: | Publisher |
| Publisher DOI: | 10.1109/MMSP.1998.738912 |
| Appears in Collections: | 情報科学研究科 / Graduate School of Information Science
|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
|