Advanced Search
Japanese | English

naistar (NAIST Academic Repository) >
学術リポジトリ naistar / NAIST Academic Repository naistar >
国際会議発表論文 / Proceedings >
情報科学研究科 / Graduate School of Information Science >

Please use this identifier to cite or link to this item:

Title: Speech-to-Lip Movement Synthesis Based on the EM Algorithm Using Audio-Visual HMMs
Authors: Eli Yamamoto
Satoshi Nakamura
Kiyohiro Shikano
Issue Date: Nov-1998
Start page: 1275
End page: 1278
Article Number: 0756
Abstract: This paper proposes a method to re-estimate output visual parameters for speech-to-lip movement synthesis using audio-visual Hidden Markov Models (HMMs) under the Expectation-Maximization(EM) algorithm. In the conventional methods for speech-to-lip movement synthesis, there is a synthesis method estimating a visual parameter sequence through the Viterbi alignment of an input acoustic speech signal using audio HMMs. However, the HMM-Viterbi method involves a substantial problem that incorrect HMM state alignment may output incorrect visual parameters. The problem in the HMM-Viterbi method is caused by the deterministic synthesis process to assign a single HMM state for an input audio frame. The proposed method avoids the deterministic process by re-estimating non-deterministic visual parameters while maximizing the likelihood of the audio-visual observation sequence under the EM algorithm.
Description: ICSLP1998: the 5th International Conference on Spoken Language Processing, November 30 - December 4, 1998, Sydney, Australia.
Text Version: Publisher
Appears in Collections:情報科学研究科 / Graduate School of Information Science

Files in This Item:

File SizeFormat
ICSLP_1998_1275.pdf802.96 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Copyright (c) 2007-2012 Nara Institute of Science and Technology All Rights Reserved.
DSpace Software Copyright © 2002-2010  Duraspace - Feedback