Advanced Search
Japanese | English

naistar (NAIST Academic Repository) >
学術リポジトリ naistar / NAIST Academic Repository naistar >
国際会議発表論文 / Proceedings >
情報科学研究科 / Graduate School of Information Science >

Please use this identifier to cite or link to this item:

Title: Speech-to-Lip Movement Synthesis Maximizing Audio-Visual Joint Probability Based on EM Algorithm
Authors: Eli Yamamoto
Satoshi Nakamura
Kiyohiro Shikano
Issue Date: Dec-1998
Publisher: IEEE
Start page: 53
End page: 58
Abstract: We investigate methods using the hidden Markov model (HMM) to drive a lip movement sequence with input speech. We have already investigated a mapping method based on the Viterbi decoding algorithm which converts an input speech to a lip movement sequence through the most likely HMM state sequence conducted by audio HMMs. However, the method contains a substantial problem of producing errors along incorrectly decoded HMM states. This paper newly proposes a method to re-estimate the visual parameters using the HMMs of the audio-visual joint probability under the expectation-maximization (EM) algorithm. In experiments, the proposed mapping method using the EM algorithm shows an error reduction of 26% compared to a method using the Viterbi algorithm at incorrectly decoded bi-labial consonants.
Description: IEEE Second Workshop on Multimedia Signal Processing, December 7-9, 1998, Redondo Beach, California, USA.
ISBN: 0780349199
Rights: Copyright 1998 IEEE
Text Version: Publisher
Publisher DOI: 10.1109/MMSP.1998.738912
Appears in Collections:情報科学研究科 / Graduate School of Information Science

Files in This Item:

File SizeFormat
MMSP_1998_53.pdf905.59 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


Copyright (c) 2007-2012 Nara Institute of Science and Technology All Rights Reserved.
DSpace Software Copyright © 2002-2010  Duraspace - Feedback