NAISTAR
Advanced Search
Japanese | English

naistar (NAIST Academic Repository) >
学術リポジトリ naistar / NAIST Academic Repository naistar >
国際会議発表論文 / Proceedings >
情報科学研究科 / Graduate School of Information Science >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10061/7968

Title: A Semi-Blind Source Separation Method for Hands-Free Speech Recognition of Multiple Talkers
Authors: Panikos Heracleous
Satoshi Nakamura
Kiyohiro Shikano
Issue Date: Sep-2003
Start page: 509
End page: 512
Abstract: In this paper, we present a beamforming based semi-blind source separation technique, which can be applied efficiently for hands-free speech recognition of multiple talkers (including moving talkers, too). The main difference from the conventional blind source separation techniques lies in the fact that the proposed method does not attempt to separate explicitly the unknown signals in a pre-processing pass before speech recognition. In fact, localization of multiple talkers, separation of the signals, and speech recognition are integrated in a single pass. Each time frame, beams formed by a delay-and-sum beamformer are steered to every direction, and speech information is extracted. A modified Viterbi formula provides n-best hypotheses for each direction and word hypotheses. At the final frame, all hypotheses are clustered based on their direction information. The clusters, which correspond to the talkers include information about the recognized speech of the multiple talkers and about their direction. Experiments for recognition of two and three talkers showed very promising results. In the case of two talkers, and using simulated clean data we achieved for `top 5' hypotheses a recognition rate of series 95.02% on average, which is very promising result.
Description: EUROSPEECH2003: 8th European Conference on Speech Communication and Technology, September 1-4, 2003, Geneva, Switzerland.
URI: http://hdl.handle.net/10061/7968
ISSN: 1018-4074
Rights: Copyright 2003 ISCA
Text Version: Publisher
Appears in Collections:情報科学研究科 / Graduate School of Information Science

Files in This Item:

File SizeFormat
EUROSPEECH_2003_509.pdf554.68 kBAdobe PDFView/Open

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Copyright (c) 2007-2012 Nara Institute of Science and Technology All Rights Reserved.
DSpace Software Copyright © 2002-2010  Duraspace - Feedback