<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
  <channel>
    <title>DSpace Collection:</title>
    <link>http://hdl.handle.net/10061/4716</link>
    <description />
    <pubDate>Wed, 22 May 2013 07:54:18 GMT</pubDate>
    <dc:date>2013-05-22T07:54:18Z</dc:date>
    <item>
      <title>Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition</title>
      <link>http://hdl.handle.net/10061/8613</link>
      <description>Title: Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition
Authors: Arata Itoh; Sunao Hara; Norihide Kitaoka; Kazuya Takeda
Abstract: A novel speech feature generation-based acoustic model training method for robust speaker-independent speech recognition is proposed. For decades, speaker adaptation methods have been widely used. All of these adaptation methods need adaptation data. However, our proposed method aims to create speaker-independent acoustic models that cover not only known but also unknown speakers. We achieve this by adopting inverse maximum likelihood linear regression (MLLR) transformation-based feature generation, and then we train our models using these features. First we obtain MLLR transformation matrices from a limited number of existing speakers. Then we extract the bases of the MLLR transformation matrices using PCA. The distribution of the weight parameters to express the transformation matrices for the existing speakers are estimated. Next, we construct pseudo-speaker transformations by sampling the weight parameters from the distribution, and apply the transformation to the normalized features of the existing speaker to generate the features of the pseudo-speakers. Finally, using these features, we train the acoustic models. Evaluation results show that the acoustic models trained using our proposed method are robust for unknown speakers.</description>
      <pubDate>Mon, 01 Oct 2012 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/10061/8613</guid>
      <dc:date>2012-10-01T00:00:00Z</dc:date>
    </item>
    <item>
      <title>音声対話システムの発話・動作タグN-gramを用いた課題未達成のオンライン検出</title>
      <link>http://hdl.handle.net/10061/8614</link>
      <description>Title: 音声対話システムの発話・動作タグN-gramを用いた課題未達成のオンライン検出
Authors: 原 直; 北岡 教英; 武田 一哉
Abstract: 本論文ではN-gram特徴を用いた音声対話システム利用時の課題未達成対話の検出手法を提案する．実験にはユーザが自分のPC上で利用するという，実環境下で収録された楽曲検索のための音声対話システムとの対話データを利用する．楽曲検索課題を行っている全ての対話データはユーザとシステムの発話を抽象化した発話・動作タグにより記述し，そのタグ系列をN-gramとしてモデル化を行う．本研究では，タグ系列中のタグN-gramの出現回数を素性として，Support Vector MachineやC4.5決定木などの識別的手法による課題未達成対話の検出実験を行った．更に対話長や反応時間などの対話の特徴を表す対話変数を導入することで検出性能の向上を行った．その結果，ユーザの4回目の発話までを用いることで，約76%の識別率となった．</description>
      <pubDate>Tue, 01 Jan 2013 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/10061/8614</guid>
      <dc:date>2013-01-01T00:00:00Z</dc:date>
    </item>
    <item>
      <title>Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator</title>
      <link>http://hdl.handle.net/10061/8290</link>
      <description>Title: Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator
Authors: Ryo Wakisaka; Hiroshi Saruwatari; Kiyohiro Shikano; Tomoya Takatani
Abstract: In this paper, we introduce a generalized minimum mean-square error short-time spectral amplitude estimator with a new prior estimation of the speech probability density function based on moment-cumulant transformation. From the objective and subjective evaluation experiments, we show the improved noise reduction performance of the proposed method.</description>
      <pubDate>Wed, 01 Feb 2012 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/10061/8290</guid>
      <dc:date>2012-02-01T00:00:00Z</dc:date>
    </item>
    <item>
      <title>Theoretical analysis of amounts of musical noise and speech distortion in structure-generalized parametric spatial subtraction array</title>
      <link>http://hdl.handle.net/10061/8289</link>
      <description>Title: Theoretical analysis of amounts of musical noise and speech distortion in structure-generalized parametric spatial subtraction array
Authors: Ryoichi Miyazaki; Hiroshi Satuwatari; Kiyohiro Shikano
Abstract: We propose a structure-generalized blind spatial subtraction array (BSSA), and the theoretical analysis of the amounts of musical noise and speech distortion. The structure of BSSA should be selected according to the application, i.e., a channelwise BSSA is recommended for listening but a conventional BSSA is suitable for speech recognition.</description>
      <pubDate>Wed, 01 Feb 2012 00:00:00 GMT</pubDate>
      <guid isPermaLink="false">http://hdl.handle.net/10061/8289</guid>
      <dc:date>2012-02-01T00:00:00Z</dc:date>
    </item>
  </channel>
</rss>

