Abstract:
In this paper, we propose many-to-many voice conversion (VC) techniques to convert an arbitrary source speaker's voice into an arbitrary target speaker's voice. We have proposed one-to-many eigenvoice conversion (EVC) and many-to-one EVC. In the EVC, an eigenvoice Gaussian mixture model (EV-GMM) is trained in advance using multiple parallel data sets of a reference speaker and many pre-stored speakers. The EV-GMM is flexibly adapted to an arbitrary speaker using a small amount of adaptation data without any linguistic constraints. In this paper, we achieve many-to-many VC by sequentially performing many-to-one EVC and one-to-many EVC through the reference speaker using the same EV-GMM. Experimental results demonstrate the effectiveness of the proposed many-to-many VC.
Description:
INTERSPEECH2009: 10th Annual Conference of the International Speech Communication Association, September 6-10, 2009, Brighton, UK.