Abstract:
We propose subband-based blind source separation (BSS) for convolutive mixtures of speech. This is motivated by the drawback of frequency-domain BSS, i.e., when a long frame with a fixed frame-shift is used for a few seconds of speech, the number of samples in each frequency bin decreases and the separation performance is degraded. In our proposed subband BSS, (1) by using a moderate number of s山bands, a sufficient number of samples can be held in each subband, and (2) by using FIR filters in each subband, we can handle long reverberation. Subband BSS achieves better performance than frequency-domain BSS. Moreover, subband BSS allows us to select the separation method suited to each subband. Using this advantage, we propose e伍cient separation procedures that take the frequency characteristics of room reverberation and speech signals into consideration, (3) by using longer unmixing filters in low frequency bands, and (4) by adopting overlap-blockshift in BSS's batch adaptation in low frequency bands. Consequently, subband processing appropriate for each frequency bin is successfully realized with the proposed subband BSS