posted on 2013-05-02, 09:02authored byMuhammad Salman Khan, Mohsen Naqvi, Jonathon Chambers
Humans are skilled in selectively extracting a single sound
source in the presence of multiple simultaneous sounds. They
(individuals with normal hearing) can also robustly adapt to
changing acoustic environments with great ease. Need has
arisen to incorporate such abilities in machines which would
enable multiple application areas such as human-computer
interaction, automatic speech recognition, hearing aids and
hands-free telephony. This work addresses the problem of
separating multiple speech sources in realistic reverberant
rooms using two microphones.
Different monaural and binaural cues have previously
been modeled in order to enable separation. Binaural spatial
cues i.e. the interaural level difference (ILD) and the inter-
aural phase difference (IPD) have been modeled [1] in the
time-frequency (TF) domain that exploit the differences in
the intensity and the phase of the mixture signals (because of
the different spatial locations) observed by two microphones
(or ears). The method performs well with no or little rever-
beration but as the amount of reverberation increases and the
sources approach each other, the binaural cues are distorted
and the interaural cues become indistinct, hence, degrading
the separation performance. Thus, there is a demand for
exploiting additional cues, and further signal processing is
required at higher levels of reverberation.
History
School
Mechanical, Electrical and Manufacturing Engineering
Citation
KHAN, M.S., NAQVI, S.M. and CHAMBERS, J., 2013. Speech separation with dereverberation-based pre-processing incorporating visual cues. IN: Proceedings of the 2nd International workshop on machine listening in multisource environments (CHIME), Vancouver, Canada, 1 June 2013, 2pp.
Publisher
CHiME
Version
AM (Accepted Manuscript)
Publication date
2013
Notes
This is a conference paper for the 2nd International Workshop on
Machine Listening in Multisource Environments
1st June 2013, Vancouver, Canada (in conjuction with ICASSP 2013). The conference website is at: http://spandh.dcs.shef.ac.uk/chime_workshop/index.html