Loughborough University
Browse
ABMSS_Accepted_Xiyu_Shi.pdf (15.3 MB)

Adaptive blind moving source separation based on intensity vector statistics

Download (15.3 MB)
journal contribution
posted on 2019-08-12, 12:30 authored by Areeb Riaz, Xiyu ShiXiyu Shi, Ahmet Kondoz
This paper presents a novel approach to blind moving source separation by detecting, tracking and separating speakers in real-time using intensity vector direction (IVD) statistics. It updates unmixing system parameters swiftly in order to deal with the time-variant mixing parameters. Denoising is carried out to extract reliable speaker estimates using von-Mises modeling of the IVD measurements in space and IIR filtering of the IVD distribution in time. Peaks in the IVD distribution are assigned location expectation values to check for consistency, and consequently high location expectation peaks are declared as active speakers. The location expectation algorithm caters for natural pauses during speech delivery. Speaker movements are tracked by spatial isolation of the detected peaks using time-variant regions of interest. As a result, the proposed moving source separation system is capable of blindly detecting, tracking and separating moving speakers. A real-time demonstration has been developed with the proposed system pipeline, allowing users to listen to active speakers in any desired combination. The system has an advantage of using a small coincident microphone array to separate any number of moving sources utilising the first order Ambisonics signals while assuming source signals to be W-disjoint orthogonal. Being nearly closed-form, the proposed system does not require convergence or initialization of parameters.

Funding

MulSys Limited through the European Commission (Grant Number 287896)

History

School

  • Loughborough University London

Published in

Speech Communication

Volume

113

Pages

1 - 14

Publisher

Elsevier B.V.

Version

  • AM (Accepted Manuscript)

Rights holder

© Elsevier B.V.

Publisher statement

This paper was accepted for publication in the journal Speech Communication and the definitive published version is available at https://doi.org/10.1016/j.specom.2019.08.001.

Acceptance date

2019-08-07

Publication date

2019-08-08

Copyright date

2019

ISSN

0167-6393

Language

  • en

Depositor

Dr Xiyu Shi