Real-time speaker identification for video conferencing

Saravi, Sara; Zafar, Iffat; Edirisinghe, Eran; Kalawsky, Roy

Eran1.pdf (1.06 MB)

Real-time speaker identification for video conferencing

conference contribution

posted on 2010-07-14, 15:15 authored by Sara SaraviSara Saravi, Iffat Zafar, Eran Edirisinghe, Roy KalawskyRoy Kalawsky

Automatic speaker identification in a videoconferencing environment will allow conference attendees to focus their attention on the conference rather than having to be engaged manually in identifying which channel is active and who may be the speaker within that channel. In this work we present a real-time, audio-coupled video based approach to address this problem, but focus more on the video analysis side. The system is driven by the need for detecting a talking human via the use of computer vision algorithms. The initial stage consists of a face detector which is subsequently followed by a lip-localization algorithm that segments the lip region. A novel approach for lip movement detection based on image registration and using the Coherent Point Drift (CPD) algorithm is proposed. Coherent Point Drift (CPD) is a technique for rigid and non-rigid registration of point sets. We provide experimental results to analyse the performance of the algorithm when used in monitoring real life videoconferencing data.

History

School

Science

Department

Computer Science

Citation

SARAVI, S....et al., 2010. Real-time speaker identification for video conferencing. IN: Kehtarnavaz, N. (ed.), Real-Time Image and Video Processing 2010, Proceedings of SPIE, 7724, 77240D, 10pp.

Publisher

Version

VoR (Version of Record)

Publication date

2010

Notes

Copyright 2010 Society of Photo-Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic electronic or print reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited. This paper can also be found at: http://dx.doi.org/10.1117/12.854846