Real-time speaker identification for video conferencing
Sara Saravi
Iffat Zafar
Eran Edirisinghe
Roy Kalawsky
2134/6493
https://repository.lboro.ac.uk/articles/conference_contribution/Real-time_speaker_identification_for_video_conferencing/9405293
Automatic speaker identification in a videoconferencing environment will allow conference attendees to focus their
attention on the conference rather than having to be engaged manually in identifying which channel is active and who
may be the speaker within that channel. In this work we present a real-time, audio-coupled video based approach to
address this problem, but focus more on the video analysis side. The system is driven by the need for detecting a talking
human via the use of computer vision algorithms. The initial stage consists of a face detector which is subsequently
followed by a lip-localization algorithm that segments the lip region. A novel approach for lip movement detection based
on image registration and using the Coherent Point Drift (CPD) algorithm is proposed. Coherent Point Drift (CPD) is a
technique for rigid and non-rigid registration of point sets. We provide experimental results to analyse the performance
of the algorithm when used in monitoring real life videoconferencing data.
2010-07-14 15:15:53
Speaker identification
Coherent point drift
Lip movement detection
Information and Computing Sciences not elsewhere classified