Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping

Ibrahim, M.Z.; Mulvaney, David

File(s) under permanent embargo

Reason: Unsuitable version.

Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping

journal contribution

posted on 2015-08-07, 15:03 authored by M.Z. Ibrahim, David Mulvaney

By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. In this paper, we present a geometrical-based automatic lip reading system that extracts the lip region from images using conventional techniques, but the contour itself is extracted using a novel application of a combination of border following and convex hull approaches. Classification is carried out using an enhanced dynamic time warping technique that has the ability to operate in multiple dimensions and a template probability technique that is able to compensate for differences in the way words are uttered in the training set. The performance of the new system has been assessed in recognition of the English digits 0 to 9 as available in the CUAVE database. The experimental results obtained from the new approach compared favorably with those of existing lip reading approaches, achieving a word recognition accuracy of up to 71% with the visual information being obtained from estimates of lip height, width and their ratio.

History

School

Mechanical, Electrical and Manufacturing Engineering

Published in

Journal of Visual Communication and Image Representation

Volume

30

Pages

219 - 233

Citation

IBRAHIM, M. and MULVANEY, D.J., 2015. Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping. Journal of Visual Communication and Image Representation, 30, pp.219-233

Publisher

Version

VoR (Version of Record)

Publisher statement

This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/

Acceptance date

2015-04-27

Publication date

2015-05-05

Notes

This paper is in closed access.

DOI

https://doi.org/10.1016/j.jvcir.2015.04.013

ISSN

1047-3203

Publisher version

http://dx.doi.org/10.1016/j.jvcir.2015.04.013

Language

en

Administrator link

https://repository.lboro.ac.uk/account/articles/9545468

Usage metrics

Keywords

Lip reading Lip geometry Mouth detection Skin segmentation Convex hull Multi dimension dynamic time warping Template probabilistic OpenCV Mechanical Engineering not elsewhere classified Artificial Intelligence and Image Processing

Licence

CC BY-NC-ND 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping

History

School

Published in

Volume

Pages

Citation

Publisher

Version

Publisher statement

Acceptance date

Publication date

Notes

DOI

ISSN

Publisher version

Language

Administrator link

Usage metrics

Categories

Keywords

Licence

Exports