Predicting speech perception in older listeners with sensorineural hearing loss using automatic speech recognition

Fontan, Lionel; Cretin-Maitenaz, Tom; Fullgrabe, Christian

Fontan_etal_TiH20.pdf (547.62 kB)

Predicting speech perception in older listeners with sensorineural hearing loss using automatic speech recognition

journal contribution

posted on 2021-01-22, 16:27 authored by Lionel Fontan, Tom Cretin-Maitenaz, Christian Fullgrabe

The objective of this study was to provide proof of concept that the speech intelligibility in quiet of unaided older hearing-impaired (OHI) listeners can be predicted by automatic speech recognition (ASR). Twenty-four OHI listeners completed three speech-identification tasks using speech materials of varying linguistic complexity and predictability (i.e., logatoms, words, and sentences). An ASR system was first trained on different speech materials and then used to recognize the same speech stimuli presented to the listeners but processed to mimic some of the perceptual consequences of age-related hearing loss experienced by each of the listeners: the elevation of hearing thresholds (by linear filtering), the loss of frequency selectivity (by spectrally smearing), and loudness recruitment (by raising the amplitude envelope to a power). Independently of the size of the lexicon used in the ASR system, strong to very strong correlations were observed between human and machine intelligibility scores. However, large root-mean-square errors (RMSEs) were observed for all conditions. The simulation of frequency selectivity loss had a negative impact on the strength of the correlation and the RMSE. Highest correlations and smallest RMSEs were found for logatoms, suggesting that the prediction system reflects mostly the functioning of the peripheral part of the auditory system. In the case of sentences, the prediction of human intelligibility was significantly improved by taking into account cognitive performance. This study demonstrates for the first time that ASR, even when trained on intact independent speech material, can be used to estimate trends in speech intelligibility of OHI listeners.

Funding

Institute of Advanced Studies of Loughborough University (UK)

History

School

Sport, Exercise and Health Sciences

Published in

Trends in Hearing

Volume

24

Pages

1 - 16

Publisher

SAGE Publications

Version

VoR (Version of Record)

Rights holder

Publisher statement

This is an Open Access Article. It is published by SAGE under the Creative Commons Attribution-NonCommercial 4.0 International Licence (CC BY-NC 4.0). Full details of this licence are available at: https://creativecommons.org/licenses/by-nc/4.0/

Acceptance date

2020-03-02

Publication date

2020-04-01

Copyright date

2020

DOI

https://doi.org/10.1177/2331216520914769

ISSN

2331-2165

eISSN

2331-2165

Publisher version

https://doi.org/10.1177/2331216520914769

Language

en

Depositor

Dr Christian Fullgrabe. Deposit date: 20 January 2021

Usage metrics

Keywords

automatic speech recognition speech intelligibility age-related hearing loss suprathreshold auditory processing cognition

Licence

CC BY-NC 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Predicting speech perception in older listeners with sensorineural hearing loss using automatic speech recognition

Funding

Institute of Advanced Studies of Loughborough University (UK)

History

School

Published in

Volume

Pages

Publisher

Version

Rights holder

Publisher statement

Acceptance date

Publication date

Copyright date

DOI

ISSN

eISSN

Publisher version

Language

Depositor

Usage metrics

Categories

Keywords

Licence

Exports