posted on 2022-08-16, 08:16authored byAbedal-karim Al-Banna, Eran Edirisinghe, Hui FangHui Fang
Stuttering is a neurodevelopmental speech disorder that affects 70 million people worldwide, approximately 1% of the whole population. People who stutter (PWS) have common speech symptoms such as block, interjection, repetition, and prolongation. The speech-language pathologists (SLPs) commonly observe these four groups of symptoms to evaluate stuttering severity. The evaluation process is tedious and time-consuming for (SLP) and (PWS). Therefore, this paper proposes a new model for stuttering events detection that may help (SLP) to evaluate stuttering severity. Our model is based on a log mel spectrogram and 2D atrous convolutional network designed to learn spectral and temporal features. We rigorously evaluate the performance of our model on two stuttering datasets (UCLASS and FluencyBank) using common speech metrics, i.e. F1-score, recall, and the area under the curve (AUC). Our experimental results indicate that our model outperforms state-of-the-art methods in prolongation with an F1 of 52% and 44.5% on the UCLASS and FluencyBank datasets, respectively. Also, we gain 5% and 3% margins on the UCLASS and FluencyBank datasets for fluent class.
History
School
Science
Department
Computer Science
Published in
2022 13th International Conference on Information and Communication Systems (ICICS)
Pages
252 - 256
Source
2022 13th International Conference on Information and Communication Systems (ICICS)