Sharifzadehthesis.pdf (1.65 MB)

Automatic speech recognition: from study to practice

educational resource

posted on 2016-09-29, 12:49 authored by Sara Sharifzadeh

Today, automatic speech recognition (ASR) is widely used for different purposes such as robotics, multimedia, medical and industrial application. Although many researches have been performed in this field in the past decades, there is still a lot of room to work. In order to start working in this area, complete knowledge of ASR systems as well as their weak points and problems is inevitable. Besides that, practical experience improves the theoretical knowledge understanding in a reliable way. Regarding to these facts, in this master thesis, we have first reviewed the principal structure of the standard HMM-based ASR systems from technical point of view. This includes, feature extraction, acoustic modeling, language modeling and decoding. Then, the most significant challenging points in ASR systems is discussed. These challenging points address different internal components characteristics or external agents which affect the ASR systems performance. Furthermore, we have implemented a Spanish language recognizer using HTK toolkit. Finally, two open research lines according to the studies of different sources in the field of ASR has been suggested for future work.

History

School

Mechanical, Electrical and Manufacturing Engineering

Rights holder

Sara Sharifzadeh

Publisher statement

This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/

Publication date

2010

Notes

A Master's Thesis. Submitted to the Department of Microelectronics in partial fulfilment of the requirements for the degree of Master of Science in Multimedia Technologies at the University of Autonoma de Barcelona

Language

en

Qualification name

MSc

Qualification level

Masters

Administrator link

https://repository.lboro.ac.uk/account/articles/9577679

Usage metrics

Keywords

Automatic speech recognition Mel-frequency cepstral coefficients DCT transform Speech enhancement Feature extraction Linear predictive coding Hidden Markov Models Mechanical Engineering not elsewhere classified

Licence

CC BY-NC-ND 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC