The research deals with the problem of efficiency of traditional and modern methods of automatic speech recognition (ASR). In the article the analysis of the common machine speech recognition algorithm structure is conducted, particularly language and acoustic models, as well as vocabulary data; the historical development of automatic speech recognition is illustrated, and its most innovative approaches are presented. The authors have conducted an experiment, in which several ASR application programming interfaces are compared with each other with a certain set of test cases. There are four different ASR-systems based on different algorithms of acoustic and language modelling: only two of them apply the same approach for all the elements in their structure, while in the other two applications acoustic and language models are based on different algorithms – thus, the structures of all the elements of the collection are not similar. The data set is analyzed by each system with Python programs; the output data is standardized and compared to the pre-transcribed reference data via WER. The results of the research have been analysed and are leading to a conclusion that the efficiency of ASR-system depends on its elements optimization and learning with an applicable data set, while neural net and statistical methods are both equally useful in acoustic and language modelling tasks.
Valuitseva, I.I., & Philatov, I.Y. (2019). Methods of language and acoustic modelling in speech recognition. Issues of Applied Linguistics, 36, 7-31. doi: https://doi.org/10.25076/vpl.36.01