Journal of Jilin University Science Edition ›› 2024, Vol. 62 ›› Issue (2): 320-0330.

Previous Articles     Next Articles

Speech Recognition Based on Attention Mechanism and Spectrogram Feature Extraction

JIANG Nan1, PANG Yongheng1, GAO Shuang2   

  1. 1. School of Public Security Information Technology and Intelligence, Criminal Investigation Police University of China, Shenyang 110854, China; 2. College of Information Science and Engineering, Northeastern University, Shenyang 110819, China
  • Received:2023-03-08 Online:2024-03-26 Published:2024-03-26

Abstract: Aiming at the problem that the connected temporal classification model needed to have output independence assumption, and there was strong dependence on language model and long training period, we proposed  a speech recognition method based on connected temporal classification model. Firstly, based on the framework of traditional acoustic model, spectrogram feature extraction network based on attention mechanism was trained by using prior knowledge, which effectively improved the discrimination and robustness of speech features. Secondly, the spectrogram feature extraction network was spliced in the 
front of the connected temporal  classification model, and the number of layers of the recurrent neural network in the model was reduced for retraining. The test analysis results show that the improved model shortens the training time, and effectively improves the  accuracy of speech recognition.

Key words: speech recognition, CTC model, recurrent neural network, attention mechanism

CLC Number: 

  • TP391