吉林大学学报(理学版) ›› 2024, Vol. 62 ›› Issue (1): 122-0131.

• • 上一篇    下一篇

自动语音识别模型压缩算法综述

时小虎1, 袁宇平2, 吕贵林3, 常志勇4, 邹元君5   

  1. 1. 吉林大学 计算机科学与技术学院, 长春 130012; 2. 吉林大学 大数据和网络管理中心, 长春 130012;3. 中国第一汽车集团有限公司研发总院 智能网联开发院, 长春 130011; 4. 吉林大学 生物与农业工程学院, 长春 130022; 5. 长春中医药大学 医药信息学院, 长春 130117
  • 收稿日期:2023-02-23 出版日期:2024-01-26 发布日期:2024-01-26
  • 通讯作者: 邹元君 E-mail:zouyj@ccucm.edu.cn

Compression Algorithms for Automatic Speech Recognition Models: A Survey

SHI Xiaohu1, YUAN Yuping2, LV Guilin3, CHANG Zhiyong4, ZOU Yuanjun5   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China; 2. Management Center of Big Data and Network, Jilin University, Changchun 130012, China; 3. Intelligent Network Development Institute, R&D Institute of China FAW Group Co., Ltd, Changchun 130011, China; 4. College of Biological and Agricultural Engineering, Jilin University, Changchun 130022, China;
    5. School of Medical Information, Changchun University of Chinese Medicine, Changchun 130117, China
  • Received:2023-02-23 Online:2024-01-26 Published:2024-01-26

摘要: 随着深度学习技术的发展, 自动语音识别任务模型的参数数量越来越庞大, 使得模型的计算开销、 存储需求和功耗花费逐渐增加, 难以在资源受限设备上部署. 因此对基于深度学习的自动语音识别模型进行压缩, 在降低模型大小的同时尽量保持原有性能具有重要价值. 针对上述问题, 全面综述了近年来该领域的主要工作, 将其归纳为知识蒸馏、 模型量化、 低秩分解、 网络剪枝、 参数共享以及组合模型几类方法, 并进行了系统综述, 为模型在资源受限设备的部署提供可选的解决方案.

关键词: 语音识别, 模型压缩, 知识蒸馏, 模型量化, 低秩分解, 网络剪枝, 参数共享

Abstract: With the development of deep learning technology,  the number of parameters in automatic speech recognition task  models was becoming  increasingly  large, which gradually increased  the computing overhead, storage requirements and power consumption of the models, and it was difficult to deploy on resource-constrained devices. Therefore, it was of great  value to compress the automatic speech recognition models based on deep learning to reduce the size of the modes while maintaining the original performance as much as possible. Aiming at the above problems,  a comprehensive survey was conducted on  the main works in this field in recent years, which was summarized as several methods, including knowledge distillation, model quantization, low-rank decomposition, network pruning, parameter sharing and combination models, and  conducted a systematic review  to  provide alternative solutions for the deployment of models on resource-constrained devices.

Key words: speech recognition, model compression, knowledge distillation, model quantization, low-rank decomposition, network pruning, parameter sharing

中图分类号: 

  • TP391