基于改进的多度量优化自适应语音增强算法

Journal of Jilin University Science Edition ›› 2026, Vol. 64 ›› Issue (2): 403-0410.

Previous Articles Next Articles

Adaptive Speech Enhancement Algorithm Based on Improved Multi-metric Optimization

FU Chunyu, LIU Jun

School of Physics and Electronic Engineering, Northeast Petroleum University, Daqing 163318, Heilongjiang Province, China

Received:2024-10-15 Online:2026-03-26 Published:2026-03-26

Abstract

Abstract: Aiming at the problem of susceptibility to outlier interference and unstable optimization during training process of multi-index speech enhancement algorithms, we proposed an adaptive speech enhancement algorithm based on a multi-head attention mechanism. Firstly, by introducing a multi-head attention structure into the intermediate layer of the discriminator network, we enhanced the joint modeling ability of the model for local features and overall structure of speech spectrum, and combined it with an online knowledge distillation strategy to achieve information sharing among multiple generators, thereby improving the collaborative optimization effect under multi-index conditions. Secondly, in order to reduce the impact of outliers on the training process, we replaced the loss function with a logarithmic mean-squared error form to improve stability and robustness of the model. Experimental results on the publicly available speech dataset VoiceBank-DEMAND show that this method outperforms existing multi-index speech enhancement models in terms of speech quality, background noise suppression, and speech intelligibility metrics. Therefore, introducing an attention mechanism and a stabilizing loss function can significantly improve the overall performance of multi-index speech enhancement algorithms.

Key words: speech enhancement, frequency domain, multi-head attention mechanism, online knowledge distillation, logarithmic mean-squared error loss

CLC Number:

TP391

FU Chunyu, LIU Jun. Adaptive Speech Enhancement Algorithm Based on Improved Multi-metric Optimization[J].Journal of Jilin University Science Edition, 2026, 64(2): 403-0410.

[1]	LI Ke, LIU Yunqing, LI Qi, YAN Fei, ZHANG Qiong. Emotion Recognition Method Based on Multi-head Attention Combined with Temporal Convolution [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1366-1378.
[2]	ZHANG Liyan, LIU Zengli, PENG Yi. Speech Enhancement Method Based on Improved Wavelet Threshold and Optimized VMD Algorithm [J]. Journal of Jilin University Science Edition, 2025, 63(2): 608-0621.
[3]	GAO Xincheng, ZHANG Xuan, FAN Benhang, LIU Wei, ZHANG Haiyang. Improved CNN-Transformer Based Encrypted Traffic Classification Method [J]. Journal of Jilin University Science Edition, 2024, 62(3): 683-690.
[4]	YU Shengbao, FANG Yu, GAO Lihui, DIAO Shu. SHEPWM Control Method of Controllable Emission Current Frequency Based on Improved Particle Swarm Optimization Algorithm [J]. Journal of Jilin University Science Edition, 2020, 58(5): 1207-1214.
[5]	WANG Hongzhi, SUN Shuyu, HE Bin, YAO Liang. Identification and Equalization Algorithm for MIMO System Based on Higher Order Statistics [J]. J4, 2009, 47(01): 115-119.

Adaptive Speech Enhancement Algorithm Based on Improved Multi-metric Optimization

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 5

Metrics

Comments

Recommended 0