基于联合知识迁移的低资源语音关键词检测

Abstract

Abstract: Aiming at the problem of the low accuracy of speech keyword spotting under low-resource conditions, we proposed a detection method combining unsupervised feature extraction and supervised model parameter transfer. Firstly, a deep feature extraction network was trained by using large-scale unlabeled speech data, and the extracted features were fused with acoustic spectrogram
features to enhance robustness of the features to acoustic environments. Secondly, the decision network was pre-trained by using rich labeled data from the source domain, and decision knowledge was introduced through parameter transfer to solve the problem of model convergence difficulty caused by insufficient training data in the target domain. Finally, the entire network was fine-tuned by using a very small amount of target domain data. Experimental results on Hakka and Cantonese datasets show that this method significantly outperforms single transfer strategies. In the Hakka task, the false rejection rate is reduced to 11.77%, and the maximum term weighted value is improved to 0.734 6. The experimental results demonstrate that the proposed method can effectively alleviate the problem of data scarcity and significantly improve detection performance for low-resource languages.

Key words: speech keyword spotting, deep learning, low-resource, joint knowledge transfer

CLC Number:

TP391

HUANG Jinxin, HE Qianhua, ZHENG Ruowei, YANG Mingru, WANG Wenwu. Low-Resource Speech Keyword Spotting Based on Joint Knowledge Transfer[J].Journal of Jilin University Science Edition, 2026, 64(2): 394-0402.

[1]	REN Yong, DUO Lin. Infrared Small Target Detection Method Based on Feature Separation and Global Context [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1437-1446.
[2]	LI Hongliang, ZHANG Meng, WANG Zichen, LI Xiang. Accuracy-Aware Sparse Gradient Fusion Algorithm for Data-Parallel Deep Learning [J]. Journal of Jilin University Science Edition, 2025, 63(5): 1356-1365.
[3]	SHAO Jianfei, CAI Shijun, LIU Jie. YOLO-LDD: Lightweight UAV Detection Algorithm [J]. Journal of Jilin University Science Edition, 2025, 63(3): 867-0877.
[4]	ZHANG Yan. Multimodal Data Feature Fusion Algorithm Based on Deep Learning and D-S Theory [J]. Journal of Jilin University Science Edition, 2025, 63(3): 855-0860.
[5]	LI Jian, WANG Hairui, WANG Zenghui, FU Haitao, YU Weilin. Improved SHO Optimization Neural Network Model [J]. Journal of Jilin University Science Edition, 2025, 63(3): 835-0844.
[6]	LI Shaoxuan, YANG Youlong. Interpretability Analysis of Convolutional Neural Networks Based on Ablation Analysis [J]. Journal of Jilin University Science Edition, 2024, 62(3): 606-614.
[7]	JIANG Sheng, ZHANG Zhongyi, WANG Zongyang, YU Qing. Target Recognition Algorithm of Traffic Intersection Based on Improved YOLOv7 [J]. Journal of Jilin University Science Edition, 2024, 62(3): 665-673.
[8]	SUN Xufei, MIAO Xinying, BI Tiantian, WANG Shuitao, YU Fangyu. SFSR-Age: An Age Recognition Algorithm Based on Strong Facial Semantics [J]. Journal of Jilin University Science Edition, 2024, 62(2): 347-0356.
[9]	HOU Guangzhe, QIN Guihe, LIANG Yanhua. Self-supervised Point Cloud Denoising Method Based on Downsampling [J]. Journal of Jilin University Science Edition, 2024, 62(1): 100-0105.
[10]	ZHU Shuchang, LI Wenhui. A Clothing Classification Algorithm Based on Self-attention Information Compensation [J]. Journal of Jilin University Science Edition, 2023, 61(6): 1419-1424.
[11]	JI Xinyuan, DONG Jiantao, TAO Hao. Option Pricing Based on Neural Stochastic Differential Equations [J]. Journal of Jilin University Science Edition, 2023, 61(6): 1324-1332.
[12]	NIU Zengxian, LIU Haifeng, XU Weifeng, LI Gang, XIE Qing, WANG Hongtao. Construction of Power Transformer Operation and Maintenance Knowledge Extraction and Knowledge Graph Based on Extended Span Representation#br# [J]. Journal of Jilin University Science Edition, 2023, 61(5): 1112-1122.
[13]	LI Wenju, LI Wenhui. Instance Segmentation Method Based on Compressed Representation [J]. Journal of Jilin University Science Edition, 2023, 61(4): 883-889.
[14]	YAO Bo, WANG Weiwei. Graph Embedding Clustering Based on Heterogeneous Fusion and Discriminant Loss [J]. Journal of Jilin University Science Edition, 2023, 61(4): 853-862.
[15]	LI Suyi, ZHANG Xinyu, YANG Qiang, ZHANG Yi, DIAO Shu. A Noise Suppression Method for MCSEM Data [J]. Journal of Jilin University Science Edition, 2023, 61(4): 929-936.

Low-Resource Speech Keyword Spotting Based on Joint Knowledge Transfer

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0