Journal of Jilin University (Information Science Edition) ›› 2024, Vol. 42 ›› Issue (3): 509-515.
Previous Articles Next Articles
HAN Xi, LIANG Kai, YUE Yu
Received:
Online:
Published:
Abstract: In order to solve the problems of low lip contour detection accuracy and poor visual speech synthesis effect, a Tibetan-driven visual speech synthesis algorithm based on audio matching is proposed. This algorithm extracts short-term energy and short-term zero-crossing rate from Tibetan-language-driven visual speech signal, establishes short-term autocorrelation function of speech signal, and extracts feature information in speech signal, so as to obtain the pitch track of Tibetan speech signal. Secondly, the temporal and spatial analysis model of lip is established to analyze the changing trend of lip contour in the pronunciation process, and the feature of lip contour is extracted by principal component analysis. Finally, the correlation between audio features and lip contour features is obtained through the input-output hidden Markov model, and Tibetan-driven visual speech is synthesized on the basis of audio matching. Experimental results show that the proposed method has high lip contour detection accuracy and good visual speech synthesis effect.
Key words: audio matching, short time autocorrelation function, spatiotemporal analysis model, principal component analysis method, visual speech synthesis
CLC Number:
HAN Xi, LIANG Kai, YUE Yu. Research on Tibetan Driven Visual Speech Synthesis Algorithm Based on Audio Matching[J].Journal of Jilin University (Information Science Edition), 2024, 42(3): 509-515.
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://xuebao.jlu.edu.cn/xxb/EN/
http://xuebao.jlu.edu.cn/xxb/EN/Y2024/V42/I3/509
Cited