吉林大学学报(工学版) ›› 2012, Vol. 42 ›› Issue (01): 234-239.

• paper • Previous Articles     Next Articles

Automatic annotation for medical texts based on hidden topic and semantic tree

LI Bo1, WEN Dun-wei2, WANG Ke1, LIU Jing-xin3   

  1. 1. School of Communication Engineering, Jilin University, Changchun 130012, China;
    2. School of Computing and Information Systems, Athabasca University, Athabasca T9S3A3, Canada;
    3. China-Japan Union Hospital, Jilin University, Changchun 130033, China
  • Received:2011-06-27 Online:2012-01-01 Published:2012-01-01

Abstract:

Medical texts lack quantifiable data structure, thus text keyword model based processing method is not practicable. On the basis of research on latent semantic association between words and keywords tree structure, a semantic analysis model based on latent semantic tree was constructed for medical text data mining. Furthermore, the hidden topic is associated with latent semantic research, and a text processing method was designed based on potential Dirichlet allocation and latent semantic tree model, which can form certain readable automatic annotation according to different types of medical texts. This automatic annotation has lower subjectivity, higher accuracy and readability than the keywords model method. Besides, it can assist medical doctors with text notation and classification, reducing their workload. Program results show that this method can be applied to medical image views and to form diagnosis opinion, patient medical records, produce symptomatic prescription. The semantic matching degree for annotation is 67.7%, and the readability of the text can reach 60.02%.

Key words: information processing, medical texts, automatic annotation, latent Dirichlet allocation, latent semantic analysis, semantic tree

CLC Number: 

  • TN919.8


[1] Valerie Bertaud, Jeremy Lasbleiz, Fleur Mougin, et al. A unified representation of findings in clinical radiology using the UMLS and DICOM
[J]. International Journal of Medical Informatics, 2008, 77: 621-629.

[2] Newman David, Karimi Sarvnaz, Cavedon Lawrence. Using topic models to interpret MEDLINE's medical subject headings
[J]. Lecture Notes in Computer Science, 2009, 5866:270-279.

[3] Jihen Majdoubi, Mohamed Tmar, Faiez Gargouri. Using the mesh thesaurus to index a medical article combination of content, structure and semantics
[J]. Lecture Notes in Computer Science, 2009, 5711: 277-284.

[4] 赵军,金千里,徐波. 面向文本检索的语义计算
[J]. 计算机学报,2005,28(12):2068-2078. Zhao Jun, Jin Qian-li, Xu Bo. Semantic computation for text retrieval
[J]. Chinese Journal of Computers, 2005, 28(12):2068-2078.

[5] Trevor Cohen, Brett Blatter, Vimla Patel. Simulating expert clinical comprehension: adapting latent semantic analysis to accurately extract clinical concepts from psychiatric narrative
[J]. Journal of Biomedical Informatics, 2008, 41: 1070-1087.

[6] Blei David M, Ng Andrew Y, Jordan Michael I. Latent dirichlet allocation
[J]. Journal of Machine Learning Research,2003, 3:993-1022.

[7] Marco Cuturi, Jean-Philippe Vert. The context-tree kernel for strings
[J]. Neural Networks, 2005, 18:1111-1123.

[8] Tsochantaridis Ioannis, Hofmann Thomas, Joachims Thorsten, et al. Support vector machine learning for interdependent and structured output spaces//Proceedings of the 21st International Conference on Machine Learning,Banff, Alta, Canada: Association for Computing Machinery, 2004: 823-830.

[1] YING Huan,LIU Song-hua,TANG Bo-wen,HAN Li-fang,ZHOU Liang. Efficient deterministic replay technique based on adaptive release strategy [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1917-1924.
[2] LIU Zhong-min,WANG Yang,LI Zhan-ming,HU Wen-jin. Image segmentation algorithm based on SLIC and fast nearest neighbor region merging [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1931-1937.
[3] SHAN Ze-biao,LIU Xiao-song,SHI Hong-wei,WANG Chun-yang,SHI Yao-wu. DOA tracking algorithm using dynamic compressed sensing [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1938-1944.
[4] YAO Hai-yang, WANG Hai-yan, ZHANG Zhi-chen, SHEN Xiao-hong. Reverse-joint signal detection model with double Duffing oscillator [J]. 吉林大学学报(工学版), 2018, 48(4): 1282-1290.
[5] QUAN Wei, HAO Xiao-ming, SUN Ya-dong, BAI Bao-hua, WANG Yu-ting. Development of individual objective lens for head-mounted projective display based on optical system of actual human eye [J]. 吉林大学学报(工学版), 2018, 48(4): 1291-1297.
[6] CHEN Mian-shu, SU Yue, SANG Ai-jun, LI Pei-peng. Image classification methods based on space vector model [J]. 吉林大学学报(工学版), 2018, 48(3): 943-951.
[7] CHEN Tao, CUI Yue-han, GUO Li-min. Improved algorithm of multiple signal classification for single snapshot [J]. 吉林大学学报(工学版), 2018, 48(3): 952-956.
[8] MENG Guang-wei, LI Rong-jia, WANG Xin, ZHOU Li-ming, GU Shuai. Analysis of intensity factors of interface crack in piezoelectric bimaterials [J]. 吉林大学学报(工学版), 2018, 48(2): 500-506.
[9] LIN Jin-hua, WANG Yan-jie, SUN Hong-hai. Improved feature-adaptive subdivision for Catmull-Clark surface model [J]. 吉林大学学报(工学版), 2018, 48(2): 625-632.
[10] WANG Ke, LIU Fu, KANG Bing, HUO Tong-tong, ZHOU Qiu-zhan. Bionic hypocenter localization method inspired by sand scorpion in locating preys [J]. 吉林大学学报(工学版), 2018, 48(2): 633-639.
[11] YU Hua-nan, DU Yao, GUO Shu-xu. High-precision synchronous phasor measurement based on compressed sensing [J]. 吉林大学学报(工学版), 2018, 48(1): 312-318.
[12] WANG Fang-shi, WANG Jian, LI Bing, WANG Bo. Deep attribute learning based traffic sign detection [J]. 吉林大学学报(工学版), 2018, 48(1): 319-329.
[13] LIU Dong-liang, WANG Qiu-shuang. Instantaneous velocity extraction method on NGSLM data [J]. 吉林大学学报(工学版), 2018, 48(1): 330-335.
[14] TANG Kun, SHI Rong-hua. Detection of wireless sensor network failure area based on butterfly effect signal [J]. 吉林大学学报(工学版), 2017, 47(6): 1939-1948.
[15] LI Juan, MENG Ke-xin, LI Yue, LIU Hui-li. Seismic signal noise suppression based on similarity matched Wiener filtering [J]. 吉林大学学报(工学版), 2017, 47(6): 1964-1968.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!