吉林大学学报(工学版) ›› 2015, Vol. 45 ›› Issue (2): 596-599.doi: 10.13229/j.cnki.jdxbgxb201502038

• 论文 • 上一篇    下一篇

基于深度学习的中文标准文献语言模型

李抵非1,田地1,胡雄伟2   

  1. 1.吉林大学 仪器科学与电气工程学院,长春 130021;
    2.国家标准化管理委员会 标准信息中心,北京100088
  • 收稿日期:2013-11-07 出版日期:2015-04-01 发布日期:2015-04-01
  • 通讯作者: 田地(1958),男,教授,博士生导师.研究方向:分析仪器测控技术及软件.E-mail:tiandi@jlu.edu.cn
  • 作者简介:李抵非(1986),男,博士研究生.研究方向:人工智能技术.E-mail:lidf12@mails.jlu.edu.cn
  • 基金资助:
    科技部创新方法工作专项项目(2011IM010400).

Standard literature language model based on deep learning

LI Di-fei1, TIAN Di1, HU Xiong-wei2   

  1. 1.College of Instrumentation &
    Electrical Engineering, Jilin University, Changchun 130021, China;
    2.Standardization Administration Information Center, Standardization Administration of the People's Republic of China, Beijing 100088, China
  • Received:2013-11-07 Online:2015-04-01 Published:2015-04-01

摘要: 为解决中文标准文献的自然语言处理问题,对Hierarchical Log-Bilinear英文统计语言模型算法进行了改进,构建了适用于中文语言的模型。采用深度神经网络技术,将无监督学习与有监督学习相结合,利用多层受限玻尔兹曼机训练文本词向量,并将训练好的词向量输入到前馈神经网络进行有监督训练,完成对中文标准文献内容的机器学习。对100多万条标准题录数据进行训练的实验结果表明,该方法能有效提高语言模型的学习能力。

关键词: 人工智能, 自然语言处理, 统计语言模型, 深度神经网络, 受限玻尔兹曼机, 词向量表示

Abstract: To solve the problem of natural language processing for Chinese standard literature, the deep learning technology is employed to build a statistical language model. The Hierarchical Log-Bilinear language model is improved and the unsupervised learning and supervised learning are integrated. In order to accomplish the machine learning, the stacked restricted Boltzman machines are taken to train words' distributed representations, which are taken as the input to a supervised feedforward neural network. The proposed is evaluated using more than one million standard literature bibliographic data. Experiment results show that this model can effectively improve the model's ability to learn the probability of words' distribution.

Key words: artificial intelligence, natural language processing, statistical language model, deep neural networks, restricted boltzman machines, distributed representations

中图分类号: 

  • TP183
[1] Blei David M, Ng Andrew, Jordan Michael. Latent dirichlet allocation[J]. JMLR, 2003,3:993-1022.
[2] Blei David M, Griffths Thomas L, Jordan Michael I. The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies[J]. ACM, 2010,57(2):7-38.
[3] Mimno D, McCallum A. Topic models conditioned on arbitrary features with dirichlet-multinomial regression[J].UAI, 2008:411-418.
[4] Salakhutdinov Ruslan, Hinton Georey. Replicated softmax: an undirected topic model[J].In Advances in Neural Information Processing Systems, 2009, 22:1607-1614.
[5] Bengio Yoshua, Ducharme Réjean, Vincent Pascal, et al. A neural probabilistic language model[J].Journal of Machine Learning Research, 2003 (3): 1137-1155.
[6] Collobert Ronan, Weston Jason. Natural language processing (almost) from scratch[J].Journal of Machine Learning Research, 2000 (1): 1-48.
[7] Mnih Andriy, Hinton Geoffrey.Three new graphical models for statistical language modelling[C]∥International Conference on Machine Learning,Oregan,USA,2007:641-648.
[8] Mnih Andriy, Hinton Geoffrey.A scalable hierarchical distributed language model[C]∥Conference on Neural Information Processing Systems,Canada,2008:1081-1088.
[9] Srivastava N, Salakhutdinov R R, Hinton G E. Modeling documents with a deep boltzmann machine in uncertainty in artificial intelligence[C]∥The Conference on Uncertainty in Artificial Intelligence,Bellevue,Washiugton,USA,2013:1309-1318.
[10] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C]∥Interspeech,Makuhari,Japan,2010:1045-1048.
[11] Huang Eric H, Socher Richard, Manning Christopher D. Improving word representations via global context and multiple word prototypes[C]∥Association for Computational Linguistics,Stroudsburg,PA,USA,2012:873-882.
[12] Xue N. Chinese word segmentation as character tagging[J]. Computational Linguistics and Chinese Language Processing, 2003, 8(1): 29-48.
[13] Tang B, Wang X, Wang X. Chinese word segmentation based on large margin nethods[J]. Int J of Asian Lang Proc, 2009, 19(2): 55-68.
[14] Zhao H, Kit C. Integrating unsupervised and supervised word segmentation: the role of goodness measures[J]. Information Sciences, 2011, 181(1): 163-183.
[15] Hinton G. A practical guide to training restricted Boltzmann machines[J]. Momentum, 2010, 9(1): 926.
[16] Bengio Y. Learning deep architectures for AI[J]. Foundations and Trends in Machine Learning, 2009, 2(1): 1-127.
[17] Hinton G E,Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J].Science, 2006, 313(5786): 504-507.
[18] Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks[J]. Advances in Neural Information Processing Systems, 2007, 19: 153.
[19] Graves A, Mohamed A, Hinton G. Speech recognition with deep recurrent neural networks[C]∥Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2013: 6645-6649.
[20] Tieleman T. Training restricted Boltzmann machines using approximations to the likelihood gradient[C]∥Proceedings of the 25th International Conference on Machine Learning,ACM, 2008: 1064-1071.
[21] Trurian J, Ratinov L,Bengio Y.Word representations:a simple and general method for semi-supervised learning[C]∥Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics,Stroudsburg,PA,USA,2010:384-394.
[1] 董飒, 刘大有, 欧阳若川, 朱允刚, 李丽娜. 引入二阶马尔可夫假设的逻辑回归异质性网络分类方法[J]. 吉林大学学报(工学版), 2018, 48(5): 1571-1577.
[2] 顾海军, 田雅倩, 崔莹. 基于行为语言的智能交互代理[J]. 吉林大学学报(工学版), 2018, 48(5): 1578-1585.
[3] 王旭, 欧阳继红, 陈桂芬. 基于垂直维序列动态时间规整方法的图相似度度量[J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[4] 张浩, 占萌苹, 郭刘香, 李誌, 刘元宁, 张春鹤, 常浩武, 王志强. 基于高通量数据的人体外源性植物miRNA跨界调控建模[J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[5] 黄岚, 纪林影, 姚刚, 翟睿峰, 白天. 面向误诊提示的疾病-症状语义网构建[J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[6] 李雄飞, 冯婷婷, 骆实, 张小利. 基于递归神经网络的自动作曲算法[J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[7] 刘杰, 张平, 高万夫. 基于条件相关的特征选择方法[J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[8] 王旭, 欧阳继红, 陈桂芬. 基于多重序列所有公共子序列的启发式算法度量多图的相似度[J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[9] 杨欣, 夏斯军, 刘冬雪, 费树岷, 胡银记. 跟踪-学习-检测框架下改进加速梯度的目标跟踪[J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
[10] 刘雪娟, 袁家斌, 许娟, 段博佳. 量子k-means算法[J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
[11] 曲慧雁, 赵伟, 秦爱红. 基于优化算子的快速碰撞检测算法[J]. 吉林大学学报(工学版), 2017, 47(5): 1598-1603.
[12] 李嘉菲, 孙小玉. 基于谱分解的不确定数据聚类方法[J]. 吉林大学学报(工学版), 2017, 47(5): 1604-1611.
[13] 邵克勇, 陈丰, 王婷婷, 王季驰, 周立朋. 无平衡点分数阶混沌系统全状态自适应控制[J]. 吉林大学学报(工学版), 2017, 47(4): 1225-1230.
[14] 王生生, 王创峰, 谷方明. OPRA方向关系网络的时空推理[J]. 吉林大学学报(工学版), 2017, 47(4): 1238-1243.
[15] 马淼, 李贻斌. 基于多级图像序列和卷积神经网络的人体行为识别[J]. 吉林大学学报(工学版), 2017, 47(4): 1244-1252.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘松山, 王庆年, 王伟华, 林鑫. 惯性质量对馈能悬架阻尼特性和幅频特性的影响[J]. 吉林大学学报(工学版), 2013, 43(03): 557 -563 .
[2] 初亮, 王彦波, 祁富伟, 张永生. 用于制动压力精确控制的进液阀控制方法[J]. 吉林大学学报(工学版), 2013, 43(03): 564 -570 .
[3] 李静, 王子涵, 余春贤, 韩佐悦, 孙博华. 硬件在环试验台整车状态跟随控制系统设计[J]. 吉林大学学报(工学版), 2013, 43(03): 577 -583 .
[4] 胡兴军, 李腾飞, 王靖宇, 杨博, 郭鹏, 廖磊. 尾板对重型载货汽车尾部流场的影响[J]. 吉林大学学报(工学版), 2013, 43(03): 595 -601 .
[5] 王同建, 陈晋市, 赵锋, 赵庆波, 刘昕晖, 袁华山. 全液压转向系统机液联合仿真及试验[J]. 吉林大学学报(工学版), 2013, 43(03): 607 -612 .
[6] 张春勤, 姜桂艳, 吴正言. 机动车出行者出发时间选择的影响因素[J]. 吉林大学学报(工学版), 2013, 43(03): 626 -632 .
[7] 马万经, 谢涵洲. 双停车线进口道主、预信号配时协调控制模型[J]. 吉林大学学报(工学版), 2013, 43(03): 633 -639 .
[8] 于德新, 仝倩, 杨兆升, 高鹏. 重大灾害条件下应急交通疏散时间预测模型[J]. 吉林大学学报(工学版), 2013, 43(03): 654 -658 .
[9] 肖赟, 雷俊卿, 张坤, 李忠三. 多级变幅疲劳荷载下预应力混凝土梁刚度退化[J]. 吉林大学学报(工学版), 2013, 43(03): 665 -670 .
[10] 肖锐, 邓宗才, 兰明章, 申臣良. 不掺硅粉的活性粉末混凝土配合比试验[J]. 吉林大学学报(工学版), 2013, 43(03): 671 -676 .