Journal of Jilin University(Engineering and Technology Edition) ›› 2021, Vol. 51 ›› Issue (5): 1817-1822.doi: 10.13229/j.cnki.jdxbgxb20200370

Previous Articles    

Large⁃scale semantic text overlapping region retrieval based on deep learning

Li-li DONG(),Dan YANG,Xiang ZHANG()   

  1. School of Information and Control Engineering,Xi'an University of Architecture and Technology,Xi'an 710055,China
  • Received:2020-05-26 Online:2021-09-01 Published:2021-09-16
  • Contact: Xiang ZHANG E-mail:kkkuujm@163.com;xd220210@163.com

Abstract:

As a hot topic in natural language processing, overlapping region recognition needs to be further explored and studied. Aiming at the problem of poor accuracy and recall in traditional text overlapping region retrieval methods, a large-scale semantic text overlapping region retrieval method based on deep learning is proposed. Combined with sparse automatic encoder and depth confidence network, a hybrid model is constructed. According to the hybrid model, a text classifier is designed and constructed. The main components of the classifier are text preprocessing, feature learning and classification retrieval. In this paper, a series of preprocessing, such as de-noising, word segmentation and stop word removal, are carried out. Finally, softmax regression is used to realize text classification, and the learned text features are used as the input of the classifier to get the classification and retrieval results of the overlapping regions. The experimental results show that the accuracy and recall of the method are both high, showing reliability and robustness.

Key words: deep learning, semantic text, overlapping region retrieval, deep confidence network, feature learning

CLC Number: 

  • TP391

Fig.1

Hybrid model structure"

Fig.2

Text classifier"

Fig.3

Text preprocessing process"

Fig.4

Comparison of running process of divergence method"

Fig.5

Text feature classification accuracy results"

Fig.6

Comparison results of retrieval accuracy"

Fig.7

Comparison of retrieval recall rates"

1 张倩倩, 田学东, 杨芳, 等. 基于数学文本和表达式转换的融合检索模型[J]. 计算机工程, 2019, 45(3): 175-181, 187.
Zhang Qian-qian, Tian Xue-dong, Yang Fang, et al. Integration retrieval model based on transformation of mathematical text and expression[J]. Computer Engineering, 2019, 45(3): 175-181, 187.
2 车翔玖, 王利, 郭晓新. 基于多尺度特征融合的边界检测算法[J]. 吉林大学学报: 工学版, 2018, 48(5): 1621-1628.
Che Xiang-jiu, Wang Li, Guo Xiao-xin. Improved boundary detection based on multi-scale cues fusion[J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1621-1628.
3 林泽琦, 邹艳珍, 赵俊峰, 等. 基于代码结构知识的软件文档语义搜索方法[J]. 软件学报, 2019, 30(12): 3714-3729.
Lin Ze-qi, Zou Yan-zhen, Zhao Jun-feng, et al. Software text semantic search approach based on code structure knowledge[J]. Journal of Software, 2019, 30(12): 3714-3729.
4 何涛, 王桂芳, 杨美妮, 等. 基于词嵌入语义的精准检索式构建方法[J]. 现代情报, 2018, 38(11): 55-58.
He Tao, Wang Gui-fang, Yang Mei-ni, et al. Construction of precise search queries based on word embedding[J]. Modern Information, 2018, 38(11): 55-58.
5 林云, 孙晓刚, 姜尧岗, 等. 基于语义分割的活体检测算法[J]. 吉林大学学报: 工学版, 2020, 50(3): 281-287.
Lin Yun, Sun Xiao-gang, Jiang Yao-gang, et al. Live detection algorithm based on semantic segmentation[J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(3): 281-287.
6 吴曦, 俞能海, 张卫明. 一种基于BloomFilter的改进型加密文本模糊搜索机制研究[J]. 控制与决策, 2019, 34(1): 97-104.
Wu Xi, Yu Neng-hai, Zhang Wei-ming, et al. An improved multi-keyword fuzzy search scheme based on BloomFilter over encrypted text[J]. Control and Decision, 2019, 34(1): 97-104.
7 李志义, 黄子风, 许晓绵. 基于表示学习的跨模态检索模型与特征抽取研究综述[J]. 情报学报, 2018, 37(4): 86-99.
Li Zhi-yi, Huang Zi-feng, Xu Xiao-mian. A review of the cross-modal retrieval model and feature extraction based on representation learning[J]. Journal of the China Society for Scientific and Technical Information, 2018, 37(4): 86-99.
8 王永强, 韩磊. 基于文本驱动的动画素材自动检索系统设计[J]. 现代电子技术, 2018, 41(24): 177-179.
Wang Yong-qiang, Han Lei. Design of animation material automatic retrieval system based on text driven[J]. Modern Electronics Technique, 2018, 41(24): 177-179.
9 马健, 刘峰, 李红辉, 等. 采用PageRank和节点聚类系数的标签传播重叠社区发现算法[J]. 国防科技大学学报, 2019, 41(1): 186-193.
Ma Jian, Liu Feng, Li Hong-hui, et al. Overlapping community detection algorithm by label propagation using PageRank and node clustering coefficients[J]. Journal of National University of Defense Technology, 2019, 41(1): 186-193.
10 缪峰, 贾华丁, 熊于宁. 基于服务相似度的移动用户近似邻居选取方法[J]. 计算机工程, 2018, 44(5): 168-173, 179.
Miao Feng, Jia Hua-ding, Xiong Yu-ning. Approximate neighbors selection method for mobile user based on services similarity[J]. Computer Engineering, 2018, 44(5): 168-173, 179.
[1] Li-sheng JIN,Bai-cang GUO,Fang-rong WANG,Jian SHI. Dynamic multiple object detection algorithm for vehicle forward based on improved YOLOv3 [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1427-1436.
[2] Feng-chong LAN,Ji-wen LI,Ji-qing CHEN. DG-SLAM algorithm for dynamic scene compound deep learning and parallel computing [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(4): 1437-1446.
[3] Jin-qing LI,Jian ZHOU,Xiao-qiang DI. Learning optical image encryption scheme based on CycleGAN [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1060-1066.
[4] Zhe-ming YUAN,Hong-jie YUAN,Yu-xuan YAN,Qian LI,Shuang-qing LIU,Si-qiao TAN. Automatic recognition and classification of field insects based on lightweight deep learning model [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1131-1139.
[5] Bo PENG,Yuan-yuan ZHANG,Yu-ting WANG,Ju TANG,Ji-ming XIE. Automatic traffic state recognition from videos based on auto⁃encoder and classifiers [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 886-892.
[6] Zhen SONG,Jun-liang LI,Gui-qiang LIU. Constant flow prediction method of variable speed hydraulic power source based on deep learning and limitation fuzzy [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(3): 1106-1110.
[7] Hong-wei ZHAO,Xiao-han LIU,Yuan ZHANG,Li-li FAN,Man-li LONG,Xue-bai ZANG. Clothing classification algorithm based on landmark attention and channel attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1765-1770.
[8] Hua CHEN,Wei GUO,Jing-wen YAN,Wen-hao ZHUO,Liang-bin WU. A new deep learning method for roads recognition from SAR images [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1778-1787.
[9] Qian XU,Ying LI,Gang WANG. Pedestrian-vehicle detection based on deep learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(5): 1661-1667.
[10] Li⁃min GUO,Xin CHEN,Tao CHEN. Radar signal modulation type recognition based on AlexNet model [J]. Journal of Jilin University(Engineering and Technology Edition), 2019, 49(3): 1000-1008.
[11] LI Di-fei, TIAN Di, HU Xiong-wei. A method of deep learning based on distributed memory computing [J]. 吉林大学学报(工学版), 2015, 45(3): 921-925.
[12] CHANG Fa-liang, LI Jiang-bao. Multi-camera relay-tracking strategies based on topological model and feature learning [J]. 吉林大学学报(工学版), 2013, 43(增刊1): 330-334.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!