Journal of Jilin University (Information Science Edition) ›› 2022, Vol. 40 ›› Issue (2): 188-197.

Previous Articles     Next Articles

Similarity Analysis of Petroleum Drilling Literature Based on Improved Siamese BERT Networks

ZHANG Yan 1a , WANG Bin 1a , YANG Qingchuan 2 , LI Wei 1b   

  1. 1a. School of Computer and Information Technology; 1b. College of Petroleum Engineering, Northeast Petroleum University, Daqing 163318, China; 2. Data Management Center, Anda Qingxin Oilfield Development Company Limited, Anda 151413, China
  • Received:2021-07-14 Online:2022-06-11 Published:2022-06-11

Abstract: In order to solve the problem that the retrieval results are biased due to the nonstandard keywords and fuzzy semantics in the petroleum drilling literature, an attention pooling method based on the Siamese BERT(Bidirectional Encoder Representation from Transformers) networks model is proposed to improve the accuracy of literature similarity evaluation. Firstly, crawler technology is used to collect and clean the petroleum drilling literature. Then, five evaluation indexes of the petroleum drilling literature data set are used to mark the samples. Finally, combined with the data characteristics of the drilling literature data set, the attention pooling method based on Siamese BERT networks is used to express the overall semantics of multi-feature samples. The experimental results show that compared with the conventional pooling method, this method can improve the effect of similarity measurement of petroleum drilling literature, and has a certain generalization performance.

Key words: literature similarity; , bidirectional encoder representation from transformers ( BERT) network; , petroleum drilling literature; , attention pooling

CLC Number: 

  • TP391