Journal of Jilin University (Information Science Edition) ›› 2021, Vol. 39 ›› Issue (6): 720-725.

Previous Articles     Next Articles

Annotation System of File Secret Information for Power Grid Enterprise Based on Transformer

DONG Tian, LI Guang, YANG Zhenyu, ZHANG Bo, YU Bo, WANG Wei   

  1. General Committee Office, State Grid Jilin Electric Power Supply Company, Changchun 130021, China
  • Received:2021-10-15 Online:2021-12-01 Published:2021-12-02

Abstract: In the face of a large number of enterprise files, it is time-consuming and laborious to label the encryption points simply by manual, and its division standard is affected by human subjective consciousness. It is an important issue for the automatic classification of enterprise documents, which needs to be solved urgently in enterprise confidentiality management is proposed. Therefore, a file dense point labeling system for power grid enterprises based on transformer. It includes file preprocessing, Chinese word segmentation, word vector construction and secret information annotation. The proposed model is trained and tested on the data set constructed by the internal core commercial secret files and ordinary commercial secret files of State Grid Jilin Electric Power Corporation. The accuracy is 97. 79% and the recall is 99. 08% , indicating that the model has achieved high recognition effect. The recognition of secret information is accurate. There are only a few secret information that have not been marked, which prevents the leakage of secret information effectively.

Key words: secret information annotation, deep learning, Chinese word segmentation, word embedding, enterprise secrets

CLC Number: 

  • TP305