Journal of Jilin University Science Edition ›› 2023, Vol. 61 ›› Issue (3): 651-657.

Previous Articles     Next Articles

Binding Prediction Algorithm of HLA-Ⅰ and Polypeptides Based on  Pre-trained Model ProtBert

ZHOU Fengfeng1,2, ZHANG Yaqi1   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, China;
    2. Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
  • Received:2022-01-11 Online:2023-05-26 Published:2023-05-26

Abstract: Aiming at the problem that the  existing HLA class Ⅰ  molecule-polypeptide binding affinity prediction algorithms rely on traditional sequence scoring functions in feature construction. In order to break through the limitations of using classical machine learning algorithms to construct  amino acid sequence features, we proposed a binding prediction algorithm ProHLAⅠ of HLA-Ⅰ and polypeptides based on protein pre-trained model ProtBert. The algorithm utilized the commonness of the composition of the living body language and the text language, compared the amino acid sequence with the sentence, and extracted the features of the HLA-Ⅰ sequence and the polypeptied sequence by integrating the network structure advantages of pre-trained model ProtBert, the BiLSTM coding and the attention mechanism, so as to realize the site\|independent polypeptide binding prediction of the HLA-Ⅰ.
The experimental results show that the model  achieves the optimal  performance on two independent test sets.

Key words: HLA-Ⅰ binding peptide prediction, natural language processing, attention mechanism, BERT model, bi-long short-term memory (BiLSTM) model

CLC Number: 

  •