hash function, original cluster center, nearby local graph, constrained objective function, algebraic signature, hash tree, network channel ,"/> Duplicate Data Elimination of Network Single-Channel Based on Minimum Hash

Journal of Jilin University (Information Science Edition) ›› 2023, Vol. 41 ›› Issue (2): 367-373.

Previous Articles     Next Articles

Duplicate Data Elimination of Network Single-Channel Based on Minimum Hash

WU Jianfei 1 , ZHOU Luming 1 , LIU Xiaoqiang 2   

  1. (1. Cancer Hospital Affiliated of Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430079, China; 2. College of Applied Engineering, Henan University of Science and Technology, Sanmenxia 472000, China) 
  • Received:2022-04-24 Online:2023-04-13 Published:2023-04-17

Abstract: Eliminating duplicate data is an indispensable step to ensure efficient network operation. But this process is susceptible to interference from signal strength, network device, router performance and other problems. Therefore, a minimum-hashing algorithm for single channel data elimination is proposed. First the hash function in the hash algorithm network is used for single channel data clustering, and then supervision discriminant projection algorithm is applied for clustering of data dimension reduction after processing, finally the algebraic sign estimate is used to guarantee the data between the computing cost minimum and to construct minimum hash tree generated calibration value, in the update to heavy tags. The repeated data in a single channel is completely eliminated by double-layer culling mechanism. Experimental results show that the algorithm has short execution time and low computation and storage cost.

Key words: hash function')">

hash function, original cluster center, nearby local graph, constrained objective function, algebraic signature, hash tree, network channel

CLC Number: 

  • TP391