吉林大学学报(信息科学版) ›› 2022, Vol. 40 ›› Issue (1): 71-76.

• • 上一篇    下一篇

基于皮尔森相关算法的云存储层次化去冗优化

杨 晖   

  1. 凯里学院 大数据工程学院, 贵州 凯里 556000
  • 收稿日期:2021-04-28 出版日期:2022-01-25 发布日期:2022-01-29
  • 作者简介:杨晖(1984— ), 女(苗族), 贵州凯里人, 凯里学院副教授, 主要从事数据库、 教学资源平台研究, (Tel)86-15885802584(E-mail)hh198412@163.com
  • 基金资助:
    智慧旅游信息服务云平台研究基金资助项目(黔东南科合 J 字[2018]008 号)

Hierarchical Redundancy Elimination Optimization of Cloud Storage Based on Pearson Correlation Algorithm

YANG Hui#br# #br#   

  1. College of Big Data Engineering, College Kaili University, Kaili 556000, China
  • Received:2021-04-28 Online:2022-01-25 Published:2022-01-29

摘要: 针对传统云存储存在去冗效率和召回率低的问题, 提出了基于皮尔森相关算法的云存储层次化去冗优化方法。 根据云存储层次化结构中冗余信息的属性分布相似性度量值, 构建了冗余信息的距离矩阵; 通过计算冗余信息之间的相似度, 对云存储层次化冗余信息进行分类; 通过分析不同类别冗余信息的结构, 利用数据降维约束条件和中心极限原理, 在云存储层次化结构中, 构建冗余信息特征空间压缩的目标函数, 提取出云存储层次化冗余信息特征; 基于去冗优化超平面, 计算冗余信息样本点到正负类超平面的距离; 利用皮尔森相关算法定义模糊因子, 据此定义云存储层次化结构中特征有效度, 构建了冗余信息特征的有效度函数, 实现了云存储层次化的去冗优化。 实验结果表明, 设计方法在提高去冗效率的同时, 在召回率上也具有很好的性能。

关键词: 皮尔森相关算法, 云存储, 层次化, 冗余信息, 去冗优化

Abstract: Aiming at the problems of redundancy elimination efficiency and low recall rate in traditional cloud storage, a hierarchical redundancy elimination optimization method based on Pearson correlation algorithm is proposed. According to the attribute distribution similarity measure value of redundant information in the
hierarchical structure, the distance matrix of redundant information is constructed to classify the hierarchical redundant information by calculating the similarity between redundant information. By analyzing the structure of different types of redundant information, using data dimension reduction constraints and central limit principle,the objective function of redundant information feature space compression is constructed, and the hierarchical redundant information features are extracted based on the redundancy optimization hyperplane, the distance between redundant information sample points and positive and negative hyperplanes is calculated. The fuzzy factor is defined by Pearson correlation algorithm, and then the feature effectiveness in the hierarchical structure of cloud storage is defined. The effectiveness function of redundant information features is constructed, and the redundancy optimization of cloud storage hierarchy is realized. Experimental results show that the design method improves the redundancy removal efficiency and has good performance in recall rate.

Key words: Pearson correlation algorithm, cloud storage, hierarchical, redundant information, eliminate redundant optimization

中图分类号: 

  • TP312