吉林大学学报(理学版) ›› 2020, Vol. 58 ›› Issue (2): 364-370.

• 计算机科学 • 上一篇    下一篇

面向大规模数据的特征趋势推理算法

吴春琼   

  1. 厦门大学 信息科学与技术学院, 福建 厦门 361005; 阳光学院 商学院,  福州 350015
  • 收稿日期:2018-11-21 出版日期:2020-03-26 发布日期:2020-03-25
  • 通讯作者: 吴春琼 E-mail:chunqiongwu@163.com

Characteristic Trend Reasoning Algorithm for LargeScale Data

WU Chunqiong   

  1. School of Information Science and Engineering, Xiamen University, Xiamen 361005, Fujian Province, China;School of Business, Yango University, Fuzhou 350015, China
  • Received:2018-11-21 Online:2020-03-26 Published:2020-03-25
  • Contact: WU Chunqiong E-mail:chunqiongwu@163.com

摘要: 提出一种面向大规模数据的特征趋势推理算法. 首先, 采用Hash函数抽取大规模数据样本, 使用Pam聚类算法和并行Kmeans聚类算法对大规模数据样本进行聚类, 获取最佳聚类结果后, 提取大规模数据聚类的动态特征; 其次, 采用基于特征趋势规则的推理算法, 构建大规模数据特征的趋势规则推理模型, 并通过累计趋势规则方法设计趋势规则算法, 推理大规模数据特征趋势, 解决了推理结果误差较大的问题. 实验结果表明, 该算法对大规模数据特征趋势推理的准确率均值为98.10%, 推理速度增长率为50%, 推理耗时最大均值仅为114.25 s, 能快速准确地完成数据特征趋势推理.

关键词: 大规模数据, 特征, 趋势, 推理, 动态特征, 累计趋势规则

Abstract: The author proposed a characteristic trend reasoning algorithm for largescale data. Firstly, Hash function was used to extract largescale data samples, Pam clustering algorithm and parallel Kmeans clustering algorithm were used to
 cluster largescale data samples. After obtaining the best clustering results, the dynamic characteristics of large data clustering were extracted. Secondly, a reasoning algorithm based on characteristic trend rules was used to construct a trend rule reasoning model for largescale data characteristics, and a trend rule algorithm was designed by the method of cumulative trend rule, which could infer the trends of largescale data characteristics, and solved the problem of large errors of reasoning results. The experimental results show that the average accuracy of the proposed algorithm for large-scale data characteristic trend reasoning is 9810%, the growth rate of reasoning speed is 50%, and the maximum average reasoning timeconsuming is only 11425 s, which can quickly and accurately complete data characteristic trend reasoning.

Key words: largescale data, characteristics, trend, reasoning, dynamic characteristics, cumulative trend rule

中图分类号: 

  • TP311