吉林大学学报(信息科学版) ›› 2024, Vol. 42 ›› Issue (2): 312-317.

• • 上一篇    下一篇

空间自回归模型下不完整大数据缺失值插补算法

刘晓燕, 翟建国   

  1. 昆明理工大学 信息工程与自动化学院, 昆明 650504
  • 收稿日期:2023-03-23 出版日期:2024-04-10 发布日期:2024-04-12
  • 通讯作者: 建国(1997— ), 男, 吉林松原人, 昆明理工大学硕士研究生, 主要从事计算机应用 研究, (Tel)86-13179016606(E-mail)187914780872@ qq. com E-mail:187914780872@ qq. com
  • 作者简介:刘晓燕(1964— ), 女, 昆明人, 昆明理工大学副教授, 博士, 主要从事软件工程研究, ( Tel)86-15308715434 (E-mail) 1489846182@ qq. com
  • 基金资助:
    云南省自然科学基金资助项目(202224143456)

Interpolation Algorithm for Missing Values of Incomplete Big Data in Spatial Autoregressive Model

LIU Xiaoyan, ZHAI Jianguo   

  1. School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504, China
  • Received:2023-03-23 Online:2024-04-10 Published:2024-04-12

摘要: 针对不完整大数据因其自身结构具有不规则性, 导致在进行缺失值插补时计算量大、 插补精度低的 问题, 提出空间自回归模型下不完整大数据缺失值插补算法。 利用迁移学习算法在动态权重下过滤出原始数 据中冗余数据, 区分异常和正常数据, 提取残缺数据, 采用最小二乘回归对残缺数据实施修补。 将缺失值插补 分为 3 种类型, 分别为一阶空间自回归模型插补、 空间自回归模型插补和多重插补法。 根据实际情况将修补后 数据插补到合适的位置, 实现不完整大数据缺失值插补。 实验结果表明, 所提方法具有良好的缺失值插 补能力。

关键词: 迁移学习, 不完整大数据, 缺失值插补, 空间回归模型, 数据修正 

Abstract: Incomplete big data, due to its irregular structure, has a large amount of computation and low interpolation accuracy when interpolation misses values. Therefore, a missing value interpolation algorithm for incomplete big data based on spatial autoregressive model is proposed. Using a migration learning algorithm to filter out redundant data from the original data under dynamic weights, to distinguish abnormal data from normal data, and to extract incomplete data. Using least square regression to repair the incomplete data. The missing value interpolation is divided into three types, namely, first order spatial autoregressive model interpolation, spatial autoregressive model interpolation, and multiple interpolation. The repaired data is interpolated to the appropriate location according to the actual situation, implementing incomplete big data missing value interpolation. Experimental results show that the proposed method has good interpolation ability for missing values. 

Key words: transfer learning, incomplete big data, imputation of missing values, spatial regression model, data correction

中图分类号: 

  • TP391