吉林大学学报(工学版) ›› 2022, Vol. 52 ›› Issue (3): 626-632.doi: 10.13229/j.cnki.jdxbgxb20200798

• 计算机科学与技术 • 上一篇    

基于Hadoop的跨社交网络局部时序链路预测算法

康苏明(),张叶娥   

  1. 山西大同大学 计算机与网络工程学院,山西 大同 037009
  • 收稿日期:2020-10-19 出版日期:2022-03-01 发布日期:2022-03-08
  • 作者简介:康苏明(1975-),男,副教授,硕士. 研究方向:计算机网络,信息系统. E-mail:ksmdt@163.com
  • 基金资助:
    国家自然科学基金项目(61672331);山西省高等学校教学改革创新项目(J2019160)

Hadoop⁃based local timing link prediction algorithm across social networks

Su-ming KANG(),Ye-e ZHANG   

  1. School of Computer and Network Engineering,Shanxi Datong University,Datong 037009,China
  • Received:2020-10-19 Online:2022-03-01 Published:2022-03-08

摘要:

为了提高跨社交网络局部时序链路预测的精度和平稳性,提出了基于Hadoop的跨社交网络局部时序链路预测算法。该算法选取了6种跨社交网络节点相似性指标,采用Hadoop的核心组件MapReduce设计了一种并行运算模型,分割处理跨社交网络内的海量并行数据,降低了运算复杂度。运用基于MapReduce并行运算模型的局部时序链路预测算法和所选取的节点相似性指标,获得网络内点对间的预测分数值,实现了跨社交网络局部时序链路预测。实验结果表明,本文算法的预测精度较高,且能够保持平稳的预测状态,具有较好的综合预测性能。

关键词: 计算机应用, 跨社交网络, Hadoop, 链路预测, MapReduce运算模型, 并行处理, 相似性指标

Abstract:

In order to improve the accuracy and stability of local time series link prediction for cross social network, a Hadoop based local time series link prediction algorithm is proposed. This algorithm selects six cross social network node similarity indicators, and designs a parallel computing model using the core component MapReduce of Hadoop, which can segment and process the massive parallel data in cross social network, and reduce the computational complexity. Based on this, the local time series link prediction algorithm based on MapReduce parallel operation model is used to obtain the point-to-point prediction score in the network by using the selected node similarity index, so as to realize the prediction of cross social network local temporal link. The experimental results show that the prediction accuracy of the proposed algorithm is high, it can maintain a stable prediction state, and has good comprehensive prediction performance.

Key words: computer application, cross social networks, Hadoop, link prediction, MapReduce operation model, parallel processing, similarity index

中图分类号: 

  • TP311

图1

MapReduce并行运算模型执行过程"

表1

实验网络拓扑特征"

实验网络编号连边数节点数聚类 系数网络直径平均度平均最短路径
a27311870.607527.592.13
b21372860.297414.362.35
c624723640.022149.746.34
d658349300.101452.5618.88
e1495780.285633.572.05
f21153210.738512.702.63
g1168250110.377142.384.99
h13641310.306730.162.08

图2

指标相似性度量精度对比结果"

表2

本文算法预测实验网络的AUC值"

实验网络编号

CN

指标

Slaton 指标Jaccard 指标

RA

指标

AA

指标

LAS

指标

a0.9310.9250.9260.9480.9430.895
b0.8140.8260.8310.8110.8190.771
c0.7520.7450.7490.7260.7540.772
d0.8330.8440.8440.8410.8290.811
e0.7810.7870.7860.7820.7870.905
f0.9480.9250.9130.9520.9360.903
g0.9370.9380.9360.9370.9340.939
h0.8520.8860.8360.8150.8710.856

表4

资源分配算法预测实验网络的AUC值"

实验网络编号

CN

指标

Slaton 指标Jaccard 指标

RA

指标

AA

指标

LAS

指标

a0.7110.7190.7260.7580.7340.755
b0.6120.6330.5980.6470.6670.681
c0.5990.5260.5680.5870.6020.631
d0.6270.6340.6680.7060.6540.669
e0.5820.5810.5670.5770.5610.556
f0.7870.7240.7760.7920.7130.706
g0.7110.7520.7260.7340.7620.725
h0.6910.6540.6680.6920.6030.611

表5

本文算法AUC指标40次运算结果的标准差"

实验网络编号

CN

指标

Slaton 指标Jaccard 指标

RA

指标

AA

指标

LAS

指标

a0.0440.0380.0460.0470.0450.033
b0.0520.0470.0510.0440.0490.036
c0.0190.0250.0230.0280.0260.022
d0.0110.0150.0090.0130.0180.016
e0.0460.0590.0620.0360.0530.096
f0.0380.0450.0360.0330.0370.041
g0.0020.0040.0030.0030.0050.006
h0.0330.0390.0290.0360.0380.041
1 吴铮, 于洪涛, 刘树新, 等. 基于信息熵的跨社交网络用户身份识别方法[J]. 计算机应用, 2017, 37(8): 2374-2380.
Wu Zheng, Yu Hong-tao, Liu Shu-xin, et al. User identification across multiple social networks based on information entropy[J]. Journal of Computer Applications, 2017, 37(8): 2374-2380.
2 任思禹,申德荣,寇月, 等. 话题感知下的跨社交网络影响力最大化分析[J]. 计算机科学与探索, 2018, 12(5): 741-752.
Ren Si-yu, Shen De-rong, Kou Yue, et al. Topic-aware influence maximization across social networks[J]. Journal of Frontiers of Computer Science & Technology, 2018, 12(5): 741-752.
3 何军,刘业政. 基于多维社交关系的在线社交网络链路预测研究[J]. 现代情报, 2017, 37(7): 41-46, 115.
He Jun, Liu Ye-zheng. Research on online social networks link prediction based on multidimensional social relations[J]. Modern Information, 2017, 37(7): 41-46, 115.
4 钱付兰, 杨强, 马闯, 等. 加权好友推荐模型链路预测算法[J]. 计算机科学与探索, 2019, 13(3): 383-393.
Qian Fu-lan, Yang Qiang, Ma Chuang, et al. Link prediction algorithm of weighted friend recommendation model[J]. Journal of Frontiers of Computer Science & Technology, 2019, 13(3): 383-393.
5 黄剑, 李明奇, 郭文强. 基于Hadoop的Apriori改进算法研究[J]. 计算机科学, 2017, 44(7): 262-266, 269.
Huang Jian, Li Ming-qi, Guo Wen-qiang. Reseach on Improved Apriori algorithm based on Hadoop[J]. Computer Science, 2017, 44(7): 262-266, 269.
6 温贺平, 禹思敏, 吕金虎. 基于Hadoop大数据平台和无简并高维离散超混沌系统的加密算法[J]. 物理学报, 2017, 66(23): 77-83.
Wen He-ping, Yu Si-min, Lv Jin-hu. Encryption algorithm based on Hadoop and non-degenerate high-dimensional discrete hyperchaotic system[J]. Acta Physica Sinica, 2017, 66(23): 77-83.
7 杜凡, 刘群. 有向动态网络中基于模体演化的链路预测方法[J]. 计算机应用研究, 2019, 36(5): 1441-1445, 1453.
Du Fan, Liu Qun. Link prediction method based on motif evolution in directed dynamic networks[J]. Application Research of Computers, 2019, 36(5): 1441-1445, 1453.
8 陆圣宇, 欧锋, 黄清元, 等. 基于资源分配与偏好连接的局部路径链路预测算法[J]. 计算机工程, 2019, 45(9): 316-320.
Lu Sheng-yu, Ou Feng, Huang Qing-yuan, et al. Local path link prediction algorithm based on resource allocation and preferential attachment[J]. Computer Engineering, 2019, 45(9): 316-320.
9 王珍, 韩忠明, 李晋. 大规模数据下的社交网络结构洞节点发现算法研究[J]. 计算机科学, 2017, 44(4): 188-192.
Wang Zhen, Han Zhong-ming, Li Jin. Research on social network structural holes discovery algorithm under large-scale data[J]. Computer Science, 2017, 44(4): 188-192.
10 江若然, 张玲玲. 社交属性网下基于链路预测及节点度的推荐算法[J]. 管理评论, 2019, 31(2): 119-129.
Jiang Ruo-ran, Zhang Ling-ling. Recommendation algorithm based on link prediction and node degree using a social-attribute network[J]. Management Review, 2019, 31(2): 119-129.
11 杨旭华, 俞佳, 张端. 基于局部社团和节点相关性的链路预测算法[J]. 计算机科学, 2019, 46(1): 155-161.
Yang Xu-hua, Yu Jia, Zhang Duan. Link prediction method based on local community and nodes ' relativity[J]. Computer Science, 2019, 46(1): 155-161.
12 冯译萱, 张月霞. 一种时序有向网络中的链路预测方法[J]. 计算机工程与应用, 2019, 55(21): 150-156.
Feng Yi-xuan, Zhang Yue-xia. Link prediction method in sequential directed network[J]. Computer Engineering and Applications, 2019, 55(21): 150-156.
13 厍向阳, 张玲. 基于Hadoop的FP-Growth关联规则并行改进算法[J]. 计算机应用研究, 2018, 35(1): 109-112.
She Xiang-yang, Zhang Ling. Parallel improved algorithm of FP-Growth association rules based on Hadoop[J]. Application Research of Computers, 2018, 35(1): 109-112.
14 张人杰, 胡超, 刘威. 空间延迟容忍网络中多链路数据拥塞控制算法[J]. 吉林大学学报: 工学版, 2020, 50(4): 1472-1477.
Zhang Ren-jie, Hu Chao, Liu Wei. Multi-Link data congestion control algorithm in spatial delay tolerance network[J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(4): 1472-1477.
15 孙学波, 石飞达. 基于Hadoop的Apriori算法研究与优化[J]. 计算机工程与设计, 2018, 39(1): 126-133, 145.
Sun Xue-bo, Shi Fei-da. Research and optimization of Apriori algorithm based on Hadoop[J]. Computer Engineering and Design, 2018, 39(1): 126-133, 145.
16 雷建云, 彭媛, 孙翀, 等. 一种社交网络环境下并行短文本查询算法[J]. 中南民族大学学报: 自然科学版, 2018, 37(3): 123-128.
Lei Jian-yun, Peng Yuan, Sun Chong, et al. A parallel short text query algorithm in social network environment[J]. Journal of South-Central University for Nationalities(Natural Science Edition), 2018, 37(3): 123-128.
17 高杨, 张燕平, 钱付兰, 等. 基于三元闭包的节点相似性链路预测算法[J]. 计算机科学与探索, 2017, 11(5): 822-832.
Gao Yang, Zhang Yan-ping, Qian Fu-lan, et al. Link prediction algorithm based on node similarity of triadic closure[J]. Journal of Frontiers of Computer Science & Technology, 2017, 11(5): 822-832.
18 张维维, 何家峰, 高国旺, 等. 基于博弈论的无线Mesh网络路由与信道分配联合优化算法[J]. 吉林大学学报: 工学版, 2018, 48(3): 887-892.
Zhang Wei-wei, He Jia-feng, Gao Guo-wang, et al. Wireless Mesh network routing and channel allocation union optimization algorithm based on game theory[J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(3): 887-892.
[1] 曲优,李文辉. 基于锚框变换的单阶段旋转目标检测方法[J]. 吉林大学学报(工学版), 2022, 52(1): 162-173.
[2] 赵宏伟,霍东升,王洁,李晓宁. 基于显著性检测的害虫图像分类[J]. 吉林大学学报(工学版), 2021, 51(6): 2174-2181.
[3] 刘洲洲,张倩昀,马新华,彭寒. 基于优化离散差分进化算法的压缩感知信号重构[J]. 吉林大学学报(工学版), 2021, 51(6): 2246-2252.
[4] 孙东明,胡亮,邢永恒,王峰. 基于文本融合的物联网触发动作编程模式服务推荐方法[J]. 吉林大学学报(工学版), 2021, 51(6): 2182-2189.
[5] 王生生,陈境宇,卢奕南. 基于联邦学习和区块链的新冠肺炎胸部CT图像分割[J]. 吉林大学学报(工学版), 2021, 51(6): 2164-2173.
[6] 任丽莉,王志军,闫冬梅. 结合黏菌觅食行为的改进多元宇宙算法[J]. 吉林大学学报(工学版), 2021, 51(6): 2190-2197.
[7] 林俊聪,雷钧,陈萌,郭诗辉,高星,廖明宏. 基于电影视觉特性的动态多目标实时相机规划[J]. 吉林大学学报(工学版), 2021, 51(6): 2154-2163.
[8] 姚引娣,贺军瑾,李杨莉,谢荡远,李英. 自构建改进型鲸鱼优化BP神经网络的ET0模拟计算[J]. 吉林大学学报(工学版), 2021, 51(5): 1798-1807.
[9] 赵宏伟,张子健,李蛟,张媛,胡黄水,臧雪柏. 基于查询树的双向分段防碰撞算法[J]. 吉林大学学报(工学版), 2021, 51(5): 1830-1837.
[10] 曹洁,屈雪,李晓旭. 基于滑动特征向量的小样本图像分类方法[J]. 吉林大学学报(工学版), 2021, 51(5): 1785-1791.
[11] 孙小雪,钟辉,陈海鹏. 基于决策树分类技术的学生考试成绩统计分析系统[J]. 吉林大学学报(工学版), 2021, 51(5): 1866-1872.
[12] 张萌谡,刘春天,李希今,黄永平. 基于K⁃means聚类算法的绩效考核模糊综合评价系统设计[J]. 吉林大学学报(工学版), 2021, 51(5): 1851-1856.
[13] 王春波,底晓强. 基于标签分类的云数据完整性验证审计方案[J]. 吉林大学学报(工学版), 2021, 51(4): 1364-1369.
[14] 欧阳丹彤,刘扬,刘杰. 故障响应指导下基于测试集的故障诊断方法[J]. 吉林大学学报(工学版), 2021, 51(3): 1017-1025.
[15] 钱榕,张茹,张克君,金鑫,葛诗靓,江晟. 融合全局和局部特征的胶囊图神经网络[J]. 吉林大学学报(工学版), 2021, 51(3): 1048-1054.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!