吉林大学学报(理学版) ›› 2020, Vol. 58 ›› Issue (6): 1399-1406.

• • 上一篇    下一篇

一种基于数据的GitHub项目个性化混合推荐方法

何锴琦1, 马宇骁2, 张炎3, 刘华虓3   

  1. 1. 吉林大学 研究生院, 长春 130012; 2. 美国东北大学 工程学院, 美国 波士顿 02115;
    3. 吉林大学 计算机科学与技术学院, 长春 130012
  • 出版日期:2020-11-18 发布日期:2020-11-26
  • 通讯作者: 刘华虓 liuhuaxiao@jlu.edu.cn

A Data-Based Personalized Mixed Recommendation Method for GitHub Projects

HE Kaiqi1, MA Yuxiao2, ZHANG Yan3, LIU Huaxiao3   

  1. 1. School of Graduate, Jilin University, Changchun 130012, China;
    2. College of Engineering, Northeastern University, Boston 02115, USA;
    3. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Online:2020-11-18 Published:2020-11-26

摘要: 将两种传统基于内存的协同过滤方法相结合, 提出一种基于数据的GitHub项目个性化混合推荐方法. 该方法不仅可动态地计算相似用户以保证推荐的个性化, 且只用很小规模的相似用户便可得到与基于项目的方法相近的推荐质量; 同时, 该方法通过建立倒排表和利用K均值分类, 在一定程度上解决了原方法在面对GitHub用户及项目数量级较大但交叉度较低的数据集时数据稀疏和冷启动问题. 通过与传统方法进行对比实验, 验证了该方法的有效性和优越性.

关键词: 数据分析, 推荐系统, 协同过滤技术, 冷启动

Abstract: We combined the traditional two memory-based collaborative-filtering methods and proposed a data-based personalized mixed recommendation method for GitHub projects. The method could not only calculate the similar users dynamically to ensure the personalized recommendation, but also obtain the recommendation quality comparable to the item-based method with only small scale of similar users. At the same time, the method solved the data sparsity and cold boot problems of the original method in the face of GitHub, a data set of users and projects of an order of magnitude but with low degree of crossover to some extent by establishing inverse table and using K-means classification. By comparing with the
traditional method, we verified the effectiveness and superiority of the proposed method.

Key words: data analysis, recommendation system, collaborative-filtering technology, cold boot

中图分类号: 

  • TP311