吉林大学学报(理学版)

• 计算机科学 • 上一篇    下一篇

基于信息增益与语义特征的多标签社交网络用户人格预测

郑惠中, 左万利   

  1. 吉林大学 计算机科学与技术学院, 长春 130012
  • 收稿日期:2015-06-23 出版日期:2016-05-26 发布日期:2016-05-20
  • 通讯作者: 左万利 E-mail:wanli@jlu.edu.cn

Multi-labeled Social Networks Users Personality PredictionBased on Information Gain and Semantic Features

ZHENG Huizhong, ZUO Wanli   

  1. College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2015-06-23 Online:2016-05-26 Published:2016-05-20
  • Contact: ZUO Wanli E-mail:wanli@jlu.edu.cn

摘要:

针对社交网络用户人格预测问题, 提出一种结合信息增益与语义特征提炼用户文本信息, 并采用多标签分类算法进行综合预测的方法. 先基于信息增益提取文本词特征, 包括情感词、 词性和时态等, 进行特征选择与加权; 对于语义特征, 将文本内容映射为本体概念并计算语义相关度; 然后以基于词的特征和语义特征的共同
影响为依据, 运用多标签分类算法执行人格预测过程, 从不同角度处理文本信息, 并充分考虑了类标签间的相关性. 实验结果验证了该方法的有效性.

关键词: 社交网络, 人格预测, 社会计算, 多标签分类

Abstract:

Aiming at the problem of the personality prediction of social network users, we proposed a method that combined information gain and semantic features to refine user’s text information, and adopted a method of multilabel classification algorithm for comprehensive prediction. Firstly, lexical features in text were extracted based on information gain, including sentiment word, part of speech and tense etc, and feature selection and weighting were carried out. For semantic features, text content was mapped to ontology concepts and then semantic relevance was calculated. Secondly, based on the combined influence of lexical features and semantic features, a multilabel classification algorithm was used to execute personality prediction process. Text information was
 handled from different perspectives and label relevance was taken into full consideration. Experimental results verify the effectiveness of the proposed method.

Key words: social network, personality prediction, social computing, multilabel classification

中图分类号: 

  • TP391