Journal of Jilin University Science Edition

Previous Articles     Next Articles

An Active Semi-supervised Clustering AlgorithmBased on Seeds Set and Pairwise Constraints

CHEN Zhiyu1, WANG Huijun1, HU Ming2, LIU Gang1   

  1. 1. College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China;\=2. Office of Principal, Jilin Vocational and Technical Institute Communications, Changchun 130012, China
  • Received:2016-09-09 Online:2017-05-26 Published:2017-05-31
  • Contact: LIU Gang E-mail:lg@ccut.edu.cn

Abstract: Aiming at the problem that the supervised information was not sufficient and the information content of supervision information was low in semi-supervised clustering algorithm, we proposed a semi\|supervised clustering algorithm based on active learning. Firstly, we designed a semi\|supervised clustering algorithm based on Seeds set and pairwise constraints (SC\|Kmeans) to guide the clustering process of the Kmeans algorithm by using the labeled data a
nd pairwise constraints. Secondly, we introduced the active learning algorithm into SC\|Kmeans, in order to select a higher amount of supervision information with a small cost and improve the clustering accuracy of SC\|Kmeans algorithm. Finally, the simulation experiments were performed on machine learning repository (UCI) standard data sets. The experimental results show that the proposed algorithm can achieve better clustering effect, and effectively improve the clustering accuracy.

Key words: semi-supervised clustering, pairwise constraint, Kmeans algorithm, active learning, Seeds set

CLC Number: 

  • TP181