吉林大学学报(工学版) ›› 2017, Vol. 47 ›› Issue (5): 1604-1611.doi: 10.13229/j.cnki.jdxbgxb201705037

Previous Articles     Next Articles

Clustering method for uncertain data based on spectral decomposition

LI Jia-fei1, 2, SUN Xiao-yu1, 2   

  1. 1.Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China;
    2.College of Computer Science and Technology, Jilin University, Changchun 130012, China
  • Received:2016-07-05 Online:2017-09-20 Published:2017-09-20

Abstract: A clustering method for uncertain data based on spectral decomposition was proposed. The method was applied to explore the true covariance structure of data records behind the uncertain representation under the natural potential association of the data. The data analysis method based on spectral decomposition can get the sharpening data according to the covariance structure. Then, clustering analysis of the sharpening data is carried out. The comparison experiment results show that, using the proposed method, the clustering quality improves significantly; the results of root mean square error and CH index are all better than that obtained using the traditional method.

Key words: artificial intelligence, uncertain data, spectral decomposition, clustering, data sharpening, covariance structure

CLC Number: 

  • TP391
[1] 孟小峰, 慈祥. 大数据管理:概念、技术与挑战[J]. 计算机研究与发展, 2013, 50(1):146-169.
Meng Xiao-feng, Ci Xiang. Big data management:concepts,techniques and challenges[J]. Journal of Computer Research and Development, 2013, 50(1):146-169.
[2] Aggarwal C C. On density based transforms for uncertain data mining[C]∥Proceedings of the 23rd IEEE International Conference on Data Engineering.NJ: IEEE, 2007: 841-850.
[3] Aggarwal C C. On unifying privacy and uncertain data models[C]∥Proceedings of the 24th IEEE International Conference on Data Engineering. NJ: IEEE, 2008: 386-395.
[4] Jin C, Yu J X,Zhou A,et al. Efficient clustering of uncertain data streams[J]. Knowledge and Information Systems, 2014, 40(3):509-539.
[5] Aggarwal C C, Yu P S. A survey of uncertain data algorithms[J]. IEEE Transactions on Knowledge and Data Engineering,2009, 21(5):609-623.
[6] Kriegel H P, Pfeifle M. Density-based clustering of uncertain data[C]∥Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. New York: ACM, 2005:672-677.
[7] 张海龙, 王仁彪, 聂俊,等. 海量数据的网格启发信息密度聚类算法[J]. 吉林大学学报:工学版, 2011, 41(增刊2):254-258.
Zhang Hai-long, Wang Ren-biao, Nie Jun, et al. Grid heuristic information density clustering algorithm based on mass data[J].Jounal of Jillin University(Engineering and Technology Edition),2011,41(Sup.2): 254-258.
[8] Kriegel H P, Pfeifle M. Hierarchical density based clustering of uncertain data[C]∥Proceedings of the 5th IEEE International Conference on Data Mining. NJ: IEEE, 2005:689-692.
[9] Ngai W K, Kao B, Chui C K,et al. Efficient clustering of uncertain data[C]∥Proceedings of the 6th IEEE Internatiaonal Conference on Data Mining. NJ: IEEE, 2006:436-445.
[10] Lee S D, Kao Ben, Cheng Reynold. Reducing UK-means to K-means[C]∥IEEE 13th International Conference on Data Mining Workshops,Omaha, Nebraska, USA,2007:483-488.
[11] 李云飞, 王丽珍, 周丽华. 不确定数据的高效聚类方法[D]. 广西师范大学学报:自然科学版, 2011, 29(2):21-27.
Li Yun-fei, Wang Li-zhen, Zhou Li-hua. Efficient clustering algorithm of uncertain data[D]. Journal of Guangxi Normal University (Natural Science Edition), 2011, 29(2):21-27.
[12] Aggarwal C C. A framework for clustering uncertain data streams[C]∥Proceedings of the 24th IEEE International Conference on Data Engineering. NJ: IEEE, 2008:150-159.
[13] Aggarwal C C. On high dimensioal projected clustering of uncertain data streams[C]∥Proceedings of 25th International Conference on Data Engineering. NJ: IEEE, 2009:1152-1154.
[14] 曹振丽, 孙瑞志, 李勐. 一种基于高斯混合模型的不确定数据流聚类方法[J]. 计算机研究与发展, 2014, 51(增刊2):102-109.
Cao Zhen-li, Sun Rui-zhi, Li Meng. A method for clustering uncertain data streams based on GMM[J]. Journal of Computer Research and Development, 2014, 51(Sup.2):102-109.
[15] Aggarwal C C. On multidimensional sharpening of uncertain data[C]∥Proceedings of the SIAM International Conference on Data Mining.PA:SIAM, 2010:136-148.
[1] LIU Zhong-min,WANG Yang,LI Zhan-ming,HU Wen-jin. Image segmentation algorithm based on SLIC and fast nearest neighbor region merging [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(6): 1931-1937.
[2] DONG Sa, LIU Da-you, OUYANG Ruo-chuan, ZHU Yun-gang, LI Li-na. Logistic regression classification in networked data with heterophily based on second-order Markov assumption [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1571-1577.
[3] GU Hai-jun, TIAN Ya-qian, CUI Ying. Intelligent interactive agent for home service [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1578-1585.
[4] GUI Chun, HUANG Wang-xing. Network clustering method based on improved label propagation algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1600-1605.
[5] ZHANG Man, SHI Shu-ming. Analysis of state transition characteristics for typical vehicle driving cycles [J]. 吉林大学学报(工学版), 2018, 48(4): 1008-1015.
[6] WANG Xu, OUYANG Ji-hong, CHEN Gui-fen. Measurement of graph similarity based on vertical dimension sequence dynamic time warping method [J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[7] ZHANG Hao, ZHAN Meng-ping, GUO Liu-xiang, LI Zhi, LIU Yuan-ning, ZHANG Chun-he, CHANG Hao-wu, WANG Zhi-qiang. Human exogenous plant miRNA cross-kingdom regulatory modeling based on high-throughout data [J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[8] DONG Ying, CUI Meng-yao, WU Hao, WANG Yu-hou. Clustering wireless rechargeable sensor networks charging schedule based on energy prediction [J]. 吉林大学学报(工学版), 2018, 48(4): 1265-1273.
[9] HUANG Lan, JI Lin-ying, YAO Gang, ZHAI Rui-feng, BAI Tian. Construction of disease-symptom semantic net for misdiagnosis prompt [J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[10] LI Xiong-fei, FENG Ting-ting, LUO Shi, ZHANG Xiao-li. Automatic music composition algorithm based on recurrent neural network [J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[11] LIU Jie, ZHANG Ping, GAO Wan-fu. Feature selection method based on conditional relevance [J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[12] DENG Jian-xun, XIONG Zhong-yang, DENG Xin. Improved DNALA algorithm based on spectral clustering matrix [J]. 吉林大学学报(工学版), 2018, 48(3): 903-908.
[13] WANG Xu, OUYANG Ji-hong, CHEN Gui-fen. Heuristic algorithm of all common subsequences of multiple sequences for measuring multiple graphs similarity [J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[14] YANG Xin, XIA Si-jun, LIU Dong-xue, FEI Shu-min, HU Yin-ji. Target tracking based on improved accelerated gradient under tracking-learning-detection framework [J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
[15] LIU Xue-juan, YUAN Jia-bin, XU Juan, DUAN Bo-jia. Quantum k-means algorithm [J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] ZHANG Wei-wei, HE Jia-feng, GAO Guo-wang, REN Li-li, SHEN Xuan-jing. Wireless Mesh network routing and channel allocation union optimization algorithm based on game theory[J]. 吉林大学学报(工学版), 2018, 48(3): 887 -892 .