吉林大学学报(工学版) ›› 2013, Vol. 43 ›› Issue (01): 135-140.
Previous Articles Next Articles
LIU Quan1,2, YANG Xu-dong1, JING Ling3, XIAO Fei1
CLC Number:
| [1] 徐心和,王骄. 中国象棋计算机博弈关键技术分析[J]. 小型微型计算机系统,2006,27(6): 961-969. Xu Xin-he, Wang Jiao. Key technologies analysis of Chinese Chess Computer Game[J]. Mini-Micro Systems, 2006, 27(6): 961-969.[2] Kretchmar R M. Parallel reinforcement learning//Proceedings of the 6th World Conference on Systemics, Cybernetics, and Informatics. Orlando, Florida, USA, 2002: 114-118.[3] Kretchmar R M. Reinforcement learning algorithms for homogenous multi-agent systems//Workshop on Agent and Swarm Programming, Cleveland, OH, USA, 2003.[4] Printista A M, Errecalde M L, Montoya C I. A parallel implementation of Q-learning based on communication with cache[J]. Journal of Computer Science & Technology, 2002, 6: 268-278.[5] Grounds M, Kudenko D. Parallel reinforcement learning with linear function approximation//Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Honolulu, Hawai'i, USA, 2007.[6] Kearns M J, Singh S P. Near-optimal reinforcement learning in polynomial time[J].Machine Learning, 2002, 49(2-3): 209-232.[7] Brafman R I, Tennenholtz M. R-MAX-a general polynomial time algorithm for near-optimal reinforcement learning[J]. Journal of Machine Learning Research, 2002, 3:213-231.[8] Strehl A L, Littman M L. A theoretical analysis of model-based interval estimation//Proceedings of the 22th International Conference on Machine Learning, Bonn, Germany, 2005: 857-864.[9] Szita I, Lorincz A. The many faces of optimism: a unifying approach//Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 2008: 1048-1055.[10] Kakade S M. On the sample complexity of reinforcement learning.Gatsby Computational Neuroscience Unit, University College London, 2003.[11] Szita I, Szepesvari C. Model-based reinforcement learning with nearly tight exploration complexity bounds//Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010: 1031-1038.[12] Domingo C.Faster near-optimal reinforcement learning:adding adaptiveness to the E3 algorithm//Proceedings of the 10th International Conference on Algorithmic Learning Theory,Volume 1720 of Lecture Notes in Computer Science,Springer.1999:241-251.[13] Szepesvari C. Algorithms for Reinforcement Learning[M].Synthesis Lectures on Artifical Intelligence and Machine Learning. Morgan & Claypool Publishers, 2010.[14] Wingate D, Seppi K D. P3VI: A partitioned, prioritized, parallel value iterator//Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada, 2004: 109-116. |
| [1] | DONG Sa, LIU Da-you, OUYANG Ruo-chuan, ZHU Yun-gang, LI Li-na. Logistic regression classification in networked data with heterophily based on second-order Markov assumption [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1571-1577. |
| [2] | GU Hai-jun, TIAN Ya-qian, CUI Ying. Intelligent interactive agent for home service [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1578-1585. |
| [3] | WANG Xu, OUYANG Ji-hong, CHEN Gui-fen. Measurement of graph similarity based on vertical dimension sequence dynamic time warping method [J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205. |
| [4] | ZHANG Hao, ZHAN Meng-ping, GUO Liu-xiang, LI Zhi, LIU Yuan-ning, ZHANG Chun-he, CHANG Hao-wu, WANG Zhi-qiang. Human exogenous plant miRNA cross-kingdom regulatory modeling based on high-throughout data [J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213. |
| [5] | HUANG Lan, JI Lin-ying, YAO Gang, ZHAI Rui-feng, BAI Tian. Construction of disease-symptom semantic net for misdiagnosis prompt [J]. 吉林大学学报(工学版), 2018, 48(3): 859-865. |
| [6] | LI Xiong-fei, FENG Ting-ting, LUO Shi, ZHANG Xiao-li. Automatic music composition algorithm based on recurrent neural network [J]. 吉林大学学报(工学版), 2018, 48(3): 866-873. |
| [7] | LIU Jie, ZHANG Ping, GAO Wan-fu. Feature selection method based on conditional relevance [J]. 吉林大学学报(工学版), 2018, 48(3): 874-881. |
| [8] | WANG Xu, OUYANG Ji-hong, CHEN Gui-fen. Heuristic algorithm of all common subsequences of multiple sequences for measuring multiple graphs similarity [J]. 吉林大学学报(工学版), 2018, 48(2): 526-532. |
| [9] | YANG Xin, XIA Si-jun, LIU Dong-xue, FEI Shu-min, HU Yin-ji. Target tracking based on improved accelerated gradient under tracking-learning-detection framework [J]. 吉林大学学报(工学版), 2018, 48(2): 533-538. |
| [10] | LIU Xue-juan, YUAN Jia-bin, XU Juan, DUAN Bo-jia. Quantum k-means algorithm [J]. 吉林大学学报(工学版), 2018, 48(2): 539-544. |
| [11] | QU Hui-yan, ZHAO Wei, QIN Ai-hong. A fast collision detection algorithm based on optimization operator [J]. 吉林大学学报(工学版), 2017, 47(5): 1598-1603. |
| [12] | LI Jia-fei, SUN Xiao-yu. Clustering method for uncertain data based on spectral decomposition [J]. 吉林大学学报(工学版), 2017, 47(5): 1604-1611. |
| [13] | SHAO Ke-yong, CHEN Feng, WANG Ting-ting, WANG Ji-chi, ZHOU Li-peng. Full state based adaptive control of fractional order chaotic system without equilibrium point [J]. 吉林大学学报(工学版), 2017, 47(4): 1225-1230. |
| [14] | WANG Sheng-sheng, WANG Chuang-feng, GU Fang-ming. Spatio-temporal reasoning for OPRA direction relation network [J]. 吉林大学学报(工学版), 2017, 47(4): 1238-1243. |
| [15] | MA Miao, LI Yi-bin. Multi-level image sequences and convolutional neural networks based human action recognition method [J]. 吉林大学学报(工学版), 2017, 47(4): 1244-1252. |
|
||