基于分布式内存计算的深度学习方法

doi:10.13229/j.cnki.jdxbgxb201503034

Abstract

Abstract: To improve the efficiency of deep neural network distributed training, a new method is proposed, which makes neural network model running on distributed memory computing system. A framework of distributed memory is built, which contains functions of data partition and multi-task schedule. It can avoid the impact of I/O on the training rate and makes the training process run at memory-speed across cluster. Within the framework, multiple model replicas of deep believe network are trained in an asynchronous way. In addition, the dropout algorithm is employed to prevent over-fitting. The proposed method is evaluated using CIFAR-10 dataset. Experiment results show that the new method improves the efficiency of training deep neural network and enables scalability.

Key words: artificial intelligence, distributed deep learning, distributed memory computing, deep belief network

CLC Number:

TP183

LI Di-fei, TIAN Di, HU Xiong-wei. A method of deep learning based on distributed memory computing[J].吉林大学学报(工学版), 2015, 45(3): 921-925.

References

[1] Ciresan D C, Meier U, Gambardella L M, et al. Deep big simple neural nets excel on handwritten digit recognition[J]. Neural Computation,2010(12):3207-3220.
[2] Coates A,Lee H L,Ng A Y.An analysis of single-layer networks in unsupervised feature learning[C]∥Proceeding of the 14th International Conference on Artificial Intelligence and Statistics,Fort Lauderdale,USA,2011:215-223.
[3] Hinton G E, Srivastava N, Krizhevsky A, et al. Improving neural networks by preventing co-adaptation of feature detectors[DB/OL].[2014-05-17].http://arxiv.org/abs/1207.0580.
[4] Raina R, Madhavan A,Ng A Y.Large-scale deep unsupervised learning using graphics processors[C]∥International Conference on Machine Learning,Montreal QC,Canada,2009:873-880.
[5] Le Q V,Monga R,Devin M,et al. Building high-level features using large scale unsupervised learning[C]∥International Conference on Acoustics, Speech and Signal,Vancouver,Canada,2013:8595-8598.
[6] Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification: a deep learning approach[C]∥Proceedings of the 28th International Conference on Machine Learning,Bellevue, WA,USA,2011:513-520.
[7] Bengio Y, Courville A C, Vincent P. Representation learning:a review and new perspectives[DB/OL].[2014-05-23].http://arxiv.org/abs/1206.5538.
[8] Ngiam J, Coates A, Lahiri A, et al. On optimization methods for deep learning[C]∥Proceedings of the 28th International Conference on Machine Learning,Bellevue, WA,USA,2011:265-272.
[9] Martens J. Deep learning via hessian-free optimization[C]∥Proceedings of the 27th International Conference on Machine Learning,Haifa,Israel,2010:735-742.
[10] Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(8):1798-1828.
[11] Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups[J]. Signal Processing Magazine,2012,29(6):82-97.
[12] Hinton G,Osindero S,Teh Y W. A fast learning algorithm for deep belief nets[J]. Neural Computation,2006,18(7):1527-1554.
[13] Hinton G. A practical guide to training restricted Boltzmann machines[J]. Momentum,2010,9(1):926.
[14] Zaharia M,Chowdhury M,Das T,et al. Resilient distributed datasets:a fault-tolerant abstraction for in-memory cluster computing[DB/OL].[2014-01-19].http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-82.pdf.
[15] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization[J]. Journal of Machine Learning Research,2011,12(7):2121-2159.

Related Articles 15

[1]	DONG Sa, LIU Da-you, OUYANG Ruo-chuan, ZHU Yun-gang, LI Li-na. Logistic regression classification in networked data with heterophily based on second-order Markov assumption [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1571-1577.
[2]	GU Hai-jun, TIAN Ya-qian, CUI Ying. Intelligent interactive agent for home service [J]. Journal of Jilin University(Engineering and Technology Edition), 2018, 48(5): 1578-1585.
[3]	WANG Xu, OUYANG Ji-hong, CHEN Gui-fen. Measurement of graph similarity based on vertical dimension sequence dynamic time warping method [J]. 吉林大学学报(工学版), 2018, 48(4): 1199-1205.
[4]	ZHANG Hao, ZHAN Meng-ping, GUO Liu-xiang, LI Zhi, LIU Yuan-ning, ZHANG Chun-he, CHANG Hao-wu, WANG Zhi-qiang. Human exogenous plant miRNA cross-kingdom regulatory modeling based on high-throughout data [J]. 吉林大学学报(工学版), 2018, 48(4): 1206-1213.
[5]	HUANG Lan, JI Lin-ying, YAO Gang, ZHAI Rui-feng, BAI Tian. Construction of disease-symptom semantic net for misdiagnosis prompt [J]. 吉林大学学报(工学版), 2018, 48(3): 859-865.
[6]	LI Xiong-fei, FENG Ting-ting, LUO Shi, ZHANG Xiao-li. Automatic music composition algorithm based on recurrent neural network [J]. 吉林大学学报(工学版), 2018, 48(3): 866-873.
[7]	LIU Jie, ZHANG Ping, GAO Wan-fu. Feature selection method based on conditional relevance [J]. 吉林大学学报(工学版), 2018, 48(3): 874-881.
[8]	WANG Xu, OUYANG Ji-hong, CHEN Gui-fen. Heuristic algorithm of all common subsequences of multiple sequences for measuring multiple graphs similarity [J]. 吉林大学学报(工学版), 2018, 48(2): 526-532.
[9]	YANG Xin, XIA Si-jun, LIU Dong-xue, FEI Shu-min, HU Yin-ji. Target tracking based on improved accelerated gradient under tracking-learning-detection framework [J]. 吉林大学学报(工学版), 2018, 48(2): 533-538.
[10]	LIU Xue-juan, YUAN Jia-bin, XU Juan, DUAN Bo-jia. Quantum k-means algorithm [J]. 吉林大学学报(工学版), 2018, 48(2): 539-544.
[11]	QU Hui-yan, ZHAO Wei, QIN Ai-hong. A fast collision detection algorithm based on optimization operator [J]. 吉林大学学报(工学版), 2017, 47(5): 1598-1603.
[12]	LI Jia-fei, SUN Xiao-yu. Clustering method for uncertain data based on spectral decomposition [J]. 吉林大学学报(工学版), 2017, 47(5): 1604-1611.
[13]	SHAO Ke-yong, CHEN Feng, WANG Ting-ting, WANG Ji-chi, ZHOU Li-peng. Full state based adaptive control of fractional order chaotic system without equilibrium point [J]. 吉林大学学报(工学版), 2017, 47(4): 1225-1230.
[14]	WANG Sheng-sheng, WANG Chuang-feng, GU Fang-ming. Spatio-temporal reasoning for OPRA direction relation network [J]. 吉林大学学报(工学版), 2017, 47(4): 1238-1243.
[15]	MA Miao, LI Yi-bin. Multi-level image sequences and convolutional neural networks based human action recognition method [J]. 吉林大学学报(工学版), 2017, 47(4): 1244-1252.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 10

[1]	LIU Song-shan, WANG Qing-nian, WANG Wei-hua, LIN Xin. Influence of inertial mass on damping and amplitude-frequency characteristic of regenerative suspension[J]. 吉林大学学报(工学版), 2013, 43(03): 557 -563 .
[2]	CHU Liang, WANG Yan-bo, QI Fu-wei, ZHANG Yong-sheng. Control method of inlet valves for brake pressure fine regulation[J]. 吉林大学学报(工学版), 2013, 43(03): 564 -570 .
[3]	LI Jing, WANG Zi-han, YU Chun-xian, HAN Zuo-yue, SUN Bo-hua. Design of control system to follow vehicle state with HIL test beach[J]. 吉林大学学报(工学版), 2013, 43(03): 577 -583 .
[4]	HU Xing-jun, LI Teng-fei, WANG Jing-yu, YANG Bo, GUO Peng, LIAO Lei. Numerical simulation of the influence of rear-end panels on the wake flow field of a heavy-duty truck[J]. 吉林大学学报(工学版), 2013, 43(03): 595 -601 .
[5]	WANG Tong-jian, CHEN Jin-shi, ZHAO Feng, ZHAO Qing-bo, LIU Xin-hui, YUAN Hua-shan. Mechanical-hydraulic co-simulation and experiment of full hydraulic steering systems[J]. 吉林大学学报(工学版), 2013, 43(03): 607 -612 .
[6]	ZHANG Chun-qin, JIANG Gui-yan, WU Zheng-yan. Factors influencing motor vehicle travel departure time choice behavior[J]. 吉林大学学报(工学版), 2013, 43(03): 626 -632 .
[7]	MA Wan-jing, XIE Han-zhou. Integrated control of main-signal and pre-signal on approach of intersection with double stop line[J]. 吉林大学学报(工学版), 2013, 43(03): 633 -639 .
[8]	YU De-xin, TONG Qian, YANG Zhao-sheng, GAO Peng. Forecast model of emergency traffic evacuation time under major disaster[J]. 吉林大学学报(工学版), 2013, 43(03): 654 -658 .
[9]	XIAO Yun, LEI Jun-qing, ZHANG Kun, LI Zhong-san. Fatigue stiffness degradation of prestressed concrete beam under multilevel amplitude cycle loading[J]. 吉林大学学报(工学版), 2013, 43(03): 665 -670 .
[10]	XIAO Rui, DENG Zong-cai, LAN Ming-zhang, SHEN Chen-liang. Experiment research on proportions of reactive powder concrete without silica fume[J]. 吉林大学学报(工学版), 2013, 43(03): 671 -676 .

A method of deep learning based on distributed memory computing

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 10