Journal of Jilin University(Engineering and Technology Edition) ›› 2021, Vol. 51 ›› Issue (2): 667-676.doi: 10.13229/j.cnki.jdxbgxb20191070

Previous Articles    

Accelerating CALYPSO structure prediction with machine learning

Xiao-hui WEI1(),Chang-bao ZHOU1,Xiao-xian SHEN1,Yuan-yuan LIU1,Qun-chao TONG2   

  1. 1.College of Computer Science & Technology,Jilin University,Changchun 130012,China
    2.State Key Lab of Superhard Materials,Jilin University,Changchun 130012,China
  • Received:2019-11-22 Online:2021-03-01 Published:2021-02-09

Abstract:

The potential of accelerating CALYPSO structure prediction by replacing DFT methods with machine learning was studied. The performance in predicting the potential energy of boron clusters was evaluated with five machine learning methods. Firstly, the original data was represented as structural information metrics with Coulomb matrix. Then the eigenvalue vector pair of the matrix was extracted and used as the input of machine learning algorithms to training model, five algorithms were trained and tested using the same dataset. Also, factors affecting the performance were explored. Finally, a method of comparing the similarity of predicted values and ground truth was proposed based on the characters of potential energy surface (PES), and a confidence model was constructed to validate the best kernel ridge regression (KRR) method. It is suggested that PES fitted by KRR is similar with the PES by DFT, and the confidence of the algorithm is closed to 90% while permissible error is 1 kcal/mol. The result of time test to KRR shows that the method’s time complexity is On), which is improved by 1 to 2 orders of magnitude compared with DFT methods.

Key words: computer application, structure prediction, energy calculation, root mean square error, confidence

CLC Number: 

  • TP399

Fig.1

Potential energy surface of 2D"

Fig.2

Neural network"

Fig.3

Structures of boron clusters"

Table 1

Form of boron clusters"

XYZ
能量:-123.82 eV
12.3613.546.50
6.846.456.50
10.6610.956.49
8.189.276.49

Fig.4

Coulomb matrix representation of B8"

Fig.5

Eigenvalue extraction"

Fig.6

Accuracy of prediction for five methods"

Fig.7

Distribution of cluster and energy"

Table 2

Runtime of algorithms"

项目训练时间/ms预测时间/ms
KNN29.0276.49
KRR13 777.212 157.36
LRR10.340.5
MNN4 403.285.46
SVR14 404.392 408.4

Fig.8

Result of exploring other factors"

Fig.9

Potential energy surface of B20"

Table 3

Confidence of KRR for predicting B20"

允许误差B20B22B24B28B30B36B38B40
0.500.6660.6550.7210.7690.7320.7560.7520.767
0.550.7090.7100.7530.7930.7540.7660.7730.795
0.600.7450.7540.7800.8090.7860.7970.8010.832
0.650.7670.7820.8050.8290.8080.8100.8090.852
0.700.7900.8020.8160.8490.8400.8340.8350.866
0.750.8030.8330.8360.8690.8550.8440.8500.881
0.800.8150.8450.8540.8800.8740.8640.8600.895
0.850.8250.8770.8720.8840.8740.8780.8680.908
0.900.8450.8850.8770.8960.8820.8850.8770.919
0.950.8580.8930.8850.9000.8870.9050.8840.924
1.000.8620.8970.8910.9000.8940.9150.8880.929

Fig.10

Time cost of KRR prediction"

1 Hansen K, Montavon G, Biegler F, et al. Assessment and validation of machine learning methods for predicting molecular atomization energies[J]. Journal of Chemical Theory and Computation, 2013, 9(8): 3404-3419.
2 Doye J P, Wales D J. Structural consequences of the range of the interatomic potential a menagerie of clusters[J]. Journal of the Chemical Society, Faraday Transactions, 1997, 93(24): 4233-4243.
3 Salamat A, Garbarino G, Dewaele A, et al. Dense close-packed phase of tin above 157 GPa observed experimentally via angle-dispersive x-ray diffraction[J]. Physical Review B, 2011, 84(14): 140104.
4 Wang Y C, Lv J, Zhu L, et al. Crystal structure prediction via particle-swarm optimization[J]. Physical Review B, 2010,82(9):094116.
5 Wang Y C, Lv J, Zhu L, et al. CALYPSO: A method for crystal structure prediction[J]. Computer Physics Communications, 2012, 183(10): 2063-2070.
6 Lv J, Wang Y C, Zhu L, et al. Particle-swarm structure prediction on clusters[J]. The Journal of Chemical Physics, 2012, 137(8): 084104.
7 Tadmor E B, Miller R E. Modeling Materials: Continuum, Atomistic and Multiscale Techniques[M]. Cambridge: Cambridge University Press, 2011.
8 赵东, 臧雪柏, 赵宏伟. 基于果蝇优化的随机森林预测方法[J]. 吉林大学学报: 工学版, 2017, 47(2): 609-614.
Zhao Dong, Zang Xue-bai, Zhao Hong-wei. Random forest prediction method based on optimization of fruit fly[J]. Journal of Jilin University (Engineering and Technology Edition), 2017, 47(2): 609-614.
9 Tong Q C, Xue L T, Lv J, et al. Accelerating CALYPSO structure prediction by data-driven learning of potential energy surface[J]. Faraday Discussions, 2018, 211: 31-43.
10 张耀龙, 周雪瑶, 蒋彬. 加速神经网络势能面的构建:一种杂化的训练算法[J]. 化学物理学报, 2017, 30(6): 727-734.
Zhang Yao-long, Zhou Xue-yao, Jiang Bin. Accelerating the construction of neural network potential energy surfaces: a fast hybrid training algorithm[J]. Chinese Journal of Chemical Physics, 2017, 30(6):727-734.
11 Zhang L F, Han J Q, Wang H, et al. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics[J]. Physical Review Letters, 2018,120(14): 143001.
12 Marsland S. Machine Learning: An Algorithmic Perspective[M]. Chapman and Hall/CRC, 2011.
13 Behler J, Parrinello M. Generalized neural-network representation of high-dimensional potential-energy surfaces[J]. Physical Review Letters, 2007, 98(14): 146401.
14 Bartók A P, Payne M C, Kondor R, et al. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons[J]. Physical Review Letters, 2010, 104(13): 136403.
15 Snyder J C, Rupp M, Hansen K, et al. Finding density functionals with machine learning[J]. Physical Review Letters, 2012, 108(25): 253002.
16 Cui J, Krems R V. Gaussian process model for collision dynamics of complex molecules[J]. Physical Review Letters, 2015, 115(7): 073202.
17 Hoerl A E, Kennard R W. Ridge regression: biased estimation for nonorthogonal problems[J]. Technometrics, 1970, 12(1): 55-67.
18 Vovk V. Kernel Ridge Regression[M].Empirical Inference, Springer, 2013.
19 Smola A J, Schölkopf B. A tutorial on support vector regression[J]. Statistics and Computing, 2004, 14(3): 199-222.
20 王刚, 刘元宁, 陈慧灵, 等. 粗糙集与支持向量机在肝炎诊断中的应用[J]. 吉林大学学报: 工学版, 2011, 41(1): 160-164.
Wang Gang, Liu Yuan-ning, Chen Hui-ling, et al. Application of rough set and support vector machines in hepatitis diagnosis[J]. Journal of Jilin University (Engineering and Technology Edition), 2011, 41(1): 160-164.
21 Heidari A A, Faris H, Mirjalili S, et al. Ant Lion Optimizer: Theory, Literature Review, and Application in Multi-layer Perceptron Neural Networks[M].Nature-Inspired Optimizers, Springer, 2020.
22 Biau G, Devroye L. Lectures on the Nearest Neighbor Method[M]. Springer, 2015.
23 沈艳芳, 徐畅, 黄敏, 等. 硼团簇、硼烷及金属硼化物的研究现状[J]. 化学进展, 2016, 28(11): 1601-1614.
Shen Yan-fang, Xu Chang, Huang Min, et al. Research advances of boron clusters, borane and metal-doped boron compounds[J],Progress in Chemistry, 2016, 28(11): 1601-1614.
24 Rupp M, Tkatchenko A, Müller K, et al. Fast and accurate modeling of molecular atomization energies with machine learning[J]. Physical Review Letters, 2012, 108(5): 058301.
[1] Ming FANG,Wen-qiang CHEN. Face micro-expression recognition based on ResNet with object mask [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 303-313.
[2] Yuan SONG,Dan-yuan ZHOU,Wen-chang SHI. Method to enhance security function of OpenStack Swift cloud storage system [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 314-322.
[3] Xiao-yu WANG,Xin-hao HU,Chang-lin HAN. Face pencil drawing algorithms based on generative adversarial network [J]. Journal of Jilin University(Engineering and Technology Edition), 2021, 51(1): 285-292.
[4] Xiang-jiu CHE,You-zheng DONG. Improved image recognition algorithm based on multi⁃scale information fusion [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1747-1754.
[5] Hong-wei ZHAO,Xiao-han LIU,Yuan ZHANG,Li-li FAN,Man-li LONG,Xue-bai ZANG. Clothing classification algorithm based on landmark attention and channel attention [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1765-1770.
[6] Nai-yan GUAN,Juan-li GUO. Component awareness adaptive model based on attitude estimation algorithms [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1850-1855.
[7] Yang LI,Shuo LI,Li-wei JING. Estimate model based on Bayesian model and machine learning algorithms applicated in financial risk assessment [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1862-1869.
[8] Bing-hai ZHOU,Zhao-xu HE. Dynamic material handling scheduling for mixed⁃model assembly lines based on line⁃integrated supermarkets [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1809-1817.
[9] Lei JIANG,Ren-chu GUAN. Design of fuzzy comprehensive evaluation system for talent quality based on multi⁃objective evolutionary algorithm [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(5): 1856-1861.
[10] Zhou-zhou LIU,Wen-xiao YIN,Qian-yun ZHANG,Han PENG. Sensor cloud intrusion detection based on discrete optimization algorithm and machine learning [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(2): 692-702.
[11] Xiao-hui WANG,Lu-shen WU,Hua-wei CHEN. Denoising of scattered point cloud data based on normal vector distance classification [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 278-288.
[12] Xiao-dong ZHANG,Xiao-jun XIA,Hai-feng LYU,Xu-chao GONG,Meng-jia LIAN. Dynamic load balancing of physiological data flow in big data network parallel computing environment [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 247-254.
[13] Man CHEN,Yong ZHONG,Zhen-dong LI. Multi-focus image fusion based on latent lowrank representation combining lowrank representation [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 297-305.
[14] Shun-fu JIN,Xiu-chen QIE,Hai-xing WU,Zhan-qiang HUO. Clustered virtual machine allocation strategy in cloud computing based on new type of sleep-mode and performance optimization [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 237-246.
[15] Jun-yi DENG,Yan-heng LIU,Shi FENG,Rong-cun ZHAO,Jian WANG. GSPN⁃based model to evaluate the performance and securi tytradeoff in Ad-hoc network [J]. Journal of Jilin University(Engineering and Technology Edition), 2020, 50(1): 255-261.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!