Machine Learning for Chemistry: Basics and Applications

Yun-Fei Shi; Zheng-Xin Yang; Sicong Ma; Pei-Lin Kang; Cheng Shang; P. Hu; Zhi-Pan Liu

doi:10.1016/j.eng.2023.04.013

PDF(2541 KB)

Engineering ›› 2023, Vol. 27 ›› Issue (8) : 70-83. DOI: 10.1016/j.eng.2023.04.013

Research

Review

Machine Learning for Chemistry: Basics and Applications

Yun-Fei Shi^a^,^# ,
Zheng-Xin Yang^a^,^# ,
Sicong Ma^b ,
Pei-Lin Kang^a ,
Cheng Shang^a ,
P. Hu^c^,^* ,
Zhi-Pan Liu^a^,^b^,^*

Author information +

History +

Abstract

The past decade has seen a sharp increase in machine learning (ML) applications in scientific research. This review introduces the basic constituents of ML, including databases, features, and algorithms, and highlights a few important achievements in chemistry that have been aided by ML techniques. The described databases include some of the most popular chemical databases for molecules and materials obtained from either experiments or computational calculations. Important two-dimensional (2D) and three-dimensional (3D) features representing the chemical environment of molecules and solids are briefly introduced. Decision tree and deep learning neural network algorithms are overviewed to emphasize their frameworks and typical application scenarios. Three important fields of ML in chemistry are discussed: ① retrosynthesis, in which ML predicts the likely routes of organic synthesis; ② atomic simulations, which utilize the ML potential to accelerate potential energy surface sampling; and ③ heterogeneous catalysis, in which ML assists in various aspects of catalytic design, ranging from synthetic condition optimization to reaction mechanism exploration. Finally, a prospect on future ML applications is provided.

Graphical abstract

Keywords

Machine learning / Atomic simulation / Catalysis / Retrosynthesis / Neural network potential

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Yun-Fei Shi, Zheng-Xin Yang, Sicong Ma, Pei-Lin Kang, Cheng Shang, P. Hu, Zhi-Pan Liu. Machine Learning for Chemistry: Basics and Applications. Engineering, 2023, 27(8): 70‒83 https://doi.org/10.1016/j.eng.2023.04.013

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Y. LeCun, Y. Bengio, G. Hinton. Deep learning. Nature, 521 (7553) (2015), pp. 436-444. DOI: 10.1038/nature14539

[2]	A. Krizhevsky, I. Sutskever, G.E. Hinton. ImageNet classification with deep convolutional neural networks. Commun ACM, 60 (6) (2017), pp. 84-90. DOI: 10.1145/3065386

[3]

, Wu

.Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition. In:Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing; 2015 Apr 19-24; South Brisbane, QLD, Australia. Piscataway: IEEE; 2015.p.4520-4.

[4]	A.W. Senior, R. Evans, J. Jumper, J. Kirkpatrick, L. Sifre, T. Green, et al. Improved protein structure prediction using potentials from deep learning. Nature, 577 (7792) (2020), pp. 706-710. DOI: 10.1038/s41586-019-1923-7

[5]	J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, et al. Highly accurate protein structure prediction with AlphaFold. Nature, 596 (7873) (2021), pp. 583-589. DOI: 10.1038/s41586-021-03819-2

[6]	M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver, C.V. Stevens, K.M. Van Geem. Machine learning in chemical engineering: strengths, weaknesses, opportunities, and threats. Engineering, 7 (9) (2021), pp. 1201-1211.

[7]	V. Venkatasubramanian. The promise of artificial intelligence in chemical engineering: is it here, finally?. AIChE J, 65 (2) (2019), pp. 466-478. DOI: 10.1002/aic.16489

[8]	T. Zhou, Z. Song, K. Sundmacher. Big data creates new opportunities for materials research: a review on methods and applications of machine learning for materials design. Engineering, 5 (6) (2019), pp. 1017-1026.

[9]	W. Chen, A. Iyer, R. Bostanabad. Data centric design: a new approach to design of microstructural material systems. Engineering, 10 (2022), pp. 89-98.

[10]	A. Thebelt, J. Wiebe, J. Kronqvist, C. Tsay, R. Misener. Maximizing information from chemical engineering data sets: applications to machine learning. Chem Eng Sci, 252 (2022), Article 117469.

[11]	D.M. Lowe. Extraction of chemical structures and reactions from the literature [dissertation]. University of Cambridge, Cambridge (2012)

[12]	S.M. Kearnes, M.R. Maser, M. Wleklinski, A. Kast, A.G. Doyle, S.D. Dreher, et al. The open reaction database. J Am Chem Soc, 143 (45) (2021), pp. 18820-18826. DOI: 10.1021/jacs.1c09820

[13]	S.A. Akhondi, A.G. Klenner, C. Tyrchan, A.K. Manchala, K. Boppana, D. Lowe, et al. Annotated chemical patent corpus: a gold standard for text mining. PLoS One, 9 (9) (2014), Article e107477. DOI: 10.1371/journal.pone.0107477

[14]	S. Kim, J. Chen, T. Cheng, A. Gindulyte, J. He, S. He, et al. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res, 47 (D1) (2019), pp. D1102-D1109. DOI: 10.1093/nar/gky1033

[15]	F.W. Olver, D.W. Lozier, R.F. Boisvert, C.W. Clark (Eds.), NIST handbook of mathematical functions hardback and CD-ROM, Cambridge University Press, Cambridge (2010)

[16]	M. Ayers.ChemSpider: the free chemical database. Ref Rev, 26 (7) (2012), pp. 45-46

[17]	A. Gaulton, L.J. Bellis, A.P. Bento, J. Chambers, M. Davies, A. Hersey, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res, 40 (D1) (2012), pp. D1100-D1107. DOI: 10.1093/nar/gkr777

[18]	D.S. Wishart, C. Knox, A.C. Guo, D. Cheng, S. Shrivastava, D. Tzur, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res, 36 (D1) (2008), pp. D901-D906. DOI: 10.1093/nar/gkm958

[19]	R. Huang, M. Xia, D.T. Nguyen, T. Zhao, S. Sakamuru, J. Zhao, et al. Tox21Challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs. Front Environ Sci, 3 (2016), p. 85.

[20]	J.S. Delaney. ESOL: estimating aqueous solubility directly from molecular structure. J Chem Inf Comput Sci, 44 (3) (2004), pp. 1000-1005.

[21]	D.L. Mobley, J.P. Guthrie. FreeSolv: a database of experimental and calculated hydration free energies, with input files. J Comput Aided Mol Des, 28 (7) (2014), pp. 711-720. DOI: 10.1007/s10822-014-9747-x

[22]	J.B. Wang, D.S. Cao, M.F. Zhu, Y.H. Yun, N. Xiao, Y.Z. Liang. In silico evaluation of logD_7.4 and comparison with other prediction methods. J Chemometr, 29 (7) (2015), pp. 389-398. DOI: 10.1002/cem.2718

[23]	C.R. Groom, I.J. Bruno, M.P. Lightfoot, S.C. Ward. The Cambridge Structural Database. Acta Cryst B, 72 (Pt 2) (2016), pp. 171-179.

[24]	D. Zagorac, H. Müller, S. Ruehl, J. Zagorac, S. Rehme. Recent developments in the Inorganic Crystal Structure Database: theoretical crystal structure data and related features. J Appl Cryst, 52 (Pt 5) (2019), pp. 918-925. DOI: 10.1107/s160057671900997x

[25]	S. Gates-Rector, T. Blanton. The Powder Diffraction File: a quality materials characterization database. Powder Diffr, 34 (4) (2019), pp. 352-360. DOI: 10.1017/s0885715619000812

[26]	M. Lucu, E. Martinez-Laserna, I. Gandiaga, H. Camblong. A critical review on self-adaptive Li-ion Battery Ageing Models. J Power Sources, 401 (2018), pp. 85-101.

[27]	A. Zakutayev, N. Wunder, M. Schwarting, J.D. Perkins, R. White, K. Munch, et al.An open experimental database for exploring inorganic materials. Sci Data, 5 (1) (2018), Article 180053.

[28]	L. Ruddigkeit, R. van Deursen, L.C. Blum, J.L. Reymond.Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model, 52 (11) (2012), pp. 2864-2875. DOI: 10.1021/ci300415d

[29]	D. Weininger. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci, 28 (1) (1988), pp. 31-36. DOI: 10.1021/ci00057a005

[30]	R. Ramakrishnan, P.O. Dral, M. Rupp, O.A. von Lilienfeld. Quantum chemistry structures and properties of 134 kilo molecules. Sci Data, 1 (1) (2014), Article 140022.

[31]	Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, et al. Commentary: the Materials Project: a materials genome approach to accelerating materials innovation. APL Mater 2013 ;1(1):011002.

[32]	S. Kirklin, J.E. Saal, B. Meredig, A. Thompson, J.W. Doak, M. Aykol, et al. The Open Quantum Materials Database (OQMD): assessing the accuracy of DFT formation energies. npj Comput Mater, 1 (1) (2015), p. 15010.

[33]	S. Curtarolo, W. Setyawan, G.L.W. Hart, M. Jahnatek, R.V. Chepulskii, R.H. Taylor, et al. AFLOW: an automatic framework for high-throughput materials discovery. Comput Mater Sci, 58 (2012), pp. 218-226.

[34]	C.E. Calderon, J.J. Plata, C. Toher, C. Oses, O. Levy, M. Fornari, et al. The AFLOW standard for high-throughput materials science calculations. Comput Mater Sci, 108 (2015), pp. 233-238.

[35]	S.P. Ong, W.D. Richards, A. Jain, G. Hautier, M. Kocher, S. Cholia, et al. Python Materials Genomics (pymatgen): a robust, open-source Python library for materials analysis. Comput Mater Sci, 68 (2013), pp. 314-319.

[36]	J.S. Smith, O. Isayev, A.E. Roitberg. ANI-1, a data set of 20 million calculated off-equilibrium conformations for organic molecules. Sci Data, 4 (1) (2017), Article 170193.

[37]	J.M. Bowman, C. Qu, R. Conte, A. Nandi, P.L. Houston, Q. Yu. The MD17 datasets from the perspective of datasets for gas-phase “small” molecule potentials. J Chem Phys, 156 (24) (2022), Article 240901.

[38]	K.T. Schütt, F. Arbabzadah, S. Chmiela, K.R. Müller, A. Tkatchenko. Quantum-chemical insights from deep tensor neural networks. Nat Commun, 8 (1) (2017), p. 13890.

[39]	P. Kang, C. Shang, Z. Liu. Recent implementations in LASP 3.0: global neural network potential with multiple elements and better long-range description. Chin. J Chem Phys, 34 (5) (2021), pp. 583-590. DOI: 10.1063/1674-0068/cjcp2108145

[40]	A. Kolluru, M. Shuaibi, A. Palizhati, N. Shoghi, A. Das, B. Wood, et al. Open challenges in developing generalizable large-scale machine-learning models for catalyst discovery. ACS Catal, 12 (14) (2022), pp. 8572-8581. DOI: 10.1021/acscatal.2c02291

[41]	Townshend RJL, Vögele M, Suriana P, Derry A, Powers A, Laloudakis Y, et al. ATOM3D: tasks on molecules in three dimensions. 2022. arXiv:2012.04035.

[42]	C.A. Tolman. Steric effects of phosphorus ligands in organometallic chemistry and homogeneous catalysis. Chem Rev, 77 (3) (1977), pp. 313-348. DOI: 10.1021/cr60307a002

[43]	N.M. Al Hasan, H. Hou, S. Sarkar, S. Thienhaus, A. Mehta, A. Ludwig, et al. Combinatorial synthesis and high-throughput characterization of microstructure and phase transformation in Ni-Ti-Cu-V quaternary thin-film library. Engineering, 6 (6) (2020), pp. 637-643.

[44]	P.P. Plehiers, S.H. Symoens, I. Amghizar, G.B. Marin, C.V. Stevens, K.M. Van Geem. Artificial intelligence in steam cracking modeling: a deep learning algorithm for detailed effluent prediction. Engineering, 5 (6) (2019), pp. 1027-1040.

[45]	F. Musil, A. Grisafi, A.P. Bartók, C. Ortner, G. Csányi, M. Ceriotti. Physics-inspired structural representations for molecules and materials. Chem Rev, 121 (16) (2021), pp. 9759-9815. DOI: 10.1021/acs.chemrev.1c00021

[46]	D.J. Durand, N. Fey. Computational ligand descriptors for catalyst design. Chem Rev, 119 (11) (2019), pp. 6561-6594. DOI: 10.1021/acs.chemrev.8b00588

[47]	S.R. Heller, A. McNaught, I. Pletnev, S. Stein, D. Tchekhovskoi. InChI, the IUPAC International Chemical Identifier. J Cheminform, 7 (1) (2015), p. 23

[48]	D. Rogers, M. Hahn. Extended-connectivity fingerprints. J Chem Inf Model, 50 (5) (2010), pp. 742-754. DOI: 10.1021/ci100050t

[49]	B.J. Braams, J.M. Bowman. permutationally invariant potential energy surfaces in high dimensionality. Int Rev Phys Chem, 28 (4) (2009), pp. 577-606. DOI: 10.1080/01442350903234923

[50]	S.H. Newman-Stonebraker, S.R. Smith, J.E. Borowski, E. Peters, T. Gensch, H.C. Johnson, et al. Univariate classification of phosphine ligation state and reactivity in cross-coupling catalysis. Science, 374 (6565) (2021), pp. 301-308. DOI: 10.1126/science.abj4213

[51]	J. Behler. Atom-centered symmetry functions for constructing high-dimensional neural network potentials. J Chem Phys, 134 (7) (2011), Article 074106.

[52]	P.J. Steinhardt, D.R. Nelson, M. Ronchetti. Bond-orientational order in liquids and glasses. Phys Rev B, 28 (2) (1983), pp. 784-805.

[53]	S.D. Huang, C. Shang, P.L. Kang, Z.P. Liu. Atomic structure of boron resolved using machine learning and global sampling. Chem Sci, 9 (46) (2018), pp. 8644-8655. DOI: 10.1039/c8sc03427c

[54]	S.D. Huang, C. Shang, X.J. Zhang, Z.P. Liu. Material discovery by combining stochastic surface walking global optimization with a neural network. Chem Sci, 8 (9) (2017), pp. 6327-6337.

[55]	A.F. Zahrt, J.J. Henle, B.T. Rose, Y. Wang, W.T. Darrow, S.E. Denmark. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science, 363 (6424) (2019), Article eaau5631.

[56]	A.P. Bartók, R. Kondor, G. Csányi. On representing chemical environments. Phys Rev B, 87 (18) (2013), Article 184115. DOI: 10.1103/PhysRevB.87.184115

[57]	Y. Zhang, C. Hu, B. Jiang. Embedded atom neural network potentials: efficient and accurate machine learning with a physically inspired representation. J Phys Chem Lett, 10 (17) (2019), pp. 4962-4967. DOI: 10.1021/acs.jpclett.9b02037

[58]	F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, et al. Scikit-Learn: machine learning in Python. J Mach Learn Res, 12 (85) (2011), pp. 2825-2830

[59]

Paszke

, Gross

, Massa

, Lerer

, Bradbury

, Chanan

, et al. PyTorch:an imperative style, high-performance deep learning library. In:Proceedings of the 33rd International Conference on Neural Information Processing Systems; 2019 Dec 8- 14 ; Vancouver, BC, Canada. Red Hook: Curran Associates Inc.; 2019. p. 8026-37.

[60]	TensorFlow Developers. TensorFlow. Version 2.8.2 [software]. 2022 May 23 [cited 2022 Jun 8]. Available from: https://zenodo.org/record/6574269.

[61]	J.R. Quinlan. Induction of decision trees. Mach Learn, 1 (1) (1986), pp. 81-106.

[62]	Ho TK.Random decision forests. In:Proceedings of 3rd International Conference on Document Analysis and Recognition; 1995 Aug 14-16; Montreal, QC, Canada. Piscataway: IEEE; 1995. p.278-82.

[63]	M. Suvarna, T.P. Araújo, J. Pérez-Ramírez. A generalized machine learning framework to predict the space-time yield of methanol from thermocatalytic CO₂ hydrogenation. Appl Catal B, 315 (2022), Article 121530.

[64]	K. Muraoka, Y. Sada, D. Miyazaki, W. Chaikittisilp, T. Okubo. Linking synthesis and structure descriptors from a large collection of synthetic records of zeolite materials. Nat Commun, 10 (1) (2019), p. 4459.

[65]	M. Baysal, M.E. Günay, R. Yıldırım. Decision tree analysis of past publications on catalytic steam reforming to develop heuristics for high performance: a statistical review. Int J Hydrogen Energy, 42 (1) (2017), pp. 243-254.

[66]	F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev, 65 (6) (1958), pp. 386-408. DOI: 10.1037/h0042519

[67]	Bottou L. Large-scale machine learning with stochastic gradient descent. In: Lechevallier Y, Saporta G, editors. Proceedings of COMPSTAT’2010; 2010 Aug 22-27; Paris, France. Heidelberg: Physica-Verlag HD; 2010. p.177-86.

[68]	Kingma DP, Ba J. Adam: a method for stochastic optimization. 2017. arXiv:1412.6980.

[69]	D.C. Liu, J. Nocedal. On the limited memory BFGS method for large scale optimization. Math Program, 45 (1) (1989), pp. 503-528.

[70]	He K, Zhang X, Ren S, Sun J.Deep residual learning for image recognition. In:Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition; 2016 Jun 27-30; Las Vegas, NV, USA. Piscataway: IEEE; 2016. p.770-8.

[71]	J. Wang, L.P. Tchapmi, A.P. Ravikumar, M. McGuire, C.S. Bell, D. Zimmerle, et al. Machine vision for natural gas methane emissions detection using an infrared camera. Appl Energy, 257 (2020), Article 113998.

[72]	N. Wang, H. Li, F. Wu, R. Zhang, F. Gao. Fault diagnosis of complex chemical processes using feature fusion of a convolutional network. Ind Eng Chem Res, 60 (5) (2021), pp. 2232-2248. DOI: 10.1021/acs.iecr.0c05739

[73]	L. Wen, X. Li, L. Gao, Y. Zhang. A new convolutional neural network-based data-driven fault diagnosis method. IEEE Trans Ind Electron, 65 (7) (2018), pp. 5990-5998. DOI: 10.1109/tie.2017.2774777

[74]	J. Xing, J. Xu. An improved convolutional neural network for recognition of incipient faults. IEEE Sens J, 22 (16) (2022), pp. 16314-16322. DOI: 10.1109/jsen.2022.3189484

[75]	X. Ge, B. Wang, X. Yang, Y. Pan, B. Liu, B. Liu. Fault detection and diagnosis for reactive distillation based on convolutional neural network. Comput Chem Eng, 145 (2021), Article 107172.

[76]	S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Comput, 9 (8) (1997), pp. 1735-1780. DOI: 10.1162/neco.1997.9.8.1735

[77]	W. Bort, I.I. Baskin, T. Gimadiev, A. Mukanov, R. Nugmanov, P. Sidorov, et al. Discovery of novel chemical reactions by deep generative recurrent neural network. Sci Rep, 11 (1) (2021), p. 3178.

[78]	Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Precup D, Teh YW, editors.Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6-11; Sydney, NSW, Australia; 2017. p. 1263-72.

[79]	B. Sanchez-Lengeling, E. Reif, A. Pearce, A.B. Wiltschko. A gentle introduction to graph neural networks. Distill, 6 (9) (2021), p. e33

[80]	T. Xie, J.C. Grossman. Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys Rev Lett, 120 (14) (2018), Article 145301.

[81]	K.T. Schütt, H.E. Sauceda, P.J. Kindermans, A. Tkatchenko, K.R. Müller. SchNet—a deep learning architecture for molecules and materials. J Chem Phys, 148 (24) (2018), Article 241722.

[82]

Vaswani

, Shazeer

, Parmar

, Uszkoreit

, Jones

, Gomez

, et al. Attention

is all you need

. von

Luxburg U

, Guyon

, Bengio

, Wallach

, Fergus

, editors.Proceedings of the 31st International Conference on Neural Information Processing Systems; 2017 Dec 4- 9 ; Long Beach, CA, USA. Red Hook: Curran Associates, Inc.; 2017. p. 6000-10.

[83]	T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, et al. Languagemodels are few-shot learners. H.Larochelle, M.Ranzato, R.Hadsell, M.F.Balcan, H.Lin (Eds.), Advances in neural information processing systems 33, Curran Associates, Inc., Red Hook (2020), pp. 1877-1901

[84]	Devlin J, Chang MW, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. 2019. arXiv:1810.04805.

[85]	Parmar N, Vaswani A, Uszkoreit J, Kaiser L, Shazeer N, Ku A, et al. Image transformer. In: Dy J, Krause A, editors. Proceedings of the 35th International Conference on Machine Learning; 2018 Jul 10-15; Stockholm, Sweden. Red Hook: Curran Associates, Inc.; 2018. p.4055-64.

[86]

Ying

, T.

Cai

, S.

Luo

, S.

Zheng

, G.

, D.

, et al. Do transformers really perform badly for graph representation?. M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, J. Wortman Vaughan (Eds.), Advances in neural information processing systems 34, Curran Associates, Inc., Red Hook (2021), pp. 28877-28888.

[87]	P. Schwaller, B. Hoover, J.L. Reymond, H. Strobelt, T. Laino. Extraction of organic chemistry grammar from unsupervised learning of chemical reactions. Sci Adv, 7 (15) (2021), Article eabe4166.

[88]	M.H.S. Segler, M. Preuss, M.P. Waller. Planning chemical syntheses with deep neural networks and symbolic AI. Nature, 555 (7698) (2018), pp. 604-610. DOI: 10.1038/nature25978

[89]	B. Liu, B. Ramsundar, P. Kawthekar, J. Shi, J. Gomes, Q. Luu Nguyen, et al. Retrosynthetic reaction prediction using neural sequence-to-sequence models. ACS Cent Sci, 3 (10) (2017), pp. 1103-1113. DOI: 10.1021/acscentsci.7b00303

[90]	P. Schwaller, T. Gaudin, D. Lányi, C. Bekas, T. Laino. “Found in Translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem Sci, 9 (28) (2018), pp. 6091-6098. DOI: 10.1039/c8sc02339e

[91]	P. Schwaller, T. Laino, T. Gaudin, P. Bolgar, C.A. Hunter, C. Bekas, et al. Molecular Transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci, 5 (9) (2019), pp. 1572-1583. DOI: 10.1021/acscentsci.9b00576

[92]

Jin

, C.

Coley

, R.

Barzilay

, T. Jaakkola. Predicting organic reaction outcomes with Weisfeiler-Lehman network. I.

Guyon

, U. Von

Luxburg

, S.

Bengio

, H.

Wallach

, R.

Fergus

, S.

Vishwanathan

(Eds.), Advances in neural information processing systems 30, Curran Associates, Inc., Red Hook (2017), pp. 2604-2613

[93]	M.H.S. Segler, M.P. Waller. Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chemistry, 23 (25) (2017), pp. 5966-5971. DOI: 10.1002/chem.201605499

[94]	J.N. Wei, D. Duvenaud, A. Aspuru-Guzik. Neural networks for the prediction of organic chemistry reactions. ACS Cent Sci, 2 (10) (2016), pp. 725-732. DOI: 10.1021/acscentsci.6b00219

[95]	C.W. Coley, L. Rogers, W.H. Green, K.F. Jensen. SCScore: synthetic complexity learned from a reaction corpus. J Chem Inf Model, 58 (2) (2018), pp. 252-261. DOI: 10.1021/acs.jcim.7b00622

[96]	Zhang L, Han J, Wang H, Car R, E W. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys Rev Lett 2018 ;120(14):143001.

[97]	Han J, Zhang L, Car R, E W. Deep Potential: a general representation of a many-body potential energy surface. Commun Comput Phys 2018 ;23(3):629-39.

[98]

Schütt

, P.J.

Kindermans

, H.E. Sauceda

Felix

, S.

Chmiela

, A.

Tkatchenko

, K.R.

Müller

. SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan (Eds.), Advances in neural information processing systems 30, Curran Associates, Inc., Red Hook (2017), pp. 992-1002.

[99]	S.D. Huang, C. Shang, P.L. Kang, X.J. Zhang, Z.P. Liu. LASP: fast global potential energy surface exploration. WIREs Comput Mol Sci, 9 (6) (2019), p. e1415.

[100]

S.A.

Ghasemi

, A.

Hofstetter

, S.

Saha

, S.

Goedecker

. Interatomic potentials for ionic systems with density functional accuracy based on charge densities obtained by a neural network. Phys Rev B, 92 (4) (2015), Article 045131. DOI: 10.1103/PhysRevB.92.045131

[101]

Kito

, T.

Hattori

, Y.

Murakami

. Estimation of catalytic performance by neural network—product distribution in oxidative dehydrogenation of ethylbenzene. Appl Catal A, 114 (2) (1994), pp. L173-L178

[102]

M.B. Abdul

Rahman

, N.

Chaibakhsh

, M.

Basri

, A.B.

Salleh

, R.N.Z.R. Abdul Rahman. Application of artificial neural network for yield prediction of lipase-catalyzed synthesis of dioctyl adipate. Appl Biochem Biotechnol, 158 (3) (2009), pp. 722-735. DOI: 10.1007/s12010-008-8465-z

[103]

Burger

, P.M.

Maffettone

, V.V.

Gusev

, C.M.

Aitchison

, Y.

Bai

, X.

Wang

, et al. A mobile robotic chemist. Nature, 583 (7815) (2020), pp. 237-241. DOI: 10.1038/s41586-020-2442-2

[104]

Tran

, Z.W.

Ulissi

. Active learning across intermetallics to guide discovery of electrocatalysts for CO₂ reduction and H₂ evolution. Nat Catal, 1 (9) (2018), pp. 696-703. DOI: 10.1038/s41929-018-0142-1

[105]

Sun

, H.

Liao

, J.

Wang

, B.

Chen

, S.

Sun

, S.J.H.

Ong

, et al. Covalency competition dominates the water oxidation structure-activity relationship on spinel oxides. Nat Catal, 3 (7) (2020), pp. 554-563. DOI: 10.1038/s41929-020-0465-6

[106]

Y.F.

Shi

, P.L.

Kang

, C.

Shang

, Z.P.

Liu

. Methanol synthesis from CO₂/CO mixture on Cu-Zn catalysts from microkinetics-guided machine learning pathway search. J Am Chem Soc, 144 (29) (2022), pp. 13401-13414. DOI: 10.1021/jacs.2c06044

[107]

E.J.

Corey

, W.T.

Wipke

. Computer-assisted design of complex organic syntheses: pathways for molecular synthesis can be devised with a computer and equipment for graphical communication. Science, 166 (3902) (1969), pp. 178-192. DOI: 10.1126/science.166.3902.178

[108]

E.J.

Corey

, R.D. Cramer

III

, W.J.

Howe

. Computer-assisted synthetic analysis for complex molecules. Methods and procedures for machine generation of synthetic intermediates. J Am Chem Soc, 94 (2) (1972), pp. 440-459. DOI: 10.1021/ja00757a022

[109]

E.J.

Corey

, A.K.

Long

, S.D.

Rubenstein

. Computer-assisted analysis in organic synthesis. Science, 228 (4698) (1985), pp. 408-418. DOI: 10.1126/science.3838594

[110]

W.T.

Wipke

, G.I.

Ouchi

, S.

Krishnan

. Simulation and evaluation of chemical synthesis—SECS: an application of artificial intelligence techniques. Artif Intell, 11 (1-2) (1978), pp. 173-193.

[111]

Mikulak-Klucznik

, P.

Gołębiowska

, A.A.

Bayly

, O.

Popik

, T.

Klucznik

, S.

Szymkuć

, et al. Computational planning of the synthesis of complex natural products. Nature, 588 (7836) (2020), pp. 83-88. DOI: 10.1038/s41586-020-2855-y

[112]

Schwaller

, R.

Petraglia

, V.

Zullo

, V.H.

Nair

, R.A.

Haeuselmann

, R.

Pisoni

, et al. Predicting retrosynthetic pathways using transformer-based models and a hyper-graph exploration strategy. Chem Sci, 11 (12) (2020), pp. 3316-3325. DOI: 10.1039/c9sc05704h

[113]

Genheden

, A.

Thakkar

, V.

Chadimová

, J.L.

Reymond

, O.

Engkvist

, E.

Bjerrum

. AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. J Cheminform, 12 (1) (2020), p. 70.

[114]

C.W.

Coley

, W.H.

Green

, K.F.

Jensen

. Machine learning in computer-aided synthesis planning. Acc Chem Res, 51 (5) (2018), pp. 1281-1289. DOI: 10.1021/acs.accounts.8b00087

[115]

Wang

, W.

Zhang

, B.

Liu

. Computational analysis of synthetic planning: past and future. Chin J Chem, 39 (11) (2021), pp. 3127-3143. DOI: 10.1002/cjoc.202100273

[116]

Badowski

, E.P.

Gajewska

, K.

Molga

, B.A.

Grzybowski

. Synergy between expert and machine-learning approaches allows for improved retrosynthetic planning. Angew Chem Int Ed Engl, 59 (2) (2020), pp. 725-730. DOI: 10.1002/anie.201912083

[117]

Jiang

, Yu

, Kong

, Mei

, Yuan

, Huang

, et al. Artificial

intelligence for retrosynthesis prediction

. Engineering 2023; 25:32-50.

[118]

Lin

, Y.

, J.

Pei

, L.

Lai

. Automatic retrosynthetic route planning using template-free models. Chem Sci, 11 (12) (2020), pp. 3355-3364. DOI: 10.1039/c9sc03666k

[119]

Coley

, W.

Jin

, L.

Rogers

, T.F.

Jamison

, T.S.

Jaakkola

, W.H.

Green

, et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem Sci, 10 (2) (2019), pp. 370-377. DOI: 10.1039/c8sc04228d

[120]

P.L.

Kang

, Y.F.

Shi

, C.

Shang

, Z.P.

Liu

. Artificial intelligence pathway search to resolve catalytic glycerol hydrogenolysis selectivity. Chem Sci, 13 (27) (2022), pp. 8148-8160. DOI: 10.1039/d2sc02107b

[121]

Kocer

, T.W.

, J.

Behler

. Neural network potentials: a concise overview of methods. Annu Rev Phys Chem, 73 (1) (2022), pp. 163-186. DOI: 10.1146/annurev-physchem-082720-034254

[122]

T.B.

Blank

, S.D.

Brown

, A.W.

Calhoun

, D.J.

Doren

. Neural network models of potential energy surfaces. J Chem Phys, 103 (10) (1995), pp. 4129-4137.

[123]

Behler

, M.

Parrinello

. Generalized neural-network representation of high-dimensional potential-energy surfaces. Phys Rev Lett, 98 (14) (2007), Article 146401. DOI: 10.1103/PhysRevLett.98.146401

[124]

Lorenz

, A.

Groß

, M.

Scheffler

. Representing high-dimensional potential-energy surfaces for reactions at surfaces by neural networks. Chem Phys Lett, 395 (4-6) (2004), pp. 210-215.

[125]

A.P.

Bartók

, G.

Csányi

. Gaussian approximation potentials: a brief tutorial introduction. Int J Quantum Chem, 115 (16) (2015), pp. 1051-1057. DOI: 10.1002/qua.24927

[126]

A.P.

Bartók

, M.C.

Payne

, R.

Kondor

, G.

Csányi

. Gaussian approximation potentials: the accuracy of quantum mechanics, without the electrons. Phys Rev Lett, 104 (13) (2010), Article 136403. DOI: 10.1103/PhysRevLett.104.136403

[127]

Chmiela

, H.E.

Sauceda

, I.

Poltavsky

, K.R.

Müller

, A.

Tkatchenko

. sGDML: constructing accurate and data efficient molecular force fields using machine learning. Comput Phys Commun, 240 (2019), pp. 38-45.

[128]

W.J.

Szlachta

, A.P.

Bartók

, G.

Csányi

. Accuracy and transferability of Gaussian approximation potential models for tungsten. Phys Rev B, 90 (10) (2014), Article 104108. DOI: 10.1103/PhysRevB.90.104108

[129]

V.L.

Deringer

, G.

Csányi

. Machine learning based interatomic potential for amorphous carbon. Phys Rev B, 95 (9) (2017), Article 094203. DOI: 10.1103/PhysRevB.95.094203

[130]

Unruh

, R.V.

Meidanshahi

, S.M.

Goodnick

, G.

Csányi

, G.T.

Zimányi

. Gaussian approximation potential for amorphous Si : H. Phys Rev Mater, 6 (6) (2022), Article 065603.

[131]

V.L.

Deringer

, M.A.

Caro

, G.

Csányi

. A general-purpose machine-learning force field for bulk and nanostructured phosphorus. Nat Commun, 11 (1) (2020), p. 5461.

[132]

A.P.

Bartók

, J.

Kermode

, N.

Bernstein

, G.

Csányi

. Machine learning a general-purpose interatomic potential for silicon. Phys Rev X, 8 (4) (2018), Article 041048.

[133]

Bernstein

, B.

Bhattarai

, G.

Csányi

, D.A.

Drabold

, S.R.

Elliott

, V.L.

Deringer

. Quantifying chemical structure and machine-learned atomic energies in amorphous and liquid silicon. Angew Chem Int Ed Engl, 131 (21) (2019), pp. 7131-7135. DOI: 10.1002/ange.201902625

[134]

, Shang

, Liu

. Heterogeneous catalysis from structure to activity via SSW-NN method. J Chem Phys 2019 ;151(5):050901.

[135]

Shang

, X.J.

Zhang

, Z.P.

Liu

. Stochastic surface walking method for crystal structure and phase transition pathway prediction. Phys Chem Chem Phys, 16 (33) (2014), pp. 17845-17856.

[136]

Shang

, Z.P.

Liu

. Stochastic surface walking method for structure prediction and pathway searching. J Chem Theory Comput, 9 (3) (2013), pp. 1838-1845. DOI: 10.1021/ct301010b

[137]

Q.Y.

Liu

, C.

Shang

, Z.P.

Liu

. In situ active site for Fe-catalyzed Fischer-Tropsch synthesis: recent progress and future challenges. J Phys Chem Lett, 13 (15) (2022), pp. 3342-3352. DOI: 10.1021/acs.jpclett.2c00549

[138]

Q.Y.

Liu

, C.

Shang

, Z.P.

Liu

. In situ active site for CO activation in Fe-catalyzed Fischer-Tropsch synthesis from machine learning. J Am Chem Soc, 143 (29) (2021), pp. 11109-11120. DOI: 10.1021/jacs.1c04624

[139]

X.T.

, L.

Chen

, C.

Shang

, Z.P.

Liu

. In situ surface structures of PdAg catalyst and their influence on acetylene semihydrogenation revealed by machine learning and experiment. J Am Chem Soc, 143 (16) (2021), pp. 6281-6292. DOI: 10.1021/jacs.1c02471

[140]

P.L.

Kang

, C.

Shang

, Z.P. Liu. Large-scale atomic simulation via machine learning potentials constructed by global potential energy surface exploration. Acc Chem Res, 53 (10) (2020), pp. 2119-2129. DOI: 10.1021/acs.accounts.0c00472

[141]

P.L.

Kang

, C.

Shang

, Z.P.

Liu

. Glucose to 5-hydroxymethylfurfural: origin of site-selectivity resolved by machine learning based reaction sampling. J Am Chem Soc, 141 (51) (2019), pp. 20525-20536. DOI: 10.1021/jacs.9b11535

[142]

, S.D.

Huang

, Y.H.

Fang

, Z.P.

Liu

. TiH hydride formed on amorphous black titania: unprecedented active species for photocatalytic hydrogen evolution. ACS Catal, 8 (10) (2018), pp. 9711-9721. DOI: 10.1021/acscatal.8b03077

[143]

T.W.

, J.A.

Finkler

, S.

Goedecker

, J. Behler. A fourth-generation high-dimensional neural network potential with accurate electrostatics including non-local charge transfer. Nat Commun, 12 (1) (2021), p. 398.

[144]

Sasaki

, H.

Hamada

, Y.

Kintaichi

, T.

Ito

. Application of a neural network to the analysis of catalytic reactions analysis of NO decomposition over Cu/ZSM-5 zeolite. Appl Catal A, 132 (2) (1995), pp. 261-270.

[145]

M.L.

Mohammed

, D.

Patel

, R.

Mbeleck

, D.

Niyogi

, D.C.

Sherrington

, B.

Saha

. Optimisation of alkene epoxidation catalysed by polymer supported Mo(VI) complexes and application of artificial neural network for the prediction of catalytic performances. Appl Catal A, 466 (2013), pp. 142-152.

[146]

M.E.

Günay

, R.

Yildirim

. Knowledge extraction from catalysis of the past: a case of selective CO oxidation over noble metal catalysts between 2000 and 2012. ChemCatChem, 5 (6) (2013), pp. 1395-1406. DOI: 10.1002/cctc.201200665

[147]

M.E.

Günay

, R.

Yildirim

. Neural network analysis of selective CO oxidation over copper-based catalysts for knowledge extraction from published data in the literature. Ind Eng Chem Res, 50 (22) (2011), pp. 12488-12500. DOI: 10.1021/ie2013955

[148]

Omata

. Screening of new additives of active-carbon-supported heteropoly acid catalyst for Friedel-Crafts reaction by Gaussian process regression. Ind Eng Chem Res, 50 (19) (2011), pp. 10948-10954. DOI: 10.1021/ie102477y

[149]

Rohrbach

, M.

Šiaučiulis

, G.

Chisholm

, P.A.

Pirvan

, M.

Saleeb

, S.H.M.

Mehr

, et al. Digitization and validation of a chemical synthesis literature database in the ChemPU. Science, 377 (6602) (2022), pp. 172-180. DOI: 10.1126/science.abo0058

[150]

Perera

, J.W.

Tucker

, S.

Brahmbhatt

, C.J.

Helal

, A.

Chong

, W.

Farrell

, et al. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science, 359 (6374) (2018), pp. 429-434. DOI: 10.1126/science.aap9112

[151]

Z.W.

Ulissi

, M.T.

Tang

, J.

Xiao

, X.

Liu

, D.A.

Torelli

, M.

Karamad

, et al. Machine-learning methods enable exhaustive searches for active bimetallic facets and reveal active site motifs for CO₂ reduction. ACS Catal, 7 (10) (2017), pp. 6600-6608. DOI: 10.1021/acscatal.7b01648

[152]

Liu

, J.

Xiao

, H.

Peng

, X.

Hong

, K.

Chan

, J.K.

Nørskov

. Understanding trends in electrochemical carbon dioxide reduction rates. Nat Commun, 8 (1) (2017), p. 15438.

[153]

Zhong

, K.

Tran

, Y.

Min

, C.

Wang

, Z.

Wang

, C.T.

Dinh

, et al. Accelerated discovery of CO₂ electrocatalysts using active machine learning. Nature, 581 (7807) (2020), pp. 178-183. DOI: 10.1038/s41586-020-2242-8

[154]

Yoshikawa

, R.

Kubo

, K.Z.

Yamamoto

. Twitter integration of chemistry software tools. J Cheminform, 13 (1) (2021), p. 46.

AI Summary AI Mindmap

PDF(2541 KB)

Accesses

Citations

Detail

Sections

Recommended

Received	Revised	Accepted	Published
31 Aug 2022	19 Jan 2023	06 Apr 2023	31 Aug 2023
Issue Date
13 Jun 2024

Journal home

Browse

Online first

Latest issue

All volumes and issues

Collections

Authors & reviewers

Guidelines for authors

Call for papers

Editorial policy

Copyright & license

Ethical requirements

Download templates

About the journal

Aims & scope

Description

Editorial board

Young Experts

Abstracting / Indexing

Contact us

中文版

Abstract

Graphical abstract

Keywords

Cite this article

{{custom_sec.title}}

{{custom_sec.title}}

References