Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Engineering >> 2021, Volume 7, Issue 9 doi: 10.1016/j.eng.2021.03.019

Machine Learning in Chemical Engineering: Strengths, Weaknesses, Opportunities, and Threats

a Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Ghent 9052, Belgium
b SynBioC Research Group, Department of Green Chemistry and Technology, Faculty of Bioscience Engineering, Ghent University, Ghent 9000, Belgium

Received: 2020-10-16 Revised: 2021-01-16 Accepted: 2021-03-22 Available online: 2021-07-29

Next Previous

Abstract

Chemical engineers rely on models for design, research, and daily decision-making, often with potentially large financial and safety implications. Previous efforts a few decades ago to combine artificial intelligence and chemical engineering for modeling were unable to fulfill the expectations. In the last five years, the increasing availability of data and computational resources has led to a resurgence in machine learning-based research. Many recent efforts have facilitated the roll-out of machine learning techniques in the research field by developing large databases, benchmarks, and representations for chemical applications and new machine learning frameworks. Machine learning has significant advantages over traditional modeling techniques, including flexibility, accuracy, and execution speed. These strengths also come with weaknesses, such as the lack of interpretability of these black-box models. The greatest opportunities involve using machine learning in time-limited applications such as real-time optimization and planning that require high accuracy and that can build on models with a self-learning ability to recognize patterns, learn from data, and become more intelligent over time. The greatest threat in artificial intelligence research today is inappropriate use because most chemical engineers have had limited training in computer science and data analysis. Nevertheless, machine learning will definitely become a trustworthy element in the modeling toolbox of chemical engineers.

Figures

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

References

[ 1 ] Levenspiel O. Modeling in chemical engineering. Chem Eng Sci 2002;57(22– 23):4691–6. link1

[ 2 ] Stokes GG. On the steady motion of incompressible fluids. In: Mathematical and physical papers. Cambridge: Cambridge University Press; 2009. p. 1–16. French. link1

[ 3 ] Navier CL. Memoire sur les lois du mouvement des fluides. Mem Acad Sci Inst Fr 1827;6:389–440. French. link1

[ 4 ] Prandtl L. Über flussigkeitsbewegung bei sehr kleiner reibung. In: Riegels FW, editor. Ludwig prandtl gesammelte abhandlungen. Berlin: Springer; 1904. p. 484–91. German. link1

[ 5 ] Siirola JJ, Powers GJ, Rudd DF. Synthesis of system designs: III. toward a process concept generator. AIChE J 1971;17(3):677–82. link1

[ 6 ] Venkatasubramanian V. The promise of artificial intelligence in chemical engineering: is it here, finally? AIChE J 2019;65(2):466–78. link1

[ 7 ] Reaxys [Internet]. Amsterdam: Elsevier; c2021 [cited 2021 Jan 4]. Available from: https://www.elsevier.com/solutions/reaxys. link1

[ 8 ] CAS SciFinder [Internet]. Columbus: American Chemical Society; c2021 [cited 2021 Jan 4]. Available from: https://www.cas.org/products/scifinder. link1

[ 9 ] ChemSpace [Internet]. Monmouth Junction: Chemspace US Inc.; c2021 [cited 2021 Jan 4]. Available from: https://chem-space.com/about. link1

[10] Ruddigkeit L, van Deursen R, Blum LC, Reymond JL. Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17. J Chem Inf Model 2012;52(11):2864–75. link1

[11] NIST Chemistry WebBook. Washington, DC: National Institute of Standards and Technology, US Department of Commerce; c2018 [cited 2021 Jan 4]. Available from: https://webbook.nist.gov/chemistry/. link1

[12] Pettit LD. The IUPAC stability constants database. Chem Int 2006;28(5):14–5. link1

[13] Chen G, Chen P, Hsieh CY, Lee CK, Liao B, Liao R, et al. Alchemy: a quantum chemistry dataset for benchmarking AI models. 2019. arXiv:1906.09427.

[14] Delaney JS. ESOL: estimating aqueous solubility directly from molecular structure. J Chem Inf Comput Sci 2004;44(3):1000–5. link1

[15] Mobley DL, Guthrie JP. FreeSolv: a database of experimental and calculated hydration free energies, with input files. J Comput Aided Mol Des 2014;28 (7):711–20. link1

[16] Hall MA. Correlation-based feature selection for machine learning [dissertation]. Hamilton: The University of Waikato; 1999. link1

[17] Khalid S, Khalil T, Nasreen SA. A survey of feature selection and feature extraction techniques in machine learning. In: Proceedings of 2014 Science and Information Conference; 2014 Aug 27–29; London, UK. New York: IEEE; 2014. link1

[18] Xue B, Zhang M, Browne WN, Yao X. A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput 2016;20(4):606–26. link1

[19] Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: a new perspective. Neurocomputing 2018;300:70–9. link1

[20] Szegedy C, Toshev A, Erhan D. Deep neural networks for object detection. In: Proceedings of NIPS 2013-Twenty-Seventh Annual Conference on Neural Information Processing Systems Conference. 2013 Dec 9–12; Nevada, CA, USA. New York: Neural Information Processing Systems Foundation, Inc.; 2013. link1

[21] Bassam A, Conde-Gutierrez RA, Castillo J, Laredo G, Hernandez JA. Direct neural network modeling for separation of linear and branched paraffins by adsorption process for gasoline octane number improvement. Fuel 2014;124:158–67. link1

[22] De Oliveira FM, de Carvalho LS, Teixeira LSG, Fontes CH, Lima KMG, Câmara ABF, et al. Predicting cetane index, flash point, and content sulfur of diesel– biodiesel blend using an artificial neural network model. Energy Fuels 2017;31(4):3913–20. link1

[23] Li H, Zhang Z, Liu Z. Application of artificial neural networks for catalysis: a review. Catalysts 2017;7(10):306. link1

[24] Abdul Jameel AG, Van Oudenhoven V, Emwas AH, Sarathy SM. Predicting octane number using nuclear magnetic resonance spectroscopy and artificial neural networks. Energy Fuels 2018;32(5):6309–29. link1

[25] Plehiers PP, Symoens SH, Amghizar I, Marin GB, Stevens CV, Van Geem KM. Artificial intelligence in steam cracking modeling: a deep learning algorithm for detailed effluent prediction. Engineering 2019;5(6):1027–40. link1

[26] Cavalcanti FM, Schmal M, Giudici R, Brito Alves RM. A catalyst selection method for hydrogen production through water–gas shift reaction using artificial neural networks. J Environ Manage 2019;237:585–94. link1

[27] Hwangbo S, Al R, Sin G. An integrated framework for plant data-driven process modeling using deep-learning with Monte-Carlo simulations. Comput Chem Eng 2020;143:107071. link1

[28] Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J Chem Inf Comput Sci 1988;28(1):31–6. link1

[29] Heller S, McNaught A, Stein S, Tchekhovskoi D, Pletnev I. InChI–the worldwide chemical structure identifier standard. J Cheminform 2013;5(1):7. link1

[30] Krenn M, Häse F, Nigam A, Friederich P, Aspuru-Guzik A. Self-referencing embedded strings (SELFIES): a 100% robust molecular string representation. Mach Learn Sci Technol 2020;1(4):045024. link1

[31] Amar Y, Schweidtmann AM, Deutsch P, Cao L, Lapkin A. Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis. Chem Sci 2019;10(27):6697–706. link1

[32] Yalamanchi KK, van Oudenhoven VCO, Tutino F, Monge-Palacios M, Alshehri A, Gao X, et al. Machine learning to predict standard enthalpy of formation of hydrocarbons. J Phys Chem A 2019;123(38):8305–13. link1

[33] Yalamanchi KK, Monge-Palacios M, van Oudenhoven VCO, Gao X, Sarathy SM. Data science approach to estimate enthalpy of formation of cyclic hydrocarbons. J Phys Chem A 2020;124(31):6270–6. link1

[34] Rupp M, Tkatchenko A, Müller KR, von Lilienfeld OA. Fast and accurate modeling of molecular atomization energies with machine learning. Phys Rev Lett 2012;108(5):058301. link1

[35] Hansen K, Biegler F, Ramakrishnan R, Pronobis W, von Lilienfeld OA, Müller KR, et al. Machine learning predictions of molecular properties: accurate many-body potentials and nonlocality in chemical space. J Phys Chem Lett 2015;6(12):2326–31. link1

[36] Faber FA, Hutchison L, Huang B, Gilmer J, Schoenholz SS, Dahl GE, et al. Prediction errors of molecular machine learning models lower than hybrid DFT error. J Chem Theory Comput 2017;13(11):5255–64. link1

[37] Gómez-Bombarelli R, Wei JN, Duvenaud D, Hernández-Lobato JM, SánchezLengeling B, Sheberla D, et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci 2018;4(2):268–76. link1

[38] Liu S, Demirel MF, Liang Y. N-gram graph: simple unsupervised representation for graphs, with applications to molecules. In: Advances in neural information processing systems 32. 2019 Dec 8–14; Vancouver, BC, Canada. New York: Neural Information Processing Systems Foundation, Inc.; 2019. link1

[39] Wang S, Guo Y, Wang Y, Sun H, Huang J. SMILES-BERT: large scale unsupervised pre-training for molecular property prediction. In: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics; 2019 Sep 7–10; Niagara Falls, NY, USA. New York: IEEE; 2019. p. 429–36. link1

[40] Chithrananda S, Grand G, Ramsundar B. ChemBERTa: large-scale selfsupervised pretraining for molecular property prediction. 2020. arXiv:2010.09885.

[41] Fabian B, Edlich T, Gaspar H, Ahmed M. Molecular representation learning with language models and domain-relevant auxiliary tasks. 2020. arXiv:2011.13230.

[42] Glem RC, Bender A, Arnby CH, Carlsson L, Boyer S, Smith J. Circular fingerprints: flexible molecular descriptors with applications from physical chemistry to ADME. IDrugs 2006;9(3):199–204. link1

[43] Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, et al. QSAR modeling: where have you been? where are you going to? J Med Chem 2014;57(12):4977–5010. link1

[44] Sumudu PL, Steffen L. Computational methods in drug discovery. Beilstein J Org Chem 2016;12:2694–718. link1

[45] Unterthiner T, Mayr A, Klambauer G, Steijaert M, Wegner J, Ceulemans H, et al. Deep learning as an opportunity in virtual screening. In: Proceedings of Workshop on Machine Learning for Clinical Data Analysis, Healthcare and Genomics (NIPS2014); 2014 Dec 8–13; Montreal, QC, Canada. Linz: Johannes Kepler University Linz; 2015. link1

[46] Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, et al. Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem Sci 2018;9(24):5441–51. link1

[47] Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, et al. Analyzing learned molecular representations for property prediction. J Chem Inf Model 2019;59 (8):3370–88. link1

[48] Duvenaud DK, Maclaurin D, Iparraguirre J, Bombarelli R, Aspure A, Adams RP, et al. Convolutional networks on graphs for learning molecular fingerprints. In: Proceedings of the 28th International Conference on Neural Information Processing Systems; 2015 Dec 8–12; Bali, Indonesia. Cambridge: MIR Press; 2015. p. 2224–32. link1

[49] Coley CW, Barzilay R, Green WH, Jaakkola TS, Jensen KF. Convolutional embedding of attributed molecular graphs for physical property prediction. J Chem Inf Model 2017;57(8):1757–72. link1

[50] Coley CW, Jin W, Rogers L, Jamison TF, Jaakkola TS, Green WH, et al. A graphconvolutional neural network model for the prediction of chemical reactivity. Chem Sci 2018;10(2):370–7. link1

[51] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering. In: Proceedings of the 30th International Conference on Neural Information Processing Systems; 2016 Dec 5–10; Barcelona, Spain. New York: Curran Associates, Inc; 2016. p. 3844–52. link1

[52] Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, et al. MoleculeNet: a benchmark for molecular machine learning. Chem Sci 2017;9 (2):513–30. link1

[53] Kearnes S, McCloskey K, Berndl M, Pande V, Riley P. Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 2016;30 (8):595–608. link1

[54] Battaglia P, Pascanu R, Lai M, Rezende DJ, Koray K. Interaction networks for learning about objects, relations and physics. In: Proceedings of the 30th International Conference on Neural Information Processing Systems; 2016 Dec 5–10; Barcelona, Spain. New York: Curran Associates, Inc.; 2016. p. 4502–10. link1

[55] Schütt KT, Arbabzadah F, Chmiela S, Müller KR, Tkatchenko A. Quantumchemical insights from deep tensor neural networks. Nat Commun 2017;8 (1):13890. link1

[56] Jørgensen PB, Jacobsen KW, Schmidt MN. Neural message passing with edge updates for predicting properties of molecules and materials. In: Proceedings of 32nd Conference on Neural Information Processing Systems; 2018 Dec 3– 8; Montreal, QC, Canada. New York: Neural Information Processing Systems Foundation, Inc.; 2018. link1

[57] Li Y, Tarlow D, Brockschmidt M, Zemel R. Gated graph sequence neural network. 2017. arXiv1511.05493.

[58] Winter R, Montanari F, Noé F, Clevert DA. Learning continuous and datadriven molecular descriptors by translating equivalent chemical representations. Chem Sci 2018;10(6):1692–701. link1

[59] Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE. Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6–11; Sydney, NSW, Australia. New York: JMLR.org; 2017. p. 1263–72. link1

[60] David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AIdriven drug discovery: a review and practical guide. J Cheminform 2020;12 (1):56. link1

[61] Morgan HL. The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 1965;5(2):107–13. link1

[62] Rogers D, Hahn M. Extended-connectivity fingerprints. J Chem Inf Model 2010;50(5):742–54. link1

[63] Pattanaik L, Coley CW. Molecular representation: going long on fingerprints. Chem 2020;6(6):1204–7. link1

[64] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–44. link1

[65] Von Lilienfeld OA. First principles view on chemical compound space: gaining rigorous atomistic control of molecular properties. Int J Quantum Chem 2013;113(12):1676–89. link1

[66] James CA. Daylight theory manual [internet]. Laguna Niguel: Daylight Chemical Information Systems, Inc.; c1997–2019 [cited 2021 Jan 4]. Available from: http://www.daylight.com/dayhtml/doc/theory/. link1

[67] Grethe G, Blanke G, Kraut H, Goodman JM. International chemical identifier for reactions (RInChI). J Cheminform 2018;10(1):22. link1

[68] Segler MHS, Waller MP. Neural-symbolic machine learning for retrosynthesis and reaction prediction. Chemistry 2017;23(25):5966–71. link1

[69] Plehiers PP, Coley CW, Gao H, Vermeire FH, Dobbelaere MR, Stevens CV, et al. Artificial intelligence for computer-aided synthesis in flow: analysis and selection of reaction components. Front Chem Eng 2020;2:5. link1

[70] Sanchez-Lengeling B, Aspuru-Guzik A. Inverse molecular design using machine learning: generative models for matter engineering. Science 2018;361(6400):360–5. link1

[71] Wei JN, Duvenaud D, Aspuru-Guzik A. Neural networks for the prediction of organic chemistry reactions. ACS Cent Sci 2016;2(10):725–32. link1

[72] Eyke NS, Green WH, Jensen KF. Iterative experimental design based on active machine learning reduces the experimental burden associated with reaction screening. React Chem Eng 2020;5(10):1963–72. link1

[73] Coley CW, Barzilay R, Jaakkola TS, Green WH, Jensen KF. Prediction of organic reaction outcomes using machine learning. ACS Cent Sci 2017;3(5):434–43. link1

[74] Nam J, Kim J. Linking the neural machine translation and the prediction of organic chemistry reactions. 2016. arXiv:1612.09529.

[75] Schwaller P, Gaudin T, Lányi D, Bekas C, Laino T. ‘‘Found in translation”: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem Sci 2018;9(28):6091–8. link1

[76] Duan H, Wang L, Zhang C, Guo L, Li J. Retrosynthesis with attention-based NMT model and chemical analysis of ‘‘wrong” predictions. RSC Adv 2020;10 (3):1371–8. link1

[77] Lee AA, Yang Q, Sresht V, Bolgar P, Hou X, Klug-McLeod JL, et al. Molecular transformer unifies reaction prediction and retrosynthesis across pharma chemical space. Chem Commun 2019;55(81):12152–5. link1

[78] Schwaller P, Laino T, Gaudin T, Bolgar P, Hunter CA, Bekas C, et al. Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Cent Sci 2019;5(9):1572–83. link1

[79] Michalski RS, Carbonell JG, Mitchell TM. A comparative review of selected methods for learning from examples. Mach Learn 2013;1:41–82. link1

[80] Dey A. Machine learning algorithms: a review. Int J Comput Sci Inf Technol 2016;7(3):1174–9. link1

[81] Pearson K. Contributions to the mathematical theory of evolution. Philos Trans R Soc Lond A 1894;185:71–110. link1

[82] Hotelling H. Analysis of a complex of statistical variables into principal components. J Educ Psychol 1933;24(6):417–41. link1

[83] Pearson K. On lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philos Mag J Sci 1901;2(11):559–72. link1

[84] Van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learn Res 2008;9:2579–605. link1

[85] Ester M, Kriegel HP, Sander J, Xu XW. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. 1996 Aug 2–4; Portland, OR, USA. New York: AAAI Press; 1996. p. 226–31. link1

[86] Palkovits R, Palkovits S. Using artificial intelligence to forecast water oxidation catalysts. ACS Catal 2019;9(9):8383–7. link1

[87] Likas A, Vlassis N, Verbeek J. The global k-means clustering algorithm. Pattern Recognit 2003;36(2):451–61. link1

[88] Tang J, Yan X. Neural network modeling relationship between inputs and state mapping plane obtained by FDA–t-SNE for visual industrial process monitoring. Appl Soft Comput 2017;60:577–90. link1

[89] Zheng S, Zhao J. A new unsupervised data mining method based on the stacked autoencoder for chemical process fault diagnosis. Comput Chem Eng 2020;135:106755. link1

[90] Gao H, Struble TJ, Coley CW, Wang Y, Green WH, Jensen KF. Using machine learning to predict suitable conditions for organic reactions. ACS Cent Sci 2018;4(11):1465–76. link1

[91] Vermeire FH, Green WH. Transfer learning for solvation free energies: from quantum chemistry to experiments. Chem Eng J 2020;418:129307. link1

[92] Pyl SP, Van Geem KM, Reyniers MF, Marin GB. Molecular reconstruction of complex hydrocarbon mixtures: an application of principal component analysis. AIChE J 2010;56(12):3174–88. link1

[93] Thombre M, Mdoe Z, Jäschke J. Data-driven robust optimal operation of thermal energy storage in industrial clusters. Processes 2020;8(2):194. link1

[94] Lee JM, Yoo C, Choi SW, Vanrolleghem PA, Lee IB. Nonlinear process monitoring using kernel principal component analysis. Chem Eng Sci 2004;59(1):223–34. link1

[95] Choi SW, Park JH, Lee IB. Process monitoring using a Gaussian mixture model via principal component analysis and discriminant analysis. Comput Chem Eng 2004;28(8):1377–87. link1

[96] Ning C, You F. Data-driven decision making under uncertainty integrating robust optimization with principal component analysis and kernel smoothing methods. Comput Chem Eng 2018;112:190–210. link1

[97] Kano M, Hasebe S, Hashimoto I, Ohno H. A new multivariate statistical process monitoring method using principal component analysis. Comput Chem Eng 2001;25(7–8):1103–13. link1

[98] Chiang LH, Pell RJ, Seasholtz MB. Exploring process data with the use of robust outlier detection algorithms. J Process Contr 2003;13(5):437–49. link1

[99] Zhang X, Zou Y, Li S, Xu S. A weighted auto regressive LSTM based approach for chemical processes modeling. Neurocomputing 2019;367:64–74. link1

[100] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9(8):1735–80. link1

[101] Géron A. Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: concepts, tools, and techniques to build intelligent systems. Sebastopol: O’Reilly Media; 2019. link1

[102] Safavian SR, Landgrebe D. A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 1991;21(3):660–74. link1

[103] Ho TK. Random decision forests. In: Proceedings of 3rd International Conference on Document Analysis and Recognition; 1995 Aug 14–16; Montreal, QC, Canada. New York: IEEE; 1995. p. 278–82. link1

[104] Vapnik V. The support vector method of function estimation. In: Suykens JAK, Vandewalle J, editors. Nonlinear modeling. Boston: Springer; 1998. p. 55–85. link1

[105] Matsugu M, Mori K, Mitari Y, Kaneda Y. Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw 2003;16(5–6):555–9. link1

[106] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM 2017;60(6):84–90. link1

[107] Shiffman D. Neural networks. In: Fry S, editor. The nature of code. Boston: Free Software Foundation; 2012. p. 444–80.

[108] Hopfield JJ. Artificial neural networks. IEEE Circuits Devices Mag 1988;4 (5):3–10. link1

[109] Bontemps L, Cao VL, McDermott J, Le-Khac N. Collective anomaly detection based on long short-term memory recurrent neural networks. In: Proceedings of International Conference on Future Data and Security Engineering; 2016 Nov 23–25; Can Tho City, Vietnam. Cham: Springer International Publishing; 2016. link1

[110] Brotherton T, Johnson T. Anomaly detection for advanced military aircraft using neural networks. In: Proceedings of 2001 IEEE Aerospace Conference; 2001 Mar 10–17; Big Sky, MT, USA. New York: IEEE; 2001. link1

[111] Malhotra P, Vig L, Shroff G, Agarwal P. Long short term memory networks for anomaly detection in time series. In: Proceedings of 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2015; 2015 Apr 22–24; Bruges, Belgium. Wallonie: i6doc; 2015. link1

[112] Chalapathy R, Menon AK, Chawla S. Anomaly detection using one-class neural networks. 2018. arXiv:1802.06360.

[113] Zhou S, Shen W, Zeng D, Fang M, Wei Y, Zhang Z. Spatial–temporal convolutional neural networks for anomaly detection and localization in crowded scenes. Signal Process Image Commun 2016;47:358–68. link1

[114] Grambow CA, Pattanaik L, Green WH. Deep learning of activation energies. J Phys Chem Lett 2020;11(8):2992–7. link1

[115] Scalia G, Grambow CA, Pernici B, Li YP, Green WH. Evaluating scalable uncertainty estimation methods for deep learning-based molecular property prediction. J Chem Inf Model 2020;60(6):2697–717. link1

[116] Li YP, Han K, Grambow CA, Green WH. Self-evolving machine: a continuously improving model for molecular thermochemistry. J Phys Chem A 2019;123 (10):2142–52. link1

[117] Grambow CA, Li YP, Green WH. Accurate thermochemistry with small data sets: a bond additivity correction and transfer learning approach. J Phys Chem A 2019;123(27):5826–35. link1

[118] Christensen AS, Bratholm LA, Faber FA, Anatole von Lilienfeld O. FCHL revisited: faster and more accurate quantum machine learning. J Chem Phys 2020;152(4):044107. link1

[119] Azlan Hussain M. Review of the applications of neural networks in chemical process control—simulation and online implementation. Artif Intell Eng 1999;13(1):55–68. link1

[120] Schweidtmann AM, Mitsos A. Deterministic global optimization with artificial neural networks embedded. J Optim Theory Appl 2019;180 (3):925–48. link1

[121] Zhu W, Sun W, Romagnoli J. Adaptive k-nearest-neighbor method for process monitoring. Ind Eng Chem Res 2018;57(7):2574–86. link1

[122] Yan S, Yan X. Using labeled autoencoder to supervise neural network combined with k-nearest neighbor for visual industrial process monitoring. Ind Eng Chem Res 2019;58(23):9952–8. link1

[123] Walker E, Kammeraad J, Goetz J, Robo MT, Tewari A, Zimmerman PM. Learning to predict reaction conditions: relationships between solvent, molecular structure, and catalyst. J Chem Inf Model 2019;59 (9):3645–54. link1

[124] Zahrt AF, Henle JJ, Rose BT, Wang Y, Darrow WT, Denmark SE. Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning. Science 2019;363(6424):eaau5631. link1

[125] Han K, Jamal A, Grambow CA, Buras ZJ, Green WH. An extended group additivity method for polycyclic thermochemistry estimation. Int J Chem Kinet 2018;50(4):294–303. link1

[126] Settles B. From theories to queries: active learning in practice. JMLR 2011;16:1–18. link1

[127] Settles B. Active learning literature survey. Computer Sciences Technical Report 1648. Madison: University of Wisconsin–Madison; 2009.

[128] Clayton AD, Schweidtmann AM, Clemens G, Manson JA, Taylor CJ, Niño CG, et al. Automated self-optimisation of multi-step reaction and separation processes using machine learning. Chem Eng J 2020;384:123340. link1

[129] Zhang C, Amar Y, Cao L, Lapkin AA. Solvent selection for mitsunobu reaction driven by an active learning surrogate model. Org Process Res Dev 2020;24 (12):2864–73. link1

[130] Schütt KT, Sauceda HE, Kindermans PJ, Tkatchenko A, Müller KR. SchNet-deep learning architecture for molecules and materials. J Chem Phys 2018;148 (24):241722. link1

[131] Schütt KT, Kessel P, Gastegger M, Nicoli KA, Tkatchenko A, Müller KR. SchNetPack: a deep learning toolbox for atomistic systems. J Chem Theory Comput 2019;15(1):448–55. link1

[132] Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res 2011;12:2825–30. link1

[133] Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, et al. TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16); 2016 Nov 2–4; Savannah, GA, USA. Northbrook: USENIX; 2016. link1

[134] Chollet F. Keras [internet]. San Francisco: GitHub, Inc.; 2021 Jun 18 [cited 2021 Jan 4]. Available from: https://github.com/keras-team/keras. link1

[135] Paszke A, Gross S, Massa F, Lerer A, Chintala S. Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of 33rd Conference on Neural Information Processing Systems; 2019 Dec 8–14; Vancouver, BC, Canada. New York: Neural Information Processing Systems Foundation, Inc.; 2019.

[136] Bininda-Emonds ORP, Jones KE, Price SA, Cardilloe M, Grenyer R, Purvis A. Garbage in, garbage out. In: Bininda-Emonds ORP, editor. Phylogenetic supertrees. Berlin: Springer; 2004. p. 267–80. link1

[137] Schubert E, Gertz M. Intrinsic t-stochastic neighbor embedding for visualization and outlier detection. In: Proceedings of International Conference on Similarity Search and Applications; 2017 Oct 4–6; Munich, Germany. Berlin: Springer; 2017. p. 188–203. link1

[138] Perez H, Tah JHM. Improving the accuracy of convolutional neural networks by identifying and removing outlier images in datasets using t-SNE. Mathematics 2020;8(5):662. link1

[139] Çelik M, Dadaser-Çelik F, Dokuz AS. Anomaly detection in temperature data using DBSCAN algorithm. In: Proceedings of 2011 International Symposium on Innovations in Intelligent Systems and Applications; 2011 Jun 15–18; Istanbul, Turkey. New York: IEEE; 2016. p. 91–5. link1

[140] Cassisi C, Ferro A, Giugno R, Pigola G, Pulvirenti A. Enhancing density-based clustering: parameter reduction and outlier detection. Inf Syst 2013;38 (3):317–30. link1

[141] Fernando T, Denman S, Sridharan S, Fookes C. Soft + hardwired attention: an LSTM framework for human trajectory prediction and abnormal event detection. Neural Netw 2018;108:466–78. link1

[142] Filonov P, Lavrentyev A, Vorontsov A. Multivariate industrial time series with cyber-attack simulation: fault detection using an LSTM-based predictive data model. 2016. arXiv:1612.06676.

[143] Chandola V, Banerjee A, Kumar V. Anomaly detection: a survey. ACM Comput Surv 2009;41(3):1–58. link1

[144] Ahmad S, Lavin A, Purdy S, Agha Z. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 2017;262:134–47. link1

[145] Amini M, Jalili R, Shahriari HR. RT-UNNID: a practical solution to real-time network-based intrusion detection using unsupervised neural networks. Comput Secur 2006;25(6):459–68. link1

[146] Schlegl T, Seeböck P, Waldstein SM, Schmidt-Erfurth U, Langs G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In: Proceedings of International Conference on Information Processing in Medical Imaging 2017; 2017 Jun 25–30; Boone, KY, USA. Cham: Springer International Publishing; 2017. p. 146–57. link1

[147] Schneider N, Lowe DM, Sayle RA, Tarselli MA, Landrum GA. Big data from pharmaceutical patents: a computational analysis of medicinal chemists’ bread and butter. J Med Chem 2016;59(9):4385–402. link1

[148] Raccuglia P, Elbert KC, Adler PDF, Falk C, Wenny MB, Mollo A, et al. Machinelearning-assisted materials discovery using failed experiments. Nature 2016;533(7601):73–6. link1

[149] Wu Z, Rincon D, Christofides PD. Real-time adaptive machine-learning-based predictive control of nonlinear processes. Ind Eng Chem Res 2020;59 (6):2275–90. link1

[150] Zhang Z, Wu Z, Rincon D, Christofides P. Real-time optimization and control of nonlinear processes using machine learning. Mathematics 2019;7(10):890. link1

[151] Powell BKM, Machalek D, Quah T. Real-time optimization using reinforcement learning. Comput Chem Eng 2020;143:107077. link1

[152] Ramakrishnan R, Dral PO, Rupp M, von Lilienfeld OA. Big data meets quantum chemistry approximations: the D-machine learning approach. J Chem Theory Comput 2015;11(5):2087–96. link1

[153] Bikmukhametov T, Jäschke J. Combining machine learning and process engineering physics towards enhanced accuracy and explainability of datadriven models. Comput Chem Eng 2020;138:106834. link1

[154] Gunning D, Aha DW. DARPA’s explainable artificial intelligence program. AI Mag 2019;40(2):44–58. link1

[155] Abdul A, Vermeulen J, Wang D, Lim BY, Kankanhali M. Trends and trajectories for explainable, accountable and intelligible systems: an HCI research agenda. In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems; 2018 Apr 21; Montreal, QC, Canada. New York: Association for Computing Machinery; 2018. link1

[156] Lepri B, Oliver N, Letouzé E, Pentland A, Vinck P. Fair, transparent, and accountable algorithmic decision-making processes. Philos Technol 2018;31 (4):611–27. link1

[157] Wachter S, Mittelstadt B, Floridi L. Transparent, explainable, and accountable AI for robotics. Sci Robot 2017;2(6):eaan6080. link1

[158] Kammeraad JA, Goetz J, Walker EA, Tewari A, Zimmerman PM. What does the machine learn? Knowledge representations of chemical reactivity. J Chem Inf Model 2020;60(3):1290–301. link1

[159] Kovács DP, McCorkindale W, Lee AA. Quantitative interpretation explains machine learning models for chemical reaction prediction and uncovers bias. Nat Comm 2021;12:1695.

[160] Preuer K, Klambauer G, Rippmann F, Hochreiter S, Unterthiner T. Interpretable deep learning in drug discovery. In: Samek W, Montavon G, Vedaldi A, Hansen LK, Müller KR, editors. Explainable AI: interpreting, explaining and visualizing deep learning. Cham: Springer International Publishing; 2019. p. 331–45. link1

[161] Begoli E, Bhattacharya T, Kusnezov D. The need for uncertainty quantification in machine-assisted medical decision making. Nat Mach Intell 2019;1 (1):20–3. link1

[162] Mohamed L, Christie MA, Demyanov V. Comparison of stochastic sampling algorithms for uncertainty quantification. SPE J 2010;15(01):31–8. link1

[163] Gal Y, Ghahramani Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. 2016. arXiv:1506.02142v6.

[164] Fridlyand A, Johnson MS, Goldsborough SS, West RH, McNenly MJ, Mehl M, et al. The role of correlations in uncertainty quantification of transportation relevant fuel models. Combust Flame 2017;180:239–49. link1

[165] Parker WS. Ensemble modeling, uncertainty and robust predictions. Wiley Interdiscip Rev Clim Change 2013;4(3):213–23. link1

[166] Gneiting T, Raftery AE. Weather forecasting with ensemble methods. Science 2005;310(5746):248–9. link1

[167] Derome J. On the average errors of an ensemble of forecasts. Atmos Ocean 1981;19(2):103–27. link1

[168] Sundararajan M, Taly A, Yan Q. Axiomatic attribution for deep networks. 2017. arXiv:1703.01365.

[169] Datta A, Sen S, Zick Y. Algorithmic transparency via quantitative input influence: theory and experiments with learning systems. In: Proceedings of 2016 IEEE Symposium on Security and Privacy (SP); 2016 May 22–26; San Jose, CA, USA. New York: IEEE; 2016. p. 598–617. link1

[170] Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D. A survey of methods for explaining black box models. ACM Comput Surv 2018;51 (5):93. link1

[171] Lin AI, Madzhidov TI, Klimchuk O, Nugmanov RI, Antipin IS, Varnek A. Automatized assessment of protective group reactivity: a step toward big reaction data analysis. J Chem Inf Model 2016;56(11): 2140–8. link1

[172] Pesciullesi G, Schwaller P, Laino T, Reymond JL. Transfer learning enables the molecular transformer to predict regio- and stereoselective reactions on carbohydrates. Nat Commun 2020;11(1):4874. link1

[173] Melchers RE. On the ALARP approach to risk management. Reliab Eng Syst Saf 2001;71(2):201–8. link1

[174] Hutson M. Has artificial intelligence become alchemy? Science 2018;360 (6388):478. link1

[175] Hutson M. Artificial intelligence faces reproducibility crisis. Science 2018;359 (6377):725–6. link1

[176] Gundersen OE, Kjensmo S. State of the art: reproducibility in artificial intelligence. In: Proceedings of The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18); 2018 Feb 2–7; New Orleans, LA, USA. Palo Alto: AAAI Press; 2018. p. 1644–51. link1

[177] Baker M. 1,500 scientists lift the lid on reproducibility. Nature 2016;533:452–4. link1

[178] Fenn J, Linden A. Understanding Gartner’s hype cycles. Report. Stamford: Gartner, Inc.; 2003 May. Report No: R-20-1971.

[179] Sicular S. Vashisth S. Hype cycle for artificial intelligence, 2020 [Internet]. Reading: CloudFactory; 2020 Jul 27 [cited 2021 Jan 4]. Available from: https:// www.cloudfactory.com/reports/gartner-hype-cycle-for-artificial-intelligence.

[180] Symoens SH, Aravindakshan SU, Vermeire FH, De Ras K, Djokic MR, Marin GB, et al. QUANTIS: data quality assessment tool by clustering analysis. Int J Chem Kinet 2019;51(11):872–85. link1

Related Research