Active Machine Learning for Chemical Engineers: A Bright Future Lies Ahead!

Yannick Ureel; Maarten R. Dobbelaere; Yi Ouyang; Kevin De Ras; Maarten K. Sabbe; Guy B. Marin; Kevin M. Van Geem

doi:10.1016/j.eng.2023.02.019

PDF(1235 KB)

Engineering ›› 2023, Vol. 27 ›› Issue (8) : 23-30. DOI: 10.1016/j.eng.2023.02.019

Research

Perspective

Active Machine Learning for Chemical Engineers: A Bright Future Lies Ahead!

Author information +

History +

Abstract

By combining machine learning with the design of experiments, thereby achieving so-called active machine learning, more efficient and cheaper research can be conducted. Machine learning algorithms are more flexible and are better than traditional design of experiment algorithms at investigating processes spanning all length scales of chemical engineering. While active machine learning algorithms are maturing, their applications are falling behind. In this article, three types of challenges presented by active machine learning—namely, convincing the experimental researcher, the flexibility of data creation, and the robustness of active machine learning algorithms—are identified, and ways to overcome them are discussed. A bright future lies ahead for active machine learning in chemical engineering, thanks to increasing automation and more efficient algorithms that can drive novel discoveries.

Graphical abstract

Keywords

Active machine learning / Active learning / Bayesian optimization / Chemical engineering / Design of experiments

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Yannick Ureel, Maarten R. Dobbelaere, Yi Ouyang, Kevin De Ras, Maarten K. Sabbe, Guy B. Marin, Kevin M. Van Geem. Active Machine Learning for Chemical Engineers: A Bright Future Lies Ahead!. Engineering, 2023, 27(8): 23‒30 https://doi.org/10.1016/j.eng.2023.02.019

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Oxford Economics Ltd. The global chemical industry: catalyzing growth and addressing our world’s sustainability challenges. Oxford Economics Ltd., Oxford (2019)

[2]	Ž.R. Lazić. Design of experiments in chemical engineering: a practical guide. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim (2006)

[3]	G. Franceschini, S. Macchietto. Model-based design of experiments for parameter precision: state of the art. Chem Eng Sci, 63 (19) (2008), pp. 4846-4872.

[4]	A.A. Melnikov, H. Poulsen Nautrup, M. Krenn, V. Dunjko, M. Tiersch, A. Zeilinger, et al. Active learning machine learns to create new quantum experiments. Proc Natl Acad Sci USA, 115 (6) (2018), pp. 1221-1226. DOI: 10.1073/pnas.1714936115

[5]	N. Duong-Trung, S. Born, J.W. Kim, M.T. Schermeyer, K. Paulick, M. Borisyak, et al. When bioprocess engineering meets machine learning: a survey from the perspective of automated bioprocess development. Biochem Eng J, 190 (2023), Article 108764.

[6]	F. Olsson. A literature survey of active machine learning in the context of natural language processing. Swedish Institute of Computer Science, Kista (2009)

[7]	G.B. Marin, V.V. Galvita, G.S. Yablonsky. Kinetics of chemical processes: from molecular to industrial scale. J Catal, 404 (2021), pp. 745-759.

[8]	B. Settles. Active learning. Springer Nature Switzerland AG, Cham (2012)

[9]	Frazier PI. A tutorial on Bayesian optimization. 2018. arXiv:1807.02811v1.

[10]	Y. Ureel, M.R. Dobbelaere, O. Akin, R.J. Varghese, C.G. Pernalete, J.W. Thybaut, et al. Active learning-based exploration of the catalytic pyrolysis of plastic waste. Fuel, 328 (2022), Article 125340.

[11]	N.S. Eyke, W.H. Green, K.F. Jensen. Iterative experimental design based on active machine learning reduces the experimental burden associated with reaction screening. React Chem Eng, 5 (10) (2020), pp. 1963-1972. DOI: 10.1039/d0re00232a

[12]	A.M. Schweidtmann, A.D. Clayton, N. Holmes, E. Bradford, R.A. Bourne, A.A. Lapkin. Machine learning meets continuous flow chemistry: automated optimization towards the Pareto front of multiple objectives. Chem Eng J, 352 (2018), pp. 277-282.

[13]	Y. Amar, A.M. Schweidtmann, P. Deutsch, L. Cao, A. Lapkin. Machine learning and molecular descriptors enable rational solvent selection in asymmetric catalysis. Chem Sci, 10 (27) (2019), pp. 6697-6706. DOI: 10.1039/c9sc01844a

[14]	A.D. Clayton, A.M. Schweidtmann, G. Clemens, J.A. Manson, C.J. Taylor, C.G. Niño, et al. Automated self-optimisation of multi-step reaction and separation processes using machine learning. Chem Eng J, 384 (2020), Article 123340.

[15]	S. Thrun. Exploration in active learning. M.A. Arbib ( Ed.), The handbook of brain theory and neural networks, MIT Press, Cambridge (1995), pp. 381-384.

[16]	C.E. Rasmussen, C.K.I. Williams. Gaussian processes for machine learning. MIT Press, Cambridge (2006)

[17]	E.V. Podryabinkin, A.V. Shapeev. Active learning of linearly parametrized interatomic potentials. Comput Mater Sci, 140 (2017), pp. 171-180.

[18]	J. Vandermause, S.B. Torrisi, S. Batzner, Y. Xie, L. Sun, A.M. Kolpak, et al. On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events. NPJ Comput Mater, 6 (1) (2020), p. 20.

[19]	Riis C, Antunes F, Hüttel FB, Azevedo CL, Pereira FC. Bayesian active learning with fully Bayesian Gaussian processes. 2022. arXiv:2205.10186.

[20]	Blundell C, Cornebise J, Kavukcuoglu K, Wierstra D.Weight uncertainty in neural networks. In:Proceedings of the 32nd International Conference on Machine Learning; 2015 Jul 7- 9; Lille, France; 2015. p. 1613-22.

[21]	Gal Y, Islam R, Ghahramani Z.Deep Bayesian active learning with image data. In:Proceedings of the 34th International Conference on Machine Learning; 2017 Aug 6-11; Sydney, NSW, Australia; 2017. p. 1183-92.

[22]	Hafner D, Tran D, Lillicrap T, Irpan A, Davidson J. Noise contrastive priors for functional uncertainty. In: Proceedings of the 35th Uncertainty in Artificial Intelligence Conference; 2019 Jul 22-25; Tel Aviv, Israel; 2020. p.905-14.

[23]	McHutchon A, Rasmussen C. Shawe-Taylor J, Zemel R, Bartlett P, Pereira F, Weinberger KQ, editors.Gaussian process training with input noise. In: Proceedings of the 24th International Conference on Neural Information Processing Systems;2011 Dec 12-14; Granada, Spain; 2011.p.1341-9.

[24]	Y. Zhang, A.A. Lee. Bayesian semi-supervised learning for uncertainty-calibrated prediction of molecular properties and active learning. Chem Sci, 10 (35) (2019), pp. 8154-8163. DOI: 10.1039/c9sc00616h

[25]	M. Núñez, D.G. Vlachos. Multiscale modeling combined with active learning for microstructure optimization of bifunctional catalysts. Ind Eng Chem Res, 58 (15) (2019), pp. 6146-6154. DOI: 10.1021/acs.iecr.8b04801

[26]	G. Sivaraman, A.N. Krishnamoorthy, M. Baur, C. Holm, M. Stan, G. Csányi, et al. Machine-learned interatomic potentials by active learning: amorphous and liquid hafnium dioxide. NPJ Comput Mater, 6 (1) (2020), p. 104.

[27]	D. Reker, P. Schneider, G. Schneider, J.B. Brown. Active learning for computational chemogenomics. Future Med Chem, 9 (4) (2017), pp. 381-402. DOI: 10.4155/fmc-2016-0197

[28]	K.A. Brown, S. Brittman, N. Maccaferri, D. Jariwala, U. Celano. Machine learning in nanoscience: big data at small scales. Nano Lett, 20 (1) (2020), pp. 2-10. DOI: 10.1021/acs.nanolett.9b04090

[29]	Hansen MH, Torres JAG, Jennings PC, Wang Z, Boes JR, Mamun OG, et al. An atomistic machine learning package for surface science and catalysis. 2019. arXiv:1904.00904.

[30]	Griffiths RR, Hernández-Lobato JM. Constrained Bayesian optimization for automatic chemical design. 2017. arXiv:1709.05501.

[31]	R.R. Griffiths, J.M. Hernández-Lobato. Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem Sci, 11 (2) (2020), pp. 577-586. DOI: 10.1039/c9sc04026a

[32]	K. Tran, Z.W. Ulissi. Active learning across intermetallics to guide discovery of electrocatalysts for CO₂ reduction and H₂ evolution. Nat Catal, 1 (9) (2018), pp. 696-703. DOI: 10.1038/s41929-018-0142-1

[33]	A.G. Kusne, H. Yu, C. Wu, H. Zhang, J. Hattrick-Simpers, B. DeCost, et al. On-the-fly closed-loop materials discovery via Bayesian active learning. Nat Commun, 11 (1) (2020), Article 5966.

[34]	L.B. Oftelie, P. Rajak, R.K. Kalia, A. Nakano, F. Sha, J. Sun, et al. Active learning for accelerated design of layered materials. NPJ Comput Mater, 4 (1) (2018), p. 74

[35]	J.R. Kitchin. Machine learning in catalysis. Nat Catal, 1 (4) (2018), pp. 230-232. DOI: 10.1038/s41929-018-0056-y

[36]	K.M. Jablonka, G.M. Jothiappan, S. Wang, B. Smit, B. Yoo. Bias free multiobjective active learning for materials design and discovery. Nat Commun, 12 (1) (2021), Article 2312.

[37]	C. Zhang, Y. Amar, L. Cao, A.A. Lapkin. Solvent selection for Mitsunobu reaction driven by an active learning surrogate model. Org Process Res Dev, 24 (12) (2020), pp. 2864-2873. DOI: 10.1021/acs.oprd.0c00376

[38]	A.D. Clayton, J.A. Manson, C.J. Taylor, T.W. Chamberlain, B.A. Taylor, G. Clemens, et al. Algorithms for the self-optimisation of chemical reactions. React Chem Eng, 4 (2019), pp. 1545-1554. DOI: 10.1039/c9re00209j

[39]	B.J. Shields, J. Stevens, J. Li, M. Parasram, F. Damani, J.I.M. Alvarado, et al. Bayesian reaction optimization as a tool for chemical synthesis. Nature, 590 (7844) (2021), pp. 89-96. DOI: 10.1038/s41586-021-03213-y

[40]	K.C. Felton, J.G. Rittig, A.A. Lapkin. Summit: benchmarking machine learning methods for reaction optimisation. Chem-Methods, 1 (2) (2021), pp. 116-122. DOI: 10.1002/cmtd.202000051

[41]	Felton K, Wigh D, Lapkin A. Multi-task Bayesian optimization of chemical reactions. 2020. ChemRxiv: 13250216.v1.

[42]	O. Dogu, A. Eschenbacher, R.J. Varghese, M. Dobbelaere, D.R. D’Hooge, P.H.M. Van Steenberge, et al. Bayesian tuned kinetic Monte Carlo modeling of polystyrene pyrolysis: unraveling the pathways to its monomer, dimers, and trimers formation. Chem Eng J, 455 (2023), Article 140708.

[43]	A. Tran, J. Sun, J.M. Furlan, K.V. Pagalthivarthi, R.J. Visintainer, Y. Wang. pBO-2GP-3B: a batch parallel known/unknown constrained Bayesian optimization with feasibility classification and its applications in computational fluid dynamics. Comput Methods Appl Mech Eng, 347 (2019), pp. 827-852.

[44]	S. Park, J. Na, M. Kim, J.M. Lee. Multi-objective Bayesian optimization of chemical reactor design using computational fluid dynamics. Comput Chem Eng, 119 (2018), pp. 25-37.

[45]	Y. Morita, S. Rezaeiravesh, N. Tabatabaei, R. Vinuesa, K. Fukagata, P. Schlatter. Applying Bayesian optimization with Gaussian process regression to computational fluid dynamics problems. J Comput Phys, 449 (2022), Article 110788.

[46]	C.M. Friend, B. Xu. Heterogeneous catalysis: a central science for a sustainable future. Acc Chem Res, 50 (3) (2017), pp. 517-521. DOI: 10.1021/acs.accounts.6b00510

[47]	Sabatier P. La catalyse en chimie organique. Paris: Hachette Livre; 1920. French.

[48]	S. Ichikawa. Harmonious optimum conditions for heterogeneous catalytic reactions derived analytically with Polanyi relation and Bronsted relation. J Catal, 404 (2021), pp. 706-715.

[49]

R.N.

Landau

, S.C.

Korré

, M.

Neurock

, M.T.

Klein

, R.J.

Quann

. Hydrocracking phenanthrene and 1-methyl naphthalene: development of linear free energy relationships. M. Oballa (Ed.), Catalytic hydroprocessing of petroleum and distillates, CRC Press, Boca Raton (2020), pp. 421-432. DOI: 10.1201/9781003067306-22

[50]	S. Vijay, G. Kastlunger, K. Chan, J.K. Nørskov. Limits to scaling relations between adsorption energies?. J Chem Phys, 156 (23) (2022), Article 231102.

[51]	X. Hong, K. Chan, C. Tsai, J.K. Nørskov. How doped MoS₂ breaks transition-metal scaling relations for CO₂ electrochemical reduction. ACS Catal, 6 (7) (2016), pp. 4428-4437. DOI: 10.1021/acscatal.6b00619

[52]	J. Pérez-Ramírez, N. López. Strategies to break linear scaling relationships. Nat Catal, 2 (11) (2019), pp. 971-976. DOI: 10.1038/s41929-019-0376-6

[53]	M. Zhong, K. Tran, Y. Min, C. Wang, Z. Wang, C.T. Dinh, et al. Accelerated discovery of CO₂ electrocatalysts using active machine learning. Nature, 581 (7807) (2020), pp. 178-183. DOI: 10.1038/s41586-020-2242-8

[54]

A.S.

Nugraha

, G.

Lambard

, J.

, M.S.A.

Hossain

, T.

Asahi

, W.

Chaikittisilp

, et al. Mesoporous trimetallic PtPdAu alloy films toward enhanced electrocatalytic activity in methanol oxidation: unexpected chemical compositions discovered by Bayesian optimization. J Mater Chem A, 8 (27) (2020), pp. 13532-13540. DOI: 10.1039/d0ta04096g

[55]	M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver, C.V. Stevens, K.M. Van Geem. Machine learning in chemical engineering: strengths, weaknesses, opportunities, and threats. Engineering, 7 (9) (2021), pp. 1201-1211.

[56]	D.K. Duvenaud. Automatic model construction with Gaussian processes [dissertation]. University of Cambridge, Cambridge (2014)

[57]	Wang Z, Dahl GE, Swersky K, Lee C, Mariet Z, Nado Z, et al. Pre-training helps Bayesian optimization too. 2022. arXiv:220703084.

[58]	S.H. Symoens, S.U. Aravindakshan, F.H. Vermeire, K. De Ras, M.R. Djokic, G.B. Marin, et al. QUANTIS: data quality assessment tool by clustering analysis. Int J Chem Kinet, 51 (11) (2019), pp. 872-885. DOI: 10.1002/kin.21316

[59]	F. Häse, M. Aldeghi, R.J. Hickman, L.M. Roch, A. Aspuru-Guzik. Gryffin: an algorithm for Bayesian optimization of categorical variables informed by expert knowledge. Appl Phys Rev, 8 (3) (2021), Article 031406.

[60]	F. Häse, L.M. Roch, C. Kreisbeck, A. Aspuru-Guzik. Phoenics: a Bayesian optimizer for chemistry. ACS Cent Sci, 4 (9) (2018), pp. 1134-1145. DOI: 10.1021/acscentsci.8b00307

[61]

Snoek

, Larochelle

, Adams

. Pereira

, Burges

, Bottou

, Weinberger

, editors.Practical Bayesian optimization of machine learning algorithms. In: Proceedings of the 25th International Conference on Neural Information Processing Systems; 2012 Dec 3- 6 ; Lake Tahoe, NV, USA. Red Hook: Curran Associates Inc.; 2012. p. 2951-9.

[62]	Xie Y, Tomizuka M, Zhan W. Towards general and efficient active learning. 2021. arXiv:211207963.

[63]	R.R. Griffiths, A.A. Aldrick, M. Garcia-Ortegon, V. Lalchand, A.A. Lee. Achieving robustness to aleatoric uncertainty with heteroscedastic Bayesian optimisation. Mach Learn Sci Technol, 3 (1) (2021), Article 015004

[64]	R.J. Hickman, M. Aldeghi, F. Häse, A. Aspuru-Guzik. Bayesian optimization with known experimental and design constraints for chemistry applications. Digit Discov, 1 (2022), pp. 732-744. DOI: 10.1039/d2dd00028h

[65]	W.G. Habashi, J. Dompierre, Y. Bourgault, D. Ait-Ali-Yahia, M. Fortin, M.G. Vallet.Anisotropic mesh adaptation: towards user-independent, mesh-independent and solver-independent CFD. Part I: general principles. Int J Numer Meth Fluids, 32 (6) (2000), pp. 725-744.

[66]	B. Burger, P.M. Maffettone, V.V. Gusev, C.M. Aitchison, Y. Bai, X. Wang, et al. A mobile robotic chemist. Nature, 583 (7815) (2020), pp. 237-241. DOI: 10.1038/s41586-020-2442-2

[67]	L. Hoffer, Y.V. Voitovich, B. Raux, K. Carrasco, C. Muller, A.Y. Fedorov, et al. Integrated strategy for lead optimization based on fragment growing: the diversity-oriented-target-focused-synthesis approach. J Med Chem, 61 (13) (2018), pp. 5719-5732. DOI: 10.1021/acs.jmedchem.8b00653

[68]	A.C. Bédard, A. Adamo, K.C. Aroh, M.G. Russell, A.A. Bedermann, J. Torosian, et al. Reconfigurable system for automated optimization of diverse chemical reactions. Science, 361 (6408) (2018), pp. 1220-1225. DOI: 10.1126/science.aat0650

[69]	C. Mateos, M.J. Nieves-Remacha, J.A. Rincón. Automated platforms for reaction self-optimization in flow. React Chem Eng, 4 (9) (2019), pp. 1536-1544. DOI: 10.1039/c9re00116f

[70]	N.S. Eyke, B.A. Koscher, K.F. Jensen. Toward machine learning-enhanced high-throughput experimentation. Trends Chem, 3 (2) (2021), pp. 120-132.

[71]	I. Hahndorf, O. Buyevskaya, M. Langpape, G. Grubert, S. Kolf, E. Guillon, et al. Experimental equipment for high-throughput synthesis and testing of catalytic materials. Chem Eng J, 89 (1-3) (2002), pp. 119-125.

[72]	K.H. Oh, H.K. Lee, S.W. Kang, J.I. Yang, G. Nam, T. Lim, et al. Automated synthesis and data accumulation for fast production of high-performance Ni nanocatalysts. J Ind Eng Chem, 106 (2022), pp. 449-459.

[73]	M.D. Wilkinson, M. Dumontier, I.J. Aalbersberg, G. Appleton, M. Axton, A. Baak, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data, 3 (1) (2016), Article 160018.

[74]	K.P. Greenman, W.H. Green, R. Gómez-Bombarelli. Multi-fidelity prediction of molecular optical peaks with deep learning. Chem Sci, 13 (4) (2022), pp. 1152-1162. DOI: 10.1039/d1sc05677h

[75]	G. Pilania, J.E. Gubernatis, T. Lookman. Multi-fidelity machine learning models for accurate bandgap predictions of solids. Comput Mater Sci, 129 (2017), pp. 156-163.

[76]	J.P. Folch, R.M. Lee, B. Shafei, D. Walz, C. Tsay, M. van der Wilk, et al. Combining multi-fidelity modelling and asynchronous batch Bayesian optimization. Comput Chem Eng, 172 (2023), Article 108194.

[77]	S. Mao, B. Wang, Y. Tang, F. Qian. Opportunities and challenges of artificial intelligence for green manufacturing in the process industry. Engineering, 5 (6) (2019), pp. 995-1002.

[78]	E. Shim, J.A. Kammeraad, Z. Xu, A. Tewari, T. Cernak, P.M. Zimmerman. Predicting reaction conditions from limited data through active transfer learning. Chem Sci, 13 (22) (2022), pp. 6655-6668. DOI: 10.1039/d1sc06932b

[79]	M. Kim, M.Y. Ha, W.B. Jung, J. Yoon, E. Shin, I.D. Kim, et al. Searching for an optimal multi-metallic alloy catalyst by active learning combined with experiments. Adv Mater, 34 (19) (2022), Article 2108900.

[80]	R. Gómez-Bombarelli, J.N. Wei, D. Duvenaud, J.M. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent Sci, 4 (2) (2018), pp. 268-276. DOI: 10.1021/acscentsci.7b00572

[81]	C. Shang, F. You. Data analytics and machine learning for smart process manufacturing: recent advances and perspectives in the big data era. Engineering, 5 (6) (2019), pp. 1010-1016.

[82]	Sanchez-Lengeling B, Outeiral C, Guimaraes GL, Aspuru-Guzik A. Optimizing distributions over molecular space. An objective-reinforced generative adversarial network for inverse-design chemistry (ORGANIC). 2017. ChemRxiv: 5309668.v3.

[83]	Z. Jensen, S. Kwon, D. Schwalbe-Koda, C. Paris, R. Gómez-Bombarelli, Y. Román-Leshkov, et al. Discovering relationships between OSDAs and zeolites through data mining and generative neural networks. ACS Cent Sci, 7 (5) (2021), pp. 858-867. DOI: 10.1021/acscentsci.1c00024