期刊首页 优先出版 当期阅读 过刊浏览 作者中心 关于期刊 English

《工程(英文)》 >> 2019年 第5卷 第6期 doi: 10.1016/j.eng.2019.02.011

大数据为材料研究创造新机遇——材料设计的机器学习方法与应用综述

a Process Systems Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg 39106, Germany
b Process Systems Engineering, Otto-von-Guericke University Magdeburg, Magdeburg 39106, Germany

收稿日期: 2018-11-21 修回日期: 2018-12-13 录用日期: 2019-02-25 发布日期: 2019-08-22

下一篇 上一篇

摘要

材料的发展在历史上是由人类的需求和欲望所驱动的,且在可预见的将来,这种情况应该会继续下去。到2050年,全球人口预计将达到100亿,人们对清洁高效能源、个性化消费产品、安全食品供应和专业医疗保健等方面的需求也将日益增加。新型功能材料是为目标属性或性能而定制的,这将是应对挑战的关键。从传统上讲,先进的材料都是通过经验或实验验证的方法发现的。因为现代实验和计算技术产生的大数据越来越容易获取,数据驱动或机器学习(ML)方法为发现和合理设计材料打开了新的蓝图。本文简要介绍了各种ML方法和相关的软件或工具。重点介绍了将ML方法应用于材料研究的主要思路和基本步骤。本文还总结了近期ML在多孔聚合材料、催化材料和含能材料的大规模筛选和优化设计中的重要应用。最后给出了结束语和展望。

图片

图1

图2

图3

图4

参考文献

[ 1 ] Virshup AM, Contreras-García J, Wipf P, Yang W, Beratan DN. Stochastic voyages into uncharted chemical space produce a representative library of all possible drug-like compounds. J Am Chem Soc 2013;135(19):7296–303. 链接1

[ 2 ] Rajan K. Materials informatics: the materials ‘‘gene” and big data. Annu Rev Mater Res 2015;45(1):153–69. 链接1

[ 3 ] Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater 2013;1(1):011002. 链接1

[ 4 ] Michalski RS, Carbonell JG, Mitchell TM, editors. Machine learning: an artificial intelligence approach. Berlin: Springer-Verlag; 2013. 链接1

[ 5 ] Agrawal A, Deshpande PD, Cecen A, Basavarsu GP, Choudhary AN, Kalidindi SR. Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters. Integr Mater Manuf Innovation 2014;3:8. 链接1

[ 6 ] Karak SK, Chatterjee S, Bandopadhyay S. Mathematical modelling of the physical and mechanical properties of nano-Y2O3 dispersed ferritic alloys using evolutionary algorithm-based neural network. Powder Technol 2015;274:217–26. 链接1

[ 7 ] Pilania G, Mannodi-Kanakkithodi A, Uberuaga BP, Ramprasad R, Gubernatis JE, Lookman T. Machine learning bandgaps of double perovskites. Sci Rep 2016;6:19375. 链接1

[ 8 ] Jinnouchi R, Asahi R. Predicting catalytic activity of nanoparticles by a DFTaided machine-learning algorithm. J Phys Chem Lett 2017;8(17):4279–83. 链接1

[ 9 ] Zhou T, Jhamb S, Liang X, Sundmacher K, Gani R. Prediction of acid dissociation constants of organic compounds using group contribution methods. Chem Eng Sci 2018;183:95–105. 链接1

[10] Aghaji MZ, Fernandez M, Boyd PG, Daff TD, Woo TK. Quantitative structure– property relationship models for recognizing metal organic frameworks (MOFs) with high CO2 working capacity and CO2/CH4 selectivity for methane purification. Eur J Inorg Chem 2016;2016(27):4505–11. 链接1

[11] Sharma V, Wang C, Lorenzini RG, Ma R, Zhu Q, Sinkovits DW, et al. Rational design of all organic polymer dielectrics. Nat Commun 2014;5:4845. 链接1

[12] Madaan N, Shiju NR, Rothenberg G. Predicting the performance of oxidation catalysts using descriptor models. Catal Sci Technol 2016;6(1):125–33. 链接1

[13] Gómez-Bombarelli R, Aguilera-Iparraguirre J, Hirzel TD, Duvenaud D, Maclaurin D, Blood-Forsythe MA, et al. Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach. Nat Mater 2016;15(10):1120–7. 链接1

[14] Stanev V, Oses C, Kusne AG, Rodriguez E, Paglione J, Curtarolo S, et al. Machine learning modeling of superconducting critical temperature. NPJ Comput Mater 2018;4(1):29. 链接1

[15] Olivares-Amaya R, Amador-Bedolla C, Hachmann J, Atahan-Evrenk S, SánchezCarrera RS, Vogt L, et al. Accelerated computational discovery of highperformance materials for organic photovoltaics by means of cheminformatics. Energy Environ Sci 2011;4(12):4849–61. 链接1

[16] Web of Science [Internet]. Boston: Clarivate Analytics; c2018 [cited 2018 October]. Available from: www.webofknowledge.com. 链接1

[17] Agrawal A, Choudhary A. Perspective: materials informatics and big data: realization of the ‘‘fourth paradigm” of science in materials science. APL Mater 2016;4(5):053208. 链接1

[18] Butler KT, Davies DW, Cartwright H, Isayev O, Walsh A. Machine learning for molecular and materials science. Nature 2018;559(7715):547–55. 链接1

[19] Achenie LEK, Gani R, Venkatasubramanian V, editors. Computer aided molecular design: theory and practice. Amsterdam: Elsevier; 2003. 链接1

[20] Zhang L, Cignitti S, Gani R. Generic mathematical programming formulation and solution for computer-aided molecular design. Comput Chem Eng 2015;78:79–84. 链接1

[21] Song Z, Zhou T, Qi Z, Sundmacher K. Systematic method for screening ionic liquids as extraction solvents exemplified by an extractive desulfurization process. ACS Sustain Chem Eng 2017;5(4):3382–9. 链接1

[22] Song Z, Zhang C, Qi Z, Zhou T, Sundmacher K. Computer-aided design of ionic liquids as solvents for extractive desulfurization. AIChE J 2018;64(3):1013–25. 链接1

[23] Zhou T, McBride K, Zhang X, Qi Z, Sundmacher K. Integrated solvent and process design exemplified for a Diels-Alder reaction. AIChE J 2015;61 (1):147–58. 链接1

[24] Zhou T, Lyu Z, Qi Z, Sundmacher K. Robust design of optimal solvents for chemical reactions—a combined experimental and computational strategy. Chem Eng Sci 2015;137:613–25. 链接1

[25] Zhou T, Wang J, McBride K, Sundmacher K. Optimal design of solvents for extractive reaction processes. AIChE J 2016;62(9):3238–49. 链接1

[26] Zhou T, Zhou Y, Sundmacher K. A hybrid stochastic–deterministic optimization approach for integrated solvent and process design. Chem Eng Sci 2017;159:207–16. 链接1

[27] Siddhaye S, Camarda K, Southard M, Topp E. Pharmaceutical product design using combinatorial optimization. Comput Chem Eng 2004;28(3):425–34. 链接1

[28] Zhang L, Mao H, Liu L, Du J, Gani R. A machine learning based computer-aided molecular design/screening methodology for fragrance molecules. Comput Chem Eng 2018;115:295–308. 链接1

[29] Papadopoulos AI, Stijepovic M, Linke P. On the systematic design and selection of optimal working fluids for Organic Rankine Cycles. Appl Therm Eng 2010;30 (6–7):760–9. 链接1

[30] Samudra A, Sahinidis NV. Design of heat-transfer media components for retail food refrigeration. Ind Eng Chem Res 2013;52(25):8518–26. 链接1

[31] Chavali S, Lin B, Miller DC, Camarda KV. Environmentally-benign transition metal catalyst design using optimization techniques. Comput Chem Eng 2004;28(5):605–11. 链接1

[32] Ramprasad R, Batra R, Pilania G, Mannodi-Kanakkithodi A, Kim C. Machine learning in materials informatics: recent applications and prospects. Npj Comput Mater 2017;3(1):54. 链接1

[33] Curtarolo S, Hart GL, Nardelli MB, Mingo N, Sanvito S, Levy O. The highthroughput highway to computational materials design. Nat Mater 2013;12 (3):191–201. 链接1

[34] Galvez J, Garcia R, Salabert MT, Soler R. Charge indexes. New topological descriptors. J Chem Inf Comput Sci 1994;34(3):520–5. 链接1

[35] Gozalbes R, Doucet JP, Derouin F. Application of topological descriptors in QSAR and drug design: history and new trends. Curr Drug Targets Infect Disord 2002;2(1):93–102. 链接1

[36] Ponce YM, Garit JA, Torrens F, Zaldivar VR, Castro EA. Atom, atom-type, and total linear indices of the ‘‘molecular pseudograph’s atom adjacency matrix”: application to QSPR/QSAR studies of organic compounds. Molecules 2004;9 (12):1100–23. 链接1

[37] Dureja H, Madan AK. Superaugmented eccentric connectivity indices: newgeneration highly discriminating topological descriptors for QSAR/QSPR modeling. Med Chem Res 2007;16(7–9):331–41. 链接1

[38] Fernandez M, Trefiak NR, Woo TK. Atomic property weighted radial distribution functions descriptors of metal–organic frameworks for the prediction of gas uptake capacity. J Phys Chem C 2013;117(27):14095–105. 链接1

[39] Han J, Kamber M, Pei J. Data mining: concepts and techniques. 3rd ed. San Francisco: Morgan Kaufmann; 2011. 链接1

[40] Abdi H, Williams LJ. Principal component analysis. Wiley Interdiscip Rev Comput Stat 2010;2(4):433–59. 链接1

[41] Zhou T, Qi Z, Sundmacher K. Model-based method for the screening of solvents for chemical reactions. Chem Eng Sci 2014;115:177–85. 链接1

[42] Williams CKI, Rasmussen CE. Gaussian processes for regression. In: Touretzky DS, Mozer MC, Hasselmo ME, editors. Advances in neural information processing systems 8. Cambridge: A Bradford Book; 1996. p. 514–20.

[43] Abraham A. Artificial neural networks. In: Sydenham P, Thorn R, editors. Handbook of measuring system design. Hoboken: John Wiley & Sons, Ltd.; 2005. 链接1

[44] Basak D, Pal S, Patranabis DC. Support vector regression. Neural Inf Process 2007;11(10):203–24. 链接1

[45] Safavian SR, Landgrebe D. A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 1991;21(3):660–74. 链接1

[46] Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP. Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 2003;43(6):1947–58. 链接1

[47] Kazantzi V, Qin X, El-Halwagi M, Eljack F, Eden M. Simultaneous process and molecular design through property clustering techniques: a visualization tool. Ind Eng Chem Res 2007;46(10):3400–9. 链接1

[48] Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY. An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 2002;24(7):881–92. 链接1

[49] Johnson SC. Hierarchical clustering schemes. Psychometrika 1967;32 (3):241–54. 链接1

[50] Krogh A, Brown M, Mian IS, Sjölander K, Haussler D. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol 1994;235 (5):1501–31. 链接1

[51] Mueller T, Kusne AG, Ramprasad R. Machine learning in materials science: recent progress and emerging applications. Rev Comput Chem 2016;29:186–273. 链接1

[52] Arlot S, Celisse A. A survey of cross-validation procedures for model selection. Stat Surv 2010;4:40–79. 链接1

[53] Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence; 1995 Aug 20–25; Montreal, QC, Canada. San Francisco: Morgan Kaufmann Publishers Inc.; 1995. p. 1137–43. 链接1

[54] Shao J. Bootstrap model selection. J Am Stat Assoc 1996;91(434):655–65. 链接1

[55] Zhai X, Chen M, Lu W. Accelerated search for perovskite materials with higher Curie temperature based on the machine learning methods. Comput Mater Sci 2018;151:41–8. 链接1

[56] Mannodi-Kanakkithodi A, Pilania G, Huan TD, Lookman T, Ramprasad R. Machine learning strategy for accelerated design of polymer dielectrics. Sci Rep 2016;6:20952. 链接1

[57] Lin MH, Tsai JF, Yu CS. A review of deterministic optimization methods in engineering and management. Math Probl Eng 2012;2012:756023. 链接1

[58] Spall JC. Introduction to stochastic search and optimization: estimation, simulation, and control. Hoboken: John Wiley & Sons, Ltd.; 2003. 链接1

[59] Breneman CM, Brinson LC, Schadler LS, Natarajan B, Krein M, Wu K, et al. Stalking the materials genome: a data-driven approach to the virtual design of nanostructured polymers. Adv Funct Mater 2013;23(46):5746–52. 链接1

[60] Venkatraman V, Alsberg BK. Designing high-refractive index polymers using materials informatics. Polymers 2018;10(1):E103. 链接1

[61] Wu K, Sukumar N, Lanzillo NA, Wang C, Ramprasad RR, Ma R, et al. Prediction of polymer properties using infinite chain descriptors (ICD) and machine learning: toward optimized dielectric polymeric materials. J Polym Sci B Polym Phys 2016;54(20):2082–91. 链接1

[62] Sukumar N, Krein M, Luo Q, Breneman C. MQSPR modeling in materials informatics: a way to shorten design cycles? J Mater Sci 2012;47(21): 7703–15. 链接1

[63] Mannodi-Kanakkithodi A, Chandrasekaran A, Kim C, Huan TD, Pilania G, Botu V, et al. Scoping the polymer genome: a roadmap for rational polymer dielectrics design and beyond. Mater Today 2018;21(7):785–96. 链接1

[64] Fernandez M, Woo TK, Wilmer CE, Snurr RQ. Large-scale quantitative structure–property relationship (QSPR) analysis of methane storage in metal-organic frameworks. J Phys Chem C 2013;117(15):7681–9. 链接1

[65] Fernandez M, Boyd PG, Daff TD, Aghaji MZ, Woo TK. Rapid and accurate machine learning recognition of high performing metal organic frameworks for CO2 capture. J Phys Chem Lett 2014;5(17):3056–60. 链接1

[66] Ohno H, Mukae Y. Machine learning approach for prediction and search: application to methane storage in a metal–organic framework. J Phys Chem C 2016;120(42):23963–8. 链接1

[67] Simon CM, Mercado R, Schnell SK, Smit B, Haranczyk M. What are the best materials to separate a xenon/krypton mixture? Chem Mater 2015;27 (12):4459–75. 链接1

[68] Fernandez M, Barnard AS. Geometrical properties can predict CO2 and N2 adsorption performance of metal–organic frameworks (MOFs) at low pressure. ACS Comb Sci 2016;18(5):243–52. 链接1

[69] Qiao Z, Xu Q, Jiang J. High-throughput computational screening of metal– organic framework membranes for upgrading of natural gas. J Membr Sci 2018;551:47–54. 链接1

[70] Huang K, Zhan XL, Chen FQ, Lü DW. Catalyst design for methane oxidative coupling by using artificial neural network and hybrid genetic algorithm. Chem Eng Sci 2003;58(1):81–7. 链接1

[71] Baumes L, Farrusseng D, Lengliz M, Mirodatos C. Using artificial neural networks to boost high-throughput discovery in heterogeneous catalysis. QSAR Comb Sci 2004;23(9):767–78. 链接1

[72] Baumes LA, Serra JM, Serna P, Corma A. Support vector machines for predictive modeling in heterogeneous catalysis: a comprehensive introduction and overfitting investigation based on two real applications. J Comb Chem 2006;8(4):583–96. 链接1

[73] Thornton AW, Winkler DA, Liu MS, Haranczyk M, Kennedy DF. Towards computational design of zeolite catalysts for CO2 reduction. RSC Adv 2015;5 (55):44361–70. 链接1

[74] Corma A, Serra JM, Serna P, Moliner M. Integrating high-throughput characterization into combinatorial heterogeneous catalysis: unsupervised construction of quantitative structure/property relationship models. J Catal 2005;232(2):335–41. 链接1

[75] Li Z, Ma X, Xin H. Feature engineering of machine-learning chemisorption models for catalyst design. Catal Today 2017;280(Pt 2):232–8. 链接1

[76] Li Z, Wang S, Chin WS, Achenie LE, Xin H. High-throughput screening of bimetallic catalysts enabled by machine learning. J Mater Chem A Mater Energy Sustain 2017;5(46):24131–8. 链接1

[77] Ulissi ZW, Tang MT, Xiao J, Liu X, Torelli DA, Karamad M, et al. Machinelearning methods enable exhaustive searches for active bimetallic facets and reveal active site motifs for CO2 reduction. ACS Catal 2017;7(10):6600–8. 链接1

[78] Astruc D, editor. Nanoparticles and catalysis. Weinheim: Wiley-VCH; 2008. 链接1

[79] Fernandez M, Barron H, Barnard AS. Artificial neural network analysis of the catalytic efficiency of platinum nanoparticles. RSC Adv 2017;7(77):48962–71. 链接1

[80] Maldonado AG, Rothenberg G. Predictive modeling in homogeneous catalysis: a tutorial. Chem Soc Rev 2010;39(6):1891–902. 链接1

[81] Janet JP, Kulik HJ. Predicting electronic structure properties of transition metal complexes with neural networks. Chem Sci 2017;8(7):5137–52. 链接1

[82] Fujimura K, Seko A, Koyama Y, Kuwabara A, Kishida I, Shitara K, et al. Accelerated materials design of lithium superionic conductors based on firstprinciples calculations and machine learning algorithms. Adv Energy Mater 2013;3(8):980–5. 链接1

[83] Shandiz MA, Gauvin R. Application of machine learning methods for the prediction of crystal system of cathode materials in lithium-ion batteries. Comput Mater Sci 2016;117:270–8. 链接1

[84] Sendek AD, Yang Q, Cubuk ED, Duerloo KA, Cui Y, Reed EJ. Holistic computational structure screening of more than 12000 candidates for solid lithium-ion conductor materials. Energy Environ Sci 2017;10(1):306–20. 链接1

[85] Scott DJ, Manos S, Coveney PV. Design of electroceramic materials using artificial neural networks and multiobjective evolutionary algorithms. J Chem Inf Model 2008;48(2):262–73. 链接1

[86] Gaultois MW, Oliynyk AO, Mar A, Sparks TD, Mulholland GJ, Meredig B. Perspective: web-based machine learning models for real-time screening of thermoelectric materials properties. APL Mater 2016;4(5):053213. 链接1

[87] Nagasawa S, Al-Naamani E, Saeki A. Computer-aided screening of conjugated polymers for organic solar cell: classification by random forest. J Phys Chem Lett 2018;9(10):2639–46. 链接1

[88] Yosipof A, Nahum OE, Anderson AY, Barad HN, Zaban A, Senderowitz H. Data mining and machine learning tools for combinatorial material science of alloxide photovoltaic cells. Mol Inform 2015;34(6–7):367–79. 链接1

[89] Manser JS, Christians JA, Kamat PV. Intriguing optoelectronic properties of metal halide perovskites. Chem Rev 2016;116(21):12956–3008. 链接1

相关研究