Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Frontiers of Chemical Science and Engineering >> 2022, Volume 16, Issue 4 doi: 10.1007/s11705-021-2083-5

Machine learning-based solubility prediction and methodology evaluation of active pharmaceutical ingredients in industrial crystallization

Available online: 2021-10-12

Next Previous

Abstract

Solubility has been widely regarded as a fundamental property of small molecule drugs and drug candidates, as it has a profound impact on the crystallization process. Solubility prediction, as an alternative to experiments which can reduce waste and improve crystallization process efficiency, has attracted increasing attention. However, there are still many urgent challenges thus far. Herein we used seven descriptors based on understanding dissolution behavior to establish two solubility prediction models by machine learning algorithms. The solubility data of 120 active pharmaceutical ingredients (APIs) in ethanol were considered in the prediction models, which were constructed by random decision forests and artificial neural network with optimized data structure and model accuracy. Furthermore, a comparison with traditional prediction methods including the modified solubility equation and the quantitative structure-property relationships model was carried out. The highest accuracy shown by the testing set proves that the ML models have the best solubility prediction ability. Multiple linear regression and stepwise regression were used to further investigate the critical factor in determining solubility value. The results revealed that the API properties and the solute-solvent interaction both provide a nonnegligible contribution to the solubility value.

Related Research