期刊论文详细信息
Journal of Cheminformatics
Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms
Defang Ouyang1  Zhuyifan Ye1 
[1] State Key Laboratory of Quality Research in Chinese Medicine, Institute of Chinese Medical Sciences (ICMS), University of Macau, Macau, China;
关键词: Solubility prediction;    Organic solvents;    QSPR;    Machine learning;    lightGBM;    Deep learning;   
DOI  :  10.1186/s13321-021-00575-3
来源: Springer
PDF
【 摘 要 】

Rapid solvent selection is of great significance in chemistry. However, solubility prediction remains a crucial challenge. This study aimed to develop machine learning models that can accurately predict compound solubility in organic solvents. A dataset containing 5081 experimental temperature and solubility data of compounds in organic solvents was extracted and standardized. Molecular fingerprints were selected to characterize structural features. lightGBM was compared with deep learning and traditional machine learning (PLS, Ridge regression, kNN, DT, ET, RF, SVM) to develop models for predicting solubility in organic solvents at different temperatures. Compared to other models, lightGBM exhibited significantly better overall generalization (logS  ± 0.20). For unseen solutes, our model gave a prediction accuracy (logS  ± 0.59) close to the expected noise level of experimental solubility data. lightGBM revealed the physicochemical relationship between solubility and structural features. Our method enables rapid solvent screening in chemistry and may be applied to solubility prediction in other solvents.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202203042583788ZK.pdf 1762KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:0次