pubmed-article:12115811 | pubmed:abstractText | A quantitative structure-property relationship (QSPR) was developed for predicting the aqueous solubility of drug-like compounds from their chemical structures. A set of 321 structurally diverse drugs or related compounds, with their intrinsic aqueous solubility collected from literature, was used in this analysis. The data were divided into a training set (n = 267) for building the model and a randomly chosen testing set (n = 54) for assessing the predictive ability of the model. A series of molecular descriptors was calculated directly from chemical structures and a set of eight descriptors, including dipole moment, surface area, volume, molecular weight, number of rotatable bonds/total bonds, number of hydrogen-bond acceptors, number of hydrogen-bond donors and density, was chosen for the final model. The eight-descriptor model generated by multiple linear regression was further optimized by a genetic algorithm guided selection method. The model has a correlation coefficient (r) of 0.95 and a root-mean-square (rms) error of 0.56 log unit. It predicts the solubility of testing set compounds with a reasonable degree of accuracy (r = 0.84 and rms = 0.86 log unit). The present model can serve as a tool for medicinal chemists to guide their early synthetic efforts in arriving at appropriate analogs. | lld:pubmed |