PURPOSE: The purpose of this investigation was to develop a quantitative structure-bioavailability relationship (QSBR) model for drug discovery and development. METHODS: A database of drugs with human oral bioavailability was assembled in electronic form with structure in SMILES format. Using that database, a stepwise regression procedure was used to link oral bioavailability in humans and substructural fragments in drugs. The regression model was compared with Lipinski's Rule of Five. RESULTS: The human oral bioavailability database contains 591 compounds. A regression model employing 85 descriptors was built to predict the human oral bioavailability of a compound based on its molecular structure. Compared to Lipinski's Rule of Five, the false negative predictions were reduced from 5% to 3% while the false positive predictions decreased from 78% to 53%. A set of substructural descriptors was identified to show which fragments tend to increase/decrease human oral bioavailability. CONCLUSIONS: A novel quantitative structure-bioavailability relationship (QSBR) was developed. Despite a large degree of experimental error, the model was reasonably predictive and stood up to cross-validation. When compared to Lipinski's Rule of Five, the QSBR model was able to reduce false positive predictions
Article