Implemented ADMET Predictions

The implemented Absorption, Distribution, Metabolism, Excretion and Toxicity (ADMET) prediction models, including their performance measures, are available in our paper online.1 The 15 models cover a diverse set of ADMET endpoints. Some of the models have already been published, including those for Maximum Recommended Therapeutic Dose (MRTD),2 chemical mutagenicity,3 human liver microsomal (HLM),4 Pgp inhibitor/substrates.5 We also present several new models, which we make available here for the first time.


Liver Toxicity

  • DILI: Drug-induced liver injury (DILI) has been one of the most commonly cited reason for drug withdrawals from the market. This application predicts whether a compound could cause DILI. The dataset of 1,431 compounds was obtained from four sources used by Xu et al.8 This dataset contains both pharmaceuticals and non-pharmaceuticals; we classified a compound as causing DILI if it was associated with a high risk of DILI and not if there was no such risk.
    Download DILI dataset or view model performance or view model performance (Old Version)
  • Cytotoxicity (HepG2): Cytotoxicity is the degree to which a chemical causes damage to cells. We developed a cytotoxicity prediction model, using in vitro data on toxicity against HepG2 cells for 6,000 structurally diverse compounds, which we collected from ChEMBL. In developing our model, we considered compounds with an IC50 ≤ 10 μM in the in vitro assay as cytotoxic.
    Download Cytotoxicity dataset or view model performance or view model performance (Old Version)


Metabolism


Membrane Transporters

  • BBB: The blood-brain barrier (BBB) is a highly selective barrier that separates the circulating blood from the central nervous system. We developed a vNN-based BBB model, using 352 compounds whose BBB permeability values (log⁡BB) were obtained from the literature respectively.6,7 We classified compounds with log⁡BB values of less than –0.3 and greater than +0.3 as BBB non-permeable and permeable.
    Download BBB dataset or view model performance or view model performance (Old version)
  • Pgp Substrates and Inhibitors: P-glycoprotein (Pgp) is an essential cell membrane protein that extracts many foreign substances from the cell. Cancer cells often overexpress Pgp, which increases the efflux of chemotherapeutic agents from the cell and prevents treatment by reducing the effective intracellular concentrations of such agents—a phenomenon known as multidrug resistance. For this reason, identifying compounds that can either be transported out of the cell by Pgp (substrates) or impair Pgp function (inhibitors) is of great interest. We have developed models to predict both Pgp substrates and Pgp inhibitors.5 The Pgp substrate dataset was collected by Hou and co-workers.11 This dataset consists of measurements of 422 substrates and 400 non-substrates. To generate a large Pgp inhibitor dataset, we combined two datasets,12,13 and removed duplicates to form a combined dataset consisting of a training set of 1,319 inhibitors and 937 non-inhibitors.
    Download Pgp Substrates dataset or view model performance or view model performance (Old version)
    Download Pgp Inhibitors dataset or view model performance or view model performance (Old version)


Others

  • hERG (Cardiotoxicity): The human ether-à-go-go-related gene (hERG) codes for a potassium ion channel involved in the normal cardiac repolarization activity of the heart. Drug-induced blockade of hERG function can cause long QT syndrome, which may result in arrhythmia and death. We retrieved 282 known hERG blockers from the literature and classified compounds with an IC50 cutoff value of 10 μM or less as blockers.9 We also collected a set of 404 compounds with IC50 values greater than 10 μM from ChEMBL and classified them as non-blockers.
    Download hERG dataset or view model performance or view model performance (Old Version)
  • MMP (Mitochondrial Toxicity): Given the fundamental role of mitochondria in cellular energetics and oxidative stress, mitochondrial dysfunction has been implicated in cancer, diabetes, neurodegenerative disorders, and cardiovascular diseases. We used the largest dataset of chemical-induced changes in mitochondrial membrane potential (MMP), based on the assumption that a compound that causes mitochondrial dysfunction is also likely to reduce the MMP. We developed a vNN-based MMP prediction model, using 6,261 compounds collected from a previous study that screened a library of 10,000 compounds (~8,300 unique chemicals) at 15 concentrations, each in triplicate, to measure changes in the MMP in HepG2 cells.10 The study found that 913 compounds decreased the MMP, whereas 5,395 compounds had no effect.
    Download MMP dataset or view model performance or view model performance (Old Version)
  • Mutagenicity (AMES Test): Mutagens are chemicals that cause abnormal genetic mutations leading to cancer. A common way to assess a chemical’s mutagenicity is the Ames test. We developed the prediction model, using a literature dataset of 6,512 compounds, of which 3,503 were Ames-positive. We provide further details of the model and its performance in Reference 2.
    Download AMES Test dataset or view model performance or view model performance (Old Version)
  • MRTD: The Maximum Recommended Therapeutic Dose (MRTD) is an estimated upper daily dose that is safe. We built a prediction model based on a dataset of MRTD values publically disclosed by the FDA, mostly of single-day oral doses for an average adult with a body weight of 60 kg, for 1,220 compounds (most of which are small organic drugs). We excluded organometallics, high-molecular weight polymers (>5,000 Da), nonorganic chemicals, mixtures of chemicals, and very small molecules (<100 Da). We used an external test set of 160 compounds that were collected by the FDA for validation. The total dataset for our model contained 1,185 compounds.2 The predicted MRTD value is reported in mg/day unit based upon an average adult weighing 60 kg.
    Download MRTD dataset or view model performance or view model performance (Old Version)


Performance measures of vNN models in 10-fold cross validation using a restricted or unrestricted applicability domain
ModelDataad0bhcAccuracySensitivitySpecificitykappaRdCoverage
aNumber of compounds in the dataset; bTanimoto-distance threshold value; cSmoothing factor; dPearson’s correlation coefficient ; eRegression model.
DILI14270.600.500.720.710.740.440.64
1.000.200.670.620.710.331.00
Cytotox (hep2g)60970.400.200.850.890.760.650.89
1.000.200.840.890.730.631.00
HLM32190.400.200.810.860.720.590.91
1.000.200.800.870.690.571.00
CYP1A275580.500.200.900.950.710.680.74
1.000.200.890.950.620.611.00
CYP2C980720.500.200.910.960.550.540.75
1.000.200.900.960.450.461.00
CYP2C1981550.550.200.610.930.870.550.80
1.000.200.860.940.520.491.00
CYP2D678050.500.200.890.940.610.580.74
1.000.200.880.950.530.511.00
CYP3A4103730.500.200.880.920.750.680.77
1.000.200.870.930.680.631.00
BBB3530.600.200.900.850.940.790.60
1.000.100.830.760.890.651.00
Pgp Substrate8220.600.200.780.780.790.570.65
1.000.200.720.730.720.451.00
Pgp Inhibitor23040.500.200.850.720.910.640.75
1.000.100.810.740.860.611.00
hERG6850.700.700.850.850.850.690.76
1.000.200.830.850.800.651.00
MMP62610.500.400.890.940.640.600.66
1.000.200.870.940.520.501.00
AMES65120.500.400.810.740.860.610.78
1.000.200.780.750.820.571.00
MRTDe11840.600.200.800.67
1.000.200.741.00
See Pipeline Pilot Performance measures
Performance measures of vNN models in 10-fold cross validation using a restricted or unrestricted applicability domain (Old Version)
ModelDataad0bhcAccuracySensitivitySpecificitykappaRdCoverage
aNumber of compounds in the dataset; bTanimoto-distance threshold value; cSmoothing factor; dPearson’s correlation coefficient ; eRegression model.
DILI14270.600.500.710.700.730.420.66
1.000.200.670.620.720.341.00
Cytotox (hep2g)60970.400.200.840.880.760.640.89
1.000.200.840.730.890.621.00
HLM32190.400.200.810.720.870.590.91
1.000.200.810.700.870.571.00
CYP1A275580.500.200.900.700.950.660.75
1.000.200.890.610.950.601.00
CYP2C980720.500.200.910.550.960.540.76
1.000.200.900.440.960.461.00
CYP2C1981550.550.200.870.640.930.580.76
1.000.200.860.520.940.501.00
CYP2D678050.500.200.890.610.940.570.75
1.000.200.880.520.950.511.00
CYP3A4103730.500.200.880.760.920.680.78
1.000.200.880.690.930.641.00
BBB3530.600.200.900.940.860.800.61
1.000.100.820.880.750.641.00
Pgp Substrate8220.600.200.790.800.790.580.66
1.000.200.730.730.740.471.00
Pgp Inhibitor23040.500.200.850.910.730.660.76
1.000.100.810.860.740.611.00
hERG6850.700.700.840.840.830.680.80
1.000.200.820.820.830.641.00
MMP62610.500.400.890.640.940.610.69
1.000.200.870.520.940.501.00
AMES65120.500.400.820.860.750.620.79
1.000.200.790.820.750.571.00
MRTDe11840.600.200.790.69
1.000.200.741.00
See Knime Performance measures

References

  1. Schyman, P., R. Liu, V. Desai, and A. Wallqvist. vNN web server for ADMET predictions. Frontiers in Pharmacology. 2017 December 4; 8:889.
  2. Liu, R., G. Tawa, and A. Wallqvist. Locally weighted learning methods for predicting dose-dependent toxicity with application to the human maximum recommended daily dose. Chemical Research in Toxicology. 2012; 25(10):2216-2226.
  3. Liu, R., and A. Wallqvist. Merging applicability domains for in silico assessment of chemical mutagenicity. Journal of Chemical Information and Modeling. 2014; 54(3):793-800.
  4. Liu, R., P. Schyman, and A. Wallqvist. Critically assessing the predictive power of QSAR models for human liver microsomal stability. Journal of Chemical Information and Modeling. 2015; 55(8):1566-1575.
  5. Schyman, P., R. Liu, and A. Wallqvist. Using the variable-nearest neighbor method to identify P-glycoprotein substrates and inhibitors. ACS Omega. 2016; 1(5):923-929.
  6. Muehlbacher, M., G. Spitzer, K. Liedl, J. Kornhuber. Qualitative prediction of blood–brain barrier permeability on a large and refined dataset. Journal of Computer-Aided Molecular Design. 2011; 25:1095.
  7. R. Naef. A generally applicable computer algorithm based on the group additivity method for the calculation of seven molecular descriptors: heat of combustion, logPO/W, logS, refractivity, polarizability, toxicity and logBB of organic compounds; scope and limits of applicability. Molecules. 2015; 20(10):18279-18351.
  8. Xu, Y., Z. Dai, F. Chen, S. Gao, J. Pei, and L. Lai. Deep learning for drug-induced liver injury. 2015, 55 (10):2085–2093.
  9. Schyman, P., R. Liu, and A. Wallqvist. General purpose 2D and 3D similarity approach to identify hERG blockers. Journal of Chemical Information and Modeling. 2016; 56(1):213-222.
  10. Attene-Ramos, M., R. Huang, S. Michael, K. Witt, A. Richard, R. Tice, A. Simeonov, C. Austin, M. Xia. Profiling of the Tox21 chemical collection for mitochondrial function to identify compounds that acutely decrease mitochondrial membrane potential. 2015; 123(1):49.
  11. Li, D., L. Chen, Y. Li, S. Tian, H. Sun, and T. Hou. ADMET evaluation in drug discovery. 13. Development of in Silico prediction models for P-Glycoprotein substrates. 2014; 11(3):716-726.
  12. Broccatelli, F., E. Carosati, A. Neri, M. Frosini, L. Goracci, T. Oprea, and G. Cruciani. A novel approach for predicting P-Glycoprotein (ABCB1) inhibition using molecular interaction fields. 2011; 54(6):1740-1751.
  13. Chen, L., Y. Li, Q. Zhao, H. Peng, and T. Hou. ADME evaluation in drug discovery. 10. Predictions of P-Glycoprotein inhibitors using recursive partitioning and naive bayesian classification techniques. 2011; 8(3):889–900.

Contact us: v n n a d m e t @ b h s a i . o r g