S2: Deep Learning for predictive toxicology
Environmental exposure to chemical compounds poses high risks for human health, with potential impact on the endocrine system causing adverse immune, neurological and developmental effects. The large-scale Collaborative Estrogen Receptor Activity Prediction Project (CERAPP) already explored the use of machine learning to evaluate the binding interactions/activities of environmental chemicals to the ligand-binding domain of human estrogen receptor (ER) from high-throughput screening data. Further, the CERAPP project includes three activity prediction tasks (namely: Agonist, Antagonist, and Binding), thus making multi-target models an opportunity of interest. Given the costs of experimental testing of chemicals, this class of tasks aims at prioritizing compounds, with a preference for higher-sensitivity models in order to focus on risk prevention.
Methods. We applied a combination of Deep Learning and Support Vector Machine models to predict compound activity for the ER ligand-binding domain. Our ML4Tox solution is developed on the CERAPP ToxCast training set of 1677 chemicals, described by 777 molecular features computed by Mold2. Performance is assessed in terms of Balanced Accuracy (BA) and Matthews Correlation Coefficient (MCC). Models are tested on the CERAPP Literature Evaluation Set (6319 compounds for Agonist, 6539 for Antagonist, 7283 for Binding) derived from the 7522 compounds in the CERAPP Evaluation Set by excluding compounds with relatively high (> 20%) disagreement amongst literature sources. In order to control for selection bias and other overfitting effects, the ML4Tox models are trained and evaluated in a 10x5-fold cross-validation schema that implements a Data Analysis Protocol (DAP) developed by FBK for the US-FDA led initiatives MAQC/SEQC. We first develop ML4Tox-Agonist, a multilayer neural network to predict Agonist activity with three hidden layers of 420, 30, and 5 nodes, with ReLU activation. The batch size is 300, the optimizer is SGD with adaptive learning rate, and the number of epochs is bounded to 4000. For the Antagonist task, which is highly unbalanced with less of 3% of positive labels in the training set, we developed ML4Tox-Antagonist, a Support Vector Machine (SVM) with linear kernel, C=0.0015 (tuned by a grid search) and balanced class weights. On the Binding Task, we added Extended Connectivity Fingerprints (ECFs) to Mold2 features and considered Linear and Gaussian Kernel SVM with C=10.
Results. On the first task, ML4Tox-Agonist scored BA=0.648 (CI: 0.638, 0.659; 95% studentized bootstrap confidence interval) and MCC=0.371 (CI: 0.338, 0.403) in cross-validation on the training set, and BA=0.758, MCC=0.447 on the independent evaluation set, slightly improving on the FDA_NCTR_DBB model BA(eval)=0.75 on the same task and datasets. Further, the ML4Tox-Antagonist model achieved BA=0.616 (CI:0.593, 0.637) and MCC=0.075 (CI:0.061, 0.089) and BA=0.717 in training, MCC=0.193 (SP=0.719, SN=0.715) in evaluation, significantly increasing sensitivity with respect to the FDA_NCTR_DBB BA(eval) = 0.55, SP=0.984, SN=0.113. The Gaussian SVM ML4Tox-Binding model over combined Mold2 descriptors and ECSFs features scored BA = 0.672 (CI: 0.662, 0.682) in cross-validation in training, and BA=0.639 (SP=0.677, SN=0.601)
on the evaluation set, with higher sensitivity than FDA_NCTR_DBB model (BA = 0.60 , SP = 0.935, SN=0.262).The presentation will further discuss (a) the use of alternative descriptors extracted from 2D chemical structures using chemistry featurization methods; (b) the evaluation of multiple configurations of network hyperparameters (e.g., number of layers and nodes, activation functions); (c) the development of a high-sensitivity model for the Binding task; (d) multi-task deep learning.