Model: Earthworm

In the paper “Ghosh, S., Ojha, P. K., Carnesecchi, E., Lombardo, A., Roy, K., & Benfenati, E. (2020). Exploring QSAR modeling of toxicity of chemicals on earthworm. Ecotoxicology and Environmental Safety, 190(December 2019), 110067. https://doi.org/10.1016/j.ecoenv.2019.110067“, the authors presented a study on the prediction of the toxicity of pesticides to earthworm. In particular, they presented a QSAR model to predict the negative logarithm of lethal concentration (pLC50) values of pesticides towards earthworms.

The dataset and information described in the paper were used to build the alvaRunner project that we present here.

alvaRunner project

This alvaRunner project contains one model:

  • a PLS regression model built using the 57 molecules

The model includes the following eight descriptors:

  • nR09: number of 9-membered rings
  • C-025: atom-centred fragment R–CR–R
  • C-028: atom-centred fragment R–CR–X
  • F-083: F attached to C3(sp3)
  • B03[C-C]: Presence/absence of C – C at topological distance 3
  • B04[N-O]: Presence/absence of N – O at topological distance 4
  • B07[N-N]: Presence/absence of N – N at topological distance 7
  • F04[C-N]: Frequency of C – N at topological distance 4

The authors analysed the selected descriptors highlighting the relationship between the selected descriptors and the toxicity.

The nR09 descriptor has been associated with the lipophilicity, since cyclic compounds are more lipophilic than corresponding open chain compounds and lipophilicity increases the toxicity. Both C-025 and C-028 descriptors are associated to hydrophobicity and to the permeability through the cuticle of earthworms. F-083 fragments, including fluorine atoms, affect the toxicity since fluorine atoms may form hydrogen bond and electron donor acceptor (EDA) complexes with the earthworm DNA. The authors also analysed the relationship between the toxicity and the atom pairs descriptors B03[C-C], B04[N-O], B07[N-N] and F04[C-N], their considerations can be found in the paper.

It is worth noticing that an applicability domain (A.D.) based on the Leverage method was added to the model and included in the alvaRunner project. Also, the formula coefficients were revised resulting in a model with slightly better performance than the one presented in the original paper.

The scores of the model are presented in the following table:

CV: cross-validation 5-fold (Venetian blinds)
Model name Training
R2 Q2CV RMSE RMSECV
Earthworms Log LC50 (PLS) 0.756 0.651 0.315 0.377

The chart on the left shows the predicted (Y) and experimental (X) values and the one on the right is the Williams plot of the model:

green: training set, orange: molecules outside the A.D.
Earthworms Log LC50 (PLS) Williams plot

Download

Please, log in in order to access the content.