BMFpred

BMFpred is an easy-to-use software implementing the QSAR models described in “F. Grisoni, V.Consonni, M.Vighi (2018). Acceptable-by-design QSARs to predict the dietary biomagnification of organic chemicals in fish, Integrated Environmental Assessment and Management” to predict the laboratory-based fish Biomagnification Factor (BMF) of chemicals. The descriptors used in the models are calculated using alvaDesc technology.

The Science behind BMFpred

BMFpred implements:

  • a kNN regression model, specifically a weighted Nearest Neighbour Regression (wNNR) model
  • an ordinary least squares (OLS) model
  • a Consensus model defined as the arithmetic mean of the values predicted by wNNR and OLS model

The wNNR model includes four descriptors: MolgP2 that is the square of the Moriguchi octanol-water partitioning coefficient, nBt that stands for the total number of bonds, B02[N-O] that indicates the presence/absence of a nitrogen atom and an oxygen atom separated by two bonds and F06[C-C] that is the counter of carbon pairs separated by six bonds.

The OLS model is comprised of seven molecular descriptors. As well as the kNN model it includes the MlogP2 and B02[N-O] descriptors. Additionally, the OLS model includes X0Av, X1Per, SaaaC, VE1_B(m) and B03[N-Cl] molecular descriptors.

Grisoni highlighted that X0Av (average connectivity index of order 0) is related to the fraction of atoms with many valence electrons and to unsaturated/aromatic bonds. Additionally, X1Per (perturbation connectivity index) is considered sensitive to the presence of heteroatoms, molecular shape and presence of multiple bonds. SaaaC is an atom-type electrotopological state index, it is related to the electron accessibility of specific-atom types, specifically SaaaC is the sum of the E-states of Carbon atom connected to 3 aromatic atoms. Grisoni also stated that on the considered dataset, VE1_B(M) (coefficient sum of the last eigenvector (absolute values) from Burden matrix weighted by mass) is related to the molecular size, the branching, the number of multiple bonds and of cycles. Finally, B03[N-Cl] similarly to B02[N-O] indicates the presence/absence of a nitrogen atom and an chlorine atom separated by three bonds.

Model performances
Method Fitting Cross-validation Test set
RMSE R2 RMSE Q2CV RMSE Q2TEST
kNN 0.52 0.76 0.52 0.76 0.54 0.75
OLS 0.53 0.75 0.55 0.74 0.57 0.71
Consensus 0.47 0.81 0.49 0.79 0.45 0.82

The models included in BMFpred have been designed to be intrinsically acceptable at the regulatory level and have an error comparable to the observed experimental inter- and intra-species variability.

Additional information is available at the In silico Biomagnification Factor prediction project.

BMFpred was phased out in favor of a more generic solution to create and deploy QSAR/QSPR regression models.