Skip to content

the-ahuja-lab/Metabokiller

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Carcinogenicity Prediction using Metabokiller



Introduction

Metabokiller offers a novel, machine learning-based approach that accurately recognizes carcinogens by quantitatively assessing their chemical composition as well as potential to induce proliferation, oxidative stress, genomic instability, alterations in epigenetic signatures, and activation of anti-apoptotic pathways, and therefore, obviates the absolute need for bonafide (non)carcinogens for training model. Concomitant with the carcinogenicity prediction, it also reveals the contribution of the aforementioned biochemical processes in carcinogenicity, thereby making the proposed approach highly interpretable.

The only strong dependency for this resource is RDKit which can be installed in a local Conda environment.

$ conda create -c conda-forge -n my-rdkit-env rdkit
$ conda activate my-rdkit-env

License Key

Metabokiller is free for academic institutions, however, for commercial utilization a commercial license key is required. Users (academic/commercial) may apply for a valid "License Key" here.

You can also generate your own predictions using Metabokiller’s Colab notebook

Major dependencies

  1. Signaturizer(v1.1.11)
  2. LIME

The installation procedure takes less than 5 minutes.

$ pip install signaturizer
$ pip install lime

Minor dependencies

  1. os
  2. scikit-learn v1.0.2
  3. pandas
  4. numpy
  5. tqdm
  6. joblib
  7. matplotlib
  8. io
  9. importlib

How to use Metabokiller?

Installation using pip

$ pip install Metabokiller

License activation (One time)

>>> from Metabokiller import mk_predictor as mk

Activate metabokiller license

>>> mk.license('license key') #Example: mk.license('KKKVFZ41111WF6RTQ')

To apply for the license click here

Examples

To get predictions for individual carcinogenic properties:

>>> from Metabokiller import mk_predictor as mk

Prepare a list of canonical SMILES (Openbabel generated) strings

>>> smiles = ['ClCC=C', 'C=CCOC(=O)CC(C)C'] 

Run predictions on any of the carcinogenic property of interest (e.g. epigenetic modifications)

>>> mk.Epigenetics(smiles)

Save the result as Pandas dataframe

result = mk.Epigenetics(smiles)
Metabokiller supported carcinogen-specific biochemical properties:
  1. Epigenetic Alterations
>>> mk.Epigenetics()
  1. Oxidative stress
>>> mk.Oxidative()
  1. Electrophilic Property
>>> mk.Electrophile()
  1. Genomic Instability
>>> mk.GInstability()
  1. Pro-proliferative response
>>> mk.Proliferation()
  1. Anti-apoptotic response
>>> mk.Apoptosis()
To get predictions for all available carcinogenic properties along with their explainability:
>>> from Metabokiller import EnsembleMK

Prepare a list of canonical SMILES (Openbabel generated) strings

>>> smiles = ['ClCC=C', 'C=CCOC(=O)CC(C)C'] 

Run predictions for all available carcinogenic properties

>>> EnsembleMK.predict(smiles)

Save the result as Pandas dataframe

>>> result = EnsembleMK.predict(smiles)
LIME
The biochemical property-focused Metabokiller, by the virtue of its construction, offers interpretability by implementing Local Interpretable Model-agnostic Explanations (LIME). An algorithm that provides interpretability with respect to carcinogen-specific biochemical properties for each SMILES provided.
To activate interpretability using LIME:
>>> result,explanation = EnsembleMK.predict(['ClCC=C', 'C=CCOC(=O)CC(C)C'],explainability=True)
# getting output from the explainability object
>>> from matplotlib.backends.backend_pdf import PdfPages
>>> from matplotlib import pyplot as plt

>>> pdf = PdfPages("Ensmble-Result.pdf")
>>> for fig in explanation:
...	fig.savefig(pdf, format='pdf')
>>> pdf.close()