Repository for the drexml
python package: (DRExM³L) Drug REpurposing using eXplainable Machine Learning and Mechanistic Models of signal transduction
Find the associated publication here:
Esteban-Medina M, de la Oliva Roque VM, Herráiz-Gil S, Peña-Chilet M, Dopazo J, Loucera C. drexml: A command line tool and Python package for drug repurposing. Computational and Structural Biotechnology Journal 2024;23:1129–43. https://doi.org/10.1016/j.csbj.2024.02.027.
Part of the Intelligent Biology and Medicine special issue:
And the BIB
file:
@article{EstebanMedina2024,
title = {drexml: A command line tool and Python package for drug repurposing},
volume = {23},
ISSN = {2001-0370},
url = {http://dx.doi.org/10.1016/j.csbj.2024.02.027},
DOI = {10.1016/j.csbj.2024.02.027},
journal = {Computational and Structural Biotechnology Journal},
publisher = {Elsevier BV},
author = {Esteban-Medina, Marina and de la Oliva Roque, Víctor Manuel and Herráiz-Gil, Sara and Peña-Chilet, María and Dopazo, Joaquín and Loucera, Carlos},
year = {2024},
month = dec,
pages = {1129–1143}
}
The article was written using drexml
version v1.1.0
. Install it using:
pip install drexml==1.1.0
Version v1.1.1
improves the documentation and README
by including a reference to the published article for easier access.
To install the drexml
package use the following:
conda create -n drexml python=3.10
conda activate drexml
pip install drexml
If a CUDA~10.2/11.x (< 12) compatible device is available use:
conda create -n drexml --override-channels -c "nvidia/label/cuda-11.8.0" -c conda-forge cuda cuda-nvcc cuda-toolkit gxx=11.2 python=3.10
conda activate drexml
pip install --no-cache-dir --no-binary=shap drexml
To install drexml
in an existing environment, activate it and use:
pip install drexml
Note that by default the setup
will try to compile the CUDA
modules, if not possible it will use the CPU
modules.
To run the program for a disease map that uses circuits from the preprocessed KEGG
pathways and the KDT
standard list, construct an environment file (e.g. disease.env
):
- using the following template if you have a set of seed genes (comma-separated):
seed_genes=2175,2176,2189
- using the following template if you want to use the DisGeNET [1] curated gene-disease associations as seeds.
disease_id="C0015625"
- using the following template if you know which circuits to include (the disease map):
circuits=circuits.tsv.gz
The TSV
file circuits.tsv
has the following format (tab delimited):
index in_disease
P-hsa03320-37 0
P-hsa03320-61 0
P-hsa03320-46 0
P-hsa03320-57 0
P-hsa03320-64 0
P-hsa03320-47 0
P-hsa03320-65 0
P-hsa03320-55 0
P-hsa03320-56 0
P-hsa03320-33 0
P-hsa03320-58 0
P-hsa03320-59 0
P-hsa03320-63 0
P-hsa03320-44 0
P-hsa03320-36 0
P-hsa03320-30 0
P-hsa03320-28 1
where:
index
: Hipathia circuit idin_disease
: (boolean) True/1 if a given circuit is part of the disease
Note that in all cases you can restrict the circuits to the physiological list by setting use_physio=true
in the env
file.
To run the experiment using 10 CPU cores and 0 GPUs, run the following command within an activated environment:
drexml run --n-gpus 0 --n-cpus 10 $DISEASE_PATH
where:
--n-gpus
indicates the number of gpu devices to use in parallel (-1 -> all) (0 -> None)--n-cpus
indicates the number of cpu devices to use in parallel (-1 -> all) 8DISEASE_PATH
indicates the path to the disease env file (e.g./path/to/disease/folder/disease.env
)
Use the --debug
option for testing that everything works using a few iterations.
Note that the first time that the full program is run, it will take longer as it downloads the latest versions of each background dataset from Zenodo:
https://doi.org/10.5281/zenodo.6020480
The recommended setup is:
- setup
pipx
- setup
miniforge
- use
pipx
to installpdm
- ensure that
pdm
is version >=2.1, otherwise update withpipx
- use
pipx
to inject pdm-bump intopdm
- use
pipx
to installnox
- run
pdm config venv.backend conda
- run
make
, if you want to use a CUDA enabled GPU, usemake gpu=1
- (Recommended): For GPU development, clear the cache using
pdm cache clear
first
The documentation can be found here:
https://loucerac.github.io/drexml/
[1] Janet Piñero, Juan Manuel Ramírez-Anguita, Josep Saüch-Pitarch, Francesco Ronzano, Emilio Centeno, Ferran Sanz, Laura I Furlong. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucl. Acids Res. (2019) doi:10.1093/nar/gkz1021