- Software environment with python v3.7.6 or higher and pandas v1.0.5 or higher
- Alteration databases
- Variant file (single, or multi-sample, MAF or fusion results file; VCF not yet supported) aligned to GRCh38
- (Optional) Clinical trials metadata file(s)
-
A software environment such as
conda
or miniconda is highly recommended. -
Get the repositories:
git clone --recurse-submodules https://github.com/ding-lab/druggability.git
Running this command creates the main
druggability
directory and the subdirectorydruggability_databases
with the database files. (It is possible to clone the repos separately; just be sure to updateDRUGDBDIR
inmyglobal.py
.) -
If relevant, activate the conda/miniconda environment and install additional packages:
pip install -r requirements.txt
.
-
Obtain the
Dockerfile
by direct download or from a "git clone" of thedruggability
repo and then build the image, e.g.,docker build -t druggability .
$ ./druggability.py -h
usage: druggability.py [-h] [-o OUTPUT_FILE] [-l LOG_FILE] [-d] [-nn NORMAL_NAME]
[-tn TUMOR_NAME] [-fn FUSION_SAMPLE_NAME] [-at ANNOTATE_TRIALS]
[-ato TRIALS_AUXILIARY_OUTPUT_FILE] [--dump_trials_only]
[--maf VARIANT_MAF_FILE] [--basicmaf VARIANT_BASICMAF_FILE]
[--fusion VARIANT_FUSION_FILE]
optional arguments:
-h, --help show this help message and exit
-o OUTPUT_FILE alteration database matches
-l LOG_FILE logfile name
-nn NORMAL_NAME MAF normal sample name
-tn TUMOR_NAME MAF tumor sample name
-fn FUSION_SAMPLE_NAME fusion sample name
-at ANNOTATE_TRIALS report clinical trials for this disease keyword
-ato TRIALS_AUXILIARY_OUTPUT_FILE clinical trials auxiliary output filename
--maf VARIANT_MAF_FILE variant file in maf format (requirements: tumor and normal names)
--basicmaf VARIANT_BASICMAF_FILE variant file in basic maf format (requirements: tumor and normal names)
--fusion VARIANT_FUSION_FILE variant file for fusions (requirement: tumor name)
--dump_trials_only
-d, --debug
Notes:
- Releases v1.2+:
- Maf and fusion files can now be specified at the same time. The sample(s) responsible for the database matches are shown in the output tables.
- Each input file type has been given its own command-line flag.
- The sample names passed to the tool (
-nn
,-tn
,-fn
) need to match those appearing in the inputs. This feature can be helpful for comparing results across aliquots.
- For fusion inputs, the "fusion sample name" can be any sample, normal or tumor.
- See the
examples
directory for driver scripts to use as templates, particularly for analyzing a cohort of sample sets. - The variant call files that were used in development are not able to be distributed externally.
-
MAFs in ding-lab or PanCan union maf formats are automatically detected:
$ ./druggability.py --maf LUAD.Somatic.050919.mnp.annot.maf -nn C3N-00169_N -tn C3N-00169_T -l druggability.log -o alterations.out
$ ./druggability.py --maf LUAD.Somatic.050919.mnp.annot.maf -nn C3N-00169_N -tn C3N-00169_T -l druggability.log -o alterations.out -at chol -ato trials.aux
-
Ad hoc MAFs. Variant call files, such as those redistributed with publications, may have a different format than those listed above. In that case,
druggability
can read a tab-delimited file whose first 13 fields are the following:Hugo_Symbol NCBI_Build Chromosome Start_Position End_Position Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Sample_Barcode Matched_Norm_Sample_Barcode HGVSc HGVSp_Short
and analyzed with the
basicmaf
option:$ ./druggability.py --basicmaf cohort.maf -nn 111 -tn 111 -l druggability.log -o alterations.out -at chol -ato trials.aux
Notes:
- Coordinates are 1-based
Variant_Type
should be SNP, DNP, MNP, INS, or DEL
- Input file must be in ding-lab format.
$ ./druggability.py --fusion CPTAC_fusions_v0.1.csv.tsv -fn C3L-00165_T -l druggability.log -o alterations.out
```
$ ./druggability.py --basicmaf mc3.v0.2.8.PUBLIC.code.filtered.Missense.maf.CHOL --fusion TCGA_MC3.allfusion.compile.txt.CHOL.reformatted.tsv -nn TCGA-3X-AAVA-10A-01D-A41A-09 -tn TCGA-3X-AAVA-01A-11D-A417-09 -fn TCGA-3X-AAVA-01A-11R-A41I-07 -l TCGA_MC3_chol.TCGA-3X-AAVA.log -o TCGA_MC3_chol.TCGA-3X-AAVA.out -at chol -ato TCGA_MC3_chol.TCGA-3X-AAVA.aux
```