GitHub - lujonathanh/coffdrop: Comprehensive toolkit for mutual exclusivity and co-occurrence analysis of cancer mutations.

#COFFDROP Coffdrop is a comprehensive analysis toolkit for mutually exclusive and co-occurring pairs of mutations in sequenced tumor samples, with False Discovery Rate control. Coffdrop is written in Python. It was developed by Jonathan Lu, Jason Pitt, and Lorenzo Pesce at the University of Chicago.

Coffdrop implements a binomial statistical model to assess for significance of the mutual exclusivity/co-occurrence of a pair. To control false discoveries, one can limit the tested pairs by performing an initial screen of the pairs over the patients with the least mutations, then choosing only the most significant ones to test across the whole distribution. Coffdrop uses the Benjamini-Hochberg procedure to control False Discoveries.

After detecting significant pairs, Coffdrop 1. searches for enriched genes and chromosomal regions 2. searches for enriched pairs. 3. plots the mutual-exclusivity and co-occurrence networks and finds genes with the highest degree centrality 4. Searches for triplets of mixed mutually exclusive and co-occurring pairs.

Furthermore, it has a flexible preprocessing feature to allow for: 1. handling various mutation types, particularly Copy Number Alterations, which can create significant artefacts due to lack of independence among alterations in nearby genes. Thus, one can require genes to be a certain distance away before being run 2. testing only those genes above a certain frequency

##Requirements Coffdrop requires the following Python modules: 1. NetworkX 2. SciPy 3. NumPy 4. matplotlib

#Usage See "Coffdrop workflow" in the wiki for details.

#Input Coffdrop provides several python scripts for processing and integrating MAF and GISTIC files into the alteration matrix format, detailed below.

Alteration matrix. This tab-separated file lists alterations in your dataset. Each row lists the alterations for a single sample. In each row, the first column lists a sample ID, and the remaining columns list genes that are altered in that sample. Note that the matrix is not necessarily symmetric, as different samples will have different numbers of alterations. In all files, lines starting with '#' are ignored.

#Output Output files are txt files with each identified pair or triplet as one row.

We provide example matrices (".m2") in the data folder.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
data		data
postprocessing		postprocessing
preprocessing		preprocessing
BINPARAMS.sh		BINPARAMS.sh
BRCA_wustl_Mutex_Runs_part1_5_16_16.ipynb		BRCA_wustl_Mutex_Runs_part1_5_16_16.ipynb
BRCA_wustl_Mutex_Runs_part2_updatedMutexformat_6_2_16.ipynb		BRCA_wustl_Mutex_Runs_part2_updatedMutexformat_6_2_16.ipynb
BRCA_wustl_Mutex_Runs_part3_FDR.ipynb		BRCA_wustl_Mutex_Runs_part3_FDR.ipynb
BRCA_wustl_Mutex_Runs_part4_weSME_comp.ipynb		BRCA_wustl_Mutex_Runs_part4_weSME_comp.ipynb
BRCA_wustl_Mutex_Runs_part5_SNVasloss.ipynb		BRCA_wustl_Mutex_Runs_part5_SNVasloss.ipynb
BRCA_wustl_runs.ipynb		BRCA_wustl_runs.ipynb
BRCA_wustl_runs_previous_4_13_16.ipynb		BRCA_wustl_runs_previous_4_13_16.ipynb
CNA_artefact.ipynb		CNA_artefact.ipynb
COSMICCensus.txt		COSMICCensus.txt
COSMIC_genes_Cytobands.ipynb		COSMIC_genes_Cytobands.ipynb
Coffdrop_Runs.ipynb		Coffdrop_Runs.ipynb
CopyNumber_Distance_Runs.ipynb		CopyNumber_Distance_Runs.ipynb
EM_test.py		EM_test.py
MUTPARAMS.sh		MUTPARAMS.sh
MUTPARAMS.sh~		MUTPARAMS.sh~
Mutex_Runs_FWER_6_11_16.ipynb		Mutex_Runs_FWER_6_11_16.ipynb
OV_broad_Mutex_Runs_Part1_5_16_16.ipynb		OV_broad_Mutex_Runs_Part1_5_16_16.ipynb
OV_broad_Mutex_Runs_Part2.ipynb		OV_broad_Mutex_Runs_Part2.ipynb
OV_broad_runs.ipynb		OV_broad_runs.ipynb
PAAD_ucsc_Mutex_Runs.ipynb		PAAD_ucsc_Mutex_Runs.ipynb
PAAD_ucsc_Mutex_Runs_6_14_16.ipynb		PAAD_ucsc_Mutex_Runs_6_14_16.ipynb
README.md		README.md
README.md~		README.md~
Simulated_Mutex_Runs.ipynb		Simulated_Mutex_Runs.ipynb
add_COSMIC.py		add_COSMIC.py
bingenesbypairs.py		bingenesbypairs.py
bingenesbypairs.pyc		bingenesbypairs.pyc
binpairs.sh		binpairs.sh
chisquared.py		chisquared.py
clustering.py		clustering.py
cytoBand.txt		cytoBand.txt
edgereader.py		edgereader.py
edgereader.pyc		edgereader.pyc
geneToLength_all_firstseen.txt		geneToLength_all_firstseen.txt
gene_positions.txt		gene_positions.txt
lowmutatedness.py		lowmutatedness.py
lowmutatedness.pyc		lowmutatedness.pyc
matrixlist		matrixlist
multi_testing.py		multi_testing.py
mutex.py		mutex.py
mutex.pyc		mutex.pyc
mutex_bayes.ipynb		mutex_bayes.ipynb
mutex_partitions.ipynb		mutex_partitions.ipynb
mutex_triangles.py		mutex_triangles.py
mutex_triangles.py.orig		mutex_triangles.py.orig
mutex_triangles.pyc		mutex_triangles.pyc
mutexnetwork.py		mutexnetwork.py
mutexnetwork.py.lprof		mutexnetwork.py.lprof
mutexnetwork.py.prof		mutexnetwork.py.prof
mutexnetwork.pyc		mutexnetwork.pyc
mutexprob.py		mutexprob.py
mutexprob.pyc		mutexprob.pyc
parallel_compute_working.py		parallel_compute_working.py
parallel_compute_working.pyc		parallel_compute_working.pyc
partition.py		partition.py
partition_3_11_16.py		partition_3_11_16.py
permute.py		permute.py
run_mutex.sh		run_mutex.sh
run_mutex.sh~		run_mutex.sh~
scorecooccur.py		scorecooccur.py
scorecooccur.pyc		scorecooccur.pyc
wiki.md		wiki.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

lujonathanh/coffdrop

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages