TopDeck

TopDeck is a novel method for identifying fusion gene samples from The Cancer Genome Atlas (TCGA) using recount3. Developed to create a subset of potential fusion gene samples that can then be run on more accurate fusion finders, TopDeck combines a gene overexpression analysis with a novel exon comparison analysis to identify a list of potential fusion genes.

TopDeck allows for the analysis of a single 3' fusion gene partner, by identifying if it is overexpressed, and if there is a significant increase in exon expression after the expected fusion gene breakpoint. Developed on 10 genes which act as common 3' fusion genes in TCGA (ALK, ARHGAP26, ERG, ETV1, MAML3, NTRK3, RARA, RET, TACC3, TFE3), TopDeck had an overall sensitivity of 36.83%.

TopDeck is the culmination of an Honours thesis for Monash University in collaboration with the Peter MacCallum Cancer Centre.

Gene Overexpression

A demonstration of the gene overexpression method using the 3' gene TACC3 is available in the file overexpression_workflow.Rmd, along with the functions used.

Overexpression was calculated as samples above the 95th percentile, after a preliminary analysis with Tukey's definition of outliers failed to identify a significant amount of fusion gene samples (outliers.Rmd).

To continue to improve the sensitivity of gene overexpression, a preliminary investigation into incorporating the shape of the distribution was conducted, and is summarised in distribution_shape.Rmd. To reduce the false positive rate of the 95th percentile method (identifying 5% of every cancer type as a potential fusion), a preliminary investigation was also conducted into approaches to filter to cancer types of interest through comparison to non-cancer data. This approach is summarised in low_expression.Rmd.

Change in Exon Expression

A demonstration of the change in exon expression method using 3' gene TACC3 on bladder urothelial carcinoma samples is available in file tacc3_demonstration.Rmd, with functions used explained.

Two different change in exon expression methods were used, from change in individual exon expression and change in average exon expression.

The Z-scores for all samples investigated using the change in individual exon expression are available in file: all_diff_objects.Rdata
The Z-scores for all samples investigated using the change in average exon expression are available in file: prop_cum_all.Rdata

The true positive rail_ids are available in files:

truepos_diff.csv (individual exon expression)
truepos_prop_cum.csv (average exon expression)

The false positive rail_ids are available in files:

falsepos_diff.csv (individual exon expression)
falsepos_prop_cum.csv (average exon expression)

The change in exon expression method uses functions from exon_expression_functions.R.

Bibliography

A bibliography for R packages used is available in bibliography.R

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
all_diff_objects.Rdata		all_diff_objects.Rdata
average_exon_values_thesis.Rmd		average_exon_values_thesis.Rmd
bibliography.R		bibliography.R
distribution_shape.Rmd		distribution_shape.Rmd
distribution_shape.nb.html		distribution_shape.nb.html
exon_expression_functions.R		exon_expression_functions.R
falsepos_diff.csv		falsepos_diff.csv
falsepos_prop_cum.csv		falsepos_prop_cum.csv
fusions_ided.Rmd		fusions_ided.Rmd
individual_exon_values_thesis.Rmd		individual_exon_values_thesis.Rmd
low_expression.Rmd		low_expression.Rmd
low_expression.nb.html		low_expression.nb.html
outliers.Rmd		outliers.Rmd
outliers.nb.html		outliers.nb.html
overexpression_workflow.Rmd		overexpression_workflow.Rmd
overexpression_workflow.nb.html		overexpression_workflow.nb.html
prop_cum_all.RData		prop_cum_all.RData
tacc3_demonstration.Rmd		tacc3_demonstration.Rmd
tcga.samples.csv		tcga.samples.csv
truepos_diff.csv		truepos_diff.csv
truepos_prop_cum.csv		truepos_prop_cum.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TopDeck

Gene Overexpression

Change in Exon Expression

Bibliography

About

Releases

Packages

Contributors 3

Languages

License

Oshlack/TopDeck

Folders and files

Latest commit

History

Repository files navigation

TopDeck

Gene Overexpression

Change in Exon Expression

Bibliography

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages