Releases: TheJacksonLaboratory/sbas
Archive before pre-publication repo cleanup
Differentially Expressed GTEx V8 Tissues
Differential Gene Expression Analysis
This notebook generates the sex-biased differential gene expression analysis. Differential Analysis (DE) was performed using voom (Law et.al., 2014) with gene expression counts with associated precision weights, followed by linear modeling and empirical Bayes procedure using limma.
Within each tissue, the following linear regression model was used to detect secually dimorphic alternative splicing event expression:
y = B0 + B1 sex + epsilon (error)
where y is the gene expression to be modeled sex denotes the reported sex of the subject. The function named fit_tissue() performs this analysis and accepts two arguments, the tissue and an object and create the model matrix based that tissue's sex. We will perform a linear fit after calculating normal factors (based on the library size) and calculate the dispersion using voom (mean variance model of dispersion). We are saving the resulting matrixes as files.
Content in this release are the results from executing the jupyter notebook, differentialGeneExpresionAnalysis.ipynb found in the github repository in the jupyter subdirectory.
This release contains the output of the execution of this notebook.
1.3 Output
For each tissue, as selected and specified in the tissues.tsv file found in the ../assets
directory. The following files are produced:
1.3.1 ../data/tissue
_DGE.csv
This file contains the topTable
results, reporting the ENSG
- gene identification,logFC
- log fold change, AveExpr
, t
- the model result for the sex-bias (see section 2, P.Value
, adj.P.Val
- bon, and B (FDR)
.
1.3.2 ../data/tissue
_refined.csv
This are the values that are differentially expressed with results that are 1.5 fold change greater than the mean and with a p-value of less than 0.05.
1.3.3 ../data/tissue
_ensg_map.csv
This is the mapping using gprofiler
of the ENSG
identifiers to their geneSymbols
for ease of filtering prior to creating the linear model of the junctions in the computational step for the differential alternative splicing is completed.
1.3.4 ../pdf/tissue
-gene-y-voom-MDSplot-100.pdf
These are the counts in a multi-dimensional scaling plot (MDSplot), showing the ability for the model to segregate the sex as illustrated with red
m
for the male and blue
f
for the female self-reported sex phenotypes. In these plots, voom has been used to model the variance.
1.3.5 ../pdf/tissue
-gene-y-MDSplot-100.pdf
These are the counts in a multi-dimensional scaling plot (MDSplot), showing the ability for the model to segregate the sex as illustrated with red
m
for the male and blue
f
for the female self-reported sex phenotypes. In these plots, this is without the application of the results from modeling the variance with voom.
Release for attaching differential gene expression artefacts with ropensci/piggyback
This release contains the required input data and artifacts related to the Differential Gene Expression analysis accompanying the publication "The impact of sex on alternative splicing".
Differentially Expressed GTEx V8 Tissues per Skipped Exon Junctions using IJC and SJC
Using rMATs.3.2.5 http://rnaseq-mats.sourceforge.net/rmats3.2.5/, we used the counts of aligned fastq files from GTEx https://www.gtexportal.org/home/tissueSummaryPage Source: GTEx Analysis Release V8 (dbGaP Accession phs000424.v8.p2). rMATS 3.2.5 discovered these skipped exon junctions by scanning Gencode human release 30 https://www.gencodegenes.org/human/release_30.html using the comprehensive GTF file. There are 42,611 distinct junctions in this skipped exon file as annotated in the fromGTF.SE.txt file.
The counts used in the differential analysis were the counts on junctions when the exon was included (IJC) counts and the counts on the junction resulting when the exon was excluded (SJC).
Also included are MDSPlots for each of the events modeled, using a linear regression model to the log2(counts + 0.05). Modeling using sex as a coefficient for the IJC and SJC models alone. And the model used for the downstream analysis, where both sex, the alternative splicing event (as_event) and their interaction (sex * as_event) were modeled. Limma's voom was used to model the variance. Random effects were used to model the effect of the donor using the block parameter to model this as a duplicate correlation event for the final voom calculation used in the fitting of the final predictive model.
Release for attaching differentially spliced exons results
Update README.md updated the details for the GTEx release
Release for attaching differential gene expression artefacts with ropensci/piggyback
DGE-deprecated Adds piggyback data fetch from GitHub release (#123)
Metadata file for GTExV8.v1.0
To retrieve the files from this release in R by using the ropensci/piggyback
R package, use the following command:
# devtools::install_github("ropensci/piggyback@87f71e8", upgrade="never")
piggyback::pb_download(
repo = "TheJacksonLaboratory/sbas",
progress = TRUE,
file = "SraRunTable.txt.gz",
tag = "GTExV8.v1.0",
dest = "../data/")
TheJacksonLaboratory/sbas version 1.0
Initial release of the TheJacksonLaboratory/sbas repository containing the resources to reproduce the analysis in the publication "The impact of sex on alternative splicing" Karlebach et al., 2018, doi.org/10.1101/490904
First release to attach gtex via {piggyback}
This release facilitates the upload of the GTEX archive retrieved using the {yarn}
R package. As the archive is 1.4 GB it cannot be uploaded as a file, hence we are utilising the heuristic of attaching it as a release associated archive.
More information about this approach can be found here Karthik Ram's Rstudio conference 2019 talk (9:50) and in the github repo of {piggyback}
https://github.com/ropensci/piggyback.