forked from statOmics/SGA
-
Notifications
You must be signed in to change notification settings - Fork 0
/
sequencingTutorial2_DE.Rmd
53 lines (38 loc) · 3.96 KB
/
sequencingTutorial2_DE.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
title: II. Differential analysis at the gene level.
author: "Lieven Clement"
date: "[statOmics](https://statomics.github.io), Ghent University"
output:
html_document:
theme: flatly
code_download: false
toc: false
toc_float: false
number_sections: false
bibliography: msqrob2.bib
---
<a rel="license" href="https://creativecommons.org/licenses/by-nc-sa/4.0"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a>
### 1. C. Elegans
After fertilization but prior to the onset of zygotic transcription, the C. elegans zygote cleaves asymmetrically to create the anterior AB and posterior P1 blastomeres, each of which goes on to generate distinct cell lineages. To understand how patterns of RNA inheritance and abundance arise after this first asymmetric cell division, we pooled hand-dissected AB and P1 blastomeres and performed RNA-seq (Study [GSE59943](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE59943)).
A file is provided with a detailed example on how
- to extract metadata of the experiment needed to know the details on the design
- to perform differential analysis in R.
- [elegans example](./elegansDE.html)
1. Try to understand the analysis in the example file
2. What did we not account for in the data analysis?
3. Alter the rmarkdown code to provide a proper data analysis.
4. Conduct the DE analysis based on the data you have mapped in [tutorial I]()
### 2. Airway example
The data used in this workflow comes from an RNA-seq experiment where airway smooth muscle cells were treated with dexamethasone, a synthetic glucocorticoid steroid with anti-inflammatory effects (Himes et al. 2014). Glucocorticoids are used, for example, by people with asthma to reduce inflammation of the airways. In the experiment, four human airway smooth muscle cell lines were treated with 1 micromolar dexamethasone for 18 hours. For each of the four cell lines, we have a treated and an untreated sample. For more description of the experiment see the article, PubMed entry [PMID: 24926665](https://pubmed.ncbi.nlm.nih.gov/24926665/), and for raw data see the GEO entry [GSE52778](https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE52778).
Here, we will not use our count table because it is based on a small subset of the reads.
We will use the count table that was provided after mapping all the reads with the read mapper `STAR`. The count table can be imported in R via the following link: [https://raw.githubusercontent.com/statOmics/SGA/airwaySeqData/star/airwayCountTableStar.csv](https://raw.githubusercontent.com/statOmics/SGA/airwaySeqData/star/airwayCountTableStar.csv)
1. Retreive the meta data to link the samples to the design
2. Perform the DE analysis with edgeR
3. Perform the DE analysis using the DESeq2 package see [DESeq2 vignette](https://bioconductor.org/packages/release/bioc/vignettes/DESeq2/inst/doc/DESeq2.html)
### 3. Parathyroid example
The data were presented in the article "Evidence of a Functional Estrogen Receptor in Parathyroid Adenomas" by Haglund F, Ma R, Huss M, Sulaiman L, Lu M, Nilsson IL, Hoog A, Juhlin CC, Hartman J, Larsson C, J Clin Endocrinol Metab. jc.2012-2484, Epub 2012 Sep 28, PMID: 23024189. The sequencing was performed on tumor cultures from 4 patients at 2 time points over 3 conditions (DPN, OHT and control). One control sample was omitted by the paper authors due to low quality. The package vignette describes the creation of objects from raw sequencing data provided by NCBI Gene Expression Omnibus under accession number GSE37211. The gene and exon features are the GRCh37 Ensembl annotations, release 66.
The data can be loaded by `data("parathyroidGenesSE", package="parathyroidSE")`
1. Summarize technical repeats
2. Preprocess the data
3. Model the data using edgeR and DESeq2
4. Perform statistical tests to assess the effect of the stimuli at the early timepoint, the late timepoint and if the effect of the stimulus changes over time.