-
Notifications
You must be signed in to change notification settings - Fork 19
05. RNASeq and miRNASeq Data
There are three tables in tranSMART for NGS data, two for RNA Sequencing data and one for miRNA Sequencing data. One of the RNASeq Data Tables is intended for loading raw read count observation data which can be used with Group Test for RNASeq Advanced Workflow. R package behind this workflow (EdgeR) includes normalization step. I believe this RNASeq special case data loading is not part of the original set of HDD data developed by Sanofi. Perhaps this was developed for a customer and then contributed by Hyve. At this time, tMDataLoader does not have a specific procedure to load RNASeq data in a raw read count observation format. All other Analysis Workflows require pre-normalized RNASeq data (RPKM, FPKM, TPM, etc.) which is loaded similar to Expression data. Even though there is a special table for miRNA sequencing data, it can also be loaded as RNASeq. This is mostly a matter of preference for standard IDs. For RNASeq data “probes” which in this case are transcript IDs are mapped to Standard Gene Symbols. For miRNA sequencing data miRBase symbols are used as standard IDs.
RNASeq Data (sample)
RNASeq Data is loaded from the RNASeqDataToUpload Directory similar to the Expression Data. For more details see “Expression Data”
TranscriptID | S57023 | S57024 |
---|---|---|
NM_001011874 | 0 | 0.0093 |
NM_001195662 | 0.0384 | 0.051 |
The last symbol in data file name (before extension) is one of following letters:
R
- raw data. Values are loaded into Value column. Raw values are transformed to calculate log2 value and z-score. log2 values are loaded into Log values column
L
- log2 data. Values are loaded into Log value column, raw values are restored and loaded into Value column. z-score is calculated.
T
and Z
- z-score data. Has same meaning, value will be written to z-score without modifications if it in range of (-2.5; 2.5). It will be truncated to this range otherwise.
NOTE: if data is loaded as R
, 0.001 is added to all values before log2 transformation to avoid dropping 0 values (0 can't be log transformed). Normalized Raw RNA Seq data are not expected to have any negative values.
#PLATFORM_ID: RNASeq999 #PLATFORM_TITLE: Test RNASeq Platform #SPECIES: Homo Sapiens
Transcript ID | Gene Symbol | Organism |
---|---|---|
NM_001011874 | MEF2C | Homo sapiens |
NM_001195662 | ALDH8A1 | Homo sapiens |
NM_011283 | MEF2A | Homo sapiens |
NM_011441 | MEF2C | Homo sapiens |
STUDY_ID | SITE_ID | SUBJECT_ID | SAMPLE_ID | PLATFORM | TISSUETYPE | ATTR1 | ATTR2 | CATEGORY_CD | SOURCE_CD |
---|---|---|---|---|---|---|---|---|---|
GSE_A_37424 | 0 | 1 | S57023 | RNASeq999 | Intestine | Biomarker_Data+PLATFORM+TISSUETYPE | STD |
#Normalized miRNASeq Data Loading Instruction
RNASeq Data is loaded from the MIRNA_SEQDataToUpload Directory similar to Expression Data. For more details see “Expression Data”.
miRNASeq Data (sample)
ID_REF | GSM918942 | GSM918943 | GSM918944 | GSM918945 | GSM918946 | GSM918947 | GSM918948 | GSM918949 |
---|---|---|---|---|---|---|---|---|
1 | 0.002908561 | 0.004549935 | 0.021626957 | 0.015697885 | 0.005178485 | 0.00498247 | 0.005311656 | 0.010319512 |
2 | 0.01039278 | 0.010017933 | 0.038167632 | 0.040012373 | 0.010484615 | 0.010744884 | 0.011629359 | 0.023468306 |
3 | 0.006034899 | 0.010552801 | 0.035375773 | 0.027333613 | 0.007408354 | 0.00969822 | 0.0095548 | 0.015315651 |
The last symbol in data file name (before extension) is one of following letters:
R
- raw data. Values are loaded into Value column. Raw values are transformed to calculate log2 value and z-score. log2 values are loaded into Log values column
L
- log2 data. Values are loaded into Log value column, raw values are restored and loaded into Value column. z-score is calculated.
T
and Z
- z-score data. Has same meaning, value will be written to z-score without modifications if it in range of (-2.5; 2.5). It will be truncated to this range otherwise.
NOTE: if data is loaded as R
, 0.001 is added to all values before log2 transformation to avoid dropping 0 values (0 can't be log transformed). Normalized miRNA Seq Raw data are not expected to have any negative values.
#PLATFORM_ID: GPL15467seqbased #PLATFORM_TITLE: Test MIRNAseq Platform #SPECIES: Homo Sapiens
ID_REF | MIRNA_ID | SN_ID | PLT_NAME | ORGANISM |
---|---|---|---|---|
1 | hsa-miR-1 | GPL15467seqbased | Homo Sapiens | |
2 | hsa-miR-9 | GPL15467seqbased | Homo Sapiens | |
3 | hsa-miR-10a | GPL15467seqbased | Homo Sapiens |
STUDY_ID | SITE_ID | SUBJECT_ID | SAMPLE_CD | PLATFORM | TISSUETYPE | ATTRITBUTE_1 | ATTRITBUTE_2 | CATEGORY_CD | SOURCE_CD |
---|---|---|---|---|---|---|---|---|---|
mirnaseqbased | GSM918942 | GSM918942 | GPL15467seqbased | Human | Synovium | Biomarker_Data+PLATFORM+ATTR1 | STD |