Detection and quantification of events involving novel splicing junctions #277

ArashDepp · 2023-03-24T22:39:47Z

Enjoying the great tool, but I have another question:
One of my sample had following stats of used and unused reads (following the 2-pass alignment strategy with STAR)

read outcome totals across all BAMs
USED: 84178584
NOT_PAIRED: 3698345
NOT_NH_1: 34262716
NOT_EXPECTED_CIGAR: 373242
NOT_EXPECTED_READ_LENGTH: 2487676
NOT_EXPECTED_STRAND: 0
EXON_NOT_MATCHED_TO_ANNOTATION: 1047715
JUNCTION_NOT_MATCHED_TO_ANNOTATION: 192307
CLIPPED: 0
total: 126240585

So I have two questions:

rMATs does not use reads with EXON_NOT_MATCHED_TO_ANNOTATION and JUNCTION_NOT_MATCHED_TO_ANNOTATION condition. But since rMATs can detect and quantify splicing events involving new splicing junctions and splice sites, should not it also use those reads as these are the ones informing the new events?? Could you please solve this confusion?
Secondly, progress report from STAR says that 87.5% reads had unique mapping and 9.5% were multi mappers. Here is the snapshot

But rMATs counts 34262716 as not being NOT_NH_1: Which means nearly 34262716 / 126240585 = 27% reads are multi-mappers. Why this number NOT_NH_1 so much off the STAR calculation (~9.5% in the image)?
samtools stats BAMFILE shows that there were 101268968 paired reads and 97749854 were properly paired. So I am happy that rMATs can use >80 percent of the properly paired reads. But just that the number 34262716 does not fit into picture with what STAR outputs.

Could you please clarify these issues?

The text was updated successfully, but these errors were encountered:

ArashDepp · 2023-03-26T03:26:28Z

Regarding second question: one guess is that rMATs is counting the number of alignments (as opposed to reads). Is that right or make sense? Because the total number reported by rMATs is more than the number of reads in fasta file.

EricKutschera · 2023-03-28T14:22:32Z

rMATS will detect two kinds of novel events that correspond to these two output files: https://github.com/Xinglab/rmats-turbo/tree/v4.1.2#output

fromGTF.novelJunction.[AS_Event].txt: Alternative splicing (AS) events which were identified only after considering the RNA (as opposed to analyzing the GTF in isolation). This does not include events with an unannotated splice site.
fromGTF.novelSpliceSite.[AS_Event].txt: This file contains only those events which include an unannotated splice site. Only relevant if --novelSS is enabled.

The novelJunction events are those where all the exons are annotated in the gtf, but the combination of exons is different. In those cases rMATS can find exons and splice sites in the gtf that explain the alignment, but there is no transcript in the gtf which uses the exact sequence of exons

The novelSpliceSite events use at least 1 splice site which is not in the gtf. The splice site has to be found by looking at an existing exon-exon junction in the gtf and moving the splice site subject to these parameters: --mil (minimum intron length (defaults to 50)) and --mel (maximum exon length (defaults to 500)). Basically rMATS can only use an alignment with a novel splice site if it can define a new exon by making an adjustment to an annotated exon subject to those parameter values

Yes rMATS is counting individual alignments. If one read has 5 total alignments then rMATS will report it 5 times as NOT_NH_1

ArashDepp · 2023-03-28T17:55:20Z

Hello Eric.
Thanks a lot for your answer and input. So if I understood your response correctly and based on following stats:

EXON_NOT_MATCHED_TO_ANNOTATION: 1047715
JUNCTION_NOT_MATCHED_TO_ANNOTATION: 192307

So 1047715 + 192307 reads do not match the definition of exon and intron length and are discarded by rMATs. But rMATs is utilizing the other reads which do not match the exon and junction annotation provided they are within the defined exon/intron length limits.

Also do you think the number 1047715 is too large? (it is ~1% of the total reads/alignments)

EricKutschera · 2023-03-28T18:10:53Z

Yes, rMATS is utilizing other reads which do not match the exon and junction annotation provided they are within the defined exon/intron length limits if --novelSS was used (--novelSS is not on by default)

Also rMATS only attempts to find a novel splice site for an alignment if that alignment includes a splice junction. The JUNCTION_NOT_MATCHED_TO_ANNOTATION reads are the alignments that had a splice junction, but rMATS could not find a suitable novel splice site and so rMATS decided to discard those reads. The EXON_NOT_MATCHED_TO_ANNOTATION alignments didn't have a splice junction in their alignment, the alignment was not within an annotated exon, and rMATS discarded them without trying to find a novel splice site. The novel splice site parameters (--mil and --mel) only apply to alignments with junctions

The number of alignments not matched to the annotation seems fine at 1%

ArashDepp · 2023-03-28T18:17:49Z

Okay. Thanks again for taking time to clarify these doubts and appreciate your help 👍

EricKutschera mentioned this issue Oct 6, 2023

rMATS just detects SE and MXE events #327

Closed

EricKutschera mentioned this issue Jan 29, 2024

To identify differential alternative splicing between two groups #320

Open

EricKutschera mentioned this issue Apr 26, 2024

Discrepancy Between rMATS-turbo Results and IGV Visualization #399

Open

EricKutschera mentioned this issue May 10, 2024

SE events in output table do not match to the GTF annotation #401

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detection and quantification of events involving novel splicing junctions #277

Detection and quantification of events involving novel splicing junctions #277

ArashDepp commented Mar 24, 2023

ArashDepp commented Mar 26, 2023

EricKutschera commented Mar 28, 2023

ArashDepp commented Mar 28, 2023

EricKutschera commented Mar 28, 2023

ArashDepp commented Mar 28, 2023

Detection and quantification of events involving novel splicing junctions #277

Detection and quantification of events involving novel splicing junctions #277

Comments

ArashDepp commented Mar 24, 2023

ArashDepp commented Mar 26, 2023

EricKutschera commented Mar 28, 2023

ArashDepp commented Mar 28, 2023

EricKutschera commented Mar 28, 2023

ArashDepp commented Mar 28, 2023