Question about no_feature count in transcript_model_counts.tsv #141

FabianJetzinger · 2024-01-11T14:15:41Z

Hello!

I'm trying to understand the output of IsoQuant in more detail, specifically .transcript_counts.tsv and .transcript_model_counts.tsv, using the toy data (MAPT.Mouse) from this repository.

Running:
isoquant.py --reference MAPT.Mouse.reference.fasta --genedb MAPT.Mouse.genedb.gtf --fastq MAPT.Mouse.ONT.simulated.fastq --data_type nanopore -o toy_data_out

While there are 12 Transcripts in the OUT.transcript_counts.tsv, there are only 10 in OUT.transcript_models.tsv; I take this to mean that for the two transcripts "ENSMUST00000100347.10" and "ENSMUST00000146353.1", IsoQuant does not see enough evidence in the reads to include their transcript models, even though they are known reference transcripts. This seems to be supported by the fact that OUT.read_assignments.tsv shows no FSMs for these two transcripts, only ISMs.

What confuses me, however, is that OUT.transcript_counts.tsv shows "no_feature 15", while OUT.transcript_model_counts.tsv shows "no_feature 0", where I would have expected either "no_feature 15" (the same 15 completely unassigned reads), or even "no_feature X" (where X is the 15 unassigned reads, plus the number of reads assigned to the transcripts which were not included in the transcript_models).

See isoquant_OUT.zip

What am I missing or misunderstanding here? Is this number just always 0 for the transcript_model? Sadly I wasn't able to come up with an answer from reading older issues, so any help would be greatly appreciated.

The text was updated successfully, but these errors were encountered:

andrewprzh · 2024-01-24T10:22:04Z

Dear @FabianJetzinger

Sorry for the delayed response.

Yes, you are right, inconsistencies between OUT.transcript_counts.tsv and OUT.transcript_models.tsv are normal since they have different nature and underlying algorithms. However, there are known minor flaws and currently I'm working on making them more consistent with each other.

With respect to no_feature attribute, you are also right, it seems that this field is simply ignored in OUT.transcript_models.tsv.
I will fix that, thanks for the report!

Best
Andrey

FabianJetzinger · 2024-01-25T16:43:03Z

Thanks for your reply, that makes it much clearer!

andrewprzh · 2024-05-09T09:27:49Z

Should be now fixed in IsoQuant 3.4.

Also, correlation between transcript_model_counts and transcript_counts is better now.

andrewprzh added bug Something isn't working weird results Something looks odd in the resulting files labels Jan 24, 2024

andrewprzh closed this as completed May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about no_feature count in transcript_model_counts.tsv #141

Question about no_feature count in transcript_model_counts.tsv #141

FabianJetzinger commented Jan 11, 2024

andrewprzh commented Jan 24, 2024

FabianJetzinger commented Jan 25, 2024

andrewprzh commented May 9, 2024

Question about no_feature count in transcript_model_counts.tsv #141

Question about no_feature count in transcript_model_counts.tsv #141

Comments

FabianJetzinger commented Jan 11, 2024

andrewprzh commented Jan 24, 2024

FabianJetzinger commented Jan 25, 2024

andrewprzh commented May 9, 2024