Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with gtf file #42

Closed
LiuJJ0327 opened this issue Jun 15, 2020 · 7 comments
Closed

Problem with gtf file #42

LiuJJ0327 opened this issue Jun 15, 2020 · 7 comments

Comments

@LiuJJ0327
Copy link

Hi, I downloaded hg19 gtf file from UCSC gene brower and used -g option in strawberry, however I get the same problem as issue #20

error information:
strawberry: /root/strawberry/src/alignments.cpp:1859: double compute_doc(uint, uint, const std::vector&, std::vector&, IntronMap&, uint): Assertion `gf.right() >= gf.left()' failed.

I saw it may caused by a one bp intron in the gtf file and has been fixed in 0.9.3. However I implement the latest version 1.1.1 and still had this problem.
Can you give me any idea?

@ruolin
Copy link
Owner

ruolin commented Jun 15, 2020

@LiuJJ0327 Thanks for reporting the bug. I have updated the master branch to c2a4640. Could you try this version?

@LiuJJ0327
Copy link
Author

@LiuJJ0327 Thanks for reporting the bug. I have updated the master branch to c2a4640. Could you try this version?

I tried latest version, but new error exist. Please see below.

Has loaded transcripts from 25 Chromosomes/Scaffolds
strawberry: /root/strawberry/src/alignments.cpp:322: void HitCluster::addRefContig(const Contig&): Assertion `_ref_id == contig.ref_id()' failed.
Aborted

@ruolin
Copy link
Owner

ruolin commented Jun 16, 2020

Can you check if your gft file has the same chromosome id as is in your bam file? The typical theme here in one file you have chr and the other does not.

@LiuJJ0327
Copy link
Author

Can you check if your gft file has the same chromosome id as is in your bam file? The typical theme here in one file you have chr and the other does not.

I tested using: strawberry test.bam -o test_out.gtf -g hg19_sed.gtf -p 8
but still get the same error. test.bam and hg19_sed.gtf have same chromosome.
My real bam file is H1_rep1.bam. Can you help me to figure out the problem? Below is my data. Thank you so much.

https://drive.google.com/file/d/1muxvaxDc3ZGMG0pR-e2iTVvlBUMkNLfT/view?usp=sharing
https://drive.google.com/file/d/1ZUDjn0yrMQauLdtIQtkdFjfnLNX5IoZ7/view?usp=sharing
https://drive.google.com/file/d/1VDugrDHMDG0hAeUhC_rRxXEqiyhkqQaf/view?usp=sharing

@ruolin
Copy link
Owner

ruolin commented Jun 19, 2020

@LiuJJ0327 Again, thanks for sharing the test data. After awhile, I found out this corner case is causing the problem

64526:chr1	hg19_refGene	exon	17369	17436	.	-	.	gene_name "MIR6859-3"; gene_id "NR_107063"; transcript_id "NR_107063";
409818:chr15	hg19_refGene	exon	102513727	102513794	.	+	.	gene_name "MIR6859-3"; gene_id "NR_107063"; transcript_id "NR_107063";
461414:chr16	hg19_refGene	exon	67052	67119	.	-	.	gene_name "MIR6859-3"; gene_id "NR_107063"; transcript_id "NR_107063";

As you can see, a gene is located on two different chromosomes. I fix this issue in v1.1.2. I have tested the new version on your input data and it works fine.

@ruolin
Copy link
Owner

ruolin commented Jun 24, 2020

@LiuJJ0327 Does the new version work for you?

@ruolin ruolin closed this as completed Jul 21, 2020
@LiuJJ0327
Copy link
Author

@LiuJJ0327 Does the new version work for you?

Yes. Thank you so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants