Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Troubles with GTF files #20

Closed
lietai opened this issue Jul 20, 2018 · 10 comments
Closed

Troubles with GTF files #20

lietai opened this issue Jul 20, 2018 · 10 comments

Comments

@lietai
Copy link

lietai commented Jul 20, 2018

I am currently working on human data mapped with hisat2 using Gencode fasta files and annotations

ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_28/gencode.v28.annotation.gtf.gz
ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_28/GRCh38.primary_assembly.genome.fa.gz

And I have some troubles with the GTF option.
It crash with this error

Inspecting sample......
strawberry: /home/ruolin/git/strawberry/src/alignments.cpp:1789: double compute_doc(uint, uint, const std::vector&, std::vector&, IntronMap&, uint): Assertion `gf.right() > gf.left()' failed.
Abandon (core dumped)

I tried with the example, it crashed also.

I tried with other GTF files without any more luck.
Any idea?

Or may be you could indicate from where you take your GTF file.

@ruolin
Copy link
Owner

ruolin commented Jul 21, 2018

@47Lies Did you use the zipped file gz? Please unzip first if so.

@lietai
Copy link
Author

lietai commented Jul 23, 2018

I used the unzipped file.

@ruolin
Copy link
Owner

ruolin commented Jul 23, 2018

@47Lies Thank you for reporting! I can reproduce your error. Will get back to you shortly.

@ruolin
Copy link
Owner

ruolin commented Jul 25, 2018

@47Lies Thanks again. This bug is triggered by a one bp intron in the gff file. This has been fixed in 0.9.3. https://github.com/ruolin/strawberry/tree/0.9.3

chr1 ENSEMBL exon 6457184 6457264 . + . gene_id "ENSG00000187017.16";
chr1 ENSEMBL exon 6457266 6457274 . + . gene_id "ENSG00000187017.16";

@lietai
Copy link
Author

lietai commented Jul 25, 2018

@ruolin Thank you for your reactivity. Indeed I succeded with your example bam geuvadis_300/sample_01.sorted.bam

But it failed with my bam file.

Inspecting sample......
strawberry: /home/ruolin/git/strawberry/src/alignments.cpp:1939: void filter_intron(const string&, uint, uint, const std::vector&, IntronMap&): Assertion `end > start' failed.

I suspect the same kind of issue as before.

Should I open a new issue?

Do you need one of my .bam?

@ruolin
Copy link
Owner

ruolin commented Jul 25, 2018

@47Lies Thank you for testing it! No, you don't need to open a new issue. I think it is still something related. If you can share me your bam, it will be very helpful.

@lietai
Copy link
Author

lietai commented Jul 26, 2018

@ruolin I manage to extract a few reads that make strawberry crash with the genecode.v28 gtf

CRASH.bam.gz

I just gzip the file so that github accept it.

Best regards

@ruolin
Copy link
Owner

ruolin commented Jul 27, 2018

@47Lies thank you very much for providing the test case. It helped me a lot! I have fixed the bug and tested it on your bam and it seems to work now. https://github.com/ruolin/strawberry/releases/tag/0.9.3.
Please test 0.9.3 on your whole bam and let me know how it goes.

@lietai
Copy link
Author

lietai commented Jul 31, 2018

@ruolin It works, thank you.

Best regards.

@ruolin
Copy link
Owner

ruolin commented Jul 31, 2018

@47Lies Thank you for your help and patients. I will close it now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants