Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hifiasm crashes with segfault on a toy Hi-C dataset #152

Open
sidorov-si opened this issue Jul 7, 2021 · 8 comments
Open

hifiasm crashes with segfault on a toy Hi-C dataset #152

sidorov-si opened this issue Jul 7, 2021 · 8 comments

Comments

@sidorov-si
Copy link

sidorov-si commented Jul 7, 2021

Dear hifiasm team,

I'm developing an nf-core module for hifiasm, and I've tested haplotype phasing with hifiasm using a toy set of PacBio HiFi reads and a small set of Hi-C reads (please find attached). The HiFi reads come from a child genome (GIAB's HG002, SRR10382244), and the Hi-C reads come from a normal human lung tissue (SRR13061060) and are selected so that they map to the HiFi reads.

When I run the following command with hifiasm v0.15.4-r343

time hifiasm \
    -o test.asm \
    --h1 SRR13061060_10000reads_mapped_1.fastq \
    --h2 SRR13061060_10000reads_mapped_2.fastq \
    SRR10382244_subset.fastq

in a conda env on my Mac, it fails in a short time with Segmentation fault: 11 (please see the full run log attached).

However, it still produces some output:

-rw-r--r--  1 sidoros  1934034978   4.4M  7 Jul 16:51 test.asm.hic.tlb.bin
-rw-r--r--  1 sidoros  1934034978   4.2K  7 Jul 16:51 test.asm.hic.p_ctg.lowQ.bed
-rw-r--r--  1 sidoros  1934034978   4.9K  7 Jul 16:51 test.asm.hic.p_ctg.noseq.gfa
-rw-r--r--  1 sidoros  1934034978   579K  7 Jul 16:51 test.asm.hic.p_ctg.gfa
-rw-r--r--  1 sidoros  1934034978   5.3K  7 Jul 16:51 test.asm.hic.p_utg.lowQ.bed
-rw-r--r--  1 sidoros  1934034978   5.8K  7 Jul 16:51 test.asm.hic.p_utg.noseq.gfa
-rw-r--r--  1 sidoros  1934034978   686K  7 Jul 16:51 test.asm.hic.p_utg.gfa
-rw-r--r--  1 sidoros  1934034978   5.3K  7 Jul 16:51 test.asm.hic.r_utg.lowQ.bed
-rw-r--r--  1 sidoros  1934034978   5.8K  7 Jul 16:51 test.asm.hic.r_utg.noseq.gfa
-rw-r--r--  1 sidoros  1934034978   686K  7 Jul 16:51 test.asm.hic.r_utg.gfa
-rw-r--r--  1 sidoros  1934034978    11K  7 Jul 16:51 test.asm.ovlp.reverse.bin
-rw-r--r--  1 sidoros  1934034978    44K  7 Jul 16:51 test.asm.ovlp.source.bin
-rw-r--r--  1 sidoros  1934034978   934K  7 Jul 16:51 test.asm.ec.bin

What could be the reason for the segfault?

Thank you,
Slava

hifiasm_output.tar.gz
SRR10382244_subset.fastq.gz
SRR13061060_10000reads_mapped_1.fastq.gz
SRR13061060_10000reads_mapped_2.fastq.gz
hifiasm_run_log.txt

@chhylp123
Copy link
Owner

Let me have a look at it. But for this example, the coverage is too low for assembly.

@sidorov-si
Copy link
Author

Thank you @chhylp123 ! In terms of coverage, do you mean the HiFi reads or Hi-C reads? HiFi reads I selected so that they map to the same contig, so I hoped that they could be assembled?

@chhylp123
Copy link
Owner

The k-mer plot looks weird. The normal HiFi data should have a k-mer plot like: #49 (comment)

@sidorov-si
Copy link
Author

How do they look like? Maybe, it's just because these are only 204 HiFi reads mapping to a particular HG002 contig, and on the whole HG002 PacBio HiFi run the kmer profile would be different?

@chhylp123
Copy link
Owner

Yes, I think so. So the coverage looks not enough. But hifiasm shouldn't crash even in this rare case, I will have a look at it. I just recommend you to have a try with enough coverage for testing.

@baozg
Copy link

baozg commented Jul 8, 2021

@sidorov-si You can assembly the whole HG002 HiFi reads or just one chromosome, and then grep the HiFi reads in the *.p_utg.noseq.gfa for pipeline testing. I have try assembly a smallset HiFi reads (~200 reads), it can produce the assembly.

@sidorov-si
Copy link
Author

Thank you @chhylp123 ! So, I'm using 204 reads that map to one conting from the whole HG002 assembly produced in your paper. How do you estimate the coverage?

@chhylp123
Copy link
Owner

Sorry for the delay. This bug has been fixed in v0.15.5 (see: https://github.com/chhylp123/hifiasm/releases/tag/0.15.5).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants