Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

markdup segmentation fault #393

Closed
ZhifeiLuo opened this issue Mar 28, 2019 · 66 comments
Closed

markdup segmentation fault #393

ZhifeiLuo opened this issue Mar 28, 2019 · 66 comments

Comments

@ZhifeiLuo
Copy link

sambamba 0.6.8 from bioconda

I run markdup on a cluster and generated the segmentation fault error and left a core.* temporary file. This error is consistent among all the computer with different CPU (either Xeon or AMD Opteron). Any thoughts?

@steffenheyne
Copy link

probably similar issue here. I'm also on conda v0.6.8

For test purposes I have a very small bam and it seems releated to the number of threads used.
The same small bam with -t 1..8 works fine, with -t 9 it segfaults
Maybe some threads get zero data!?

$sambamba markdup --remove-duplicates -t 9 in.bam out.bam

sambamba 0.6.8 by Artem Tarasov and Pjotr Prins (C) 2012-2018
    LDC 1.13.0 / DMD v2.083.1 / LLVM7.0.1 / bootstrap LDC - the LLVM D compiler (0.17.6)

finding positions of the duplicate reads in the file...
  sorted 324798 end pairs
     and 324 single ends (among them 0 unmatched pairs)
  collecting indices of duplicate reads...   done in 57 ms
  found 361424 duplicates
collected list of positions in 0 min 0 sec
Segmentation fault (core dumped)

@steffenheyne
Copy link

ah maybe this is due to a high number of duplicates.
I increased the --hash-table-size=4194304 and then it runs through without segfault

@steffenheyne
Copy link

so actually the segfaults seem more random, I changed parameters and sometimes it crashes, sometimes it runs through...so far unpredictable what causes this....

bioconda version/build issue?
underlying OS is ubuntu 18.04

@steffenheyne
Copy link

here is a dump
catchsegv.1.txt

@pjotrp
Copy link
Member

pjotrp commented Apr 1, 2019

Thanks for the dump. I am happy to chase the problem. You can see it is happening in the garbage collector. Can you try the latest binary release and see if this does the same? https://github.com/biod/sambamba/releases

Also, what hardware are you on. Some Xeons have threading problems.

@pjotrp pjotrp self-assigned this Apr 1, 2019
@pjotrp pjotrp added the bug label Apr 1, 2019
@pjotrp pjotrp added this to the 0.7.0 milestone Apr 1, 2019
@steffenheyne
Copy link

different cloud servers, but last one was a Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz

dump from cloned master:

catchsegv.2.txt

I compiled with make debug, but looks less informative than with the conda version

@pjotrp
Copy link
Member

pjotrp commented Apr 2, 2019

I think that is one of the unreliable Xeons for hyperthreading. Sambamba is one of the rare tools which brings it out by utilizing all cores. See also #335. Check the list by Intel.

@steffenheyne
Copy link

mhmm that's bad. so using less cores could be more stable?

I also get a segfault with sambamba depth ....

@pjotrp
Copy link
Member

pjotrp commented Apr 2, 2019

One option is to turn off hyperthreading. Sadly that is hard in most HPC environments. Alternatively don't use those machines or rerun until it gets through. You can blame Intel.

@steffenheyne
Copy link

yeah thanks!

Btw, I downgraded to v0.6.6 via conda and this seems more stable, so far I don't get any segfaults with the same data.

Does this version uses a different underlying library or like that?

@pjotrp
Copy link
Member

pjotrp commented Apr 3, 2019

It is possible. What version of LLVM and ldc is it using? Underlying tools are evolving fast. Sambamba did not change markdup essentially between those versions.

@steffenheyne
Copy link

It looks like the 0.6.6 conda version uses your pre-build packages from github
(https://github.com/biod/sambamba/releases/download/v0.6.6/sambamba_v0.6.6_linux.tar.bz2),

whereas the 0.6.8. build it from scratch on the conda server as given here:
https://github.com/bioconda/bioconda-recipes/blob/master/recipes/sambamba/build.sh

@pjotrp
Copy link
Member

pjotrp commented Apr 3, 2019 via email

@steffenheyne
Copy link

steffenheyne commented Apr 3, 2019

unfortunately this doesn't help, the pre-build 0.6.9 also segfaults, and also a locally build 0.7.0-pre1

but what I observed I rarely/(never?) got segfaults with all versions from 1-8 threads, strangely also not with the local build with 9 threads, higher thread numbers really often/(always?) segfaults

@steffenheyne
Copy link

steffenheyne commented Apr 3, 2019

yeah with your manually (not using conda) downloaded pre-build 0.6.6 it also (so far) never crashes with whatever number of threads

@RichardCorbett
Copy link

Hi folks,
I'm using sambamba-0.6.9-linux-static on CentOS 6 and 7 and getting what looks like a similar error.

I ran 40 different samples and only 10 got this error, but I tried this command on a few machines and they all seg faulted.

Do you have any suggestions?

thanks,
Richard

/projects/COLO829SomaticTitration/sambamba-0.6.9-linux-static markdup --tmpdir /projects/COLO829SomaticTitration /projects/COLO829SomaticTitration/normals/A36973_5_lanes_dupsFlagged.bam.spec.bam.p1.bam /projects/COLO829SomaticTitration/normals/A36973_5_lanes_dupsFlagged.bam.spec.bam.p1.bam.dupmark.bam

sambamba 0.6.9 by Artem Tarasov and Pjotr Prins (C) 2012-2019
    LDC 1.14.0 / DMD v2.084.1 / LLVM7.0.1 / bootstrap LDC - the LLVM D compiler (0.17.6)

finding positions of the duplicate reads in the file...
  sorted 147834456 end pairs
     and 4860622 single ends (among them 1137 unmatched pairs)
  collecting indices of duplicate reads...   done in 35457 ms
  found 2953009 duplicates
collected list of positions in 26 min 49 sec
marking duplicates...
Segmentation fault (core dumped)

@pjotrp
Copy link
Member

pjotrp commented Apr 5, 2019

@RichardCorbett what is the CPU you are on? Does using an older binary of sambamba work?

@RichardCorbett
Copy link

CPU tested:
Intel Xeon E5 -2650 @ 2.20 GHz

I just had some filesystem problems overnight so retesting is going to be a challenge. In case it may be relevant I noticed that I was testing on BWA aln 0.5.7 BAM file which includes non-zero mapping qualities for unaligned reads.

@pjotrp
Copy link
Member

pjotrp commented Apr 5, 2019

Yeah, this may be another Xeon with problems. It will be interesting to see if older versions of sambamba+llvm+ldc show the same problem.

@steffenheyne can you confirm you are still good with 0.6.6 binary? And both of you, is there any way I could access one of these machines so I can debug the problem? If 0.6.6 works I could use that same build chain to compile a new binary and see if that keeps going.

@pjotrp
Copy link
Member

pjotrp commented Apr 5, 2019

sambamba 0.6.6 was built with

    LDC 0.17.1
    using DMD v2.068.2
    using LLVM 3.8.0

@pjotrp
Copy link
Member

pjotrp commented Apr 5, 2019

Another thing we could try is building sambamba with the GNU D compiler. That would fine tune the diagnostics - whether it is with LLVM or D-runtime.

@RichardCorbett
Copy link

A second test run overnight duplicated the error on a machine with this cpu:
144 x Intel(R) Xeon(R) CPU E7-8867 v4 @ 2.40GHz

@pjotrp
Copy link
Member

pjotrp commented Apr 5, 2019

Great. I am starting to suspect the D runtime. I should try hitting one of our machines badly.

@dpryan79
Copy link
Contributor

dpryan79 commented Apr 9, 2019

I'll try rebuilding 0.6.6 in bioconda with more recent pinnings to see if (A) it still works without segfaulting and (B) it can still be combined with other recent packages in an environment. That will allow us (@steffenheyne and me) to get around the current issue. I agree that this is likely a D runtime issue.

@pjotrp
Copy link
Member

pjotrp commented Apr 11, 2019

Thanks. I am too busy now to sort it now but should have time by Easter.

@sirselim
Copy link

Just adding my observation of this issue as well.
Processors: 2x Intel Xeon Gold 5118 (12 core, based on Skylake)

Using conda Sambamaba (0.6.8 and now 0.6.9) in a snakemake workflow. Getting segfaults at markdup stage.

Set the markdup rule to call 0.6.6 from conda and things seem to be processing well.

@pjotrp
Copy link
Member

pjotrp commented Apr 26, 2019

Thanks. This is very annoying, good thing we have 0.6.6.

@pjotrp
Copy link
Member

pjotrp commented May 31, 2019

#13 at /home/wrk/izip/git/opensource/D/sambamba/BioD/bio/std/hts/bam/readrange.d:178

points at

            _alloc_buffer = uninitializedArray!(ubyte[])(max(size, 65536));

which triggers a GC sweep and segfaults.

The stack trace is informative and it looks like we have invalid pointers on the stack.

@pjotrp
Copy link
Member

pjotrp commented May 31, 2019

The debug version of sambamba renders

0x000000000070e814 in _D3std11parallelism8TaskPool16tryDeleteExecuteMFPSQBwQBv12AbstractTaskZv ()                                                              (gdb) bt                                                                                                                                                       #0  0x000000000070e814 in _D3std11parallelism8TaskPool16tryDeleteExecuteMFPSQBwQBv12AbstractTaskZv ()                                                          #1  0x000000000059a2a5 in _D3std11parallelism__T4TaskS_D3bio4core4bgzf5block19decompressBgzfBlockFSQBrQBqQBoQBm9BgzfBlockCQCoQCn5utils7memoize__T5CacheTQCcTSQD
xQDwQDuQDs21DecompressedBgzfBlockZQBwZQBpTQDzTQDgZQGf10yieldForceMFNcNdNeZQCz (this=0x2828282828282828)                                                 
    at /gnu/store/lfj9sx1c98nj65vw8gmvz31sh3q8qhm6-ldc-1.16.0-beta2/include/d/std/parallelism.d:605                                                            #2  0x0000000000599b0b in _D3std11parallelism__T4TaskS_D3bio4core4bgzf5block19decompressBgzfBlockFSQBrQBqQBoQBm9BgzfBlockCQCoQCn5utils7memoize__T5CacheTQCcTSQDxQDwQDuQDs21DecompressedBgzfBlockZQBwZQBpTQDzTQDgZQGf6__dtorMFNfZv (this=0x2828282828282828)                                                                       at /gnu/store/lfj9sx1c98nj65vw8gmvz31sh3q8qhm6-ldc-1.16.0-beta2/include/d/std/parallelism.d:747                                                            #3  0x000000000074064c in object.TypeInfo_Struct.destroy(void*) const ()
#4  0x000000000074a1f7 in rt.lifetime.finalize_array(void*, ulong, const(TypeInfo_Struct)) ()                                                                  #5  0x000000000074b456 in rt.lifetime.finalize_array2(void*, ulong) ()                                                                                         #6  0x000000000074b74a in rt_finalizeFromGC ()                                                                                                                 #7  0x000000000076c798 in _D2gc4impl12conservativeQw3Gcx5sweepMFNbZm ()                                                                                       
#8  0x0000000000767ff0 in _D2gc4impl12conservativeQw3Gcx11fullcollectMFNbbZm ()                        
#9  0x0000000000769f1c in _D2gc4impl12conservativeQw3Gcx8bigAllocMFNbmKmkxC8TypeInfoZPv ()                                                                     #10 0x0000000000765123 in _D2gc4impl12conservativeQw3Gcx5allocMFNbmKmkxC8TypeInfoZPv ()                                                                        #11 0x0000000000765059 in _D2gc4impl12conservativeQw14ConservativeGC12mallocNoSyncMFNbmkKmxC8TypeInfoZPv ()                                                    #12 0x0000000000764fa2 in _D2gc4impl12conservativeQw14ConservativeGC__T9runLockedS_DQCeQCeQCcQCnQBs12mallocNoSyncMFNbmkKmxC8TypeInfoZPvS_DQEgQEgQEeQEp10mallocTimelS_DQFiQFiQFgQFr10numMallocslTmTkTmTxQCzZQFcMFNbKmKkKmKxQDsZQDl ()                                                                                          #13 0x00000000007651e6 in _D2gc4impl12conservativeQw14ConservativeGC6qallocMFNbmkxC8TypeInfoZS4core6memory8BlkInfo_ ()
#14 0x0000000000768aa8 in _DThn16_2gc4impl12conservativeQw14ConservativeGC6qallocMFNbmkxC8TypeInfoZS4core6memory8BlkInfo_ ()                                   #15 0x000000000073d3a1 in gc_qalloc ()                                                                                                                         #16 0x00000000007374ca in _D4core6memory2GC6qallocFNaNbmkxC8TypeInfoZSQBqQBo8BlkInfo_ ()                                                                       #17 0x00000000007499d7 in _D2rt8lifetime12__arrayAllocFNaNbmxC8TypeInfoxQlZS4core6memory8BlkInfo_ ()                                                           #18 0x000000000074ab37 in _d_newarrayU ()                                                                                                                      #19 0x000000000074abed in _d_newarrayT ()
#20 0x000000000051b07e in _D7contrib6undead6stream14BufferedStream6__ctorMFCQBwQBrQBn6StreammZCQCpQCkQCgQCc (this=0x7ffe8a37f380, source=0x7ffe8a30ce40,           bufferSize=134217728) at /home/wrk/izip/git/opensource/D/sambamba/BioD/contrib/undead/stream.d:1628
#21 0x000000000054f772 in bio.std.hts.bam.reader.BamReader.getNativeEndianSourceStream() (this=0x7ffe8a312d00)                                             
    at /home/wrk/izip/git/opensource/D/sambamba/BioD/bio/std/hts/bam/reader.d:517                          
...

where the last line refers to another allocation

    return new BufferedStream(file, _buffer_size);

@pjotrp
Copy link
Member

pjotrp commented May 31, 2019

This may be what I am looking for

#0  0x000000000070e814 in _D3std11parallelism8TaskPool16tryDeleteExecuteMFPSQBwQBv12AbstractTaskZv ()
#1  0x000000000059a2a5 in _D3std11parallelism__T4TaskS_D3bio4core4bgzf5block19decompressBgzfBlockFSQBrQBqQBoQBm9BgzfBlockCQCoQCn5utils7memoize__T5CacheTQCcTSQDxQDwQDuQDs21DecompressedBgzfBlockZQBwZQBpTQDzTQDgZQGf10yieldForceMFNcNdNeZQCz (this=0x2026261b0d812821)
    at /gnu/store/lfj9sx1c98nj65vw8gmvz31sh3q8qhm6-ldc-1.16.0-beta2/include/d/std/parallelism.d:605
#2  0x0000000000599b0b in _D3std11parallelism__T4TaskS_D3bio4core4bgzf5block19decompressBgzfBlockFSQBrQBqQBoQBm9BgzfBlockCQCoQCn5utils7memoize__T5CacheTQCcTSQDxQDwQDuQDs21DecompressedBgzfBlockZQBwZQBpTQDzTQDgZQGf6__dtorMFNfZv (this=0x2026261b0d812821)
    at /gnu/store/lfj9sx1c98nj65vw8gmvz31sh3q8qhm6-ldc-1.16.0-beta2/include/d/std/parallelism.d:747
#3  0x000000000074064c in object.TypeInfo_Struct.destroy(void*) const ()
#4  0x000000000074a1f7 in rt.lifetime.finalize_array(void*, ulong, const(TypeInfo_Struct)) ()
#5  0x000000000074b456 in rt.lifetime.finalize_array2(void*, ulong) ()
#6  0x000000000074b74a in rt_finalizeFromGC ()

@pjotrp
Copy link
Member

pjotrp commented Jun 1, 2019

Just now, for the first time, sambamba ran without crashing :). The problem is an out of order execution.

@pjotrp
Copy link
Member

pjotrp commented Jun 2, 2019

I think the problem is with scopedtask where a task is added to the threadpool using the stack rather than the heap. In particular this section

https://github.com/biod/BioD/blob/master/bio/core/bgzf/inputstream.d#L391

where a task gets created and pushed on a roundbuf. When the garbage collector kicks in after reading the BAM file it wants to destroy this object but it is in an inconsistent state (maybe the thread already got cleaned up or it tries to clean up twice). I managed to prevent segfaulting by disabling the garbage collector, but obviously that won't do.

The roundbuf is probably used to keep a task connected with the bgzf unpacking buffer. I am not sure why this is necessary. Also I am not convinced a threadpool is that much of a benefit for bgzf unpacking as the single threaded routine I wrote last year is blazingly fast. Need to figure out what the best approach is...

@pjotrp
Copy link
Member

pjotrp commented Jun 10, 2019

Adding this code to the destructor which segfaults

struct DecompressedBgzfBlock {
    ~this() {
      stderr.writeln("destroy DecompressedBgzfBlock ",start_offset,":",end_offset," ",decompressed_data.sizeof);
    };
    ulong start_offset;
    ulong end_offset;
    ubyte[] decompressed_data;
}

Running a decompress block typically reads

destroy DecompressedBgzfBlock 4945:5091 16                                                                                             
destroy DecompressedBgzfBlock 4616:4945 16
destroy DecompressedBgzfBlock 4287:4616 16                                                                                             destroy DecompressedBgzfBlock 3958:4287 16
destroy DecompressedBgzfBlock 3629:3958 16                                                                                             
destroy DecompressedBgzfBlock 3300:3629 16
destroy DecompressedBgzfBlock 2971:3300 16                                                                                             
destroy DecompressedBgzfBlock 2642:2971 16
destroy DecompressedBgzfBlock 2313:2642 16                                                                                             destroy DecompressedBgzfBlock 1984:2313 16  
destroy DecompressedBgzfBlock 1655:1984 16
destroy DecompressedBgzfBlock 1326:1655 16
destroy DecompressedBgzfBlock 997:1326 16
destroy DecompressedBgzfBlock 668:997 16  
destroy DecompressedBgzfBlock 339:668 16  
destroy DecompressedBgzfBlock 0:339 16                                                                                                 

but before a segfault we get

destroy DecompressedBgzfBlock 3184080310709005360:3467820302580068384 16
destroy DecompressedBgzfBlock 3184080310725782576:3467820302580068384 16                                                               destroy DecompressedBgzfBlock 3467820298285101088:3467820319995076652 16
destroy DecompressedBgzfBlock 3184361785685716016:3539877896617996576 16                                                               destroy DecompressedBgzfBlock 2318280822927401004:3184080310725782576 16
destroy DecompressedBgzfBlock 3539878068416698400:2318281922439028780 16                                                               destroy DecompressedBgzfBlock 3467820302580068640:2318280822927401004 16
destroy DecompressedBgzfBlock 2318280822927401004:3184361785702493232 16                                                               destroy DecompressedBgzfBlock 3467820298285101088:2318280822927401004 16
destroy DecompressedBgzfBlock 3467820298285101088:2318281922439028780 16                                                               destroy DecompressedBgzfBlock 2318281922439028780:2318281922439360048 16
destroy DecompressedBgzfBlock 2318286380681145392:3184080310725782576 16
destroy DecompressedBgzfBlock 3467820298285101344:2318281922439028780 16
destroy DecompressedBgzfBlock 3539877892323029024:2318281922439094316 16
Program exited with code -11              
wrk@penguin2 ~/izip/git/opensource/D/sambamba/BioD [env]$

After making sure the start_offset is set to 0 it looks like these blocks have become invalid and the garbage collector still tries to clean them up.

@pjotrp
Copy link
Member

pjotrp commented Jun 10, 2019

Artem wrote:

Create a new task and put it on the roundbuffer using some magic

My conclusion is that that 'magic' no longer works. Creating a thread on the stack and moving it to the roundbuffer on the Heap confuses the garbage collector which is kinda unsurprising. With markdup I can't disable the GC cleanup so it needs some surgery.

@ekg
Copy link

ekg commented Jun 19, 2019

We are having the same problem. Is the only solution at present to downgrade to 0.6.6?

@pjotrp
Copy link
Member

pjotrp commented Jun 19, 2019

It is one of these things that take a few days of work to fix. It is on my list :/

@ekg
Copy link

ekg commented Jun 20, 2019

It might save some users a bit of time if you make a kind of warning binary release of 0.6.6-stable and drop it at the top of the release page.

I know how much of a pain it can be to track down stuff like this. We just had a big problem due to the Spectre/Meltdown patches changing the way that multithreaded interleaved system calls work. I wonder if this problem is related.

@pjotrp
Copy link
Member

pjotrp commented Jun 20, 2019

Actually the latest binary release 0.7.0 works.

https://github.com/biod/sambamba/releases/tag/v0.7.0

The problem is with later versions of the D compiler.

@rikrdo89
Copy link

I got the same error "Segmentation fault (core dumped)" using sambamba 0.7.0 installed with conda and running in an LSF cluster.... :/

@pjotrp
Copy link
Member

pjotrp commented Sep 16, 2019

Conda builds with a recent ldc. That is the problem at this point.

@pjotrp
Copy link
Member

pjotrp commented Sep 23, 2019

A heads up.

At this point build sambamba with an older ldc - like the binary released on github.

We decided to replace the original bam reader with a new one I wrote almost 2 years ago and we have been testing. Rather than fix the GC related issue we are going to use the new bamreader which is simpler and therefore (hopefully) easier to maintain. @NickRoz1 who worked as a Google Summer of Code student for me on column based bams is doing that work.

One important difference is that the worker threads are no longer part of the reader itself. My theory is that performance should be similar ;)

@vanottee
Copy link

vanottee commented Nov 4, 2019

Same segmentation fault issue here - had installed on HPC system with bioconda. Is there a way to know when the Conda build has been updated? Alternatively, if I extract from the 7.0 source code, how do I actually execute a tool, like markdup? Thanks!

@pjotrp
Copy link
Member

pjotrp commented Nov 25, 2019

I am working on fixing this bug. Track progress here https://thebird.nl/blog/work/rotate.html

@pjotrp
Copy link
Member

pjotrp commented Nov 28, 2019

@sjackman we have a new release of sambamba that fixes a long standing bug with the D runtime and should now compile with all versions of ldc. See also https://travis-ci.org/biod/sambamba

@sjackman
Copy link

Thanks for the heads up, Pjotr. Once you've tagged a new release, would you like to open a PR to bump the version of Sambamba in Brewsci/bio? You need to change only lines, and you can do it from the GitHub web interface if you like. See https://github.com/brewsci/homebrew-bio/blob/c5b38cfea1eff4b18ae19d9dded00f990769de84/Formula/sambamba.rb#L6-L7

@pjotrp
Copy link
Member

pjotrp commented Nov 29, 2019

@sjackman feels dirty to edit through the web interface. But hopefully it works :)

@sjackman
Copy link

Hehe. Thanks!

@pmagwene
Copy link

I hate to revive a closed issue but I'm running into the segmentation fault issue with markdup in 0.8.0 as installed from conda.

My build:

sambamba 0.8.0
 by Artem Tarasov and Pjotr Prins (C) 2012-2020
    LDC 1.24.0 / DMD v2.094.1 / LLVM11.0.0 / bootstrap LDC - the LLVM D compiler (1.24.0)

System info:

  • Ubuntu 21.04
  • CPU: i9-9900K CPU @ 3.60GHz

@pjotrp
Copy link
Member

pjotrp commented Jul 28, 2021

Does this happen reproducibly? Can you share your BAM file in that case? Hard to fix stuff if we don't get reproducible errors.

@pmagwene
Copy link

pmagwene commented Jul 29, 2021

An interesting twist on this. The file linked below reproducibly generates segmentation faults on my system when running single threaded (or whatever the default threads is; sambamba markdup -h doesn't indicate) but will complete OK if running with with -t <= 10 (i9-9900k is 8 core/16 threads).

https://drive.google.com/file/d/1owSZqrrWkrfbsEdf81iLrlByyOmfU4hs/view?usp=sharing

@pjotrp
Copy link
Member

pjotrp commented Jul 31, 2021

@pmagwene, thanks for the test file. This is a new issue and not related to earlier segfaults. I can not reproduce it on my AMD Ryzen 7 3700X 8-Core Processor and on a Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz. You may want try the static binary I'll release in a bit at https://github.com/biod/sambamba/releases. If that fails I suggest opening a new issue for the i9-9900k.

@Emma1997WTT
Copy link

ah maybe this is due to a high number of duplicates. I increased the --hash-table-size=4194304 and then it runs through without segfault

Hello, I solved my segfault with --hash-table-size=4194304!!! but I don't quite understand the meaning of this parameter, can you explain it to me? Thank you so much

@NickRoz1
Copy link
Contributor

NickRoz1 commented Aug 4, 2022 via email

@Emma1997WTT
Copy link

Hi. This parameter sets the size of a hashmap. https://en.m.wikipedia.org/wiki/Hash_table

_table = new HReadBlock[1 << table_size_log2];
If you have 10 keys value pairs but the size of hashmap is only 9, then some key will be mapped to the same cell as another key ( a duplicate ). The same will happen if you have 9 non distinct keys. In case of sambamba I think it moves duplicate values to a separate list, and then uses temp files to store these values if list is filled up. Maybe this functionality wasn't tested well or it didn't account for dataset of your size or with your number of duplicates (
// FIXME: constant!
).

On Wed, Aug 3, 2022, 11:35 Emma1997WTT @.> wrote: ah maybe this is due to a high number of duplicates. I increased the --hash-table-size=4194304 and then it runs through without segfault Hello, I solved my segfault with --hash-table-size=4194304!!! but I don't quite understand the meaning of this parameter, can you explain it to me? Thank you so much — Reply to this email directly, view it on GitHub <#393 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKJFOIXDZK3WCPX3KQAVZYTVXIVNVANCNFSM4HCFFO7Q . You are receiving this because you were mentioned.Message ID: @.>

Thanks so much for your quick and professional reply,I get it now! Have a nice day!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests