Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0 Initalized packed array buffer #7

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

guinn8
Copy link

@guinn8 guinn8 commented Jan 21, 2022

I have 0 initialized the PackedArray buffer. I was valrgrind'ing around in my application and was getting Conditional jump or move depends on uninitialised value(s) when using a 1- bit packed array buffer and was getting incorrect results. This fix resolves the issue. I reran the tests and everything is passing. This is a fix for #6

Bug can be reproduced in this commit with this input aliquot/_pom_yang_rewrite/ make && valgrind ./main --bound=$((10**6)) --seg_len=$((5 * $((10**3)))) --writebuf_len=1000000 --preimage_count_bits=1 --num_threads=12

@gpakosz
Copy link
Owner

gpakosz commented Jan 21, 2022

Hello @guinn8 👋

I likely won't accept this PR.
At that point, it seems that all it does is working around a problem in your own code.

If you really want to 0 initialize the buffer, imho a proper way is to redefined PACKEDARRAY_MALLOC to use calloc() instead of malloc().

As far as Valgrind goes, I get no issue when running tests

$ make -C _gnu-make/
make: Entering directory '/data/Projects/PackedArray/_gnu-make'
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySelfTest -DPACKEDARRAY_SELF_TEST -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArray.c
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySelfBench -DPACKEDARRAY_SELF_BENCH -DNDEBUG -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArray.c
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySIMDSelfTest -DPACKEDARRAY_SELF_TEST -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArraySIMD.c
mkdir -p /data/Projects/PackedArray/bin/linux-x86_64
cc -o /data/Projects/PackedArray/bin/linux-x86_64/PackedArraySIMDSelfBench -DPACKEDARRAY_SELF_BENCH -DNDEBUG -std=c99 -pedantic -O2 -g /data/Projects/PackedArray/PackedArraySIMD.c
make: Leaving directory '/data/Projects/PackedArray/_gnu-make'
$ valgrind ./bin/linux-x86_64/PackedArraySelfTest
==241616== Memcheck, a memory error detector
==241616== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==241616== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==241616== Command: ./bin/linux-x86_64/PackedArraySelfTest
==241616==
-- PackedArray self test -------------------------------------------------------

sizeof(PackedArray) = 16

1 by 1 packing / unpacking:
   1 bits per item -- success.
   2 bits per item -- success.
   3 bits per item -- success.
   4 bits per item -- success.
   5 bits per item -- success.
   6 bits per item -- success.
   7 bits per item -- success.
   8 bits per item -- success.
   9 bits per item -- success.
  10 bits per item -- success.
  11 bits per item -- success.
  12 bits per item -- success.
  13 bits per item -- success.
  14 bits per item -- success.
  15 bits per item -- success.
  16 bits per item -- success.
  17 bits per item -- success.
  18 bits per item -- success.
  19 bits per item -- success.
  20 bits per item -- success.
  21 bits per item -- success.
  22 bits per item -- success.
  23 bits per item -- success.
  24 bits per item -- success.
  25 bits per item -- success.
  26 bits per item -- success.
  27 bits per item -- success.
  28 bits per item -- success.
  29 bits per item -- success.
  30 bits per item -- success.
  31 bits per item -- success.
  32 bits per item -- success.

bulk packing / unpacking:
   1 bits per item -- success.
   2 bits per item -- success.
   3 bits per item -- success.
   4 bits per item -- success.
   5 bits per item -- success.
   6 bits per item -- success.
   7 bits per item -- success.
   8 bits per item -- success.
   9 bits per item -- success.
  10 bits per item -- success.
  11 bits per item -- success.
  12 bits per item -- success.
  13 bits per item -- success.
  14 bits per item -- success.
  15 bits per item -- success.
  16 bits per item -- success.
  17 bits per item -- success.
  18 bits per item -- success.
  19 bits per item -- success.
  20 bits per item -- success.
  21 bits per item -- success.
  22 bits per item -- success.
  23 bits per item -- success.
  24 bits per item -- success.
  25 bits per item -- success.
  26 bits per item -- success.
  27 bits per item -- success.
  28 bits per item -- success.
  29 bits per item -- success.
  30 bits per item -- success.
  31 bits per item -- success.
  32 bits per item -- success.
==241616==
==241616== HEAP SUMMARY:
==241616==     in use at exit: 0 bytes in 0 blocks
==241616==   total heap usage: 81,921 allocs, 81,921 frees, 73,788,928 bytes allocated
==241616==
==241616== All heap blocks were freed -- no leaks are possible
==241616==
==241616== For lists of detected and suppressed errors, rerun with: -s
==241616== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)  

@guinn8
Copy link
Author

guinn8 commented Jan 21, 2022

Fair enough to no accept the PR as is. Redefining PACKEDARRAY_MALLOC to use calloc is not entirely straight forward as you would need to handle the additional argument, that's why I went for memset. I suppose the issue is that I assumed PackedArray would be zero initialized as a complex data structure. Would you accept a PR that makes the readme and other docs very clear that memory is left uninitialized?

@guinn8
Copy link
Author

guinn8 commented Jan 21, 2022

You are correct about it being a flaw in my application, my code assumed zero initialization. Looked like a bug from my perspective as the error was only observed on some inputs. Super cool library, reduced the memory consumption of my application by a factor of 8 (from 480gb to 60gb)!

@gpakosz
Copy link
Owner

gpakosz commented Jan 21, 2022

Redefining PACKEDARRAY_MALLOC to use calloc is not entirely straight forward as you would need to handle the additional argument, that's why I went for memset

You would do

#define PACKEDARRAY_MALLOC(size) calloc(1, size)

@gpakosz
Copy link
Owner

gpakosz commented Jan 21, 2022

Also the benefit of calloc() is that can leverages the kernel filling pages with 0 depending on the OS. Which means it has the potentiel of being faster than malloc() then memset()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants