Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZDICT_trainFromBuffer_cover is not thread safe #4045

Closed
as-com opened this issue May 14, 2024 · 18 comments
Closed

ZDICT_trainFromBuffer_cover is not thread safe #4045

as-com opened this issue May 14, 2024 · 18 comments
Assignees

Comments

@as-com
Copy link

as-com commented May 14, 2024

Describe the bug

ZDICT_trainFromBuffer_cover is not thread safe due to the use of global state:

/* We need a global context for qsort... */
static COVER_ctx_t *g_coverCtx = NULL;

Calling this function from multiple threads may result in segfault/access violation, use-after-free and other badness.

To Reproduce
Call ZDICT_trainFromBuffer_cover from multiple threads

Desktop (please complete the following information):

  • OS: Windows
  • Version 11 23H2
  • Compiler: MSVC, Visual Studio 2022
  • Flags: default
  • Other relevant hardware specs: N/A
  • Build system: whatever zstd-sys/cargo uses
@Adenilson
Copy link

From what I could understand from the code, the issue derives from having to use a global context to pass extra information (i.e. a context) while performing the call to qsort().

Perhaps the reentrant qsort_r() could be a good fit for this problem?

From the documentation (https://man7.org/linux/man-pages/man3/qsort_r.3.html), quote: "The qsort_r() function is identical to qsort() except that the comparison function compar takes a third argument. A pointer is passed to the comparison function via arg. In this way, the comparison function does not need to use global variables to pass through arbitrary arguments, and is therefore reentrant and safe to use in threads.".

It is available with small variations (i.e. qsort_s()) for both Windows (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170) and BSD (https://man.freebsd.org/cgi/man.cgi?query=qsort_r&sektion=3), therefore should be portable.

I would like to help on this one if it is available.
:-)

@adenilsoncavalcanti
Copy link

An update: I have a first draft patch, it seems to work fine on OSX@clang and Linux@gcc.

I will need a bit of time to get a Windows machine (and apparently a BSD machine too...), so I hope this is not a high priority bug.

I got a few questions, please see below:

@as-com: as you mentioned zstd-sys/cargo, I would assume that you are using zstd from Rust? Do you think it is possible to share a simple test case exposing the bug?

@Cyan4973: concerning testing... what would be the recommended test suite (and how to build/run/etc) for changes in the cover.c file?

So far, I only did my local changes and run 'make test' in the root folder for both OSX and Linux and after a while could only see PASS/Success/etc but I guess there is a more focused test suite for the code in the dictionary folder.

@Cyan4973
Copy link
Contributor

I guess there is a more focused test suite for the code in the dictionary folder.

There isn't actually.

That's the reason why there is no report of any issue using current CI test suite.

So one test must be added to observe the problem, and then see it fixed.

While it's not very hard, it's not completely trivial either :
one would need a Multi-threaded application, that is calling the Training function from each one of its Thread. 2 threads are likely enough to witness the problem. tsan could also be added, but its major downside, on top of being slow, is its lack of availability beyond Linux, so it's preferable to not depend on it.

@Adenilson
Copy link

Sounds good. I will hack something to test the fix.

@Adenilson
Copy link

An update: I wrote a simple dictionary builder example calling ZDICT_trainFromBuffer_cover with 160 samples and each sample with 100KB in size.

For a regular build (i.e. make -j8), it crashes reliably when I use 32 threads, not so reliably with 24 threads and rarely with 16 threads.

The backtrace looks like:
zdict_train_crash

@Adenilson
Copy link

Now when I apply the qsort_r() fix, it seems to run fine but it takes a little while to run (i.e. around 20s in a Intel 11th gen i7 processor).

A link to the output of the test case:
https://gist.github.com/Adenilson/4668b853b89a4eed4ff551c3885a814a

@Adenilson
Copy link

Adenilson commented Jun 17, 2024

The current fix works fine on OSX & Linux. I got validate it for Windows and since I don't have a Win box, it will take a little bit to get one and setup compilers/devel tools/etc.

@Cyan4973
Copy link
Contributor

Thanks @Adenilson ! This is trending well !

Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jun 19, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jun 19, 2024
A test case to demonstrate the crashes reported on facebook#4045.

Not for commit, builds only on POSIX systems.
@Adenilson
Copy link

@Cyan4973 the above branches have a rough draft of how the fix and the PoB (Proof of Bug) look like.

They are not ready for landing yet, given that I need to validate the approach on BSD (I assume that it is an officially supported OS? Or not? Is it part of the CI?) and it is unclear to me ATM how the PoB should be integrated (or not) e.g. as a unit test? Do test runners time out or have resource restraints (i.e. I'm using 48 threads in the PoB)?

Also I noticed that the MS Visual Studio files are for VS 2010. I got VS 2022 installed in an old Windows laptop and was forced to 're-target' the build files to VS 2022 to be able to build zstd.

Is that expected? Or is zstd build in another way for MS Windows (e.g. cmake)?

Finally, it seems that OSX Build on vanilla 'dev' branch (hash 3242ac5 from Friday Jun 14th) is now broken, I guess I got fix that too:
CC obj/generic_noconf/zstdcli_trace.o
compiling single-threaded static library 1.5.6
==> building with threading support
==> building zstd with .gz compression support
==> no liblzma, building zstd without .xz/.lzma support
==> no liblz4, building zstd without .lz4 support
LINK obj/generic_noconf/zstd
zstd build completed
compiling multi-threaded dynamic library 1.5.6
ld: unknown options: -soname=libzstd.so.1
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [obj/generic_noconf/dynamic/libzstd.so.1.5.6] Error 1
make: *** [lib-release] Error 2
make -j8 37.82s user 3.03s system 621% cpu 6.569 total

@Cyan4973
Copy link
Contributor

Cyan4973 commented Jun 19, 2024

I assume that it is an officially supported OS? Or not? Is it part of the CI?

Yes it is.
Here is the last test: https://github.com/facebook/zstd/runs/26246973990

Do test runners time out or have resource restraints (i.e. I'm using 48 threads in the PoB)?

Yes, there's a hard limit at 1 hour, but in this project, we aim at a ~20mn max for any particular test, and ~5mn preferably.
Multithreading is available, though it's unlikely to ramp up to 48 hardware threads.
More problematic can be the RAM usage. I don't know the exact limit, but we noticed in the past that, in order to stay in the "safe zone", RAM usage should remain < 1 GB, which can be a severe constraint.

Is that expected? Or is zstd build in another way for MS Windows (e.g. cmake)?

Yes. Basically, VS2010 project file can be updated to any modern version, so we have used that since the beginning of the project (~2015).
We could bump that to VS2013 if that matters, since we have stopped supporting VS2010 and VS2012.

That being said, an even better future would be to no longer manually maintain any Solution file, and instead get them generated automatically, typically from cmake as part of make all or make vs_solution. We haven't (yet) invested the time to get it done.

Finally, it seems that OSX Build on vanilla 'dev' branch

That's a concern. And it's surprising since macos is part of CI, so such a build error should have been detected.
edit : indeed, here is the last test, and it seems successful: https://github.com/facebook/zstd/actions/runs/9352935877/job/25742255656

Such a bug should be fixed a.s.a.p, though not necessarily by you, nor bundled into another PR with a different objective.

edit : the pb seems bisected to #4067 , but here too, the macos test was run and was successful...
https://github.com/facebook/zstd/actions/runs/9388161997/job/25852791513

edit 2: #4076

@Cyan4973
Copy link
Contributor

Oh, btw,
are you familiar with tsan, thread sanitizer ?
Presuming you know how to use it (likely limited to Linux only), would it help make the test fail more reliably with less threads (2 being ideal) ?

Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jun 19, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
@Adenilson
Copy link

@Cyan4973 I just had dinner and was going to open a detailed bug report for the OSX build breakage and just noticed that you already fixed the build. Thanks a lot!
:-)

I will experiment with tsan and report back the results (also got get a *BSD environment to validate the qsort_r()/s() fix).

Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jun 19, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
@Adenilson
Copy link

An update: I got the fix to build for FreeBSD. Since I don't have a spare system to install FreeBSD, I just used a VM under VirtualBox and it worked reasonably well.

Fortunately it seems that *BSD systems comply with the GNU API for qsort_r(), which simplifies the things (i.e. the same code is used for Linux and *BSD).

I consider now the fix properly ready for review, since it was tested on 4 OSes (Linux, OSX, Windows, FreeBSD) and I will submit a merge request.

What is missing is the PoB (Proof of Bug) integration. Is the later a hard requirement for the first?

@Adenilson
Copy link

One last thing that occurred to me is to verify if it builds fine for Android (IIRC the NDK uses clang).

@Adenilson
Copy link

The test matrix was: clang@OSX, gcc@Linux, gcc@FreeBSD, msvc@Windows.

@Cyan4973
Copy link
Contributor

What is missing is the PoB (Proof of Bug) integration. Is the later a hard requirement for the first?

It's preferable, but if it's too hard to achieve within the constraints of CI, this could be left.

A desirable intermediate solution could be to provide the test program or unit without making it a CI test, shipping with it a documentation to reproduce the test locally.

@Adenilson
Copy link

Perhaps the PoB could be rewritten in a way to teach how to use the dictionary building API, thus addressing the other request on #4066.

I still got investigate TSAN, it may be the case that it is able to simply the PoB quite a bit.

I just noticed that the CI showed that a bot (MINGW64) failed to compile.

That is a combination I haven't tested (gcc@Windows), but I think probably the fix should be pretty simple (i.e. test if the compiler used is GCC and then define _GNU_SOURCE).

So not ready for landing yet, still needs some more work on it.
:-)

Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 1, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed for newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
Adenilson pushed a commit to Adenilson/zstd that referenced this issue Jul 2, 2024
…trant

The two main functions used for dictionary training using the COVER
algorithm require initialization of a COVER_ctx_t where a call
to qsort() is performed.

The issue is that the standard C99 qsort() function doesn't offer
a way to pass an extra parameter for the comparison function callback
(e.g. a pointer to a context) and currently zstd relies on a *global*
static variable to hold a pointer to a context needed to perform
the sort operation.

If a zstd library user invokes either ZDICT_trainFromBuffer_cover or
ZDICT_optimizeTrainFromBuffer_cover from multiple threads, the
global context may be overwritten before/during the call/execution to qsort()
in the initialization of the COVER_ctx_t, thus yielding to crashes
and other bad things (Tm) as reported on issue facebook#4045.

Enters qsort_r(): it was designed to address precisely this situation,
to quote from the documention [1]: "the comparison function does not need to
use global variables to pass through arbitrary arguments, and is therefore
reentrant and safe to use in threads."

It is available with small variations for multiple OSes (GNU, BSD[2],
Windows[3]), and the ISO C11 [4] standard features on annex B-21 qsort_s() as
part of the <stdlib.h>. Let's hope that compilers eventually catch up
with it.

For now, we have to handle the small variations in function parameters
for each platform.

The current fix solves the problem by allowing each executing thread
pass its own COVER_ctx_t instance to qsort_r(), removing the use of
a global pointer and allowing the code to be reentrant.

Unfortunately for *BSD, we cannot leverage qsort_r() given that its API
has changed on newer versions of FreeBSD (14.0) and the other BSD variants
(e.g. NetBSD, OpenBSD) don't implement it.

For such cases we provide a fallback that will work only requiring support
for compilers implementing support for C90.

[1] https://man7.org/linux/man-pages/man3/qsort_r.3.html
[2] https://man.freebsd.org/cgi/man.cgi?query=qsort_r
[3] https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/qsort-s?view=msvc-170
[4] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
@Cyan4973
Copy link
Contributor

Fixed in #4086

@Cyan4973 Cyan4973 self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants