-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error/Segfault when writing many partitions #396
Labels
bug
Something isn't working
Comments
quinnj
added a commit
that referenced
this issue
May 25, 2023
Fixes #396. As noted in the originally reported issue, enabling debug logging when writing arrow data with compression can result in segfaults because the underlying CodecX package have debug logs, causing task switches/migration and thus making the pattern of using a single `X_COMPRESSOR` array indexed by `Threads.threadid()` unsafe since multiple threads may try using the same compressor at the same time. We fix this by wrapping each compressor in a `Lockable` and ensuring the `compress` (or `uncompress`) operation holds the lock for the duration of the operation. We also: * Add a decompressor per thread to avoid recreating them over and over during reading * Lazily initialize compressors/decompressors in a way that is 1.9+ safe and only creates the object when needed by a specific thread * Switch from WorkerUtilities -> ConcurrentUtilities (the package was renamed) Co-authored-by: J S <49557684+svilupp@users.noreply.github.com>
quinnj
added a commit
that referenced
this issue
May 30, 2023
Fixes #396. As noted in the originally reported issue, enabling debug logging when writing arrow data with compression can result in segfaults because the underlying CodecX package have debug logs, causing task switches/migration and thus making the pattern of using a single `X_COMPRESSOR` array indexed by `Threads.threadid()` unsafe since multiple threads may try using the same compressor at the same time. We fix this by wrapping each compressor in a `Lockable` and ensuring the `compress` (or `uncompress`) operation holds the lock for the duration of the operation. We also: * Add a decompressor per thread to avoid recreating them over and over during reading * Lazily initialize compressors/decompressors in a way that is 1.9+ safe and only creates the object when needed by a specific thread * Switch from WorkerUtilities -> ConcurrentUtilities (the package was renamed) Successor to #397; I've added @svilupp as a co-author here since they started the original movement for the code to go in this direction. --------- Co-authored-by: J S <49557684+svilupp@users.noreply.github.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When writing a partitioned table with compression, it throws an error or segfaults.
It impacts Julia 1.8.5 on both ARM and Linux when threading is enabled. Possible root causes are suggested below.
How to reproduce:
If you don't use compression, have 1 thread, or don't use logging, or don't use partitioned data (to enabled threaded workload) all is fine.
On Linux, it results in the following errors:
on ARM (Apple M1), it either gives the same errors, or sometimes it segfaults:
Suspected root cause:
Threads.threadid()
:dynamic
thread scheduler ships as a default, so if any part of the code yields (for example when using logging - vis. above!), different tasks might try to tap into the same compressorVersioninfo:
The text was updated successfully, but these errors were encountered: