-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fwrite gzip on Solaris error #4099
Comments
The only thing I see so far to follow up on is in the 2nd deflate call : A: by reading the code it looks like 8391193 is returned by deflateBound when passed 8MB. Maybe it knows that no matter what the contents of the 8MB, it can always reduce a little bit (2,585 bytes). I'm not sure why we're using deflateBound for the 2nd part, because our buffers are fixed. The idea of deflateBound I thought was to provide an upper bound for the result of the compression, so that just that amount can be allocated. Important for the header, because for the very small header, the compressed output is most often larger. But not sure why for the data chunks. Running locally I get the same output, absolutely no difference, including that 8391193. So that was barking up the wrong tree.
|
|
I asked on S.O. here : I fear that will get closed for being too vague. Just a quick review of the code or some pointers from @madler would be awesome! He's by far the top user in that tag too, so fingers crossed! |
Your arithmetic needs a little work. 8,391,193 is greater than 8,388,608. |
Thanks, @madler. Arithmetic was never my strong point :-) Just an error here in comments. I rechecked our code in fwrite.c and it does allocate that 8,391,193 size (slightly larger than 8MB) and looks ok, afaics. |
Now extra tracing is merged, will revisit after next release to CRAN. See comments at the top of #4107. Hence bump to 1.12.11 milestone. |
Latest Solaris output now showing on CRAN with the extra tracing to tackle.
|
deflate returning -2 indicates that the z_stream structure is invalid. This is usually due to either not initializing it in the first place, pointing to where there isn't a z_stream structure, or the z_stream structure getting trampled on in memory somehow. Since this involves multiple threads, I'm guessing the last of those. I can't tell much else from that output. |
I ran v1.13.2 on the excellent R-hub package builder, https://builder.r-hub.io/. Both of the following :
But surprisingly, and frustratingly, both pass fine. CRAN's Solaris could be different to R-hub's Solaris in some way. It could also be a fault that could happen on Windows or Linux too, but is just showing up on CRAN's Solaris. It could be 32bit related, but CI tests 32bit Windows regularly and we haven't seen a problem there. We've already established that the library and headers match on zlib 1.2.11 which is the latest version. Looking at the source code of zlib 1.2.11, there are some
We do request GZIP in We should be able to rule out multi-threaded because the data size is small and we see from the log that only one thread is being used by design which is correct. Not ruling anything out though. I've traced the internal zlib structure locally and will extend the output further. The output in the log already contains the full
|
v1.13.4 is on CRAN today and we have the new output.
|
Ok so the z_stream for data (1): sizeof(z_stream)==56: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 68 55 d7 0f 10 0a 95 fe 40 0a 95 fe 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 state: 90 99 02 08 39 00 00 00 strm==8029990 state->strm==8029990 state->status==57 zalloc==fe950a10 zfree==fe950a40 (s->strm==strm)==1 s->next_out==0 s->avail_in=0 s->next_in=0 deflates()'s checks (excluding status) would return -2 here z_stream for data (2): sizeof(z_stream)==56: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 68 55 d7 0f 10 0a 95 fe 40 0a 95 fe 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 state: 90 99 02 08 39 00 00 00 strm==8029760 state->strm==8029990 state->status==57 zalloc==fe950a10 zfree==fe950a40 (s->strm==strm)==0 s->next_out==0 s->avail_in=0 s->next_in=0 deflates()'s checks (excluding status) would return -2 here The first |
_OPENMP == 201307 so that corresponds to OpenMP 4.0. Should be fine. On first glance it seems the address of However, |
There are no zlib calls between these two trace points: lines 877 and 909, so it's not zlib. An extra trace output on line 882 would confirm that. Let's do one thing at a time, and add that trace point and resubmit. We have to release again anyway to get out of the way of the breaking change to the breaking change to all.equal in recent days in R-devel. |
1.13.6 was accepted on CRAN today and we're now passing on its Solaris. This confirms that #4845 did fix it. |
awesome news! thanks for all that effort debugging! |
NxtIRFcore is failing tests giving z_stream_error on inflate Perhaps this is similar to this problem: Rdatatable/data.table#4099 Try creating z_stream vector on stack, outside OpenMP loop
NxtIRFcore is failing tests giving z_stream_error on inflate Perhaps this is similar to this problem: Rdatatable/data.table#4099 Try creating z_stream vector on stack, outside OpenMP loop Former-commit-id: 2226ecf
Now that 1.12.8 is on CRAN, we have the output from the extra tracing in #4028.
https://www.r-project.org/nosvn/R.check/r-patched-solaris-x86/data.table-00check.html
The text was updated successfully, but these errors were encountered: