Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to read compressed empty table from java implementation #437

Closed
DrChainsaw opened this issue May 17, 2023 · 3 comments · Fixed by #448
Closed

Failure to read compressed empty table from java implementation #437

DrChainsaw opened this issue May 17, 2023 · 3 comments · Fixed by #448

Comments

@DrChainsaw
Copy link
Contributor

The problem is that the java implementation sets the length field in RecordBatch to 8 bytes in this case. The consequence is that this check does nothing so the uncompression continues, finds len=0 when reading the first 8 bytes from the pointer and passes a zero length array to transcode which then seems to hang indefinitely.

Doing the same type of check and return as for buffer.length on len resolves the issue for me.

It might very well be the java implementation which does something wrong here as I couldn't find any reference in the format specification on how to describe lengths when using compression. I have opened an issue there as well. File generated by the java code in that issue: randon_access_to_file.zip.

Both pyarrow and the Julia implementation sets it to 0 when writing an empty table to disk.

It does make some sense to set buffer.length to 8 and let the len field carry the information even though it is 8 bytes which could have been saved. Both the java implementation and pyarrow can read the attached file.

@Moelf
Copy link
Contributor

Moelf commented May 17, 2023

is this a duplication of apache/arrow#435?

@DrChainsaw
Copy link
Contributor Author

No, just happens to touch the same function.

@quinnj
Copy link
Member

quinnj commented Jun 3, 2023

PR to fix: #448

quinnj added a commit that referenced this issue Jun 5, 2023
Fixes #437. Thanks to @DrChainsaw for the investigation, proposed fix,
and test file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants