Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 1.1.0rc4 hang in pre_transform_dataset #269

Merged
merged 2 commits into from
Mar 20, 2023
Merged

Conversation

jonmmease
Copy link
Collaborator

Closes #268

This fixes / works around the hang reported in #268. I don't understand the full cause, but it's mitigated by copying the input Arrow table through the IPC format. We lose zero copy, but it should still be an improvement over prior versions as the IPC clone is happening all in Rust. Also important, this approach allows us to compute an accurate hash for the input table (rather than using the Python id of the PyArrow table), so it improves cacheing.

This works around #268 by copying the input pyarrow table through the IPC bytes representations. It also allows us to properly hash the input PyArrow table, which allows the cache to work properly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1.1.0rc4 hangs after repeated calls to pre_transform_datasets
1 participant