You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The HashBatch method in key_hash.h (not quite merged but close) is likely to be the most performant. However, it does make some sacrifices on uniqueness of hashes in the spirit of performance (so we should make sure to document these).
We have a lot of internal logic for hashing inputs and it might be nice to expose some of this to users (e.g. https://stackoverflow.com/questions/72177022/how-to-get-hash-of-string-column-in-polars-or-pyarrow)
The
HashBatch
method inkey_hash.h
(not quite merged but close) is likely to be the most performant. However, it does make some sacrifices on uniqueness of hashes in the spirit of performance (so we should make sure to document these).Reporter: Weston Pace / @westonpace
Related issues:
Note: This issue was originally created as ARROW-16513. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: