Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Improved dictionary invariants #1137

Merged
merged 1 commit into from
Jul 5, 2022
Merged

Improved dictionary invariants #1137

merged 1 commit into from
Jul 5, 2022

Conversation

jorgecarleitao
Copy link
Owner

This PR changes the internal invariant from DictionaryArray to only contain keys that:

  • can be casted to usize
  • the maximum value is smaller than the values' length

This allows removing bound checks when iterating over the values via its keys.

Backward incompatible changes

  • DictionaryArray::from_data was replaced by try_new, try_new_unchecked and try_from_keys
  • DictionaryKey now only implements NativeType + TryFrom<usize> + TryInto<usize>

@codecov
Copy link

codecov bot commented Jul 3, 2022

Codecov Report

Merging #1137 (a461a93) into main (b3583b6) will increase coverage by 0.03%.
The diff coverage is 78.15%.

@@            Coverage Diff             @@
##             main    #1137      +/-   ##
==========================================
+ Coverage   83.49%   83.52%   +0.03%     
==========================================
  Files         366      366              
  Lines       35635    35799     +164     
==========================================
+ Hits        29752    29902     +150     
- Misses       5883     5897      +14     
Impacted Files Coverage Δ
src/array/dictionary/iterator.rs 100.00% <ø> (+41.66%) ⬆️
src/array/equal/dictionary.rs 100.00% <ø> (ø)
src/array/equal/mod.rs 80.00% <0.00%> (-3.59%) ⬇️
src/compute/arithmetics/mod.rs 74.10% <ø> (ø)
src/io/parquet/read/statistics/dictionary.rs 48.57% <ø> (ø)
src/compute/cast/dictionary_to.rs 22.41% <7.14%> (-4.40%) ⬇️
src/io/avro/read/nested.rs 64.53% <50.00%> (-0.60%) ⬇️
src/io/parquet/read/deserialize/dictionary.rs 79.06% <65.62%> (-2.69%) ⬇️
src/array/growable/dictionary.rs 77.31% <72.22%> (-0.21%) ⬇️
src/io/json/read/deserialize.rs 72.55% <83.33%> (-0.07%) ⬇️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3583b6...a461a93. Read the comment docs.

@jorgecarleitao jorgecarleitao marked this pull request as ready for review July 3, 2022 20:24
@jorgecarleitao jorgecarleitao merged commit 78a2a63 into main Jul 5, 2022
@jorgecarleitao jorgecarleitao deleted the dict branch July 5, 2022 15:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant