Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Improved performance of checking offsets ~-64-73% #1305

Merged
merged 2 commits into from
Nov 29, 2022

Conversation

ritchie46
Copy link
Collaborator

@ritchie46 ritchie46 commented Nov 28, 2022

This PR ensures that check_offsets is auto-vectorized. A break in a tight loop often prevents this auto-vectorization, so we keep a variable around in the hot loop and we decide if we must return an error once we have processed the data.

Gnuplot not found, using plotters backend
check_offsets^10        time:   [212.92 ns 213.20 ns 213.53 ns]                             
                        change: [-73.878% -73.296% -72.720%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

check_offsets^12        time:   [857.28 ns 858.50 ns 859.97 ns]                              
                        change: [-66.556% -66.487% -66.419%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

check_offsets^14        time:   [3.2819 µs 3.2895 µs 3.2976 µs]                              
                        change: [-69.058% -68.997% -68.927%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  5 (5.00%) high mild
  1 (1.00%) high severe

check_offsets^16        time:   [13.689 µs 13.701 µs 13.715 µs]                              
                        change: [-64.854% -64.650% -64.446%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  4 (4.00%) high severe

check_offsets^18        time:   [59.150 µs 59.216 µs 59.300 µs]                             
                        change: [-61.642% -61.554% -61.465%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) low mild
  1 (1.00%) high severe

check_offsets^20        time:   [238.54 µs 239.10 µs 239.75 µs]                             
                        change: [-64.237% -64.127% -64.026%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  2 (2.00%) high severe


@ritchie46 ritchie46 changed the title perf: auto-vectorize check_offsets perf: auto-vectorize check_offsets ~-64-73% Nov 28, 2022
@codecov
Copy link

codecov bot commented Nov 28, 2022

Codecov Report

Base: 83.14% // Head: 83.14% // Decreases project coverage by -0.00% ⚠️

Coverage data is based on head (2dc1f3f) compared to base (368aacc).
Patch coverage: 95.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1305      +/-   ##
==========================================
- Coverage   83.14%   83.14%   -0.01%     
==========================================
  Files         369      369              
  Lines       40234    40245      +11     
==========================================
+ Hits        33454    33461       +7     
- Misses       6780     6784       +4     
Impacted Files Coverage Δ
src/array/specification.rs 92.23% <95.00%> (-0.16%) ⬇️
src/array/utf8/mod.rs 84.75% <0.00%> (-0.92%) ⬇️
src/bitmap/immutable.rs 86.06% <0.00%> (-0.61%) ⬇️
src/io/ipc/read/schema.rs 95.58% <0.00%> (-0.30%) ⬇️
src/io/ipc/read/file.rs 97.32% <0.00%> (+0.44%) ⬆️
src/bitmap/utils/slice_iterator.rs 98.78% <0.00%> (+1.21%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jorgecarleitao jorgecarleitao merged commit 6d2eccd into jorgecarleitao:main Nov 29, 2022
@jorgecarleitao jorgecarleitao changed the title perf: auto-vectorize check_offsets ~-64-73% Improved performance of checking offsets ~-64-73% Nov 29, 2022
@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Nov 29, 2022
@ritchie46 ritchie46 deleted the auto_vec_offset branch November 29, 2022 08:17
ritchie46 added a commit to ritchie46/arrow2 that referenced this pull request Nov 29, 2022
)

* perf: auto-vectorized try_check_offsets

* err on empty offsets
ritchie46 added a commit to ritchie46/arrow2 that referenced this pull request Mar 29, 2023
)

* perf: auto-vectorized try_check_offsets

* err on empty offsets
ritchie46 added a commit to ritchie46/arrow2 that referenced this pull request Apr 5, 2023
)

* perf: auto-vectorized try_check_offsets

* err on empty offsets
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants