Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parquet reading performance ~35-40% #3821

Merged
merged 2 commits into from
Jun 27, 2022
Merged

Conversation

ritchie46
Copy link
Member

Previous implementation we use memmap and read to a vector. This is not very smart, as we pay double IO. Removing memmap and reading directly to the buffers improved performance by ~30%.

But that data still needs to be zeroed before written, so if we use memmap directly we could skip that step. That improved performance by yet another ~8% compared to master.

Reading the yellowtrip dataset benchmarks:

master

Performance counter stats for './target/release/memcheck':

         22.978,03 msec task-clock                #    6,296 CPUs utilized          
           465.290      context-switches          #   20,249 K/sec                  
            10.129      cpu-migrations            #  440,812 /sec                   
         2.323.129      page-faults               #  101,102 K/sec                  
    75.015.799.667      cycles                    #    3,265 GHz                    
    73.372.673.203      instructions              #    0,98  insn per cycle         
    13.989.811.069      branches                  #  608,834 M/sec                  
        71.684.372      branch-misses             #    0,51% of all branches        

       3,649539611 seconds time elapsed

      10,707208000 seconds user
      13,337545000 seconds sys

read with zeroed

 Performance counter stats for './target/release/memcheck':

         17.159,57 msec task-clock                #    6,073 CPUs utilized          
           393.657      context-switches          #   22,941 K/sec                  
             7.877      cpu-migrations            #  459,044 /sec                   
         1.995.815      page-faults               #  116,309 K/sec                  
    59.192.854.862      cycles                    #    3,450 GHz                    
    52.386.441.413      instructions              #    0,89  insn per cycle         
    10.072.386.163      branches                  #  586,983 M/sec                  
        69.491.528      branch-misses             #    0,69% of all branches        

       2,825714164 seconds time elapsed

       5,988224000 seconds user
      12,028295000 seconds sys

memory map

 Performance counter stats for './target/release/memcheck':

         16.007,82 msec task-clock                #    6,071 CPUs utilized          
           352.604      context-switches          #   22,027 K/sec                  
             6.504      cpu-migrations            #  406,301 /sec                   
         1.747.134      page-faults               #  109,143 K/sec                  
    55.113.085.851      cycles                    #    3,443 GHz                    
    51.144.360.047      instructions              #    0,93  insn per cycle         
     9.883.151.274      branches                  #  617,395 M/sec                  
        67.085.974      branch-misses             #    0,68% of all branches        

       2,636863878 seconds time elapsed

       6,678761000 seconds user
      10,102559000 seconds sys

@jorgecarleitao FYI

@github-actions github-actions bot added the rust Related to Rust Polars label Jun 27, 2022
@ritchie46 ritchie46 changed the title Improve parquet reading performance Improve parquet reading performance ~35-40% Jun 27, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #3821 (6796be3) into master (ca28e70) will increase coverage by 0.00%.
The diff coverage is 97.91%.

@@           Coverage Diff           @@
##           master    #3821   +/-   ##
=======================================
  Coverage   77.95%   77.95%           
=======================================
  Files         446      447    +1     
  Lines       73865    73946   +81     
=======================================
+ Hits        57578    57645   +67     
- Misses      16287    16301   +14     
Impacted Files Coverage Δ
polars/polars-io/src/parquet/mod.rs 93.54% <ø> (ø)
polars/polars-io/src/parquet/read.rs 98.90% <ø> (ø)
polars/polars-io/src/parquet/read_impl.rs 94.53% <95.34%> (-0.47%) ⬇️
polars/polars-io/src/parquet/mmap.rs 100.00% <100.00%> (ø)
...core/src/chunked_array/logical/categorical/from.rs 21.62% <0.00%> (-18.92%) ⬇️
...ars/polars-core/src/series/implementations/utf8.rs 71.53% <0.00%> (-1.13%) ⬇️
polars/polars-arrow/src/utils.rs 75.28% <0.00%> (-1.13%) ⬇️
polars/polars-time/src/series/mod.rs 56.00% <0.00%> (-0.67%) ⬇️
polars/polars-core/src/series/from.rs 83.12% <0.00%> (-0.63%) ⬇️
...olars/polars-core/src/frame/groupby/into_groups.rs 60.29% <0.00%> (-0.30%) ⬇️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ca28e70...6796be3. Read the comment docs.

@ritchie46
Copy link
Member Author

@thobai this will certainly help ;)

@ritchie46 ritchie46 merged commit 83161a1 into master Jun 27, 2022
@ritchie46 ritchie46 deleted the improve_parquet branch June 27, 2022 10:13
@alexander-beedie
Copy link
Collaborator

That's quite a gain - especially given that it was hardly slow to start with :)

@ritchie46
Copy link
Member Author

That's quite a gain - especially given that it was hardly slow to start with :)

Yeap, curious for the benchmarks. :)

@thobai
Copy link

thobai commented Jul 1, 2022

@ritchie46 This sounds very promising! Did that already make it into the recent python version of polars (0.13.51)? I just tested with that version and I can unfortunately not really see a difference in performance for my datasets and queries. But that could also be because the majority of time is spent somewhere else and reading wasn't a bit fraction of the execution time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants