[SYSTEMDS-3548] Optimize IO path Python interface for SystemDS #2154

Nakroma · 2024-12-17T09:52:50Z

Draft for the student project SYSTEMDS-3548.

Current contributions:

Fixes some minor bugs related to the performance tests
Parallelizes pandas_to_frame_block column processing (see image below for speed up, tested on my machine)

This commit fixes the load_numpy string performance test case. It keeps the CLI usage consistent with the other test cases, but converts the dtype to the correct one internally.

This commit fixes the array boolean convert breaking for row numbers above 64. It also adds a bit more error handling to prevent cases like this in the future.

This commit parallelizes the column processing in the pandas DataFrame to FrameBlock conversion.

codecov · 2024-12-17T10:32:29Z

Codecov Report

Attention: Patch coverage is 20.00000% with 4 lines in your changes missing coverage. Please review.

Project coverage is 72.03%. Comparing base (d3fcfb1) to head (8323515).

Files with missing lines	Patch %	Lines
.../apache/sysds/runtime/util/Py4jConverterUtils.java	20.00%	3 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff            @@
##               main    #2154   +/-   ##
=========================================
  Coverage     72.03%   72.03%           
+ Complexity    43937    43935    -2     
=========================================
  Files          1441     1441           
  Lines        166106   166110    +4     
  Branches      32428    32431    +3     
=========================================
+ Hits         119655   119659    +4     
- Misses        37199    37204    +5     
+ Partials       9252     9247    -5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

christinadionysio · 2024-12-18T14:10:33Z

LGTM! Thank you for your contribution @Nakroma!

Baunsgaard · 2024-12-18T14:21:47Z

LGTM as well.

How did you measure the time?
Is it with startup time of the system?

Nakroma · 2024-12-18T14:29:23Z

@Baunsgaard I used the IO benchmark scripts for the figure provided above:

https://github.com/apache/systemds/blob/main/scripts/perftest/runAllIO.sh
https://github.com/apache/systemds/blob/main/scripts/perftest/python/io/load_pandas.py

Baunsgaard · 2024-12-18T16:08:02Z

@Baunsgaard I used the IO benchmark scripts for the figure provided above:

https://github.com/apache/systemds/blob/main/scripts/perftest/runAllIO.sh https://github.com/apache/systemds/blob/main/scripts/perftest/python/io/load_pandas.py

Great!

Then you can get better numbers:

systemds/scripts/perftest/python/io/load_pandas.py

Line 37 in d3fcfb1

run = "\n".join(

Modify the script to start the context not in the 'run' part, instead move it to the 'setup' part, and remember to shut down the system with ctx.close() after you are done measuring.

…t during run statement

Nakroma · 2024-12-19T12:26:28Z

@Baunsgaard Okey yeah that makes sense - pushed a commit for that 👍 I didnt move it to the setup but rather inside the global context, so .close() time is not included in the timing and also to support the args.number parameter.

Baunsgaard · 2024-12-19T13:09:02Z

@Baunsgaard Okey yeah that makes sense - pushed a commit for that 👍 I didnt move it to the setup but rather inside the global context, so .close() time is not included in the timing and also to support the args.number parameter.

what are the times then?

Nakroma · 2024-12-19T13:41:34Z

what are the times then?

seems to be about a difference of 1-2s, at least on my local machine

Baunsgaard · 2024-12-19T14:03:38Z

what are the times then?

seems to be about a difference of 1-2s, at least on my local machine

60% speedup on int32 and 100% on int64 is great!
However, it does seem to me like there is something else taking time from your results. I would expect speedup closer to the number of cores in your system.

Nakroma · 2024-12-19T19:19:57Z

60% speedup on int32 and 100% on int64 is great! However, it does seem to me like there is something else taking time from your results. I would expect speedup closer to the number of cores in your system.

So there is some more constant time, the building of the frameblock a few lines before the .convert calls for example is around 400ms.

I looked at the profiling a bit more and it seems like most time is spent on socket communication between Java and Python. My assumption would be that this adds quite a bit of overhead and doesn't parallelize well.

Nakroma added 4 commits December 3, 2024 13:03

[SYSTEMDS-3548] Fix performance test for load_numpy string case

d9b439a

This commit fixes the load_numpy string performance test case. It keeps the CLI usage consistent with the other test cases, but converts the dtype to the correct one internally.

[SYSTEMDS-3548] Fix Py4j boolean array convert

ae15fe8

This commit fixes the array boolean convert breaking for row numbers above 64. It also adds a bit more error handling to prevent cases like this in the future.

[SYSTEMDS-3548] Parallelize pandas_to_frame_block

0b54a15

This commit parallelizes the column processing in the pandas DataFrame to FrameBlock conversion.

Merge branch 'apache:main' into main

8323515

[SYSTEMDS-3548] Update converters.py to be black compliant

a41cfe6

Nakroma marked this pull request as ready for review December 18, 2024 14:04

Nakroma added 2 commits December 19, 2024 13:21

[SYSTEMDS-3548] Update I/O tests to not include systemds context star…

b3e5126

…t during run statement

[SYSTEMDS-3548] Remove redundant close statements

89202c1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYSTEMDS-3548] Optimize IO path Python interface for SystemDS #2154

[SYSTEMDS-3548] Optimize IO path Python interface for SystemDS #2154

Nakroma commented Dec 17, 2024

codecov bot commented Dec 17, 2024

christinadionysio commented Dec 18, 2024

Baunsgaard commented Dec 18, 2024

Nakroma commented Dec 18, 2024

Baunsgaard commented Dec 18, 2024

Nakroma commented Dec 19, 2024

Baunsgaard commented Dec 19, 2024

Nakroma commented Dec 19, 2024 •

edited

Loading

Baunsgaard commented Dec 19, 2024

Nakroma commented Dec 19, 2024

[SYSTEMDS-3548] Optimize IO path Python interface for SystemDS #2154

Are you sure you want to change the base?

[SYSTEMDS-3548] Optimize IO path Python interface for SystemDS #2154

Conversation

Nakroma commented Dec 17, 2024

codecov bot commented Dec 17, 2024

Codecov Report

christinadionysio commented Dec 18, 2024

Baunsgaard commented Dec 18, 2024

Nakroma commented Dec 18, 2024

Baunsgaard commented Dec 18, 2024

Nakroma commented Dec 19, 2024

Baunsgaard commented Dec 19, 2024

Nakroma commented Dec 19, 2024 • edited Loading

Baunsgaard commented Dec 19, 2024

Nakroma commented Dec 19, 2024

Nakroma commented Dec 19, 2024 •

edited

Loading