Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OnceLock to store TokioRuntime #895

Merged
merged 8 commits into from
Oct 5, 2024

Conversation

Michael-J-Ward
Copy link
Contributor

@Michael-J-Ward Michael-J-Ward commented Oct 3, 2024

Which issue does this PR close?

Closes #737.

Rationale for this change

To avoid repeatedly recreating the tokio runtime, #341 began storing the tokio runtime on the python heap at datafusion.runtime.

This created the requirement that the datafusion python package must be available to any user of the rust crate datafusion-python.

However, some users (#737) want to leverage the rust-crate datafusion-python to create their own custom python package without depending on or using the python datafusion package.

What changes are included in this PR?

This PR uses std::sync::OnceLock instead of the python heap to ensure that only one tokio runtime gets created once.

This both removes the implicit dependency and avoids rust --> python --> rust roundtrip every time the runtime is used.

Are there any user-facing changes?

datafusion.runtime has been removed, but that was already unusable for users of the python package.

@Michael-J-Ward Michael-J-Ward marked this pull request as ready for review October 3, 2024 21:18
@Michael-J-Ward
Copy link
Contributor Author

Because the change was performance related, we should probably benchmark the change.

I did a quick & dirty one (making no effort to isolate resources / stop other processes on my laptop), but the results tell me I should run a cleaner one tomorrow before merging.


Using benchmarks/tpch/tpch.py and 30 runs for each query.

  • main and feat are average query time in milliseconds for main and this branch
  • diff is feat - main (positive numbers are bad)
query main main_std feat diff diff_z percent_diff
q1 875.883 103.659 935.693 59.8103 0.576991 6.82858
q10 741.986 23.4227 751.555 9.56897 0.408533 1.28964
q11 179.048 10.8799 176.393 -2.65517 -0.244043 -1.48294
q12 721.959 25.6898 721.752 -0.206897 -0.00805364 -0.0286577
q13 177.414 9.63364 183 5.58621 0.579865 3.14869
q14 493.9 14.5749 505.51 11.6103 0.796601 2.35075
q15 1068.53 27.9732 1081.39 12.8552 0.459553 1.20307
q16 134.962 13.8672 136.255 1.2931 0.0932493 0.958124
q17 1233.1 21.5606 1259.44 26.3379 1.22158 2.13591
q18 1463.43 33.4739 1482.25 18.8172 0.562146 1.28583
q19 575.676 17.9801 589.338 13.6621 0.759845 2.37322
q2 220.979 10.2689 232.648 11.669 1.13634 5.28057
q20 679.241 21.3368 686.652 7.41034 0.347303 1.09097
q21 1821.93 50.2259 1829.33 7.39655 0.147266 0.405972
q22 159.476 6.1729 171.555 12.0793 1.95683 7.57438
q3 702.131 17.5867 718.107 15.9759 0.908405 2.27534
q4 620.148 26.4152 639.021 18.8724 0.714454 3.04321
q5 743.714 20.2565 749.21 5.49655 0.271348 0.739068
q6 464.876 10.8334 470.121 5.24483 0.484136 1.12822
q7 947.045 30.8158 960.534 13.4897 0.437751 1.42439
q8 776.548 31.5608 778.169 1.62069 0.0513514 0.208704
q9 938.197 28.7482 947.883 9.68621 0.336933 1.03243
setup 1.5 nan 1.9 0.4 nan 26.6667
total 456466 nan 464171 7704.4 nan 1.68784

@timsaucer
Copy link
Contributor

From the benchmarking above it appears to be a mostly negative impact. Am I reading that right?

@Michael-J-Ward
Copy link
Contributor Author

Yes. My ~dirty benchmark had a pretty consistently negative performance impact. Hence, the need for further investigation.

That result surprises me, though.

@Michael-J-Ward Michael-J-Ward marked this pull request as draft October 4, 2024 17:21
@Michael-J-Ward
Copy link
Contributor Author

Michael-J-Ward commented Oct 4, 2024

Alright, second benchmark running after clean boot look like this has basically no impact.

I wouldn't mind if @andygrove or anyone else has a nice benchmarking server setup to run their own, but I'm more comfortable with this now.


Reminder: diff = feat - main, so positive diffs are bad.

query main main_std feat diff diff_z percent_diff
q1 888.259 159.165 917.228 28.969 0.182006 3.26132
q10 743.379 25.2168 757.448 14.069 0.55792 1.89257
q11 177.052 7.93562 175.793 -1.25862 -0.158604 -0.710877
q12 726.648 27.0699 735.497 8.84828 0.326868 1.21768
q13 179.317 8.79525 177.714 -1.60345 -0.182308 -0.894196
q14 488.414 12.0078 498.728 10.3138 0.858923 2.11169
q15 1063.58 23.1427 1098.63 35.0483 1.51444 3.2953
q16 133.062 7.81557 131.559 -1.50345 -0.192366 -1.12988
q17 1219.1 19.424 1234.63 15.531 0.799578 1.27398
q18 1454.77 33.4598 1453.37 -1.4 -0.0418413 -0.096235
q19 572.231 17.7751 576.055 3.82414 0.215141 0.668286
q2 238.445 48.7071 228.276 -10.169 -0.208778 -4.2647
q20 687.376 25.9638 696.972 9.59655 0.369612 1.39611
q21 1800.74 37.2634 1819.87 19.131 0.5134 1.0624
q22 167.645 12.4563 168.772 1.12759 0.0905237 0.672604
q3 702.986 25.4628 705.334 2.34828 0.0922237 0.334043
q4 621.555 23.043 628.755 7.2 0.312459 1.15838
q5 745.252 23.4507 741.314 -3.93793 -0.167923 -0.528403
q6 459.452 16.2195 462.979 3.52759 0.217491 0.767782
q7 944.11 32.3262 942.3 -1.81034 -0.0560023 -0.191751
q8 765.51 35.3682 765.686 0.175862 0.00497232 0.0229732
q9 970.969 208.444 944.076 -26.8931 -0.129018 -2.76972
setup 1.6 nan 27 25.4 nan 1587.5
total 456747 nan 459996 3248.6 nan 0.711247

@Michael-J-Ward Michael-J-Ward marked this pull request as ready for review October 4, 2024 19:20
I also included a reference comment in case future users experience problems with
using datafusion-python behind a forking app server l ike `gunicorn`.
@Michael-J-Ward
Copy link
Contributor Author

Ran it one more time with the #[inline] annotation.


query main main_std feat diff diff_z percent_diff
q1 888.259 159.165 874.948 -13.3103 -0.0836261 -1.49848
q10 743.379 25.2168 752.286 8.9069 0.353213 1.19816
q11 177.052 7.93562 173.355 -3.69655 -0.465818 -2.08784
q12 726.648 27.0699 715.179 -11.469 -0.42368 -1.57834
q13 179.317 8.79525 174.041 -5.27586 -0.599853 -2.94219
q14 488.414 12.0078 477.593 -10.8207 -0.901137 -2.21548
q15 1063.58 23.1427 1075.16 11.5724 0.500046 1.08806
q16 133.062 7.81557 124.021 -9.04138 -1.15684 -6.79486
q17 1219.1 19.424 1231.75 12.6483 0.651166 1.03751
q18 1454.77 33.4598 1446.93 -7.84138 -0.234352 -0.539011
q19 572.231 17.7751 570.862 -1.36897 -0.0770161 -0.239233
q2 238.445 48.7071 242.534 4.08966 0.0839642 1.71514
q20 687.376 25.9638 682.079 -5.29655 -0.203997 -0.770547
q21 1800.74 37.2634 1839.24 38.5069 1.03337 2.1384
q22 167.645 12.4563 177.886 10.2414 0.822188 6.10897
q3 702.986 25.4628 704.593 1.6069 0.0631076 0.228582
q4 621.555 23.043 621.583 0.0275862 0.00119716 0.00443826
q5 745.252 23.4507 742.538 -2.71379 -0.115723 -0.364144
q6 459.452 16.2195 466.848 7.39655 0.456029 1.60986
q7 944.11 32.3262 937.966 -6.14483 -0.190088 -0.650859
q8 765.51 35.3682 759.803 -5.7069 -0.161357 -0.745502
q9 970.969 208.444 931.376 -39.5931 -0.189946 -4.07769
setup 1.6 nan 24.3 22.7 nan 1418.75
total 456747 nan 455978 -768.7 nan -0.168299

Copy link
Contributor

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @Michael-J-Ward !

@Michael-J-Ward Michael-J-Ward merged commit ec8246d into apache:main Oct 5, 2024
23 checks passed
// If we run into that problem, in the future we can look to `delta-rs`
// which adds a check in that disallows calls from a forked process
// https://github.com/delta-io/delta-rs/blob/87010461cfe01563d91a4b9cd6fa468e2ad5f283/python/src/utils.rs#L10-L31
static RUNTIME: OnceLock<TokioRuntime> = OnceLock::new();
Copy link

@austin362667 austin362667 Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I'm understanding it correctly (please correct me if I'm not making sense).

So now it use OnceLock<Runtime> than OnceLock<Arc<Runtime>> to get rid of Python heap thoroughly. And it will not be thread-safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend looking at the PR diff to see the real change.

There is an intermediate commit that went from OnceLock<Arc<Runtime>> to OnceLock<Runtime>, but that's because the Arc was superfluous.

OnceLock is already Send + Sync and explicitly thread-safe.

So, if you discover that it is not actually thread-safe, please share!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How do I bring dependencies in my binding?
3 participants