Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix blocking I/O to load python code when the network location contains non-ascii characters #1342

Merged
merged 28 commits into from
Oct 20, 2024

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Oct 20, 2024

urlsplit is not async safe since it can import unicodedata at runtime which does blocking I/O to load python code from disk. This PR replaces it with a new version that is functionality compatible but rewritten a bit to have better performance and avoid run time imports

To improve performance #190 (comment) suggested cythonizing it but I think thats a lot more to maintain, and the performance issues can likely be solved in python.

Long term this also means we don't have to use SplitResult internally anymore and can fix issues where urlsplit doesn't work as we expected and have been sitting in the issue queue for years in future PRs

Copy link

codspeed-hq bot commented Oct 20, 2024

CodSpeed Performance Report

Merging #1342 will improve performances by ×2.2

Comparing url_parse (4cf4bb0) with master (89695f7)

Summary

⚡ 6 improvements
✅ 77 untouched benchmarks

Benchmarks breakdown

Benchmark master url_parse Change
test_url_make_encoded_with_host_path_and_port 318.3 µs 283.7 µs +12.22%
test_url_make_no_netloc 449 µs 412.9 µs +8.75%
test_url_make_no_netloc_relative 446 µs 410.6 µs +8.62%
test_url_make_with_many_hosts 5.4 ms 3.9 ms +39.66%
test_url_make_with_many_ipv4_hosts 5.6 ms 4.1 ms +37.47%
test_url_make_with_many_ipv6_hosts 11.4 ms 5.2 ms ×2.2

Copy link

codecov bot commented Oct 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.13%. Comparing base (89695f7) to head (4cf4bb0).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1342      +/-   ##
==========================================
+ Coverage   96.09%   96.13%   +0.04%     
==========================================
  Files          27       27              
  Lines        5483     5541      +58     
  Branches      336      358      +22     
==========================================
+ Hits         5269     5327      +58     
  Misses        188      188              
  Partials       26       26              
Flag Coverage Δ
CI-GHA 96.13% <100.00%> (+0.04%) ⬆️
MyPy 45.22% <100.00%> (+0.60%) ⬆️
OS-Linux 99.23% <100.00%> (+<0.01%) ⬆️
OS-Windows 99.27% <100.00%> (+<0.01%) ⬆️
OS-macOS 98.96% <100.00%> (+0.01%) ⬆️
Py-3.10.11 98.94% <100.00%> (+0.14%) ⬆️
Py-3.10.15 99.18% <100.00%> (+0.14%) ⬆️
Py-3.11.10 99.18% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 98.94% <100.00%> (+0.01%) ⬆️
Py-3.12.7 99.18% <100.00%> (+<0.01%) ⬆️
Py-3.13.0 99.18% <100.00%> (+<0.01%) ⬆️
Py-3.9.13 98.90% <100.00%> (+0.14%) ⬆️
Py-3.9.20 99.14% <100.00%> (+0.14%) ⬆️
Py-pypy7.3.16 99.18% <100.00%> (+0.15%) ⬆️
Py-pypy7.3.17 99.20% <100.00%> (+0.15%) ⬆️
VM-macos-latest 98.96% <100.00%> (+0.01%) ⬆️
VM-ubuntu-latest 99.23% <100.00%> (+<0.01%) ⬆️
VM-windows-latest 99.27% <100.00%> (+<0.01%) ⬆️
pytest 99.23% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bdraco bdraco changed the title Vendor urlparse to address performance issues Vendor urlsplit to address performance issues Oct 20, 2024
@bdraco bdraco changed the title Vendor urlsplit to address performance issues Adapt urlsplit to address performance issues Oct 20, 2024
@bdraco bdraco changed the title Adapt urlsplit to address performance issues Adapt urlsplit to address performance issues Oct 20, 2024
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Oct 20, 2024
@bdraco bdraco changed the title Adapt urlsplit to address performance issues Replace urlsplit to address performance issues Oct 20, 2024
@bdraco
Copy link
Member Author

bdraco commented Oct 20, 2024

This is a performance improvement and a bugfix but there is no way to separate so bugfix wins for the changelog messages

@bdraco bdraco changed the title Replace urlsplit to address performance issues Fix blocking I/O to load python code when the network location contains unicode characters Oct 20, 2024
@bdraco bdraco changed the title Fix blocking I/O to load python code when the network location contains unicode characters Fix blocking I/O to load python code when the network location contains non-ascii characters Oct 20, 2024
@bdraco bdraco marked this pull request as ready for review October 20, 2024 23:56
@bdraco bdraco merged commit 5fdf03b into master Oct 20, 2024
43 of 46 checks passed
@bdraco bdraco deleted the url_parse branch October 20, 2024 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided There is a change note present in this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant