Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lazy wheel to obtain dep info for new resolver #8532

Closed
wants to merge 5 commits into from

Conversation

McSinyx
Copy link
Contributor

@McSinyx McSinyx commented Jul 3, 2020

This PR is created to continue the path to implement GH-7819 (this is half way there). I file this a bit early to iron out the UX that we'd want to have. At the time of writing, the patch produce something like the following, which IMHO a bit too verbose

Installation of django-rest-swagger

$ pip install django-rest-swagger --no-cache
Collecting django-rest-swagger
  Obtaining dependency information from django-rest-swagger 2.2.0
Collecting openapi-codec>=1.3.1
  Downloading openapi-codec-1.3.2.tar.gz (6.3 kB)
Collecting coreapi>=2.3.0
  Obtaining dependency information from coreapi 2.3.3
Collecting itypes
  Obtaining dependency information from itypes 1.2.0
Collecting coreschema
  Downloading coreschema-0.0.4.tar.gz (10 kB)
Collecting uritemplate
  Obtaining dependency information from uritemplate 3.0.1
Collecting djangorestframework>=3.5.4
  Obtaining dependency information from djangorestframework 3.11.0
Collecting jinja2
  Obtaining dependency information from jinja2 2.11.2
Collecting MarkupSafe>=0.23
  Obtaining dependency information from markupsafe 1.1.1
Collecting django>=1.11
  Obtaining dependency information from django 3.0.8
Collecting sqlparse>=0.2.2
  Obtaining dependency information from sqlparse 0.3.1
Collecting asgiref~=3.2
  Obtaining dependency information from asgiref 3.2.10
Collecting pytz
  Obtaining dependency information from pytz 2020.1
Collecting simplejson
  Downloading simplejson-3.17.0.tar.gz (83 kB)
     |████████████████████████████████| 83 kB 666 kB/s 
Collecting requests
  Obtaining dependency information from requests 2.24.0
Collecting chardet<4,>=3.0.2
  Obtaining dependency information from chardet 3.0.4
Collecting idna<3,>=2.5
  Obtaining dependency information from idna 2.10
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1
  Obtaining dependency information from urllib3 1.25.9
Collecting certifi>=2017.4.17
  Obtaining dependency information from certifi 2020.6.20
Collecting django-rest-swagger
  Downloading django_rest_swagger-2.2.0-py2.py3-none-any.whl (495 kB)
     |████████████████████████████████| 495 kB 329 kB/s 
Collecting coreapi>=2.3.0
  Downloading coreapi-2.3.3-py2.py3-none-any.whl (25 kB)
Collecting itypes
  Downloading itypes-1.2.0-py2.py3-none-any.whl (4.8 kB)
Collecting uritemplate
  Downloading uritemplate-3.0.1-py2.py3-none-any.whl (15 kB)
Collecting djangorestframework>=3.5.4
  Downloading djangorestframework-3.11.0-py3-none-any.whl (911 kB)
     |████████████████████████████████| 911 kB 315 kB/s 
Collecting jinja2
  Downloading Jinja2-2.11.2-py2.py3-none-any.whl (125 kB)
     |████████████████████████████████| 125 kB 691 kB/s 
Collecting MarkupSafe>=0.23
  Downloading MarkupSafe-1.1.1-cp38-cp38-manylinux1_x86_64.whl (32 kB)
Collecting django>=1.11
  Downloading Django-3.0.8-py3-none-any.whl (7.5 MB)
     |████████████████████████████████| 7.5 MB 624 kB/s 
Collecting sqlparse>=0.2.2
  Downloading sqlparse-0.3.1-py2.py3-none-any.whl (40 kB)
     |████████████████████████████████| 40 kB 1.5 MB/s 
Collecting asgiref~=3.2
  Downloading asgiref-3.2.10-py3-none-any.whl (19 kB)
Collecting pytz
  Downloading pytz-2020.1-py2.py3-none-any.whl (510 kB)
     |████████████████████████████████| 510 kB 73 kB/s 
Collecting requests
  Downloading requests-2.24.0-py2.py3-none-any.whl (61 kB)
     |████████████████████████████████| 61 kB 1.0 MB/s 
Collecting chardet<4,>=3.0.2
  Downloading chardet-3.0.4-py2.py3-none-any.whl (133 kB)
     |████████████████████████████████| 133 kB 827 kB/s 
Collecting idna<3,>=2.5
  Downloading idna-2.10-py2.py3-none-any.whl (58 kB)
     |████████████████████████████████| 58 kB 1.1 MB/s 
Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1
  Downloading urllib3-1.25.9-py2.py3-none-any.whl (126 kB)
     |████████████████████████████████| 126 kB 820 kB/s 
Collecting certifi>=2017.4.17
  Downloading certifi-2020.6.20-py2.py3-none-any.whl (156 kB)
     |████████████████████████████████| 156 kB 706 kB/s 
Using legacy setup.py install for openapi-codec, since package 'wheel' is not installed.
Using legacy setup.py install for coreschema, since package 'wheel' is not installed.
Using legacy setup.py install for simplejson, since package 'wheel' is not installed.
Installing collected packages: MarkupSafe, urllib3, jinja2, idna, chardet, certifi, uritemplate, sqlparse, requests, pytz, itypes, coreschema, asgiref, django, coreapi, simplejson, openapi-codec, djangorestframework, django-rest-swagger
    Running setup.py install for coreschema ... done
    Running setup.py install for simplejson ... done
    Running setup.py install for openapi-codec ... done
Successfully installed MarkupSafe-1.1.1 asgiref-3.2.10 certifi-2020.6.20 chardet-3.0.4 coreapi-2.3.3 coreschema-0.0.4 django-3.0.8 django-rest-swagger-2.2.0 djangorestframework-3.11.0 idna-2.10 itypes-1.2.0 jinja2-2.11.2 openapi-codec-1.3.2 pytz-2020.1 requests-2.24.0 simplejson-3.17.0 sqlparse-0.3.1 uritemplate-3.0.1 urllib3-1.25.9

cc @cosmicexplorer for review and other thoughts on the lazy wheel

cc @nlhkabu and @ei8fdb for the UI/UX

TODOs:

  • Make all tests passes
  • Make the optimization available only as an opt-in via --use-feature
  • Add functional tests for the opt-in

@McSinyx McSinyx marked this pull request as draft July 3, 2020 15:57
@McSinyx McSinyx force-pushed the lazy-whl-dep-info branch 2 times, most recently from 98455c2 to 0872339 Compare July 5, 2020 10:49
@ofek
Copy link
Contributor

ofek commented Jul 6, 2020

Need rebase fyi

@McSinyx
Copy link
Contributor Author

McSinyx commented Jul 6, 2020

Thanks @ofek, you're even faster than @BrownTruck! Just curious, do you have some sort of hook to keep track of the PRs or you just happen to see it at the right time?

@ofek
Copy link
Contributor

ofek commented Jul 6, 2020

I constantly cycle open tabs :)

@McSinyx McSinyx force-pushed the lazy-whl-dep-info branch 2 times, most recently from b46fb6b to 2192d0f Compare July 11, 2020 08:15
@McSinyx McSinyx marked this pull request as ready for review July 11, 2020 08:15
@McSinyx McSinyx requested review from pradyunsg and dholth July 11, 2020 08:15
By invoking pip with --use-feature=lazy-wheel, the new resolver
will try to obtain dependency information by lazily download wheels
by HTTP range requests.
@McSinyx
Copy link
Contributor Author

McSinyx commented Jul 13, 2020

Per a private discussion with @pradyunsg a few days ago, we came to a consensus that lazy-wheel is not a particularly intuitive name to specify the feature this PR is introducing. Any suggestion would be really appreciated.

@uranusjr
Copy link
Member

I’d call it something along the line of fast-deps since that’s how this would achieve.

@McSinyx
Copy link
Contributor Author

McSinyx commented Jul 13, 2020

@uranusjr, I've renamed the user-facing feature name to fast-deps, while keep the internal implementation as well as the description in the test suit.

@McSinyx McSinyx closed this Jul 13, 2020
@McSinyx McSinyx reopened this Jul 13, 2020
@@ -38,18 +38,34 @@ jobs:
env:
- GROUP=1
- NEW_RESOLVER=1
- env:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are any of these tests exercising the new code? It will only impact the network tests, and there aren't many in our unit tests. Personally I'd trade all of these for 1 or 2 integration tests where we know the new code is being used. werkzeug (which we build on for our mock server test helper) has some built-in helpers for handling range requests so we could do all of this locally, without reaching out to PyPI. The PR in pallets/werkzeug#977 might provide some help in using it.

Copy link
Member

@chrahunt chrahunt Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add on to this, because of the code that we're touching, these are the specific integration tests I think would give us the most impact:

  1. pip wheel
  2. pip download
  3. pip install of a wheel that has extras that are also wheels

One approach would be to find existing integration tests, then convert them to use our mock server and then parameterize which mock server to use: plain one or range-request-supporting one.

@@ -271,6 +271,7 @@ def make_resolver(
force_reinstall=force_reinstall,
upgrade_strategy=upgrade_strategy,
py_version_info=py_version_info,
lazy_wheel='fast-deps' in options.features_enabled,
Copy link
Member

@chrahunt chrahunt Jul 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RequirementPreparer currently hides all details about the way that we're accessing distributions, so the only thing that the Candidate needs to be aware of is the abstract distribution after preparing. If we put the lazy wheel logic there instead (probably used to create the abstract distribution), then:

  1. it preserves that separation of concerns, with all of the associated benefits
  2. we get more code reuse, since the lazy-wheel-backed abstract distribution would follow the same path as the eager-wheel-backed one, with all of the associated benefits
  3. it gives us control over whether the wheel download is actually lazy - IIUC currently we're relying on the fact that no one happens to access _InstallRequirementBackedCandidate.dist prior to iter_dependencies()
  4. it gives us control over when the real wheel download happens - IIUC currently we're relying on the fact that someone happens to access _InstallRequirementBackedCandidate.dist before actually trying to install the wheel
  5. it works when we compose candidates, like in ExtrasCandidate, which currently directly accesses self.base.dist and bypasses the lazy wheel logic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooo! I missed ExtrasCandidate when I discussed this w/ @McSinyx.

I think we can cover this by changing how self.dist is populated (instead of changing how iter_dependencies works), since what's really happening in _iter_dependencies -- we're creating a separate distribution object and using it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what was done in #8448.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pradyunsg, I think we did discuss point (5) and what you came up with (1a28d08) will handle it just fine. So now we came to a consensus on how to cover the listed points above, which is different from what this PR is pushing, I'm thinking about filing another one (likely within today) to avoid intensive rebasing which I'm not exactly good at 😄

@chrahunt, thank you for the thoughtful heads up as well as the tips on testing.

@dstufft
Copy link
Member

dstufft commented Jul 14, 2020

I just thought I'd share pypi/warehouse#8254 here, as it's an idea that tackles this from a different angle, that I think would possibly be better in the long term (although I don't think it needs to stop this work in the short term).

@@ -271,6 +271,7 @@ def make_resolver(
force_reinstall=force_reinstall,
upgrade_strategy=upgrade_strategy,
py_version_info=py_version_info,
lazy_wheel='lazy-wheel' in options.features_enabled,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a warning to be printed that this functionality is not stable and not meant for production use at this time.

src/pip/_internal/cli/cmdoptions.py Outdated Show resolved Hide resolved
@@ -23,13 +23,18 @@
from pip._internal.network.session import PipSession


class HTTPRangeRequestUnsupported(RuntimeError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class HTTPRangeRequestUnsupported(RuntimeError):
class HTTPRangeRequestUnsupported(Exception):

@McSinyx
Copy link
Contributor Author

McSinyx commented Jul 15, 2020

Thank you @dstufft, the initiation for METADATA via the simple API looks really promising and is very likely to be out before the revised JSON API. I'll try to find a way, if possible and not ugly, to make the metadata acquisition part to be a bit future-proof so that we don't have to refactor again when we want to change the method to obtain them.

@pradyunsg
Copy link
Member

I had a call with @McSinyx yesterday, where we discussed how we'd implement this functionality. Following @chrahunt's review (who stated what I was thinking during my initial review much more clearly!), we concluded that a better approach than what's being done right now, would be to do something like pradyunsg@1a28d08. It's a smaller, more easily reviewable, self-contained change overall.

@McSinyx
Copy link
Contributor Author

McSinyx commented Jul 16, 2020

I'm closing this since it is superseded by GH-8588.

@McSinyx McSinyx closed this Jul 16, 2020
@McSinyx McSinyx deleted the lazy-whl-dep-info branch August 8, 2020 14:46
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants