-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependency resolution differences (wrong) when using custom (i.e. not pypi) repository #4439
Comments
I had read through the code when I had created my issue and my understanding of the cause matches yours. I think, though, that assuming that all PyPI-like backends support the simple API would be brave. For example, I was using AWS CodeArtifact as one of my PyPI backends, and that supports the legacy API and not the simple one. I know that it is far from ideal, but it's probably best to just try every front door for custom repositories and see which APIs are available to use, rather than just resorting to sdist downloads for all custom repositories. It's a sad state of affairs when the thing storing packages can't be trusted to answer really basic questions about what packages need to be installed correctly, but these performance and correctness issues really undercut a huge amount of the value add that users get from using Poetry. I appreciate that Poetry is trying to do the "right thing", but it's tiring to advocate for using tools like this and have it either take ages to do its calculations, especially when it's not making use of available API endpoints to do so. |
That's the ideal solution, to me at least (option 1). You could try and optimize the guessing game a little bit by looking at hostnames though. E.g. if you can tell from a url that it's azure artifact feeds, or aws codeartifact, you can use that to your advantage. |
Tip for others that are also affected by this. Until this is resolved we are moving to installing via git instead
Using private pypi:
Installing with private pypi can take hours for us. |
So I should lobby with Microsoft to get them to implement PEP 691 and 658 on Azure Artifact Feeds? (i can do that!) |
That is correct. |
This is still happening when a python private package from Artifact Registry in Google Cloud Platform (GCP) package from Artifact Registry is beeing installed. $ poetry install
Creating virtualenv ./.venv
Updating dependencies
Resolving dependencies... Downloading https://files.pythonhosted.org/packages/be/c8/551a803a6ebb174ec1c124e68b449b98a0961f0b737def601e3c1fbb4cfd/pathspec-0.11.1-py3-none-a
Resolving dependencies... Downloading https://files.pythonhosted.org/packages/39/fd/217e9bf573f710827416e1e6f56a6355b90c2ce7fbf8b83d5729d5b2e0b6/numpy-1.24.2-cp310-cp310-m
Resolving dependencies... Downloading https://files.pythonhosted.org/packages/39/fd/217e9bf573f710827416e1e6f56a6355b90c2ce7fbf8b83d5729d5b2e0b6/numpy-1.24.2-cp310-cp310-m
Resolving dependencies... Downloading https://files.pythonhosted.org/packages/39/fd/217e9bf573f710827416e1e6f56a6355b90c2ce7fbf8b83d5729d5b2e0b6/numpy-1.24.2-cp310-cp310-m
Resolving dependencies... Downloading https://files.pythonhosted.org/packages/39/fd/217e9bf573f710827416e1e6f56a6355b90c2ce7fbf8b83d5729d5b2e0b6/numpy-1.24.2-cp310-cp310-m
Resolving dependencies... (820.7s) It tooks more than 820 seconds and hasn't even started the installation... Is this an GCP related issue? Or can it be solved within poetry? |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
-vvv
option).Issue
Any source that is defined in pyproject.toml that is not pypi, is always handled internally as a LegacyRepository.
That means metadata is not collected from API calls, but always by downloading and parsing packages, usually sdists.
I probably don't have to explain how this is bad for performance in terms of speed, but you can see people notice it because it is quite significant! See for example #4113
Sometimes however, in cases where package metadata would have been available on an API endpoint, but poetry can't figure out what the metadata is by parsing the sdist, this leads to problems in dependency resolution.
For example, scikit-image 0.17.2 sdist imports numpy in its setup.py, but it doesn't specify any build requirements in pyproject.toml, so running setup.py fails. Poetry then just silently concludes scikit-image doesn't have any dependencies, which is clearly wrong.
This is exactly what happens in #3464 and is also how I first encountered this bug.
If you install this package from pypi however, everything goes smoothly because the metadata is collected from the API endpoint instead.
So in short, for the exact same dependencies, depending on what source repository you use: pypi or something else, you may not get the same dependency resolution. Even if the alternative source is a direct reverse proxy to pypi.
Suggested fix
Option 1 - Fully automated
This is the ideal option. Poetry becomes clever enough to figure out for any source if it can provide metadata via an API just like pypi can. A mechanism needs to be built that tests this per configured source.
You could look at hostnames to try and optimize this guessing game a little bit.
Option 2 - User configurable
Allow users to configure the capabilities a source has available in pyproject.toml. This would basically put the responsibility with the user to tell poetry what APIs can be consumed.
If you agree with one of the suggested improvements, I can do the work and open a PR. I'm pretty sure many users will reap the benefits in performance and correctness!
The text was updated successfully, but these errors were encountered: