Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More intelligently insert the pip path into sys.path #8241

Closed

Conversation

abadger
Copy link
Contributor

@abadger abadger commented May 15, 2020

When inserting pip's library path into sys.path we do not want to
override the stdlib. This is for several reasons:

(1) The stdlib should only contain the stdlib. So there should not be
other instances of pip there which will override our desired library.
(2) If we were to override the stdlib and the path the pip library is
installed in contains any libraries which override the stdlib
(coughenum34cough) then we could break code which depends on the
stdlib's APIs.

Fixes #8214

@pfmoore
Copy link
Member

pfmoore commented May 15, 2020

We still seem to be trying to solve a theoretical issue with (ever more complex) code. I know this triggers an actual problem, but can we start with a test that demonstrates the real issue?

The code in this PR seems way too complex and potentially very platform sensitive (as shown by the test failures) so I'd want to see strong justification for it, in terms of confirmation that there's no simpler way of fixing the underlying issue.

#8213 is far simpler. It may be the correct solution - the problem was that the explanation/justification for the fix got hopelessly confused, and (again) didn't have a test demonstrating the underlying issue, so we ended up debating generalities that may or may not have been relevant.

@abadger
Copy link
Contributor Author

abadger commented May 15, 2020

Of the three easy ways to fix this issue, I think this one is my favorite. I do have some questions, though:

  • Are there other paths which count as stdlib paths? I only knew about DESTSHARED because of long experience running and packaging python on posix platforms. I don't have similar experience with Windows.
  • Should the check for __platform__ continue to be included (As I stated elsewhere, that check seems to be trying, unsuccessfully, to limit the path changing to running pip from a wheel)? This change should be generic between wheel and non-wheel cases. (But might have reason to avoid in other cases.)

EDIT: sysconfig does not appear to know where the stdlib is installed on Windows so I do not believe this is a viable option.

@pfmoore
Copy link
Member

pfmoore commented May 15, 2020

  • Are there other paths which count as stdlib paths?
  • Should the check for __platform__ continue to be included

This is why I don't like this approach. Expecting (current and future) pip maintainers to know this sort of information, and track any changes tha affect it, is IMO a very high cost, and the issue doesn't justify it.

When inserting pip's library path into sys.path we do not want to
override the stdlib.  This is for several reasons:

(1) The stdlib should only contain the stdlib.  So there should not be
    other instances of pip there which will override our desired library.
(2) If we were to override the stdlib and the path the pip library is
    installed in contains any libraries which override the stdlib
    (*cough*enum34*cough*) then we could break code which depends on the
    stdlib's APIs.

Fixes pypa#8214
@abadger abadger force-pushed the fix-pip-build-env-intelligent-path branch from bc544d6 to c6587af Compare May 15, 2020 08:29
@felixfontein
Copy link

@pfmoore would you prefer this approach? #8213 (comment) IMO that would be easier to understand and maintain.

@abadger
Copy link
Contributor Author

abadger commented May 15, 2020

Of the three easy ways to fix this issue, I think this one is my favorite. I do have some questions, though:

* Are there other paths which count as stdlib paths?  I only knew about `DESTSHARED` because of long experience running and packaging python on posix platforms.  I don't have similar experience with Windows.

Judging by the windows error:

2020-05-15T08:32:45.0323810Z     sys.path.insert(max(stdlib_path_indexes) + 1, pip_installed_path)
2020-05-15T08:32:45.0324486Z ValueError: max() arg is an empty sequence

It's worse than I feared. This error means that not only are there not more possible stdlib locations but the three locations that sysconfig knows about were all empty on the Windows build hosts.

If we can't count on sysconfig knowing where the stdlib lives, I don't think this is a viable option. If @uranusjr knows of another way to get the potential directories, then he might be able to rescue this approach, but I won't be able to.

@uranusjr
Copy link
Member

@felixfontein Appending to the back is probably not the right approach, since the pip referred by path would be shadowed by the pip installed in site-packages. It must be inserted before site-packages to work correctly.

@uranusjr
Copy link
Member

I wonder if we can steal a trick or two from how virtualenv builds sys.path on Python 2 in its site.py.

@felixfontein
Copy link

@uranusjr please read my comment again. I said to only append at the end if none of the paths before contain a subdirectory pip.

@uranusjr
Copy link
Member

But what happens if any of them do? If you append in front of them you’d still get hit by the enum34 issue, which is the thing we want to solve in the first place…

@felixfontein
Copy link

If one of the stdlib paths contains an installed version of pip, I think we're screwed anyway. We either insert our path afterwards (in which case the wrong version of pip is used), or we insert it before (in which case the enum34 problem arises). Are you aware of cases where pip is already in the stdlib path(s)?

@abadger
Copy link
Contributor Author

abadger commented May 15, 2020

@pfmoore Sorry, uranusjr had mentioned this approach and that was the last thing mentioned in the bug report. I knew that approach would have some wrinkles (needing to ask sysconfig for the information. There being more than one directory that constituted the stdlib) so I thought I should take a stab at it so people didn't all decide that it was the right solution without seeing the full scope of what it would entail.

I'm afraid I'm swamped with work so I don't really have time to write a test... I did try just now, but I don't understand all of your test framework so the test doesn't fail:

diff --git a/tests/functional/test_pep517.py b/tests/functional/test_pep517.py
index 1647edf0..20f858b1 100644
--- a/tests/functional/test_pep517.py
+++ b/tests/functional/test_pep517.py
@@ -1,11 +1,25 @@
+import pathlib
+import os
+
 import pytest
 from pip._vendor import toml
 
+import pip
 from pip._internal.build_env import BuildEnvironment
 from pip._internal.req import InstallRequirement
 from tests.lib import make_test_finder, path_to_url, windows_workaround_7667
 
 
+@pytest.fixture
+def break_stdlib():
+    pip_location = pathlib.Path(pip.__file__).parent
+    broken_enum = pip_location / 'enum.py'
+    with open(broken_enum, 'w') as f:
+        f.write('broken()\n')
+    yield
+    os.remove(broken_enum)
+
+
 def make_project(tmpdir, requires=[], backend=None, backend_path=None):
     project_dir = tmpdir / 'project'
     project_dir.mkdir()
@@ -93,6 +107,17 @@ def test_pep517_install(script, tmpdir, data):
     result.assert_installed('project', editable=False)
 
 
+def test_stdlib_not_overridden(script, tmpdir, data, break_stdlib):
+    project_dir = make_project(
+        tmpdir, requires=['test_backend'],
+        backend="test_backend"
+    )
+    result = script.pip(
+        'install', '--no-index', '-f', data.backends, project_dir
+    )
+    result.assert_installed('project', editable=False)
+
+
 def test_pep517_install_with_reqs(script, tmpdir, data):
     """Backend generated requirements are installed in the build env"""
     project_dir = make_project(

I really would like to take the time to figure out how to get the test to test for the problem but I'm simply too busy... I haven't taken a weekend off in two weeks and I'm currently stealing time to work on this at 2:30 in the morning. I'm truly sorry, I hope you understand I'd like to carry this further but I simply can't keep being distracted by yummy, yummy open source problems when there's things I have to work on to keep getting health care right now ;-) You have the reproducer in the bug report which I assume you are now able to duplicate after I detailed it further, so I hope that's enough for someone else to be able to finish off a test for you.

@pfmoore
Copy link
Member

pfmoore commented May 15, 2020

I really would like to take the time to figure out how to get the test to test for the problem but I'm simply too busy

No problem - please don't feel pressured to work on this on your own time, we're all volunteers and understand how that can be. The contributions you've already made have been very helpful, don't feel you need to do any more than you already have.

In my view, there's clearly some edge cases in how we set up isolated envioronments that could do with tidying up, and this fits into that area. I'd be fine with addressing this as part of that work, when someone has the time to do it. We can write tests as part of that - the report here is sufficient as a starting point.

Even if we don't implement a solution to this issue right now the discussion here, has been extremely useful, so what's been done so far won't be wasted.

@pradyunsg
Copy link
Member

@abadger Please don't feel pressured to work on this. It's perfectly fine if we don't fix this right now. :)

@pradyunsg
Copy link
Member

Closing due to lack of movement here. Please feel free to file a new PR if you want to get the ball rolling again! :)

@pradyunsg pradyunsg closed this Feb 26, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

With Enum34 installed on python3.6+, pip picks up enum34 instead of the stdlib enum package
5 participants