Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

During seeding properly uninstall present versions of the wheels #2186

Merged
merged 4 commits into from
Sep 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/changelog/2185.bugfix.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Fixed a bug where while creating a venv on top of an existing one, without cleaning, when seeded
wheel version mismatch occurred, multiple ``.dist-info`` directories may be present, confounding entrypoint
discovery - by :user:`arcivanov`
37 changes: 32 additions & 5 deletions src/virtualenv/seed/embed/via_app_data/pip_install/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import re
import zipfile
from abc import ABCMeta, abstractmethod
from itertools import chain
from tempfile import mkdtemp

from distlib.scripts import ScriptMaker, enquote_executable
Expand All @@ -31,14 +32,10 @@ def _sync(self, src, dst):

def install(self, version_info):
self._extracted = True
self._uninstall_previous_version()
# sync image
for filename in self._image_dir.iterdir():
into = self._creator.purelib / filename.name
if into.exists():
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaborbernat I'm sorry, this is NOT an equivalent change.

We went from "delete target prior to copying into" into "uninstall previous distro but do not delete target directory if it wasn't part of the distro".

I specifically preserved that functionality because if there are extraneous files that are NOT part of a valid distro, venv would override them on install and won't fail. Now VEnv will fail as soon as it encounters a directory/file that exists which can happen if overwriting a partially created VEnv directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, how do you end up with a partially created venv? Do you have some real-world examples of this or are you talking HYPOTHETICALS?

Copy link
Contributor Author

@arcivanov arcivanov Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ctrl-C during venv installation with partial install of the package. This will occur intermittently and how intermittent it'll be will depend on the PIP package installation logic which will also change from time to time (whether they write dist-info first or they write package data first etc).

My point is: there was specific logic here that I preserved (not introduced!) that resulted in a very specific behavior that made venv more robust. The cleanup resulted in that logic being removed which made venv less robust and more error prone while the problem I tried to fix was fixed.

Can we please make venv less brittle again and reintroduce the logic that was there before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is virtualenv, not venv 😊 please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change you introduced relies on everything else (PIP, wheel, bdist_wheel) working 100% correctly 100% of the time: fully installed packages with 100% correct RECORD file that recorded all the files that were actually installed and that package is installed successfully 100% of the time and never interrupted in the process of installation by user action or running out of disk space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

if into.is_dir() and not into.is_symlink():
safe_delete(into)
else:
into.unlink()
self._sync(filename, into)
# generate console executables
consoles = set()
Expand Down Expand Up @@ -150,6 +147,36 @@ def _create_console_entry_point(self, name, value, to_folder, version_info):
result.extend(Path(i) for i in new_files)
return result

def _uninstall_previous_version(self):
dist_name = self._dist_info.stem.split("-")[0]
in_folders = chain.from_iterable([i.iterdir() for i in {self._creator.purelib, self._creator.platlib}])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The installation happens only into the purelib: https://github.com/pypa/virtualenv/pull/2186/files#diff-05dbe4fcecab074226aa3cf0e1d6285e9a47e4a1c529a20ab49f2b572c3d49eaR38

Why bother with platlib when install will never actually touch it?

Copy link
Contributor

@gaborbernat gaborbernat Sep 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the package is either in the purelib or platlib is an implementation detail of the installer. Our installer uses purelib, but is completely legal for someone else to use platlib for it. We don't control the previous state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's even more important to restore the logic above. If the package being removed is in the platlib but the extraneous files are in purelib then while the package is removed from platlib the purelib will stay polluted and installation will fail.

paths = (p for p in in_folders if p.stem.split("-")[0] == dist_name and p.suffix == ".dist-info" and p.is_dir())
existing_dist = next(paths, None)
if existing_dist is not None:
self._uninstall_dist(existing_dist)

@staticmethod
def _uninstall_dist(dist):
dist_base = dist.parent
logging.debug("uninstall existing distribution %s from %s", dist.stem, dist_base)

top_txt = dist / "top_level.txt" # add top level packages at folder level
paths = {dist.parent / i.strip() for i in top_txt.read_text().splitlines()} if top_txt.exists() else set()
paths.add(dist) # add the dist-info folder itself

base_dirs, record = paths.copy(), dist / "RECORD" # collect entries in record that we did not register yet
for name in (i.split(",")[0] for i in record.read_text().splitlines()) if record.exists() else ():
path = dist_base / name
if not any(p in base_dirs for p in path.parents): # only add if not already added as a base dir
paths.add(path)

for path in sorted(paths): # actually remove stuff in a stable order
if path.exists():
if path.is_dir() and not path.is_symlink():
safe_delete(path)
else:
path.unlink()

def clear(self):
if self._image_dir.exists():
safe_delete(self._image_dir)
Expand Down
16 changes: 13 additions & 3 deletions tests/unit/seed/embed/test_bootstrap_link_via_app_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ def test_seed_link_via_app_data(tmp_path, coverage_env, current_fastest, copies)
pip = site_package / "pip"
setuptools = site_package / "setuptools"

files_post_first_create = list(site_package.iterdir())
files_post_first_create = set(site_package.iterdir())
assert pip in files_post_first_create
assert setuptools in files_post_first_create
for pip_exe in [
Expand Down Expand Up @@ -82,15 +82,25 @@ def test_seed_link_via_app_data(tmp_path, coverage_env, current_fastest, copies)
assert not process.returncode
assert site_package.exists()

files_post_first_uninstall = list(site_package.iterdir())
files_post_first_uninstall = set(site_package.iterdir())
assert pip in files_post_first_uninstall
assert setuptools not in files_post_first_uninstall

# install a different setuptools to test that virtualenv removes this before installing new
version = "setuptools<{}".format(bundle_ver["setuptools"].split("-")[1])
install_cmd = [str(result.creator.script("pip")), "--verbose", "--disable-pip-version-check", "install", version]
process = Popen(install_cmd)
process.communicate()
assert not process.returncode
assert site_package.exists()
files_post_downgrade = set(site_package.iterdir())
assert setuptools in files_post_downgrade

# check we can run it again and will work - checks both overwrite and reuse cache
result = cli_run(create_cmd)
coverage_env()
assert result
files_post_second_create = list(site_package.iterdir())
files_post_second_create = set(site_package.iterdir())
assert files_post_first_create == files_post_second_create

# Windows does not allow removing a executable while running it, so when uninstalling pip we need to do it via
Expand Down