Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package installation fail during parallel poetry runs #7370

Open
4 tasks done
VolvoxGlobator opened this issue Jan 19, 2023 · 9 comments
Open
4 tasks done

Package installation fail during parallel poetry runs #7370

VolvoxGlobator opened this issue Jan 19, 2023 · 9 comments
Labels
kind/bug Something isn't working as expected status/triage This issue needs to be triaged

Comments

@VolvoxGlobator
Copy link

VolvoxGlobator commented Jan 19, 2023

  • Poetry version: 1.3.2-->

  • Python version: 3.9.7 -->

  • OS version and name: Microsoft Windows 10 Pro 10.0.19044 N/A Build 19044

  • pyproject.toml: toml and lock file

  • I am on the latest stable Poetry version, installed using a recommended method.

  • I have searched the issues of this repo and believe that this is not a duplicate.

  • I have consulted the FAQ and blog for any relevant entries or release notes.

  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option) and have included the output below.

Issue

Package installation during multiple simultaneous runs of poetry install (also happens with add).
Way to reproduce
0/ copy toml and lock file from gist into two separate folders (e.g. A and B)
1/ clean up the cache, in the example shown above deleteing d:\shared_home\poetry_cache\artifacts\0a\68\9f\8998c481b6d80d6d27eabc63ee35e31620a651b9d4f8c71044164af253\tensorflow-2.5.1-cp39-cp39-win_amd64.whl is sufficient
2/ in both folders execute poetry install -vvv almost simultaneously
3/ one of the poetry instances will install tensorflow successfully (e.g. A), one will fail (e.g. B)

Pure speculation: Maybe when package is not in the cache, each instance of poetry downloads it individually to a tempfile. The one that finishes first then renames it to proper name in this case 2.5.1-cp39-cp39-win_amd64.whl. The slower one tries to do it as well, but the destination file already exists and thus fails?
Issue is easily reproducible on the tensorflow package, most likely because it is half gig and it is thus quite probable that the poetry instances will "meet" on that one

Output of the failed one

poetry install -vvv
Loading configuration file C:\Users\pkrejci\AppData\Roaming\pypoetry\config.toml
Creating virtualenv a in D:\poetry\b\.venv
Using virtualenv: D:\poetry\b\.venv
Installing dependencies from lock file

Finding the necessary packages for the current system

Package operations: 39 installs, 1 update, 0 removals, 1 skipped

  • Installing certifi (2022.12.7)
  • Installing charset-normalizer (3.0.1)
  • Installing idna (3.4)
  • Installing pyasn1 (0.4.8)
  • Installing urllib3 (1.26.14)
  • Installing cachetools (5.2.1)
  • Installing pyasn1-modules (0.2.8)
  • Installing oauthlib (3.2.2)
  • Installing requests (2.28.2)
  • Installing zipp (3.11.0)
  • Installing rsa (4.9)
  • Installing six (1.15.0)
  • Installing google-auth (2.16.0)
  • Installing importlib-metadata (6.0.0)
  • Installing markupsafe (2.1.2)
  • Installing requests-oauthlib (1.3.1)
  • Installing absl-py (0.15.0)
  • Installing google-auth-oauthlib (0.4.6)
  • Installing grpcio (1.34.1)
  • Installing protobuf (3.20.3)
  • Installing tensorboard-plugin-wit (1.8.1)
  • Installing werkzeug (2.2.2)
  • Installing numpy (1.19.5)
  • Installing tensorboard-data-server (0.6.1)
  • Installing markdown (3.4.1)
  • Updating setuptools (65.6.3 -> 66.0.0)
  • Installing astunparse (1.6.3)
  • Installing flatbuffers (1.12)
  • Installing gast (0.4.0)
  • Installing keras-preprocessing (1.1.2)
  • Installing opt-einsum (3.3.0)
  • Installing termcolor (1.1.0)
  • Installing keras-nightly (2.6.0.dev2021062500)
  • Installing tensorboard (2.11.2)
  • Installing typing-extensions (3.7.4.3)
  • Installing tensorflow-estimator (2.5.0)
  • Installing h5py (3.1.0)
  • Installing google-pasta (0.2.0)
  • Installing wrapt (1.12.1)
Connection pool is full, discarding connection: pypi.org. Connection pool size: 10
Connection pool is full, discarding connection: pypi.org. Connection pool size: 10
Connection pool is full, discarding connection: pypi.org. Connection pool size: 10
  • Installing tensorflow (2.5.1)
  • Installing wheel (0.38.4): Skipped for the following reason: Already installed

  Stack trace:

  3  C:\Python39\lib\site-packages\poetry\installation\executor.py:649 in _download_link
      647│             # No cached distributions was found, so we download and prepare it
      648│             try:
    → 649│                 archive = self._download_archive(operation, link)
      650│             except BaseException:
      651│                 cache_directory = self._chef.get_cache_directory_for_link(link)

  2  C:\Python39\lib\site-packages\poetry\installation\executor.py:720 in _download_archive
      718│                         progress.set_progress(done)
      719│
    → 720│                 f.write(chunk)
      721│
      722│         if progress:

  1  C:\Python39\lib\contextlib.py:126 in __exit__
      124│         if typ is None:
      125│             try:
    → 126│                 next(self.gen)
      127│             except StopIteration:
      128│                 return False

  PermissionError

  [WinError 5] Access is denied: 'D:\\shared_home\\poetry_cache\\artifacts\\0a\\68\\9f\\8998c481b6d80d6d27eabc63ee35e31620a651b9d4f8c71044164af253\\tmpg3beyq4a' -> 'D:\\shared_home\\poetry_cache\\artifacts\\0a\\68\\9f\\8998c481b6d80d6d27eabc63ee35e31620a651b9d4f8c71044164af253\\tensorflow-2.5.1-cp39-cp39-win_amd64.whl'

  at C:\Python39\lib\site-packages\poetry\utils\helpers.py:53 in atomic_open
       49│     tmp_descriptor, tmp_name = tempfile.mkstemp(dir=os.path.dirname(filename))
       50│     try:
       51│         with os.fdopen(tmp_descriptor, "wb") as tmp_handler:
       52│             yield tmp_handler
    →  53│         os.replace(tmp_name, filename)
       54│     except BaseException:
       55│         os.remove(tmp_name)
       56│         raise
       57│

The following error occurred when trying to handle this error:


  Stack trace:

  6  C:\Python39\lib\site-packages\poetry\installation\executor.py:263 in _execute_operation
      261│
      262│             try:
    → 263│                 result = self._do_execute_operation(operation)
      264│             except EnvCommandError as e:
      265│                 if e.e.returncode == -2:

  5  C:\Python39\lib\site-packages\poetry\installation\executor.py:334 in _do_execute_operation
      332│             return 0
      333│
    → 334│         result: int = getattr(self, f"_execute_{method}")(operation)
      335│
      336│         if result != 0:

  4  C:\Python39\lib\site-packages\poetry\installation\executor.py:454 in _execute_install
      452│
      453│     def _execute_install(self, operation: Install | Update) -> int:
    → 454│         status_code = self._install(operation)
      455│
      456│         self._save_url_reference(operation)

  3  C:\Python39\lib\site-packages\poetry\installation\executor.py:488 in _install
      486│             archive = self._download_link(operation, Link(package.source_url))
      487│         else:
    → 488│             archive = self._download(operation)
      489│
      490│         operation_message = self.get_operation_message(operation)

  2  C:\Python39\lib\site-packages\poetry\installation\executor.py:640 in _download
      638│             self._yanked_warnings.append(message)
      639│
    → 640│         return self._download_link(operation, link)
      641│
      642│     def _download_link(self, operation: Install | Update, link: Link) -> Path:

  1  C:\Python39\lib\site-packages\poetry\installation\executor.py:656 in _download_link
      654│                 # prior to Python 3.8
      655│                 if cached_file.exists():
    → 656│                     cached_file.unlink()
      657│
      658│                 raise

  PermissionError

  [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\shared_home\\poetry_cache\\artifacts\\0a\\68\\9f\\8998c481b6d80d6d27eabc63ee35e31620a651b9d4f8c71044164af253\\tensorflow-2.5.1-cp39-cp39-win_amd64.whl'

  at C:\Python39\lib\pathlib.py:1354 in unlink
      1350│         Remove this file or link.
      1351│         If the path is a directory, use rmdir() instead.
      1352│         """
      1353│         try:
    → 1354│             self._accessor.unlink(self)
      1355│         except FileNotFoundError:
      1356│             if not missing_ok:
      1357│                 raise
      1358│
@VolvoxGlobator VolvoxGlobator added kind/bug Something isn't working as expected status/triage This issue needs to be triaged labels Jan 19, 2023
@dimbleby
Copy link
Contributor

I'd be inclined to call this a great success - I'm pretty sure that in previous releases the two poetrys would have trampled over each other and left a corrupt copy of the wheel in place so that future attempts would all fail.

A one-time failure, with a more-or-less sensible error message, is really not so bad.

@neersighted
Copy link
Member

Ah, Windows. I wonder if we can open files in the cache with the ability to overwrite, or simple consider a PermissionError here to be a success/use the existing file? Indeed, this is a big win, and things just /work/ on Unix, so this is a tiny edge case that shows the fix was good.

@jedie

This comment was marked as off-topic.

@neersighted

This comment was marked as off-topic.

@jedie

This comment was marked as off-topic.

@neersighted

This comment was marked as off-topic.

@VolvoxGlobator
Copy link
Author

Ah, Windows. I wonder if we can open files in the cache with the ability to overwrite, or simple consider a PermissionError here to be a success/use the existing file? Indeed, this is a big win, and things just /work/ on Unix, so this is a tiny edge case that shows the fix was good.

Re overwrite - note, that cached package during this situation is actively used and being installed to one of the virtualenvs. Some forced overwriting could cause that other instance to fail instead

@neersighted
Copy link
Member

neersighted commented Jan 20, 2023

Right, but files opened with FILE_SHARE_DELETE should remain open as far as the process that is holding the lock is concerned, while allowing us to replace the file on disk. This would result in the same behavior as on a Unix system -- the process reading the file completes its operation (as the open file handle is retained and becomes anonymous) and the file is replaced on disk from the perspective of the other processes.

Some details at https://devblogs.microsoft.com/oldnewthing/20211022-00/?p=105822.

It's been a while since I've had to mess with Windows filesystem semantics, but if I'm not mistaken, setting the share flag should get us the behavior we want -- the catch here being this will require #6205 as I don't think we can control what mode pip uses to open files here.

@GrantAnt
Copy link

GrantAnt commented Mar 16, 2024

The chat is a bit older, but I faced the same problem. In case several processes need to access poetry you need to setup the cache dir dynamically. Poetry offers two way: either by using environment variables or by poetry config. Both act system wide so that all processes are configured the same.

But the option can be put in poetry.toml. And since I needed the path dynamically I found the following option:

poetry config cache-dir /folder/path/ --local

geritwagner added a commit to CoLRev-Environment/colrev that referenced this issue Jul 26, 2024
see python-poetry/poetry#7370

commit 2ee412f
Author: Gerit Wagner <gerit.wagner@uni-bamberg.de>
Date:   Fri Jul 26 08:16:58 2024 +0200

    try runner.temp with env

commit 1f92ed7
Author: Gerit Wagner <gerit.wagner@uni-bamberg.de>
Date:   Fri Jul 26 08:08:59 2024 +0200

    try POETRY_CACHE_DIR

commit fc4982b
Author: Gerit Wagner <gerit.wagner@uni-bamberg.de>
Date:   Fri Jul 26 08:03:40 2024 +0200

    update cache dir

commit e53e6f4
Author: Gerit Wagner <gerit.wagner@uni-bamberg.de>
Date:   Fri Jul 26 07:56:05 2024 +0200

    set poetry cache directory
tgummerer added a commit to pulumi/pulumi that referenced this issue Sep 23, 2024
Poetry uses a cache for packages to speed up installation of the venv.
It seems like the primitives it uses for that are causing issues when
running multiple instances of poetry in parallel that are installing
the same packages.  This seems to happen only on Windows.  Make these
tests run sequentially on Windows to avoid this test flakyness.

There's an upstream issue for poetry as well (python-poetry/poetry#7370)
github-merge-queue bot pushed a commit to pulumi/pulumi that referenced this issue Sep 23, 2024
Poetry uses a cache for packages to speed up installation of the venv.
It seems like the primitives it uses for that are causing issues when
running multiple instances of poetry in parallel that are installing the
same packages. This seems to happen only on Windows. Make these tests
run sequentially on Windows to avoid this test flakyness.

There's an upstream issue for poetry that describes this as well
(python-poetry/poetry#7370)

#17183
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working as expected status/triage This issue needs to be triaged
Projects
None yet
Development

No branches or pull requests

5 participants