Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{cache-dir}/virtualenvs may be too volatile a storage location for some users #3346

Open
2 tasks done
hwalinga opened this issue Nov 10, 2020 · 9 comments
Open
2 tasks done
Assignees
Labels
area/venv Related to virtualenv management status/needs-consensus Consensus among maintainers required

Comments

@hwalinga
Copy link

  • I have searched the issues of this repo and believe that this is not a duplicate.
  • I have searched the documentation and believe that my question is not covered.

Feature Request

Poetry by default saves its virtualenvs in {cache-dir}/virtualenvs which by default is ~/.cache/pypoetry/virtualenvs (Linux) and ~/Library/Caches/pypoetry/virtualenvs (MacOS). However, these folders are generally considered to be safe to be removed: Linux, MacOS

XDG also specifies to use the cache directory for non essential files. (i.e. those that are trivially recreated without user interaction.)

I report this because I emptied the cache on a server and this dropped the in the background running webserver using a poetry virtualenv, unbeknownst to me (luckily it was weekend).

So I recommend to set a different default for the virtualenv folder. If you follow XDG specifications that would probably be ~/.local/share/poetry/virtualenvs (FYI: Pipenv already uses ~/.local/share/virtualenvs), and there is probably an similar folder for MacOS.

(Default virtualenvs folder for Windows should probably be fine.)

@hwalinga hwalinga added kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Nov 10, 2020
@mrijken
Copy link

mrijken commented Nov 11, 2020

You know you can change the location, which makes it a different default obsolete?

poetry config virtualenvs.path /path/to/cache/directory/virtualenvs

or to the project / repo location:

poetry config virtualenvs.in-project true

@hwalinga
Copy link
Author

Yes, but I found that out when it already went wrong :-)
(I wouldn't expect people to completely read the docs, especially for a package manager. So, any rm happy person can make the same mistake as me.)

But it would really be nice to have a more saner default instead. I think ~/.local/share/poetry/virtualenvs would be the perfect location (just as Pipenv does it) and for MacOS that is as it seems ~/Library/Application Support/poetry/virtualenvs or just ~/Library/poetry/virtualenvs

@DrLuke
Copy link
Contributor

DrLuke commented Nov 11, 2020

(I wouldn't expect people to completely read the docs, especially for a package manager. So, any rm happy person can make the same mistake as me.)

I would at least expect people to try to find out where the venv goes when deploying this to a production environment, seeing how it obviously doesn't end up in the project directory by default.

I agree however that putting them in the cache directory by default might not be the best solution, as that means that your venv could vanish seemingly random at any time you're working with it. At best it's a nuisance because you have to re-run install, at worst this could be rather devastating for someone with a slow or data-capped internet connection.

Personally I would prefer them to always end up in the project directory by default, as that means that deleting the project directory will clean up all data associated with it. If it's at a different location, like ~/.local you could collect a substantial amount of dangling venvs if you don't clean it up regularily.

@hwalinga
Copy link
Author

I would at least expect people to try to find out where the venv goes when deploying this to a production environment, seeing how it obviously doesn't end up in the project directory by default.

I learned it the hard way, but agreed.

Personally I would prefer them to always end up in the project directory by default, as that means that deleting the project directory will clean up all data associated with it. If it's at a different location, like ~/.local you could collect a substantial amount of dangling venvs if you don't clean it up regularily.

I think that is also a very good solution, perhaps even better.

@hwalinga
Copy link
Author

@finswimmer finswimmer added the area/venv Related to virtualenv management label Nov 27, 2020
@felciano
Copy link

This bug just bit me hard on MacOS. Optimization software like CleanMyMac will clear out ~/Library/Caches/ periodically, wiping out packages downloaded by poetry. They've (correctly IMO) pointed to the Apple technical docs that pretty clearly indicate that this directory shouldn't include any files that applications depend on and can't recreate themselves (e.g. "the application does not require cache data to operate properly, but it can use cache data to improve performance").

This seems like a valid and well-documented design decision, and one that poetry should respect out-of-the-box. Since python apps aren't typically expected to be able to re-install required libraries on their own (!), shouldn't a different default cache location be used on MacOS?

@YodaEmbedding
Copy link

It depends on the interpretation of the XDG specification:

There is a single base directory relative to which user-specific data files should be written. This directory is defined by the environment variable $XDG_DATA_HOME.

There is a single base directory relative to which user-specific non-essential (cached) data should be written. This directory is defined by the environment variable $XDG_CACHE_HOME.

Is the data essential? It can be reconstructed (by re-downloading) easily, so it sounds rather cache-like. On the other hand, it's a practical annoyance for this data to disappear and force a redownload. Additionally, if the cache is cleared, the user must manually redownload the data, which means it violates the idea that a user should not be able to tell if the cache is suddenly cleared.

TL;DR: storing it in $XDG_DATA_HOME is probably more practical.

@neersighted neersighted added status/needs-consensus Consensus among maintainers required and removed kind/feature Feature requests/implementations status/triage This issue needs to be triaged labels Oct 4, 2022
@neersighted neersighted changed the title {cache-dir}/virtualenvs might not be the best place to safe virtualenvs {cache-dir}/virtualenvs may be too volatile a storage location for some users Oct 4, 2022
@YodaEmbedding
Copy link

YodaEmbedding commented Aug 7, 2023

XDG_STATE_HOME (i.e. ~/.local/state) was recently introduced as an "in-between" option between XDG_CACHE_HOME and XDG_DATA_HOME.

The $XDG_STATE_HOME contains state data that should persist between (application) restarts, but that is not important or portable enough to the user that it should be stored in $XDG_DATA_HOME. It may contain:

  • actions history (logs, history, recently used files, …)
  • current state of the application that can be reused on a restart (view, layout, open files, undo history, …)

Theoretically, anything inside XDG_CACHE_HOME should be regeneratable with zero additional user interaction. Since poetry does not automatically run poetry install when it detects missing virtual environments, there is non-zero user interaction required to regenerate the cache:

rm -rf ~/.cache/pypoetry/virtualenvs
poetry run python main.py  # Does not work without additional user interaction!
poetry install             # Manual user interaction.
poetry run python main.py  # Now it works.

Thus, XDG_STATE_HOME is a more appropriate place.


TL;DR: ~/.local/state/pypoetry/virtualenvs.

@Secrus Secrus self-assigned this Sep 25, 2024
@shayneoneill
Copy link

shayneoneill commented Nov 26, 2024

Just noting that I just got bitten by this. I've filed a bug report with cleanmymac about it, but looks like its not a true "bug" from the perspective of how Cache is supposed to use (Hopefully they'll put an excemption in for that directory though.

That said, if the cache directory is volatile, is a poor choice (And a virtualenv is not a cache)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/venv Related to virtualenv management status/needs-consensus Consensus among maintainers required
Projects
None yet
Development

No branches or pull requests

9 participants