-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
poetry install
does not populate cache
#2203
Comments
Does anyone have a suggested workaround for this issue? I'd like to be able to cache my dependencies in CI, but this seems tricky to do due to this issue. |
The problem: the firmware is currently built in 4 variants and each variant downloads all submodules again which wastes bandwitdh and is pretty slow on my machine. Also poetry was downloading all its python deps again and again. We create one docker data volume which persist. It is mapped inside container to '/root/.cache'. So all tools aware of this standard location should naturally benefit. Poetry should benefit in the future when they fix python-poetry/poetry#2203. Pip is already using it today. To improve git submodules situation I adopted following strategy. 1. On first run I clone the repo once into /root/.cache/repos/trezor-firmware. Let's call it the canonical repo. 2. On each run I perform initial update step which checkouts requested tag and brings this canonical repo up to date (including its submodules). 3. When building each variant of the firmware, we copy the canonical repo to /tmp/trezor-firmware instead cloning it. This is faster. We do a copy because we want to work from scratch. E.g. there should be no left-over files from previous compilations, because the source canonical repo is clean. To blow the caches one can run BLOW_CACHES=1 ./build-docker.sh
The problem: the firmware is currently built in 4 variants and each variant downloads all submodules again which wastes bandwidth and is pretty slow on my machine. Also poetry was downloading all its python deps again and again. We create one docker data volume which persist. It is mapped inside the container to '/root/.cache'. So all tools aware of this standard location should naturally benefit. Poetry should benefit in the future when they fix python-poetry/poetry#2203. Pip is already using it today. To improve git submodules situation I adopted following strategy. 1. On first run I clone the repo once into /root/.cache/repos/trezor-firmware. Let's call it the canonical repo. In case of local repo, we can be even faster by doing direct rsync from /local. 2. On each run I perform initial update step which checkouts requested tag and brings this canonical repo up to date (including its submodules). 3. When building each variant of the firmware, we overlay the canonical repo in /repo/.cache/ws instead of copying or cloning it. This is instant. Also using an overlayfs keeps canonical repo read-only and we can easily scratch possible build files/artifacts before each run. To blow the caches one can run BLOW_CACHES=1 ./build-docker.sh
I confirm the same behaviour using stock Debian containers. |
This is the expected behavior. The data in The lock file is the source of truth for |
I don't really understand the relevance of these parts of the comment:
(and)
This issue is about caching, not correctness. I'm not objecting to what As for the rest of your comment:
It's been a long time since I've written this issue so perhaps I understood the cache wrong or the behavior has changed, but IIRC, the presence of that cache dramatically improved install times. As for why I want it, I said so in the original comment:
To clarify further, I'm using Docker-based builds (in Travis), so they do the thing where they archive a bunch of files on disk so you can unarchive them in later builds in fresh Docker containers, thereby skipping a bunch of work for that later build. If there is another disk location that would be helpful for CI caching instead of or in addition to that one, that would be helpful to know. I don't see any documentation along these lines. It's also worth noting that I opened this issue before the addition of |
The data in That being said, there is a cache that matters for the installation and it's |
I didn't know (and couldn't find anything) about this directory when I originally commented on this issue, and I ended up caching my entire virtualenv to speed up CI builds instead. This works fine but isn't very granular, and also isn't really how virtualenvs are intended to be used. If caching this directory would work instead then possibly the only thing that could be improved for this issue is surfacing this directory in the docs? (If it's not there already, maybe I was just looking in the wrong place). |
Thanks @sdispater, that is very helpful! I think between the existence of @bobwhitelock I agree, having it documented somewhere would be really handy for those of us in charge of the build systems. I did a quick skim but no docs sections jumped out at me as being appropriate. Candidates: the entry on the |
per comments in the thread, there's no bug here: this can be closed |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
-vvv
option).Issue
poetry install
does not populate the filesystem cache, butpoetry add
does. Aside from being a little surprising, this makes caching on CI machines effectively impossible since they only everinstall
and the cache never changes, so they always have to start with a blank slate.A shell session demonstrating the issue:
The text was updated successfully, but these errors were encountered: