-
Notifications
You must be signed in to change notification settings - Fork 455
Docker images contain 20MB of deleted /var/lib/apt/lists/ files #90
Comments
Unfortunately, repacking the tarballs is out of the question, so the best
we can do here is fix the comment and file an issue upstream (perhaps
linking to the upstream issue in our comment?)
If this does get fixed upstream, we can leave our "RUN" line in place as a
fallback, since it will then be a no-op zero layer and coalesce just like
"ENV", etc metadata layers do. 👍 (especially since the upstream tarballs
have flip-flopped on this a few times in the past, IIRC)
Thanks for the detailed report!
|
Thank you for the reply!
Out of curiosity, could you elaborate? Is it for reproducibility?
Sounds good on both counts! Would you mind filing upstream? (I'd have to create a launchpad account just for this)
I think it would be good to make upstream regressions for this more obvious. So perhaps either:
|
Partially for reproducibility, but more because Canonical has asked to be the "source of truth" for the bits that become "Ubuntu" on the Docker Hub (I'm merely a proxy for their builds).
Yeah, I'll go ahead and file a launchpad issue -- having this directory be explicitly empty by design in the tarballs they provide isn't something they'd ever committed to doing, so I didn't want to assume that it was a "bug" (I'm not sure 100% whether we're the only consumers of these particular tarballs), but if we file an issue and they're warm to the idea (and modify their tarballs), then I'd be more amenable to making it an assertion of emptiness instead. 👍 |
Issue filed: https://bugs.launchpad.net/cloud-images/+bug/1699913 |
Thank you for filing that. Since there has been no response to the ticket, I was going to register for a launchpad account and leave another comment explaining further, however my attempts to log in using a newly created Ubuntu One account failed with "Oops! Sorry, something just went wrong in Launchpad." :-( |
Is there anyone we can ping to get some movement on that launchpad ticket? |
@edmorley I was able to login to Launchpad with my Ubuntu One account today. I recommend trying again and see if it works now. 👍 |
Ah I forgot to update the above - someone kindly helped fixed the issue with my account since then :-) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Does anyone have any contacts at Ubuntu who might be able to help get some traction in the launchpad issue? |
Looking forward to Ubuntu taking this stuff out of their tarball. Until that happens, a new |
I support @pranas suggestion. A simple multi stage build does save ~26MB. Btw. I don't see any drawbacks of a single-layer base image. 🤔 |
Still no reply in that launchpad issue after 10 months sadly. Does anyone have any idea who else we can CC to it? It's not clear where the code for generating the cloud images lives, so I can't even fix it myself 😞 |
Indeed, this is fixed. 👍 |
Only for cosmic+ though! I should see about backporting some of the recent changes to at least bionic |
@mwhudson Using the workaround with multi stage seems pretty good for the non-cosmic images. |
This comment has been minimized.
This comment has been minimized.
We have no plans to move Ubuntu to use a multi-stage build because they don't preserve permissions enough to be usable for a full distro rootfs. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Please see the background for this in https://bugs.launchpad.net/cloud-images/+bug/1699913. I can assure you that Docker is very efficient at managing empty layers -- there is no extra transport cost for this line for versions of the tarball which do not have any contents there. In fact, I do intend to instead adjust that line at some point to verify that the directory is in fact empty instead of forcing it to be so. If you need further help understanding the justification, plese refer to that Launchpad issue and the Docker Community Forums, the Docker Community Slack, or Stack Overflow (as the issues here are not intended as a support forum). |
Hi! Before I dive in, I just want to say thank you for maintaining these images :-)
So I happened to notice this section present in the Dockerfiles (generated here):
I agree it's a good idea to remove these files to force later "apt-get update", however the comment about saving space is not correct, since deleting files in a layer after they've already been added won't free up the space. The comment seems to have been copy-pasted from this script (which isn't run across multiple layers so actually does save space).
Rather than just correcting the comment, it would be best to avoid the 20MB wasted space in the first place.
The files in
/var/lib/apt/lists/
come from the base image archive from Canonical, which is directly extracted using theADD
command's tar file support. This cannot be switched to the curl/untar/delete pattern used in downstream images, since until the base archive is extracted there are no binaries in the image to use. As such, the removal of/var/lib/apt/lists/
needs to occur prior to the Docker build process.This example shows the Ubuntu 16.04 image being reduced from 118MB to 97.6MB by doing exactly that...
Output:
I guess the question will be whether to store both the original base archive and the processed one in this repo (so people can still use the hashes and compare), or whether to just store the processed one.
Also, I think it's worth pushing the upstream maintainers of these base images to remove the APT lists from them, which will avoid all of this busywork. Perhaps this size-reduction use-case is a more compelling one for them than that outlined here:
https://bugs.launchpad.net/cloud-images/+bug/1685399
Many thanks!
The text was updated successfully, but these errors were encountered: