-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpine images too big (should use multi-staged build) #339
Comments
Hi @harmv Thank you for sending this question! The upstream postgres image is also not multi-staged ( https://github.com/docker-library/postgres/blob/master/13/alpine/Dockerfile ) imho: There must be some way to reduce the size, so if you have some proof-of-concept PR, I'd be happy to look into it.
Could you please provide me with a more detailed test description for this? $ docker pull docker.io/postgis/postgis:13-3.3-alpine
13-3.3-alpine: Pulling from postgis/postgis
Digest: sha256:dea154c9000546b9bcc07cf55646563d7a7401637083c82867472b130bed27b8
Status: Image is up to date for postgis/postgis:13-3.3-alpine
docker.io/postgis/postgis:13-3.3-alpine
$ docker run -it --rm docker.io/postgis/postgis:13-3.3-alpine sh
/ # gcc
sh: gcc: not found
/ # autoconf
sh: autoconf: not found
/ # automake
sh: automake: not found
/ # perl
sh: perl: not found
/ # apk info
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.17/main: No such file or directory
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.17/community: No such file or directory
alpine-baselayout-data
musl
busybox
busybox-binsh
alpine-baselayout
alpine-keys
ca-certificates-bundle
libcrypto3
libssl3
ssl_client
zlib
apk-tools
scanelf
musl-utils
libc-utils
xz-libs
libgcc
libstdc++
libuuid
libcom_err
libffi
libverto
krb5-conf
keyutils-libs
krb5-libs
gdbm
libsasl
libldap
ncurses-terminfo-base
ncurses-libs
libedit
libxml2
libgpg-error
libgcrypt
libxslt
zstd-libs
readline
tzdata
icu-libs
icu-data-full
bash
su-exec
zstd
nss_wrapper
.postgresql-rundeps
ca-certificates
openexr
libbz2
brotli-libs
nghttp2-libs
libcurl
cfitsio
libdeflate
libexpat
freexl
geos
giflib
libsz
hdf5
hdf5-cpp
aom-libs
libde265
numactl
x265-libs
libheif
libjpeg-turbo
json-c
kealib
minizip
liburiparser
libkml
mariadb-connector-c
hdf5-hl
netcdf
unixodbc
libtirpc-conf
libtirpc
ogdi
openjpeg
pcre2
libpng
freetype
fontconfig
lcms2
libwebp
tiff
poppler
libpq
sqlite-libs
proj
qhull
libxau
libmd
libbsd
libxdmcp
libxcb
libx11
libxext
libxrender
pixman
cairo
libgeotiff
lz4-libs
librttopo
libspatialite
librasterlite2
xerces-c
gdal
llvm15-libs
pcre
protobuf-c
.postgis-rundeps
/ # |
Hi @harmv, I think I found the reason for the increase in the alpine image size. in the build log : And the dive CI tool is also detected: $ CI=true dive docker.io/postgis/postgis:13-3.3-alpine
Using default CI config
Image Source: docker://docker.io/postgis/postgis:13-3.3-alpine
Fetching image... (this can take a while for large images)
Analyzing image...
efficiency: 76.2937 %
wastedBytes: 263481244 bytes (264 MB)
userWastedPercent: 47.9720 %
Inefficient Files:
Count Wasted Space File Path
2 263 MB /usr/lib/libLLVM-15.so
2 428 kB /etc/ssl/certs/ca-certificates.crt
3 276 kB /lib/apk/db/installed
2 56 kB /usr/local/share/postgresql/postgresql.conf.sample
3 43 kB /lib/apk/db/scripts.tar
2 2.4 kB /etc/passwd
2 1.4 kB /etc/group
2 874 B /etc/shadow
3 414 B /lib/apk/db/triggers
3 282 B /etc/apk/world
2 86 B /etc/shells
2 0 B /usr/lib/llvm15/lib/libLLVM-15.so
2 0 B /usr/bin/unlzma
2 0 B /usr/bin/factor
2 0 B /usr/bin/uniq
2 0 B /usr/bin/unexpand
2 0 B /usr/bin/tty
2 0 B /usr/bin/truncate
2 0 B /usr/bin/tr
2 0 B /usr/bin/timeout
2 0 B /usr/bin/test
2 0 B /usr/bin/tee
2 0 B /usr/bin/tail
2 0 B /usr/bin/tac
2 0 B /usr/bin/sum
3 0 B /usr/bin/strings
2 0 B /usr/bin/split
2 0 B /usr/bin/sort
2 0 B /usr/bin/shuf
2 0 B /usr/bin/shred
2 0 B /usr/bin/sha512sum
2 0 B /usr/bin/sha256sum
2 0 B /usr/bin/sha1sum
2 0 B /usr/bin/seq
2 0 B /usr/bin/realpath
2 0 B /usr/bin/readlink
2 0 B /usr/bin/printf
2 0 B /usr/bin/paste
2 0 B /usr/bin/od
2 0 B /usr/bin/nproc
2 0 B /usr/bin/nohup
2 0 B /usr/bin/nl
2 0 B /usr/bin/mkfifo
2 0 B /usr/bin/md5sum
2 0 B /usr/bin/lzma
2 0 B /usr/bin/lzcat
2 0 B /usr/bin/install
2 0 B /usr/bin/id
2 0 B /usr/bin/hostid
2 0 B /usr/bin/head
2 0 B /usr/bin/fold
2 0 B /usr/bin/unlink
2 0 B /usr/bin/expr
2 0 B /usr/bin/expand
2 0 B /usr/bin/env
2 0 B /usr/bin/du
2 0 B /usr/bin/dirname
2 0 B /usr/bin/cut
2 0 B /usr/bin/comm
2 0 B /usr/bin/cksum
2 0 B /usr/bin/basename
2 0 B /usr/bin/awk
2 0 B /usr/bin/[
2 0 B /tmp
2 0 B /lib/apk/exec
2 0 B /usr/bin/unxz
2 0 B /usr/bin/wc
2 0 B /lib/apk/db/lock
2 0 B /usr/bin/who
2 0 B /usr/bin/whoami
2 0 B /usr/bin/xzcat
2 0 B /bin/uname
2 0 B /bin/true
2 0 B /bin/touch
3 0 B /bin/tar
2 0 B /bin/sync
2 0 B /bin/stty
2 0 B /usr/bin/yes
2 0 B /bin/sleep
2 0 B /bin/rmdir
2 0 B /bin/rm
2 0 B /bin/pwd
2 0 B /bin/printenv
2 0 B /bin/nice
2 0 B /bin/mv
2 0 B /bin/mktemp
2 0 B /bin/mknod
2 0 B /bin/mkdir
2 0 B /bin/ls
2 0 B /bin/ln
2 0 B /bin/false
2 0 B /bin/echo
2 0 B /bin/df
2 0 B /bin/dd
2 0 B /bin/date
2 0 B /bin/cp
2 0 B /bin/chown
2 0 B /bin/chmod
2 0 B /bin/chgrp
2 0 B /bin/cat
2 0 B /bin/base64
2 0 B /usr/sbin/chroot
2 0 B /usr/lib/libLLVM-15.0.6.so
2 0 B /bin/stat
Results:
FAIL: highestUserWastedPercent: too many bytes wasted, relative to the user bytes added (%-user-wasted-bytes=0.4797200053524177 > threshold=0.1)
SKIP: highestWastedBytes: rule disabled
FAIL: lowestEfficiency: image efficiency is too low (efficiency=0.7629371198317819 < threshold=0.9)
Result:FAIL [Total:3] [Passed:0] [Failed:2] [Warn:0] [Skipped:1] (now) I can't think of a better solution than to wait until the base image ( postgres:15-alpine3.17 ) is updated. Lesson learned: After each docker build we should run the Dive CI tool as a check to detect similar things as soon as possible. |
Hm... Reading through your comments I know realize that: My initial report is not correct. That should be just as good as a multi-staged build. So maybe a simpler solution is possible. |
Interestingly, it might be the case that libLLVM is not needed at all to run. For test I added
To the RUN stage in the Dockerfile of postgis. After that I can perfectly fine fire-up the db, and run (my own) projects gis-related unittests against it. Is this a bug in the postgres Dockerfile?
There is some trickery to manually filter out perl, python & tcl. Is such a thing also needed for llvm15-libs? If so, that would win double the space. (The problem reported was due to the fact that libLVM size appears twice in the final image, but then instead of appearing once, it would not appear at all) [edit] It seems LLVM is required for JIT. Which is debatable wether you need that in the alpine image. |
Suggested to postgres to drop llvm from runDeps See: docker-library/postgres#1044 |
Anyway I think postgis would benefit from a multi-stage build in its Dockefile. I'll take a stab at this this week, and provide you with an example. (in 10 days or so) Stay tuned... |
imho: However, a
imho: For example, might there be unanticipated secondary effects that could affect stability? Although there is already a thorough PostGIS test at build time Currently, only the -master image is multi-stage based but there is not such a high stability requirement. |
Yeah, I agree. If a dependency of postgresql on llvm is to cut, they should do that upstream (postgres) and not here. (and the'll not do that probably)
I gave this some thoughts. So, adding a multi staged build should be preferable from a stability point of view also. (besides the size issues)
Yeah, agreed
Ah nice, so there is a working example of a similar multi staged build. I'll take a look. |
Please check-out my attempt (directly edited Dockerfile, not the template) harmv@4283d74 pro:
Sieze
libLVM version is correct now
is this something you would consider taking? |
@harmv : Now the latest `postgis/postgis:13-3.3-alpine˙ is 425MB and your proposal is 475MB,
|
Hm..
I'm confused about the 50MB diff though too. I can reproduce it
Something wrong in my attempt, for sure.
your trick
Ok, I got it. this is caused by the line So in order to workaround this, I need a more specific COPY command, not all of /usr/local. Can you give me some hints? which contents of the postgis build/install should be copied? I am a bit surprised about the versioning in docker images though.... |
There were 2 big upgrades last weekend, so the
And there is a weekly cron (
not really, it's strange and unfamiliar territory for me too. |
I managed to get the size down to 425 MB by adding a workaround for docker issue: 21950. while doing that I encountered docker issue 45015 See: harmv@82325bd End result: (13.3-alpine)
|
My initial bug report was for the huge size. Investigation showed that this was caused by 2 issues:
Given that:
I suggest to just close this issue, and leave everything as is. |
@harmv I'll close out the issue. thanks for the investigation efforts! |
I noticed that the docker-postgis alpine images are far too big
eg:
postgis/postgis 13-3.3-alpine a21d01173429 2 weeks ago 556MB
Cause
The Dockerfile does not use a multi-staged build.
Steps to reproduce:
Actual result
the final image contains, besides the required postgres & postgis binaries, also unneeded build stuff (g++, gcc, clang-dev, perl, autoconf, automake, etc, etc)
Expected
The final image is much smaller.
Only the required binaries are in the final image. (postgres + postgis)
The intermediate build stuff (compilers etc) are not in the final image.
The text was updated successfully, but these errors were encountered: