Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exporting to image takes a lot of time #1704

Closed
saswatac opened this issue Sep 25, 2020 · 16 comments · Fixed by #2181
Closed

exporting to image takes a lot of time #1704

saswatac opened this issue Sep 25, 2020 · 16 comments · Fixed by #2181

Comments

@saswatac
Copy link

Description

After an initial build, i am making changes in my code and doing a follow up build. The changes affects only the last couple of layers.
During build, the cache gets used as expected, and last two steps in the dockerfile are run which is quite fast.

But it takes extremely long to export the image, exporting the layers takes most of the time.

Steps to reproduce the issue:

  1. Example dockerfile -
ARG BASE_IMAGE

FROM $BASE_IMAGE
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . plugin
RUN pip install -e plugin/

build command buildctl build --frontend dockerfile.v0 --opt build-arg:BASE_IMAGE=$BASE_IMAGE --local context=. --local dockerfile=. --output type=image,name=$IMAGE,push=true

  1. Make change within the context by adding a new file - touch test
  2. Run the build again -
    buildctl build --frontend dockerfile.v0 --opt build-arg:BASE_IMAGE=$BASE_IMAGE --local context=. --local dockerfile=. --output type=image,name=$IMAGE,push=true

Describe the results you received:

Logs of initial build -

[+] Building 306.1s (10/10) FINISHED                                                                                                                                                                       
 => [internal] load build definition from Dockerfile                                                                                                                                                  0.0s
 => => transferring dockerfile: 32B                                                                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                                                                     0.1s
 => => transferring context: 2B                                                                                                                                                                       0.0s
 => [internal] load metadata for ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780                                                                                            1.1s
 => [internal] load build context                                                                                                                                                                     0.0s
 => => transferring context: 1.05kB                                                                                                                                                                   0.0s
 => [1/5] FROM ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780@sha256:511f1789b448769f8c65289eacc68802cbc698e76c5a71c7e791529529f30c62                                     50.1s
 => => resolve ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780@sha256:511f1789b448769f8c65289eacc68802cbc698e76c5a71c7e791529529f30c62                                      0.0s
 => => sha256:093611ee8362e4ab0ba93ebb4369d0cbd697a6969599d7dce72a44de66c2988e 801B / 801B                                                                                                            0.8s
 => => sha256:0e5ecd78f7d47ad2a0ee3f2fb4d6d9152b770ceb678dfbee412eb0a0fdfd5e07 4.99kB / 4.99kB                                                                                                        0.9s
 => => sha256:2619cbbd09bc8aea0d815f76538d60607f617a16e0aa14898da49e90a016f361 210.37kB / 210.37kB                                                                                                    0.0s
 => => sha256:bb866943075d8b9323ecb225b78ba35951d12d2b89cec128c6a32f984bdcc39e 65.05MB / 65.05MB                                                                                                      0.0s
 => => sha256:17e7cc3635760ed0bdcd8fc56aecab51f200772d25210279b65f4af737a1a15f 282B / 282B                                                                                                            0.0s
 => => sha256:ca3cf29a20a36d34ecd6ff2bb40510f1e6cbf1bdd65361a8dd226a162359262a 221.87MB / 221.87MB                                                                                                    0.0s
 => => sha256:51f6bbaddf3422de1bca5c2810a14590157769a50287b086e3f3a4c12285eac1 282.04MB / 282.04MB                                                                                                    0.0s
 => => sha256:d8f1569ddae616589c5a2dabf668fadd250ee9d89253ef16f0cb0c8a9459b322 7.22MB / 7.22MB                                                                                                        0.0s
 => => sha256:577ec257523584a7f6f469c5cd7fde99a61fe202ca616282c519ff8b51b5502e 71.51MB / 71.51MB                                                                                                      0.0s
 => => sha256:fce285a92e7cc2a4f38295ba4397de7009ac961fe847696d8160e131ca5a491f 32.88kB / 32.88kB                                                                                                      1.0s
 => => sha256:8877b914cc13891de8b77a77d47e4041f5e68279678475c5b567035bc0a30fc2 6.35kB / 6.35kB                                                                                                        0.0s
 => => sha256:85386706b02069c58ffaea9de66c360f9d59890e56f58485d05c1a532ca30db1 8.45MB / 8.45MB                                                                                                        0.0s
 => => sha256:ee9b457b77d047ff322858e2de025e266ff5908aec569560e77e2e4451fc23f4 184B / 184B                                                                                                            0.0s
 => => sha256:be4f3343ecd31ebf7ec8809f61b1d36c2c2f98fc4e63582401d9108575bc443a 688.74MB / 688.74MB                                                                                                    0.0s
 => => sha256:ff59175ffb10ac81014c7e75963f131e169af77c54ed5fe3db3721a26a03958d 100.43MB / 100.43MB                                                                                                    0.0s
 => => sha256:2e64fc198b25c4fdbfbdcabe25c5b8be51d3a7503b790311adbd752d05263c7e 31.98kB / 31.98kB                                                                                                      0.9s
 => => sha256:8208f996d5b6d74a0730d4c2534e4de2a4d4791b96829888db85a83bef2a467f 12.69MB / 12.69MB                                                                                                      0.0s
 => => sha256:7ddbc47eeb70dc7f08e410a6667948b87ff3883024eb41478b44ef9a81bf400c 26.69MB / 26.69MB                                                                                                      0.0s
 => => sha256:8c3b70e3904492c753652606df4726430426f42ea56e06ea924d6fea7ae162a1 845B / 845B                                                                                                            0.0s
 => => sha256:45d437916d5781043432f2d72608049dcf74ddbd27daa01a25fa63c8f1b9adc4 162B / 162B                                                                                                            0.0s
 => => sha256:511f1789b448769f8c65289eacc68802cbc698e76c5a71c7e791529529f30c62 9.58kB / 9.58kB                                                                                                        0.0s
 => => sha256:c6a35edcbf60f663afad6be4cde65a3ba51cd4350a3012fbf1bd7eb5e656b17a 184.37kB / 184.37kB                                                                                                    0.0s
 => => sha256:97cf27a8d23588f7b5ebae7a8121fcb86c764155402c9960e1eb83ff8a940934 3.98MB / 3.98MB                                                                                                        0.0s
 => => sha256:34665c999038db027210c8adec8d1125f5ae5eec377d288150a9343446edf191 21.89MB / 21.89MB                                                                                                      0.0s
 => => sha256:d69c80a2beb90bb1c1cdefba8a441b9087117db406ae71b5cce633916224e2c2 34.44kB / 34.44kB                                                                                                      0.0s
 => => sha256:bbfe1e46071b05d22b9a2ac4c8195bb88619f3c30dd9095111f6ac04db04a4c8 144.09kB / 144.09kB                                                                                                    0.0s
 => => sha256:6e130b935f6e8cef47d29cab9157500fb248bcc6f10b46464d88aef96802891d 1.67MB / 1.67MB                                                                                                        0.0s
 => => sha256:aa2caab8a31b2b363a90f5b62798fd4389c490626fb5e80641fd73f30f20ff2e 2.55MB / 2.55MB                                                                                                        0.0s
 => => sha256:ad58b939462673827740aa52109a96869972735f893e80aea0b7867fd249900c 120.11MB / 120.11MB                                                                                                    0.0s
 => => sha256:c1bbdc448b7263673926b8fe2e88491e5083a8b4b06ddfabf311f2fc5f27e2ff 35.36kB / 35.36kB                                                                                                      0.0s
 => => sha256:f8411aa8f601daede995a881ff432d5db704b783498ea6cb3ddfa03e0dfa4b88 3.97MB / 3.97MB                                                                                                        0.0s
 => => sha256:4e9cbc876b85d1e8510a106e2367463c751b12c27cfaea2718ae9cd214575811 42.89MB / 42.89MB                                                                                                      0.0s
 => => sha256:d0297fb6b42e605fe5fd64e42b7868d7d7f3d6c919e5205252ffd0baa31fa5a5 106.86MB / 106.86MB                                                                                                    0.0s
 => => sha256:7805f075af6af4bcf09e0bc1b62db3e5b3f98675d64476c2cd2836513a571379 2.79kB / 2.79kB                                                                                                        0.0s
 => => sha256:3d76c9d9305859dd3899e240b86d1784e3ca1f21a40a7a998b1c1216d2ee1570 62.67MB / 62.67MB                                                                                                      0.0s
 => => sha256:e377fcd0529c576cfa106c3f066ae416503abafed8d61fdc8063bc15ae2577b4 1.22GB / 1.22GB                                                                                                        0.0s
 => => sha256:e0a0b3318a8653396fb8b3f7c01c41345fad19c2558adb64aaa78e0732f2978f 25.09MB / 25.09MB                                                                                                      0.0s
 => => sha256:e98b27c3902ea700e8883fa2b7006df68ce24938cfd8976950e360a336878642 245.64MB / 245.64MB                                                                                                    0.0s
 => => sha256:1b42e1c527376dd92216f6890a417e5061763a5c21ccb2f19ce5fe75c32587fe 252.10kB / 252.10kB                                                                                                    0.0s
 => => sha256:ab7d779e564ec0cdd1af0ddf44cd1c688a0da68a79c3f60469174678e3b80244 2.39MB / 2.39MB                                                                                                        1.1s
 => => sha256:3ef0e804676f1005f8a8287f00389f3af038ff218221429a0494e4f639a15ee7 3.75MB / 3.75MB                                                                                                        1.1s
 => => sha256:29ab2d635982a66b2474861b6931708c726c6b8735ef3ace4d01a5e8b1f92db9 646B / 646B                                                                                                            1.2s
 => => sha256:46af60ee0d3d2974f105c2ea9f753d239d32c1b9f419e87cfc89cfd447609c41 1.23MB / 1.23MB                                                                                                        1.0s 
 => => sha256:ba325c2c5a1d419dc53de6ccb3ee315ba1713a12e6e4bf10468144e8d06e7e1d 9.47kB / 9.47kB                                                                                                        1.3s
 => => sha256:867687c4b777e19decf3fdede297cfd4188f2ca6b742586784df182301413309 301.26MB / 301.26MB                                                                                                    6.4s
 => => sha256:497f8b1df3a82a854f8c288dff5c1ad09041534236bf35cab1c2aa626fda0ec2 9.47kB / 9.47kB                                                                                                        1.3s
 => => sha256:2e64fc198b25c4fdbfbdcabe25c5b8be51d3a7503b790311adbd752d05263c7e 31.98kB / 31.98kB                                                                                                      0.9s
 => => unpacking ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780@sha256:511f1789b448769f8c65289eacc68802cbc698e76c5a71c7e791529529f30c62                                   43.4s
 => [2/5] COPY requirements.txt .                                                                                                                                                                     0.1s
 => [3/5] RUN pip install --no-cache-dir -r requirements.txt                                                                                                                                          4.3s
 => [4/5] COPY . plugin                                                                                                                                                                               0.3s
 => [5/5] RUN pip install -e plugin/                                                                                                                                                                  2.9s
 => exporting to image                                                                                                                                                                              247.1s
 => => exporting layers                                                                                                                                                                             244.8s
 => => exporting manifest sha256:4aef12ff37b0bdd9bc5fa4f3a0af53f6fe3f6ff6eec7558193ff608f40c2314b                                                                                                     0.0s
 => => exporting config sha256:4ca91380fa7e23d455bfb0f01ca3d344ca1dc4bc8d00dcdba4becf3efe1ec450                                                                                                       0.0s
 => => pushing layers                                                                                                                                                                                 1.1s
 => => pushing manifest for neuralconceptncstest.azurecr.io/loss_plugin:latest 

Logs of the second build -

[+] Building 93.5s (10/10) FINISHED                                                                                                                                                                        
 => [internal] load build definition from Dockerfile                                                                                                                                                  0.1s
 => => transferring dockerfile: 197B                                                                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                                                                     0.0s
 => => transferring context: 2B                                                                                                                                                                       0.0s
 => [internal] load metadata for ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780                                                                                            0.6s
 => [internal] load build context                                                                                                                                                                     0.0s
 => => transferring context: 1.46kB                                                                                                                                                                   0.0s
 => [1/5] FROM ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780@sha256:511f1789b448769f8c65289eacc68802cbc698e76c5a71c7e791529529f30c62                                      0.0s
 => => resolve ncstest.azurecr.io/notebook:0.8-ncs-main-dev-2.8.0-release-v2-8.193592780@sha256:511f1789b448769f8c65289eacc68802cbc698e76c5a71c7e791529529f30c62                                      0.0s
 => CACHED [2/5] COPY requirements.txt .                                                                                                                                                              0.0s
 => CACHED [3/5] RUN pip install --no-cache-dir -r requirements.txt                                                                                                                                   0.0s
 => [4/5] COPY . plugin                                                                                                                                                                               0.2s
 => [5/5] RUN pip install -e plugin/                                                                                                                                                                  2.8s
 => exporting to image                                                                                                                                                                               89.7s
 => => exporting layers                                                                                                                                                                              85.5s 
 => => exporting manifest sha256:a1e616afcb71d8784b54e1fdefcaaa7e36f4ef72cd479a795ae90ba65b262d06                                                                                                     0.0s 
 => => exporting config sha256:38e146bd09b7eb5c011bed694899c5082e65eec4c7b529ae6bcaeb037f9e7761                                                                                                       0.0s 
 => => pushing layers                                                                                                                                                                                 1.0s
 => => pushing manifest for neuralconceptncstest.azurecr.io/test:latest                                                                                                                               2.6s

Describe the results you expected:

Why does exporting the layers while exporting image take this much time? ( 244 sec initial build, 89 sec in the second build)
Is there some any options i am missing which can help speed it up ?

version

buildkitd --version
buildkitd github.com/moby/buildkit 22e230744171b4442101731951bbbecf97796ea5

buildkitd --version
buildkitd github.com/moby/buildkit v0.7.2 22e230744171b4442101731951bbbecf97796ea5

Any other relevant information:

I am running buildkit daemon rootless image in kubernetes as a sidecar container. buildctl is run from the development container in the pod.

@tonistiigi
Copy link
Member

This is likely due to containerd differ. Post a full reproducer so it can be verified. It might depend on the time precision of your backing file system. If it is very low containerd differ falls back to comparing files by data.

@saswatac
Copy link
Author

The backing filesystem is ext4 -

df -T
Filesystem           Type       1K-blocks      Used Available Use% Mounted on
overlay              overlay    129901008  19360148 110524476  15% /
tmpfs                tmpfs          65536         0     65536   0% /dev
tmpfs                tmpfs        7169488         0   7169488   0% /sys/fs/cgroup
/dev/sda1            ext4       129901008  19360148 110524476  15% /dev/termination-log
/dev/sda1            ext4       129901008  19360148 110524476  15% /etc/resolv.conf
/dev/sda1            ext4       129901008  19360148 110524476  15% /etc/hostname
/dev/sda1            ext4       129901008  19360148 110524476  15% /etc/hosts
shm                  tmpfs          65536         0     65536   0% /dev/shm
/dev/sdc             ext4       103081248  14127212  88937652  14% /home/user/.local/share/buildkit
tmpfs                tmpfs        7169488        12   7169476   0% /run/secrets/kubernetes.io/serviceaccount
tmpfs                tmpfs        7169488         0   7169488   0% /proc/acpi
tmpfs                tmpfs          65536         0     65536   0% /proc/kcore
tmpfs                tmpfs          65536         0     65536   0% /proc/keys
tmpfs                tmpfs          65536         0     65536   0% /proc/timer_list
tmpfs                tmpfs          65536         0     65536   0% /proc/sched_debug
tmpfs                tmpfs        7169488         0   7169488   0% /proc/scsi
tmpfs                tmpfs        7169488         0   7169488   0% /sys/firmware

This is running Azure Kubernetes service, and the file system is on Azure persistent disk volume mounted at /home/user/.local/share/buildkit . I also tried removing the persistent volume mount at /home/user/.local/share/buildkit , but got similar results. The node on which it is running is Ubuntu 16.04.7 LTS, kernel 4.15.0-1093-azure.

@saswatac
Copy link
Author

saswatac commented Oct 28, 2020

@tonistiigi It is indeed the case that the containerd differ is falling back to comparing files by data. However the issue is not with the backing filesystem, i notice that the modify times of most files in the image have 0 values for nanoseconds in their last modified times.

To reproduce, you can use this dockerfile -

FROM tensorflow/tensorflow:latest-gpu-jupyter
COPY . .

Build -
sudo buildctl build --frontend dockerfile.v0 --local context=. --local dockerfile=. --output type=image,name=tensorflow/tensorflow:test

After initial build, if i create a new file touch test and build again, then when exporting the layers all the files are compared ( and most or all compared by data ) , as a result the export takes a lot of time.

If we look inside the docker image docker run --rm -it tensorflow/tensorflow:latest-gpu-jupyter bash at modify times for example -

root@faf2479a2b9d:/tf# stat /usr/local/lib/python3.6/dist-packages/tensorflow
  File: /usr/local/lib/python3.6/dist-packages/tensorflow
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 8dh/141d	Inode: 15436378    Links: 13
Access: (2755/drwxr-sr-x)  Uid: (    0/    root)   Gid: (   50/   staff)
Access: 2020-10-28 18:38:14.909599126 +0000
Modify: 2020-09-29 00:25:57.000000000 +0000
Change: 2020-10-28 18:38:14.869599156 +0000
 Birth: -

the modify times have 0 nanos.

@tonistiigi
Copy link
Member

So what is the time precision of the files you create during build. Can you confirm these files have nanoseconds in your fs?

Unfortunately don't see any other solution for this than to break away from containerd differs and implement our own high-performance ones. Not an easy task.

@saswatac
Copy link
Author

In the example

FROM tensorflow/tensorflow:latest-gpu-jupyter
COPY . .

nothing much is happening. I can confirm the files on my host filesystem have nanoseconds.

I do not build tensorflow/tensorflow:latest-gpu-jupyter . I am guessing it is build with docker.

When build with docker, i do notice however that the files loose their nano precision ( probably because during docker build context is copied through tar ? )

If the base image is build with docker, and its files do not have nanos, should it actually matter?

I added some additional debug logs in

func doubleWalkDiff(ctx context.Context, changeFn ChangeFunc, a, b string) (err error) {
, and it seems to me that it does a differ of the entire tree from root . The differ is supposed to happen between upper and lower layer, but is it possible that it is happening between merged layer and lower layer?

If we compare lower and upper layers, and lower layer has files with 0 nanos which are not overridden in upper layer, then it should not be a problem?

@tonistiigi
Copy link
Member

The differ is not overlay specific(doesn't know about what is upper/lower). It just compares two mountpoints.

@saswatac
Copy link
Author

ok. i don't have a lot of context, but it does not seem right that the entire filesystem gets read and compared.
I added a debug line after

logrus.Debugf("compare %s and %s: %s", f1.path, f2.path, k)

If i have a dockerfile like

FROM tensorflow/tensorflow:latest-gpu-jupyter
COPY foo /tmp

no directory except /tmp is modified from the base image, yet in the logs i can see lines like

DEBU[0006] Using double walk diff for /tmp/containerd-mount829474460 from /tmp/containerd-mount017619953 
DEBU[0006] compare /.local and /.local: modify          
DEBU[0006] compare /bin and /bin: modify                
DEBU[0006] compare /bin/bash and /bin/bash: modify      
...
DEBU[0009] compare /usr/local/lib/python3.6/dist-packages and /usr/local/lib/python3.6/dist-packages: modify 
...
DEBU[0008] compare /usr/local/cuda and /usr/local/cuda: modify

It compares the entire tree and all these comparisons are done by file content due to missing nanos in the files / directories.

@gjymosha
Copy link

@saswatac hi saswatac, Is there any solution now? I encountered the same problem.

@saswatac
Copy link
Author

@gjymosha unfortunately not. If you are running the build in a cloud environment, you may want to check that disk i/o is not being throttled, for me increasing the disk size helped improved the times, but still it is not satisfactory.

@tonistiigi i looked at the code a bit further, i am wondering whether instead of doing a double diff walk in https://github.com/containerd/continuity/blob/master/fs/diff.go#L111 , we can implement https://github.com/containerd/continuity/blob/master/fs/diff_unix.go#L33 and do diffdirchanges . This would avoid all the unnecessary comparisons by file content. Is there any particular reason it was not implemented ?

@gjymosha
Copy link

@saswatac Thank you very much for your reply, I have another question: as the snapshots generated in build step are already changeset files, why not directly use the snapshot content to generate a new layer, but recalculate the new content by comparing the two mountpoints? @tonistiigi

@Xplouder
Copy link

Xplouder commented Feb 5, 2021

I'm experiencing the same problem.

Exporting inside a Gitlab runner in Kubernetes:
image

Looks like the job is stuck, atm it's over 15min, when probably 14 are from the exporting phase...

UPDATE: the job timed out, 60min. Tested it with export type registry and local.

@ktock
Copy link
Collaborator

ktock commented Jun 16, 2021

FYI: opened a draft PR to solve this at #2181.

@saswatac
Copy link
Author

saswatac commented Oct 9, 2021

i just noticed that this did not make it to the 0.9.1 release. When is it planned to include this fix?

@tonistiigi
Copy link
Member

This is too big to pick for patch release and will come with v0.10

@Shaked
Copy link

Shaked commented Nov 11, 2021

Is there any workaround for this issue until 0.10 release? For some reason this only happens to us with one specific build while all others - on the same platform - work fast and as expected.

@av1v3k
Copy link

av1v3k commented Feb 15, 2024

Came here to check for the same issue which occurred in Multi-staged build.
Takes lot of time.

nerdctl build -t cp-test:v0.1 --build-arg="environment=dev" --build-arg="node_environment=production" --target builder .

Above is the command which I used.

My Dockerfile has 2 staged Build. So, in order to debug the 1st stage I executed the above command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants