Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of CI system #1407

Closed
consideRatio opened this issue Jul 16, 2021 · 32 comments · Fixed by #1703
Closed

Improve performance of CI system #1407

consideRatio opened this issue Jul 16, 2021 · 32 comments · Fixed by #1703
Labels
type:Arm Issue specific to arm architecture type:Enhancement A proposed enhancement to the docker images

Comments

@consideRatio
Copy link
Collaborator

consideRatio commented Jul 16, 2021

We have crippled our CI systems performance after introducing support for arm64 based images. A key reason for this is that emulation of arm64 images from the amd64 based runners github provide is far worse, besides the fact that we end up building base-notebook and minimal-notebook for arm64 in sequence alongside the other images now.

I'm not fully sure how we should optimize this long run, but under the assumption that we will have high performance self-hosted arm64 based GitHub Action runners that can work in parallel to the amd64 runners. Below is an overview of a very optimized system, where several parts can be done separately.

  1. Nightly builds
    We have nightly builds with :nightly-amd64 and nightly-arm64 tags

  2. amd64 / arm64 in parallel
    All tests for amd64 and arm64 run in parallel, relying on nightly-amd64 and nightly-arm64 caches

  3. Images in parallel where possible
    All tests for individual images are run in a dedicated job that needs its base image job to complete.

    Some images can run in parallel:

    • base
    • minimal
    • scipy | r
    • tensorflow | datascience | pyspark
    • all-spark
  4. Avoid rebuilds when merging
    Tests finish by updating a github container registry associated with a PR. By doing so, our publishing job on merge to master can opt to use the images as they were built during tests if they are considered fresh enough.

  5. Parallel manifest creation
    Merge to default branch triggers manifest creation jobs on both amd64 and arm64. If we opt to not optimize using step 4 then this could also build fresh images using nightly cache first.

  6. Combine manifests into one before pushing to official registry
    Merge to default branch triggers a job that pulls both the amd64 image and arm64 image and defines a combined docker manifest which is then pushed to our official container registry. I think this could be done with something like docker manifest create <name of combined image> <amd64 only image> <arm64 only image> but @manics knows more and I lack experience with this.

Standalone performance issue

This standalone issue will go away by using better strategies like above. It isn't so critical to fix either I'd say. But currently, we build minimal-notebook again without using cache during push-multi assuming push-multi for base-notebook has already run. It is because we re-tag jupyter/base-notebook:latest I think.

@consideRatio consideRatio added type:Enhancement A proposed enhancement to the docker images type:Arm Issue specific to arm architecture labels Jul 16, 2021
@mathbunnyru
Copy link
Member

Related: #1203

@mathbunnyru
Copy link
Member

Added this issue to the milestone.

@mathbunnyru
Copy link
Member

Another idea is to improve PRs speed - build ARM images only in master or when the commit has some pre-defined string.
This way we will be able to have PRs working at the same speed as before (around 25 minutes).

Also, we might want to use actions/cache.

@manics
Copy link
Contributor

manics commented Jul 20, 2021

Could you move the multiarch build into a separate GitHub workflow? You'd then get multiple CI statuses on PRs, and could choose to merge after the amd64 job passes instead of waiting for all jobs?

@consideRatio
Copy link
Collaborator Author

consideRatio commented Jul 20, 2021

@manics I agree that is an important optimization - note that you only need to have separate jobs, not separate workflows (that contains X jobs). I suggest we both separate amd from arm (optimization 2) and separate images from each other (optimization 3).

@consideRatio
Copy link
Collaborator Author

I've ordered 7 RPi computers and look to make them self-hosted arm64 based runners for us in the Jupyter ecosystem where needed.

@mathbunnyru
Copy link
Member

I've ordered 7 RPi computers and look to make them self-hosted arm64 based runners for us in the Jupyter ecosystem where needed.

Wow, nice! :)

I also wanted to create some VMs on ARM to use it as self-hosted runners, but if you're already on it, that's great 👍

@mathbunnyru
Copy link
Member

@consideRatio did you have any luck with arm runners?

@consideRatio
Copy link
Collaborator Author

@mathbunnyru I have a k8s cluster running on 7 raspberry pi computers etc, but I've failed to deploy the github runner software on k8s still. I'm left quite clueless on what is going on with that and have failed to debug it.

actions/actions-runner-controller#732

@mathbunnyru
Copy link
Member

I see. Unfortunately, I have almost zero experience with k8s and absolutely zero experience with self-hosted runners, so I can't help you right now :(

@mathbunnyru
Copy link
Member

@consideRatio I noticed, that the build times in master are really slow.
It seems that for multi-arch images push step rebuilds everything from scratch and we do the same thing twice (building the image).
Could you please take a look?

Latest master branch is still running - build step took only 1h 16m 43s, push is already taking almost an hour and not yet finished.
https://github.com/jupyter/docker-stacks/runs/3431621105

@consideRatio
Copy link
Collaborator Author

consideRatio commented Aug 30, 2021

Latest master branch is still running - build step took only 1h 16m 43s, push is already taking almost an hour and not yet finished.

Hmmm, looking into this a bit, my guess is that the cache for layers grows too large and that cause it to be discarded along the way forcing a rebuild or similar. This guess is supported by noting that previous builds have been successfully using a cache for at least the base-notebook image and then suddenly that stops working when you added more images to be built in recent PRs.

I have a few ideas on what we could do:

  1. We could increase the available disk space for the runner.
  2. I'm not sure about our use of --rm and the --force-rm flags when we publish and with docker buildx build.
    Are they supported using docker buildx build?
    If they work, doesn't that mean that we loose some relevant cache?
  3. We could accept this failure until other performance optimizations makes this not happen, by having more and separate jobs pushing to registries in between.

Practically, doing can be done like this.

      # Without this our cache may get reset.
      #
      # NOTE: This step needs to run before actions/checkout to not end
      #       up with an empty workspace folder.
      #
      - name: Maximize build space
        uses: easimon/maximize-build-space@b4d02c14493a9653fe7af06cc89ca5298071c66e
        with:
          root-reserve-mb: 51200 # 50 GB
          build-mount-path: /var/lib/docker/tmp # remaining space
          remove-dotnet: "true"
          remove-haskell: "true"
          remove-android: "true"

To do 2, we would just experiment by removing --rm and --force-rm.

@trallard
Copy link
Member

trallard commented Feb 4, 2022

Hey folks I would like to help with optimising the CI 😉

Also - on the matter of arm64 it turns out that you can ask for permanent free-access to machines at https://github.com/WorksOnArm/equinix-metal-arm64-cluster it might be worth submitting a request

@mathbunnyru
Copy link
Member

Wow, I didn't know about this project - I will submit a request in a few days, thank you!

@mathbunnyru
Copy link
Member

mathbunnyru commented Feb 4, 2022

So, I can share my vision of how to make this work.
And, to be honest I don't see another way.
My solution probably requires one big PR change at one point (though I can see how it can be done in several big steps), but it will resolve around 5 huge issues at the same time (maybe even more).

  1. Get native arm runners. This is a must.
    Without native runners, we end up building under QEMU, which means we have to wait much longer and debug some strange arm under qemu behavior.
    I personally know nothing about QEMU, and I believe it makes debugging problems much more difficult.
    We will also be able to build datascience-notebook for arm (it builds fine on my native arm VM, but doesn't under QEMU).

  2. In the main workflow we create a random tag.
    We can also use git hash, to make it easier to debug, though there can be several builds for the same hash.
    We then create two parallel build jobs - one for arm and one for x86.

  3. Build different platforms in parallel.
    We can build our x86 images in under 30 minutes.
    I think that's quite a good result.
    ARM images probably take the same time.

  4. Test these platforms in parallel.
    We will be able to easily test arm images (they are not tested right now at all).

  5. We create manifests, calculate tags, and so on using docker run in parallel.
    Note, these tags are gonna be correct (we won't assume x86 tags for arm images, as we do right now).
    We save these tags/manifests using github artifacts or some kind of shared location, which we can later access from the main workflow.

  6. Remember the random tag from step 2?
    We push these images in our GHCR with this tag and arch prefix in parallel.
    For example, amd64 runner will push jupyer/base-notebook:amd64-randomtag, jupyer/minimal-notebook:amd64-randomtag and so on, while aarch64 runner will push jupyer/base-notebook:aarch64-randomtag, jupyer/minimal-notebook:aarch64-randomtag and so on (randomtag is the tag from the step 2).

  7. In the main workflow we wait till these two parallel steps finish.
    We try to merge the tags between arm and x86 and if the tag matches for different archs, we create multi-platform image, if not, we just push the tag.
    We use docker manifest for that.
    https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/
    We're gonna use the hard way, because we build in different jobs and we only want to orchestrate manifests.
    We do not build/run or somehow modify our images anymore.
    There is room for improvement - if there was this tag for the different arch in the past, we have to do some stuff to make it work, but in the beginning, we can skip this part.

  8. We push manifests to wiki and images to dockerhub.

Note:

  • we use GHCR as our buffer between native runners and the merging part.
  • we get rid of QEMU completely
  • any performance improvements can still be applied (like building different notebooks in parallel) if necessary
  • We will use docker manifest for merging different archs, this way our images won't be rebuilt during push (because we won't even tell docker manifest how to build the images, only perform manifests merging).
  • We won't need to maximize build space, because we only had problems when we started using dockerx.

@manics
Copy link
Contributor

manics commented Feb 4, 2022

@mathbunnyru what does docker buildx do in step 6? If you've already built and pushed the single arch images and just need a multiarch manifest you can use docker manifest {create,push} with the tags of those images. Since this is just a metadata operation it should be even faster than a cached build.
https://www.docker.com/blog/multi-arch-build-and-images-the-simple-way/

@mathbunnyru
Copy link
Member

Thanks, I meant docker run. Added docker manifest to the step 7.

@mathbunnyru
Copy link
Member

@manics what do you think about my proposal?

@consideRatio
Copy link
Collaborator Author

Quick note from mobile: native arm runners, i got them running on k8s cluster setup with raspberry computers. The downside..... You cant use the same actions etc you have used in a github workflow. Setup-python etc relies on cached versions in a amd64 maintained github ci environment that wont work on arm64. So, going arm64 native means abandoning typical actions we have relied on.

Imo, i'm more positive on having atandalone arm64 builds that then get combined in a manifest augnentation thingy. Anyhow, on mobile, just wanted to warn about arm64 runners challenges

@mathbunnyru
Copy link
Member

Thanks, @consideRatio!
Actually, the actions we will need in these runners are quite simple - install python, python requirements, make and docker.
This should be enough to build, test and push our images.

@manics
Copy link
Contributor

manics commented Feb 4, 2022

Proposal generally sounds good!

In step 6 how are the manifests and tags calculated? Do they require both architectures to be built before calculation?

If the calculations in step 6 can be run in parallel for the separate architectures that avoids having to pull the images back down again, instead you could push direct to Docker Hub and save the tags/manifests as JSON or text files, uploaded as GitHub build artifacts (one set of artifacts for each architecture).

The main workflow could then fetch those artifacts, confine then as necessary, and update everything else without even touching the images.

@mathbunnyru
Copy link
Member

Proposal generally sounds good!

In step 6 how are the manifests and tags calculated? Do they require both architectures to be built before calculation?

If the calculations in step 6 can be run in parallel for the separate architectures that avoids having to pull the images back down again, instead you could push direct to Docker Hub and save the tags/manifests as JSON or text files, uploaded as GitHub build artifacts (one set of artifacts for each architecture).

The main workflow could then fetch those artifacts, confine then as necessary, and update everything else without even touching the images.

I didn't want to push to docker hub directly, because we might end up in a situation, where x86 images are fine and already uploaded and arm doesn't build for some reason.
In my solution, this will fail the whole build and our docker hub images will not be changed at all.
If we push directly to dockerhub, this will leave it in some inconsistent state.

@manics I've updated my proposition to include your suggestions.
Does it make more sense now?

@manics
Copy link
Contributor

manics commented Feb 6, 2022

Yes, makes sense to me 😄

@trallard
Copy link
Member

trallard commented Feb 9, 2022

Hey folks - since my brain works in a very visual way I went ahead and made a diagram which captures the proposed approach above:

  • From the proposal written here by @mathbunnyru: the random tag in step 2 could be a combination of the GH sha and the timedate/epoch stamp to make it easier to track back to

Docker stacks - Build Docker images workflow diagram
( You can see the high-resolution schematic here: https://res.cloudinary.com/nezahualcoyotl/image/upload/v1644401496/docker-stacks-schematic_ilb2yz.png)

As I mentioned in #1203 I would be happy to start working on a prototype to start parallelising stuff

@mathbunnyru
Copy link
Member

@trallard very nice!

A few moments:

  1. We create manifests and tags somewhere near Test images (so, in separate jobs for ARM and x86). We save this info as our build artefacts and will use that later to do merging different architectures. This can be done in parallel with testing and pushing images.
  2. We can add "Push tags and manifests" before "Can merge tags?" questions. Or even in parallel with it. I think this step will simply gather build artefacts and prepend it to our wiki.
  3. Build multi-platform image - I would rename this, sth like "Merge different architectures manifests to create multi-platform image". We do not build in our main job, that's important

As I mentioned in #1203 I would be happy to start working on a prototype to start parallelising stuff

Please, proceed, I won't have much time for a few months, but I'm ready to review and help if needed.

@mathbunnyru
Copy link
Member

Also, I think this diagram will be very useful in the future if/when we implement this, so it might be worth to add a separate page in Contribution Guide.

@trallard
Copy link
Member

trallard commented Feb 9, 2022

For completeness, I have updated the diagram to reflect @mathbunnyru comment above

Docker Stacks -CI workflow schematic

@mathbunnyru
Copy link
Member

One more small update - we probably want to "Push tags and manifests" in the main workflow.
This is easier, because main workflow is gonna run in Github provided environment (so we can use existing "commit and push" GitHub workflow).
This way we will also push only when the tests are passed and they are passed for each image for each arch.

@mathbunnyru
Copy link
Member

So it's a step right before "Can merge tags?"

@trallard
Copy link
Member

trallard commented Feb 9, 2022

🎉 fixed! thanks as usual for the comments/reviews

docker-stacksv3

@mathbunnyru
Copy link
Member

I think what we can do now, is not to create multi-platform images as well and make aarch64 tags look like this jupyter/base-notebook:aarch64-latest (just add prefix).
Users will have to use tags like this for aarch64 images, but at least our images will be reilable.

Right now, every update is a pain - I have to rebuild 5 times to get to the point, when images on DockerHub are the same, as if I built from source.

@maresb
Copy link
Contributor

maresb commented May 1, 2022

A simple-minded comment I posted under an unrelated issue, together with the response from @mathbunnyru:

Also, just an idea, sorry if this is way too naive, but would it simplify everything to make the various images into stages of a single Dockerfile? It seems like that way Docker would take care of the build dependency tree for you, so that you don't even have these problems in the first place.

This is a good suggestion. But we still have to work with amd64/aarch64 differences (for example, we're not building everything under aarch64).
Or, it doesn't make sense to build all and then test all. It makes sense to test as early as possible (otherwise you will have to wait all the builds even if base image doesn't work).
Also, tagging is not an easy thing.
Overall, I can see some advantages, but I see many disadvantages as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Arm Issue specific to arm architecture type:Enhancement A proposed enhancement to the docker images
Projects
None yet
5 participants