Evaluate fanout for rust-lang/rust CI #175

MarcoIeni · 2024-10-25T09:33:23Z

To optimize our CI to remove work from large runners we want to evaluate how much fan out would help.

Tasks

Explain how much time we could save by using fanout instead of building the stage 1 compiler for every target:

Understand what are the artifacts of stage 1 - what do I need to cache and when
- Maybe it's in build/$host/stage$stage in compile.rs file of bootstrap. Host should be the target_triple
Understand where I need to download these artifacts.
Understand when the first CI job should stop
Understand when the first CI job should start

Right now we don't have a way to understand how long each stage takes. We could upload metrics to datadog or S3 about each stage of the CI to analyze them. It would be nice to also include cpu utilization to understand if a bigger machine helps a certain step or if it's worth to parallelize some operations (if possible).
- identify "checkpoints" where to send metrics.
- Check metrics.json if this is present (https://ci-artifacts.rust-lang.org/rustc-builds-alt/...). If not, ask Jakub
Is it possible to avoid building LLVM on windows? Discussion on CI: unset NO_DOWNLOAD_CI_LLVM for 2 windows jobs rust#132781

Random optimizations I find while studying this that it might be worth looking at later.

We could have one github workflow where we build stage 1 + stage 2 compiler and we could have various jobs that run tests that needs the previous job. So we build the stage 2 compiler in one job and then we split these jobs to run the tests.

Do we always run all CI jobs or just the CI jobs for the affected components? I.e. can we skip tests if the code change doesn't affect it? E.g. if the code only changes comments, do we need to run the tests across every OS and target?
- answer: compining the compiler takes the most time, so this is not worth working on it.
We build the docker image everytime. How long does it take? Is caching working? https://github.com/MarcoIeni/rust/blob/a9d17627d241645a54c1134a20f1596127fedb60/src/ci/docker/run.sh#L93
- answer: by enabling timestamps in github actions I found it doesn't take long to build the docker container (2 min)

The text was updated successfully, but these errors were encountered:

MarcoIeni self-assigned this Oct 25, 2024