Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CircleCI hardocodes a maximum build time of 5 hours #970

Closed
tlaurion opened this issue Feb 3, 2021 · 31 comments
Closed

CircleCI hardocodes a maximum build time of 5 hours #970

tlaurion opened this issue Feb 3, 2021 · 31 comments

Comments

@tlaurion
Copy link
Collaborator

tlaurion commented Feb 3, 2021

Houston, we have a problem:
https://app.circleci.com/pipelines/github/tlaurion/heads/694/workflows/9d351748-166a-4fa6-b0e4-7a08510cedcc/jobs/765

Build fails because:

Build timed out after 5h0m0s

context deadline exceeded

Documented here:
https://discuss.circleci.com/t/job-times-out-after-5-hours/32220/4

Possible solutions:

  • Reduce the number of boards supported under CI
  • Reduce the number of coreboot versions built (move all boards to latest coreboot version so that coreboot 4.11 and latest version are built (coreboot musl-cross for each coreboot version is taking a lot of time to build) )
  • Build on top to a docker image including the musl-cross-make tollchain we use and revert to having coreboot build on top (bad idea)
  • Test ccache ?
  • Check impacts of bumping CPUS=4 to something higher
  • Test if memory issues are still happening when removing CPUS= statement altogether.
@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

Some stats obtained by grepping in the first board being built by CircleCI. Stats obtained by grep "2021-01-31" on the log downloaded locally.

  • The first make BOARD= build (2h53) needs to build musl-cross-make if not in cache.
2021-01-31 18:52:52+00:00 Wrong gawk detected: 
--2021-01-31 18:52:53--  http://gnu.mirror.constant.com/gawk/gawk-4.2.1.tar.xz
2021-01-31 18:52:53 (33.1 MB/s) - '/root/project/packages/gawk-4.2.1.tar.xz.tmp' saved [2985412/2985412]
2021-01-31 18:54:26+00:00 WGET https://www.coreboot.org/releases/coreboot-blobs-4.11.tar.xz
--2021-01-31 18:54:26--  https://www.coreboot.org/releases/coreboot-blobs-4.11.tar.xz
2021-01-31 18:57:13 (152 KB/s) - '/root/project/packages/coreboot-blobs-4.11.tar.xz.tmp' saved [25766200/25766200]
2021-01-31 18:57:16+00:00 CONFIG coreboot-blobs
2021-01-31 18:57:16+00:00 MAKE coreboot-blobs
2021-01-31 18:57:16+00:00 DONE coreboot-blobs
2021-01-31 18:57:16+00:00 WGET https://www.coreboot.org/releases/coreboot-4.11.tar.xz
--2021-01-31 18:57:16--  https://www.coreboot.org/releases/coreboot-4.11.tar.xz
2021-01-31 19:02:16 (133 KB/s) - '/root/project/packages/coreboot-4.11.tar.xz.tmp' saved [40730352/40730352]
2021-01-31 19:24:14+00:00 CONFIG coreboot

Here, we get by the difference in the two last timestamps that building coreboot buildstack took ~22 minutes.

2021-01-31 19:24:15+00:00 WGET https://github.com/richfelker/musl-cross-make/archive/38e52db8358c043ae82b346a2e6e66bc86a53bc1.tar.gz
--2021-01-31 19:24:15--  https://github.com/richfelker/musl-cross-make/archive/38e52db8358c043ae82b346a2e6e66bc86a53bc1.tar.gz
--2021-01-31 19:24:16--  https://codeload.github.com/richfelker/musl-cross-make/tar.gz/38e52db8358c043ae82b346a2e6e66bc86a53bc1
2021-01-31 19:24:16 (7.76 MB/s) - '/root/project/packages/musl-cross-38e52db8358c043ae82b346a2e6e66bc86a53bc1.tar.gz.tmp' saved [142143]
2021-01-31 19:24:16+00:00 CONFIG musl-cross
2021-01-31 19:24:16+00:00 WGET https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.14.62.tar.xz
--2021-01-31 19:24:16--  https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.14.62.tar.xz
2021-01-31 19:24:17 (242 MB/s) - '/root/project/packages/linux-4.14.62.tar.xz.tmp' saved [100957512/100957512]
2021-01-31 19:25:00+00:00 MAKE musl-cross
--2021-01-31 19:25:00--  https://ftp.gnu.org/pub/gnu/gcc/gcc-8.3.0/gcc-8.3.0.tar.xz
2021-01-31 19:25:02 (49.3 MB/s) - 'gcc-8.3.0.tar.xz' saved [63694700/63694700]
--2021-01-31 19:25:02--  http://git.savannah.gnu.org/gitweb/?p=config.git;a=blob_plain;f=config.sub;hb=3d5db9ebe860
2021-01-31 19:25:02 (2.53 MB/s) - 'config.sub' saved [36159]
--2021-01-31 19:25:25--  https://ftp.gnu.org/pub/gnu/binutils/binutils-2.32.tar.xz
2021-01-31 19:25:25 (39.0 MB/s) - 'binutils-2.32.tar.xz' saved [20774880/20774880]
--2021-01-31 19:25:33--  https://www.musl-libc.org/releases/musl-1.1.24.tar.gz
2021-01-31 19:25:33 (10.7 MB/s) - 'musl-1.1.24.tar.gz' saved [1024988/1024988]
--2021-01-31 19:25:33--  https://ftp.gnu.org/pub/gnu/gmp/gmp-6.1.2.tar.bz2
2021-01-31 19:25:34 (18.6 MB/s) - 'gmp-6.1.2.tar.bz2' saved [2386766/2386766]
--2021-01-31 19:25:36--  https://ftp.gnu.org/pub/gnu/mpc/mpc-1.1.0.tar.gz
2021-01-31 19:25:36 (6.51 MB/s) - 'mpc-1.1.0.tar.gz' saved [701263/701263]
--2021-01-31 19:25:36--  https://ftp.gnu.org/pub/gnu/mpfr/mpfr-4.0.2.tar.bz2
2021-01-31 19:25:36 (9.80 MB/s) - 'mpfr-4.0.2.tar.bz2' saved [1652074/1652074]
--2021-01-31 19:25:38--  http://ftp.barfooze.de/pub/sabotage/tarballs//linux-headers-4.19.88.tar.xz
2021-01-31 19:25:44 (164 KB/s) - 'linux-headers-4.19.88.tar.xz' saved [1052880/1052880]
2021-01-31 21:03:50+00:00 DONE musl-cross

The last two lines informs us that after having downloaded the packages, building musl-cross (musl-cross-make) took 1h38.

  • After having built musl-cross-make, building a shared linux version across boards take ~30 minutes:
2021-01-31 21:03:55+00:00 MAKE linux
2021-01-31 21:31:20+00:00 DONE linux
  • Then, all other modules are built, normally shared across boards.
  • As of right now, we have variable versions for: coreboot, linux, and cryptsetup.
  • For the above board, building the rom took 2h53, on which coreboot buildstack took 22 minutes, musl-cross-make took 1h38, where linux took around 30 minutes, while the rest of the modules took the 23 minutes left.

Next board.

  • After that, the next following board, depending on the same coreboot and linux version takes 1m26.

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

I don't know a lot about CircleCI but this states:

Jobs have a maximum runtime of 5 hours. If your jobs are timing out, consider running some of them concurrently using workflows.

So could we not just split up tasks into jobs?

Edit: i.e. could have have each board be a job, in which case the 5hr limit would be the time it takes to build each board

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

Actually following on from above, it looks like we could really speed up building by splitting jobs and making them parallel. A workflow could go something like:

Initial prep -> Build a single board of coreboot 4.8.1, 4.11 and 4.13 in parallel -> build all the rest of the boards in parallel.

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

@tlaurion messy as hell PoC here

So looks like we can persist data between jobs with persist_to_workspace and attach_workspace but that's not too friendly with root (/) so we will have to re-run apt install for each job unless we setup a docker image with those dependencies installed.

Edit: slightly better here, still not in parallel though.

Edit: even better with parallel here

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

@Tonux599 : https://app.circleci.com/pipelines/github/Tonux599/heads/67/workflows/7d0131b5-f6e3-44d9-8b35-2154c57f5b79
Is awesome!!!!

My suggestion would be to split it up my coreboot version. Looking in your config

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

Suggestions:

  • Add apt install call as a variable and call it instead of duplicating.

I think if that last build succeeds, we can do another rounds of tweaks after! Awesome @Tonux599! You are now the CircleCI master of heads :)

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

It does look promising, I'll open a PR with cleaner commits so we can go forward with its current state. One thing to note though, I'm not convinced the cache will be saved correctly as the job save_cache inherits from the last job built workspace (I think). So it may not contain coreboot 4.11 or 4.13 musl-cross.

I think a cleaner way of doing it might be:

Build a CB 4.8.1 board -> build a CB 4.11 board -> build a CB 4.13 board -> save cache -> build everything else in parallel.

Saying that though, persist_to_workspace and attach_workspace allow picking certain subdirectories so we may be able to still have save_cache last with defined directory's from its parents.

More testing and playing around with required!

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

Wrapping my head around this.

Trying to understand why the builds are taking so long still in between the dependencies.

If I get this dependency right, dependencies linked to parallelization should be smaller steps to save time between dpendencies of builds (like in this case, all other board builds are waiting for qemu-coreboot to finish before starting)

  • make musl-cross
  • make all other boards

Watching the process of the builds, I have intuition that a make statemnt is currently missing in coreboot to do what we want. We have coreboot, coreboot-blobs, coreboot.menuconfig, coreboot.saveconfig and their clean counterparts, where coreboot.crossgcc (building crossgcc-i386 and iasl) would speed up all other boards. We could then have the following minimal dependencies shared

  1. make musl-cross CPUS=8 V=1
  2. make BOARD=$(candidate of each coreboot versions) coreboot.crossgcc CPUS=8 V=1 (dependency on 1)
  3. make BOARD= (all other boards candidates) (dependency on 2)

What do you think from your experiences of today @Tonux599 ?

The reasoning out of this is

  • qemu-coreboot builds musl-cross, coreboot.crossgcc, linux, and modules, then builds its coreboot stages and warps everything in. (1h39 if without musl-cross cache nor coreboot nor linux cache)
  • librem_mini builds its coreboot.crossgcc for aother version, its different linux version and modules, then finally coreboot its coreboot stages and warps everything in.
    • librem_mini_v2 does the same but depends on finished time of qemu-coreboot and librem_mini coreboot. That could be hypothetically parallelized

Catch my draft ? IS it a silly idea? The total build time of

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

It does look promising, I'll open a PR with cleaner commits so we can go forward with its current state. One thing to note though, I'm not convinced the cache will be saved correctly as the job save_cache inherits from the last job built workspace (I think). So it may not contain coreboot 4.11 or 4.13 musl-cross.

I think a cleaner way of doing it might be:

Build a CB 4.8.1 board -> build a CB 4.11 board -> build a CB 4.13 board -> save cache -> build everything else in parallel.

Saying that though, persist_to_workspace and attach_workspace allow picking certain subdirectories so we may be able to still have save_cache last with defined directory's from its parents.

More testing and playing around with required!

@Tonux599 yeah... I'll sleep on it, but I really think
1- musl-cross only for qemu-coreboot board
2- then parallelize coreboot.crossgcc only for diffferent coreboot versions
3- then parallelize boards on top of coreboot.crossgcc cache, built on top of musl-cross (in a parallelized way?)

From my analysis of build time #970 (comment)
We know that building modules is minimal as opposed to building musl-cross, coreboot crossgcc and linux.
While linux are varying a lot recently across boards, caching it is not really interesting as of now (while the real solutio nwould be to upgrade it for all board config, and migrating old boards to latest coreboot will also limit the crosggc building time for different boards).

We now add a caching time of 15 minutes per board building of parent + extraction time from the child. (Which could be reduced by selective caching like you said).

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

Firstly I need to amend my previous comment, by the looks of it all the workspace layers are applied so the cache should be saved correctly.

1. make musl-cross CPUS=8 V=1

2. make BOARD=$(candidate of each coreboot versions) coreboot.crossgcc CPUS=8 V=1 (dependency on 1)

3. make BOARD= (all other boards candidates) (dependency on 2)

This is what I would be aiming for. However the only problem with that is all the small "things" (gpg, busybox, cryptsetup) would be needlessly be being build side-by-side at point 2. I think ideally we need:

  1. make musl-cross CPUS=8 V=1
  2. make (all the small stuff, gpg, busybox, cryptsetup)
  3. make BOARD=$(candidate of each coreboot versions) coreboot.crossgcc CPUS=8 V=1
  4. make BOARD= (all other boards candidates)

There is definitely more optimisation to be done, I think its just a case of sifting though it and making adjustments where we can.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

@Tonux599 Can you trigger a rebuild please? (it will create another build, just so we reuse your cache and see)

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

Nervermind the cache is not ready yet. 30 minutes damn. Normal. I could cancel the workflow. I guess i'll be able to trigger a rebuild.

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

Nervermind the cache is not ready yet. 30 minutes damn. Normal.

Yeah we probably don't actually need all those workspace layers being applied, just one from each variant board of coreboot.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

@Tonux599 A quickfix here though would be to replace the qemu-coreboot by one of the boards that has it all already. Like the x230-hotp-maximized board.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

Nervermind the cache is not ready yet. 30 minutes damn. Normal.

Yeah we probably don't actually need all those workspace layers being applied, just one from each variant board of coreboot.

Not sure I got that, sorry.

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

If were talking about testing the cache we will have to wait anyway, as its not saved at any other point.

Nervermind the cache is not ready yet. 30 minutes damn. Normal.

Yeah we probably don't actually need all those workspace layers being applied, just one from each variant board of coreboot.

Not sure I got that, sorry.

It's in effect applying the state of the filesystem from each job (So about 20) on top of one another before creating the cache, but in reality we only need one board of coreboot 4.8.1, 4.11 and 4.13 (so 3) and then create a cache.

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 3, 2021

Updated here which implements above as save_cache appears to be stalling.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 3, 2021

hooooo now i get it.
you cache .

instead of ./crossgcc ./build and others.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 4, 2021

https://circleci.com/api/v1.1/project/github/Tonux599/heads/116/output/101/0?file=true&allocation-id=601b21ec1b2fdd228f4c5265-0-build%2F1C84DA94

is really interesting...
Downloading workspace layers

  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/0ff72300-8429-4c74-8fac-05369efd4314/0/113.tar.gz - 3.6 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/625f6ef3-f23e-407d-b5d7-8399671e0958/0/108.tar.gz - 4.1 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/12a9f227-b118-4fa0-bcb3-c1034d60bfc8/0/108.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/c5c0ea91-026c-4607-a423-82f6e06a329c/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/3b0eb27f-e591-42a6-b499-78f93be1e162/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/b1816471-b0ba-4147-aa6a-bf2c434aac9d/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/5fd57d84-616f-4082-926c-807b1c65479e/0/108.tar.gz - 5.3 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/597a1726-0644-451c-ba01-c32ee9cd63d1/0/108.tar.gz - 5.4 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/d07abbb4-2d82-4848-ac17-7deba89d96b1/0/108.tar.gz - 5.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/ecbcd98f-67bd-415d-b8df-29c5bb193f7a/0/108.tar.gz - 4.1 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/0753d3f6-edef-4409-a414-3332435c7114/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/ca087f17-db89-4ba7-b2d3-1dc522d1fdee/0/108.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/80d3dd79-0167-4c99-98d0-588487e88cfa/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/e3329b1d-f1b1-4c79-bf47-62ea0640cef6/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/b52059b0-9c0d-467a-86e1-8a1300aeccd8/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/e6fc142e-a1ca-455f-911e-0a7e8f084136/0/108.tar.gz - 3.8 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/d390f5bb-ac42-43e7-a2ec-6db71db2e231/0/108.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/d8025a60-1417-4939-b820-9b2c517effd6/0/109.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/9a494f13-851e-46f4-8ff3-fe249aeff896/0/108.tar.gz - 5.8 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/2422d846-ec72-495f-bc85-40b5bd0d6f6f/0/108.tar.gz - 3.9 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/311d3b5b-e273-42b9-9672-9daa24d3f0b3/0/108.tar.gz - 5.8 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/777a5e09-6cd8-4ba9-8e65-ad9ced491d4b/0/108.tar.gz - 5.0 GB
  workflows/workspaces/7d0131b5-f6e3-44d9-8b35-2154c57f5b79/0/9bab0344-adbe-4e38-8d5f-c49f5acf8ad7/0/108.tar.gz - 5.1 GB

That could be tweeked.

But then all those file downloading and extraction takes a looooot of time because not specialized.
We are heading for another 5 hours, will check in the morning.

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 4, 2021

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 4, 2021

Ok, but the thing is still that parallelization is happening 2x at a time. Didn't realize that picking the build late.

You can without a doubt put CPUS=8 everywhere, maybe even more. Not specifying CPUS= results in build going to 36 cores, leaving us place to tune the number of cores to play with before we exhaust memory limits. Originally, CircleCI builds were pinned with --load parameter. We might also fall back into that if needed.

Where putting it to 8 barely unmade it https://app.circleci.com/pipelines/github/tlaurion/heads/700/workflows/0a3a9b23-aa8d-4557-b848-c8830e8be276

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 4, 2021

@Tonux599 :
Blunt tests for the night to compare:

And finally, having a compare point to 8 cores involved:

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 4, 2021

So it failed again, looks like workspace layers don't like overwriting each other.

Another build here which is sequential for the first three boards, saves a cache, then runs in parallel: https://app.circleci.com/pipelines/github/Tonux599/heads/69/workflows/1585f145-847c-4b2f-b498-ac6ea0a528ac

Edit: @tlaurion if above works I'll open a PR based on it. I have other ideas on how to make the CI a bit more clean to like using commands and parameters to reduce the amount of repetitive code as detailed here

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 4, 2021

* master equivalent, but passing CPUS=16 instead of CPUS=4: https://app.circleci.com/pipelines/github/tlaurion/heads/702/workflows/f59eb61b-fd41-4641-9179-787806d99032/jobs/773

This worked, so i'll push that to master as a temporary fix so that Ci Builds are successful as of now

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 4, 2021

Will actually push the luck more and push CPUS=24: https://app.circleci.com/pipelines/github/tlaurion/heads/705/workflows/357533c7-8e62-416f-8479-b2f548949081/jobs/822

Edit: worked.

@Tonux599
Copy link
Contributor

@tlaurion please see #977

@tlaurion
Copy link
Collaborator Author

tlaurion commented Feb 27, 2021

Quick review.

  • me_cleaner only downloaded for xx30 blobs, not xx20

  • Not sure I understand clearly the dependencies. Everything depends on librem-mini, where librem-mini depends on kgpe-d16?

Other then that im impressed by the time gained! Awesome!

@Tonux599
Copy link
Contributor

Tonux599 commented Feb 27, 2021

* me_cleaner only downloaded for xx30 blobs, not xx20

So we have @Thrilleratplay's modified me_cleaner in branch already for xx20 so we don't need to download anything.

* Not sure I understand clearly the dependencies. Everything depends on librem-mini, where librem-mini depends on kgpe-d16?

I'll try and explain, what's basically happening is we are building a board for coreboot 4.8.1 (qemu-coreboot), saving the whole projects workspace. Then loading it again for coreboot 4.11 (kgpe-d16_workstation) to be build. Saving. Loading. Then coreboot 4.13 (librem_mini) is being built.

Now that workspace from librem_mini is being saved, and that includes all the built toolchains from coreboot 4.8.1, 4.11 and 4.13. Also most the modules are built. This workspace is then loaded by all the other boards, in parallel, and they are then built.

Don't get me wrong, there are still things we could do to make this better, notably we can have a single board from each coreboot build in parallel, and then build the remaining boards. Things get complicated then though as CircleCI refuses to overwrite other saved workspaces, so we have to be careful in instructing CircleCI to only save files/folders that will not clash. As I said in the PR though, hopefully what I've done here is a good starting point.

@tlaurion
Copy link
Collaborator Author

tlaurion commented Mar 1, 2021

* me_cleaner only downloaded for xx30 blobs, not xx20

So we have @Thrilleratplay's modified me_cleaner in branch already for xx20 so we don't need to download anything.

@Tonux599 Right. Sorry for the confusion. Forgot about that.

Don't get me wrong, there are still things we could do to make this better, notably we can have a single board from each coreboot build in parallel, and then build the remaining boards. Things get complicated then though as CircleCI refuses to overwrite other saved workspaces, so we have to be careful in instructing CircleCI to only save files/folders that will not clash. As I said in the PR though, hopefully what I've done here is a good starting point.

@Tonux599 : I quickly reviewed, and see this PR as a safe direct move when 5 hours builds will fail for a fresh build without cache. I still don't get the gist out of how to optimize this better, but from what I see, a fresh build takes 5h43 (vs 4h:56) where a full build for all boards from most complete cache available takes 4h (vs 1h19).

We should merge preventively?

@tlaurion
Copy link
Collaborator Author

tlaurion commented Mar 1, 2021

@Tonux599

Following the rabbit to try to understand what is happening and consuming time on builds even if cache is available. Quick observations on last rebuild with avail full cache:

  • On rebuild, total cache deflates in prep step. This cache is smaller (5.9gb vs 8+) and does not seem to contain all compiled binaries. Might be fine or not, dunno yet.
  • Persisting to workspace takes 14m for prep step
  • qemu-coreboot which is next to be built restore 111 workspace which takes 3m
  • qemu-coreboot build time takes 8 minutes even if full cache is supposed to be available and restored (attaching workspace of 111.tar.gz which is 6.3gb
  • Then things get exponential.
    • Next sequential board being kgpe-d16-workstation eats 2x workspace of the same size and deflates 6.3gb x 2, which takes now 7m44
    • Next sequential build which is librem_mini eats 3 workspace caches of 6.3GB, taking 10m35
    • Each parallel boards being build hereafter eats 4x workspace cache each, taking 18 minutes of deflating time per board compilation.
    • The save cache step also attach the workspace x4 without actually saving cache, taking another 13 minutes.
    • So basically, for all combined parallel tasks (20) we seem to loose at least 10 minutes each subsequent build reusing too many workspaces, acconting for the approximate loss of time between actual cache reuse (master) and rebuild time based on multiple workspace deflation (3m for 1x vs 16 minutes for 4x): 20 * 10: 200. 200/60: 3.33h lost because of multiple workspace cache defaltion.

@Tonux599 : I understand that the cache alone cannot be reused and workspaces need to be saved and passed in persistent workspaces in betweek each sequential builds? Wouldn't there be a way to only pass workspaces only on fresh builds when caches are not available? Maybe im just dreaming here. Can you link on the docs you are having in the back of your mind for further optimizations?

@tlaurion
Copy link
Collaborator Author

tlaurion commented Dec 4, 2021

#1015 merged

@tlaurion tlaurion closed this as completed Dec 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants