-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CircleCI hardocodes a maximum build time of 5 hours #970
Comments
Some stats obtained by grepping in the first board being built by CircleCI. Stats obtained by
Here, we get by the difference in the two last timestamps that building coreboot buildstack took ~22 minutes.
The last two lines informs us that after having downloaded the packages, building musl-cross (musl-cross-make) took 1h38.
Next board.
|
I don't know a lot about CircleCI but this states:
So could we not just split up tasks into jobs? Edit: i.e. could have have each board be a job, in which case the 5hr limit would be the time it takes to build each board |
Actually following on from above, it looks like we could really speed up building by splitting jobs and making them parallel. A workflow could go something like: Initial prep -> Build a single board of coreboot 4.8.1, 4.11 and 4.13 in parallel -> build all the rest of the boards in parallel. |
@tlaurion messy as hell PoC here So looks like we can persist data between jobs with Edit: slightly better here, still not in parallel though. Edit: even better with parallel here |
@Tonux599 : https://app.circleci.com/pipelines/github/Tonux599/heads/67/workflows/7d0131b5-f6e3-44d9-8b35-2154c57f5b79 My suggestion would be to split it up my coreboot version. Looking in your config |
Suggestions:
I think if that last build succeeds, we can do another rounds of tweaks after! Awesome @Tonux599! You are now the CircleCI master of heads :) |
It does look promising, I'll open a PR with cleaner commits so we can go forward with its current state. One thing to note though, I'm not convinced the cache will be saved correctly as the job I think a cleaner way of doing it might be: Build a CB 4.8.1 board -> build a CB 4.11 board -> build a CB 4.13 board -> save cache -> build everything else in parallel. Saying that though, More testing and playing around with required! |
Wrapping my head around this. Trying to understand why the builds are taking so long still in between the dependencies. If I get this dependency right, dependencies linked to parallelization should be smaller steps to save time between dpendencies of builds (like in this case, all other board builds are waiting for qemu-coreboot to finish before starting)
Watching the process of the builds, I have intuition that a make statemnt is currently missing in coreboot to do what we want. We have coreboot, coreboot-blobs, coreboot.menuconfig, coreboot.saveconfig and their clean counterparts, where coreboot.crossgcc (building crossgcc-i386 and iasl) would speed up all other boards. We could then have the following minimal dependencies shared
What do you think from your experiences of today @Tonux599 ? The reasoning out of this is
Catch my draft ? IS it a silly idea? The total build time of |
@Tonux599 yeah... I'll sleep on it, but I really think From my analysis of build time #970 (comment) We now add a caching time of 15 minutes per board building of parent + extraction time from the child. (Which could be reduced by selective caching like you said). |
Firstly I need to amend my previous comment, by the looks of it all the workspace layers are applied so the cache should be saved correctly.
This is what I would be aiming for. However the only problem with that is all the small "things" (gpg, busybox, cryptsetup) would be needlessly be being build side-by-side at point 2. I think ideally we need:
There is definitely more optimisation to be done, I think its just a case of sifting though it and making adjustments where we can. |
@Tonux599 Can you trigger a rebuild please? (it will create another build, just so we reuse your cache and see) |
Nervermind the cache is not ready yet. 30 minutes damn. Normal. I could cancel the workflow. I guess i'll be able to trigger a rebuild. |
Yeah we probably don't actually need all those workspace layers being applied, just one from each variant board of coreboot. |
@Tonux599 A quickfix here though would be to replace the qemu-coreboot by one of the boards that has it all already. Like the x230-hotp-maximized board. |
Not sure I got that, sorry. |
If were talking about testing the cache we will have to wait anyway, as its not saved at any other point.
It's in effect applying the state of the filesystem from each job (So about 20) on top of one another before creating the cache, but in reality we only need one board of coreboot 4.8.1, 4.11 and 4.13 (so 3) and then create a cache. |
Updated here which implements above as |
hooooo now i get it. instead of ./crossgcc ./build and others. |
is really interesting...
That could be tweeked. But then all those file downloading and extraction takes a looooot of time because not specialized. |
@tlaurion keep an eye on this one as the last one failed: |
Ok, but the thing is still that parallelization is happening 2x at a time. Didn't realize that picking the build late. You can without a doubt put CPUS=8 everywhere, maybe even more. Not specifying CPUS= results in build going to 36 cores, leaving us place to tune the number of cores to play with before we exhaust memory limits. Originally, CircleCI builds were pinned with --load parameter. We might also fall back into that if needed. Where putting it to 8 barely unmade it https://app.circleci.com/pipelines/github/tlaurion/heads/700/workflows/0a3a9b23-aa8d-4557-b848-c8830e8be276 |
@Tonux599 :
And finally, having a compare point to 8 cores involved:
|
So it failed again, looks like workspace layers don't like overwriting each other. Another build here which is sequential for the first three boards, saves a cache, then runs in parallel: https://app.circleci.com/pipelines/github/Tonux599/heads/69/workflows/1585f145-847c-4b2f-b498-ac6ea0a528ac Edit: @tlaurion if above works I'll open a PR based on it. I have other ideas on how to make the CI a bit more clean to like using |
This worked, so i'll push that to master as a temporary fix so that Ci Builds are successful as of now |
Will actually push the luck more and push CPUS=24: https://app.circleci.com/pipelines/github/tlaurion/heads/705/workflows/357533c7-8e62-416f-8479-b2f548949081/jobs/822 Edit: worked. |
Quick review.
Other then that im impressed by the time gained! Awesome! |
So we have @Thrilleratplay's modified me_cleaner in branch already for xx20 so we don't need to download anything.
I'll try and explain, what's basically happening is we are building a board for coreboot 4.8.1 (qemu-coreboot), saving the whole projects workspace. Then loading it again for coreboot 4.11 (kgpe-d16_workstation) to be build. Saving. Loading. Then coreboot 4.13 (librem_mini) is being built. Now that workspace from librem_mini is being saved, and that includes all the built toolchains from coreboot 4.8.1, 4.11 and 4.13. Also most the modules are built. This workspace is then loaded by all the other boards, in parallel, and they are then built. Don't get me wrong, there are still things we could do to make this better, notably we can have a single board from each coreboot build in parallel, and then build the remaining boards. Things get complicated then though as CircleCI refuses to overwrite other saved workspaces, so we have to be careful in instructing CircleCI to only save files/folders that will not clash. As I said in the PR though, hopefully what I've done here is a good starting point. |
@Tonux599 Right. Sorry for the confusion. Forgot about that.
@Tonux599 : I quickly reviewed, and see this PR as a safe direct move when 5 hours builds will fail for a fresh build without cache. I still don't get the gist out of how to optimize this better, but from what I see, a fresh build takes 5h43 (vs 4h:56) where a full build for all boards from most complete cache available takes 4h (vs 1h19). We should merge preventively? |
Following the rabbit to try to understand what is happening and consuming time on builds even if cache is available. Quick observations on last rebuild with avail full cache:
@Tonux599 : I understand that the cache alone cannot be reused and workspaces need to be saved and passed in persistent workspaces in betweek each sequential builds? Wouldn't there be a way to only pass workspaces only on fresh builds when caches are not available? Maybe im just dreaming here. Can you link on the docs you are having in the back of your mind for further optimizations? |
#1015 merged |
Houston, we have a problem:
https://app.circleci.com/pipelines/github/tlaurion/heads/694/workflows/9d351748-166a-4fa6-b0e4-7a08510cedcc/jobs/765
Build fails because:
Documented here:
https://discuss.circleci.com/t/job-times-out-after-5-hours/32220/4
Possible solutions:
The text was updated successfully, but these errors were encountered: