Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: (ci) split cache #2531

Merged
merged 10 commits into from
Jan 30, 2024
Merged

Conversation

tychoish
Copy link
Contributor

As discussed out of band this PR:

  • creates a separate set of caches for each branch. This means that the first build of any PR will be a bit slower than it is today, but that every push after that will be (quite) quick.
  • splits the caches between uinttests/glaredb/python/nodejs/clippy: all of these compile the world, and can't share each other's cache because they have different flags.

@tychoish
Copy link
Contributor Author

This is an interesting study:

  • the first build has to populate the caches and build everything (also the caches, particularly the unittests,) are huge. so we lose a lot of time uploading files. It took 16m wallclock time and 32m billable time (compile tasks on large runners)
  • the second build could just use caches, and its 13m30s end to end, and there was exactly 13m of billable time. the fact that both numbers are 13 is a coincidence. We might be able to tweak this down to about ~10 of each, but not a lot.
    • the unittests had to compile glaredb the binary twice.
    • the python best case isn't much better than using the "build" cache.
  • SQLServer
    • count pushdown would save about 2 minutes from the longest part of the build, so that would help the developer feedback time in any case (but same cost)
    • table scans are pretty slow eveywhere for the bikeshare trip data
    • even with count pushdown, it's still a bit slow, but some of that might be how we do the dataload
  • until sqlserver gets faster there's no time-related reason to break out the remaining integration test group, but it might help clarity in some cases (particularly postgres)
  • I've noticed this for days but the node bindings have equivalent performance on the midsized runners, so I'm going to go with it.
  • One thing I noticed (anecdotally,) is that the github action's scheduler made a mistake: it started the build task after it had started other tasks with the same dependencies, which mean that the total time was a bit longer (in both cases.)

I think you can have a sort of fallback restore key situation, falling back to a different cache, and I'm going to give that a try, and move the unittests around so they can get quicker.

@tychoish tychoish merged commit 00d0db8 into tycho/ci-dag-reorg Jan 30, 2024
22 checks passed
@tychoish tychoish deleted the tycho/ci-per-branch-caching branch January 30, 2024 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants