Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up builds on CI with more caching #3200

Closed
overlookmotel opened this issue May 7, 2024 · 22 comments
Closed

Speed up builds on CI with more caching #3200

overlookmotel opened this issue May 7, 2024 · 22 comments
Assignees
Labels
C-performance Category - Solution not expected to change functional behavior, only performance

Comments

@overlookmotel
Copy link
Contributor

Follow on from #2981.

The largest component of the time it takes to run benchmarks on CI is building the benchmarks.

Obviously the benchmarks for whichever crates a PR touches do need to be rebuilt, but all the rest could be cached.

The docs for Swatinem/rust-cache say:

In particular, the workspace crates themselves are not cached since doing so is generally not effective.

That would indeed make sense for a single crate, but in a workspace project where the majority of PRs will only touch a single crate, I don't think this is a good strategy.

Not sure how to solve it, but I imagine if we can improve the caching strategy, it could improve CI times significantly.

@overlookmotel overlookmotel added the C-bug Category - Bug label May 7, 2024
@overlookmotel overlookmotel changed the title Speed up builds on CI Speed up builds on CI with more caching May 7, 2024
@overlookmotel overlookmotel added C-performance Category - Solution not expected to change functional behavior, only performance and removed C-bug Category - Bug labels May 7, 2024
@overlookmotel
Copy link
Contributor Author

overlookmotel commented May 7, 2024

Here's a wacky idea:

If we could determine which crates' benchmarks are unaffected by the changes in a PR, could prevent benchmarks for those crates running at all.

I've been using this hack on #2457 to get faster feedback:

// Add cached results for benchmarks which weren't run
const cacheZipPath = pathJoin(__dirname, 'cachedBenches.tar.gz'),
cacheDir = pathJoin(dataDir, 'cache');
await fs.mkdir(cacheDir);
await extractTar({file: cacheZipPath, cwd: cacheDir});
for (const filename of await fs.readdir(cacheDir)) {
const match = filename.match(/^(.+)_(\d+)\.out$/);
assert(match, `Unexpected file in cache: ${filename}`);
const [, component, pid] = match;
if (components.has(component)) continue;
const outPath = pathJoin(dataDir, filename);
await fs.rename(pathJoin(cacheDir, filename), outPath);
profileFiles.push({pid: +pid, path: outPath});
}

It only runs the benchmarks for NAPI, but uploads a cached copy of results for all the other benchmarks to CodSpeed, so that CodSpeed's dashboard doesn't show "missing benchmark" and they all show as 0% change, except the one that I'm interested in.

This would also stop every second PR from showing a nonsense "oxc_semantic regressed 3%" / "oxc_semantic improved 5%" message.

@Boshen
Copy link
Member

Boshen commented May 7, 2024

I already tried.

You need something like https://github.com/holmgr/cargo-sweep to tidy up things, but I got into dead ends. I read all the issues around this topic and even tried building my own tools, but there's some problem with cargo not keeping some meta if I remember correctly.

The best option is actually https://github.com/mozilla/sccache with an aws s3 backend, but I'm too cheap for that :-)

@overlookmotel
Copy link
Contributor Author

Ah. It's more complex than I thought.

The best option is actually https://github.com/mozilla/sccache with an aws s3 backend

How about using sccache with Github Actions backend?
https://github.com/mozilla/sccache/blob/main/docs/GHA.md

And what do you think about my "wacky idea" above?

@Boshen
Copy link
Member

Boshen commented May 7, 2024

If the goal is to speed up the benchmark build, 2 things:

  1. lto = "thin", it should cut the build time by a huge amount with about 10% perf regression
  2. figure out why it's building the entire workspace?

edit: oh it's building the whole crate ... we can split all the crates up because cargo caches at the crate level :-)

Run cargo build --release -p oxc_benchmark --bench lexer --features codspeed
    Updating crates.io index
   Compiling oxc_span v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_span)
   Compiling oxc_index v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_index)
   Compiling oxc_allocator v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_allocator)
   Compiling oxc_ast_macros v0.0.0 (/home/runner/work/oxc/oxc/crates/oxc_ast_macros)
   Compiling oxc_diagnostics v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_diagnostics)
   Compiling oxc_sourcemap v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_sourcemap)
   Compiling oxc_macros v0.0.0 (/home/runner/work/oxc/oxc/crates/oxc_macros)
   Compiling oxc_syntax v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_syntax)
   Compiling oxc_ast v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_ast)
   Compiling oxc_tasks_common v0.0.0 (/home/runner/work/oxc/oxc/tasks/common)
   Compiling oxc_semantic v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_semantic)
   Compiling oxc_parser v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_parser)
   Compiling oxc_codegen v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_codegen)
   Compiling oxc_minifier v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_minifier)
   Compiling oxc_linter v0.0.0 (/home/runner/work/oxc/oxc/crates/oxc_linter)
   Compiling oxc_prettier v0.0.0 (/home/runner/work/oxc/oxc/crates/oxc_prettier)
   Compiling oxc_transformer v0.12.5 (/home/runner/work/oxc/oxc/crates/oxc_transformer)
   Compiling oxc_benchmark v0.0.0 (/home/runner/work/oxc/oxc/tasks/benchmark)

@Boshen
Copy link
Member

Boshen commented May 7, 2024

How about using sccache with Github Actions backend?

https://github.com/mozilla/sccache/blob/main/docs/GHA.md

Already tried a year go #268

@Boshen
Copy link
Member

Boshen commented May 7, 2024

And what do you think about my "wacky idea" above?

I don't not a fan of constructing our infra against a third party tool ...

@overlookmotel
Copy link
Contributor Author

Already tried a year go #268

Damn!

cargo caches at the crate level

Would you mind explaining that a bit more? This is what I was trying to say originally, but I'm not great on infrastructure stuff, so probably don't know the right words/concepts.

@Boshen
Copy link
Member

Boshen commented May 7, 2024

Split oxc_benchmark into more crates, i.e. oxc_bench_lexer etc etc

I can try this tomorrow when I wake up.

edit: I think it's best to use lto = "thin" in local and CI, and lto = "fat" for final release, I already did this for Rspack.

@overlookmotel
Copy link
Contributor Author

Ah I see. I didn't try that exactly, but I did try putting all oxc_benchmark's dependencies (oxc_linter, oxc_prettier etc) behind different features and then only enabling the feature required for each benchmark. It helped a bit, but not much.

But maybe problem is I didn't get them to use different cache keys, so they all destroyed each others' caches?

@overlookmotel
Copy link
Contributor Author

overlookmotel commented May 7, 2024

The other thing we could try is having 1 job build oxc_benchmark, and then all the other jobs that actually run the benchmarks wait for the build to complete before starting.

That wouldn't reduce the start-to-end time to run benchmarks, but it wouldn't tie up 10 workers all building the same thing - which was the original problem raised in #2981.

@Boshen
Copy link
Member

Boshen commented May 7, 2024

The other thing we could try is having 1 job build oxc_benchmark, and then all the other jobs to actually run the benchmarks wait for that to complete.

Oh yes, this is simpler, let's do this.

@overlookmotel
Copy link
Contributor Author

overlookmotel commented May 7, 2024

Oh yes, this is simpler, let's do this.

It would be an easy first step. But, personally I really like fast feedback on benchmarks, so if the solution of multiple crates / features gets the benchmarks to the finish line faster, I would favour that approach.

I think quite possibly when I tried it before I screwed it up by having them all clobber each others' caches. You are much better at this stuff than me, and can most likely get more juice out of it than I could.

I think it's worth a try at least.

@overlookmotel
Copy link
Contributor Author

overlookmotel commented May 8, 2024

Just to go back to 2 things you said earlier:

I already tried.

You need something like https://github.com/holmgr/cargo-sweep to tidy up things, but I got into dead ends. I read all the issues around this topic and even tried building my own tools, but there's some problem with cargo not keeping some meta if I remember correctly.

I've just read a bunch of the issues linked to from Swatinem/rust-cache#37

Can't say I understand most of it, but it seems like maybe "some problem with cargo not keeping some meta" was that cargo judges if a file is changed or not by its mtime, but restoring the cache on Github Actions does not preserve the timestamps. If so, this looks like a promising solution for that:

rust-lang/cargo#6529 (comment)

(this "retimer" tool was never made public, but it doesn't sound complicated to build something similar)

It may well be that I'm being hopelessly naive, and there's a whole rabbit hole of further complications. But given that CI speed is a priority for Oxc, maybe it's worthwhile giving it another shot?

And what do you think about my "wacky idea" above?

I don't not a fan of constructing our infra against a third party tool ...

I didn't see it like that. I saw it as hacking an external tool to make it do what we want. i.e. make the infra fit us, rather than us fit the infra (and we are already hacking CodSpeed's API, to do sharded benchmarks).

@overlookmotel
Copy link
Contributor Author

And just to drop the link as a sort of "bookmark", I found this comment interesting on the subject:
yewstack/yew#2340 (comment)

@milesj
Copy link
Contributor

milesj commented May 9, 2024

IMO I'd suggest sccache + S3 again. I use it in moon just fine.

@overlookmotel
Copy link
Contributor Author

@milesj Does S3 not run up the bills quite a bit?

@Boshen
Copy link
Member

Boshen commented May 9, 2024

@overlookmotel That's why are living in a refugee camp. Too poor.

@Boshen
Copy link
Member

Boshen commented May 9, 2024

We don't need more caches, we just need to fix our code #3213

closing as non-actionable.

@Boshen Boshen closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2024
@overlookmotel
Copy link
Contributor Author

overlookmotel commented May 10, 2024

@Boshen Concerning sccache: How much data storage do we need?

sccache supports Cloudflare R2 which is cheaper than S3 - $1.50 a month for 100 GB storage, and data transfer is free.

Surely even us refugees can afford $1.50 a month?!

Yes, we can improve our code, but at that price, why not do both?

@overlookmotel
Copy link
Contributor Author

Tell you what... I'll split it with you. 75 cents a month each. 😆

@Boshen
Copy link
Member

Boshen commented May 10, 2024

Adding cache and speeding up the build will hide away the fact that our code is too slow to compile. If you have 75 cents, can you spare me a little bit more for https://buildjet.com/for-github-actions 😛

@milesj
Copy link
Contributor

milesj commented May 10, 2024

FWIW, moon's S3 CI costs are only like $8 a month. I have the bucket auto-delete old files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Category - Solution not expected to change functional behavior, only performance
Projects
None yet
Development

No branches or pull requests

3 participants