Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hitting multiple "global" module caches #575

Closed
jerrymarino opened this issue Mar 1, 2021 · 13 comments
Closed

Hitting multiple "global" module caches #575

jerrymarino opened this issue Mar 1, 2021 · 13 comments

Comments

@jerrymarino
Copy link
Contributor

I'd like to use a global module cache to mitigate i/o from a bazel build. It looks like the feature "global module cache" is almost doing this with a non hermetic cache, but it ends up creating a per "configuration" cache.

Specifically:

paths.join(prerequisites.bin_dir.path, "_swift_module_cache"),

This is an issue when the build has multiple OS versions, or transition configurations. It seems to creates module cache per config

bazel-out//ios-x86_64-min10.0-applebin_ios-ios_x86_64-dbg-ST-f15c78a5cd43/bin/_swift_module_cache
bazel-out//ios-x86_64-min11.0-applebin_ios-ios_x86_64-dbg-ST-78e822f89cdf/bin/_swift_module_cache

To fix, we can take the first part of the bin_dir instead of the entire dir: it'd have the effect of pointing swiftc at bazel-out/_swift_module_cache. We could make the change to fix it rules_swift or just fix it externally 👍

@keith
Copy link
Member

keith commented Mar 1, 2021

Hrm interesting. I have heard in the past that these modules can be tied to some of those configurations, for example in the minimum OS case wouldn't you have to make sure to produce the cache with the lowest number? Or is that done automatically with the hashes in the filenames?

@keith
Copy link
Member

keith commented Mar 1, 2021

Also how many module caches do you have that this causes a problem?

@jerrymarino
Copy link
Contributor Author

Yeah I believe it does have to namespace material in module cache at the config level, I haven't looked at this in depth in a while. Case in point other apps seem to use ~/Library/Developer/Xcode/DerivedData/ModuleCache.noindex for all compilation

LMK if your open to this change and I can put up a PR for it. Also, I think it might cause less indexer data to be written but I need to double check.

@keith
Copy link
Member

keith commented Mar 1, 2021

Good point. Seems reasonable to me, @allevato might be able to provide more context on the original path decision tho

@jerrymarino
Copy link
Contributor Author

ATM I've got one swift cache per OS version, per transition config, per bazel workspace

@allevato
Copy link
Member

allevato commented Mar 2, 2021

In the two examples shown above, since the iOS minimum deployment target differs, then the implicit modules between the two builds would differ as well. Moving the cache location out to bazel-out/_swift_module_cache most likely won't change anything of significance; it just means instead of having this layout:

bazel-out/<iOS 10.0 config-specific-path>/bin/_swift_module_cache/<iOS 10.0 clang invocation hash>/
  <.pcms compiled for 10.0>
bazel-out/<iOS 11.0 config-specific-path>/bin/_swift_module_cache/<iOS 11.0 clang invocation hash>/
  <the same .pcms, but compiled for 11.0>

you'll have this:

bazel-out/_swift_module_cache/
  <iOS 10.0 clang invocation hash>/
    <.pcms compiled for 10.0>
  <iOS 11.0 clang invocation hash>/
    <the same .pcms, but compiled for 11.0>

So you still won't actually be sharing anything between the two builds, because Clang still has to compile the modules differently when the minimum deployment target changes.

Now, because Starlark transitions can affect Bazel settings that end up being a superset of what gets passed to the Clang invocation, you could theoretically end up with multiple configurations that could share modules that don't today—and you can detect this by looking at all the _swift_module_cache directories under bazel-out/X and seeing if two or more configs X have a module cache subdirectory with the same Clang invocation hash. But if you find that that's not actually happening in practice, I wouldn't spend time worrying about it.

FWIW, the addition of this feature was before we started using any Starlark transitions, so the possibility of the above happening was zero; all the configuration bits that affect the part of the path before the Starlark transition hash (e.g., ios-x86_64-min10.0-applebin_ios-ios_x86_64-dbg) also explicitly affect the Clang invocation.

@jerrymarino
Copy link
Contributor Author

jerrymarino commented Mar 2, 2021

Ah thanks for clarifying @allevato Yep, my example isn't 100% comprehensive of where the current feature beaks down.

In the example above, getting 1 module cache per OS version in that case would be hinged on having identical transitions per OS, which may not always happen in practice. Specifically, since Bazel's transition config hash is computed on other Bazel CLI args than OS version. There's some issues on github about this e.g. bazelbuild/bazel#12731 and rules_ios ( a ruleset that wraps rules_swift ) originally used multiple config hashes.

I'd still like to have a global cache to:

  1. share the module cache with swift and clang - not 100% sure what it reuses, I'll need to look more into that.
  2. minimize index store size. As dep PCMs written into the unit files, I get larger index store when using this feature compared to giving it an global cache as proposed. I'm getting 2x the index store size: say 200MB VS 400MB. This is fundamentally less data for index-import to process. The per OS versions module cache directories cause this on todays swift release.

For me the indexstore issue is enough of an improvement move the "global module cache" out of rules_swift. Given the transition feature I think it makes sense to make it global.

@allevato
Copy link
Member

allevato commented Mar 2, 2021

share the module cache with swift and clang - not 100% sure what it reuses, I'll need to look more into that.

While you could share the same module cache location, just as the situation above, you won't actually be sharing any modules themselves—you'll end up with the same modules in two different hash subdirectories. There are two reasons for this:

  1. The compiler invocation used internally by ClangImporter in swiftc to import modules is significantly different than anything that would be used when compiling regular Objective-C code with clang.
  2. Even if the invocations were compatible, there's no guarantee that the version of clang distributed in a particular Xcode matches exactly the version of clang linked into swiftc in the same Xcode, and the compiler version must match exactly for the modules to be compatible. Sadly, they're mismatched far more often than they happen to match—I love the table of versions at the bottom of Xcode's Wikipedia page because it shows just how often they don't match.

These two points are why I had to add support for -emit-pcm to swiftc to support explicit modules. Clang can't be reliably used to generate compatible ones; a compiler can only consume implicit or explicit modules that it generated itself, with a compatible invocation.

@jerrymarino
Copy link
Contributor Author

@allevato thanks for sheding some light on the clang/swift interop - looks like we might not be able share the modules from clang / swift and at a higher level might not make sense anyways.

This being said, using the proposed global path still is still cutting my global swift indexstore size in half which makes this a useful to me. Also as we're guaranteed to get a module cache per transition hash - that puts it at O(N swift_library's) swift module caches in the worst case. The same issue exists on the native cc rules: O( N objc_librarys ) objc module caches in the worst case.

So atleast 2 main benefits of using a global cache are:

  • cut my global index store size in half
  • remove per library module cache in worst case ( for per lib transition hashes )

From the compilers' perspective only, it shouldn't cause issues given data is correctly namespaced in that global cache. In general, it seems worth considering since other apps seem OK with use a global cache. In my experience, I haven't seen any negative side effects of a global cache yet with Bazel or remote builds.

Do you know of any cons to the proposed change?

@allevato
Copy link
Member

allevato commented Mar 2, 2021

I can't think of any other problems off the top of my head.

The implicit module cache is already non-hermetic because it's placed at the configuration-specific output root instead of being sandboxed to each target (which would of course defeat the purpose), so it's not like moving it higher up would make it "more non-hermetic".

In the past, we've seen issues where teams would use the global module cache, sync, and then rebuild and the module cache wouldn't be invalidated properly causing the compiler to crash because the underlying header files changed. It happened nondeterministically enough and rarely enough that I was never able to track down the cause, but that happened even when the cache was restricted to the configuration subdirectory. I suppose there's a slim possibility that moving it higher could cause the likelihood of that to increase, but Clang should be isolating the modules well enough in the first place. But it also should have been doing it so that the crashes we saw would have never occurred... 🤷‍♂️

So if you see other concrete benefits from it, I think it should be fine.

@jerrymarino
Copy link
Contributor Author

Thanks for taking the time out of your day to weigh in on this. I totally agree: re tracking down corruption of that cache is hard to reproduce. I've seen it sparsely too. If sourcekit is pointing at a global module cache it will have the same issue when completing/indexing. Some out band intervention e.g. "Remove derived data" when rebasing or cleaning state during project generation time has always seemed required - effectively treating every last module cache as ephemeral.

Today the compilers still put the global state into the module-cache-path so even with today's feature we can get corruption from other modules. Putting it at the transition hash level theoretically may reduce hash collisions when compared to a "more global" cache, but I'm don't know the probability 🤔 . It's totally tangental but maybe there is upside to explicit and hermetic PCM in the build system - hope that can obviate the implicit stuff.

@keith
Copy link
Member

keith commented Oct 26, 2021

Doesn't seem like there's anything super clear to do here. I think the true way out at this point is explicit modules

@keith keith closed this as completed Oct 26, 2021
@luispadron
Copy link
Contributor

Apologies for restarting this old conversation but we pretty regularly hit compilation issues like one described here.

Do ya'll still run into this or have you worked around it somehow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants