-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance issues with swift.index_while_building
#561
Comments
swift.index_while_building
swift.index_while_building
At Target we used a global index, but only when running locally: 6b77a6b. Remote cache hits would still download to each module folder, but post build we would copy those into the global index. Locally built modules would go directly into the global index. We then only We personally found that the largest hit was the vast number of files and directories being made (and re-made, since I would love to see a first class solution for a global index store, but I think it would require changes to Bazel: bazelbuild/bazel#7527 |
@brentleyjones great to hear that you've got this working! Also thanks for taking the time to write up how you're dealing with this, I'm going to give your blog post a read 👍
So you're writing a per- I was thinking it'd be ideal to stripping out a minimal subset of records before writing to the cache - perhaps that's not 100% necessary if the artifacts are reasonably small. I was also thinking to leverage With the incremental Yep it seems like Bazel needs some solution to bazelbuild/bazel#7527. Like the clang module cache this is basically a mutable tree artifact that many actions need to mutate 🤔 |
Looks like this feature was introduced here, yay! #563 |
Definitely pass that to reduce extra work |
My analysis was with that flag on, btw. |
Nice, I think this is a sensible default! I tried passing Basically there is a O(N indexes * M imports) problem here, so repos with a ton of code will have same problem but with internal code: N*M indexing internal code anyhow. Making a per module index ( N * M ) is the real root of the issue that going with a global index will fix. In addition to burning CPU and I/O, I get OOMs due to the amount of code it's loading in memory: specifically when I think to implement "Index while building" in Bazel there's some options. Use the original architecture of a global index and implicit cacheRemote exec / caching Workers use a global index internally, and then write records and units into the remote cache by copying material from the worker's global index into Locally Local machines w/o cache writes write to a global index directly. Ideally Xcode can just read from this index, possibly with a patch to Xcode or somehow preventing the re-mapping Implement "per module" indexingPossibility 1. Perhaps we can add a feature to clang and swift that only index files in the module. This could be nice for other reasons |
This potentially seems doable with the worker, similar to how it gives a global module cache? We'd just need to be able to copy out the relevant bits of the index to the desired output directory after the |
Yep it seems totally reasonable and I think we'd just need a C++ program to extract the relevant units/records. The other piece to this is making it work with the swift and objc rules. In practice, this rules_swift feature doesn't work with a large majority of the objc code and there isn't yet a way to handle objc code and remote caching I know if. The other simple/pragmatic option is to disable "Index while building" in the interim: driving this with Xcode, letting it manage the indexes incrementally as you open files after a build. With a focused project it will quickly index the subset of the code in the editor w/o paying to index the entire graph. Even a theoretical optimized approach that uses a global index will have a noticeable impact on build perf. |
This PR adds the capability to import output files from a compilation: to improve performance and experiment with using a "global shared index" while remote caching a subset for lib. The github issue has some overall mentions of this bazelbuild/rules_swift#561 Index while building is performant in clang and swift becauase it conditionally writes data based on files in a global index cache reused across all clang/swift invocations in the build. The overhead of 3-5% performance hit in the whitepaper of "Index while building" was hinged on the fact it'd be using a global index cache. With the per library index as rules_swift implemented this feature, "Index while building" adds a 300% increase in build times in my testing. Longer term, can add flags to clang and swift indexing to be aware of remote compilation and caching, this is a quick hack incase that doesn't ever happen. Remote build possibilities: The plan is to be able to use a global index on compilation fix compilation performance and minimize the O ( M imports * N modules ) disk usage. I'd then intend to use this code to import a subset of the global index material into the remote cache by putting it in `bazel-out` at `swift_library`'s "index-store" path. Eventually a feature can be added to rules_cc following the same pattern for .m, .cpp Local build possibilities: I plan to import swift_library cache remote cache hits from `bazel-out` into Xcode's index. However, importing output files as a capability is useful even if that doesn't play out: if we orient index-import around output files, then it doesn't have to enumerate and process an entire index ever again. Another build related program ran aas an aspect or via BEP could read outputs and then invoke index-import with these outputs.
Overall architecture of design for "Index While Building" in Bazel - MobileNativeFoundation/index-import#53 - bazelbuild/rules_swift#567 - bazelbuild/rules_swift#561
* Writeup doc on Index While Building Overall architecture of design for "Index While Building" in Bazel - MobileNativeFoundation/index-import#53 - bazelbuild/rules_swift#567 - bazelbuild/rules_swift#561
This PR adds the capability to import output files from a compilation: to improve performance and experiment with using a "global shared index" while remote caching a subset for lib. The github issue has some overall mentions of this bazelbuild/rules_swift#561 Index while building is performant in clang and swift becauase it conditionally writes data based on files in a global index cache reused across all clang/swift invocations in the build. The overhead of 3-5% performance hit in the whitepaper of "Index while building" was hinged on the fact it'd be using a global index cache. With the per library index as rules_swift implemented this feature, "Index while building" adds a 300% increase in build times in my testing. Longer term, can add flags to clang and swift indexing to be aware of remote compilation and caching, this is a quick hack incase that doesn't ever happen. Remote build possibilities: The plan is to be able to use a global index on compilation fix compilation performance and minimize the O ( M imports * N modules ) disk usage. I'd then intend to use this code to import a subset of the global index material into the remote cache by putting it in `bazel-out` at `swift_library`'s "index-store" path. Eventually a feature can be added to rules_cc following the same pattern for .m, .cpp Local build possibilities: I plan to import swift_library cache remote cache hits from `bazel-out` into Xcode's index. However, importing output files as a capability is useful even if that doesn't play out: if we orient index-import around output files, then it doesn't have to enumerate and process an entire index ever again. Another build related program ran aas an aspect or via BEP could read outputs and then invoke index-import with these outputs.
Closed by #567 |
Good morning! I'm again looking at how to use the feature
swift.index_while_building
with Bazel and found it to introduce significant build performance issues. The impact of using this flag is a 300% increase in swift compilation times as opposed to not using it.From what I can tell, it adds multiple GB of output to the build - as there is a per
swift_library
index, all system modules and deps seem to be indexed O(N swift_library) times. rules_ios uses the programindex-import
which churns trying to import these indicies for longer than it'd takes Xcode to index the source code after build. Using a "global index" - a single index mitigates the perf impact, but I'm not sure how this would work with remote caching.I was curious how "index while building" is working with
rules_swift
and Bazel in general for larger builds. XCHammer doesn't have this feature and works around it by letting Xcode index a focused subset after the build.The text was updated successfully, but these errors were encountered: