-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support C++20 Modules #19940
base: master
Are you sure you want to change the base?
support C++20 Modules #19940
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
This PR already got a lot of attention at Google in the group of C++ toolchain maintainers / experts. There’s a desire to have it, but no concrete/incompatible plans yet. The design would need some changes so that it’s compatible and supports Google well. (Think of easier maintenance in the future) I’m not an expert in C++, but I will start the discussion internally and come back with possible requirements/changes when we figure out what they are. |
Some people are out of office. The main discussion will start second week of November. I’ll post next update after that. |
826867b
to
cf2c9ad
Compare
I rebase the PR to the latest master branch due to |
gentle ping :-) |
CMake developer here; just tracking how modules are being implemented in various places :) . I read through the design doc and had a few comments. Since it was already merged, I figured that here may be better; can move wherever is best though.
|
No, clang don't have such plans (deprecating 2 phase compilation model) at least for now.
Yes but the story of the 2 phase compilation model seems really appealing. So the build system supporting 2-phase compilation model may be a positive advantages. And in the future, the build systems may be able to support both (or even more) compilation models and the users can make the choice. |
To be more precise, there may be multiple kinds of BMIs in the future and Clang may have a 3-phase with the trimmed BMI being the "interesting" bit for importers in the future, but still using the full BMI for codegen. Clang is also getting a (proper rather than "frontend does the 2-phase internally" of today) 1-phase compilation like GCC and MSVC as well.
I agree. However, I prioritized 1-phase over 2-phase for CMake due to compiler support.
Agreed. However, given the simplicity of the 1-phase, I find it better for the initial implementation. There are a number of performance things that can be looked at in the future:
Basically my main interest is in getting things working across the ecosystem as a baseline before we start up our ricer cars. Of course, Bazel can do as they please; I can only offer my view on things here. |
This issue was filed against CMake. Unconditional redirection of |
db1eb69
to
f52154b
Compare
f52154b
to
f07e7e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to consider a way to tell Bazel that some sources do not use modules and can therefore completely skip scanning (and, if nothing in the target needs scanned, the target's collation step as well).
// if cpp20_module enabled, only c++20-deps-scanning will produce .d file | ||
// other actions will reuse the .d file from c++20-deps-scanning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is accurate as the "real" compile may mention other files in its .d
output. Including, but not limited to:
- the BMI files that are read (or only those that are used)
- modmap files
- header units which are translated into imports may stop reading the header and read the BMI directly
The last one should be covered by the header changing -> trigger a rescan, not listing it here allows the build graph to not-run the compile in case its change is non-consequential to the compile by waiting for the scanning to say so rather than queuing up the compile automatically.
Preconditions.checkState(module.isFileType(CppFileTypes.CPP_MODULE), "Non-module? %s", module); | ||
var skyValue = actionExecutionValues.get(module.getGeneratingActionKey()); | ||
if (skyValue == null) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a problematic error case; no messages or context about what happened?
src/main/java/com/google/devtools/build/lib/rules/cpp/CppCompileAction.java
Show resolved
Hide resolved
public CppCompileActionBuilder setPcmFiles(NestedSet<Artifact.DerivedArtifact> pcmFiles) { | ||
this.pcmFiles = pcmFiles; | ||
return this; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation seems weird here. Also looks like a missing newline after this brace.
<li>Clang use cppm </li> | ||
<li>GCC can use any source file extension </li> | ||
<li>MSVC use ixx </li> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All three can use any extension with the right flags (e.g., -x c++-module
or -interface
/-interfacePartition
). These are the preferred extensions.
var scanDepsBuilder = initializeCompileAction(sourceArtifact); | ||
scanDepsBuilder.setActionName(CppActionNames.CPP20_DEPS_SCANNING); | ||
scanDepsBuilder.setOutputs(ddiFile, dotdFile, null); | ||
// only c++20-deps-scanning add .d file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted elsewher, this seems unwise.
src/main/java/com/google/devtools/build/lib/rules/cpp/Cpp20ModuleDepMapAction.java
Show resolved
Hide resolved
content.append("module-file="); | ||
content.append(moduleName); | ||
content.append("="); | ||
content.append(modulePath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are any escaping mechanism required to be considered (e.g., spaces in the path)?
src/main/java/com/google/devtools/build/lib/rules/cpp/Cpp20ModuleDepMapAction.java
Show resolved
Hide resolved
@SerializedName("source-path") | ||
private String sourcePath; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the future header unit support would require reading use-source-path
(bool) and lookup-method
(enum). It might be prudent to read these and fail gracefully with a message about header unit non-support.
TL;DR The Bazel team has decided to accept this PR, I'll be doing the reviews and I'll get some help from internal C++ experts, namely @trybka. We identified the following risks:
We'd like to keep the maintenance costs at minimum - Bazel team will only do reviews on PRs after the initial community review. We won't address any issues that are reported. We don't mind if the community addresses them. We'd like to keep the change behind an experimental flag, to mitigate the risk of divergent implementations. While the change is under the experimental flag, there is no guarantee about incompatible changes. If Google does an internal implementation, we'd like it to match, to reduce maintenance costs. We'd also like to make the change as "modular" as possible, in order to make it easier to remove the future. That might happen in an unlikely scenario, that Google doesn't implement support for the C++20 modules and that this remains the only complexity in CppCompileAction that we can't be rewritten to Starlark. In case this scenario plays out, the C++20 modules support will probably need to be implemented in a different way. That said, we do see the benefits of this change for both the community and Google. Thank you for your contribution. |
hi @comius , I have split this XXL PR into 6 smaller commits. Initially, I hoped to divide it into independent small patches (see #22425 , #22427), but that proved to be unfeasible due to dependencies between the patches (#22429). Later, I plan to use stacked PRs to facilitate code review. However, stacked PRs require creating branches in the target repository first, and I'm not sure if I could be granted the necessary permissions. I've also created a demo of stacked PRs in my repository (https://github.com/PikachuHyA/bazel/pulls) as bakup. Do you have any suggestions on code review process? BTW. the windows CI is broken, I will fix it later. |
@mathstuf Thanks for your comments. I will make the related code changes as soon as possible. |
Nothing should require that; tools doing so should…work on that. It's kind of crazy to make tools not available for external contributors to projects. I believe https://stacked-git.github.io/ does most of its work locally so that at least you're not tied to any Github limitations. |
I split the XXL PR #19940 into several small patches. This is the first patch of Support C++20 Modules, I add `module_interfaces` attr only example - foo.cppm ``` // foo.cppm export module foo; // ... ``` - BUILD.bazel ``` cc_library( name="foo", copts=["-std=c++20"], module_interfaces=["foo.cppm"], # features=["cpp20_module"] ) ``` build failed with the following message ``` ➜ bazel build :foo ERROR: bazel_demo/BUILD.bazel:1:11: in cc_library rule //:foo: Traceback (most recent call last): File "/virtual_builtins_bzl/common/cc/cc_library.bzl", line 40, column 42, in _cc_library_impl File "/virtual_builtins_bzl/common/cc/semantics.bzl", line 123, column 13, in _check_can_module_interfaces Error in fail: attribute module_interfaces: requires --experimental_cpp20_modules ERROR: bazel_demo/BUILD.bazel:1:11: Analysis of target '//:foo' failed ERROR: Analysis of target '//:foo' failed; build aborted INFO: Elapsed time: 0.106s, Critical Path: 0.00s INFO: 1 process: 1 internal. ERROR: Build did NOT complete successfully ``` To build with C++20 Modules, the flag `--experimental_cpp20_modules` must be added. ``` ➜ bazel build :foo --experimental_cpp20_modules ERROR: bazel_demo/BUILD.bazel:1:11: in cc_library rule //:foo: Traceback (most recent call last): File "/virtual_builtins_bzl/common/cc/cc_library.bzl", line 41, column 34, in _cc_library_impl File "/virtual_builtins_bzl/common/cc/cc_helper.bzl", line 1225, column 13, in _check_cpp20_modules Error in fail: to use C++20 Modules, the feature cpp20_modules must be enabled ERROR: bazel_demo/BUILD.bazel:1:11: Analysis of target '//:foo' failed ERROR: Analysis of target '//:foo' failed; build aborted INFO: Elapsed time: 0.091s, Critical Path: 0.00s INFO: 1 process: 1 internal. ERROR: Build did NOT complete successfully ``` To build with C++20 Modules, the feature `cpp20_modules` must be enabled. ``` bazel build :foo --experimental_cpp20_modules --features cpp20_modules ``` the flag `--experimental_cpp20_modules` works on global and the feature `cpp20_modules` work on each target but in this patch, do nothing with C++20 Module Interfaces. Closes #22425. PiperOrigin-RevId: 643303029 Change-Id: I08d8a1186d2ddd1c632f1e768442e504b87a0691
This patch adds `compiler_input_flags_feature` and `compiler_output_flags_feature` to the features. follow #22717 By default, the features `compiler_input_flags_feature` and `compiler_output_flags_feature` are included through `CppActionConfigs.java` in the `getFeaturesToAppearLastInFeaturesList` method. For reference, see the relevant code here: https://github.com/bazelbuild/bazel/blob/0dbfaccaf5bee5ea7f11c01db1fc0cd1ca7f3810/src/main/java/com/google/devtools/build/lib/rules/cpp/CppActionConfigs.java#L1513-L1573 ## Background I modified `tools/cpp/unix_cc_toolchain_config.bzl` and found no input and output on macOS when testing #19940 with the new action names `c++20-deps-scanning` and `c++20-module-compile`. As discussed in #22429 (comment), I added these two features to `unix_cc_toolchain_config.bzl`. the Windows toolchains already have these features, so no modifications were necessary for `windows_cc_toolchain_config.bzl`. - Windows input flags: https://github.com/bazelbuild/bazel/blob/786a893ef6f69a8f77ca008a478bf67abfdcdc57/tools/cpp/windows_cc_toolchain_config.bzl#L1073-L1095 - Windows output flags: https://github.com/bazelbuild/bazel/blob/786a893ef6f69a8f77ca008a478bf67abfdcdc57/tools/cpp/windows_cc_toolchain_config.bzl#L960-L1020 cc @comius Closes #22743. PiperOrigin-RevId: 643345702 Change-Id: I5715d25e12c7a3616d1fdb484f77ef7cd0fd1bba
This patch add `dependency_file_feature` to features when OS is macos. the feature `dependency_file_feature` added by default through `CppActionConfigs.java getLegacyFeatures` https://github.com/bazelbuild/bazel/blob/0dbfaccaf5bee5ea7f11c01db1fc0cd1ca7f3810/src/main/java/com/google/devtools/build/lib/rules/cpp/CppActionConfigs.java#L93-L117 ## Background I modified `tools/cpp/unix_cc_toolchain_config.bzl` and found `dependency_file` not work on MacOS when testing #19940 with new action name `c++20-deps-scanning` and `c++20-module-compile`. After adding `dependency_file_feature` to features, it works. cc @comius Closes #22717. PiperOrigin-RevId: 643345857 Change-Id: I50210592edd1082e2328c7e4ab68bd0c76087aaa
hi @peakschris , (#22425 (comment)) thanks very much for your interest to this PR.
I have completed the one-phase compilation support for GCC, Clang, and MSVC, as well as the two-phase compilation support for Clang.
The first patch has already been merged. The remaining patches need some time for review.
Thanks. Your feedback and anticipation are really valuable to us. |
I am hugely excited to use this for my own project. Congrats on your hard work and perseverance. |
Thanks @PikachuHyA, I asked the question here and then moved it to the closed PR that I found. It looks like this is an absolutely mammoth task, excellent effort! |
I split the XXL PR #19940 into several small patches. This is the second patch of Support C++20 Modules, I add C++20 related tools ## Overview This patch contains two tools: `aggregate-ddi` and `gen-modmap`. These tools are designed to facilitate the processing of C++20 modules information and direct dependent information (DDI). They can aggregate module information, process dependencies, and generate module maps for use in C++20 modular projects. ## The format of DDI The format of DDI content is [p1689](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html). for example, ``` { "revision": 0, "rules": [ { "primary-output": "path/to/a.pcm", "provides": [ { "is-interface": true, "logical-name": "a", "source-path": "path/to/a.cppm" } ], "requires": [ { "logical-name": "b" } ] } ], "version": 1 } ``` ## Tools ### `aggregate-ddi` #### Description `aggregate-ddi` is a tool that aggregates C++20 module information from multiple sources and processes DDI files to generate a consolidated output containing module paths and their dependencies. #### Usage ```sh aggregate-ddi -m <cpp20modules-info-file1> -m <cpp20modules-info-file2> ... -d <ddi-file1> <path/to/pcm1> -d <ddi-file2> <path/to/pcm2> ... -o <output-file> ``` #### Command Line Arguments - `-m <cpp20modules-info-file>`: Path to a JSON file containing C++20 module information. - `-d <ddi-file> <pcm-path>`: Path to a DDI file and its associated PCM path. - `-o <output-file>`: Path to the output file where the aggregated information will be stored. #### Example ```sh aggregate-ddi -m module-info1.json -m module-info2.json -d ddi1.json /path/to/pcm1 -d ddi2.json /path/to/pcm2 -o output.json ``` ### `generate-modmap` #### Description `generate-modmap` is a tool that generates a module map from a DDI file and C++20 modules information file. It creates two output files: one for the module map and one for the input module paths. #### Usage ```sh generate-modmap <ddi-file> <cpp20modules-info-file> <output-file> <compiler> ``` #### Command Line Arguments - `<ddi-file>`: Path to the DDI file containing module dependencies. - `<cpp20modules-info-file>`: Path to the JSON file containing C++20 modules information. - `<output-file>`: Path to the output file where the module map will be stored. - `<compiler>`: Compiler type the modmap to use. Only `clang`, `gcc`, `msvc-cl` supported. #### Example ```sh generate-modmap ddi.json cpp20modules-info.json modmap clang ``` This command will generate two files: - `modmap`: containing the module map. - `modmap.input`: containing the module paths. Closes #22427. PiperOrigin-RevId: 668488153 Change-Id: Icde51b498f1ecc5c1182427029d0a81ce7c2f686
## Summary I have splited the XXL PR [#19940](#19940) into several smaller patches. This is the third patch to support C++20 Modules, which adds the `deps-scanner` tool and updates toolchains. This patch includes: 1. New action names 2. File extensions 3. Build variables 4. Updated toolchains for compiling C++20 Modules ## Action Names Three action names have been added: - `c++-module-deps-scanning` - `c++20-module-compile` - `c++20-module-codegen` When two-phase compilation is employed: - `c++-module-deps-scanning`: Scans source files and retrieves C++20 Modules dependencies, storing them in `<filename>.ddi`. - `c++20-module-compile`: Compiles the C++20 Modules Interfaces to a Built Module Interface (BMI), converting `<filename>.cppm` to `<filename>.pcm`. - `c++20-module-codegen`: Compiles the BMI to an object file, converting `<filename>.pcm` to `<filename>.o`. When one-phase compilation is employed: - `c++-module-deps-scanning`: Operates similarly to two-phase compilation. - `c++20-module-compile`: Compiles the C++20 Modules Interfaces directly to an object file `<filename>.o` and produces a BMI `<filename>.pcm` as a byproduct. ## File Extensions We follow the file extensions preferred by different compilers, adding two new `ArtifactCategory`s: `CPP_MODULE_GCM` and `CPP_MODULE_IFC`. - Clang uses `.pcm` (CPP_MODULE, already exists). - GCC uses `.gcm` (CPP_MODULE_GCM, new). - MSVC uses `.ifc` (CPP_MODULE_IFC, new). Following the CMake implementation, we added three extra `ArtifactCategory`s: `CPP_MODULES_INFO`, `CPP_MODULES_DDI`, and `CPP_MODULES_MODMAP`. - The `.ddi` file (CPP_MODULES_DDI) stores the dependencies information of one source file. - The `.CXXModules.json` file (CPP_MODULES_INFO) stores dependencies information for an entire target. - The `.modmap` file (CPP_MODULES_MODMAP) maps module names to BMIs, with different formats for each compiler. Additionally, a special `ArtifactCategory`, `CPP_MODULES_MODMAP_INPUT`, is an auxiliary file used to easily obtain the requested BMI paths. ## Build Variables Two build variables, `CPP_MODULE_MODMAP_FILE` and `CPP_MODULE_OUTPUT_FILE`, have been added. - `CPP_MODULE_MODMAP_FILE` specifies the path to the `.modmap` file and is used by the `cpp20_modmap_file_feature`. - `CPP_MODULE_OUTPUT_FILE` specifies the output name of the BMI when one-phase compilation is employed and is used by the `cpp20_module_compile_flags_feature`. ## Toolchains Three action configs (`cpp_module_scan_deps`, `cpp20_module_compile`, and `cpp20_module_codegen`) have been added, corresponding to the action names section. Two features (`cpp_module_modmap_file_feature` and `cpp20_module_compile_flags_feature`) have been added, corresponding to the build variables section. Using C++20 Modules necessitates topological ordering for the compilation units. For more details, see the [Discovering Dependencies](https://clang.llvm.org/docs/StandardCPlusPlusModules.html#discovering-dependencies) section. Considering the various compilers, I have added the `deps-scanner` tool. The default implementation is a script wrapper that uses different scanning methods depending on the compiler. The wrapper `deps_scanner_wrapper` is generated by a template file `<compiler>_deps_scanner_wrapper.sh.tpl`. Three template files have been added: - `clang_deps_scanner_wrapper.sh.tpl` - `gcc_deps_scanner_wrapper.sh.tpl` - `mvsc_deps_scanner_wrapper.bat.tpl` For a demonstration of how to scan C++20 dependencies, please refer to this [demo](https://github.com/PikachuHyA/cpp20_modules_scan_dependency_demo). Closes #22429. PiperOrigin-RevId: 669241384 Change-Id: Id9ee2f66cb075446d0c38e6a6c70786ad9b28022
Hi PikachuHyA, Congrats on getting the first 3 PRs merged, and thanks to reviewers :-) It appears that we need to wait for PR4 to be rebased and merged before we can experiment with this, is that correct? |
Yes. If the #22553 is merged, Bazel will basically support C++20 Modules. |
this PR implement the support C++20 Modules in bazel.
the design doc: bazelbuild/proposals#354
the discussion: #19939
the demo: https://github.com/PikachuHyA/async_simple
the extra tests: https://github.com/PikachuHyA/bazel_cxx20_module_test
see #4005