Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --cfg and --rustc-cfg flags to output compiler configuration #9002

Merged
merged 51 commits into from
Feb 23, 2021

Conversation

volks73
Copy link
Contributor

@volks73 volks73 commented Dec 19, 2020

This PR is my attempt to address #8923.

I have added the --cfg flag to the cargo rustc subcommand and the --rustc-cfg flag to the cargo build, cargo check, cargo test, and cargo bench subcommands, respectively. Both versions of the flag, --cfg and --rustc-cfg, do the same thing, but I thought it looked weird for the cargo rustc subcommand to be cargo rustc --rustc-cfg. The following example invocations are possible, once stabilized:

  • cargo rustc --cfg
  • cargo build --rustc-cfg
  • cargo check --rustc-cfg
  • cargo test --rustc-cfg
  • cargo bench --rustc-cfg

In each case, compilation is aborted and only compiler configuration is emitted. All of the context creation and configuration is still executed, but execution of the compilation job is aborted. Similar to the --unit-graph implementation, the --cfg/--rustc-cfg flag is hidden, marked as a "unstable", and requires the -Z unstable-options flag at the moment. A complete example invocation with this PR would be:

$ cargo +nightly rustc -Z unstable-options --cfg

I am open to alternatives for the flag name. I have thought of --compiler-cfg, --compile-cfg, and --target-cfg, but I went with --rustc-cfg because it is the Rust compiler (rustc) configuration (cfg). The --target-cfg could be confusing because there are Cargo targets and Compiler targets. A lone --cfg for the build, check, test, and bench subcommands would also be ambiguous and configuration that is being displayed.

Originally, I was only going to add the --cfg flag to the cargo rustc subcommand, but build, check, test, and bench all have the --unit-graph flag, and I used that flag's implementation and existence as a template for this implementation. I am not opposed to having just the --cfg flag for the cargo rustc subcommand as originally proposed in #8923.

I discovered during my initial investigation to implement the feature and familiarization with the Cargo codebase, that as part of the build context creation and compilation process, Cargo internally calls the rustc --print cfg command for all targets and stores the output in the RustcTargetData type. It does this for the host and any explicitly defined targets for cross-compilation. Compilation features, such as +crt-static, added to the compilation process through Cargo environment variables or TOML configuration directives are added to the internal rustc --print cfg command call. So, the --cfg/--rustc-cfg just emits the contents of the RustcTagetData type as JSON. There is no need to call the rustc --print cfg command again. The implementation then becomes nearly identical to the --unit-graph implementation.

The output is only in JSON, similar to the --unit-graph feature, instead of the key-value and name style of the rustc --print cfg output because it easily resolves issues related to multi-targets, explicit target (--target <TRIPLE>) versus host configurations, and cross compilation. Subcommands and other projects can parse the JSON to recreate the key-value and name style if desired based on a specific target or the host. Here is an example of the JSON output (formatted for humans, the actual output is condensed), which is also available as a comment in the cargo::core::compiler::rustc_cfg module:

{
     "version": 1,
     "host": {
         "names": ["windows", "debug_assertions"],
         "arch":"x86_64",
         "endian":"little",
         "env":"msvc",
         "family":"windows",
         "features":["fxsr","sse","sse2"],
         "os":"windows",
         "pointer_width":"64",
         "vendor":"pc"
     },
     "targets": {
         "x86_64-unknown-linux-gnu": {
             "names": ["debug_assertions", "unix"],
             "arch":"x86_64",
             "endian":"little",
             "env":"gnu",
             "family":"unix",
             "features": ["fxsr","sse","sse2"],
             "os":"linux",
             "pointer_width":"64",
             "vendor":"unknown"
         },
         "i686-pc-windows-msvc": {
             "names": ["windows", "debug_assertions"],
             "arch":"x86",
             "endian":"little",
             "env":"msvc",
             "family":"windows",
             "features":["fxsr","sse","sse2"],
             "os":"windows",
             "pointer_width":"32",
             "vendor":"pc"
         }
     }
 }

I decided to remove the "target_" prefix from the relevant configurations to reduce "noise" in the output. Again, I am open to alternatives or suggestions on the JSON format.

The `--cfg` flag is added to the `cargo rustc` subcommand. The
implementation generally follows the `--unit-graph` implementation in
that it aborts compilation after the build context is created.

I discovered that cargo runs the `rustc --print cfg` every time it
builds/compiles a package. It stores this information for all compiler
targets and the host in the `RustcTargetData` struct, which is further
populated when the build context is created. When the `rustc --print
cfg` command is ran internally, all of the Cargo configurations and
environment variables are applied to the command. This means that the
command does not need to be re-run for the `--cfg` flag. Instead, I just
needed to print what was already populated into the `RustcTargetData`
struct for the build context and abort before executing the
build/compile job.

The existence of the `rustc --print cfg` command being executed
internally to Cargo also meant that multi-target and cross-compilation
are naturally supported. However, the output kind of has to be JSON
because it is not possible to select a single compiler target to print.
It gets messy very quickly if multiple targets are specified and which
one to use for the "human" output similar to the `rustc --print cfg`
command. The solution is to output in JSON like the `--unit-graph` flag
and include the data for the host and any specified targets. A
downstream parser can then pull out/extract the target data that is
needed.

The `--cfg` flag needs to be added to the `cargo build` subcommand, too.
The `--unit-graph` flag is available in both subcommands and so should
the `--cfg` flag for consistency. Ultimately, the `cargo build`
subcommand "calls" the `cargo rustc` subcommand. In other words, both
subcommands use the same implementation.

The flag does not appear in the help text, but the `--unit-graph` does
not either. I need to work on this as well.
The `--cfg` option is changed to `--rustc-cfg` for the build, bench,
test, and check subcommands but left as `--cfg` for the rustc
subcommand. The `--rustc-cfg` flag is more descriptive of
configuration (cfg) that will be printed as JSON, but it looks weird if
`--rustc-cfg` is used with the rustc subcommand, i.e.:

```pwsh
PS C:\>cargo rustc --rustc-cfg
```

versus

```
PS C:\>cargo rustc --cfg
```

By using the rustc subcommad, the type of output and configuration is
known, but wiht the other subcommands it is ambiguous what a lone
`--cfg` would output.
The name of the flag changed for some subcommands from `--cfg` to
`--rustc-cfg` but the parsing of the arguments was not updated to
account for the different names.
Following the template of the `--unit-graph` option, the `--cfg` and
`--rustc-cfg` flags are placed behind the `-Z unstable-options` flag.
Now, if the `--cfg` or `--rustc-cfg` flags are used without the `-Z
unstable-option` flag, an error message will appear. Once stablized, I
believe this requirement can be removed.
The `BuildConfig.cfg` is changed to `BuildConfig.rustc_cfg`. I am afraid
of a name collision or confusion with just a `cfg` field. The
`rustc_cfg` field indicates this is for printing the compiler (rustc)
configuration from the `rustc --print cfg` like process, instead of some
higher level or more generic "configuration" type.
Here again, the `cfg` name is too generic and could be confusing. It is
changed to `rustc_cfg` in similar convention to the `unit_graph` module
and to relate it to the `--rustc-cfg` flag.
The original format was quick-n-dirty and more like a "raw dump" of the
compiler configurations. The quick-n-dirty format would probably work
just find for 99% of the use cases but it was kind of noisy and verbose.
The target features were not great. The new format is less noisy (no
"target_" everywhere) and provides an array of the target features. It
is just more clean and elegant.

Note, there is no implementation or CLI argument to select a JSON format
version. I am not sure such an implementation is even needed.
Similar to the `--unit-graph` flag, which is also hidden, the `--cfg`
and `--rustc-cfg` are hidden. I think they will be made visible once the
feature is stablized.
@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Eh2406 (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 19, 2020
@ehuss
Copy link
Contributor

ehuss commented Dec 24, 2020

Thanks for the PR! Just a heads up, with the holidays, it may take a while to get a proper look at this.

Just my initial reaction, I'm a little uneasy about adding so many flags. AFAIK, the cfg values only depend on the --target flag, is that correct? The reason --unit-graph was done that way is because it depends on the command and pretty much all the flags, so it couldn't really work as a separate command.

I would also avoid hard-coding keys and stripping prefixes, and just dump the contents of the Cfg map. Is there a particular reason to not display the whole thing?

@volks73
Copy link
Contributor Author

volks73 commented Dec 24, 2020

Thanks for the PR! Just a heads up, with the holidays, it may take a while to get a proper look at this.

Your welcome, and I understand. Happy holidays!

Just my initial reaction, I'm a little uneasy about adding so many flags. AFAIK, the cfg values only depend on the --target flag, is >that correct? The reason --unit-graph was done that way is because it depends on the command and pretty much all the flags, so >it couldn't really work as a separate command.

Yes, I believe the cfg values are only dependent on the --target option(s) and the reasoning for the --unit-graph flag makes sense. The --cfg/--rustc-cfg flag should probably only be added for the cargo rustc subcommand. Obtaining the compiler configuration for a specific Cargo target, like for tests, could be obtained with cargo rustc --test --cfg instead of cargo test --rustc-cfg. My original thought was to only add the flag to the cargo rustc subcommand anyways. I will hold off on making this change until further confirmation is received.

I would also avoid hard-coding keys and stripping prefixes, and just dump the contents of the Cfg map. Is there a particular reason to not display the whole thing?

Up until commit 234089, the initial implementation was just a dump of the contents:

{
    "version": 1,
    "host": {
        "names": ["windows", "debug_assertions"],
        "key_pairs": [
            {"target_arch": "x86_64"},
            {"target_endian": "little"},
            {"target_env": "msvc"},
            {"target_family": "windows"},
            {"target_feature": "fxsr"},
            {"target_feature": "sse"},
            {"target_feature": "sse2"},
            {"target_os": "windows"},
            {"target_pointer_width": "64"},
            {"target_vendor": "pc"}
        ]
    },
    "targets": [
        "x86_64-unknown-linux-gnu": {
            "names": ["debug_assertions", "unix"],
            "key_pairs": [
                {"target_arch": "x86_64"},
                {"target_endian": "little"},
                {"target_env": "gnu"},
                {"target_family": "unix"},
                {"target_feature": "fxsr"},
                {"target_feature": "sse"},
                {"target_feature": "sse2"},
                {"target_os": "linux"},
                {"target_pointer_width": "64"},
                {"target_vendor": "unknown"}
            ]
        },
         "i686-pc-windows-msvc": {
            "names": ["windows", "debug_assertions"],
            "key_pairs": [
                {"target_arch":"x86"},
                {"target_endian":"little"},
                {"target_env":"msvc"},
                {"target_family":"windows"},
                {"target_feature":"fxsr"},
                {"target_feature":"sse"},
                {"target_feature":"sse2"},
                {"target_os":"windows"},
                {"target_pointer_width":"32"},
                {"target_vendor":"pc"}
         }
    ]
}

And looking at the output, I see a lot of "target_" being repeated. I use repeated prefixes as a "code smell" or indicator that something should be refactored/restructured. From the consumer side of the JSON output, like in a crate or third party subcommand, I would not have Cfg::target_os, Cfg::target_family, Cfg::target_env, etc. methods/fields, but Cfg::os, Cfg::family, Cfg::env, etc. or even Cfg::Target::os, Cfg::Target::family, Cfg::Target::env, etc. So, I thought the JSON should mirror this.

I recognize hard-coding and stripping prefixes is maybe not the most flexible or future-proof implementation. All of the output I tested with the rustc --print cfg --target <TRIPLE> command had the hard-coded fields, i.e. "target_env", "target_os", "target_endian", etc., albeit an empty string for targets without the value. This seemed to indicate that the "target_*" configurations were not going away and would always be present, but it is possible new configurations (prefixed with target_ or not) might be added or used and this is the concern? I thought the Version field would help in these situations.

I don't really have a strong reason for the proposed JSON format other than "look and feel", smaller number of bytes, and empirical evidence that certain fields are always available, which is not great because it is really for machine consumption, not human consumption, bandwidth/performance is not a concern, and I am not familiar enough with future compiler changes. Honestly, I struggled with the format and have mentally run around in circles on it. So, I decided I reached my limits and it was time to seek advice and review. I would greatly appreciate some direction, suggestions, and/or alternative organization of the output (when time and capacity permit).

An example of me mentally running around in a circle, a hybrid approach that condenses "target_*" configurations to "target" array field but maintains some extensibility because everything is an array where new configurations can just be added as new maps, but indicates that both the "names" and "target" configurations/fields are ultimately optional and may not exist:

{
    "version": 1,
    "host": [
        {"names": ["windows", "debug_assertions"]},
        {"target": [
            {"arch": "x86_64"},
            {"endian": "little"},
            {"env": "msvc"},
            {"family": "windows"},
            {"features": ["fxsr", "sse", "sse2"]},
            {"os": "windows"},
            {"pointer_width": "64"},
            {"vendor": "pc"}
        ]}
    ],
    "targets": [
        "i686-pc-windows-msvc": [
            {"names": ["windows", "debug_assertions"]},
            {"target": [
                {"arch":"x86"},
                {"endian":"little"},
                {"env":"msvc"},
                {"family":"windows"},
                {"features":["fxsr","sse","sse2"]},
                {"os":"windows"},
                {"pointer_width":"32"},
                {"vendor":"pc"}
            ]}
        ]
    ]
}

@alexcrichton
Copy link
Member

I would agree with @ehuss that we probably don't want to munge the values and instead just report exactly what rustc is telling us. I think that a flag also may not be necessary for this, we can probably just add this json blob to the ones already emitted?

@volks73
Copy link
Contributor Author

volks73 commented Jan 5, 2021

I think that a flag also may not be necessary for this, we can probably just add this json blob to the ones already emitted?

I am not aware of other flags that output JSON and skip compiling/building. Are you referring to the --unit-graph flag? If yes, then I am good with making this implementation, too. Would it just be:

{
    "rustc": ["windows", "debug_assertions", "target_arch='x86'", "target_endian='little'", ...]
}

somewhere within the --unit-graph JSON?

@alexcrichton
Copy link
Member

Oh I thought that this was just a normal json message coming from Cargo during build? We can add it to --unit-graph as well but I think it's fine for us to just add other blobs coming out during a build, they should have an identifier on them and implementations should be skipping identifiers they don't recognize.

@volks73
Copy link
Contributor Author

volks73 commented Jan 7, 2021

Oh I thought that this was just a normal json message coming from Cargo during build?

The goal was to get the compiler configurations with Cargo modifications, such as those defined in environment variables or Cargo configuration files, without building/compiling the project, similar to the rustc --print cfg command. I have not looked into the JSON messages that are emitted during the Cargo build process to see if the compiler configurations are part of the output, but I could see where this information would also be useful in that context as well.

I looked at the --unit-graph output. There is the platform field, which is null if compilation is for the host and a target triple if it is for cross-compilation. A cfg, configurations, rustc, compiler_cfg, compiler_configurations, rustc_configurations, or other similarly named field could be added after the platform field:

...
"platform": null,
"compiler_configurations":["debug_assertions","windows","target_arch='x86'","target_endian='little'", ...],
"mode": "build",
...

I would still like to consider a separate flag for the cargo rustc --cfg command and not include the configuration in the --unit-graph output for two reasons:

  1. The --unit-graph feature appears to be further along and closer to stabilization. I do not want this feature to slow down the stabilization of the --unit-graph feature as the --unit-graph feature appears to be more broadly desired. The compiler configuration field could be added after stabilization as a Version 2 of the --unit-graph output since I could see the configuration information for each unit being useful as well.
  2. A cargo rustc --cfg command would be a nice analog to the rustc --print cfg command, which prints the configuration but skips compilation. The compiler configuration would be buried in the --unit-graph output. The cargo rustc --cfg command is more "semantically meaningful" to me, i.e. "show me the compiler configuration for the cargo compilation process".

@alexcrichton
Copy link
Member

Ah ok I understand now, thanks for helping me to clarify. I missed that this was a pre-compilation step rather than something learned during compilation.

To add to what @ehuss was mentioning earlier, this feels like the current implementation is a bit heavyweight relative to what it's doing. For example this is downloading the entire crate graph, which it probably doesn't need if it's just scraping config values. Unfortunately we don't really have a great analog for something similar to this and where it would go today, so I don't know where it would best go.

Only the `cargo rustc --cfg` argument remains. Having the `--rustc-cfg`
flag for all other subcommands as too much and deemed not really
necessary or useful.
The `--cfg` flag is changed to `--print-cfg`, so it is a little more
obvious what the flag will do. At the same time, the implementation is
changed to only affect the `cargo rustc` subcommand. It skips
downloading the entire crate graph and compiling/building the project.
Instead, it just obtains the rustc target data similar to the
`ops::compile` function and then returns. The `ops::print_cfg` function
is added, but this could also be moved to just the `cargo rustc`
subcommand.
@volks73
Copy link
Contributor Author

volks73 commented Feb 20, 2021

I have changed to using the exec function instead of the exec_with_output. Something must have been up with my system because when I originally tried to use the exec function, I was getting no output and the command appeared to hang and not work. Now it works!

Anyways, I will work on adding a test next.

Not sure how to handle running the test on different hosts, so the first
test is explicit about the compile target to print the configuration.
Using the multitarget feature to print the configuration of multiple
compiler target configurations, a test is created.
The `RUSTFLAGS` environment variable is used to add the "crt-static"
target feature and test printing the target compiler configuration
contains the `target_feature="crt-static"` line.
A `.cargo/config.toml` file is used to add the "crt-static" target
feature and test printing the compiler target configuration contains the
`target_feature="crt-static"` line.
The `panic="unwind"` appears in the output for the CI tests, but not in
my local tests. I need to investigate the origin of this configuration
but it causes the CI builds to fail.
The `with_stdout_contains` was mis-used. Since some lines may or may not
appear for some compiler targets and environments (nightly, beta,
stable, etc.) the tests would fail because the output was not identical.
Instead of a using raw strings, each line with the arch, endian, env,
family, vendor, pointer_width, etc. that are known to always be
present (at least of the CI builds) are included. This should work for
the CI environments and my local environment.
@volks73
Copy link
Contributor Author

volks73 commented Feb 23, 2021

I have added a couple of tests. These tests include using the --print cfg command with the -Z multitarget option, too.

@alexcrichton
Copy link
Member

Looks good to me, thanks!

@bors: r+

@bors
Copy link
Collaborator

bors commented Feb 23, 2021

📌 Commit 2a5355f has been approved by alexcrichton

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 23, 2021
@bors
Copy link
Collaborator

bors commented Feb 23, 2021

⌛ Testing commit 2a5355f with merge 8d8459a7f81bbb335b38665f980e03ce5c5b9ce9...

@bors
Copy link
Collaborator

bors commented Feb 23, 2021

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 23, 2021
@ehuss
Copy link
Contributor

ehuss commented Feb 23, 2021

@bors retry
macos: The runner has received a shutdown signal.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 23, 2021
@bors
Copy link
Collaborator

bors commented Feb 23, 2021

⌛ Testing commit 2a5355f with merge 4ae4aad...

@bors
Copy link
Collaborator

bors commented Feb 23, 2021

☀️ Test successful - checks-actions
Approved by: alexcrichton
Pushing 4ae4aad to master...

@bors bors merged commit 4ae4aad into rust-lang:master Feb 23, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request Feb 27, 2021
Update cargo

11 commits in bf5a5d5e5d3ae842a63bfce6d070dfd438cf6070..572e201536dc2e4920346e28037b63c0f4d88b3c
2021-02-18 15:49:14 +0000 to 2021-02-24 16:51:20 +0000
- Pass the error message format to rustdoc (rust-lang/cargo#9128)
- Fix test target_in_environment_contains_lower_case (rust-lang/cargo#9203)
- Fix hang on broken stderr. (rust-lang/cargo#9201)
- Make it more clear which module is being tested when running cargo test (rust-lang/cargo#9195)
- Updates to edition handling. (rust-lang/cargo#9184)
- Add --cfg and --rustc-cfg flags to output compiler configuration (rust-lang/cargo#9002)
- Run rustdoc doctests relative to the workspace (rust-lang/cargo#9105)
- Add support for [env] section in .cargo/config.toml (rust-lang/cargo#9175)
- Add schema field and `features2` to the index. (rust-lang/cargo#9161)
- Document the default location where cargo install emitting build artifacts (rust-lang/cargo#9189)
- Do not exit prematurely if anything failed installing. (rust-lang/cargo#9185)
@ehuss ehuss added this to the 1.52.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants