Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Set CARGO_CHECK environment variable when type checking #3748

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions text/3748-cargo_check_environment_variable.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@akiselev akiselev Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the links! I searched Rust issues and RFCs but forgot to search the cargo repo ☹️

It doesn't look like the discussion has moved much in years, is there a reason against forcing the issue via RFC at this point?

The Rust landscape has changed significantly since the last major activity in the discussion and rust-analyzer seems to be the de-facto LSP implementation at this point. What are the downsides of introducing an LSP specific check command whose only difference (for now) is setting that environment variable? Or the alternative, allowing callers to set a CARGO_NO_BUILD environment variable themselves , one that is officially sanctioned and supported by Rust/Cargo (if not set automatically)?

I understand and appreciate the hesitance to stabilize contracts without ironing out all of the generalities but the discussion about cargo modes and the cargo check && cargo build feels very ivory tower. Rust is increasingly being used to integrate with C++ code beyond *-sys crates as cxx and other tools mature, and I feel like a way to notify build scripts not to run extraneous steps is very much needed regardless of the aforementioned issues. Personally I always configure rust-analyzer to use a subdirectory so that it doesn't block cargo build anyway, so build caching between cargo check and cargo build wouldn't apply (and I'm curious what fraction of the community does too)

That said, I'm biased as I feel acute pain with cxx/cxx-qt, where sccache doesn't seem to help. Worst case scenario I can set the environment variables myself and use a custom fork.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like the discussion has moved much in years, is there a reason against forcing the issue via RFC at this point?

No. While RFC doesn't really force anything until accepted, it is good to open a discussion. Sometimes people hang out in https://internals.rust-lang.org/ first for pre-RFC, before preparing a more formal proposal here.

Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
- Feature Name: `cargo_check_environment_variable`
- Start Date: 2024-12-20
- RFC PR: [rust-lang/rfcs#3748](https://github.com/rust-lang/rfcs/pull/3748)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Add a new environment variable `CARGO_CHECK` that is set to `1` when running `cargo check` or similar type-checking operations so build scripts can skip expensive compilation steps that are unnecessary for Rust type checking, such as compiling external C++ code in cxx based projects.

# Motivation
[motivation]: #motivation

Rust development heavily relies on IDE tooling like rust-analyzer, which frequently invokes `cargo check` to provide real-time type information and diagnostics. Many projects use build scripts (`build.rs`) to generate Rust code and compile external dependencies. For example:

- cxx-rs generates Rust bindings for C++ code and compiles C++ source files
- cxx-qt generates Rust bindings for Qt code and runs the Qt Meta-Object Compiler (MOC)
- Projects using Protocol Buffers generate Rust code from .proto files
- bindgen generates Rust bindings from C/C++ headers

Currently, every time rust-analyzer runs `cargo check`, the build script in the changed crate must execute its full build process, including steps like compiling C++ code that are only needed for linking but not for type checking. Normally the build script would only be run when a file added by `cargo::rerun-if-changed` is changed, which generally doesn't include the Rust source code. However, when using `cxx` to create bridges between C++ and Rust, the build script must be run for every change in the Rust bridges. Usually `cxx` bindings are rarely changed but in projects like `cxx-qt` that interface between Rust and Qt types, they receive signficantly more changes. This impacts IDE responsiveness, especially in projects with complex build scripts.

This is particularly important for projects using cxx-qt and similar frameworks where the build scripts perform extensive code generation and compilation.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

When writing a build script (`build.rs`), you can now check the `CARGO_CHECK` environment variable to determine if the build is being performed for type checking purposes:

```rust
fn main() {
generate_rust_bindings();

// Only compile external code when not type checking
if std::env::var("CARGO_CHECK").is_ok() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be is_err/`!is_ok'?

compile_cpp_code();
}
}
```

This allows build scripts to optimize their behavior based on the build context. When rust-analyzer or a developer runs `cargo check`, the build script can skip time-consuming steps that aren't necessary for type checking.

This feature primarily benefits library authors who maintain build scripts, especially those working with external code generation and compilation. Regular Rust developers using these libraries will automatically benefit from improved IDE performance without needing to modify their code.

# Reference-level explanation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should address things like caching and output generation. Quoting @alexcrichton in rust-lang/cargo#4001 (comment):

I'm not 100% certain this still fits in Cargo myself. The major downside of implementing a feature like this is that cargo check && cargo build gets slower than it currently is. Cargo currently caches build script invocations for those two commands, which means that all the work done by cargo check is reused by cargo build and isn't redone.

[...]

As an author of crates like cc and cmake it's also somewhat ambiguous to me about what "check mode" would do for libraries like that. Should they do nothing? Type-check the C code? Make sure it all compiles? Basically I don't think that C/C++ have any real meaningful distinction like Rust does for cargo check and cargo build, so there's not really an obvious choice of what these library crates would do, which would require even more opt-in or configuration on behalf of all users.

High level proposal: for a build script, CHECK_ONLY should mean:

  1. Do any relevant checks that don't include codegen, if possible
  2. Do what is necessary to check the Rust code
  3. This can always be disregarded in favor of running the full build script

So for C code, CHECK_ONLY would mean doing the C/++ configure (but not build) step and possibly invoking bindgen. Usually this is ./configure or cmake without --build to verify dependencies and environment, and prepare final headers and source files. bindgen can then be run on these final headers. !CHECK_ONLY then means that make, cmake --build, or cc should be run.

OUT_DIR should be the same between check and build so configuration output can be reused. The build script will probably always run the congigure step, cmake knows how to skip a lot if files are already up to date 1.

For crates, cc should totally ignore this environment variable and not change anything - the build script author will just be have the option to skip invoking cc if CHECK_ONLY is set. cmake's Config should add new methods like only_configure (always called) and only_build (called if CHECK_ONLY is unset), since Config::build currently does both these things. cmake::build and Config::build could optionally listen to CHECK_ONLY to determine whether to run the only_build step.

Footnotes

  1. Cmake's configure caching can sometimes be pretty slow for bigger projects. It would be possible for build scripts to thumbprint relevant environment and skip invoking cmake if OUT_DIR has an updated thumbprint, but this would be advanced usage.

[reference-level-explanation]: #reference-level-explanation

Cargo will set the `CARGO_CHECK` environment variable to `1` when running `cargo check`

The environment variable will not be set for commands that require full compilation:
- `cargo build`
- `cargo run`
- `cargo test`

# Drawbacks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to #3748 (comment): as proposed, this is only beneficial in cases where whatever the build script does is more significant than compiling the build script + deps itself. IME this is indeed the common case since often a crate like cmake or bindgen is needed to generate the headers, actually compiling only adds cmake --build.

[drawbacks]: #drawbacks

1. **Potential for Inconsistencies**: Build scripts might behave differently during type checking vs. full compilation, which could theoretically lead to different type checking results compared to the final build.

2. **Increased Complexity**: Build script authors need to consider an additional factor when determining their behavior, which adds some complexity to the build system. On the other hand, they can ignore the feature entirely and just run all build steps regardless.

3. **Maintenance Burden**: The Rust and Cargo teams will need to maintain this feature and ensure it remains consistent across different commands and contexts.

# Rationale and alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rust-lang/cargo#10126 proposed CARGO_MODE that adds doc, test, doctest, and a handful of others. Personally I would rather have a single environment variable that says whether or not codegen should happen (as is proposed here) rather than needing to know which modes do and don't require codegen, which isn't robust against adding new modes. Somehow indicating bench/test mode could be useful but this seems better as an independent feature.

[rationale-and-alternatives]: #rationale-alternatives

Alternative designs considered:

1. Define a standard environment variable that isn't set by `cargo check` but is officially encouraged by Rust for RLS and other IDE tooling. This would avoid any unexpected behavior from build scripts with other `cargo check` consumers but still provide a standard way for build scripts to skip unnecessary steps.

2. Do Nothing: If we do nothing, build scripts will continue to run all build steps even when it's not necessary, significantly impacting Rust ergonomics when interfacing with exernal languages.

# Prior art
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind providing link to each prior art so we can find reference easier?

[prior-art]: #prior-art

1. **Go Build Tags**: Go allows conditional compilation using build tags, which can be used to skip certain build steps based on the build context.

2. **Bazel's Configuration Transitions**: Bazel provides mechanisms to modify build behavior based on the target being built.

3. **Cargo Features**: The existing feature flag system in Cargo demonstrates the value of conditional build behavior.

4. **Other Cargo Environment Variables**: Cargo already sets several environment variables during builds:
- `CARGO_CFG_TARGET_OS`
- `CARGO_MANIFEST_DIR`
- `OUT_DIR`

This proposal follows the established pattern of using environment variables to communicate build context to scripts.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

1. Should the environment variable be set for other commands that don't require full compilation?
- `cargo doc`
- `cargo clippy`

2. How should this interact with parallel builds where some targets need full compilation and others only need type checking? (Is this even a thing?)

3. Should we provide additional variables to distinguish between different types of type-checking operations (IDE, clippy, etc.)?

4. How do we ensure build scripts don't diverge too much between type checking and full compilation modes?

# Future possibilities
[future-possibilities]: #future-possibilities

1. **Extended Build Contexts**: Introduce additional environment variables for other build contexts:
- `CARGO_DOC` for documentation generation
- `CARGO_IDE` specifically for IDE tooling