Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new --orchestrator-id flag to rustc #635

Closed
1 of 3 tasks
LukeMathWalker opened this issue May 17, 2023 · 14 comments
Closed
1 of 3 tasks

Add a new --orchestrator-id flag to rustc #635

LukeMathWalker opened this issue May 17, 2023 · 14 comments
Labels
disposition-merge The FCP starter wants to merge this finished-final-comment-period The FCP has finished, action upon the disposition label needs to be taken major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted proposed-final-comment-period An FCP has been started, cast your votes and raise concerns T-compiler Add this label so rfcbot knows to poll the compiler team

Comments

@LukeMathWalker
Copy link

LukeMathWalker commented May 17, 2023

Proposal (Updated on 2024-09-06)

Context

The JSON documentation for a crate often refers to items (e.g. functions, traits, types, etc.) defined in one or more its dependencies, either direct or transitive.
This can happen in a variety of scenarios:

  • A local function uses a foreign type as one of input parameters or as its return type
  • A foreign trait is implemented for a local type or used in a bound
  • Re-exports
  • Etc.

Problem

The JSON documentation for a crate doesn't provide many details when it comes to foreign items. If you want to dig deeper, you must generate the JSON documentation for the crate where they are defined and then combine the information from those two JSON documents to get a complete picture1.
In other words, you need to walk the dependency graph to perform analyses where foreign items are in scope.

To walk the graph, you need to go from "I have this foreign item" to "here's the JSON documentation for crate X, where that item was defined".
This is the difficult part: the JSON documentation for a crate contains very little information about third party crates. In particular, it only reports their names and the root URLs to their docs.
In the simplest scenario, that's not a problem: the crate name is enough to generate the JSON documentation if there is only one instance of that crate in the dependency graph.
It becomes an issue when multiple versions of the same crate appear together in the same JSON document. There will be multiple entries in external_crates with the same name, one for each version of that duplicated dependency.
There is no bullet-proof mechanism to disambuigate between those entries and map them back to specific package entries in the dependency tree of the "root" crate2. That's the problem this MCP tries to solve.

The proposed solution

For rustdoc

rustdoc will provide an additional field in ExternalCrate, named orchestrator_id.
orchestrator_id will be of type Option<String>.
orchestrator_id will be (optionally) provided by the tool orchestrating the overall build. This will be cargo in the most common case, but it may as well be an alternative orchestrator (bazel, buck2, etc.). If an orchestrator decides to provide an id for each build unit, it must guarantee that those ids uniquely identify that unit within the dependency graph.

For cargo

cargo will populate the orchestrator_id field.
The id will be a valid string for the --package flag in cargo (e.g. cargo rustdoc -p <build-id> should always work). This will allow tool builders to walk the graph without ambiguity when working with JSON documentation.

Implementation strategy

How would rustdoc receive this orchestrator_id, in practice?
rustdoc looks at the rmeta file to gather information about third party crates. We want to include orchestrator_id in that file to get it from cargo to rustdoc. This requires rustc to cooperate.

Adding a new (unstable) --orchestrator-id option to rustc seems to be the simplest way forward.
This could then be stabilised as is or, if preferred, moved to a different "channel" (e.g. as part of -Cmetadata, its own -C flag, etc.).

Mentors or Reviewers

A mentor would definitely be appreciated, I never touched any of the code involved here.

Process

The main points of the [Major Change Process][MCP] are as follows:

  • File an issue describing the proposal.
  • A compiler team member or contributor who is knowledgeable in the area can second by writing @rustbot second.
    • Finding a "second" suffices for internal changes. If however, you are proposing a new public-facing feature, such as a -C flag, then full team check-off is required.
    • Compiler team members can initiate a check-off via @rfcbot fcp merge on either the MCP or the PR.
  • Once an MCP is seconded, the Final Comment Period begins. If no objections are raised after 10 days, the MCP is considered approved.

Comments

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

Notes

This proposal supercedes #622, incorporating the feedback and ideas that were surfaced in the associated Zulip stream.

Previous proposal, before first round of feedback

The problem

rustdoc's JSON output for a crate must often refer to items (e.g. functions, traits, types, etc.) defined in one or more its dependencies.

Very little information is captured about those dependencies: their name and the root URL to their docs.

This causes issues when multiple versions of the same crate appear as direct dependencies (e.g. via renames) of the crate we are documenting—e.g. it becomes impossible to find out where items are coming from.

The proposed solution

Add a --build-id flag to rustc.
The value would be provided by cargo and then captured as part of the .rlib metadata, which would in turn make it available to rustdoc, as well as other parts of the compiler that might benefit from it (e.g. diagnostics, such as version mismatches which currently limit themselves to perhaps two different version of CRATE_NAME are being used?).

The provided build-id must:

  • Uniquely identify the crate it points to within the current workspace;
  • Be a valid string for the --package flag in cargo (e.g. cargo rustdoc -p <build-id> should always work). This allows toolmakers to fetch more information on-demand when working with rustdoc's JSON output;
  • Be as human-friendly as possible (e.g. use just the crate name if there is no ambiguity) in order to be used in diagnostics.

Footnotes

  1. This is an intentional design choice, that's been discussed at length in the rustdoc team and that's been endorsed by all JSON-based tool builders that interface with the team.

  2. See this Zulip thread for more details.

@LukeMathWalker LukeMathWalker added major-change A proposal to make a major change to rustc T-compiler Add this label so rfcbot knows to poll the compiler team labels May 17, 2023
@rustbot
Copy link
Collaborator

rustbot commented May 17, 2023

This issue is not meant to be used for technical discussion. There is a Zulip stream for that. Use this issue to leave procedural comments, such as volunteering to review, indicating that you second the proposal (or third, etc), or raising a concern that you would like to be addressed.

cc @rust-lang/compiler @rust-lang/compiler-contributors

@estebank
Copy link

Given that this is a user visible flag (otherwise I'd just second):

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Oct 24, 2023

Team member @estebank has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period An FCP has been started, cast your votes and raise concerns disposition-merge The FCP starter wants to merge this labels Oct 24, 2023
@petrochenkov
Copy link

@rfcbot concern other-existing-options

The zulip thread lists some alternatives like -Cmetadata, the proposal needs to describe why the existing options are not enough and why a new separate option is necessary for this.

One alternative, for example, is making -Cmetadata more extensible so it could include e.g. json with different pieces of auxiliary information like this.

@pnkfelix
Copy link
Member

pnkfelix commented Oct 26, 2023

(I'm checking my box under the assumption that the implementation of this MCP will explore leveraging an extension of --metadata rather than adding a whole new --build-id flag.)

@rfcbot reviewed

@wesleywiser
Copy link
Member

@rfcbot concern option-name

When reading the title, I assumed this was an flagthat would allow the user to choose what --build-id value is passed to the linker. I would prefer we choose a different name for this flag if we can't reuse or extend -Cmetadata as @petrochenkov suggests.

@compiler-errors
Copy link
Member

@apiraino:

I don't think it makes sense to block this FCP on finding a mentor, unless you think the change to the compiler is so large that its acceptance should be conditional on finding someone to implement it (which seems out of the ordinary) -- IMO, the FCP is just for approval of the technical change. Also, the syntax is @rfcbot to file a concern on the FCP.

@apiraino
Copy link
Contributor

@compiler-errors fair point, sorry for the noise. Just wanted to be sure that that point was't forgotten.

@LukeMathWalker
Copy link
Author

I've updated the proposal to reflect the feedback above (and in Zulip).
In particular:

  • I've cut down the scope to the bare minimum required to solve the rustdoc-related problem
  • Clarified what the proposal means for each "actor"
  • Deferred the final decision as to what will be the "blessed" channel to pass this information (a dedicated flag, -Cmetadata, etc.)

@apiraino apiraino added the to-announce Announce this issue on triage meeting label Sep 6, 2024
@petrochenkov
Copy link

I still don't understand what is the point of this, rustc is already able to identify crates uniquely.
Do you plan to somehow make these unique identifiers human-readable using higher-level tools?

I'll remove my concern to unblock the experimentation, but the scenario that I'd prefer to see in the future is

  • actually use this option from some build system to show the use case and motivation clearly
  • understand how the same use case can be covered by information already available to rustc, and remove the option

@rfcbot resolve other-existing-options

@wesleywiser
Copy link
Member

@rfcbot resolve option-name

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 13, 2024

🔔 This is now entering its final comment period, as per the review above. 🔔

psst @estebank, I wasn't able to add the final-comment-period label, please do so.

@rfcbot
Copy link
Collaborator

rfcbot commented Sep 23, 2024

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

psst @estebank, I wasn't able to add the finished-final-comment-period label, please do so.

@pnkfelix pnkfelix added finished-final-comment-period The FCP has finished, action upon the disposition label needs to be taken final-comment-period The FCP has started, most (if not all) team members are in agreement labels Sep 26, 2024
@apiraino
Copy link
Contributor

apiraino commented Oct 7, 2024

@rustbot label -final-comment-period +major-change-accepted

@apiraino apiraino closed this as completed Oct 7, 2024
@rustbot rustbot added major-change-accepted A major change proposal that was accepted and removed final-comment-period The FCP has started, most (if not all) team members are in agreement labels Oct 7, 2024
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge The FCP starter wants to merge this finished-final-comment-period The FCP has finished, action upon the disposition label needs to be taken major-change A proposal to make a major change to rustc major-change-accepted A major change proposal that was accepted proposed-final-comment-period An FCP has been started, cast your votes and raise concerns T-compiler Add this label so rfcbot knows to poll the compiler team
Projects
None yet
Development

No branches or pull requests

9 participants