Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an unstable --output-format option to cargo rustdoc #12103

Closed
LukeMathWalker opened this issue May 7, 2023 · 20 comments · Fixed by #12252
Closed

Add an unstable --output-format option to cargo rustdoc #12103

LukeMathWalker opened this issue May 7, 2023 · 20 comments · Fixed by #12252
Assignees
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` disposition-merge FCP with intent to merge finished-final-comment-period FCP complete S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review T-cargo Team: Cargo

Comments

@LukeMathWalker
Copy link

LukeMathWalker commented May 7, 2023

Problem

rustdoc (on nightly) exposes an --output-format option to choose the format of the generated documentation: either HTML (the default) or JSON (the new unstable backend).

If you choose the JSON format, the output file of the unit is no longer /target/doc/<crate_name>/index.html; it becomes /target/doc/<crate name>.json.
Since there is no way to specify the output format as an option of cargo rustdoc itself, cargo assumes that the only possible output path for a doc unit is the HTML one. As a consequence, the JSON output is never considered "fresh" and we miss out on caching (see the test case in #12100, which fails on master).

Proposed Solution

Since cargo must remain unaware of the semantics of rustdoc's extra arguments, I suggest we add --output-format as an unstable cargo rustdoc argument, mimicking the semantics of the underlying --output-format option for rustdoc.

This would allow cargo to reason about it and take it into account in the fingerprinting logic.

Notes

No response

@LukeMathWalker LukeMathWalker added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage. labels May 7, 2023
@ehuss
Copy link
Contributor

ehuss commented May 8, 2023

It might be good to list all the concerns that need to be considered for this. For example:

  • What happens with --open? Does that generate an error?
  • Is there any interaction with -Z rustdoc-scrape-examples? Does the JSON include the scraped data? Or should scraping be disabled when generating the JSON?
  • This mentions adding it to cargo rustdoc, would it also be added to cargo doc? If not, is there a reason?
    • If we do add this to cargo doc, would it also generate JSON files for all the dependencies?

Overall seems like a reasonable flag to add to me.

@ehuss ehuss added S-needs-mentor Status: Issue or feature is accepted, but needs a team member to commit to helping and reviewing. S-needs-team-input Status: Needs input from team on whether/how to proceed. and removed S-triage Status: This issue is waiting on initial triage. labels May 8, 2023
@epage
Copy link
Contributor

epage commented May 8, 2023

@obi1kenobi since you rely on rustdoc's output format, I'm curious on your thoughts on this issue.

@LukeMathWalker
Copy link
Author

Thanks for raising those points @ehuss, I'll try to address them based on what I know.

* What happens with `--open`? Does that generate an error?

I don't think it's necessary to return an error, at least for the JSON format.
Most browsers have native support for rich viewers when it comes to JSON documents, and I do find mysel opening the JSON docs in the browser from time to time for exploration. I suggest we keep honouring the current behaviour, even if the format is JSON.

* Is there any interaction with [`-Z rustdoc-scrape-examples`](https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#scrape-examples)? Does the JSON include the scraped data? Or should scraping be disabled when generating the JSON?

I'm not familiar with the feature, nor with its implications for the JSON format. @aDotInTheVoid perhaps?

* This mentions adding it to `cargo rustdoc`, would it also be added to `cargo doc`? If not, is there a reason?
  * If we do add this to `cargo doc`, would it also generate JSON files for all the dependencies?

Yes, adding it to cargo doc is the natural extension and I do expect it to behave as you describe—generate the JSON docs for all dependencies.
There are some edge cases to keep in mind in that case (i.e. the JSON filename is the crate name, leading to issues if you have multiple versions of the same crate in your dependency tree), but they must probably must resolved regardless.

I have limited this issue to cargo rustdoc since that is the primary interface at the moment and it looks like a reasonable first step.

@obi1kenobi
Copy link
Member

Thanks for the ping @epage. I'm strongly in favor of this. It's a reasonable feature to have, and from my perspective, improving the rustdoc JSON cache hit rate is extremely worth it. Thanks to @LukeMathWalker for figuring this out.

Anything that improves rustdoc JSON performance would be hugely impactful for cargo-semver-checks, since that's by far the "long pole" in its perf graph right now.

For context, cargo-semver-checks v0.20 shipped a massive perf improvement: 109x faster checking time in tokio, 2354x in aws-sdk-ec2, etc. We've also added caching of the "baseline" rustdoc JSON which gives us another 2x speedup on top of those numbers.

Since we've significantly reduced the time taken by all the other steps, the "current" version's rustdoc JSON generation is easily 70-80% of cargo-semver-checks wall-clock time right now. In turn, our users report that cargo-semver-checks even after the optimizations can be up to 50% of their total CI time


the JSON filename is the crate name

This is ever so slightly imprecise, the filename is the lib target's name with all - chars turned into _. This bit cargo-semver-checks recently so I'm bringing it up so it doesn't bite anyone else.

-Z rustdoc-scrape-examples

I believe the JSON does not have an ability to include examples. It only includes doc comments that are explicitly set on items. For the time being, scraping can be disabled. I have no opinion at the moment on whether rustdoc JSON should include examples, and I think it's fine to decide on that later since the JSON format is unstable anyway.

--open

I'm fine with attempting to open the JSON file with the user's default viewer (mine would be VSCode), falling back to the browser if no default viewer is available. If this is complex to achieve on day 1, I'm also fine with this being an error for now.

cargo doc

Totally fine with adding it there too, it makes sense to me.

LukeMathWalker added a commit to LukeMathWalker/pavex that referenced this issue May 15, 2023
This PR is mostly the result of a few "instrument, measure, optimise"
cycles, looking to understand where `pavex_cli` spends its time when
generating application code.

As it turns out, with a warm cache, the vast majority of the time is
spent computing the JSON docs for the local crates. As it stands, this
is not something that we can optimise on our own.
We have two options, which both require upstream work:

1. Make the JSON docs generation faster;
2. Compute multiple JSON docs in parallel (up to the number of threads
available on the machine).

I don't have any concrete leads on 1., while 2. depends on adding
`--output-format` support to `cargo doc`. See
rust-lang/cargo#12103

While hunting inefficiencies down, I added instrumentation around the
key steps of the compiler as well as enriched the `--debug` option with
a disk-based output that can be loaded in the Chrome tracing viewer
(chrome://tracing or ui.perfetto.dev).

You get this kind of visualisation:

<img width="1242" alt="immagine"
src="https://github.com/LukeMathWalker/pavex/assets/20745048/bd5381c3-d1db-48ce-9b59-eadb956ff4a3">

This is an execution trace with a warm cache. You can clearly see how
the lion share of the execution time it spend in `_compute_crate_docs`.

Miscellaneous changes:
- Re-added the `realworld` scaffolding in the `examples` folder;
- The package graph and the SQLite connection are now created in
parallel. We save ~10ms on the overall execution time.
@LukeMathWalker
Copy link
Author

Is there any concern here that remains to be addressed? Or should we start sketching the implementation?

@weihanglo
Copy link
Member

weihanglo commented May 18, 2023

Most concerns seem to be answered. I got some minor questions here: How stable is the rustdoc JSON output? Does it make sense that a user wants to keep different versions of JSON output at the same time?

Apart from my trivial questions, I feel like the flag isn't something that cannot be rolled back, and It also won't affect major build workflow of Cargo. Thus I am happy to discuss the design and implementation.

@LukeMathWalker
Copy link
Author

LukeMathWalker commented May 18, 2023

How stable is the rustdoc JSON output? Does it make sense that a user wants to keep different versions of JSON output at the same time?

It is versioned (via the format_version field), but a specific rustdoc build can only emit a single version of the output.
Supporting multiple versions is primarily a "client-side" concern, if you want your tool to be able to work across multiple nightly releases (e.g. what @obi1kenobi does for cargo-semver-checks).

@weihanglo weihanglo added the T-cargo Team: Cargo label May 18, 2023
@weihanglo
Copy link
Member

Let's do a quick poll before starting.

@rfcbot fcp merge

Proposed solution

Since cargo must remain unaware of the semantics of rustdoc's extra arguments, I suggest we add --output-format as an unstable cargo rustdoc argument, mimicking the semantics of the underlying --output-format option for rustdoc.

This would allow cargo to reason about it and take it into account in the fingerprinting logic.

@rfcbot
Copy link
Collaborator

rfcbot commented May 18, 2023

Team member @weihanglo has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period An FCP proposal has started, but not yet signed off. disposition-merge FCP with intent to merge labels May 18, 2023
@epage
Copy link
Contributor

epage commented May 18, 2023

So long as this is unstable, I have no concern with this. Once we have it,. it'll then also make it easier to explore the benefits and challenges with making it stable (assuming the rustdoc flag is stable)

@rfcbot rfcbot added final-comment-period FCP — a period for last comments before action is taken and removed proposed-final-comment-period An FCP proposal has started, but not yet signed off. labels May 21, 2023
@rfcbot
Copy link
Collaborator

rfcbot commented May 21, 2023

🔔 This is now entering its final comment period, as per the review above. 🔔

@weihanglo weihanglo added S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review and removed S-needs-team-input Status: Needs input from team on whether/how to proceed. S-needs-mentor Status: Issue or feature is accepted, but needs a team member to commit to helping and reviewing. labels May 21, 2023
@weihanglo
Copy link
Member

There is no need to wait util FCP ends. People interested in doing it can start. The Cargo Team may not have a large capacity to mentor but for the review part we'll try our best.

@LukeMathWalker
Copy link
Author

I'll get started next week then 👍🏻 Thanks!

@rfcbot rfcbot added finished-final-comment-period FCP complete to-announce and removed final-comment-period FCP — a period for last comments before action is taken labels May 31, 2023
@rfcbot
Copy link
Collaborator

rfcbot commented May 31, 2023

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

@charmitro
Copy link
Contributor

I'll get started next week then 👍🏻 Thanks!

Hi @LukeMathWalker , were you able to start working on this?

@LukeMathWalker
Copy link
Author

No, a few other things had to take priority.

@charmitro
Copy link
Contributor

No, a few other things had to take priority.

Can other folks(including me) give it a try? Else I believe it's better to assign it to yourself.

@LukeMathWalker
Copy link
Author

If you have the time to start on it immediately, go ahead 😁

@charmitro
Copy link
Contributor

@rustbot claim

@ehuss ehuss removed the to-announce label Dec 21, 2023
charmitro added a commit to charmitro/cargo that referenced this issue Jan 8, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
charmitro added a commit to charmitro/cargo that referenced this issue Jan 8, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
charmitro added a commit to charmitro/cargo that referenced this issue Jan 8, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
charmitro added a commit to charmitro/cargo that referenced this issue Jan 11, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
charmitro added a commit to charmitro/cargo that referenced this issue Jan 11, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
charmitro added a commit to charmitro/cargo that referenced this issue Jan 11, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
charmitro added a commit to charmitro/cargo that referenced this issue Jan 11, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
bors added a commit that referenced this issue Jan 11, 2024
Add unstable `--output-format` option to  `cargo rustdoc`

Add unstable `--output-format` option to "rustdoc"

We achieved this by:
* Handle `--output-format` argument, accepting `html` or `json`
* A new field `json` in `CompileMode::Doc`.
* If `json` is passed, we append the following in
   `prepare_rustdoc`:
  1. `-Zunstable-options`
  2. `--output-format=json`

Fixes #12103
@bors bors closed this as completed in 9276399 Jan 11, 2024
@weihanglo
Copy link
Member

@obi1kenobi @LukeMathWalker
The --output-format flag for cargo rustdoc is available on nightly. It would be much appreciated if you can provide feedback on the cache story, or other UX issues :)

stupendoussuperpowers pushed a commit to stupendoussuperpowers/cargo that referenced this issue Feb 28, 2024
We achieved this by:
   * Handle `--output-format` argument, accepting `html` or `json`
   * If `json` is passed, we append the following in
   `prepare_rustdoc`:
     	1. `-Zunstable-options`
	2. `--output-format`

Fixes rust-lang#12103

Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` disposition-merge FCP with intent to merge finished-final-comment-period FCP complete S-accepted Status: Issue or feature is accepted, and has a team member available to help mentor or review T-cargo Team: Cargo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants