Skip to content

get rid of some false negatives in rustdoc::broken_intra_doc_links #132748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

lolbinarycat
Copy link
Contributor

@lolbinarycat lolbinarycat commented Nov 7, 2024

rustdoc will not try to do intra-doc linking if the "path" of a link looks too much like a "real url".

however, only inline links (text) can actually contain a url, other types of links (reference links, shortcut links) contain a reference which is later resolved to an actual url.

the "path" in this case cannot be a url, and therefore it should not be skipped due to looking like a url.

fixes #54191

@rustbot
Copy link
Collaborator

rustbot commented Nov 7, 2024

r? @notriddle

rustbot has assigned @notriddle.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Nov 7, 2024
@lolbinarycat lolbinarycat force-pushed the rustdoc-intra-doc-link-warn-more-54191 branch from df3d0f6 to 8ae6586 Compare November 7, 2024 21:41
@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

@notriddle notriddle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good idea, but I have found one problem. Consider a sample like this:

//@ check-pass
#![no_std]
#![deny(rustdoc::broken_intra_doc_links)]

// regression test for https://github.com/rust-lang/rust/issues/54191
// false positives

//! This is not an intra-doc link: [`foobar`]
//!
//! [`foobar`]: /index.html

That test case fails, because it thinks /index.html is an intra-doc link. You're right that certain kinds of links aren't allowed to be URLs—that's the Unknown type links.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 7, 2024
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@lolbinarycat
Copy link
Contributor Author

welp, turns out the standard library itself contains several of these false negatives!

@lolbinarycat
Copy link
Contributor Author

...or perhaps there's some deeper bug with intra-doc link generation?

there seems to be some false positives containing spaces, even though there is a matching reference definition...

I guess we can just re-enable unconditionally ignoring links with spaces?

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@lolbinarycat
Copy link
Contributor Author

Ok, this breaks a lot of ui tests. This might actually have a pretty big impact, maybe we should do a crater run or something..?

@lolbinarycat lolbinarycat added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 8, 2024
@notriddle
Copy link
Contributor

It's looks fine to me. Here's the PRs where each of these test cases were added:

test case PR
disambiguator-endswith-named-suffix.rs #127016
weird-syntax.rs #112014
redundant_explicit_links-utf8.rs #115070

The last two are both contrived ICE tests, and all of them are intended to be parsed as intra-doc links. Making it so that they warn seems fine to me.

r? @GuillaumeGomez do you also think this is a suitable bug fix?

@rustbot rustbot assigned GuillaumeGomez and unassigned notriddle Nov 8, 2024
@notriddle notriddle self-assigned this Nov 8, 2024
@GuillaumeGomez
Copy link
Member

I think so. I'm not completely satisfied with the current impact though. Intra-doc links should not be triggered on items that contain characters which can't be in an ident or a path, like Clone\(\). Maybe a different warning in these cases?

@lolbinarycat
Copy link
Contributor Author

@GuillaumeGomez what about just amending the current warning to account for other common mistakes?

honestly makes me wish we had --explain for lints, putting a full help message for each one could get a bit verbose.

regardless, i think that should be a separate issue, since the lint output isn't the most helpful in general (it just recommends escaping the brackets, and doesn't mention other common issues, such as misspelled link references)

@GuillaumeGomez
Copy link
Member

what about just amending the current warning to account for other common mistakes?

Do you have an example of before/after of what you have in mind by any chance?

@lolbinarycat
Copy link
Contributor Author

Do you have an example of before/after of what you have in mind by any chance?

well, the most obvious one is some sort of "consider adding a reference: [whatever]: //some-url.example/", but i also think trying to catch typos by comparing the edit distance of the missing reference to that of existing references could be handy.

and perhaps we should also mention surrounding code by backticks, since code snippets should generally be in code blocks instead of individually escaping chars.

@GuillaumeGomez
Copy link
Member

well, the most obvious one is some sort of "consider adding a reference: [whatever]: //some-url.example/", but i also think trying to catch typos by comparing the edit distance of the missing reference to that of existing references could be handy.

Sounds good to me. The distance between two items could be nice too (as a follow-up?).

@lolbinarycat
Copy link
Contributor Author

This was discussed in the latest t-rustdoc meeting

Consensus was we need some sort of additional heuristic to allow stuff like "[1, 2]" without escaping, but no consensus was reached as to what that heuristic should be.

@lolbinarycat
Copy link
Contributor Author

This was discussed in this month's t-rustdoc meeting, and the consensus was that reference-style intra-doc links should not be checked unless they are wrapped with backticks.

this means something like [`Type.method()`] will cause a lint to be emitted, but [Type.method()] will not.

lolbinarycat and others added 8 commits April 18, 2025 14:47
rustdoc will not try to do intra-doc linking if the "path"
of a link looks too much like a "real url".

however, only inline links ([text](url)) can actually contain
a url, other types of links (reference links, shortcut links)
contain a *reference* which is later resolved to an actual url.

the "path" in this case cannot be a url, and therefore it should
not be skipped due to looking like a url.

fixes rust-lang#54191
fix link kind check

Co-authored-by: Michael Howell <michael@notriddle.com>
add error annotation

Co-authored-by: Michael Howell <michael@notriddle.com>
@lolbinarycat lolbinarycat force-pushed the rustdoc-intra-doc-link-warn-more-54191 branch from a918960 to 6747e21 Compare April 18, 2025 19:55
@lolbinarycat
Copy link
Contributor Author

I know @notriddle wanted to actually make the lints more generous in some cases, but the way this is currently implemented is that links that look like urls get completely ignored by the entire intra-doc link system, not just ignored by the lint.

so, the best I can do without significant overhauls is to revert to the previous buggy behavior if there are no backticks.

this is in an effort to reduce the amount of code churn caused by
this lint triggering on text that was never meant to be a link.

a more principled hierustic for ignoring lints is not possible
without extensive changes, due to the lint emitting code
being so far away from the link collecting code,
and the fact that only the link collecting code
has access to details about how the link appears in the
unnormalized markdown.
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

collapsed links and reference links have a pretty particular syntax,
it seems unlikely they would show up on accident.
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added the A-rustdoc-json Area: Rustdoc JSON backend label Apr 19, 2025
@rustbot
Copy link
Collaborator

rustbot commented Apr 19, 2025

Some changes occurred in tests/rustdoc-json

cc @aDotInTheVoid

@lolbinarycat lolbinarycat force-pushed the rustdoc-intra-doc-link-warn-more-54191 branch from 513d9b7 to 32a73bf Compare April 19, 2025 15:56
jhpratt added a commit to jhpratt/rust that referenced this pull request Apr 21, 2025
…aumeGomez

jsondocck: Require command is at start of line

In one place we use `///`@`` instead of `//`@`.` The test-runner allowed it, but it probably shouldn't. Ran into by `@lolbinarycat` in rust-lang#132748 (comment):

```
error: unknown disambiguator `?(`
##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25
  |
3 | ///@ set foo = "$.index[?(`@.name=='Foo')].id"`
  |                         ^^
  |
```

Maybe it's also worth erroring on this like we added in rust-lang#137103

r? `@GuillaumeGomez`
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Apr 21, 2025
…aumeGomez

jsondocck: Require command is at start of line

In one place we use `///`@`` instead of `//`@`.` The test-runner allowed it, but it probably shouldn't. Ran into by `@lolbinarycat` in rust-lang#132748 (comment):

```
error: unknown disambiguator `?(`
##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25
  |
3 | ///@ set foo = "$.index[?(`@.name=='Foo')].id"`
  |                         ^^
  |
```

Maybe it's also worth erroring on this like we added in rust-lang#137103

r? `@GuillaumeGomez`
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 21, 2025
…aumeGomez

jsondocck: Require command is at start of line

In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang#132748 (comment):

```
error: unknown disambiguator `?(`
##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25
  |
3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"``
  |                         ^^
  |
```

Maybe it's also worth erroring on this like we added in rust-lang#137103

r? ``@GuillaumeGomez``
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 21, 2025
Rollup merge of rust-lang#140076 - aDotInTheVoid:jsondocline, r=GuillaumeGomez

jsondocck: Require command is at start of line

In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang#132748 (comment):

```
error: unknown disambiguator `?(`
##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25
  |
3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"``
  |                         ^^
  |
```

Maybe it's also worth erroring on this like we added in rust-lang#137103

r? ``@GuillaumeGomez``
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Apr 22, 2025
jsondocck: Require command is at start of line

In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang/rust#132748 (comment):

```
error: unknown disambiguator `?(`
##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25
  |
3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"``
  |                         ^^
  |
```

Maybe it's also worth erroring on this like we added in #137103

r? ``@GuillaumeGomez``
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Apr 24, 2025
jsondocck: Require command is at start of line

In one place we use `///``@``` instead of `//``@`.`` The test-runner allowed it, but it probably shouldn't. Ran into by ``@lolbinarycat`` in rust-lang/rust#132748 (comment):

```
error: unknown disambiguator `?(`
##[error] --> /checkout/tests/rustdoc-json/fns/return_type_alias.rs:3:25
  |
3 | ///@ set foo = "$.index[?(``@.name=='Foo')].id"``
  |                         ^^
  |
```

Maybe it's also worth erroring on this like we added in #137103

r? ``@GuillaumeGomez``
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rustdoc-json Area: Rustdoc JSON backend S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rustdoc does not warn about broken links if they contain . or []
6 participants