Fix relative path handling for --extern-html-root-url#152977
Fix relative path handling for --extern-html-root-url#152977rust-bors[bot] merged 3 commits intorust-lang:mainfrom
Conversation
|
r? @notriddle rustbot has assigned @notriddle. Use Why was this reviewer chosen?The reviewer was selected based on:
|
src/librustdoc/html/format.rs
Outdated
| match cache.extern_locations[&def_id.krate] { | ||
| ExternalLocation::Remote(ref s) => { | ||
| *is_remote = true; | ||
| let is_abs = s.contains("://") || s.starts_with('/'); |
There was a problem hiding this comment.
There's actually the (uncommon) possibilities that someone specifies a network-path reference URI (which would begin // without the scheme: prefix); or a relative-path reference that happens to include :// somewhere within the path (for example ../this_is_silly://but_perfectly_valid).
Perhaps instead use a regex, such as that from Appendix B of RFC 3986, and treat the URI as absolute if (i) either of the scheme or authority components are present; or (ii) the path component begins with /.
Moreover, that's a check we only need to perform once when creating the ExternalLocation::Remote—so perhaps move it into an is_absolute boolean of the variant and avoid recomputing it every time we create a URL for that crate?
There was a problem hiding this comment.
Updated — is_absolute is now computed once in to_remote using proper RFC 3986 scheme validation and stored in the variant. Also renamed is_remote to is_absolute throughout, which should answer your second question too.
src/librustdoc/html/format.rs
Outdated
| ExternalLocation::Remote(ref s) => { | ||
| *is_remote = true; | ||
| let is_abs = s.contains("://") || s.starts_with('/'); | ||
| *is_remote = is_abs; |
There was a problem hiding this comment.
I'm not sure what this is_remote variable does? Is it the case that extern crates that are reached by a relative path should set it to false?
There was a problem hiding this comment.
Yes, relative paths set it to false so that root_path still gets prepended in make_href, same as local paths. The name is confusing, your suggestion to store is_absolute in the variant would clean this up.
src/librustdoc/html/format.rs
Outdated
| let extra = relative_to.len().saturating_sub(1); | ||
| let mut b: UrlPartsBuilder = iter::repeat_n("..", extra).collect(); |
There was a problem hiding this comment.
I think we might be able to use let mut b = href_relative_parts(&[], relative_to); here instead?
There was a problem hiding this comment.
It takes &[Symbol] so the types don't line up directly, and passing an empty slice to mean "go to root" is a bit of a semantic stretch IMO. The explicit repeat_n is clearer about intent here.
There was a problem hiding this comment.
Amazing, thanks again :) I've only really the one concern - that we might also need to consider relative paths when using such ExternalLocation::Remote.url elsewhere, but perhaps that's moot if they're guaranteed only to be output when based at the crate root?
| let is_absolute = url.starts_with('/') | ||
| || url.split_once("://").is_some_and(|(scheme, _)| { | ||
| scheme.bytes().next().is_some_and(|b| b.is_ascii_alphabetic()) | ||
| && scheme | ||
| .bytes() | ||
| .all(|b| b.is_ascii_alphanumeric() || matches!(b, b'+' | b'-' | b'.')) | ||
| }); |
There was a problem hiding this comment.
Makes me wonder whether we should do further sanity checking here, eg ensuring that url does not have query or fragment parts - but that can be dealt with as a separate issue/PR.
There was a problem hiding this comment.
Agreed, worth a follow-up.
src/librustdoc/html/format.rs
Outdated
| ExternalLocation::Remote { ref url, .. } => { | ||
| // `ExternalLocation::Remote` always end with a `/`. | ||
| format!("{s}{path}", path = fmt::from_fn(|f| path.iter().joined("/", f))) | ||
| format!("{url}{path}", path = fmt::from_fn(|f| path.iter().joined("/", f))) |
There was a problem hiding this comment.
Are these links guaranteed only to appear based from the crate root, or do we also need to add appropriate levels of ../ for the relative url case here (and in the other few places I've commented "ditto" below)? If adjustment is required then I guess the code to do that can be factored out somewhere (a method on ExternalLocation perhaps)?
There was a problem hiding this comment.
Yeahh, these other sites have the same gap. Before this PR none of them handled relative paths either. I'll expand the fix to cover them.
There was a problem hiding this comment.
Okay, factored out a remote_url_prefix helper and applied it to the macro href and primitive link sites. The source link in context.rs and the make_href/root_path interaction are more involved since source pages have different depth semantics, so I'll handle those in a follow-up.
src/librustdoc/html/format.rs
Outdated
| let cname_sym = ExternalCrate { crate_num: def_id.krate }.name(cx.tcx()); | ||
| let builder: UrlPartsBuilder = | ||
| [s.as_str().trim_end_matches('/'), cname_sym.as_str()] | ||
| [url.as_str().trim_end_matches('/'), cname_sym.as_str()] |
| ExternalLocation::Remote(ref s) => { | ||
| root = s.to_string(); | ||
| ExternalLocation::Remote { ref url, .. } => { | ||
| root = url.to_string(); |
There was a problem hiding this comment.
If we can't properly handle relative extern locations here for now, could we add a FIXME: comment?
There was a problem hiding this comment.
Added FIXME comments on both sites.
| name: e.name(self.tcx).to_string(), | ||
| html_root_url: match external_location { | ||
| ExternalLocation::Remote(s) => Some(s.clone()), | ||
| ExternalLocation::Remote { url, .. } => Some(url.clone()), |
There was a problem hiding this comment.
As above, perhaps add a FIXME: here?
| if !is_remote && let Some(root_path) = root_path { | ||
| if !is_absolute && let Some(root_path) = root_path { | ||
| let root = root_path.trim_end_matches('/'); | ||
| url_parts.push_front(root); | ||
| } |
There was a problem hiding this comment.
Sorry, one more thing. If we have a url like ../../core/index.html that is relative to the current page, won't prefixing it with root cause it to break (unless root happens to be a path to the current directory)?
There was a problem hiding this comment.
Sorry, can we add a FIXME: here too?
51db628 to
2f68c59
Compare
2f68c59 to
b9614b2
Compare
Co-authored-by: Michael Howell <michael@notriddle.com>
There was a problem hiding this comment.
While looking for more bugs, I wrote a few test cases that you probably want to incorporate. All of them passed, so it seems fine. Just bringing them up:
diff --git a/tests/rustdoc-html/extern/extern-html-root-url-relative.rs b/tests/rustdoc-html/extern/extern-html-root-url-relative.rs
index df6ebf1aedd..ba2b50c6bf2 100644
--- a/tests/rustdoc-html/extern/extern-html-root-url-relative.rs
+++ b/tests/rustdoc-html/extern/extern-html-root-url-relative.rs
@@ -1,4 +1,4 @@
-//@ compile-flags:-Z unstable-options --extern-html-root-url core=../ --extern-html-root-takes-precedence
+//@ compile-flags:-Z unstable-options --extern-html-root-url core=../ --extern-html-root-takes-precedence --generate-link-to-definition
// At depth 1 (top-level), the href should be ../core/...
//@ has extern_html_root_url_relative/index.html
@@ -9,7 +9,19 @@
// At depth 2 (inside a module), the href should be ../../core/...
pub mod nested {
//@ has extern_html_root_url_relative/nested/index.html
- //@ has - '//a/@href' '../../core/iter/index.html'
+ //@ has - '//a/@href' '../../core/future/index.html'
#[doc(no_inline)]
- pub use std::iter;
+ pub use std::future;
}
+
+// Also depth 2, but for an intra-doc link.
+//@ has extern_html_root_url_relative/intra_doc_link/index.html
+//@ has - '//a/@href' '../../core/ptr/fn.write.html'
+/// [write](<core::ptr::write()>)
+pub mod intra_doc_link {
+}
+
+// link-to-definition
+//@ has src/extern_html_root_url_relative/extern-html-root-url-relative.rs.html
+//@ has - '//a/@href' '../../core/iter/index.html'
+//@ has - '//a/@href' '../../core/future/index.html'There was a problem hiding this comment.
Added, thanks for the extra coverage.
|
@bors r+ |
Rollup of 12 pull requests Successful merges: - #149169 (ptr::replace: make calls on ZST null ptr not UB) - #150562 (Fix doc link used in suggestion for pinning self) - #152418 (`BTreeMap::merge` optimized) - #152679 (rustc_expand: improve diagnostics for non-repeatable metavars) - #152952 (mGCA: improve ogca diagnostic message ) - #152977 (Fix relative path handling for --extern-html-root-url) - #153017 (Implement debuginfo for unsafe binder types) - #152868 (delete some very old trivial `Box` tests) - #152922 (rustc_public: Make fields that shouldn't be exposed visible only in `rustc_public`) - #153032 (Fix attribute parser and kind names.) - #153051 (Migration of `LintDiagnostic` - part 3) - #153060 (Give a better error when updating a submodule fails)
Rollup merge of #152977 - arferreira:fix-extern-html-relative-depth, r=notriddle Fix relative path handling for --extern-html-root-url When `--extern-html-root-url` receives a relative path (like `../`), rustdoc uses it as a literal prefix regardless of page depth. This works at depth 1 (`crate/index.html`) but breaks at depth 2+ (`crate/module/struct.Foo.html`) because the relative path doesn't account for the extra nesting. This patch detects relative vs absolute URLs in the `Remote` branch of `url_parts` and prepends the necessary `../` segments based on the current page depth. Absolute URLs (`https://...`) and server-absolute paths (`/docs/...`) are unchanged. This makes relative paths a viable option for `--extern-html-root-url`, which they weren't before. Related to #152917, which exposed that relative paths weren't handled correctly. cc @eggyal
When
--extern-html-root-urlreceives a relative path (like../), rustdoc uses it as a literal prefix regardless of page depth. This works at depth 1 (crate/index.html) but breaks at depth 2+ (crate/module/struct.Foo.html) because the relative path doesn't account for the extra nesting.This patch detects relative vs absolute URLs in the
Remotebranch ofurl_partsand prepends the necessary../segments based on the current page depth. Absolute URLs (https://...) and server-absolute paths (/docs/...) are unchanged.This makes relative paths a viable option for
--extern-html-root-url, which they weren't before.Related to #152917, which exposed that relative paths weren't handled correctly.
cc @eggyal