-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup markdown span handling, take 2 #80550
Conversation
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
This comment has been minimized.
This comment has been minimized.
r? @jyn514 (rust-highfive has picked a reviewer for you, use r? to override) |
I think we should consider asking pulldown if they're willing to do this; I think it would be useful and in scope. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decide if locate() can be simplified by assuming s is always in md
I would expect this to cause unsoundness if wrong, so I'd prefer not to do that. I don't think current pulldown API guarentees it will be borrowed from the original string, it could be borrowed from Box::leak
or something.
Overall this is great though ❤️
src/librustdoc/html/markdown.rs
Outdated
// Ranges provided for Borrowed strings may differ from where the actual string comes | ||
// from. Since the &str points into `md`, we can calculate its actual location. | ||
CowStr::Borrowed(s) => locate(s, span), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why only Borrowed
variants have the wrong span? I would expect instead for it to be wrong for all spans, but we'd only be able to use locate
on Borrowed variants because otherwise it points to a different allocation.
That brings up a couple other questions I have:
- Why does
locate
only work for things within the same allocation? Can we improve it to work regardless? - When would pulldown give us a
Boxed
orInlined
span?
I think this PR would be good to land regardless, but the comment seems off, we should at least fix that before landing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get the easy ones out of the way first:
-
The comment tries to hint at the current problem: we get a
Borrowed
string that contains the destination of a reference-style link, and a range that points at the link usage itself. -
I'm not sure about
Inlined
, but it seems like it's a small string optimization for theBoxed
variant. It exists to avoid allocating onclone
if the boxed string is short enough.
As far as I can see, Pulldown provides a Borrowed
if the string directly comes from the source. If any transformation is necessary (e.g. unescaping used when scanning ref defs or such), Pulldown builds and hands out a Boxed
string.
The issue is not that Borrowed
strings can have an incorrect span, it's that for those, we can actually restore the span we need. For any other variant, it's reasonable (I think) to assume that the string contained in the variant is some sort of derivative of the original source, and can not be located by either pointer arithmetic or string search. (Just as I write this, if I'm correct this makes rfind
-ing on the Boxed
variant kinda pointless, because the string will not be an exact match anyway. Otherwise, Pulldown would have handed out a Borrowed
).
Hope this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is not that Borrowed strings can have an incorrect span, it's that for those, we can actually restore the span we need.
Right, that's what I expected. Can you say that in the comment? Right now it makes it sound like the range will only be different for Borrowed, which isn't true.
Can you take a look at 1 to see if it's possible to get rid of the unsafe? (and hopefully make this work for Boxed
variants too, but after your later comments that might not be possible.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not completely certain what you're referring to as "1" (ah I realized you meant your own numbered point). Getting rid of the unsafe is as easy as casting the pointer to usize
but that's just masking the problem. Otherwise, I would not change locate
if there's a chance for a pulldown update that allows us to remove it entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, r=me with the comment fixed then, and hopefully we can get rid of locate in a pulldown update soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found where my expectation break down: LinkReplacer can give us Borrow
variants that borrow from something other than md
. I made a note of this.
@bors delegate=bugadani |
✌️ @bugadani can now approve this pull request |
9cbac74
to
b67e8c1
Compare
b67e8c1
to
f073543
Compare
@bors r=jyn514 |
📌 Commit f073543 has been approved by |
☀️ Test successful - checks-actions |
This PR includes the cleanups made in #80244 except for the removal of
locate()
.While the biggest conceptual part in #80244 was the removal of
locate()
, it introduced a diagnostic regression.Additional cleanup:
RefCell
to avoid building two separate vectors for the linksWork to do:
locate()
can be simplified by assumings
is always inmd
cc @jyn514 This is the best I can do without patching Pulldown to provide multiple ranges for reference-style links. Also, since
locate
is probably more efficient thanrfind
(at least it's constant time), I decided to not check the link type and just cover every &str as it was before.