-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Designate doc comments #99
Conversation
314b2b8
to
59bf0d3
Compare
My first attempted implementation was using regular expressions, but, unfortunately, the examples were not passing (https://travis-ci.org/github/tree-sitter/tree-sitter-rust/builds/753422004#L413). My regular expressions seemingly had trouble capturing the pattern Therefore, I've reworked the approach to use externals (commented in be26083). |
59bf0d3
to
d25be16
Compare
Since doc comments are generally multi-line, I've made the It'd be useful to have a node without the leading |
Is there any update on this one? |
What's the current status on this? Is any help needed? |
Ping ? |
What about |
I do not have merge access to this repository. This is not the only PR in review limbo right now. |
@dcreager is there someone maintaining tree-sitter-rust at github right now? |
@resolritter can you rebase on the latest and I'll approve the PR. |
d25be16
to
b09ed6e
Compare
FYI #126 |
The current implementation is incomplete because Rust requires strictly three slashes for doc comments, any more than that and it turns back to a normal line comment. This is explained in the current documentation for comments: https://doc.rust-lang.org/reference/comments.html#examples. Moving this to Draft while I try to fix that. Edit: Should be fixed |
b09ed6e
to
2e6b9b1
Compare
============================================ | ||
|
||
/// Doc | ||
/// Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//!
as well.
But I think //!!
will become normal comment. The last I worked on is this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From https://doc.rust-lang.org/reference/comments.html#examples
//! - Inner line doc
//!! - Still an inner line doc (but with a bang at the beginning)
So I implemented it as "any ! makes it a doc comment, no matter how many"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's only up till four, IIRC it's the same highlight in vim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In SpaceVim (without LSP) the highlight is different (grey for comments, orange for doc comments) and that was really useful IMHO.
need to rebase, because I've merged your other PR. Having those parser.c in the repository is annoying. |
2e6b9b1
to
5c30f3d
Compare
Rebased and also added support for |
This PR seems to have introduced a bug where the scanner can get into an infinite loop. I'm seeing the Tree-sitter test suite hang after updating |
It looks like the infinite loop was happening during some randomized mutation of the @resolritter Feel free to do a new PR if you can get this to avoid an infinite loop. I also have questions about the need to do this using the external scanner. Couldn't you just do this in the grammar? grammar({
// ...
rules: {
// ...
doc_comment: $ => token(choice(
// exactly three leading slashes
seq('///', optional(/[^/].*/)),
seq('//!', /.*/),
)),
// any number of leading slashes other than three, which would produce a doc comment.
line_comment: $ => token(seq(
'//', optional('//'), /.*/
)),
}
}); |
Of course, this would not join all of the adjacent doc comments into one continuous node - you'd get one node per comment. I think this might be better though: I think it would make it easier to determine what ranges of text contain the documentation itself, because you wouldn't have to deal with leading whitespace. I also just think that it retains more information to provide a node for each comment, and it's somewhat "lossy" to group them all into one node. I'm still open to the other approach though, if people have strong feelings that it's more useful to get a single node. |
Whatever happens, I promise we won't wait a year to merge this time. |
Having the whole text in a single node is why it was done this way. What would be the alternative for highlighting the code below? /// ```
/// use foo::Foo;
/// let bar = Foo::new("foo");
/// ```
I can only infer the following steps:
Having the text in a single node gets rid of steps 1 and 2. Or do you see a more efficient way to go about that? Or do you think having to traverse the tree in order to collect the text is not a problem?
How can I try this randomization when testing locally? |
if (started_with_slash == false || lexer->lookahead != '/') { | ||
lexer->result_symbol = DOC_COMMENT; | ||
while (true) { | ||
while (lexer->lookahead != '\n') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at least one of the problems is in this line: it should check for lexer->lookahead != 0
as well
This is a good point. It is definitely "lossier" than other nodes since it also includes the leading whitespace for contiguous lines. What's being gained by this approach is the ease of fetching the whole content directly, at the cost of less precision for the ranges. Feel free to close #128 if you feel like it isn't a good tradeoff. |
I think there are challenges either way, but it is more straightforward if you have a separate node for each comment. Copying the doc comments' text into a separate buffer is not an option - we need to parse the code in place so that the positions of the nodes in the nested syntax tree correspond correctly to the original file. So what we need to do is to retrieve a list of ranges from the original file that should be parsed, together, in a nested language (markdown). We can then parse the contents of those ranges using Tree-sitter's If we have a separate node for each comment, then we need to
If, on the other hand, we have one giant node, then we need to:
|
I was not aware that
Since this use-case is supported by the query API, I am fine with closing #128. |
What are those tree-sitter test suites that catch the regression? Could we run them in the CI of tree-sitter-rust? |
closes #88