JSDoc comment parsing supports multiline backticks #50677

sandersn · 2022-09-07T18:50:54Z

Previously, backticks in JSDoc ended at the end of a line. This PR changes that so that backticks continue parsing until another backtick. I'm not sure this is a good change, and I left the code a bit inelegant until we decide.

Note that triple backticks and single backticks are treated the same way: both have to be closed. The code change is already very clunky, and I haven't thought about how much clunkier it would be to track backtick history in the parser.

I would appreciate opinions on whether this is a sensible change in behaviour, as well as ideas on how hard it would be to distinguish triple from single backticks.

Fixes #47679

Previously, backticks in JSDoc ended at the end of a line. This PR changes that so that backticks continue parsing until another backtick. Note that triple backticks and single backticks are treated the same way. The change is already very clunky, and I haven't thought about how much clunkier it would be to track backtick history in the parser.

jakebailey · 2022-09-09T05:51:11Z

I haven't read the PR yet, but I enabled the "tsdoc" ESLint linter plugin and there are even quite a few comments in our repo that have unclosed backtick blocks, so I'm half worried what we might break out there.

sandersn · 2022-09-12T17:35:24Z

I agree. After some discussion with @andrewbranch , this can be summarised as improvement in correctness at the cost of breaking existing code. I'm inclined to close this PR, but other opinions are welcome. @DanielRosenwasser you might want to weigh in.

Edit: Probably we need an estimate of how common unbalanced backticks are.

trusktr · 2022-10-01T16:42:12Z

I haven't read the PR yet, but I enabled the "tsdoc" ESLint linter plugin and there are even quite a few comments in our repo that have unclosed backtick blocks, so I'm half worried what we might break out there.

Why are there unclosed back ticks? I've never seen that in my entire life, and it is an interesting coincidence that the TS repo (of all places) has them.

jakebailey · 2022-10-01T17:20:17Z

Just a typo (which I've already fixed). I'm only worried because JSDoc has semantic meaning in some codebases where an unclosed backtick might eat an @param or @deprecated or something later.

DanielRosenwasser · 2023-01-24T21:56:42Z

#52398 is a related issue - any recollection whether \ escaping was discussed around this?

DanielRosenwasser · 2023-01-25T19:23:27Z

Let's revisit this and consider whether or not we can give suggestion diagnostics for plain JS users, and errors for checkJs users.

sandersn · 2023-01-30T16:43:18Z

@DanielRosenwasser Here's a summary of the discussion from the editor meeting. No conclusions, just more issues raised:

In addition to quoting tags and decorators, we would like to have a way to quote jsdoc comments so they too could be embedded. The ending token */ is the problem and would need a quoting mechanism, perhaps *\/
This PR breaks on incorrectly closed backticks, and the breaks in our codebase indicated that breaks would happen all over.
(One thing I forgot to mention: distinguishing single backticks from triple backticks is a difficult change given the current jsdoc tokeniser.)
The breaks, while common, might not be too bad since it just results in badly formatted jsdoc for .ts, and some missing information for .js.
In .js we could detect and issue an error when the parser ends in a quoted state, although there may be some linters that already do this. (@jakebailey ran some tool to detect unclosed backticks on our code.)

So one possible path forward is to add an "unclosed backtick" error to this PR, for JS files only. We'd need to check what linters issue similar errors, however.

rbuckton · 2023-02-03T18:00:41Z

I am not sure the JSDoc parsing changes are sufficient to correctly handle embedded markdown. If we want backtick parsing to be reliable, we should probably ensure we support the following cases as well:

Code spans in markdown aren't interpreted as code spans if there is a hard line break (e.g. 2+ newlines) anywhere after the starting backtick, and only include soft line breaks if it can find a trailing backtick.

`b
c`

is interpreted as a code span:

b c

while

`b

c`

is interpreted as literal characters:

`b

c`

Code spans must also be balanced so that they are not terminated improperly:

```abc```

is interpreted as:

abc

while

```abc``

is interpreted as:

```abc``

Code spans can contain backticks as long as the starting and ending backtick runs are greater in number than the number of sequential backticks to include in the code span:

```a `b` c```

is interpreted as:

a `b` c

If the code text of code span needs to include a backtick at the start or end, a space is required:

`` `b``

is interpreted as:

`b

while

`` ` ``

is interpreted as:

`

Code blocks begin with three or more backticks, followed by any non-backtick characters (primarily used for language selection), and end when the parser encounters the same number of backticks on a separate line, or the end of the input:

```abc
def
```

is interpreted as:

def

while

```abc
def```

is interpreted as (note the backticks are included in the code block):

def```

Also, code blocks aren't only detected from backticks. Fenced code blocks can also be written using 3+ sequential tilde characters:

~~~
abc
~~~

is interpreted as:

abc

rbuckton · 2023-02-03T18:01:48Z

tests/cases/fourslash/quickInfoJSDocBackticks.ts

 goTo.marker("x");
 verify.quickInfoIs("(parameter) x: string", "hi there `@param`");
 goTo.marker("y");
-verify.quickInfoIs("(parameter) y: string", "hi there `@ * param\nthis is the margin");
+verify.quickInfoIs("(parameter) y: string", "hi there `@ * param\nthis is the margin\n@param {string} z OOPS, unclosed backtick!");


This isn't correct. Markdown won't detect this as a code span because there's no closing backtick, so it will treat the backtick as a literal character.

rbuckton · 2023-02-03T18:28:42Z

I was experimenting with better markdown parsing for TSDoc a few years ago. The parsing rules I pursued were that any @tag at the start of a line would be considered a markdown block element, following the rest of the markdown rules for everything else. So:

@param x `a
@param y `b

Would treat the code span starting at `a` as unterminated, because blocks are parsed before spans. This also means the code span starting at b would be unterminated.

This gets a bit tricker when JSDoc parameter descriptions flow onto the next line, since JSDoc rules don't align with markdown's block-level indentation rules. Consider an ordered list:

1. ```abc
def
```

is interpreted as:

def

while

2. ```abc
   def
   ```
3. abc
```def
ghi
```
4. abc
   ```def
   ghi
   ```

is interpreted as:

```
def
```
abc

ghi

abc
```
ghi
```

Ordered list indentation expects the text to flow at the same indentation level. So for (1), the ```abc block ends immediately because the content that follows is part of a different block. (2) is properly indented, (3) only contains abc in its description, the code block is a different paragraph, and (4) is properly indented.

This makes the rules around @param a bit weird and open to interpretation. Due to varying commenting styles, we probably want to use the indentation of the 2nd line of the tag (if present) up to the offset where the parameter description would start. If the first line starts with a code fence, we would use the offset of the parameter description as the indentation:

@param a this is a multiline parameter
whose initial indentation is 0.

@param b this is a multiline parameter
         whose initial indentation is 9.

in the case of code blocks whose initial indentation is 0:

@param a this has an unterminated code block at indentation 0
```
the initial indentation here is also 0
@param b since the code block shares the indentation of this param tag,
this content is considered part of the code block.

if the indentation of the code block is greater than the block tag that follows:

@param a this has an unterminated code block at indentation 0
  ```
  the indentation here is 2
@param b this is not considered part of the code block.

if the code block starts the parameter description, its initial indentation should be the offset of the start of the description:

@param a ```
         the code block has an indentation of 9.
@param b this is not considered part of the preceding code block.

sandersn · 2023-07-25T18:06:49Z

I'd rather avoid parsing real markdown code blocks, since our JSDoc parsing is already quite slow. But as @rbuckton shows, any improvement less than real parsing is unlikely to improve things. I'm going to close this PR.

dreamorosi · 2023-07-25T18:11:18Z

@sandersn is there going to be any alternative to solve the issue?

sandersn · 2023-07-25T21:53:19Z

@dreamorosi I haven't come up with anything yet. If you have some ideas, please do propose them on the original issue.

typescript-bot assigned sandersn Sep 7, 2022

typescript-bot added Author: Team For Milestone Bug PRs that fix a bug with a specific milestone labels Sep 7, 2022

sandersn requested review from andrewbranch, gabritto and jakebailey September 7, 2022 18:51

sandersn requested a review from DanielRosenwasser September 12, 2022 17:35

sandersn mentioned this pull request Sep 12, 2022

TypeScript class method decorator not rendering properly in version >= 1.63.2 #47679

Open

DanielRosenwasser mentioned this pull request Jan 24, 2023

Support for comments inside JSDoc @example comments #52398

Open

5 tasks

rbuckton suggested changes Feb 3, 2023

View reviewed changes

sandersn closed this Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JSDoc comment parsing supports multiline backticks #50677

JSDoc comment parsing supports multiline backticks #50677

Uh oh!

sandersn commented Sep 7, 2022

Uh oh!

jakebailey commented Sep 9, 2022

Uh oh!

sandersn commented Sep 12, 2022 •

edited

Loading

Uh oh!

trusktr commented Oct 1, 2022 •

edited

Loading

Uh oh!

jakebailey commented Oct 1, 2022 •

edited

Loading

Uh oh!

DanielRosenwasser commented Jan 24, 2023

Uh oh!

DanielRosenwasser commented Jan 25, 2023

Uh oh!

sandersn commented Jan 30, 2023

Uh oh!

rbuckton commented Feb 3, 2023

Uh oh!

rbuckton Feb 3, 2023

Uh oh!

rbuckton commented Feb 3, 2023

Uh oh!

sandersn commented Jul 25, 2023

Uh oh!

dreamorosi commented Jul 25, 2023

Uh oh!

sandersn commented Jul 25, 2023

Uh oh!

Uh oh!

JSDoc comment parsing supports multiline backticks #50677

JSDoc comment parsing supports multiline backticks #50677

Uh oh!

Conversation

sandersn commented Sep 7, 2022

Uh oh!

jakebailey commented Sep 9, 2022

Uh oh!

sandersn commented Sep 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trusktr commented Oct 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakebailey commented Oct 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DanielRosenwasser commented Jan 24, 2023

Uh oh!

DanielRosenwasser commented Jan 25, 2023

Uh oh!

sandersn commented Jan 30, 2023

Uh oh!

rbuckton commented Feb 3, 2023

Uh oh!

rbuckton Feb 3, 2023

Choose a reason for hiding this comment

Uh oh!

rbuckton commented Feb 3, 2023

Uh oh!

sandersn commented Jul 25, 2023

Uh oh!

dreamorosi commented Jul 25, 2023

Uh oh!

sandersn commented Jul 25, 2023

Uh oh!

Uh oh!

sandersn commented Sep 12, 2022 •

edited

Loading

trusktr commented Oct 1, 2022 •

edited

Loading

jakebailey commented Oct 1, 2022 •

edited

Loading