Skip to content

Fix unicode escapes in jsx identifiers and extended unicode characters in jsdoc #32716

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

weswigham
Copy link
Member

Which are the two other places we used isIdentifierStart in the scanner.

Fixes #32701

while (isIdentifierPart(text.charCodeAt(pos), ScriptTarget.Latest) && pos < end) {
pos++;
let nextChar: number;
while (isIdentifierPart(nextChar = codePointAt(text, pos), ScriptTarget.Latest) && pos < end) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional not to allow unicode escape sequences in JSDoc identifiers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so, but I dunno. Given that it's a comment, it just means that we'll interpret the \u0061 verbatim. @sandersn would be the authority here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Doesn't this change cause us to allow escape sequences in JSDoc identifiers?

jsdoc.app doesn't mention unicode as far as google can tell, so we are free to do whatever we want. I vote that JSDoc identifiers work the same as normal ones, which should mean allowing unicode escape sequences.

Copy link
Member

@sandersn sandersn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused after reading @ajafff's question. How does this change affect jsdoc? The tests make it seems like unicode identifiers (and escapes?) now work, but his question implies that it doesn't.

pos++;
let nextChar: number;
while (isIdentifierPart(nextChar = codePointAt(text, pos), ScriptTarget.Latest) && pos < end) {
pos += charSize(nextChar);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stupid question: why isn't charSize(c) === 2 when c >= 0x10000 instead of c > 0x10000?

while (isIdentifierPart(text.charCodeAt(pos), ScriptTarget.Latest) && pos < end) {
pos++;
let nextChar: number;
while (isIdentifierPart(nextChar = codePointAt(text, pos), ScriptTarget.Latest) && pos < end) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused. Doesn't this change cause us to allow escape sequences in JSDoc identifiers?

jsdoc.app doesn't mention unicode as far as google can tell, so we are free to do whatever we want. I vote that JSDoc identifiers work the same as normal ones, which should mean allowing unicode escape sequences.

@weswigham
Copy link
Member Author

he tests make it seems like unicode identifiers (and escapes?) now work, but his question implies that it doesn't.

Unicode characters now scan as identifiers in jsdoc, escapes are still not parsed.

@weswigham
Copy link
Member Author

Doesn't this change cause us to allow escape sequences in JSDoc identifiers?

No.

@sandersn
Copy link
Member

sandersn commented Aug 6, 2019

That makes sense. It would be nice to have JSDoc support escapes as well, but I think the right way to do that is to make JSDoc use the normal scanner. That's a complete rewrite of the low-level parsing code too, so I'm not sure it's worth it to us.

@weswigham
Copy link
Member Author

@sandersn Oh, too late. I already pushed a commit that adds unicode escape support into jsdoc.

@weswigham weswigham merged commit f333684 into microsoft:master Aug 6, 2019
@weswigham weswigham deleted the simplify-jsx-identifier-scanning branch August 6, 2019 22:14
@orta
Copy link
Contributor

orta commented Aug 10, 2019

I'm pretty sure this deprecates #32631 - can someone double check me?

@weswigham
Copy link
Member Author

I don't think so - is jsdoc, at least, we're still going to stop parsing before the hyphen.

@HiEv
Copy link

HiEv commented Oct 5, 2019

I don't think so - is jsdoc, at least, we're still going to stop parsing before the hyphen.

I'm a bit confused by this reply. In JSDoc you shouldn't stop parsing before the hyphen if there's no space. The JSDoc documentation says that it should be able to handle dashes in its member names. For example:

/**
 * @typedef {object} node
 * @property {string} aria-activedescendant - Description.
 */
/** @type {node} */
var node;

That does work in JSDoc from the commandline, but not in VSCode. (See the comments here.)

In VSCode, if you hover the mouse over the "node" on the "@typedef" line, it should show:

type node = {
	aria-activedescendant: string;
}

However, instead you get:

type node = {
	aria: string;
}

with the "-activedescendant" part missing.

If you put quotes around "aria-activedescendant", then you get this instead:

type node = {
	(Missing): string;
}

#32631 was supposed to fix that problem (#14395).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Insonsistent parsing of JsxIdentifier
5 participants