Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MEGA-ISSUE: Syntax highlighting in language-javascript #875

Open
savetheclocktower opened this issue Jan 19, 2024 · 12 comments
Open

MEGA-ISSUE: Syntax highlighting in language-javascript #875

savetheclocktower opened this issue Jan 19, 2024 · 12 comments
Labels
bug Something isn't working

Comments

@savetheclocktower
Copy link
Contributor

savetheclocktower commented Jan 19, 2024

IMPORTANT: Some issues have already been fixed!

If you’re still on the regular 1.113 release, you might be suffering from a problem that has already been fixed. Many fixes landed on master in #859. You are encouraged to download the latest rolling release — both (a) to see whether what you’re reporting has been fixed, and (b) so that you can enjoy the fixes that have been reported since the 1.113 release.


This will serve as a catch-all issue for any problems you may encounter with syntax highlighting in JavaScript files (language-javascript). If you have problems with the new syntax highlighting (or folds, or indents) that are specific to js or jsx files, keep reading.

Something isn't highlighting correctly!

If there are any highlighting regressions since 1.113:

First, please scroll down and see if someone else has reported the issue. If not, please comment with

  • A small amount of sample code to reproduce the issue
  • Screenshots (before and after would be ideal, but just the “after” is OK for obvious things

I want to go back to the old highlighting!

You can easily opt out of the new Tree-sitter highlighting for any language, but first please:

  • subscribe to this issue so you'll know when the problem is fixed
  • remove the config change when the next release comes out so that you can enjoy the new grammar once again

To revert to the old Tree-sitter grammar for this language only, add the following to your config.cson:

".js.source":
  core:
    useLegacyTreeSitter: true

To revert to the TextMate-style grammar for this language only, add the following to your config.cson:

".js.source":
  core:
    useTreeSitterParsers: false
@savetheclocktower savetheclocktower added the bug Something isn't working label Jan 19, 2024
@savetheclocktower
Copy link
Contributor Author

There have already been a handful of JavaScript fixes in #859 (check the commit descriptions for details), and I've upgraded the tree-sitter-javascript parser to the latest master since the 1.113 release.

@claytonrcarter
Copy link
Contributor

I'm see some inconsistency in identifier coloring in JS. In particular, variables are scoped and colored differently inside and outside of "blocks", and also depending on how they're used. Here's an example (this is the Firefox theme, but he results are the same for the built in themes that I tried)
Screen Shot 2024-01-19 at 8 27 20 AM

Notice how foo is not highlighted on lines 5, 10, 11, 16 and 17. From what I can tell:

  • foo is only getting source.js when outside of these blocks (eg line 3)
  • foo is getting source.js, meta.block.js for references inside these blocks.
  • foo is getting source.js, meta.block.js, variable.other.assignment.js for assignments inside these blocks (eg 6)

In CSS/terms, this means that .syntax--source.syntax--js (blue, in my case) is winning outside of the block, that .syntax--js is winning inside the block (unstyled, for me), and that .syntax--js.syntax--variable (blue again) is winning in assignments.

The legacy tree-sitter highlighter does not appear to apply the meta.block.js scope, so all occurrences of foo are blue for me, presumably because .syntax--source.syntax--js is winning in the "inside block, not assignment" cases.

I note that in your write up on scopes you say that you want "to avoid the temptation to classify every remaining identifier as variable.other and thereby dilute the significance of the scope", and I totally get that, but this is a case where semantically similar/identical things are looking different. Is this a case where themes should be updated, or what purpose is meta.block.js serving here?

@claytonrcarter
Copy link
Contributor

Oh, BTW I tried this out on the latest version of #859 and I didn't see it changing. The screenshot above was taken while running that branch. (At least to the best of my knowledge. 😄 )

@savetheclocktower
Copy link
Contributor Author

I'm see some inconsistency in identifier coloring in JS. In particular, variables are scoped and colored differently inside and outside of "blocks", and also depending on how they're used. Here's an example (this is the Firefox theme, but he results are the same for the built in themes that I tried)

You had me nervous for a second.

The variable scopes are present on lines 1, 2, 6, 7, 12, and 13, because that's where variables are declared, assigned, or reassigned. You are seeing what you are seeing in the Firefox theme (assuming it's the firefox-syntax package) because that package makes utterly insane choices.

For instance, the whole file is scoped as source.js, and .syntax--source.syntax--js in the Firefox theme is what applies that cyan-ish color. But .syntax--js by itself is what gives things that whitish color. The .syntax--variable class name also applies a cyan-ish color.

Anything inside a block gets scoped as meta.block.js for the sake of semantics, and that's enough to ensure that nonvariables are not cyan. This theme is nuts and will leave you less informed about the meanings of various tokens than you'd be if everything were dark gray.

@savetheclocktower
Copy link
Contributor Author

what purpose is meta.block.js serving here?

This post illustrates the sorts of things you can do with meta scopes. The TextMate scope system exposes itself to settings and snippets in a way that the Tree-sitter system can't, so the more scope information you have at the cursor, the more flexibility you have.

@DeeDeeG
Copy link
Member

DeeDeeG commented Feb 6, 2024

In the "Solarized Dark" theme, I kind of like console.log() and console.error() etc. being blue. It seems more contrasty and stands out. [EDIT: Specific color not as important as contrastiness.]

I tend to think console.log() should stand out, as it indicates something an end-user should know, something being logged on a server, or good old "print a message" debugging. Great to be able to spot these a mile away while skimming/scrolling and using it for "console.log() debugging".

Legacy Tree-sitter

(I like this highlighting)

console error stands out with Legacy Tree-sitter -- Solarized Dark Theme

Modern Tree-sitter

(Seems less contrasty to me, less variety of color leading to slightly less readable/skimmable code? Particularly, console.error() as green sort of doesn't jump out to me like the bright blue does. )

console error blends in with Modern Tree-sitter -- Solarized Dark Theme

Code Snippet

https://github.com/pulsar-edit/package-frontend/blob/95ce50674bf4698521019bf67ea24498ba6a144e/microservices/download/utils.js#L126-L140

@savetheclocktower
Copy link
Contributor Author

The legacy Tree-sitter grammar picks such utterly ridiculous scope names for both console.log that I'm amazed they align to the same color in any theme.

In the legacy grammar, console is scoped as support.variable.dom, which… is a confusing mélange of ideas. error is scoped as entity.name.function because that's how the legacy grammar scopes all function calls — also wrong.

The modern grammar scopes these as support.class.builtin.console.js and support.function.builtin.console.js. Most themes don't try to scope things so granularly as to emphasize some utility functions over others. But that's the advantage of verbosity — it allows users to do that very thing!

Drop this into your user stylesheet:

@import "syntax-variables";

.syntax--source.syntax--js {
  .syntax--builtin.syntax--console {
    color: @syntax-color-function;
  }
}

@asiloisad
Copy link
Contributor

asiloisad commented Feb 28, 2024

The modern tree-sitter auto-indent is broken if multiple cursors are used. Tested in safe mode.

  1. a state before
    image
  2. press Enter
  3. state after
    a) legacy tree-sitter
    image
    b) modern tree-sitter
    image

@savetheclocktower
Copy link
Contributor Author

savetheclocktower commented Feb 28, 2024

Eesh. Nice catch. I'll be able to make unit tests out of these examples. (Once I fix this bug, of course — which does not seem like it will be easy!)

@savetheclocktower
Copy link
Contributor Author

@asiloisad, a fix for this issue is now present in #941.

We analyze the tree to determine when we need to adjust indentation after a newline is inserted. But Pulsar has always wanted to answer that question as soon as possible — right after the given buffer change is made. Even if the change is part of a transaction. That's painful for modern Tree-sitter because it would require an immediate and synchronous tree reparse.

Instead, we try to get away with waiting until the transaction is over, then auto-indenting the given row. But this isn't always correct, because a transaction can contain any number of arbitrary edits. By the end of the transaction, the content that we wanted to auto-indent may no longer be on the same row that it was on when we were first asked! So in complex cases, we give up on the atomic adjustments and just do an auto-indent of the entire range affected by the transaction.

Previously, our heuristic for determining when to perform this fallback was simple: if a transaction contains more than one edit, use the fallback behavior. But that logic triggers in a multi-cursor edit scenario such as the one in your example. Since the edit increases the indent on line 2, invoking auto-indentation over the entire buffer range affected by the three edits will set the indention on line 4 to match that of line 2, and the indentation on line 7 to match that of line 5. Hence the ever-increasing indentation. This happens even though we can perform each of the three indentation adjustments just fine, since they're on different lines and don't get in each other's way.

So the new heuristic is a bit laxer: if the content of the line is identical to what it was in the middle of the transaction, then it's still safe to do an atomic re-indent.

The fallback option of a transaction-wide auto-indent will sometimes fail to produce the outcome that the user wants, as in your example. I hate that fallback. It's a better outcome than giving up on trying to auto-indent altogether — if we'd done that in your example, none of the return } lines would've been indented at all — but only barely.

But the other option is bad, too. We could disable async indentation altogether, and get indentation behavior that's just as good as what we get in the other modes. But then your example would involve forcibly re-parsing the tree each time we inserted a newline — three times total — and then once more at the end of the transaction. Those are three “throwaway” trees that we can't use for any other purpose.

And even in your example that's probably the best approach! Three re-parses isn't that bad; the user won't even notice it! The hard part is that the decision point inside suggestedIndentForBufferRow — re-parse now, or wait until the end of the transaction? — happens in a context where we don't know how many subsequent edits will happen. For synchronous indenting to be the default, we need to have some sort of safety valve so that the edge cases won't end up freezing the browser.

One option might be to start out in synchronous indent mode, but only allow ourselves a certain number of re-parses (or a certain amount of total re-parse time) in a given transaction before we decide that our budget has been exceeded and flip to async indent mode.

At any rate, this change in heuristic is a good thing and makes multi-cursor edits less tricky. And it kicks the can down the road.

Thanks for the report!

@savetheclocktower
Copy link
Contributor Author

@asiloisad: Actually, I rubber-ducked myself into the idea of a time budget in the previous comment — then decided to implement it while my brain still found the idea fascinating. This is now also part of #941!

So your example should now work just fine for two different reasons: first, because the edits in question will result in synchronous indentation decisions, since the tree re-parses that are needed won't come close to exceeding the re-parse time budget. Second, because even if we were forced to fall back to async indent, we recognize your example as one in which we can still auto-indent three lines atomically because their contents won't have changed between the original indent hinting request and the end of the transaction. The new test case proves this.

@asiloisad
Copy link
Contributor

thanks you for investigation, a detailed explanation and fix. the logic around auto-indent is quite hard. I'm waiting for 1.115 :) It's should resolve the same problem, but in Python #880 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants