Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make handling of valid JS/TS ids/Type aliases consistent with the rest of rules #4078

Conversation

jkavalec
Copy link

Resolves #4072

Changes

  • added a simple expansion to a regex to check for valid ids containing _ and $. This makes it consistent with the rest of rules where it previously correctly highlighted const functions and class declaration, while splitting ids and TS type aliasses into highlighted and non-highlighted parts
  • added markups that check already working cases and the cases that were fixed

Checklist

  • Added markup tests, or they don't apply here because...
  • Updated the changelog at CHANGES.md

@joshgoebel
Copy link
Member

This makes it consistent with the rest of rules where it previously correctly highlighted const functions and class declaration

Could you provide a little more color here? You're saying this worked in the recent past...?

@jkavalec
Copy link
Author

jkavalec commented Aug 3, 2024

Hi, sorry for unclear description, I meant by that "previously" that those const functions and class declarations are already highlighted correctly, and with my patch the rest of the cases are highlighted correctly as well.

@joshgoebel
Copy link
Member

I find this very non-idiomatic... I googled and looked around and the popular guides I see refer to UPPER case for constants, camelCase for other things, etc - which I'm quite familiar with... but nothing like this.

I'm not certain we want to support niche cases - so let's start there - whose idea of proper naming conventions is this? Can you show me any popular style guides or large tech companies suggesting this type of naming convention?

@jkavalec
Copy link
Author

jkavalec commented Aug 5, 2024

As I mentioned in original issue, I am working on a language that is using snake case and should cooperate with TS, and when I release the first version I would like IDs that are valid JS ids to be displayed correctly in tools that use Highlight.js as their highligher - so for instance (SO/ChatGPT) fail to highlight properly whereas Claude AI that uses different highlighter (prisma I guess) has none of these issues

From a few languages I tried this is the only language that has this problem - For instance Kotlin/Java/C++ none of these exhibit same problem and all valid IDs are treated as such and all the names are treated consistently

There is already partial support for this as mentioned in case of const functions and class declarations, my fix makes this consistent with the rest of the output

On top of that I did nothing crazy here, it is just a fix to regex that changes from regex that doesn't match valid ID names into regex that matches, there was no change of logic or anything that would break previous code, there were no test cases broken, I added all the test cases to verify the behavior and consistency of the result

At this point this is pretty much only tool that fails to match valid JavaScript ID, all IDEs, prism.js which is biggest alternative to highlight.js do not break on valid JS names

@joshgoebel
Copy link
Member

joshgoebel commented Aug 9, 2024

As I mentioned in original issue, I am working on a language

You may want to write a custom grammar then rather than trying to shoehorn your language into another language's grammar.

For instance Kotlin/Java/C++ none of these exhibit same problem and all valid IDs are treated as such and all the names are treated consistently

Untrue. I took a look just now and all these grammars all have very different handling of identifiers.

There are a few different contexts here we should examine, syntactical and idiomatic. When we can easily understand the syntax, we aim to highlight all valid syntax. For example this might be valid in some languages, but NOT idiomatic.

# made up example code
constant ADS&----!@---#&ADH = 32;

That's pretty awful and hopefully not idiomatic - but if we can match the entire const [ident] = such that it's valid syntactically then we'll do so - because we can be certain ADS&----!@---#&ADH is a constant.

Meanwhile if we just have a random construct like:

[1, 2, 3, ADS&----!@---#&ADH]

That MAY be a valid identifier, but if it's not idiomatic we're not going to highlight it as a constant. Generally our rule is to not highlight identifiers/variable names unless the language has a strong syntax for such things, or strong idioms. So something like $x would count as a variable in PHP (strong syntax), and something like POWER_LEVEL counts as a constant in many different languages - not by syntax, but by idiom.

This is an opinion of the library, not everyone likes it. (that we care strongly about idioms and factor them into our rules)

There is already partial support for this as mentioned in case of const functions and class declarations

Yes, see above this is fine - because it's 100% clear syntactically in those cases.

prism.js which is biggest alternative to highlight.js do not break on valid JS names

Sure it does (using your own examples). A quick check of both JS and TS reveals it's far from consistent.

Screen Shot 2024-08-09 at 6 51 43 PM Screen Shot 2024-08-09 at 6 51 59 PM

only tool that fails to match valid JavaScript ID, all IDEs, prism.js

This is not a stated goal of the project - to match anyone else example by example. If you prefer Prism better, use that.

@joshgoebel
Copy link
Member

If you're interested in improving obviously syntactical context cases (in any of our grammars) that's a conversation worth having (though I'd file it as a dup of #2500)... but we'd be open to improvements - cases like const or class etc - if we don't have those correct in various languages, happy to fix them. Though even there idioms matter, ie:

const x = 3; // x is `variable.constant`
const FastCar = "Ferrari"; // FastCar is `title.class` (by idiom)

But we're not interested in highlighting random identifiers that are not following the common/popular idiomatic conventions for a given language - that would include the very oddly named JS/TS variables you're wanting us to support.

@joshgoebel
Copy link
Member

Closing this as a #wontfix. We'll support only the idiomatic CamelCase naming (for auto-highlighting).

@joshgoebel joshgoebel closed this Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JavaScript/Typescript Upper_Snake_Case issues
2 participants