Use parse tree #2

tecosaur · 2024-05-11T17:18:00Z

Previously we were just using the Tokenizer from JuliaSyntax, but we might as well use the full parse-tree. It's higher-quality data, and with just a little more work we can have much more accurate highlighting.

Subject to change as I work on this, but here's a before and after screenshot for the current progress:

Before

After

Specific showcases

c42f · 2024-05-21T03:50:00Z

Beautiful :-)

The highlighting of x in the comment is particularly neat.

It makes me want to include tools - maybe even a parser mode - to further process the content of docstrings and comments within JuliaSyntax. But I know I shouldn't go there 😆

tecosaur · 2024-05-21T05:18:12Z

Thanks Claire 🙂

The highlighting of x in the comment is particularly neat.

It's inspired by https://code.tecosaur.net/tec/simple-comment-markup, which I consistently find slightly nice to have.

It makes me want to include tools - maybe even a parser mode - to further process the content of docstrings and comments within JuliaSyntax. But I know I shouldn't go there 😆

Heh, yea, I can see the temptation. I'm tempted to add an ad-hoc markdown parser for docstrings.

For the future, something else I'm considering is making an extensible list of highlighting passes be used. That way DSLs introduced by macros could get custom highlighting, for instance (I'd love to see styled"..." with highlighting).

Once I've applied the tweak we discussed on slack, I'm thinking we can just merge this and improve from here. The particulars of the highlighting scheme (the faces used, and where they're applied) are explicitly mentioned as subject to change.

jakobnissen · 2024-05-25T13:03:16Z

One point that might be an improvement: In the help, example julia> prompts are colored in green. This makes it difficult to visually distinguish from the actual prompts in the REPL history.
For this reason, OhMyREPL.jl will highlight julia> prompts in the help menu as red.
I would prefer if REPL mirrored OhMyREPL here

Current master:

1.10 with OhMyREPL:

tecosaur · 2024-05-25T15:22:13Z

@jakobnissen two things

I'm not sure that julia prompt highlighting should be considered by this library
The behaviour you reference is already customisable: Highlight julia-repl code in Markdown specially julia#54423

BioTurboNick · 2024-05-26T21:00:21Z

a = """function foo(x::AbstractArray{<:Integer{{Bar}}, 3})\n    findfirst('z', (("happy"))) === nothing == [0x40, 0x52, 0x62, [[0x63]]] += 2\nend""" # Not intended to be real code, just to demonstrate syntax
write(stdout, JuliaSyntaxHighlighting.highlight(a))

My comments:

Parentheses/braces at highest level could have the same color as a function name/type, respectively. They're already distinct enough, so don't need to separate them with different colors.
Parentheses and brackets could alternate with two colors (light and dark versions of one) the same way you have braces now.
Consider using bright white for literals rather than a color
nothing could share the same coloring as literals
If literals are no longer purple, then alternating purple/light purple is available for brackets
Operators could have the same color as functions, except inside braces they could have the same color as types.
I don't love that the assignment operator is multicolor. I assume the intent is to make sure it's distinguishable from equality? But maybe I'll find it a good tradeoff if the other colors are smoothed out. Or maybe the whole thing could just use the red?
Commas in type signature currently has the same color as types, but commas elsewhere are the standard gray. Perhaps they should all be gray? Or they could all take on the color of their parentheses/brackets/braces level?
The in-comment blue-in-ticks should probably be dark blue to match the darker comment, otherwise the eye is drawn to it.

I've mocked up what most of my present suggestions would look like:

Alternate where the types and operators inside a type parameter also alternate so they match the next brace level:

EDIT: Hmm, perhaps the brighter color should be the outer level for the brackets; (maybe also the type/braces? So the eye isn't drawn first to the inner one.

KristofferC · 2024-05-27T07:30:45Z

The behaviour you reference is already customisable: Highlight julia-repl code in Markdown specially julia#54423

99.9% will use defaults and will now be confused about what came from markdown rendering and what came from the actual REPL. Making the default non-confusing seems like a good idea.

tecosaur · 2024-05-27T11:01:41Z

Thanks for the feedback @BioTurboNick! To keep the conversation rolling, I'll lay out my initial thoughts on your comments.

Parentheses/braces at highest level could have the same color as a function name/type, respectively. They're already distinct enough, so don't need to separate them with different colors.

[behaviour] Matching parent colouring to the context seems viable, but a bit of a hassle. I'm inclined to bump that off to a future PR (possibly by someone else) even if we're sold on it.

[defaults] There is also a no-colour option available by setting JuliaSyntaxHighlighting.RAINBOW_DELIMITERS_ENABLED[] = false.

Parentheses and brackets could alternate with two colors (light and dark versions of one) the same way you have braces now.

[defaults] I think this is a question of defaults, since julia_rainbow_paren_{1-6} can be customised.

I don't mind discussion on defaults, but I would like to separate it out from discussion on the capabilities/design 🙂.

Consider using bright white for literals rather than a color

[defaults] Maybe? I'm not entirely sure what defaults ~~everybody~~ most people would be happy with here.

nothing could share the same coloring as literals

Currently, they are? I'm assuming by literals you mean :symbols?

If literals are no longer purple, then alternating purple/light purple is available for brackets

[defaults] That's an interesting thought...

Operators could have the same color as functions, except inside braces they could have the same color as types.

[defaults] perhaps, I think this might be somewhat divisive though, IIRC a few people on slack liked it different

[behaviour] refer to earlier comments on parent highlighting matching their context

I don't love that the assignment operator is multicolor. I assume the intent is to make sure it's distinguishable from equality? But maybe I'll find it a good tradeoff if the other colors are smoothed out. Or maybe the whole thing could just use the red?

This behaviour was suggested by Fredrik Ekre: https://fredrikekre.se/posts/highlight-julia/#infix_operators_and_assignment

Maybe it's worth making an option? Not sure.

Commas in type signature currently has the same color as types, but commas elsewhere are the standard gray. Perhaps they should all be gray? Or they could all take on the color of their parentheses/brackets/braces level?

Hmm, I'm more open to making all commas grey than matching the parenthesis level from an implementation difficulty perspective, but both ideas seem worth considering aesthetically to me.

The in-comment blue-in-ticks should probably be dark blue to match the darker comment, otherwise the eye is drawn to it.

I think semantically code is the right face to use, and currently that's cyan to match the current Markdown tty rendering behaviour. This can be discussed a few different ways:

Terminal emulators will already use different brightness depending on their particular colour set
We can adjust the default colour of the code face
We can introduce a derived face with a different default foreground here, but I'd rather not if we can get away without doing so

BioTurboNick · 2024-05-27T13:41:29Z

@tecosaur - To be fair you did direct me to this PR for feedback :-) I can break out the defaults into a separate issue if you like.

The drive behind my suggestions, is that 1) a color change should signal something important, and too many color changes in a small space interferes with intelligibility; and 2) minimize the number of times the same color can mean different things. A default that follows these principles more closely will be a nicer user experience, however you want to achieve that.

Just a few other response points:

using a neutral color (gray/white) somewhere makes the colored special entities stand out more, and symbols/literals that can take arbitrary forms seem reasonable candidates for that
I could be wrong but it looks like the nothing is a dark purple while the literals (not symbols) are a light purple? That's what I was referring to there.
Using the color table Frames posted in Slack, the typical presentation is White text with Cyan code. It appears that in all terminals shown they're roughly similar brightness. Comments are using Bright Black I believe? So dropping Cyan to Blue, which has a similar drop in brightness across terminals, seems sensible. In your example screenshot a few posts up, I can barely see the comment and then "x" really pops out.

tecosaur · 2024-05-29T04:18:48Z

To be fair you did direct me to this PR for feedback :-) I can break out the defaults into a separate issue if you like.

All good, I just think it's helpful to separate the two in conversation. If nothing else, one takes a lot more effort than the other 😛

We can also continue this conversation after this PR is moved, I like to merge changes that improve a situation even if they're not perfect/end-state by themself, but that doesn't mean I think the work/conversation should stop once this PR is merged.

I'm in agreement with regards to wanting colour changes to signal semantic information/be actively helpful not just mimic a rainbow. The complication I think is that there are multiple reasonable ways of doing so. Figuring out a good balance here is an active WIP as far as I'm concerned

One of the reasons why more faces are used than there are ANSI colours, because I want to allow going beyond ANSI 4-bit colour. I think this affects "minimize the number of times the same color can mean different things" a bit.

I think we can "have our cake at eat it too" (somewhat) by using the inheritance pattern. E.g. the faces julia_string, julia_char, julia_number, julia_symbol, and julia_singleton could all inherit from julia_literal: this lets you customise the whole bunch at once, or target them individually 🙂.

So dropping Cyan to Blue, which has a similar drop in brightness across terminals

I think you may have mixed up the bright and non-bright colours? Anyway, we'd also need to work out how to enact this change (we've got at least two options).

tecosaur · 2024-06-05T03:24:07Z

Let's merge this since it's an improvement, but keep on talking about how it can be made better 😀.

@BioTurboNick if I could convince you to make two issues, one on changes to defaults you think would be a good idea, and the other on changes in behavior that would be ideal: if not, no worries 🙂.

c42f

This seems reasonable, though I think there's some lack of clarity in the code regarding tokens vs interior nodes. I generally find it pays to split those cases apart based on haschildren().

(I've thought before that it could be useful to have the kind() of a for token be different from the kind of the for interior node. This would probably clarify the situation for you here. But also would be a pain in other ways.)

Also you could probably use JuliaSyntax's flags to simplify some things.

src/JuliaSyntaxHighlighting.jl

c42f · 2024-06-24T03:42:34Z

src/JuliaSyntaxHighlighting.jl

+    elseif JuliaSyntax.is_prec_comparison(nkind) && JuliaSyntax.is_trivia(node);
+        :julia_comparator
+    elseif isplainoperator(node); :julia_operator
+    elseif nkind == K"..." && JuliaSyntax.is_trivia(node); :julia_operator


This whole thing might be cleaner if you split the big elseif thing into two sections based on haschildren()?

Because, for example, K"for" means two different things, depending on whether you're looking at a token (!haschildren() - ie, the literal for trivia token in the source) vs an interior node of the tree representing a for loop, with the comparision and body as children.

Hmmm, that's a good point. I might leave this for a future refactor (I'm sure there will be a few updates from this initial rewrite) though.

tecosaur · 2024-07-14T08:29:14Z

Let's not let the perfect be the enemy of improvement 🙂, I've moved some of the non-JuliaSyntax work that had collected here to #4, and I'd be keen to continue working on improving the highlighting across issues/future PRs.

tecosaur added 2 commits May 6, 2024 00:17

Use the parse tree

a755e96

Warn about the potential for changes to highlights

5331c52

tecosaur force-pushed the use-parse-tree branch from 89e1246 to 57d1721 Compare May 12, 2024 10:16

BioTurboNick mentioned this pull request Jun 7, 2024

Make it look more like this #3

Open

tecosaur force-pushed the use-parse-tree branch from 57d1721 to 122a882 Compare June 23, 2024 05:32

tecosaur marked this pull request as ready for review June 23, 2024 05:34

c42f reviewed Jun 24, 2024

View reviewed changes

LilithHafner mentioned this pull request Jul 4, 2024

Remove JuliaSyntaxHighlighting from 1.11 JuliaLang/julia#55023

Closed

Improve parse-tree based highlighting

d33ae74

tecosaur force-pushed the use-parse-tree branch 3 times, most recently from a533511 to d33ae74 Compare July 14, 2024 08:26

tecosaur merged commit d33ae74 into main Jul 14, 2024

tecosaur deleted the use-parse-tree branch July 14, 2024 08:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use parse tree #2

Use parse tree #2

tecosaur commented May 11, 2024 •

edited

Loading

c42f commented May 21, 2024

tecosaur commented May 21, 2024

jakobnissen commented May 25, 2024

tecosaur commented May 25, 2024

BioTurboNick commented May 26, 2024 •

edited

Loading

KristofferC commented May 27, 2024

tecosaur commented May 27, 2024

BioTurboNick commented May 27, 2024 •

edited

Loading

tecosaur commented May 29, 2024

tecosaur commented Jun 5, 2024

c42f left a comment

c42f Jun 24, 2024

tecosaur Jul 13, 2024

tecosaur commented Jul 14, 2024

Use parse tree #2

Use parse tree #2

Conversation

tecosaur commented May 11, 2024 • edited Loading

Before

After

Specific showcases

c42f commented May 21, 2024

tecosaur commented May 21, 2024

jakobnissen commented May 25, 2024

tecosaur commented May 25, 2024

BioTurboNick commented May 26, 2024 • edited Loading

KristofferC commented May 27, 2024

tecosaur commented May 27, 2024

BioTurboNick commented May 27, 2024 • edited Loading

tecosaur commented May 29, 2024

tecosaur commented Jun 5, 2024

c42f left a comment

Choose a reason for hiding this comment

c42f Jun 24, 2024

Choose a reason for hiding this comment

tecosaur Jul 13, 2024

Choose a reason for hiding this comment

tecosaur commented Jul 14, 2024

tecosaur commented May 11, 2024 •

edited

Loading

BioTurboNick commented May 26, 2024 •

edited

Loading

BioTurboNick commented May 27, 2024 •

edited

Loading