Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add unicode completions by latex name #1327

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tqml
Copy link

@tqml tqml commented Oct 23, 2024

Feature: Unicode variable completion by latex name (WIP)

This PR adds the ability to work with variables that use unicode symbols as names more conveniently. As an example consider there exists a variable θ₀. With this PR, the symbol θ₀ would be added to code completion suggestions if the user types a string close to theta_0 (which is easier on keyboards that dont feature greek characters)

Principle:

  • User types sequence that should be autocompleted
  • Iterate through all known identifiers (variables, function names, etc.) (this is currently already done)
  • Check for each symbol, if it contains any unicode characters
  • If so, then calculate an alternative name of the symbol, which corresponds to the latex name
    • θ₀ would result in theta_0, because the latex command to type it is: \theta\_0
    • xyz_θ would result in xyz_theta
  • perform the check if the user sequence, matches the alternative name. If so, add it to the list of autocomplete suggestions

Note: The autocomplete suggestion for theate_0 to θ₀ would only occur, if there is already a symbol with this name

Demo:

Screen.Recording.2024-10-09.at.10.41.16.mov

For every PR, please check the following:

Edits: Clarification about the working principle and when the completions would occur

@tqml tqml marked this pull request as draft October 23, 2024 20:13
@tqml
Copy link
Author

tqml commented Oct 23, 2024

Change log updated here:

julia-vscode/julia-vscode#3720

@tqml tqml changed the title [WIP] add unicode completions by latex name add unicode completions by latex name Oct 23, 2024
@jwortmann
Copy link
Contributor

I don't take any decisions over which PRs should or shouldn't be merged in this repository, but are you aware that the language server already provides the unicode symbols when you type the LaTeX name (or an emoji name), but starting with a backslash, i.e. like in actual LaTeX or in the Julia REPL?

if pt isa CSTParser.Tokens.Token && pt.kind == CSTParser.Tokenize.Tokens.BACKSLASH
latex_completions(string("\\", CSTParser.Tokenize.untokenize(t)), state)
elseif ppt isa CSTParser.Tokens.Token && ppt.kind == CSTParser.Tokenize.Tokens.BACKSLASH && pt isa CSTParser.Tokens.Token && (pt.kind === CSTParser.Tokens.CIRCUMFLEX_ACCENT || pt.kind === CSTParser.Tokens.COLON)
latex_completions(string("\\", CSTParser.Tokenize.untokenize(pt), CSTParser.Tokenize.untokenize(t)), state)

Also I think it is quite common to use the greek letter names written in ASCII as variable names (like alpha, beta, mu), and then I imagine that it would be problematic if they are automatically converted to unicode symbols when auto-completing.

@tqml
Copy link
Author

tqml commented Oct 29, 2024

I don't take any decisions over which PRs should or shouldn't be merged in this repository, but are you aware that the language server already provides the unicode symbols when you type the LaTeX name (or an emoji name), but starting with a backslash, i.e. like in actual LaTeX or in the Julia REPL?

if pt isa CSTParser.Tokens.Token && pt.kind == CSTParser.Tokenize.Tokens.BACKSLASH
latex_completions(string("\\", CSTParser.Tokenize.untokenize(t)), state)
elseif ppt isa CSTParser.Tokens.Token && ppt.kind == CSTParser.Tokenize.Tokens.BACKSLASH && pt isa CSTParser.Tokens.Token && (pt.kind === CSTParser.Tokens.CIRCUMFLEX_ACCENT || pt.kind === CSTParser.Tokens.COLON)
latex_completions(string("\\", CSTParser.Tokenize.untokenize(pt), CSTParser.Tokenize.untokenize(t)), state)

Also I think it is quite common to use the greek letter names written in ASCII as variable names (like alpha, beta, mu), and then I imagine that it would be problematic if they are automatically converted to unicode symbols when auto-completing.

Thanks for the comment!

Yes, I'm aware about it but its not what this PR is about (sorry, I think my description is not clear enough in this regard)

This PR actually tries to do something different: The completion only occurs, when an existing symbol already exists, so just typing alpha would not autocomplete to the unicode symbol. So in my example above, alpha would not autocomplete while zeta and omega do, because there are already unicode symbols defined, that match that name.

The very first symbol needs to be typed using the latex-backslash way. However, having to use the latex-backslash all of the time slows you down quite a bit (in my experience), thats why I usually stop using unicode symbols all together and just use ASCII after a while (which is a shame, because I think its a really nice feature!)
I also admit, that this is mostly due to my keyboard (German, Mac), where the backslash is just at an inconvenient position, however I think it could be helpful overall

I'm open for comments and feedback because this is a very opinionated change (maybe it would make sense to have a preference toggle for it).

@jwortmann
Copy link
Contributor

Okay I understand now, thanks for the explanation. Then I guess it could be something useful. Maybe it should do another check to not provide this unicode conversion if there is also an existing symbol with the corresponding ASCII name already defined. Though that would probably be a quite bad and confusing code style if names like e.g. θ and theta are defined at the same time, so I don't know how realistic such a use case is.

@tqml
Copy link
Author

tqml commented Oct 30, 2024

Thanks for pointing it out, it was not clear in the initial description.
At least in my quick tests it worked quite nice and made working with unicode way faster!

Good point, that could be confusing.
Not sure what the right thing to do there is, I tend to solution to just show both symbols but maybe with some ranking/order, e.g. for the query theta would match a variable named theta exactly and should get a higher score/rank than θ (because here it matches only the alternative name)

Actually, I think some kind of ranking/suggestion for the completion suggestion would make sense overall but is not implemented from what I saw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants