Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bowtie symbol isn't a valid character. #23820

Open
rofinn opened this issue Sep 22, 2017 · 16 comments
Open

Bowtie symbol isn't a valid character. #23820

rofinn opened this issue Sep 22, 2017 · 16 comments
Labels
domain:unicode Related to unicode characters and encodings parser Language parsing and surface syntax

Comments

@rofinn
Copy link
Contributor

rofinn commented Sep 22, 2017

julia>= 1
ERROR: syntax: invalid character ""

FWIW, \bowtie tab completed on the REPL.

@JeffBezanson
Copy link
Sponsor Member

Ah, one of those category Sm characters. They have a range of meanings so we need to whitelist them. Should this be an operator or just an identifier character?

@JeffBezanson JeffBezanson added parser Language parsing and surface syntax domain:unicode Related to unicode characters and encodings labels Sep 22, 2017
@stevengj
Copy link
Member

stevengj commented Sep 22, 2017

Yes, ⋈ (U+22c8) was one of the characters that got left out because it wasn't immediately clear if it should be an operator or an identifier.

It seems like it is mainly used as a relation symbol. Most of these relational algebra symbols were added in #8036. Confusingly, the join symbol ⨝ (U+2a1d) (tab-complete \Join) is already supported as an infix operator, and looks nearly identical to the "bowtie".

One option would be to add a mapping to our custom Julia Unicode normalization that simply parses ⋈ (U+22c8) as equivalent to ⨝ (U+2a1d).

@rofinn
Copy link
Contributor Author

rofinn commented Sep 22, 2017

I would like to use it as an infix operator for natural join between 2 relations, but I'd be fine if it was just a valid function name.

@rofinn
Copy link
Contributor Author

rofinn commented Sep 22, 2017

Oh, I didn't realize they were different unicode symbols. I'm fine using the join symbol, but \join wasn't tab completing on the REPL in 0.6 or 0.7 (5 days old).

@StefanKarpinski
Copy link
Sponsor Member

It's capital \Join not \join (I have no idea why).

@StefanKarpinski
Copy link
Sponsor Member

Normalizing these seems sane since they like exactly alike.

@stevengj
Copy link
Member

The reason it was \Join is because it is that way in LaTeX, and we were hesitant about innovating in LaTeX names in #8036. I agree that adding the lower-case tab completion, along with the normalization, makes sense.

@stevengj
Copy link
Member

stevengj commented Sep 22, 2017

One oddity is that \Join seems to be treated as a fullwidth character, while bowtie is treated as narrow:

julia> charwidth('') # bowtie
1

julia> charwidth('') # join
2

(\Join displays oddly in the REPL too, similar to #3721, since my (Mac) terminal also seems to be confused about its character width.)

@JeffBezanson
Copy link
Sponsor Member

My terminal and/or font seems to have the widths swapped --- bowtie takes up 2 columns (though the terminal doesn't know it) and join takes up 1.

@iamed2
Copy link
Contributor

iamed2 commented Sep 22, 2017

My terminal handles \Join correctly but Julia doesn't

EDIT: Julia with OhMyREPL displays it with one column but moves the cursor symbol another column right, despite typing in the correct place. Julia without OMR displays it with 2 columns but removes second column on backspace.

EDIT 2: I have iterm2 with Unicode 9 widths turned on; that's probably relevant

@rofinn
Copy link
Contributor Author

rofinn commented Sep 22, 2017

Would there be any issue with making all the unicode substitutions case-insensitive? Is there an example where the casing would matter?

@stevengj
Copy link
Member

@rofinn, since the same tab-substitution list is now supported by several editors, and we can't control the case sensitivity of editor tab substitution without providing an explicit list of all possibilities (\Join, \JOiN, …), I think it's better for consistency's sake to just a support a limited set of variations. (e.g. \Join and \join).

See also the discussion in #21646 about rationalizing the latex completions.

@StefanKarpinski
Copy link
Sponsor Member

IIRC, there are examples where case matters in LaTeX escapes.

@iamed2
Copy link
Contributor

iamed2 commented Sep 22, 2017

Example:

\delta: δ
\Delta: Δ

@rofinn
Copy link
Contributor Author

rofinn commented Sep 22, 2017

Alright, I forgot about \delta vs \Delta.

@stevengj
Copy link
Member

stevengj commented Oct 8, 2021

Update: \join and \Join nowadays both tab-complete to U+2A1D. It's still not clear what to do about \bowtie (U+22C8).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:unicode Related to unicode characters and encodings parser Language parsing and surface syntax
Projects
None yet
Development

No branches or pull requests

5 participants