infix notation for more functions #4498

my-little-repository · 2013-10-13T14:05:47Z

This is a feature request. The in function has both a prefix and infix notation. It would be nice to have infix notation for the functions beginswith, endswith and contains, e.g.

julia> "julialang" beginswith 'j'
true

The text was updated successfully, but these errors were encountered:

lindahua · 2013-10-13T15:30:26Z

If we provide infix form for beginswith, then a large number of other functions may want to get this privilege. I think it is enough to provide a very small number of infix operators that cover the most common operations.

Also "julialang" beginswith 'j' doesn't seem to be a huge improvement over beginswith("julialang", 'j') to me.

my-little-repository · 2013-10-13T16:54:06Z

I beg to differ. There are not a large number of functions that may be written in infix notation. When it comes to string manipulation, I have not seen any other functions than the ones I am describing above.

And "julialang" beginswith 'j' seems to give the same improvement over beginswith("julialang", 'j')
that x in y gives over in(x,y). I would even dare to say that it is in fact a better improvement because the infix in notation introduces an ambiguity in the language (in is also a synonym for = in for loops). There is no such ambiguity with beginswith.

johnmyleswhite · 2013-10-13T17:00:26Z

I agree with Dahua: there's no reason to do this unless we make it possible to use all binary functions as infix operators.

quinnj · 2013-10-13T17:38:35Z

Also see here for the discussion on infix operators.

#2703

This was closed when the in syntax was added, but there was also discussion
of allowing for user specified infix functions along these lines. Perhaps
another issue could be opened for the generalized case.

-Jacob
On Oct 13, 2013 1:00 PM, "John Myles White" notifications@github.com
wrote:

I agree with Dahua: there's no reason to do this unless we make it
possible to use all binary functions as infix operators.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/4498#issuecomment-26221653
.

JeffBezanson · 2013-10-14T08:14:48Z

There are some problems with that, such as [1 + 2] and [x in y] being 1-element arrays and [x f y] being a 3-element array. Maybe a haskell-like x f y could work.

johnmyleswhite · 2013-10-14T15:29:39Z

I would be pretty into the Haskell approach to infix functions.

diegozea · 2013-11-10T05:56:50Z

I like the Haskell approach.
There is something similar on R: You can define an infix function in R using % around the name

ssfrr · 2014-01-27T19:46:16Z

Haskell automatically treats all-symbol functions as infix, and alphanumeric functions can be treated as infix with the backtics, as @JeffBezanson mentions. I like this in the sense that it's less special case ("these are the operators that we've hard-coded into the parser to be infix"). Of course with the unicode support it's a little more complicated to define clearly what's alphanumeric.

Though the trade off of making it easy to define custom symbolic infix operators is that people will. :)

Making symbolic functions infix and allowing alphanumeric functions to be "infixed" using backtics are orthogonal questions though, so perhaps if there's much more discussion on it we should split it out into a separate issue.

ssfrr · 2014-01-27T20:20:08Z

Reading around a little more on this, the backticks have been discussed here where @StefanKarpinski is not a fan. In #552 @JeffBezanson also says that this isn't happening because of limited ascii space.

Unicode/symbolic operators were discussed in #552 as a possible future feature but not a high priority.

damiendr · 2014-10-31T13:46:04Z

The Haskell approach would be especially nice in the context of mini-domain specific languages.
Eg. being able to create a graph this way:

a -> b -> c

with -> being the "connect" function.

The downside of restricting infix functions to unicode characters is that there is no easy way to type →.

StefanKarpinski · 2014-10-31T13:59:36Z

\rightarrow<tab>

jiahao · 2014-10-31T14:55:49Z

I'm tempted to close this in light of #552, #6929.

damiendr · 2014-11-02T10:17:37Z

@StefanKarpinski This doesn't work on any of my code editors, and neither do MacOSX's substitution rules (seems like they only work when editing rich text). But even then, that's 5 times the number of keystrokes (6 times on a fr keyboard), plus remembering the macro name. Also → confuses both Terminal.app and iTerm.app at the time being. I'm not entirely against the use of unicode in source code, but it feels like it's a bit ahead of its time...

The ability to neatly express embedded DSLs is a really, really powerful feature for a language, especially when coupled with Julia's metaprogramming. And freedom in defining your own infix operators is crucial for that! For instance let's say you want to use <- or :=. These operators can be typed in a completely obvious way with 2 keystrokes in Graphviz and Pascal, respectively. It would seem a bit impractical to require \colonequals<tab> or \leftarrow<tab> or alt-2254 in Julia... especially when ASCII == is accepted.

My feeling is that the current approach to infixes in Julia is a bit too restrictive for embedded DSLs, also because it's totally not obvious which of the many unicode math operators will work. An explicit approach would be much clearer IMHO.

StefanKarpinski · 2014-11-02T14:44:46Z

The ability to neatly express embedded DSLs is a really, really powerful feature for a language, especially when coupled with Julia's metaprogramming. And freedom in defining your own infix operators is crucial for that! For instance let's say you want to use <- or :=.

I see where you're coming from but I think many Lispers would strongly disagree with this position.

eschnett · 2014-11-02T15:18:05Z

I've thought about this before, and came to the conclusion that the cleanest way is to actually store unicode characters in the source code, as this gives a standardized and unambiguous representation of operators.

The presentation (i.e. how it looks in an editor) can be changed. I was thinking of adding hooks to emacs for loading/saving files in Julia mode e.g. to replace \in by ∈, but then discovered that emacs's Julia mode already handles this (when pressing the tab key), and the visual representation of unicode characters works fine as well (in emacs).

I don't know what I would do in another editor. I would hope that displaying unicode works fine everywhere, but entering unicode still seems to be a problem. Julia's emacs mode's choice of latex notation is just the right thing for most Julia users, but also shows that this is neither standardized nor universally available.

The best I could come up with would be a preprocessor that translates "longhand" notation (e.g. ->) into unicode before compiling. This may need to be coupled to an unambiguous syntax for these operators. It seems that -> is fine in this respect, but <- is not -- the expression 1<-2 already has a meaning.

In an editor, where the unicode character is created instantaneously and on demand, this is fine, but doing this in an automated way only works if e.g. all operators need to be surrounded by white space or parentheses, as in Lisp.

Fortress had some interesting ideas in this respect.

jiahao · 2014-11-02T17:35:39Z

Julia's emacs mode's choice of latex notation is just the right thing for most Julia users, but also shows that this is neither standardized nor universally available.

The code for tab-completion also exists in the base REPLCompletions module, so to the extent that Julia's REPL builds and is useable, we do have some consistency in the user interface.

I would hope that displaying unicode works fine everywhere

Unfortunately we have very little control over the display of Unicode; this is primarily a question of what fonts people use. In fact, some commonly used fonts have some inexcusable mistakes in their glyph tables (e.g. #8429 for a particularly insidious problem with phi and varphi due to old Unicode code point mappings). And quite a few fonts have incorrect glyph bounding box offsets for combining diacritics, so that they end up rendered on the wrong characters.

In general, we have seen that displaying with default fonts on OSX is pretty good, but atrocious on Linux and Windows. You can see for yourself by viewing the tab completion table with your font of choice.

Fortress had some interesting ideas in this respect.

I had been told this too, and yet when I dug into the details, I found Fortress's Unicode support wanting in clarity. In particular, Fortress's spec (v1.0; pdf) conveys no evidence that they have thought about the visual ambiguities inherent in Unicode support.

The Fortress spec does not make clear what level of Unicode equivalence the language supports. If we assume no canonicalization (default), the language treats as semantically different visually ambiguous characters like µ (micro) vs. μ (mu), and Å (Ångström) vs. Å (A with ring) vs. Å (A with ring combining character). (Julia currently supports NFC canonicalization natively for identifiers (canonicalize unicode identifiers #5434) and there may be custom canonicalization support in the future (add custom JULIA normalization? JuliaStrings/utf8proc#11) since neither NFC nor NFKC does what we really want.)
Fortress's insistence on a standard rendering that looks mathematical (Appendix B) only makes things far worse for producing multiple identifiers which cannot be disambiguated visually.
- The spec states that M is rendered as an italic M, which can then (depending on the font) be visually indistinguishable (yet semantically different) from 𝑀 (U+1D440).
- The spec states that the variable called OMEGA13 (all ASCII) is rendered the same as Ω₁₃ (U+3A9 U+2081 U+2083), yet the spec appears to treat these two as semantically distinguishable.

JeffBezanson · 2014-11-02T23:56:53Z

Yes I think having multiple renderings of source code is a bad idea. Occasionally you typeset code for publication e.g. in a latex document, and that's fine as it's easily distinguished from other uses of the code. But having a preprocessor or multiple ways to type the "same" identifier strikes me as massive unnecessary confusion.

eschnett · 2014-11-03T03:27:10Z

I didn't refer to how Fortress handles unicode -- I wasn't familiar with that, and it's sad that this is so problematic. What I meant was that Fortress defines several styles for presenting code, one of them being essentially ASCII that can be easily displayed and input everywhere, and which is still quite readable once you get past the double brackets.

One could invent something similar for Julia (or rather for unicode in general): A way to enter and/or display unicode characters on devices that either don't have the respective fonts available, or where the backslash-name-tab completion is not available. This would not be part of the Julia language, but would be a convention that editors can follow, similar to backslash-name-tab.

cdsousa · 2014-11-03T11:32:04Z

I would say that the backslash itself could be used for that purpose, after deprecating its (not so common?) current uses. Thus becoming a \times b equivalent to a × b ...

StefanKarpinski · 2014-11-03T12:03:24Z

Good luck prying backslash out of the linear algebraists cold dead hands.

eschnett · 2014-11-03T14:00:26Z

backslash-name-space won't quite work, because people may write a÷b which is then rendered as a\divb. Maybe backslash-brackets would work -- a\[div]b.

The notation does not need to be entirely unambiguous, since the regular uses of the characters one chooses could be escaped. For example (but I'm not suggesting this), one could also re-use plain square brackets for this, as in a[div]b. Any actual use of a bracket then needs to be translated to [openbracket] or [closebracket].

ntessore · 2014-11-03T18:51:17Z

I'm sorry if this is a naive idea, but could one not resolve \leftarrow<tab> to a :leftarrow symbol (or maybe a special syntax, ::leftarrow or :leftarrow: to not clash with other symbols), and then let the :leftarrow symbol be rendered as a Unicode <- when the terminal supports it. One could then always type out the symbols normally, and nothing exotic would get written to files. This is, as far as I know, how Mathematica handles their special symbols.

Edit: I see now that this is probably what you are discussing anyway. Carry on.

JeffBezanson · 2014-11-03T19:44:28Z

I think the best solution is also the most realistic and most straightforward: wait for platforms, applications, and fonts to gradually get better unicode support. Even the humble misc-fixed fonts support almost all the characters you might want. I find browser, phone, and editor support is already quite good, and I don't even use any Apple products.

I don't think we need to pick up the slack for text editors. Doing technical programming in an editor that's not customizable and doesn't support entering special symbols seems like a very strange requirement. Seriously, get a better editor.

StefanKarpinski · 2014-11-03T20:10:12Z

Agree. We're designing a language for the next 20 years, not the last 20.

tkelman · 2014-11-03T21:18:28Z

Yes I think having multiple renderings of source code is a bad idea. Occasionally you typeset code for publication e.g. in a latex document, and that's fine as it's easily distinguished from other uses of the code. But having a preprocessor or multiple ways to type the "same" identifier strikes me as massive unnecessary confusion.

Thoughts on Mathematica's typeset input style? I personally think that it (or something like it) is the nicest solution to the spaces-as-delimiters problem, it's clear from context when you're typing into a typeset matrix object, as it is in a textbook, where it really isn't so obvious in generally-monospaced unicode source.

Granted this shouldn't really be an issue for the base language, I see this as something that could be done at an IDE / IME level by IJulia or Juno down the road.

damiendr · 2014-11-04T14:29:27Z

As for distinguishing a <— 2 from a < -2, there is always the option to use a dash rather than a hyphen-minus, since most keyboard layouts have dashes.

In the end infix notation is syntactic sugar. It only makes sense when it's simpler and nicer to use than function call notation; else it's better to do without it. And if the expr space symbol space expr syntactic space is already too crowded to support Scala-style universal infixes, I guess the unicode solution is the only compromise left indeed... which I think is unfortunate but then, that's just a matter of taste.

Maybe what can be done is to write documentation that encourages editors to bundle both a syntax mode AND a set of useful macros for Julia.

I would also argue for making the list of accepted unicode characters as inclusive as possible, as the arity of an operator symbol might depend on whether it's being used in linear algebra, control theory or some obscure branch of topology.

jiahao · 2014-11-04T15:09:42Z

Thoughts on Mathematica's typeset input style?

We already have tab-completion supporting LaTeX-style input, which is the de facto open source standard for non-ASCII input in an ASCII environment. Is there really a need for another completely different input method that users have to learn?

StefanKarpinski · 2014-11-04T16:45:21Z

LaTeX style input is supported in the REPL, vim, emacs, TextMate, Sublime, IJulia and probably several others. Not sure how much better we're supposed to be doing. If there are editors that don't yet have support for this, additions are certainly welcomed.

ntessore · 2014-11-04T17:02:05Z

@StefanKarpinski Pardon if I go on a tangential, but how do you get LaTeX-style input in TextMate 2?

tkelman · 2014-11-04T22:52:44Z

(apologies for somewhat hijacking the issue, again)

We already have tab-completion supporting LaTeX-style input, which is the de facto open source standard for non-ASCII input in an ASCII environment.

Sure, for single characters at a time. But unless you're actually full-on rendering math-mode LaTeX, you're still limited to mostly-monospaced characters, uniform line heights, etc in an ASCII environment.

Is there really a need for another completely different input method that users have to learn?

A need? No, definitely not. But I think it could be interesting to look into alternate methods of inputting and/or rendering math-formatted code down the line, to go beyond the limitations of an ASCII environment. Literate Julia, or something like it. Not that it worked out all that well for Fortress, or really any example aside from Mathematica. There may be architectural limits to IPython or mainstream editors that make this completely impractical, I don't know.

StefanKarpinski · 2014-11-05T01:15:09Z

Not actually sure – in Sublime Text, you can use the UnicodeMath package. I assume there's a similar package in TextMate but I might be mistaken.

Glen-O · 2015-07-20T08:18:49Z

Note: I don't know any details about changes possibly made in 0.4, I'm using 0.3.10.

How about making use of the existing meaning of |>, and generalising it slightly?

5|>mod<|6

Right now, we have "hello" |>print ≡ print("hello"). Could we not simply extend this to have print<| "hello" ≡ print("hello") as well, and "hello" |>print<| " world" ≡ print("hello"," world")? Not only does it maintain an existing notation, but it extends it in a completely sensible way. The only caveat is that <| is currently an "unused" infix operator, so it's possible that some package is already using it for something.

Another currently-unused infix operator that would also work nicely is --. So the notation would be 5 --mod-- 6. And it looks like ** also is available (it currently throws an error saying to use ^ instead... kind of a waste to just block its use entirely), so 5 **mod** 6 could work.

Other notations that aren't in use that could work nicely include <>, ><, and .. (another currently-unused infix operator).

(incidentally, I'm the Glen O that suggested exploiting the colon notation for custom infix)

jakebolewski · 2015-08-11T22:31:14Z

Closing in light of #6929. Any addition I feel will have to provide a compelling implementation.

matt2000 · 2016-04-24T06:39:41Z

I found this issue when searching for information on how to define my own infix operator/functions, so I'll share the solution I ultimately came up with, for the benefit of future searchers, and as an example of why Julia may not need these feature at the language level.

julia > macro <(arg, fn, args...)
       :($fn($arg, $args...))
       end
julia> @< "julialang" beginswith 'j'
true

On the other hand, if there's a better solution for this now, I'd appreciate pointers to any documentation.

stevengj · 2017-10-06T17:11:30Z

See #16985 on defining custom infix operators.

ssfrr mentioned this issue Feb 2, 2014

Function chaining #5571

Closed

stevengj mentioned this issue May 23, 2014

exploiting colon for custom infix operators #6946

Closed

ViralBShah removed the feature label Feb 14, 2015

jakebolewski added the parser label Jun 2, 2015

jakebolewski closed this as completed Aug 11, 2015

infix notation for more functions #4498

infix notation for more functions #4498

Comments

my-little-repository commented Oct 13, 2013

lindahua commented Oct 13, 2013

my-little-repository commented Oct 13, 2013

johnmyleswhite commented Oct 13, 2013

quinnj commented Oct 13, 2013

JeffBezanson commented Oct 14, 2013

johnmyleswhite commented Oct 14, 2013

diegozea commented Nov 10, 2013

ssfrr commented Jan 27, 2014

ssfrr commented Jan 27, 2014

damiendr commented Oct 31, 2014

StefanKarpinski commented Oct 31, 2014

jiahao commented Oct 31, 2014

damiendr commented Nov 2, 2014

StefanKarpinski commented Nov 2, 2014

eschnett commented Nov 2, 2014

jiahao commented Nov 2, 2014

JeffBezanson commented Nov 2, 2014

eschnett commented Nov 3, 2014

cdsousa commented Nov 3, 2014

StefanKarpinski commented Nov 3, 2014

eschnett commented Nov 3, 2014

ntessore commented Nov 3, 2014

JeffBezanson commented Nov 3, 2014

StefanKarpinski commented Nov 3, 2014

tkelman commented Nov 3, 2014

damiendr commented Nov 4, 2014

jiahao commented Nov 4, 2014

StefanKarpinski commented Nov 4, 2014

ntessore commented Nov 4, 2014

tkelman commented Nov 4, 2014

StefanKarpinski commented Nov 5, 2014

Glen-O commented Jul 20, 2015

jakebolewski commented Aug 11, 2015

matt2000 commented Apr 24, 2016

stevengj commented Oct 6, 2017