Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-interpolating string literal syntax? #11764

Closed
StefanKarpinski opened this issue Jun 19, 2015 · 49 comments
Closed

non-interpolating string literal syntax? #11764

StefanKarpinski opened this issue Jun 19, 2015 · 49 comments
Labels
speculative Whether the change will be implemented is speculative

Comments

@StefanKarpinski
Copy link
Member

It's come up pretty regularly that having a non-interpolating string literal syntax would be handy. Maybe

'''This costs $10'''

Whether to do indentation stripping or not would be an open question.

@StefanKarpinski StefanKarpinski added the speculative Whether the change will be implemented is speculative label Jun 19, 2015
@StefanKarpinski StefanKarpinski changed the title non-interpolating string literal syntax non-interpolating string literal syntax? Jun 19, 2015
@JeffBezanson
Copy link
Member

The best option is to have all string literal syntax be non-interpolating. Simple :)

@johnmyleswhite
Copy link
Member

Interpolatio delenda est.

@ScottPJones
Copy link
Contributor

Yes, that's a severe problem with the current interpolation syntax, which I'd love to see deprecated.
As I've suggested before, why couldn't we start off by at least adding the \( ) interpolation syntax,
which doesn't eat up another ASCII character (because \ is already special), and would be consistent with other similar languages, such as Swift (which seems to have a LOT of similarities to Julia).
Then later on, maybe you could only support the $ interpolation syntax if they had a i"..." string, for example, and wean people off of it slowly (but before Julia 1.0!).

@StefanKarpinski
Copy link
Member Author

Realistically, the current interpolation syntax is not going anywhere. There seem to be no other people who find the $ for interpolation problematic – aside from Jeff who just don't like interpolation at all. The $ is generally used for interpolation through the language, so the \( ) thing just isn't going to happen.

@JeffBezanson
Copy link
Member

In fact as soon as you need parens the value of interpolation goes down significantly. Interpolation works best when "just a $variable" is interpolated. I suspect most people prefer string(f(x)) to "$(f(x))".

@StefanKarpinski
Copy link
Member Author

Jeff, c'mon man, you had the chance to get rid of this a couple years ago and didn't do it. Let's not rehash this. I get it, you don't like interpolation. Let's focus on the issue at hand, which is not whether to get rid of interpolation in normal string literals or change its syntax. I guess I should have known better than to open this issue.

@JeffBezanson
Copy link
Member

I'm just saying that $x is better than \(x). Extreme conciseness is one of the benefits of interpolation.

you had the chance to get rid of this a couple years ago

I don't think that's a very accurate description.

@StefanKarpinski
Copy link
Member Author

We had horse traded this at some point (back when we could still do that). I agreed to getting rid of interpolation in exchange for something or other. I think the plan was juxtaposition instead.

@JeffBezanson
Copy link
Member

Yes, we were considering x "" y juxtaposition, but I think that syntax had too many problems of its own. It's pretty non-standard and is oddly space-sensitive, so it didn't seem like enough of a win. Between $-interpolation and juxtaposition I'm fine with picking interpolation.

@ScottPJones
Copy link
Contributor

@JeffBezanson It's no extra characters at all, for the $(expression) case, and doesn't lead to people complaining about $ signs getting eaten or having to be quoted...
Ruby: #{expr}
Python: '%d' % value
Julia: $name or $(expr)
Perl/PHP "$name" or "@{[expr]}" (note: single quoted strings are not interpolated)
JavaScript: ${expr} (this is a new feature in ECMAscript 6)
Swift: \(expr)
Only Perl/PHP have such a concise syntax, but only for a variable, and their variable names start with $ anyway, so it kind of makes sense... of course with the serious downside that $ is a rather common character in strings, but they at least have the escape that single quoted strings don't interpolate.
If you are trying to produce a string that has interpolation in Julia, that later would be read in as a Julia program, you need to always use the $(variable) syntax anyway, because otherwise you have to be very careful about what character comes after the variable name...
You think people really prefer string to interpolation? I wish they did, but I haven't seen evidence of it!
😞

@StefanKarpinski I can't find it right now, but a lot of people have been complaining about string interpolation, and would like to at least have non-interpolated strings, that would be more compatible with
C/C++/Swift/Java/JSON etc, however, I don't think the ''' syntax would be good. People were looking for syntax like: u"...", U"...", or utf8"...", utf16"...", utf32"...". The prefixes would also work identically on """ strings... indentation removed, but no interpolation.

@StefanKarpinski
Copy link
Member Author

The UNIX shells started the whole interpolation with $ thing, and since shells are probably the oldest and most widely distributed scripting languages around, I'd say there is some prior art.

The reason using u"""...""" etc. for non-interpolating strings is not great is that we want to move to where all double-quote-like strings behave the same and have things like interpolation, escaping, and indent removal handled by the parser, rather than implemented over and over again in macros. In fact, the kind of nested parsing that interpolation inside of double quotes does is not possible to implement via macros. The sane place that we'd like to be – and maybe I should have lead with this – is where you define a single @foo_str macro, and you get foo"..." and foo"""...""" for free and they have consistent behavior.

@ScottPJones
Copy link
Contributor

I thought that @foo_str already worked on both foo"..." and foo"""..."""?
Maybe you need a @foo_nistr convention, so that people can have macros that specifically don't interpolate.
A lot of people do want the option (like Perl/PHP have), of having non-interpolating strings.
After working on the triplequote code, it seems that it would be much nicer to have all that logic handled in the C code parser, and maybe both single double-quote and triple double-quote parsing could be done with simply a couple of flags to a single parse routine, for interpolate or not, indentation removal or not, escaping or not... and allow the string macros to select which things they want...

@hayd
Copy link
Member

hayd commented Jun 19, 2015

The cost of putting in a backslash hardly seems worth another string literal type/macro. You also have to escape " so it's not that surprising...

An alternative solution, for the most common problem, is for strings like "$10" to be allowed to parse since what follows the $ is not an identifier.

@StefanKarpinski
Copy link
Member Author

An alternative solution, for the most common problem, is for strings like "$10" to be allowed to parse since what follows the $ is not an identifier.

Seems a bit too brittle. Currently it's whatever expression comes after, which is pretty simple.

@ScottPJones
Copy link
Contributor

@hayd The problem is simply that since the string formats are almost identical, people copy strings from C/C++/Java/JSON directly into their programs... and then end up with bugs, and have to go through line by line to find the cases of $ that need to be quoted in their strings... not fun.
$ is just too common a character to always have to be quoting it, just for some syntactic sugar.
The argument about the conciseness of $variable for string interpolation really falls down there,
when people have to add lots of \ just to make julia happy...
That's why I'd rather have \(expr), that has no problems about what follows, it's not brittle, and it doesn't "eat" a common character that doesn't have to be quoted in most other languages.

@JeffBezanson
Copy link
Member

Composing the different kinds of string literals is a significant problem; e.g. if you want a utf-16 string without interpolation. It starts to get ugly.

@ScottPJones
Copy link
Contributor

I wouldn't want interpolation for any of them... what people were asking for was to have C/C++/Java like strings, and not have to worry about $, and as a separate item, to be able to have things like C/C++'s
u"...", U"..." syntax available.

@JeffBezanson
Copy link
Member

I wouldn't want interpolation for any of them

No argument there :)

@hayd
Copy link
Member

hayd commented Jun 19, 2015

Seems a bit too brittle.

How's about allowing either identifier or digits. e.g. "$10a" would fail but "$10.00" would work (similar to "$a.00" works now). This seems to cover the most common cases without breaking any existing code.

Edit, to clarify:

"$10.00" == "\$10.00"
"$a.00" == "10.00"  # if a = 10

@kmsquire
Copy link
Member

I'm not a fan of

'''You've won $1,000,000 (not interpolated)'''

mostly because that starts to cloud the syntax (it's too close to """...""", and programmers coming from Python, at least, will get confused and start using these interchangably).

Why not go back to a xxx_str macro (I think it used to be b"I want my $2 (not interpolated")?

(And for what it's worth, I do rather like string interpolation--thanks for defending it Stefan!)

@yuyichao
Copy link
Contributor

For me $10 is not the most often usecase. I personally uses literal $ in string mostly with PyPlot with latex labels. It would be quite surprising if the behavior of $ in string depends on whether it's in front of a number or not.

@hayd
Copy link
Member

hayd commented Jun 20, 2015

I meant to say this before: if we go with a macro, how's about: raw"...".

@kmsquire
Copy link
Member

I kind of like raw"...".

@Keno
Copy link
Member

Keno commented Jun 20, 2015

We used to have @L_str that did this, but that was removed.

@ScottPJones
Copy link
Contributor

I'd like to have a raw, but also have another that is just \ escapes, no interpolation (or Swift-style \(expr) interpolation).

@cdsousa
Copy link
Contributor

cdsousa commented Jun 20, 2015

I'm with @ScottPJones,
+1 for the raw "just \ escapes" literal, but preferably with the innocuous \(expr) interpolation (I find myself having to write $(variable)_something many times anyway).

@hayd
Copy link
Member

hayd commented Jun 20, 2015

IMO raw should have no interpolation, innocuous or otherwise. I think the one tricky bit/ambiguity is backlash escapes e.g. """a"b""" and """a\"b""" parse the same... python's r"..." gets around this by not allowing backslash escapes at all, this has the benefit of being simple but (for julia) means you couldn't start/end a raw string with a ".

@cdsousa
Copy link
Contributor

cdsousa commented Jun 20, 2015

yeah, I shouldn't have said "raw", I should have said "just \ escapes"...

@ScottPJones
Copy link
Contributor

Yes, I'd like a raw, but what I said was I wanted another one, that was closer to C/C++/Java/Swift/JSON...
I've been having crazy ideas today about some other interesting useful additions, that would still not require quoting any extra characters... (just \ and ").
What would people think of the following format?

u"This is a string with funny Unicode characters, like \⊕ it costs $10, and it also can handle 0x\(hex(mychar))"

@stevengj
Copy link
Member

@yuyichao, in PyPlot you can use L"..." for LaTeX labels. This also has the added benefit of allowing you to omit the $ signs entirely if the entire label is an equation, e.g. L"$\alpha + 1$" or ``L"\alpha + 1"` are equivalent. The resulting strings are also displayed as rendered equations in IJulia.

@ScottPJones
Copy link
Contributor

@stevengj That trick only helps people using PyPlot though, in the specific case of LaTex labels, correct?
Dealing with having to quote $ specially in strings just so interpolation can save two characters is a pain for a lot of people.

@felipenoris
Copy link
Contributor

Maybe you should deprecate interpolation on "$xyz" case, and keep interpolation on "$(xyz)".
If you're porting strings, either by copy-pasting code or reading from files, "you won $10" will work.
When you want to interpolate, you're generally writing from scratch. So writing "you won $$(prize)" will translate to "you won $10", and I wouldn't be bothered by the additional parenthesis. I find it sufficiently concise. At least, it forces the programmer to be intentional about using the interpolation feature. But, in my opinion, conciseness by itself is not the main reason to interpolate, but to have access to language expressions inside strings ... in a concise way. :)
Btw, string(f(x)) may be better than "$(f(x))" in this simple case, but I hate doing paste0("I want to interpolate ", var1, ", and ", var2, ", and, "var3) , as I would write in R. In fact, it's so cumbersome that what I just wrote has an error.

@stevengj
Copy link
Member

Yes, that's only for the PyPlot/LaTex use-case mentioned above. Having to escape $ is a pain, but having a more verbose interpolation syntax is also a pain (not to mention a huge disruption of existing Julia code).

I feel like the ship has sailed here on radical changes to interpolation syntax. A macro is a good option, but there is the question of whether it should still make backslash substitutions (by default, a string macro will treat backslashes as literal chars); it seems like one could get a large number of different "raw" string macro variants depending on the data source, and I'm not clear on what should be in Base...

@malmaud
Copy link
Contributor

malmaud commented Jul 11, 2015

What if we used suffixes to customize the kind of raw string you want? Something like raw"..." escapes both $ and \, raw"..."d escapes only $, and raw"..."b escapes only \.

@kmsquire
Copy link
Member

I think that's reasonable. Most of the time, people would just use
raw"...", and would only use one of the suffixes if they needed the extra
functionality.

On Saturday, July 11, 2015, Jonathan Malmaud notifications@github.com
wrote:

What if we used suffixes to customize the kind of raw string you want?
Something like raw"..." escapes both $ and , raw"..."d escapes only $,
and raw"..."b escapes only .


Reply to this email directly or view it on GitHub
#11764 (comment).

@felipenoris
Copy link
Contributor

In the context of avoiding disruption of existing code, raw"..." sounds good, as long as there is a solution for "you won $$(prize)" case.

@hayd
Copy link
Member

hayd commented Aug 3, 2015

Any chance raw".." might sneak into 0.4 ? Is it simply:

macro raw_str(x)
    x
end

@shashi
Copy link
Contributor

shashi commented Aug 3, 2015

It's true that string interpolation is currently the most concise way to do... string interpolation. I have grown to love this feature, except when it presents you with unhelpful line numbers in error messages. Here is a ~220 line md"" string with copious amounts of interpolation. It's really awesome when you are composing long documents and keep the flow. Here is the rendered result.

We do need a non-interpolating string literal syntax. The $ symbol is kind of an unwieldy beast in this specific case as it conflicts with LaTeX interpolation in Markdown as @stevengj also pointed out. So for example $a + b$ is taken as interspersed LaTeX whereas $(a+b) will be evaluated (if it's in its own line, cc @one-more-minute ). I quite like the \() suggestion as an alternative to $, it's got a bit of TeX feel to it. But there are some other good suggestions here as well like raw"".

@StefanKarpinski
Copy link
Member Author

I'm not too concerned about needing literal $ in strings except for LaTeX in docstrings, where it's really a shame that we can't just write $\sqrt{x^2 + y^2}$ like one normally does.

@MikeInnes
Copy link
Member

Our docstrings are already reasonably smart about interpolation and escaping, so it's not that awful.

julia> x = "bar"
"bar"

julia> md"""
       foo $x baz

       $\sqrt{x^2 + y^2}$
       """
  foo bar baz

\sqrt{x^2 + y^2}

@jverzani
Copy link
Member

jverzani commented Aug 5, 2015

Could the docstrings be smart enough to also parse

out = md"""
$$
displaymath
$$
"""

as LaTeX and not as

julia> out.content
2-element Array{Any,1}:
 $                                             
 Base.Markdown.Paragraph(Any["displaymath ",$])

I can't find a satisfactory workaround.

@MikeInnes
Copy link
Member

That should work without the spaces between the $s and the TeX. See the fft docstring. We could certainly parse something like that as TeX as well if it's standard – in general I'd like to follow the lead of other parsers, e.g. pandoc.

@jverzani
Copy link
Member

jverzani commented Aug 5, 2015

Thanks, that could work. In case it is any push to extend the syntax a bit, the demo at http://pandoc.org/try/ parses both of these:

$$\sin(x)^2$$

$$
\sin(x)^2
$$

as LaTeX, but does not parse this as LaTeX:

$$

\sin(x)^2
$$

@hayd
Copy link
Member

hayd commented Aug 5, 2015

If we do this, we should use \[ .. \] rather than $$ (which is deprecated). I feel like there's a PR with that change already...

@jverzani
Copy link
Member

jverzani commented Aug 5, 2015

Using \[ is certainly reasonable. Thanks for the consideration of adding the feature.

@stevengj
Copy link
Member

stevengj commented Aug 5, 2015

@hayd, $$ may be deprecated in LaTeX proper, but in Markdown it is the only commonly supported syntax for display-mode equations.

@MikeInnes
Copy link
Member

See also this thread for some relevant discussion.

@jballanc
Copy link
Contributor

jballanc commented Nov 8, 2015

Just stumbled across this issue and thought I'd add a slight correction to the list of other languages interpolation syntaxes. Ruby can use "#$global" and "#@ivar" to interpolate values without the "{}". Presumably this was originally part of Ruby's Perl inheritance, but today very few Ruby programers even know about these variants. Indeed, usually they only discover it accidentally when some string literal doesn't come out as expected. Currently, there is an open issue with Ruby to remove the shorthand, as it is very, very rarely used any longer: https://bugs.ruby-lang.org/issues/10541 .

Obviously the situation is not completely analogous with Julia since Ruby's short-hand interpolation syntax doesn't work with local variables. In the comments on that issue, though, there's a very good analysis that even when the short-hand could have been used, it largely wasn't. Removing the short-hand interpolation syntax seems to be the simplest solution to the issues presented here, and Ruby's experience seems to indicate that it wouldn't be horribly missed in the long term.

@tkelman
Copy link
Contributor

tkelman commented Jan 26, 2017

close given the raw string macro from #19900?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests