Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: add "++" operator #11686

Merged
merged 2 commits into from
Jul 27, 2015
Merged

RFC: add "++" operator #11686

merged 2 commits into from
Jul 27, 2015

Conversation

jdlangs
Copy link
Contributor

@jdlangs jdlangs commented Jun 12, 2015

I was quite surprised issue #11030 reach 160+ comments and no one had created a PR for introducing ++, as it seemed to have very broad support. It was a pleasant surprise how easy it was to add. Serious thanks to @mbauman for providing the hint to get this functionality that an outsider like me would otherwise have no idea about.

Speaking of being an outsider, I tried to be as non-controversial as possible, by just replicating the current * concat method and not allowing any other generic use. I was motivated to make this PR because I really liked the idea of a generic sequence concat operator but others can work out the details for that.

This will break code if anyone was doing something like x++1 (as x + (+1)) which is pretty unlikely but possible.

I guess the long-term plan would be to decide if ++ should be added to 0.4, and then if so, the whole debacle of deprecating * would be discussed/argued/executed in 0.5-dev? I'm looking forward to watching the comments with my popcorn...

@johnmyleswhite
Copy link
Member

I like this operator. For me, the toughest question is how this relates to vcat/hcat.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 12, 2015

Also, I didn't know how to make x ++ y ++ z parse to ++(x,y,z) so it creates ++(++(x,y), z)

@mbauman
Copy link
Member

mbauman commented Jun 12, 2015

Hooray code! Thanks for starting the ball rolling here. It's so much easier to discuss implementations instead of going around in circles of spilled ink.

Given the timeline @IainNZ and I were discussing in #11030 (comment), maybe this should be pared down to simply be the addition of the parsed operator (with no builtin meaning at all). That way this can be developed over the course of 0.4 in a package to experiment with things like Char ++ String and Char ++ Char and maybe even vcat and general iterables.

x ++ y ++ z parse to ++(x,y,z)

That's a bit of a harder problem. Right now + and * are special cased: https://github.com/JuliaLang/julia/blob/master/src/julia-parser.scm#L771. You'd have to make the chain-op argument a list, and propagate that through to, e.g., replace eq? with (I think) memq. But that's likely to be met with a little resistance since @JeffBezanson has expressed interest in removing this special parsing entirely. So I think this is a good start.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 12, 2015

The test failure is coming from a parsing test of "+ + +" which was added when fixing #8994. Currently, repr(parse("+ + +")) == ":(+(++))" and with this change it should be ":(+(+ +))" or ":(+(+(+)))"

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 12, 2015

@mbauman Thanks again for taking time to point out some of the internals. I won't bother following up on the multi-argument parsing then.

As an additional note, I think if it is legitimate to do x = x ++ y, IMO there isn't really a reason to not include ++=. I know @JeffBezanson is concerned about encouraging inefficient code patterns but I feel like that can be handled effectively in documentation by being extra clear that any operator of that form actually makes a copy. Not having the option at all seems a bit too prescriptive from my POV.

@@ -108,6 +108,9 @@ end
const .≤ = .<=
const .≠ = .!=

#For now, no generic behavior for ++
++(x,y) = error("++ only defined for strings")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better for this to be a no method error.

@tkelman
Copy link
Contributor

tkelman commented Jun 13, 2015

Ref #9532, past issue on parsing of ++ vs --

This would be hypothetically breaking some not-particularly-legible code, but I think I'd be in favor of allowing it to be parsed as an operator for now (would we need to make -- no longer an error a syntax deprecation warning for consistency?) without giving it a meaning yet.

@PallHaraldsson
Copy link
Contributor

"I was quite surprised issue #11030 reach 160+ comments", yes.. I'm not trying to do that here..

See my latest comment there about: "Also, I didn't know how to make x ++ y ++ z parse to ++(x,y,z) so it creates ++(++(x,y), z)", parsing to ++(x,y,z) would be good but not if it results in two operations.

If "not allowing any other generic use. [..] This will break code if anyone was doing something like x++1 (as x + (+1)" your operator might be better, because it disallows non-string types, or worse depending how you look at it.. This already works (or with println):

julia> string(1, "10", 1)
"1101"

It's grate that you're learning Julia and trying to help. I'm not sure we should add this if string or that function renamed to cat or c function might be better, then we have to get rid of two operators.. if that ended up as the recommended way.

Would the ++ ever be useful for other things, such as, in C++?

[Another operator (already taken..) || is good.. used by SQL, would not have had the numbers problem or could cat them.. It's only defined for booleans, not sure it's a good idea to define it for this for all non-booleans.. Maybe it would work? As the result of || is also bool and you would get an error down the line. Immediately if in a condition, but this is probably a very bad idea as errors should be localized.]

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 16, 2015

@StefanKarpinski I removed the generic fallback method so any call that isn't using strings will throw a MethodError.

@PallHaraldsson:

Would the ++ ever be useful for other things, such as, in C++?

I think if people really wanted an increment and/or decrement unary operator, we would have seen more requests on the mailing list, etc. It's also pretty tough to make a good case that x++ is much better or more convenient than x+=1.

@ScottPJones
Copy link
Contributor

I really think that this should not be defined with string in the general case, because that will make it unusable as a general concatenation operator, which was one of the big reasons that I was pushing for ++ or .. in the first place. It is also not consistent with the current * string concatenation operator.

It also means that you couldn't fix things so that

b"abcde" ++ b"fghij"

would work as expected.
If somebody really wants to stringify something, then they can use string already.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 17, 2015

@ScottPJones I think you misread the changes here. This doesn't call string on arbitrary objects. It is currently an exact copy of the * string concat method and will only work when passed string arguments. b"abc" ++ b"def" or "123" ++ 4 will throw a MethodError.

However, it is looking like people are more interested in allowing the operator parsing without even the string concat logic. I personally am somewhat skeptical about adding an operator without giving it some sort of default meaning (why is it being added then?) but will leave it up to the experts.

@ScottPJones
Copy link
Contributor

Here is a little bit of what I'd been working on (but all my suggestions for a specific operator, including .., $+,++,, kept getting shot down):

{T <: Union{AbstractString,Char}}(a::T, b::T...) = string(a, b...)
(a::Char, b::Char...) = string(a, b...)
{T <: Union{UInt8, UInt16, UInt32, Char}}(a::Union{T,AbstractVector{T}}, b::Union{T,AbstractVector{T}}...) = vcat(a, b...)
julia> "foo""b"
"foob"
julia> "foo"'c'
"fooc"
julia> b"foo"b"bar"
6-element Array{UInt8,1}:
 0x66
 0x6f
 0x6f
 0x62
 0x61
 0x72

@ScottPJones
Copy link
Contributor

@phobon Yes, I saw that, that's why I said "in the general case". I was afraid that you'd change it to use string() for other than concatenations with AbstractString.

@ScottPJones
Copy link
Contributor

However, it is looking like people are more interested in allowing the operator parsing without even the string concat logic.

@phobon I tried exactly that, and had my PR #11102 peremptorily closed 😞
I've been watching this PR with great interest, hoping that you'd have more success than I did getting this in...

@mbauman
Copy link
Member

mbauman commented Jun 17, 2015

Let's not get distracted by the general case. Part of the beauty in this PR is its minimalism. Once the operator can be parsed, a package can drive development of the more complicated cases. As it stands, this PR entails two fairly straightforward decisions:

  • Do we want to parse ++ as an operator, knowing that it might break some rickety code?
  • Do we want to export ++ from base as a concatenation operator? This needs documentation in that case.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 17, 2015

@ScottPJones I think those methods all look quite reasonable. However, I don't think this is the place to discuss them since that amount of logic will definitely not be in this PR, which is just about whether the ++ operator should be introduced.

@ScottPJones
Copy link
Contributor

@phobon That's fine, although I would suggest the change I had in the first method, that allows Chars to be concatenated with AbstractStrings.
@mbauman I don't think two lines added in order to handle the most useful general case makes this not so minimalist... However, if this PR does make it in, I'll immediately submit a PR to add those two lines! 😀

@ScottPJones
Copy link
Contributor

Do we want to parse ++ as an operator, knowing that it might break some rickety code?

@mbauman That's why I'd been trying, in #11102, to find some character(s), such as \bigoplus ,
that could not break anything at all... (that character, in particular, is only used as a unary operator
AFAICT in math, so it still could unambigously be used as a general concatenation operator)

Hooray code! Thanks for starting the ball rolling here. It's so much easier to discuss implementations instead of going around in circles of spilled ink.

Remember, I did submit code also, for PR #11102...
(edited: fixed my stupid typo on the PR #!)

@jdlangs
Copy link
Contributor Author

jdlangs commented Jun 17, 2015

For those confused, @ScottPJones is meaning to link to #11102.

For now, I won't propose any further extension beyond what * already does. I'm sure if ++ is accepted though, making ++(::String,::Char) work will definitely go forward.

@ScottPJones
Copy link
Contributor

BTW, very good work @phobon, I'll be incredibly happy to see this get in...

@jdlangs jdlangs force-pushed the concat_operator branch from 3557e7c to 1a9d16e Compare July 9, 2015 18:32
@jdlangs jdlangs changed the title WIP: "++" as [string] concatenation operator WIP: add "++" operator Jul 9, 2015
@jdlangs jdlangs force-pushed the concat_operator branch from 1a9d16e to 0c2b1dd Compare July 9, 2015 18:58
@jdlangs
Copy link
Contributor Author

jdlangs commented Jul 9, 2015

Ok, I removed all exports and concat meaning so this PR is now just about the parsing. This is already sufficient to experiment with various ++ methods down the road. General consensus seems to be that allowing the parsing is ok, despite some possible minor breakage. Are there any other aspects that should be addressed/discussed?

@jdlangs jdlangs changed the title WIP: add "++" operator RFC: add "++" operator Jul 9, 2015
@jdlangs
Copy link
Contributor Author

jdlangs commented Jul 24, 2015

I downloaded and grep'd through all the package code for the string "++" and found it used in two places. Both of them looked to be mistaken translations from C/C++ where the increment operator was not rewritten. I'll go ahead and submit PRs to those packages regardless of what happens here.

Given the imminent 0.4 feature freeze, is there any decision? Possible future development with the operator is going to need it defined in 0.4 for Compat. @JeffBezanson, I think it's your call given your previous comment.

@prcastro
Copy link
Contributor

👍 This is a nice adition to 0.4

@ViralBShah
Copy link
Member

I would prefer not to have such a major syntax change this late into the 0.4 release cycle, and instead have it for the 0.5 cycle.

@tkelman
Copy link
Contributor

tkelman commented Jul 26, 2015

This is a pretty minor parsing adjustment and leaves most of the question of what to do with this operator for packages to play with, so I don't think of it as that disruptive to do now? The other recent syntax change in #12285 was quite a bit more disruptive than this would be.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jul 27, 2015

@JeffBezanson, I've added the n-ary parsing behavior. I did what was probably very naive changes to make it work but it seems to be working ok and not conflicting with the chain parsing of + or *.

@jdlangs
Copy link
Contributor Author

jdlangs commented Jul 27, 2015

Btw, @aviks, your Ito.jl package is one of the two that currently (mistakenly?) uses ++ here

@JeffBezanson
Copy link
Member

Looks good, thanks!

@StefanKarpinski
Copy link
Member

I wonder if ConcatString could be defined like this:

immutable ConcatString{N,S<:String} <: String
    strings::NTuple{N,S}
end

It's probably pretty common to know how many things have been concatenated up to a point. If you're constructing one of these in a loop you should probably be using an IOBuffer anyway.

@JeffBezanson
Copy link
Member

Yes, we might want to experiment with some iovec-like things like that. Would you still be in favor of n-ary parsing of ++ though?

@StefanKarpinski
Copy link
Member

Yes, n-ary parsing strikes me as better in any case.

JeffBezanson added a commit that referenced this pull request Jul 27, 2015
@JeffBezanson JeffBezanson merged commit b64a49f into JuliaLang:master Jul 27, 2015
@waldyrious
Copy link
Contributor

🎆 👏

@jdlangs jdlangs deleted the concat_operator branch July 27, 2015 22:22
@tkelman
Copy link
Contributor

tkelman commented Jul 28, 2015

Quoting myself from #11686 (comment)

would we need to make -- no longer an error a syntax deprecation warning for consistency?

@jakebolewski
Copy link
Member

I don't think so.

@waldyrious
Copy link
Contributor

Shouldn't this go on NEWS.md?

@quinnj
Copy link
Member

quinnj commented Aug 9, 2015

Not NEWS-worthy IMO, since we're not actually adding any functionality, just parsing it as an operator.

@waldyrious
Copy link
Contributor

Well, given the vast amount of discussion that surrounded this issue, I thought people would be interested nevertheless. Besides, it is a language change, albeit admittedly a small one.

@ssfrr
Copy link
Contributor

ssfrr commented Aug 9, 2015

I wonder if 1 / length(discussion) is a reliable measure of language impact. 😉 Maybe lines_changed / length(discussion).

@waldyrious
Copy link
Contributor

I'm just saying, lots of people will be interested in knowing this news, regardless of how significant the developers consider this change to be. Purposely choosing to be quiet about this change is IMO a disservice to all the people who engaged in those conversations.

@StefanKarpinski
Copy link
Member

It's fine to announce it and include a list of new operators. Please feel free to make a PR, @waldyrious.

@waldyrious
Copy link
Contributor

Will do :) What other operators should I include? #12472? Any others I'm not aware of?

@waldyrious
Copy link
Contributor

Alternatively I can add only ++ and let others more in-the-loop do update the file to include the remaining operators.

@StefanKarpinski
Copy link
Member

I think you can mention ++ explicitly and just link to the appropriate parser file, which is definitive.

@waldyrious
Copy link
Contributor

Ok. I also realized that and were already parsed in 0.3, so that's irrelevant here. The only new operators besides ++ seem to be ÷= and .÷= (#9009). Are those what you were thinking of when you wrote of "a list of new operators", @StefanKarpinski? I'm not sure you want me to include them in the PR or not, so I'll stick with ++ and they can be added later.

waldyrious added a commit to waldyrious/julia that referenced this pull request Aug 11, 2015
@StefanKarpinski
Copy link
Member

Sure, list them all – for some reason I thought more operators were added in 0.4 but maybe it was 0.3.

@tkelman
Copy link
Contributor

tkelman commented Aug 11, 2015

I think more unicode operators are parsed on 0.4 than on 0.3 but not very many of them have pre-assigned meanings.

StefanKarpinski added a commit that referenced this pull request Aug 11, 2015
@waldyrious
Copy link
Contributor

If I'm reading git diff v0.3.11 master -- src/julia-parser.scm correctly, the only ones added to 0.4 were ++, ÷= and .÷=. I would have added the latter to #12554 if I was sure, but I can open a new PR if desired. If so, please confirm I'm not missing anything else :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.