Skip to content
This repository has been archived by the owner on Apr 13, 2023. It is now read-only.

add F#-style "pipe" operator |> #6615

Closed
gavinking opened this issue Oct 18, 2016 · 91 comments
Closed

add F#-style "pipe" operator |> #6615

gavinking opened this issue Oct 18, 2016 · 91 comments

Comments

@gavinking
Copy link
Contributor

gavinking commented Oct 18, 2016

Several times in the past, users (including me!) have requested an F#-style pipe operator, which accepts any type T has the left operand, and a unary function type F(T) as the right operand.

val |> fun

would be an abbreviation for:

let (_ = val) fun(_)

Thus, you could write print("hello") as "hello" |> print.

If I'm not mistaken, |> is naturally left-associative.

I'm now favorably-disposed toward this proposal, and the syntax seems viable. It looks like it can be implemented by desugaring and need not impact the backends in the initial implementation.

An open question is: is foo |> bar considered a legal statement? Can I write this:

void hello() {
    "hello" |> print;
}

I would say that we should accept this code.

Discussion on Gitter led to me proposing two additional variations of this operator, by analogy to the existing ?. and *. operators.

  • maybe ?|> fun would propagate nulls from left to right, being equivalent to if (exists _=val) then fun(_) else null
  • tuple *|> fun would spread a tuple over the parameters of an arbitrary-arity function, being equivalent to let (_=val) fun(*_)

(Note that these operators could also be easily defined in terms of |> and a higher-order function. For example, tuple *|> fun is equivalent to tuple |> unflatten(fun).)

Both of these are useful—I've wanted something like ?|> many times when writing real Ceylon code—but the objection was raised by @someth2say that they're too ascii-arty for this language. That's a reasonable objection, and we should give it some weight.

Feedback?

@FroMage
Copy link
Contributor

FroMage commented Oct 18, 2016

As usually I'm skeptical about introducing new operators that don't exist or don't have the same syntax as in other languages (or do they?), because although their meaning has a justification, they're still cryptic for newcomers.

I'm also skeptical about their value. I don't think I've ever wanted them. As a comparison I've wanted the .. operator in #4049 a lot (order of magnitude) more often (at least in Java), which is similar except it sticks to a single return value.

I'm not opposed to them, just very skeptical, especially when foo() |> bar is much clearer as bar(foo()) IMO. Similarly for foo() |*> bar versus bar(*foo()), I don't see the point. Unless the tuple can be null and it behaves like |*?> and the my next argument applies to it.

I do believe foo() |?> bar has value over if (exists f = foo()) then bar(f) else null, but I am afraid single-argument functions are the exception and not the norm, so if it's the second argument that can be null and should avoid the function call, it's useless.

I'm not opposed to those operators, but I'm not convinced at all yet. Just waiting to be convinced ;)

@jvasileff
Copy link
Contributor

jvasileff commented Oct 18, 2016

I don't have an opinion about |> yet. I've always thought about a compose operator, which I've wanted at times.

But I do think ?|> would be GREAT. I'm constantly wanting to map over optionals to avoid the if (...) then ... else null pattern.

but I am afraid single-argument functions are the exception

If I'm not mistaken, the multi-argument use would look like:

Integer? x = ...;
Integer? y = x ?|> ((x)=>times(x, 5));

@gavinking
Copy link
Contributor Author

gavinking commented Oct 18, 2016

@FroMage My feeling is that we would introduce |> as a first step. I don't think I would add ?|> and *|> initially, since I kinda agree with the thinking that they're potentially cryptic.

I agree that bar(foo()) is clearer than foo() |> bar, however, I don't think that reasoning holds when it's

bar(foo(baz(fee(fi(fo(fum))))))

vs

fum |> fo |> fi |> fee |> baz |> foo |> bar

@bjansen
Copy link
Contributor

bjansen commented Oct 18, 2016

I think foo() |> bar might be too simplistic to consider this operator as useful. What Gavin didn't mention is that it would be chainable. Consider the following example:

request |> parseParameters |> validateParameters|> doStuff |> writeResponse

vs

Request request = ...;

value params = parseParameters(request);
value paramsAgain = validateParameters(params);
value output = doStuff(paramsAgain);
writeResponse(output);

@FroMage
Copy link
Contributor

FroMage commented Oct 18, 2016

fum |> fo |> fi |> fee |> baz |> foo |> bar

OK fine, true that's better, but again assumes single arguments is common.

Integer? y = x ?|> ((x)=>times(x, 5));

OK but if you want to chain this more, it's really starting to smell, especially if you have a chain of heterogeneous number method arguments.

@gavinking
Copy link
Contributor Author

OK fine, true that's better, but again assumes single arguments is common.

Well that's the point of *|> if it wasn't clear. You could write stuff like:

[foo, bar] *|> fun |> otherFun

But sure, this is optimized for the case of one argument, that's for sure.

@bsideup
Copy link

bsideup commented Oct 18, 2016

assumes single arguments is common

Until you start working with RxJava, Streams API, Spark, or any other functional API :)

@someth2say
Copy link
Contributor

I can't sleep thinking about how powerful can this be, and how much I dislike the |> syntax.
I've been thinking that the purpose of this is being able to define the "calling" of a "sequence" of functions with an initial parameter(s).
This almost naturally drive me to use both already known constructs in the language: sequence and method calls.
I propose the following syntax:
{ fo ; fi ; fee ; baz ; foo ; bar }(fum)
Being fo...bar the functions, and fum the initial value.
This syntax expresses clearly the order of applied functions, is compatible with ? and * and, IMHO, is more Ceylonic than ascii-art operators.
Also, I can see many other advantages for this syntax:

  • The body can be understood as a function definition itself, so it can be directly used for functions:
    function chain => { foo ; bar ; baz }
    Yes, you can also do the same with |>, but does not look that good for me.
  • Not only single initial value can be used, but anything like a function parameter set:
    { times ; Integer.string } (x , y)
    In fact { f1 ; ... ; fn } is a function that accepts the same parameters than f1 and returns the same type than fn.

Thoughts?

@someth2say
Copy link
Contributor

Now that I think a bit more about this, I found the ? prefix does not smoothly fit with multiple parameters. { ?foo; bar } (x, y)

? usually means "if input exists, then use it for the following". But then multiple parameters can be used, the input will be a collection of parameters, not just a single one. And this collection will always exists (say can never be null).
On edge cases, the parameter collection may be empty, contain a single null parameter, or contain many null parameters. None of those cases do really match the current ?semantics.

Anyway, as we are proposing the syntax, we can also adapt the semantics to our needs.
Some options are:

  1. Always refuse ? on first function. I don't really like it, but is is feasible,
  2. Accept ?on the first function iif it only accepts a single nullable parameter. ? then will check for only for this parameter.
  3. Accept ?on the first function, and redefine it to check non-nullness for all parameters.
    I slightly prefer 3), but will accept 2).

@simonthum
Copy link

I'd like to remind people that a compose like operator as @jvasileff would give 99% of the expressive power with minimal language changes - there are several operators you can't use on functions waiting for reuse.

I also achieved a compose function that composes functions of a single type, and it's reverse case, quite trivially

"Chain homogeneous functions in order last to first called (or outer to inner calls)."
shared X(X) chain<X>(X(X)+ functions)
        => functions.reduce(compose<X, X, [X]>);

If some type wizards could extend it to accept X(Y) this issue would boil down to an SDK call. But I doubt that's feasible, or only as compose2, compose3, compse4... which I find bad enough to warrant an operator. I think * or + would be fine:

function chain => foo + bar + baz;
(foo + bar + baz)(arg);

If I'm not mistaken, spreading would happen naturally.

@welopino
Copy link

welopino commented Oct 24, 2016

It would be more effective to have monadic for - then to have the pipe operator (which is only for single argument functions). If at all, it is necessary to have a right associative version of concatenation for the creation of immutable data structures. But I know that you will not accept this proposal, because it make a whole bunch of addition to ceylon necessary or at least desirable: Implicits, no constraint of e.g. Summable interface (chain different types), right associativity, ascii art (there is no justification against ascii art, e.g.why is there a fixed set of operators for any class??), operator precedence definition (I can really understand than you dont like that - it makes a language a horror) ... have fun - but you will not have, and for some reason it is good so.

@gavinking
Copy link
Contributor Author

gavinking commented Oct 24, 2016

@simonthum the source of the discomfort with using + or * to represent function composition is that functions don't form a semigroup. OTOH, as you've observed, functions of matching input and output type do form a semigroup (a monoid, even), so the notation would be reasonable in that case. But note that the pipe |> doesn't demand that input and output types be the same, which makes it a lot more generally-useful for representing a sequence of data transformations.

@gavinking
Copy link
Contributor Author

gavinking commented Oct 24, 2016

Hi, @welopino

the pipe operator (which is only for single argument functions)

Well, that's not quite right: the proposal I've presented above is not just for single-parameter functions. Though, of course, it's most natural in that case.

But I know that you will not accept this proposal, because it make a whole bunch of addition to ceylon necessary or at least desirable:

No, that's not right at all. Ceylon already has higher-order generics, so the type Category (and even Functor/Monad) can be easily represented in our type system. But it doesn't seem to me that that's necessary in order to solve the problem that we're looking at here. It looks like overkill, frankly.

Implicits

I don't see what "implicits" have to do with this at all. The problem of abstracting over the notion of "composition" at a high level is a job for higher-order generics, not for implicit type conversions.

no constraint of e.g. Summable interface (chain different types)

What you're trying to describe here is the higher-order generic type Category, it seems to me. We could certainly add such a type, but I don't see how that really solves the immediate problem.

ascii art (there is no justification against ascii art, e.g.why is there a fixed set of operators for any class??), operator precedence definition (I can really understand than you dont like that - it makes a language a horror) ... have fun - but you will not have, and for some reason it is good so.

Well you went all ranty at the end here, and I can't tell what this has to do with the issue we're discussing.

@gavinking
Copy link
Contributor Author

There has been some discussion between @someth2say, @luolong, and myself on the gitter channel. Both @simonthum and @someth2say have been pushing in the direction of a syntax for function composition, instead of the F#-style "piping" at the value level.

It seems to me, however, that |> can do both. It looks like |> is a perfectly well-defined associative binary operator meaning:

  • application, if the LHS is of type X and the RHS is of type Y(X), and
  • composition, if the LHS is of type Y(X) and the RHS is of type Z(Y).

I can't seem to be able to construct any case where this would be ambiguous and/or non-associative right now. Perhaps I'm being dense, and there's something obvious that I'm missing.

@gavinking
Copy link
Contributor Author

there's something obvious that I'm missing

Well, OK, there's this one:

  • LHS: Object(Object)
  • RHS: Object(Object)

That case is indeed ambiguous :-/

@someth2say
Copy link
Contributor

Well in fact, anything that

  • LHS: Callable(A,B)
  • RHS: Callable(X,Y) given Y satisfies Callable(A,B)
    In other words, if RHS accepts as Y a parameter, and LHS satisfies Y, we have an ambiguity.

I can see several approaches for solving the ambiguity:

  • Use the ( X |> Y |> Z)(param) syntax (yes, I know I am being boring). This implies |>will never means application, just always composition.
  • Using the fact 1) this can only happen in first position for the |> chain, and 2) LHS should satisfy Callableto find the ambiguity.
    If this situation is found, then evaluate |> as one of them (i.e. application).
    If this happens, then just force composition (this may be tricky for the typechecker)
  • Disallow using a Callableon first position for the |> chain.
    This will disallow things like function comp(Object param) => param |> X |> Y |> Z, but we can live with it.

@gavinking
Copy link
Contributor Author

In other words, if RHS accepts as Y a parameter, and LHS satisfies Y, we have an ambiguity.

Well, yes, of course. And can you think of any other type than Object or Anything that would satisfy you that? I can't.

@jvasileff
Copy link
Contributor

... |> print does seem useful though.

A couple other interesting examples:

class X() satisfies X(X) {}
Float f(X g(X x)) => 1.0;
Float | Float(X) whatIsIt = X() |> f;

(similar for class X() satisfies X() {})

and this, which is an extension of the Object(Object) example:

Anything(String)(String) a => nothing;
String(Anything(String)) b => nothing;
String | String(String) whatIsIt = a |> b;

@someth2say
Copy link
Contributor

Interesting.
First example:
Assuming you can satisfy Callable (currently you can't), you are in the ambiguity previously described.
You can't decide if X (the Callable) or X (the parameter) will be the parameter for f.

Second example:
Can be rewritten as:

alias SToA => Anything(String);
SToA a(String str) => nothing;
String b(SToA stoa) => nothing;

So in this example, desugaring |>operator, you have (allow me to abuse of <=>)

a |> b <=> b(a) <=> String(SToA(String)) <=> String(String)

So whatIsIt is an String(String).
Quoting myself:
( f1 |> ... |> fn)

is a function that accepts the same parameters than f1 and returns the same type than fn.

Agree, I've been a bit a cheater here, creating the SToAalias to avoid currying a, but this way is clearer, IMHO.

@xkr47
Copy link
Contributor

xkr47 commented Nov 1, 2016

just gotta say I don't especially like |> .. it would be nice to use unicode characters....

@luolong
Copy link

luolong commented Nov 1, 2016

Unicode characters are notoriously difficult to type...

@arseniiv
Copy link

arseniiv commented Nov 1, 2016

@xkr47 I like unicode too, but it seems time for exclusive non-ASCII tokens in a programming language has not yet arrived. However, non-ASCII tokens could be alternate forms of ASCII ones: for example, Haskell has a compiler extension (and a source code directive) for that.

@lucaswerkmeister
Copy link
Contributor

I actually proposed support for ∨ ∧ ∀ ∃ ∈ ≤ ≠ etc. a while ago ;) I wasn’t entirely joking, but sadly nothing came of it.

@gavinking
Copy link
Contributor Author

gavinking commented Apr 16, 2018

Hrm, so I suppose it would be possible to get rid of the parens around anonymous functions, and write

void fun()
        => "hello world"
        |> String.size
        |> (len) => Integer.format(len, 16)
        |> print;

If we parsed anonymous function bodies with a slightly higher precedence (higher than assignments and |>. Not sure if that’s a good idea, however. It means that => would no longer have the same precedence everywhere in the language. In particular it means that anonymous functions bodies would have a different grammar to regular functions. You couldn't write: do((x) => y = x). (Though of course you could still write do((x) { y = x; }) which is almost the same number of characters and arguably clearer.)

@gavinking
Copy link
Contributor Author

gavinking commented Apr 16, 2018

If we parsed anonymous function bodies with a slightly higher precedence (higher than assignments and |>. Not sure if that’s a good idea, however.

Well, actually, I've managed to do a bit better than that. I've hacked it so that |> has a lower precedence than almost everything, including assignments and anonymous functions. That means that the only "weirdness" is that you can use |> in the body of a regular function or value declaration or in a specification statement, but not in an anonymous function or an assignment expression (unless you wrap it in parens).

That actually feels pretty reasonable to me....

@gavinking
Copy link
Contributor Author

This is now working well enough that you folks can try it out, if you like. For example, the following code:

shared void run() {
    "hello world how are you today, I'm doing great !!!! xxxxxyyy"
            |> String.size
            |> (Integer i) => Integer.format(i, 16) 
            |> String.uppercased 
            |> String.trimmed 
            |> print;
}

prints 3C with both ceylon run and ceylon run-js.

@gavinking
Copy link
Contributor Author

gavinking commented Apr 17, 2018

So I'm still not completely finished with this. The following issues remain:

  1. While it's certainly convenient to be able to write
    ... |> (Integer i) => Integer.format(i, 16) |> ... 
    the truth is that the resulting grammar isn't completely "clean", not in a way that you'll really notice as a user of the language, but definitely in a way that will bother me when I have to write it down in the spec. I'm inclined to think it's probably better to have a clean grammar, at the cost of having to leave in the parentheses in
    ... |> ((Integer i) => Integer.format(i, 16)) |> ...
    So I'm thinking of rolling back a some of the work I did yesterday.
  2. I want type inference for anonymous function parameters in pipelines. What that boils down to is adding parameter type inference for immediately-invoked anonymous functions, stuff like
    ((i) => Integer.format(i, 16))(100)
    We've never needed that before because there was never a good reason to define an anonymous function and immediately call it. Now, sure, infer anonymous fun parameter type from usage #7353 gets us part of the way there, but it definitely doesn't work in all cases. Fortunately this looks pretty easy to implement (though perhaps if I'm going to do it, I should bite off parameter type inference for anonymous functions in assignments #7058 at the same time).

@xkr47
Copy link
Contributor

xkr47 commented Apr 17, 2018

the truth is that the resulting grammar isn't completely "clean"

Yeah despite the cleaner look, it only feels natural when you split the |> on different lines. If you one-linify the whole statement then it feels strange that the following |> would not be part of the preceding (Integer i) => Integer.format(i, 16) function.

@xkr47
Copy link
Contributor

xkr47 commented Apr 17, 2018

And of course the normal-precedence version would be auto-indented like this:

shared void run() {
    "hello world how are you today, I'm doing great !!!! xxxxxyyy"
            |> String.size
            |> (Integer i) => Integer.format(i, 16) 
                |> String.uppercased 
                |> String.trimmed 
                |> print;
}

.. revealing the precedence thought error
.. while funnily enough still producing the exact same end result 😁

@gavinking
Copy link
Contributor Author

while funnily enough still producing the exact same end result

ASSOCIATIVITY FTW!!!!11!1

gavinking added a commit that referenced this issue Apr 17, 2018
- higher precedence for |>
- commit to desugaring-based impl
@gavinking
Copy link
Contributor Author

So I'm thinking of rolling back a some of the work I did yesterday.

Done. And I threw in an impl of #3229, the >|> operator.

I want type inference for anonymous function parameters in pipelines.

Still TODO.

gavinking added a commit that referenced this issue Apr 18, 2018
@gavinking
Copy link
Contributor Author

gavinking commented Apr 18, 2018

I've now also added <|< and <|, but what direction should they associate in?

  • for <|< it doesn't matter, since function composition is truly associative, but
  • for <| it does:
    • if it's right-associative, it's just the same as |>, but lets you write your chain in the opposite direction, value-last (not really very useful, IMO)

    • if it's left-associative, it has a completely different usecase: sending multiple values to a consuming function, for example:

       write <| "hello" <| "world";
      

@gavinking
Copy link
Contributor Author

gavinking commented Apr 18, 2018

Parameter type inference is done for |>. But I should also add it for >|>.

gavinking added a commit that referenced this issue Apr 18, 2018
not sure what associativity they should have
gavinking added a commit that referenced this issue Apr 18, 2018
gavinking added a commit that referenced this issue Apr 18, 2018
- add pipeline operator |> #6615
- add fish operator >|> operator for #3229
- parameter type inference for anonymous functions in assignments #7058
- parameter type inference for immediately-invoked anonymous functions
@gavinking
Copy link
Contributor Author

I've merged this work to master. Still need to mention |> in the spec.

gavinking added a commit that referenced this issue Apr 18, 2018
@gavinking
Copy link
Contributor Author

Added to spec. Closing this issue, but please reopen if you run into any serious problem with this work.

@gavinking gavinking modified the milestones: 1.4, 1.4.0 beta Apr 18, 2018
@xkr47
Copy link
Contributor

xkr47 commented Jun 8, 2018

@lucaswerkmeister
Copy link
Contributor

Ceylon used to have something like that, in the ancient past, where a b meant a.b, and a b c meant a.b(c), called “operator-style expressions”. We killed it ages ago (8c713a1), because it was a syntactical nightmare.

@xkr47
Copy link
Contributor

xkr47 commented Jun 8, 2018

yeah.. interesting thread.. this would have aliased a bit differently but I guess same/similar problems would arise.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.