-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: pipe operator with explicit result passing #70826
Comments
Related Issues
Related Discussions (Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
The way I'm understanding the proposal is that the following example from the proposal text... a() |> f1(piped, b) |> f2(c, piped) ...would behave approximately like the following: (func () {
piped := a()
piped := f1(piped, b)
return f2(c, piped)
})() Does that match what you were intending? Do you expect that the |
Effectively, yeah. That exact code snippet isn't legal because you can't shadow a variable in the same scope, but if you could then that would be pretty much equivalent.
In this version, yeah, single-value only. I thought of a couple of ways to do multi-value, but none of them seemed sufficient. My original idea was to actually introduce a new operator to refer to the previous expression's result and that would work as a stand-in for multiple results, i.e. |
The point about shadowing I suppose the following is a more "honest" desugaring: // (this would be easier to write out with something like #21498 ...)
(func () WhateverF2Returns {
piped := (func () WhateverF1Returns {
piped := a()
return f1(piped, b)
}())
return f2(c, piped)
}()) That does, however, raise an interesting point about the design: This design does implicitly shadow the symbol So I guess the key question here would be: is the readability improvement of the overall feature sufficient to overcome the (admittedly subjective) readability decrease caused by an implicit scope created by some syntax that isn't like anything else in today's Go. 🤔 |
This is pretty much what happened to me, too. I tried to come up with syntaxes that would involve Another difference with the desugared version is that anonymous functions like that would need to have return types.
I don't think the scoping is a readability problem. That's just a technical detail. Thinking about it simply as " There are some potential complications if someone tries to do something with anonymous functions or change the order of operator precedence with parentheses. Something like f() |> func() {
f2() |> stuff(piped) // piped is f2(), not f().
}()
// or
f() |> (f2() |> stuff(piped)) |> f3(piped) In other words, someone explicitly creating a new scope inside of the pipeline could be a bit confusing. In the latter case, I think that's pretty simple to just not allow. Could give a In the former case, though, that could be a problem with calling a function that takes a function argument. For example, strings.SplitSeq(input, "\n") |>
xiter.Map(
func(v string) string { v + piped }, // This is weird.
piped,
) Maybe it's possible to not allow access to |
To be more specific about what I was concerned about with this implicit scope, some details:
I don't mean to say that any of what I've been saying is a showstopper. Just some reactions I had while trying to understand the meaning of this new syntax from your proposal trying to put myself in the shoes of someone who doesn't have this proposal text to help them understand what it means. I guess overall there's just a big tension here where the drive to make this as concise as possible unfortunately makes it quite unlike anything else in Go, because Go is not a very concise language overall. Making it less concise makes it less valuable as a language feature, because the conciseness is the whole point of this. In my earlier comment I alluded to #21498 in a throwaway code comment in my code example, but it occurs to me that if we did have a more concise anonymous function syntax then it might be more defensible to do something like the earlier proposal #33361, with the requirement that the rhs of There's still lots of debate over there about exactly which syntax to use for those functions, so I'll just pick one arbitrarily from comments near the end of that issue for an example: a() |>
|aResult| f1(aResult, b) |>
|f1Result| f2(c, f1Result) Or, the less contrived examples from the older proposal: s := strings.TrimSpace(x) |>
|trimmed| strings.ToLower(trimmed) |>
|lowered| strings.ReplaceAll("/old/", "/new")
http.HandleFunc("/",
LogMiddleware(IndexHandler) |>
|m| SomeOtherMiddleware(m) |>
|m| RequireAuthMiddleware(m)
) I will concede right up front that I don't think this is substantially more readable than the original proposal, but it does at least avoid an implicit declaration of an arbitrary name in favor of having the author choose an argument name explicitly, and it supports expressions that generate more than one result. It also hopefully removes some of the mystery of what's going on for someone who was already familiar with the shorthand anonymous function syntax. (Some of the other candidates for lightweight anonymous function syntax include the |
I think the proposed syntax is still unreadable compared to a simple for loop: var ints []int
for s := range strings.SplitSeq(input, "\n") {
if n, _ := strconv.ParseInt(s, 10, 0); n > 0 {
ints = append(ints, n)
}
} and of the syntaxes listed above, I think the intermediate variables one is more readable since each operation is more clearly marked. maybe I just don't mind naming the variables p0, p1, p2, ..., and letting the unused var checker catch failing to pass it to the next operation. and I don't see the point of |
The problem isn't trying to create a one-liner. It's ability to edit the code later. Using intermediate variables is highly error-prone. For example, let's say I'm constructing some pipeline that needs to do a map and then a filter, and then I come back later and need to insert a new filter in between the existing map and filter. The following happens: mapped := xiter.Map(mapFunc, seq)
filtered := xiter.Filter(filterFunc, mapped)
// later
mapped := xiter.Map(mapFunc, seq)
filtered := xiter.Filter(insertedFilterFunc, mapped)
filtered = xiter.Filter(filterFunc, mapped) // Accidentally left mapped the same. It might seem trivial in an example like this, but in larger, more complicated pipeleines it can be a real issue.
That's a valid argument, but there are things that literally can not be done with the ints := func(yield func(int) bool) {
for s := range strings.SplitSeq(input, "\n") {
if n, _ := strconv.ParseInt(s, 10, 0); n > 0 {
if !yield(n) {
return
}
}
}
} and instead of constructing a pipeline of map/filter/whatever function calls. That's essentially what I've been doing myself in my code because of how unergonomic such pipelines currently are. |
I like that. Even without anonymous functions, just changing it to require a manual naming of each pipeline stage is a good one. That also allows you to put the name on the next line, somewhat avoiding the problem with previous lines ending with One thing that could be confusing, though, is what exactly the scope of those names is if the naming is part of the |
Since I was imagining these as just normal anonymous functions, of course for me these symbols exist only inside the function just like any other function argument. That does admittedly mean that if you want to make additional use of some value at a later pipeline stage then that would need to be done manually somehow, rather than the symbol automatically being in scope. None of the motivating examples we discussed so far required that sort of thing, so I didn't try to solve for it. My initial instinct is that if you need more than just directly propagating the results from one expression into the directly-following expression then you should probably use a different technique so you can manage the symbols more explicitly. This proposal is already in a tricky part of the concise vs. readable vs. maintainable tradeoff and so I expect that complicating it any further would topple the scale. |
Thanks for the well thought out and well described proposal. One of the goals of Go is that Go programs are mostly comprehensible to people unfamiliar with Go. The code in this proposal doesn't seem to meet that criterion: the new This proposal introduces a new name, Most importantly, this doesn't handle errors. Many Go functions return both a value and an error, and there is no way to handle such functions with this operator. That is, this might just apply to special cases unusable by much Go code. That makes the relatively obscurity of the construct worse, in that it will be rare. The problem this is solving is writing chains of functions using lists of variables (we agree that packing everything into a single expression is difficult to read). That problem does not seem important enough, and common enough, to deserve a fairly complex new language construct. For these reasons, this is a likely decline. Leaving open for four weeks for final comments. |
Thank you.
It is something not particularly common in Go-like languages, that's true, but it's not that uncommon in and of itself. Elm, Elixir, and OCaml all have very similar operators, and the operator itself was originally based on shell pipes, something that many people are familiar with regardless of the programming languages that they primarily use.
There was some previous discussion about this, including a suggestion to explicitly name the previous value (#70826 (comment)). Although, at that point I feel like an alternative proposal that just simply allows you to shadow variables in the same scope that they were declared in would be a better idea. For example, iter := data()
^iter := xiter.Map(iter, func(v int) float64 { return float64(v / 2) })
^iter := xiter.Filter(iter, func(v float64) bool { return v >= 1 })
process(iter) There are a few downsides to that, though, including that it can't be a single expression, which is probably not actually a problem, and that shadowing variables can have unexpected side-effects if they're captured somewhere. Languages that let you do that usually are either immutable or encourage immutability. It might be possible to work around that by simply limiting the usage, though, i.e. you can't shadow a variable that has been captured in between its declaration and the attempt to shadow it. Or maybe some way to create immutable variables will be added in the future and they will just allow shadowing directly. I might put together a proposal for that based on an older comment I made in another issue somewhere that involved a special naming convention for immutable variables similar to how exporting works now so that you could see at a glance if a variable was immutable or not. Iterators would work well as immutable variables since they're just function values and thus don't have any internal state of their own.
I think that the previous suggestion might also solve this problem, since you could just handle errors like normal. |
@DeedleFake Have you considered tweaking the original function composition operator instead? // Given that f, g, and h are all functions...
fgh := f | g | h
// The above statement would be transformed into this pseudocode:
fgh := func(...inputs) ...outputs {
return h(g(f(inputs...)))
} Instead of executing immediately, This definition has the advantage that it requires no changes to the parser - a new case simply needs to be added to the Then the example in your proposal would become: // Note: The xiter functions presented here have different definitions
// than the ones in the original proposal:
func xiter.Map[I, O any](func(I) O) func(iter.Seq[I]) iter.Seq[O]
func xiter.Filter[T any](func(T) bool) func(iter.Seq[T]) iter.Seq[T]
getInts := strings.SplitSeq |
xiter.Map(func(v string) int {
n, _ := strconv.ParseInt(v, 10, 0)
return int(n)
}) |
xiter.Filter(func(v int) bool { return v > 0 })
ints := getInts(lines, "\n") |
Composing functions without executing them is an interesting idea. I sure hope, though, that it wouldn't encourage folks to write something like this: ints := (
strings.SplitSeq |
xiter.Map(func(v string) int {
n, _ := strconv.ParseInt(v, 10, 0)
return int(n)
}) |
xiter.Filter(func(v int) bool { return v > 0 })
)(lines, "\n") Perhaps it's just me, but I don't really find that any more readable than the nested form this proposal was presented as an alternative to. In particular, it took me quite some staring at this code (and, for that matter, the example with a separate This does of course also still have the question of how one would actually handle errors. In contrived examples like this it's easy to just ignore the (I realize the above is not what was proposed, but it seems like the above would be valid if that proposal were accepted.) |
I agree that none of the examples presented so far are particularly convincing. Let me give it a shot: Given a preexisting file that looks something like this: [user1]
owner=root
files=hello.txt
other_metadata=...
[user2]
owner=groot
files=a.txt,b.txt
[user3]
owner=noot
files=hello.txt,c.txt If we want to parse this into a deduplicated list of filenames, normally we'd have to write something like this: fileSet := map[string]struct{}{}
for _, line := range strings.Split(data, "\n") {
k, v, _ := strings.Cut(line, "=")
if k != "files" {
continue
}
for _, file := range strings.Split(v, ",") {
fileSet[file] = struct{}{}
}
}
sortedFiles := slices.Sorted(maps.Keys(fileSet))
// Do something with sortedFiles... However, with the function composition operator import . "golang.org/x/exp/shell"
var parseFilenames = strings.Lines | Grep("^files=") | Cut("-d= -f2-") | Grep("-oE '[^,]+'") | Sort("-u") | slices.Collect
func main() {
sortedFiles := parseFilenames(data)
// Do something with the files
}
// Note: signatures would be something like:
package shell
func Grep(args string) func(iter.Seq[string]) iter.Seq[string] |
No change in consensus. |
A small demo to implements pipe with xiter: |
Go Programming Experience
Experienced
Other Languages Experience
Elixir, JavaScript, Ruby, Kotlin, Dart, Python, C
Related Idea
Has this idea, or one like it, been proposed before?
Yes, several times. This variant directly addresses the main issues brought up in those proposals.
Does this affect error handling?
Not directly, though it possible could in some cases.
Is this about generics?
Not directly, though it addresses a situation that has arisen as a result of generics.
Proposal
When a pipe operator has been proposed before (#33361), the primary issues with it were
I think that the first point is arguable as a reason not to consider a feature, but more importantly I think that the situation there has changed and I think that #49085's continued discussion is good evidence that some way to fix the issue that a pipe operator would address is a very popular idea. With generics being added and now iterators, too, some way to write chains of function/method calls in a left-to-right or top-to-bottom manner has, I think, gained a fair bit of usefulness that it didn't have back in 2019 (#33361). Simply using methods like many other languages, such as Rust, do, has a lot of problems that have been pointed out in the above issue, but functions have none of those problems. Their only issue in this regard is simply syntactic.
Points 2 and 3, however, I think are very solvable in a simple way: Add a special variable that is defined for the scope of each piece of the pipe that contains the value of the previous expression instead of magically inserting function arguments. For example, assuming that the bikeshed is painted
piped
:The first
|>
operator creates a new scope that exists only for the expression immediately to its right, in this case a call tof1()
. In that scope, it defines a variable,piped
, containing the result of the expression to its left, in this case justa()
. The second|>
operator creates a new scope that shadows the existingpiped
, introducing a newpiped
variable containing the result of the expression to its left, in this casea() |> f1(piped, b)
. And so on with a longer pipeline.This completely fixes problem 2, as it now makes piping extremely explicit. It mostly fixes problem 3 as it makes the operator significantly more flexible, reducing the need for writing APIs specifically to accommodate it. I think this not completely solvable, though, as, at some point, someone will always write something that they probably shouldn't have.
It also allows the pipe operator to become non-exclusive to function calls. Any single-value expression now becomes valid at any point in a pipeline, allowing even things like
For a more practical example, here's some iterator usage:
Side note: I'm not a huge fan of needing to put the
|>
operator at the end of a line. I think Elixir's way of doing it with the operator at the beginning of each of the subsequent lines looks way better. Unfortunately, Go's semicolon insertion rules kind of make this necessary unless someone can come up with a way to do it that doesn't involve special-casing the|>
operator, which I definitely think would be unnecessary. For comparison's sake, here's that same iterator chain written the other way around:Language Spec Changes
A section would have to be added about the
|>
operator. It shouldn't directly affect any existing parts of the spec, I don't think.Informal Change
The
|>
operator allows expressions to be written in a left-to-right manner by implicitly passing the result of one into the next in the form of a variable calledpiped
that is scoped only to the right-hand side of each usage of|>
, shadowing any existing variables namedpiped
in parent scopes, including previous|>
usages in the same pipeline.Is this change backward compatible?
Yes.
Orthogonality: How does this change interact or overlap with existing features?
It allows a compromise between adding generic types in method calls (#49085) and function calls having poor ergonomics for certain use cases.
Would this change make Go easier or harder to learn, and why?
Slightly harder as the idea of the specially-scoped variable and its automatic shadowing of its counterparts in previous pipeline stages would have to be explained.
Cost Description
Tiny compile-time cost. No runtime costs. Slight increase in language complexity. Slight increase in potential for poorly written code as some people might misuse the operator.
Changes to Go ToolChain
All tools that parse Go code would have to be updated. gofmt and goimports would be affected the most.
Performance Costs
Compile-time cost is minimal. Runtime cost is nonexistent.
Prototype
No response
The text was updated successfully, but these errors were encountered: