Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: variadic type parameters #66651

Closed
ianlancetaylor opened this issue Apr 2, 2024 · 163 comments
Closed

proposal: spec: variadic type parameters #66651

ianlancetaylor opened this issue Apr 2, 2024 · 163 comments
Labels
dotdotdot ... LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Milestone

Comments

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Apr 2, 2024

Proposal Details

Background

There are various algorithms that are not easy to write in Go with generics because they are naturally expressed using an unknown number of type parameters. For example, the metrics package suggested in the generics proposal document is forced to define types, Metric1, Metric2, and so forth, based on the number of different fields required for the metric. For a different example, the iterator adapter proposal (https://go.dev/issue/61898) proposes two-element variants of most functions, such as Filter2, Concat2, Equal2, and so forth.

Languages like C++ use variadic templates to avoid this requirement. C++ has powerful facilities to, in effect, loop over the variadic type arguments. We do not propose introducing such facilities into Go, as that leads to template metaprogramming, which we do not want to support. In this proposal, Go variadic type parameters can only be used in limited ways.

Proposal

A generic type or function declaration is permitted to use a ... following the last type parameter of a type parameter list (as in T... for a type parameter T) to indicate that an instantiation may use zero or more trailing type arguments. We use T... constraint rather than T ...constraint (that is, gofmt will put the space after the ..., not before) because T is a list of types. It's not quite like a variadic function, in which the final argument is effectively a slice. Here T is a list, not a slice.

We permit an optional pair of integers after the ... to indicate the minimum and maximum number of type arguments permitted. If the maximum is 0, there is no upper limit. Omitting the numbers is the same as listing 0 0.

(We will permit implementations to restrict the maximum number of type arguments permitted. Implementations must support at least 255 type arguments. This is a limit on the number of types that are passed as type arguments, so 255 is very generous for readable code.)

type Metric[V... 1 0 comparable] /* type definition */
func Filter[K any, V... 0 1 any] /* function signature and body */
func Filter[K, V... 0 1 any]     /* same effect as previous line */

With this notation V becomes a variadic type parameter.

A variadic type parameter is a list of types. In general a variadic type parameter may be used wherever a list of types is permitted:

  • In a function parameter or result list
    • func SliceOf[T... any](v T) []any
    • a variadic type parameter may not be used as the type of a regular variadic parameter
  • In a variable declaration, to define a variadic variable.
    • func PrintZeroes[T... any]() { var z T; fmt.Println(z) }
  • In a struct declaration, to define a variadic field.
    • type Keys[T... any] struct { keys T }

A variadic variable or field may be used wherever a list of values is permitted.

  • When calling a function, either a conventional variadic function or a function with a parameter whose type is itself a corresponding variadic type parameter with compatible min/max values and constraints.
  • In a struct composite literal, setting a variadic field with corresponding type.
  • In a slice composite literal where the constraint of the variadic type parameter is assignable to the element type of the slice.
  • Similarly, in an array composite literal, if it uses the [...]T syntax (here T is an ordinary type or type parameter, not a variadic type parameter).
  • On the left hand side of a for/range statement, if the variadic type parameter is constrained to ensure that at most two variables are present.

Note that a variadic type parameter with a minimum of 0 may be used with no type arguments at all, in which case a variadic variable or field of that type parameter will wind up being an empty list with no values.

Note that in an instantiation of any generic function that uses a variadic type parameter, the number of type arguments is known, as are the exact type arguments themselves.

// Key is a key for a metric: a list of values.
type Key[T... 1 0 comparable] struct {
	keys T
}

// Metric accumulates metrics, where each metric is a set of values.
type Metric[T... 1 0 comparable] struct {
	mu sync.Mutex
	m map[Key[T]]int
}

// Add adds an instance of a value.
func (m *Metric[T]) Add(v T) {
	m.mu.Lock()
	defer m.mu.Unlock()
	if m.m == nil {
		m.m = make(map[Key[T]]int)
	}
	// Here we are using v, of type T,
	// in a composite literal of type Key[T].
	// This works because the only field of Key[T]
	// has type T. This is ordinary assignment
	// of a value of type T to a field of type T,
	// where the value and field are both a list.
	m.m[Key[T]{v}]++
}

// Log prints out all the accumulated values.
func (m *Metric[T]) Log() {
	m.mu.Lock()
	defer m.mu.Unlock()
	for k, v := range m.m {
		// We can just log the keys directly.
		// This passes the struct to fmt.Printf.
		fmt.Printf("%v: %d\n", k, v)

		// Or we can call fmt.Println with a variable number
		// of arguments, passing all the keys individually.
		fmt.Println(k.keys, ":", v)

		// Or we can use a slice composite literal.
		// Here the slice has zero or more elements,
		// as many as the number of type arguments to T.
		keys := []any{k.keys}
		fmt.Printf("%v: %d\n", keys, v)
	}
}

// m is an example of a metric with a pair of keys.
var m = Metric[string, int]{}

func F(s string, i int) {
	m.Add(s, i)
}

Variadic type parameters can be used with iterators.

// Seq is an iterator: a function that takes a yield function and
// calls yield with a sequence of values. We always require one
// value K, and there can be zero or more other values V.
// (This could also be written as Seq[K, V... 0 1 any].)
type Seq[K any, V... 0 1 any] = func(yield func(K, V) bool)

// Filter is an iterator that filters a sequence using a function.
// When Filter is instantiated with a single type argument A,
// the f argument must have type func(A) bool,
// and the type of seq is func(yield func(A) bool).
// When Filter is instantiated with two type arguments A1, A2,
// the f argument must have type func(A1, A2) bool,
// and the type of seq is func(yield func(A1, A2) bool).
func Filter[K, V... 0 1 any](f func(K, V) bool, seq Seq[K, V]) Seq[K, V] {
	return func(yield func(K, V) bool) {
		// This is range over a function.
		// This is permitted as the maximum for V is 1,
		// so the range will yield 1 or 2 values.
		// The seg argument is declared with V,
		// so it matches the number on the left.
		for k, v := range seq {
			if f(k, v) {
				if !yield(k, v) {
					return
				}
			}
		}
	}
}

In a struct type that uses a variadic field, as in struct { f T } where T is a variadic type parameter, the field must have a name. Embedding a variadic type parameter is not permitted. The reflect.Type information for an instantiated struct will use integer suffixes for the field names, producing f0, f1, and so forth. Direct references to these fields in Go code are not permitted, but the reflect package needs to have a field name. A type that uses a potentially conflicting field, such as struct { f0 int; f T } or even struct { f1000 int; f T }, is invalid.

Constraining the number of type arguments

The Filter example shows why we permit specifying the maximum number of type arguments. If we didn't do that, we wouldn't know whether the range clause was permitted, as range can return at most two values. We don't want to permit adding a range clause to a generic function to break existing calls, so the range clause can only be used if the maximum number of type arguments permits.

The minimum number is set mainly because we have to permit setting the minimum number.

Another approach would be to permit range over a function that takes a yield function with an arbitrary number of arguments, and for that case alone to permit range to return more than two values. Then Filter would work as written without need to specify a maximum number of type arguments.

Work required

If this proposal is accepted, at least the following things would have to be done.

  • Spec changes
  • Changes to the go/ast package
    • We would permit an Ellipsis in the constraint of the last type parameter.
    • This should have a separate sub-proposal.
  • Type checker changes
    • Possible changes to the go/types API, a separate sub-proposal.
  • Tools would need to adjust.
  • go/ssa changes
  • gopls changes
  • Communication with other tool builders
  • Analyzer changes
  • gofmt adjustments (minor)
  • Compiler backend changes
    • The shape of a function with a variadic type parameter would have to consider the number of type arguments.
    • Changes to the importer and exporter
@gopherbot gopherbot added this to the Proposal milestone Apr 2, 2024
@ianlancetaylor ianlancetaylor added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Apr 2, 2024
@zephyrtronium
Copy link
Contributor

Aside from the addition of constraints on the count of type parameters, this seems to be a more precise statement of #56462, for some additional background. (cc @DeedleFake)

@zephyrtronium
Copy link
Contributor

Presumably func F[T... 2 1 any]() is illegal. What about func G[T... 1 1 any]()? If that's legal, it feels awkward for it to have a substantially different meaning from func H[T... 0 0 any]().

@ianlancetaylor
Copy link
Contributor Author

@zephyrtronium Thanks, I entirely forgot about #56462.

func G[T... 1 1 any]() requires a single type argument (which is valid but useless). func H[T... 0 0 any]() permits any number of type arguments. I'm not sure I see the awkwardness.

@jimmyfrasche
Copy link
Member

If there were tuple types that looks like it would satisfy the MetricN case and simplify variadic fields/values since you could just wrap any type list in a tuple and not have to define additional machinery.

It seems unfortunate to be limited to a single variadic type as that means you couldn't write, for example, a decorator for any function

@diamondburned
Copy link

Would a [V 0..1 any] and [V 0.. any] syntax look clearer? What about [V 0:1 any] and [V 0: any]?

@Merovius
Copy link
Contributor

Merovius commented Apr 2, 2024

Another question:

func F[T... any](T) {
    F(T, T) // legal?
}

@timothy-king
Copy link
Contributor

@Merovius Presumably the rules would need to be tightened to not allow calls that generate infinite shapes. The following could also generate an infinite series of func shapes.

func rec[T... 0 0 any](v func(T)) {
   var f func(int, T)
   rec(f)
}
var _ = rec(func(){})

Though the word "compatible" might be doing some heavy lifting to disallow these example.

@ianlancetaylor
Copy link
Contributor Author

@jimmyfrasche A major goal here is to avoid requiring both Filter and Filter2 in #61898. I don't think a tuple type addresses that.

@Merovius I think that is already prohibited by the rule saying that a generic function may only refer to itself using identical type parameters.

@jimmyfrasche
Copy link
Member

@ianlancetaylor I meant that if you also had tuples you could offload the handling of values of type-vectors onto them. However it occurs to me you'd still need some mechanism for writing out the parameters to a closure.

@eihigh

This comment has been minimized.

@leaxoy
Copy link

leaxoy commented Apr 3, 2024

If it is used as a struct field, what type should it be, tuple? Or some other unknown type?

@leaxoy
Copy link

leaxoy commented Apr 3, 2024

type Seq[K any, V... 0 1 any] = func(yield func(K, V) bool)

in addition, why isn't this the case here? @ianlancetaylor

type Seq[E... 1 2 any] = func(yield func(E) bool)

@johanbrandhorst

This comment was marked as resolved.

@Merovius
Copy link
Contributor

Merovius commented Apr 3, 2024

@ianlancetaylor I can't find that rule in the spec? I can find one for generic types, but nothing equivalent for functions.

@aarzilli
Copy link
Contributor

aarzilli commented Apr 3, 2024

What does PrintZeroes[]() do?

@ianlancetaylor
Copy link
Contributor Author

@leaxoy

If it is used as a struct field, what type should it be, tuple? Or some other unknown type?

If a struct field has a variadic type, then it is actually a list of struct fields. The types of the listed fields are the list of type arguments.

@ianlancetaylor

This comment was marked as resolved.

@ianlancetaylor
Copy link
Contributor Author

@Merovius Apologies, I'm probably misremembering it. But it seems to me that we need that rule. Otherwise we can write

func F[T any]() string {
    var v T
    s := fmt.Sprintf("%T", v)
    if len(s) >= 1000 {
        return s
    }
    return F[struct{f T}]()
}

And indeed that currently fails to compile:

foo.go:11:8: instantiation cycle:
	foo.go:17:14: T instantiated as struct{f T}

@ianlancetaylor
Copy link
Contributor Author

@aarzilli

What does PrintZeroes do?

The same thing as fmt.Println(). That is, it does not pass any arguments.

@leaxoy
Copy link

leaxoy commented Apr 3, 2024

The types of the listed fields are the list of type arguments.

Is this list mutable? Does it support iterate and subscript access?

There is a scenario where variable generics are used as function parameters and are expected to be accessible through subscript.

@aarzilli
Copy link
Contributor

aarzilli commented Apr 3, 2024

@aarzilli

What does PrintZeroes do?

The same thing as fmt.Println(). That is, it does not pass any arguments.

I'm guess that these:

func f1[T... any](x T) { fmt.Println(&x) }
func f2[T... any](x T) any { return x }

the first one is never allowed and the second one is only allowed if called with a single type? It seems weird to me that this introduces variables that aren't real variables and struct fields that aren't real struct fields.

@Merovius
Copy link
Contributor

Merovius commented Apr 3, 2024

@leaxoy The proposal intentionally does not propose allowing that. There are undeniably useful abilities, but they are out of scope here.

We do not propose introducing such facilities into Go, as that leads to template metaprogramming, which we do not want to support. In this proposal, Go variadic type parameters can only be used in limited ways.

@Merovius
Copy link
Contributor

Merovius commented Apr 3, 2024

@ianlancetaylor I'm a bit worried about the ability to specify bounds on the number of type-parameters, as it seems to me to have the potential to introduce corner cases allowing computation. For that reason, it would be useful to have a relatively precise phrasing what those rules are.

@changkun
Copy link
Member

changkun commented Apr 3, 2024

  • Is f[~T... any] legal?
  • Is kk := Keys{keys: ???}; for k := range kk.keys {...} legal?
  • Is there an "unpack" to type params in this design?

@Merovius
Copy link
Contributor

Merovius commented Apr 3, 2024

@changkun

Is f[~T... any] legal?

I'm not sure what that syntax should mean, so I'd say no. Note that F[~T any] isn't legal either.

Is kk := Keys{keys: ???}; for k := range kk.keys {...} legal?

I don't think so (assuming you meant Key[T]{keys}?), as that is a struct type with fields f0, …, fn, there is no kk.keys.
Otherwise you'd have to elaborate what Keys is in your example.

Is there an "unpack" to type params in this design?

No.

@aarzilli
Copy link
Contributor

I'm daring to suggest that this particular use case is strong enough that it can justify that extra complexity

I think tuples aren't buying you anything in your proposal, you could use structs and just have the ... operator "spread" a struct over an argument list.

var t1 = struct{ a int, b string }{1, "hello"}
var t2 = struct{ a int}{ 42 }
var t3 = struct{}{}

x, y := t1...      // x=1, y="hello"
strconv.Itoa(t2...)   // "42"
fmt.Println("one", t1..., "two", t2..., "three", t3...)  // one 1 hello two 42 three

etcætera.

Your syntax in particular is also ambiguous:

func f() (int, error)

is this returning a tuple of int and error or an int and an error?

@bjorndm
Copy link

bjorndm commented Apr 12, 2024

@aarzilli Yes that is #64613 or close to it. If struct unpacking is needed for this feature, that seems like a simpler way to go at it than tuples.

@atdiar
Copy link

atdiar commented Apr 12, 2024

I think that the discussion about tuples is simply a discussion about ways that one would try to implement multiple variadic type parameters.

Basically, that would either keep tuple types as a purely generic programming construct (as variadic type parameters, implicitly (un)packed where needed) or inscribe them in the spec.

Tuples as a reified concept is some sort of sugar around anonymous structs.
Perhaps that this part (reification) is not really necessary anymore once we have variadic type parameters?

If it is necessary, then the other question is whether the proposed implementation of variadic type parameters is not so limiting that people start to want tuples still.

I guess, if people are to try and generate code, the 255 limit might be too limiting, although that's very unlikely to be the norm. That limit could potentially be removed one day anyway. Or having multiple variadic type parameters, they could just be juxtaposed to increase the limit.

@rogpeppe
Copy link
Contributor

@aarzilli

I'm daring to suggest that this particular use case is strong enough that it can justify that extra complexity

I think tuples aren't buying you anything in your proposal, you could use structs and just have the ... operator "spread" a struct over an argument list.

Yes, that is definitely a possibility. Tuples are, after all, "just" structs with unnamed members.

Still, on balance, I think they'd be worthwhile:

  • they pair very nicely with variadic type parameters: there's a one-to-one relationship between tuple types and variadic type parameters.
  • there's no need to invent a field name for each member: .0 feels more natural than .T0, or whatever spelling one might choose for a struct tuple field. Go as a language doesn't generally tend to use arbitrary names in the type system.
  • the syntax for creating a tuple literal ((1, "hello", true)) is much more natural than creating the equivalent literal with an anonymous struct (struct{T0 int, T1 string, T2 bool}{1, "hello", true}).
  • It seems to me that the "spread" operation you suggest should be considered bad practice to use on arbitrary structs: just like using unkeyed composite literals, it breaks the ability to add new fields to a struct while maintaining API compatibility. So the fact that it would work for structs as well as tuples is a kind of attractive nuisance.
  • it's arguably nice to have a clear distinction between structs and tuples at reflect time too. For example, I'd argue that the natural JSON encoding of the tuple (1, "hello") is [1, "hello"] not {"T0": 1, "T1": "hello"}, but if tuples were just regular structs, that's not a distinction that a codec could make.

Your syntax in particular is also ambiguous:

func f() (int, error)

is this returning a tuple of int and error or an int and an error?

I should point out that the semantics I am suggesting for tuples are (almost?) identical to those proposed in #64457.
Thus, according to that proposal, to return a tuple of int and error, one would write:

func f() ((int, error))

@neild
Copy link
Contributor

neild commented Apr 12, 2024

A possible alternate approach to variadic templates using tuples:

Start with tuples as in #64457.

Adjust range-over-func to unpack tuples:

var seq iter.Seq[(int, string)] // a sequence of tuples
for k, v := range seq {
  // k is an int, v is a string
}

Add an additional assignability rule: A function value taking and/or returning some set of parameters is assignable to a variable of a func type that takes and/or returns a single tuple containing the same parameters.

var f1 func(int, int)   // func taking two ints
var f2 func((int, int)) // func taking a tuple of two ints
f1 = f2 // ok
f2 = f1 // ok

(This implies that the calling convention for a func taking and/or returning a single tuple is the same as that for a func taking and/or returning the flattened contents of the tuple. I think that should be straightforward, but I'm not a compiler person.)

Given this, we write Filter as:

func Filter[V any](seq Seq[V], func(V) bool) Seq[V] {
  return func(yield func(V) bool) {
    seq(func(v V) bool {
      if f(v) {
        return yield(v)
      }
      return true
    })
  })
}

This is almost exactly the current definition of Filter, but we call seq directly instead of using range. (Since we defined range as unpacking tuples above, and we don't actually want to unpack here.)

In the two-element case, this is parameterized as Filter[(int, string)]. The filter function in this case is a func taking a tuple parameter (func((int, string)) bool), but the assignability rule above means the user can pass a func with two parameters:

var seq Seq[(int, string)]
filtered := Filter(func(k int, v string) bool {
  return k > 10
}, seq)

@atdiar
Copy link

atdiar commented Apr 12, 2024

This is slightly off-topic and I'd rather not continue this discussion here but I think I remember one issue with tuples as types being: how to unpack unexported fields. Within the package the tuple type would be defined in, it should be fine.
But let's say the tuple type is unexported and yet used as the type of a function parameter?
(possibly relevant when this type has a constructor function and a non-useful zero-value)

Should the unexported field be unpacked when outside of the package the tuple was defined in?

Also, it seems to be a bit complex, if manageable, to have tuple typed values and multiple values. I am a bit concerned by what would happen when someone has to think about how to unpack those.

Variadic type parameters which really are some kind of tuples of type parameters seem to strike a good balance and would integrate in a smoother way with the existing language, I personally find.

@bjorndm
Copy link

bjorndm commented Apr 13, 2024

Actually as defined in this proposal variable type arguments are not tuples but type lists, which is different, we should not conflate them.

Tuples are one way in which this type list could be made usable but there are other, simpler alternatives.

@Merovius
Copy link
Contributor

I remember one issue with tuples as types being: how to unpack unexported fields.

Tuples have no field names, so no "unexported fields". This is an issue with the "struct-unpacking operator" proposal, not tuples.

@atdiar
Copy link

atdiar commented Apr 13, 2024

Actually as defined in this proposal variable type arguments are not tuples but type lists, which is different, we should not conflate them.

Tuples are one way in which this type list could be made usable but there are other, simpler alternatives.

This is to be taken slightly more in the mathematical sense since type parameters are not types themselves anyway.
But fair enough.

I remember one issue with tuples as types being: how to unpack unexported fields.

Tuples have no field names, so no "unexported fields". This is an issue with the "struct-unpacking operator" proposal, not tuples.

Indeed, that's true. Forgot that if tuple types are implemented as very specific kinds of structs, this problem goes away. So this point notwithstanding.

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 14, 2024

Edit: this possibility was already suggested by @ianlancetaylor here.

I've been wondering about the constraints on the number of type arguments.
To start with I'm not very keen on the use of 0 to mean "unlimited" in the second
argument (vs actual "zero" in the first), but that's really a minor gripe.

More substantively, I wonder whether it's necessary to have
those numeric constraints at all.

The Filter example shows why we permit specifying the maximum number of type arguments. If we didn't do that, we wouldn't know whether the range clause was permitted, as range can return at most two values.

It is indeed the case that range can return at most two values, but
does that have to be the case?

I'm wondering if it might actually be OK to change the language to
allow range to return any number of values. That is, range
would be permitted over any function of the form:

type Seq[T ...any] = func(yield func(T...) bool)

That would allow both a zero-element range clause (something that
has been mooted in the past) and also, and (probably less usefully), 3 or more
elements.

Given the ability, with variadic type parameters, to write code that
works generically over all the kinds of Seq, perhaps this is actually
OK.

Then there would be no need for the numeric constraints at all. AFAICS
the contraint used for the Metric example is only a convenience.
There's nothing in that code that actually requires there to be at
least one type parameter AFAICS, and even the comments seem to forget
that constraint:

// Or we can use a slice composite literal.
// Here the slice has zero or more elements

AFAICS that "zero" should actually be "one.

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 14, 2024

One another aspect of the proposal, I tend to agree with some other replies
that limiting to a single variadic type parameter is overly
restrictive.

Being able to represent a full function signature is a very useful
capability.

It becomes easy to write many useful wrapper functions that are
otherwise a tad laborious. For example, assuming the tuple-based
implementation, here's a simple memoizer that could work on any
function that has non-variadic arguments:

func Memoize[A ...comparable, R ...any](f func(A...) R...) func(A...) R... {
	m := make(map[A] R)
	return func(args A...) R... {
		r, ok := m[args]
		if ok {
			return r...
		}
		r = f(args...)
		m[args] = r
		return r...
	}
}

Although this can't directly represent functions with variadic
arguments, it becomes straightforward to write an adaptor:

func FromVariadic[Args ...any, V any, Ret ...any](f func(Args..., ...V) Ret...) func(Args..., []V) Ret...) {
	return func(args Args..., v []V) Ret... {
		return f(args..., v...)...
	}
}

func ToVariadic[Args ...any, V any, Ret...any](f func(Args..., []V) Ret...) func(Args..., ...V) Ret... {
	return func(args Args..., v ...V) Ret... {
		return f(args..., v)...
	}
}

Another, more abstract, reason for allowing multiple variadic type
parameters is that it isn't possible to make a generic type or
function that wraps more than one variadic type without that
capability. With variadic function parameters, it's always possible to
make one of the arguments a slice, but that option isn't available for
variadic type parameters.

So I suggest that multiple variadic type parameters should be allowed,
with the rule that if there is more than one variadic type
parameter present, all variadic type parameters must be presented
as a type tuple.

For example:

	Memoize[(float64, byte, int, int), (string,)](strconv.FormatFloat)
	FromVariadic[(string, any]), (int, error)](fmt.Printf)
	FromVariadic[(), any]), (string,)](fmt.Sprint)

To my mind allowing multiple variadic type parameters
doesn't seem to add a great deal of complexity to the proposal,
but does make it considerably more useful.

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 14, 2024

To summarise my thoughts from my various posts above, I propose the following:

FWIW things started falling into place for me when I realised that a
tuple can be to a variadic type parameter as a slice is to a variadic
function parameter. In current Go, a variadic function parameter
"decays" into a slice inside the variadic function, and must be spread
out again with ... if we are to call another variadic function.
Similarly, in my tuple-based suggestion, a variadic type parameter
"decays" into a tuple inside the generic function and must be spread
out again with ... if we are to treat it as a list of types again.

To my mind this direct analogy makes variadic type parameters
easier to explain and fit more naturally into the language,
and keeping things explicit reduces the overall "surprise
factor".

@AndrewHarrisSPU
Copy link

To summarise my thoughts from my various posts above, I propose the following:

Just to see how it looked on the page, I tried to write everything in the iteration adapters proposal (not just Filter) tuple-fied and in many ways I thought things looked better. Still, I'm really unsure about the full impact of putting these ideas in the language. Some interesting cases were Zip and Merge. How malleable should tuples be, can they be nested, or pulled apart, or otherwise manipulated in type parameter lists? How much additional specification is needed around this? What's readable?

Zip

The tuple-fied implementation of the Zip function and the Zipped type can generally cover zipping an m-ary tuple and an n-ary tuple, covering more possibilities than the original non-variadic implementations. Also, if I'm not mistaken, the originally proposed variadic type parameters cannot, in one implementation, indicate the association amongst zipping two 1-ary sequences of V1, V2 as well as two 2-ary sequences K1, V1, K2, V2 - it's not generally m+n zipping. This looks like a use case somewhat better served with multiple variadics.

The original implementation of Zipped/Zipped2 structs use Ok1 and Ok2 booleans when zipping sequences with unequal lengths. It still seemed like a much cleaner idea in tuple-vision, better than putting Oks in the tuples.

Merge

In contrast, there's an interesting detail with MergeFunc. These are the given signatures:

func MergeFunc[V any](x, y Seq[V], f func(V, V) int) Seq[V] { ... }

//sic: this function should be named "MergeFunc2", I guess
func MergeFunc[K, V any](x, y Seq2[K, V], f func(K, K) int) Seq2[K, V] { ... } 

The original 1-ary implementation uses a comparison function f func(V, V) int, and the original 2-ary implementation uses a comparison function f func(K, K) int - just K, not K and V. For a 2-ary variadic type parameter or tuple-fied implementation, it's more natural to arrive at a comparison function func(V, V) int where V has both components, not just the first component. It may be that the this form is good enough, but what if we really wanted to compare with the first component only? With non-tuple, original variadic type parameters, it seems straightforward:

func MergeFunc[K, V... 0 1 any](x, y Seq[K, V], f func(K, K) int) Seq[K, V] { ... }

I was sort of puzzled by how to write the same relationship with tuples. How can we express that K, (K, V), (K, V1, V2) etc. are valid, but () or (K, (V1, V2)) (if nesting is even legal in any case) are invalid here?

@rogpeppe
Copy link
Contributor

To summarise my thoughts from my various posts above, I propose the following:

Just to see how it looked on the page, I tried to write everything in the iteration adapters proposal (not just Filter) tuple-fied and in many ways I thought things looked better. Still, I'm really unsure about the full impact of putting these ideas in the language. Some interesting cases were Zip and Merge. How malleable should tuples be, can they be nested, or pulled apart, or otherwise manipulated in type parameter lists? How much additional specification is needed around this? What's readable?

That's a great exercise! Here's my stab at it: https://go.dev/play/p/CpQpxovY9VP

The fact that the results of Zip are always tuplified is arguably not a great UX.
Maybe it might make sense to have Zip (single-arity iterators only) and ZipN (any arity iterators)
variants.

There are some other questions over what API works best too, but in general
it seems like it's straightforward to implement all the shapes we might choose.

Zip

The tuple-fied implementation of the Zip function and the Zipped type can generally cover zipping an m-ary tuple and an n-ary tuple, covering more possibilities than the original non-variadic implementations. Also, if I'm not mistaken, the originally proposed variadic type parameters cannot, in one implementation, indicate the association amongst zipping two 1-ary sequences of V1, V2 as well as two 2-ary sequences K1, V1, K2, V2 - it's not generally m+n zipping. This looks like a use case somewhat better served with multiple variadics.

Yes, this isn't something the original proposal could do. This is a concrete example of the "it isn't possible to make a generic type or function that wraps more than one variadic type without that capability" reasoning from this comment.

The original implementation of Zipped/Zipped2 structs use Ok1 and Ok2 booleans when zipping sequences with unequal lengths. It still seemed like a much cleaner idea in tuple-vision, better than putting Oks in the tuples.

Merge

In contrast, there's an interesting detail with MergeFunc. These are the given signatures:

func MergeFunc[V any](x, y Seq[V], f func(V, V) int) Seq[V] { ... }

//sic: this function should be named "MergeFunc2", I guess
func MergeFunc[K, V any](x, y Seq2[K, V], f func(K, K) int) Seq2[K, V] { ... } 

The original 1-ary implementation uses a comparison function f func(V, V) int, and the original 2-ary implementation uses a comparison function f func(K, K) int - just K, not K and V. For a 2-ary variadic type parameter or tuple-fied implementation, it's more natural to arrive at a comparison function func(V, V) int where V has both components, not just the first component. It may be that the this form is good enough, but what if we really wanted to compare with the first component only? With non-tuple, original variadic type parameters, it seems straightforward:

func MergeFunc[K, V... 0 1 any](x, y Seq[K, V], f func(K, K) int) Seq[K, V] { ... }

I was sort of puzzled by how to write the same relationship with tuples. How can we express that K, (K, V), (K, V1, V2) etc. are valid, but () or (K, (V1, V2)) (if nesting is even legal in any case) are invalid here?

Something like this perhaps?

func MergeFunc[K any, V... any](x, y Seq[K, V...], f func(K, K) int) Seq[K, V...] {
	return func(yield func(K, V...) bool) bool {
		next, stop := Pull(y)
		defer stop()
		k2, v2, ok2 := next()
		for k1, v1 := range x {
			for ok2 && f(k1, k2) > 0 {
				if !yield(k2, v2...) {
					return false
				}
				k2, v2, ok2 = next()
			}
			if !yield(k1, v1...) {
				return false
			}
		}
		for ok2 {
			if !yield(k2, v2...) {
				return false
			}
			k2, v2, ok2 = next()
		}
	}
}

Although it hasn't been explicitly stated yet, I think the rules for arity of
function returns and arguments should be reasonably straightforward to
state: essentially if inside some generic code I call a function that's
been declared as func() (T, V...) then it appears to return exactly two values,
the second of which is a tuple containing all the elements of T.

Thinking further, a similar rule must apply to all values: although the generic code can
reason about the values in terms of their declared types, the
actual values still retain their original type under the hood:

For example:

type S[... V] struct {
	X (V..., error)
}

func (s S[V]) m() {
	// From this code's point of view there are
	// exactly two elements in the tuple S
	// and we can extract them as such.
	fmt.Println(s.X.1)		// Always prints the error part of s.X.

	// Under the hood, the value is actually the original
	// tuple, so if we use reflection, element 1 of the
	// tuple might _not_ be error, depending on how
	// many elements there are in V.
	fmt.Println(reflect.ValueOf(s.X).Elem(1))
}

@neild
Copy link
Contributor

neild commented Apr 17, 2024

In a tuple world, I'd say that Merge should operate on the full tuple and MergeFunc should take a func operating on tuples.

func MergeFunc[T any](x, y Seq[T], f func(T, T) int) Seq[T]
var seq1, seq2 Seq[(int, string)]
merged := MergeFunc(seq1, seq2, func(x, y (int, string)) int {
  return cmp.Compare(x.0, y.0)
})

This is a bit different from the API in #61898: You can't Merge a sequence where only the keys are comparable (but you can MergeFunc it), but you're not limited to only considering the keys in MergeFunc.

@perj
Copy link

perj commented Apr 21, 2024

I think it's worth keeping in mind that this is blocking the iterators proposal. Rightly so, in my opinion, because the Filter and Filter2 do look weird and I would much prefer to have a single Filter to cover both. It's great that the Go team have stopped to reconsider it before accepting that proposal, thanks for that.

If tuples would be the way to go, I think it should first be considered to be allowed inside variadic type parameter functions only, to keep changes at a minimum. Also, general support for tuples would be a completely different propsal, IMO.

With that in mind, the original proposal seems to be almost exactly that, although not named as tuples. It does constrain it to only be used expanded, not as a single value. The original proposal doesn't use ellipsis, but it could be added to allow for more generic tuples later.

There seems to be a general agreement that a single variadic type list is too limiting, I agree as well. Parentheses have been suggested as a workaround, which seems reasonable.

func EqualFunc[V1, V2 ...any](x Seq[V1], y Seq[V2], f func(V1..., V2...) bool) bool

I suppose if I want a reference to this function I would then write var equalfunc = xiter.EqualFunc[(string), (int)], requiring the parentheses even if it's a single type. I say it should be func EqualFunc[V1, V2 ...any](x Seq[V1...], y Seq[V2...], f func(V1..., V2...) bool) bool in the context of this proposal, though.

the reflect package needs to have a field name

Perhaps this restriction could be lifted, or the restriction for unique field names? var v struct { _ int; _ int} is valid syntax, so there's at least some precedence. Making struct { F T... } have two F fields would be new, but as long as accessing the fields by name is restricted, similar to _ fields, it would work.

return cmp.Compare(x.0, y.0)

This syntax wouldn't be allowed, which is why the original takes K only. This also shows another reason why general tuples need to be discussed separately.

T [:]any // 0 or more

I prefer this syntax for size constraints.
EqualFunc would be func EqualFunc[V1, V2 [:]any](x Seq[V1...], y Seq[V2...], f func(V1..., V2...) bool) bool or secondly func EqualFunc[V1, V2 [:]...any](x Seq[V1...], y Seq[V2...], f func(V1..., V2...) bool) bool if the general opinion is that ellipsis is required.

@chad-bekmezian-snap
Copy link

chad-bekmezian-snap commented Apr 21, 2024

I am worried that this discussion is losing clear direction, and therefore failing to progress, due to the many different proposals/variations being thrown into the mix via comments. Perhaps we should create a separate issue for such discussion? Or pull some of the proposed changes into separate proposals so they can go through a more clear and distinct vetting process?

@mikeschinkel
Copy link

I wanted to chime in against the sigils if they aren't actually doing anything other than signaling it to the reader. To me, I don't really see any different between variadic types and any other one.

One thing I would worry about is that without explicit sigils, free-form arguments might block some preferred syntax for a future enhancement to the language that has yet to be proposed or even conceived.

But I have no examples of what that might be, so it is merely a worry I have.

@Splizard
Copy link

Splizard commented May 4, 2024

There are two major pieces to this proposal that I consider to be incredibly unintuitive. The numbers in the type parameter list and the fact that these variadic parameters can be used as fields and variables. What type is keys in the Metrics example? The solution for reflection is very weird.

I think this proposal is overfitting to the the use-case for metrics and the iterators proposal. Any proposal for variadic type parameters should consider Go holistically and enable a way to represent both variadic function arguments and results. For example, I would instead propose this (which will allow other pieces of the puzzle here to fall in place that can improve Go):

// Multiple type parameter lists can be used to defined seperate groups of 
// type parameters, each can end with a variadic type parameter. This is
// useful for representing both function arguments and results.
func Log[A ...any][R ...any](fn func(A...)R...) func(A...)R... {
	return func(args A...)R... {
		fmt.Println("Calling function with ", fmt.Sprint(args))
		return fn(args...) // variadic values must be expanded when used, to make it clear what they are.
	}
}

Firstly, to address everyone here who is talking about Tuples, Go already has tuples, if we had support for variadic type parameters, these are representable like this:

type Tuple[T ...any] func() T...

func NewTuple[T ...any](v T...) Tuple[T...] {
	return func() T... { return v... }
}

So the real reason there is a discussion here about tuples, must be because functions are not comparable, the problem being that these tuples can't be used as map values. Well maybe we have an opportunity here to both improve generics and maps. Currently maps are a bit of a special case syntax wise, as instead of being something like map[string, string], they are represented as map[string]string. Unlike other generic types. We can fix this!

// This means, the default map[string]int syntax is simply a shorthand
// for the following generic type signature, eliminating it as a special case 
// type-declaration wise.
type map[K ...comparable][V any] struct{} // map[string][int] == map[string]int

Now if we take another look at the Metric example, we can do this:

type Metric[T... comparable] struct {
	mu sync.Mutex
	m map[T...]int // maps can now support variadic comparable types as a key type
}
func (m *Metric[T]) Add(v T...) {
	m.mu.Lock()
	defer m.mu.Unlock()
	if m.m == nil {
		m.m = make(map[T...]int)
	}
	m.m[v...]++ // variadic values must always be expanded when used, they cannot be used as values directly.
}
func (m *Metric[T]) Log() {
	m.mu.Lock()
	defer m.mu.Unlock()
	for k, v := range m.m {
		// k is of type func() T...
		fmt.Printf("%v: %d\n", fmt.Sprint(k())), v)	// Option 1
		fmt.Println(fmt.Sprint(k()), ":", v)		// Option 2
		keys := (func(args ...any)[]any{return args})(k())
		fmt.Printf("%v: %d\n", keys, v)			// Option 3
	}
}

Then for sequences:

type Seq[K any, V ...any] = func(yield func(K, V...) bool)

func Filter[K any, V ...any](f func(K, V...) bool, seq Seq[K, V...]) Seq[K, V...] {
	return func(yield func(K, V...) bool) {
		for k, v := range seq {
			if f(k, v...) {
				if !yield(k, v...) {
					return
				}
			}
		}
	}
}

No magic numbers or weird struct fields! Seems much more Go-like to me. What does everyone think?

@atdiar
Copy link

atdiar commented May 4, 2024

Those numbers if I understand well are used to retrofit some of our current iterators without a performance penalty (when ranging over a map, one has the choice over retrieving key AND value or just key).

Basically, this proposal introduces another way to parameterize a type. Where we only had type parameters we also have variadic ones.
To differentiate between the two, we need a new notation. (T... is proposed).

If we tried to use T ...constraint it would be ambiguous as it would denote a list of constraints, not a list of type parameters.

You're right that tuples exist in some form in Go within the compiler.

Other than that, Go has a product types types and the closest to tuples are structs.
What you are defining as "tuple" seem closer to a tuple constructor and of course, each of these constructors would be unique anyway, (the hardware meaning of a function being different from the mathematical meaning here, it would be hard to prove function equality).

You're right, or at least I currently agree that we should try and see if it is possible to have multiple variadic parameters.
Unlike variadics in normal functions, these are specific parameters and I think it shouldn't lead to much confusion and allow to abstract functions.

@AndrewHarrisSPU
Copy link

AndrewHarrisSPU commented May 5, 2024

@Splizard

There are two major pieces to this proposal that I consider to be incredibly unintuitive. The numbers in the type parameter list and the fact that these variadic parameters can be used as fields and variables. What type is keys in the Metrics example?

I guess some of the noise in this thread stems from tension between meaningfully different semantics of:

Taking type Key[...] out of the metrics listing seems like an example of how semantics can differ ... I do think this is different than a yes/no on multiple variadic elements in a type parameter list. Multiple variadic elements could be done with either/some in-between semantic perspective. (I believe multiple variadic elements, with this proposal's variadic variable semantics, would be sufficient for the use cases explored).

// Multiple type parameter lists can be used to defined seperate groups of 
// type parameters, each can end with a variadic type parameter. This is
// useful for representing both function arguments and results.
func Log[A ...any][R ...any](fn func(A...)R...) func(A...)R... {
...

...
What does everyone think?

This seems hard to comment on, I'm not sure what "separate groups of type parameters" implies. What would the type and value of i be here, or would the code be invalid?

func F[T ...any](t T) T { return t }
func G[T1 ...any][T2 ...any](t1 T1, t2 T2) (T1, T2) { return t1, t2 } // edit: added the return

i := G(F(G[][int](0))) // edit: added missing paren

@perj
Copy link

perj commented May 5, 2024

the fact that these variadic parameters can be used as fields and variables

Several people has said this is unintuitive. I do think it'd help if the ellipsis was added there. Otherwise I don't have a problem with it, personally. Adding 0, 1, etc. as a suffix does seem a bit weird, which is why I question if that is really needed.

Those numbers if I understand well are used to retrofit some of our current iterators without a performance penalty (when ranging over a map, one has the choice over retrieving key AND value or just key).

That's not what it says in the OP, but I suppose it could be another reason. The purpose of the numbers according to OP is to allow to restrict the max to 2, to in turn allow the range statement to work. It would be a compile error to range on an unbounded variable type. It continues to say that allowing range to iterate over more than two values is another option, one I feel like looks quite attractive after the difficulties finding a good syntax.

Said clearly: drop the numbers and allow range on more than two values.

I'm not sure what "separate groups of type parameters" implies. What would the type and value of i be here, or would the code be invalid?

func F[T ...any](t T) T { return t }
func G[T1 ...any][T2 ...any](t1 T1, t2 T2) (T1, T2)

i := G(F(G[][int](0))

I think it would be valid, i would have type int, assuming G has the body { return t1, t2 }. I guess the question you're posing is whether letting G[][int] and G[int][] have the same signatures and semantics is a good idea. I can't really see a problem with it, they likely still would be separate types to the compiler. I'm assuming here that the variable type assignment is greedy, in that parameters to G are assigned to T1 as far as possible, if not specified. I think that seems natural, though it probably has to be pointed out in the spec.

Finally, the [T1... any][T2... any] syntax seems better than using parentheses. I suppose one could view ][ as an operator in this syntax. If seen that way, other characters could also be used. But ][ is fine to me.

@DeedleFake
Copy link

Treating ][ as an operator could lead to some weird parsing oddities. For example, func Example[T1 any] [T2 any](v1 T2, v2 T2) would be invalid. Yes, it's formatted incorrectly, but incorrectly formatted code should still compile, and gofmt wouldn't even be able to fix it if it was actually just simply parsing completely incorrectly.

@AndrewHarrisSPU
Copy link

@perj

I think it would be valid, i would have type int, assuming G has the body { return t1, t2 }.

Oops, yeah - I meant that.

I guess the question you're posing is whether letting G[][int] and G[int][] have the same signatures and semantics is a good idea. I can't really see a problem with it, they likely still would be separate types to the compiler.

Actually I hadn't considered the sense that G[][int] would sort of concatenate or intersect down to G[int] (and, I'm skeptical that the concatenation interpretation generalizes well). But - if they're still separate types to the compiler, would this be invalid?

G[][int](0) == G[int][](0)

If that's invalid, it seems surprising - it's as if the identity function F makes type checking succeed where it wouldn't otherwise.

I had wondered if the code should be invalid because there's a mismatch between G emitting 2 variadic terms, while F accepts 1, or if there's some telescoping where F is parameterized on some type that's a pair of "empty list" and int, and the outermost G is parameterized on that pair (T1) plus another "empty list" (T2).

@perj
Copy link

perj commented May 6, 2024

Treating ][ as an operator could lead to some weird parsing oddities. For example, func Example[T1 any] [T2 any](v1 T2, v2 T2) would be invalid. Yes, it's formatted incorrectly, but incorrectly formatted code should still compile, and gofmt wouldn't even be able to fix it if it was actually just simply parsing completely incorrectly.

Ok, I mostly wanted to point out that another set of characters could be chosen there, in case someone would want to argue for a better one. But I'm with you on that.

Actually I hadn't considered the sense that G[][int] would sort of concatenate or intersect down to G[int] (and, I'm skeptical that the concatenation interpretation generalizes well). But - if they're still separate types to the compiler, would this be invalid?

G[][int](0) == G[int][](0)

If that's invalid, it seems surprising - it's as if the identity function F makes type checking succeed where it wouldn't otherwise.

Since you're calling G, it doesn't matter that the two versions of G has different types, they both have the same result type, int. So this expression would be valid. Trying to compare G[][int] and G[int][] would be invalid. I suppose when you do var v = G[][int] then v would have the type func(int)int and var v = G[int][] would have the same type. At that level, they're the same. I'm not familiar enough with the type parameters implementation to know if that's the same on all levels, so I was a bit careless while expressing that.

I had wondered if the code should be invalid because there's a mismatch between G emitting 2 variadic terms, while F accepts 1, or if there's some telescoping where F is parameterized on some type that's a pair of "empty list" and int, and the outermost G is parameterized on that pair (T1) plus another "empty list" (T2).

Ah, I see. To me, the inner call G has a single list of return values. These are then used to generate the variadic type list for F, whose single list of return values are then used to generate the second call to G. Another way to say it is that the inner call to G is completely resolved to a concrete type before the resolution of types for F is even considered.

@Splizard
Copy link

Splizard commented May 6, 2024

@AndrewHarrisSPU
I think I agree with @perj on the validity of your example, although, I would add that with my proposed syntax, variadic type parameters must always be expanded when used, so they are clearly distinguished from ordinary type parameters.

func F[T ...any](t T...) T... { return t... }
func G[T1 ...any][T2 ...any](t1 T1..., t2 T2...) (T1..., T2...) { return t1..., t2... }

i := G(F(G[][int](0))

I would also add that additional type parameters (ie.with ][) are only valid when the preceding type parameter set ends with a variadic type parameter.

If we tried to use T ...constraint it would be ambiguous as it would denote a list of constraints, not a list of type parameters.

I don't understand this one, can you elaborate what you mean by a list of constraints? ... is used to represent variadics, it's not a type specifier in Go. type Foo ...constraint doesn't make sense to me for any future Go syntax. Is there a parser issue I am not aware of? Following the same pattern as variadic function arguments, seems much more intuitive to me.

@atdiar
Copy link

atdiar commented May 6, 2024

@Splizard

In func(args ...int), args represent a list of int.

If we make a comparable statement,

T in f[T ...constraint] should represent a list of constraints and not a list of constrained type parameters.

That's a bit subtle, but I think that explains why the notation T... was proposed, in order to refer to a list of type parameters instead.

Edit: seems to be unclear. Let me try again.

In non-generic code, there is a difference between f([]int) and f(...int) that is somewhat alleviated because ...int has been made equal to []int in some (but not all) cases.

Here it's not the type parameterization that is variadic but rather the type parameters themselves that should be. So the behavior is analogous to that of expecting a slice argument for a normal function.

That's why [T... any, U... comparable] should probably be possible just as a normal function can be written
f([]int, []float32)

Instantiation may then require specific syntax such as some form of brackets if we consider the "slice" form.

Otherwise, indeed, if it's simple variadically parametered types (instead of variadic type parameters!) that we want, then
T ...constraint is sufficient, and mirrors the well-known variadic function notation.

Also, being able to differentiate regular type parameters from variadic type parameters seem useful.

@ianlancetaylor
Copy link
Contributor Author

Thanks for the great discussion. It has been very helpful. I think it's clear that this idea needs some more thought. I'm going to withdraw this proposal and return to the general idea later.

@ianlancetaylor ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dotdotdot ... LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests