Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: generics: type switch on parametric types #45380

Open
rogpeppe opened this issue Apr 4, 2021 · 219 comments
Open

proposal: spec: generics: type switch on parametric types #45380

rogpeppe opened this issue Apr 4, 2021 · 219 comments
Labels
generics Issue is related to generics LanguageChange Suggested changes to the Go language LanguageChangeReview Discussed by language change review committee Proposal
Milestone

Comments

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 4, 2021

In the discussion for #45346, this comment mentioned the possibility of adding a type switch on parametric types. Given that it's threatening to derail the discussion there, I'm creating this proposal to talk about it.

Proposal: type switching on type parameters

I propose that a new statement be added to the language which allows a function with type parameters to further constrain its type parameters:

switch type T {
case A1:
case A2, A3:
   ...
}

The switched type (T above) must be the name of a type parameter. The type in a case arm can be any type, including constraint types.

The first case is chosen that has a constraint that matches the type parameter.

Within a chosen arm of the case statement, the type parameter that's being switched on becomes further restricted by the type selected, just as if it had originally been constrained by that type too. Precisely, given a switched type T constrained by C, and a type in the switch arm A, within the code of the switch arm, T takes on the type:

    interface{
        C
        A
    }

If there are multiple cases mentioned in the arm (A1, A2, ... An), then T is constrained by using a union element:

    interface{
        C
        A1 | A2 | ... | An
    }

Example

type Stringish interface {
	string | fmt.Stringer
}

func Concat[S Stringish](x []S) string {
    switch type S {
    case string:
        // S is constrained by interface {Stringish; string} which is the same as
        // interface{string} which is the same as string, so we can use x
        // as a normal []string slice.
        return strings.Join(x, "")
    case fmt.Stringer:
        // S is constrained by interface {Stringish; Stringer}
        // which is the same as `Stringer`, so we can call 
        // its `String` method but we can't use x directly as
        // []fmt.Stringer because it might have any layout
        // or size.
        var buf strings.Builder
        for _, s := range x {
             buf.WriteString(s.String())
        }
        return buf.String()
    }
 }

The above assumes that the constraint:

interface {
     X | Y | Z
     X
}

would allow all operations allowed on X, but I believe that's true under #45346 proposal.

Note: I don't think there's any need to allow ~ to be used directly in these type switch cases - case interface {~T}: is sufficient if necessary.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

Replying to @candlerb #45346 (comment)

func Sort[T Lessable[T]](list []T) {
	var less func(a, b T) bool
	switch type T {
	case Ordered:
		less = func(a, b T) bool { return a < b }
	case Lesser:
		less = func(a, b T) bool { return a.Less(b) }
	}
    
	// sort using less
}

I can't see how the two branches of the "switch" can be compiled in the same function instance. For a given T, only one branch or the other is semantically valid.

Isn't this very similar to what happens with a switch x := x.(type) statement currently?

Let's assume for the sake of argument that two versions of Sort with different type parameters can be compiled into the same code (it might do this when both types have the same GC layout).

Then the type switch statement could look at the metadata for the type and "unlock" other operations on the type, just as a type switch does for interface values in current Go - there's no particular reason that Sort would need to be split into two versions just because of the type switch. Splitting is of course still a valid implementation strategy and would result in more efficient code, at the usual cost of potential code bloat.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

Note: I don't think there's any need to allow ~ to be used directly in these type switch cases - case interface {~T}: is sufficient if necessary.

I assume you would apply the same to union-elements?

FWIW, while it doesn't add expressive power, I'd prefer if we'd allow both approximation- and union-elements directly, if we can make it work without syntactical ambiguities. It's more convenient and IMO still clear. And I would prefer if the cases can more directly reflect what's in the constraint, that is, I think

type Constraint interface {
    ~int | ~int8 | ~string
}

func ThisSyntax[T Constraint]() {
    switch type T {
    case ~int | ~int8:
        // …
    case ~string:
        // …
    }
}

func IsClearerThanThisSyntax[T Constraint]() {
    switch type T {
    case interface{~int | ~int8 }:
        // …
    case interface{ ~string }:
        // …
    }
}

But it's a weakly held opinion.

I think this would be good, but one concern I have is that I think it would have unfortunate overlap with simply allowing approximation-elements in type switches and assertions (let's call it "approximate type switch"). I think the form of type switch you propose here (let's call it "parameter type switch") is clearly better for type parameters. But the approximate type switch is clearly better for "sum-types", as you'd need to unpack the value at some point - so you'd want to type-switch on a value, not a type.

So, IMO

  1. It is slightly unfortunate to have both
  2. If we never add "sum-types", the parameter type switch would be better
  3. If we do add "sum-types" and only want one, the approximate type switch would be better

Of course, we can always eat the cost of "slightly unfortunate" and add both, when the time comes.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

Isn't this very similar to what happens with a switch x := x.(type) statement currently?

One significant difference is that switch x := x.(type) visibly declares a new variable (shadowing the old one), scoped to the switch case block.

Arguably, the parameter type switch could "declare a new type" (shadowing the old one), scoped to the switch case block. But it's still arguably more implicit and "magic".

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I think this would be good, but one concern I have is that I think it would have unfortunate overlap with simply allowing approximation-elements in type switches and assertions (let's call it "approximate type switch"). I think the form of type switch you propose here (let's call it "parameter type switch") is clearly better for type parameters. But the approximate type switch is clearly better for "sum-types", as you'd need to unpack the value at some point - so you'd want to type-switch on a value, not atype.

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal. In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value, but that's not the case here, where once we know about the type, we know the type of all values that use that type.

That's why I chose to restrict this proposal to allow only exactly type parameter names to be specified in the switch, because then there's no ambiguity - we're further constraining the already-known type and thus we know that all arguments, variables and return parameters that use that type are constrained likewise.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

To try to be clearer, this proposed statement is a "switch on a type" rather than a "switch on the type of a value". That's why it's very different from the current type switch statement, and also why it's specifically useful in the context of parametric types.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal.

I think they are different, but I don't think they are orthogonal/independent. In particular, the "approximate type switch" would in terms of what you can express subsume the "parameter type switch" (or at least most of it). That is, you can write code doing the same, with the same static guarantees, using either - even if less convenient.

In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value

That is true. But this proposal doesn't talk about regular interface values, it talks about type parameters. So, if you are in a situation where you can use the "parameter type switch", you do know that two values have the same type. That is, you could write

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return math.Max(a, b.(~float64))
    default:
        if a > b {
            return a
        }
        return b
    }
}

and this is statically known to be correct - even if inconvenient, by requiring an extra type-assertion, you know this type-assertion can't fail.

That's why I chose to restrict this proposal to allow only exactly type parameter names to be specified in the switch, because then there's no ambiguity - we're further constraining the already-known type and thus we know that all arguments, variables and return parameters that use that type are constrained likewise.

I don't see how case ~int supposedly adds ambiguity, where interface{ ~int } doesn't. One is simply syntactic sugar for the other. Sorry, I think I misunderstood what you where saying, disregard this.

To try to be clearer, this proposed statement is a "switch on a type" rather than a "switch on the type of a value". That's why it's very different from the current type switch statement, and also why it's specifically useful in the context of parametric types.

Yupp :) That's what I meant when I said this is clearly better for type parameters :)

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal.

I think they are different, but I don't think they are orthogonal/independent. In particular, the "approximate type switch" would in terms of what you can express subsume the "parameter type switch" (or at least most of it). That is, you can write code doing the same, with the same static guarantees, using either - even if less convenient.

In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value

That is true. But this proposal doesn't talk about regular interface values, it talks about type parameters. So, if you are in a situation where you can use the "parameter type switch", you do know that two values have the same type. That is, you could write

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return math.Max(a, b.(~float64))
    default:
        if a > b {
            return a
        }
        return b
    }
}

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion, and neither could the float64 returned by math.Max be assigned to the return parameter.

You'd need something like this instead:

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return interface{}(math.Max(float64(a), float64(interface{}(b).(~float64)))).(T)
    default:
        if a > b {
            return a
        }
        return b
    }
}

That would become less verbose if value type assertions could be done directly on non-interface values, but still not great.

and this is statically known to be correct - even if inconvenient, by requiring an extra type-assertion, you know this type-assertion can't fail.

Although as a programmer, you might know that those dynamic type assertions are OK, they seem problematic to me - they're verbose and easy to get wrong (with a nasty panic if you do).

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion

Who knows, we are talking about a speculative design with no proposal filed :) IMO, x.(~float64) should evaluate to float64 exactly and if ~float64 is used as a case, a should have type float64. But either way, that doesn't matter a lot.

neither could the float64 returned by math.Max be assigned to the return parameter.

Probably true. This is harder to handwave away :)

Although as a programmer, you might know that those dynamic type assertions are OK, they seem problematic to me - they're verbose and easy to get wrong (with a nasty panic if you do).

That's why I said this proposal is better, as long a we're only concerned with type parameters.

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 4, 2021

As I see it, this proposal is to add a way to say, "given a thing that might be one of possibly infinitely many types, add special handling for cases where it is one of a chosen set of types." This is already exactly what the current syntax for type switches does, for where "thing" means "value" rather than "type." Considering how similar they are, why is the proposed syntax so different?

In particular, with the current generics proposal, it would be possible to implement parsing with the only new AST nodes being those needed to describe constraints (type lists for the proposal as accepted, or type set operations for #45346, both only on interface declarations). Depending on how the implementation of type parameters is handled, static analysis of type-checked programs could be done with only the same new node; tools that only analyze function bodies might not need to be updated at all. How is the cost of an entirely new syntactic construct to every Go code parser and to every static analysis tool using them justified?

I would prefer if the spelling of these type switches were also switch T.(type).

@thepudds
Copy link
Contributor

thepudds commented Apr 4, 2021

I would prefer if the spelling of these type switches were also switch T.(type).

One other option — in an earlier conversation (still in the context of type lists), Ian had floated this syntax:

func F[T constraints.Integer]() { 
    switch T { 
    case int: 
    case int8: 
    } 
} 

The proposed semantics at that time though were not as nice as the semantics proposed here under type sets, I think.

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 4, 2021

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion, and neither could the float64 returned by math.Max be assigned to the return parameter.

You'd need something like this instead:

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return interface{}(math.Max(float64(a), float64(interface{}(b).(~float64)))).(T)
    default:
        if a > b {
            return a
        }
        return b
    }
}

That would become less verbose if value type assertions could be done directly on non-interface values, but still not great.

I agree with @Merovius that I would expect case ~float64: to cause T to become float64 within the case, but my reason is specifically that, as proposed in #45346, ~float64 is not a type. There are no values of type ~float64, only of types whose underlying type is float64. Aside from your comment, none of the suggestions in this thread seem to suggest that behavior should change.

(I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other, so it should be fine under even the most conservative type checking of this proposal to return T(math.Max(float64(a), float64(b))).)

@urandom
Copy link

urandom commented Apr 4, 2021

I would think that if a case were to include an approximation type (e.g. ~float64), then the compiler should be able to deduce that within the branch T would be a type whose underlying type is float64, and should thus be able to convert a float64 back to T. If that is really the case, then restricting such a switch to only non-approximation types sounds like an unnecessary restriction.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other

Yes, but to allow it, the compiler would have to know about it. It is easy to use v.(~float64) as a float64, because neither v itself, nor it's type, actually change - you look at a new variable with a new type. Meanwhile, if you'd want to convert T(v) after that, the compiler would have to take into account that the type-assertion before succeeded. That's not how Go works so far - for example, you can't do

var r io.Reader = new(bytes.Buffer)
_ = r.(*bytes.Buffer) // succeeds
var br *bytes.Buffer = (*bytes.Buffer)(r) // type-assertion succeded, so we know that we can convert r to *bytes.Buffer

It's a little different, because we are talking type parameters, but it still destroys the elegant simplicity.

So, I do agree with @rogpeppe that type switching on the type parameter is useful and enables you to do new things, because there is inherent logic in saying "inside the switch case, T is interpreted as being constrained to the type set in the case".

I still don't think they are orthogonal though. They still both have significant overlap.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

@zephyrtronium

This is already exactly what the current syntax for type switches does, for where "thing" means "value" rather than "type." Considering how similar they are, why is the proposed syntax so different?

I believe it is because of what you mention - we aren't switching on a value, but on a type. But FTR, I don't think it really matters for the substance of the proposal whether it's switch T, switch type T or switch T.(type) - so maybe we shouldn't worry about the color of our bikeshed just yet :)

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

FWIW, after the discussion so far, I've come around to this. I think it's a good addition :)

I still would like to allow union- and approximation elements directly (without wrapping them in interface) though and I also think that this shouldn't prevent us from adding type switches/assertions using approximation elements either :)

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion

Who knows, we are talking about a speculative design with no proposal filed :) IMO, x.(~float64) should evaluate to float64 exactly and if ~float64 is used as a case, a should have type float64.

FWIW the semantics I'd expect from an approximate type switch on a value would be that the value inside the case would allow just the same operations as if the type had been constrained by the switched type in a generic function parameter.

That is, in this example, case ~float64 would allow only operations allowed by every type with underlying type float64, and since you can't assign a value with a defined float64 type to a float64 without an explicit type conversion you wouldn't be able to call Max, hence the need for an explicit type conversion in my code snippet.

Note that being able to assert on ~float64 doesn't help when you're wanting to operate on a slice or other higher level type though, because the constraint syntax doesn't allow interface{[]~float64} AFAICS.

However, switching on the type parameter itself does potentially help that case, because then the slice can be used as an argument to any function with suitable constraints.

For example:

func MaxVec[F comparable](xs []F) F {
    switch F {
    case interface{~float64 | ~float32}:
        // This is OK because F is here constrained by interface{comparable; ~float64 | ~float32}.
        return MaxFloats(xs)
    default:
        ...
    }
}

func MaxFloatVec[F interface {~float64 | ~float32}](xs []F} F {
    ...
}

This makes me realise an interesting point here: even if one has an approximate type switch, unless you significantly add to the power of the approximation and union element features (a big ask, given their current elegance), you will still not be able to do something like assert on the underlying type of a type component such a slice element.

That is, something like this would probably not be allowed, which arguably makes approximate type switches on values
considerably less interesting as a candidate feature for the language.

func X(x interface{}) {
    switch x := x.(type) {
    case []~float64:
        ...
    }
}

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

@zephyrtronium The concrete syntax I've proposed here is only one of many possible. A syntax like switch T { would work just as well; in fact I think I might prefer that, although it arguably gives less clues to the reader of the code about what's going on.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I still would like to allow union- and approximation elements directly

Note that you could add union elements for completeness, but they're technically redundant with respect to this proposal because a case with multiple comma-separated elements is the same as a union element.

Come to think of that: you can use multiple elements in a type switch case currently, but the result is the original interface type. That behaviour probably can't change for backward compatibility reasons, but if one allowed a union element directly, one could constrain the available operations to the intersection of operations available on all the selected types, just as with generics.

That doesn't affect this proposal though.

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

FWIW the semantics I'd expect from an approximate type switch on a value would be that the value inside the case would allow just the same operations as if the type had been constrained by the switched type in a generic function parameter.

That would require ~float64 to be a type though. That is, with v := v.(~float64), v needs to have some type. IMO float64 is the most natural and most useful type here. Why would I not want it to be a float64? Except avoiding having to learn "v.(~T) evaluates to a T"?

Note that this is completely different when we are talking about this proposal. When you change the constraints put on T inside the case block, you can obviously do more things with it, because you can have many values of that type and be sure they are the same type.

Agreed, to the rest of the comment.

That is, something like this would probably not be allowed, which arguably makes approximate type switches on values
considerably less interesting as a candidate feature for the language.

FWIW, I think approximate type switches will be a must if we ever allow to use all interfaces as types. Up until then, they are a nice-to-have, at best (if something like this proposal gets accepted - I do strongly feel that we need some way to specialize on the type argument eventually).

Note that you could add union elements for completeness, but they're technically redundant with respect to this proposal because a case with multiple comma-separated elements is the same as a union element.

This is a drawback to me. Personally, I think I would consider to disallow multiple cases in this type switch construct. It seems we need to choose whether we'd rather be consistent with other switch constructs or be consistent with the constraint syntax. Unfortunate.

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

Can you give an example? I'm not sure what would make it harder. Conceptually, each case block is just an anonymous generic function with a more constrained type parameter. ISTM that if we can type-check a generic function, we should already be able to type-check this construct as well?

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 4, 2021

@Merovius

I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other

Yes, but to allow it, the compiler would have to know about it. It is easy to use v.(~float64) as a float64, because neither v itself, nor it's type, actually change - you look at a new variable with a new type. Meanwhile, if you'd want to convert T(v) after that, the compiler would have to take into account that the type-assertion before succeeded. That's not how Go works so far - for example, you can't do

var r io.Reader = new(bytes.Buffer)
_ = r.(*bytes.Buffer) // succeeds
var br *bytes.Buffer = (*bytes.Buffer)(r) // type-assertion succeded, so we know that we can convert r to *bytes.Buffer

Getting a bit off-topic over an example, but I don't follow your argument here. You can do this:

var r io.Reader = new(bytes.Buffer)
switch r := r.(type) {
case *bytes.Buffer:
	var br *bytes.Buffer = r
}

which much closer resembles the example under discussion. The question is the operations (specifically conversions) legal on a parameterized type within a case of a type switch that effectively redefines the type parameter's constraint.

Now, there is a bit of a difference here in that I use r := r.(type) whereas the switch on a parameterized type does not. You can't assign var br *bytes.Buffer = r without shadowing, of course, because the r remains type io.Reader. However, the difference is that r is a value with an interface type, whereas T in the original example is a constraint – not even a type. Per the proposal, within the switch case, the latter is defined to operate as if the function's constraint type set were originally the intersection of the actual constraint and the constraint used in the case. The only types in that intersection are convertible to float64 and float64 is convertible to any type in it, so by the behavior specified in the accepted proposal, conversions between those types are allowed.

Perhaps this does relate to @rogpeppe's comment while I was typing this:

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

And, perhaps this line leads me to conclude that switch T.(type) is not quite the correct syntax, either, because this sort of type constraint switch effectively includes a redeclaration, which arguably should not be implicit.

@zephyrtronium
Copy link
Contributor

@rogpeppe

@zephyrtronium The concrete syntax I've proposed here is only one of many possible. A syntax like switch T { would work just as well; in fact I think I might prefer that, although it arguably gives less clues to the reader of the code about what's going on.

Noted. To me, there is a major difference between proposing an extension of semantics for existing syntax to analogous behavior, and proposing new syntax when an analogous one already exists. But perhaps I am a bit too early to this complaint.

@randall77
Copy link
Contributor

I can definitely see some trickiness with this idea.

func f[T any](a T) {
    switch T {
        case float64:
            var x T // x is a float64
            x = a // both are type "T", why can't I assign them?
    }
}

I think it is cleaner if you introduce a new type-parameter-like-identifier, like:

func f[T any](a T) {
    switch U specializing T { // or some other syntax
        case float64:
            var x U // x is a float64
            x = a // now it is clear this is not allowed, as it is assigning a T to a U.
    }
}

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

That would require ~float64 to be a type though. That is, with v := v.(~float64), v needs to have some type. IMO float64 is the most natural and most useful type here. Why would I not want it to be a float64?

Because the original type still carries semantic weight and could be useful. A type switch tells you what the dynamic type of the value is; it doesn't convert it to some other type.

For example, I'd expect the following code to print "MyFloat", not "float64":

type MyFloat float64

func P[F any](f F) {
    switch type F {
    case ~float64:
        fmt.Printf("%T\n", f)
    default:
        fmt.Printf("%T\n", f)
    }
}

func main() {
    P(MyFloat(64))
}

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I can definitely see some trickiness with this idea.

func f[T any](a T) {
    switch T {
        case float64:
            var x T // x is a float64
            x = a // both are type "T", why can't I assign them?

In this proposal, you can. Within that case, T is known to be exactly float64 as if with a type alias.
So both x and a are of both type float64 and T.

The important point is that the specialisation affects all variables in scope that use type T.
It's OK to do that because, unlike regular interface values, there's no special GC shape associated with T, so we can specialise without needing to create a new value.

That is, we're not creating a new T scoped to the case - we are genuinely specialising the original T for the extent of that case.

@randall77
Copy link
Contributor

Ah, ok, so we could think of these switches as happening at the top level. We put them syntactically lower down in the AST just to share all the code outside the switch.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

Ah, ok, so we could think of these switches as happening at the top level. We put them syntactically lower down in the AST just to share all the code outside the switch.

Yes, you could look at it that way, although I'm not sure how helpful that is.

The way I look at it is that within the case statement, you get a more precise view of what T happens to be, so you can do more specific operations on values defined in terms of T. If the generated code isn't fully expanded out for every possible type, the switch operation may well involve some runtime cost (not at the top level) to look up the relevant dictionary of available operations, much as the usual type switch does with the method dictionary.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

@zephyrtronium

ISTM that the confusion is that you are talking about this proposal (reasonably so) while I was discussing a different idea - namely allowing type switches/assertions on values to assert on the underlying type. And I was discussing that idea to compare its expressive power to the one proposed by @rogpeppe. Everything you say is true, under this proposal.

@atdiar
Copy link

atdiar commented Sep 11, 2023

Without entering in the details of your future talk and for the sake of this technical problem that really needs solving, why is it difficult to check whether a go type satisfies another?

We do that all the time?
What if it's a union? Suffice to check for every member of the union? If at least one member satisfies, we have overlap?

Note: There are algorithms for that already.

Is the issue the number of overlapping cases? Which shouldn't be an issue in practice if we really have compiler enforced disjoint cases? (the programmer needs to provide the intersection case when there is one being detected)

@Merovius
Copy link
Contributor

for the sake of this technical problem that really needs solving

I agree that it needs solving, but I disagree that it needs solving in the next two months.

As I said, I don't think we should re-hash the same conversation we had then. Feel free to assume these problems are easy, against my recommendation. I don't believe I will convince you otherwise.

@atdiar
Copy link

atdiar commented Sep 11, 2023

Ok fine. A technical discussion postponed only for the sake of self-promotion. That's a bit of a pity. I wouldn't do that. :)

@Merovius
Copy link
Contributor

@atdiar The offer I made you over a year ago still stands: If you write up a detailed proposal of how to allow methods in unions, I'll be happy to look it over.

@Merovius
Copy link
Contributor

FWIW as long as the compiler is not required to answer any of these questions (or questions of that kind), there probably is no future problem:

  1. Is a given constraint satisfiable (is its type set empty)?
  2. Do two constraints have a non-empty intersection (are their type sets disjoint)?
  3. Does one constraint imply another (is the type set of one a subset of the other)?
  4. Does a list of constraints entirely cover another (are there unhandled cases in a switch)?

This class of question is hard to answer, in general, and very strongly interacts with the exact constraints we allow. I believe any of them can be relatively easily translated into a reduction-proof for CNFSAT, with even a relatively simple constraint language (as long as it allows methods in unions - currently, we disallow them, which enables us to answer these questions).

Under that assumption, the proposal in the top-post creates problems, for example. Because it allows this code:

func F[T any](v T) string {
	switch type T {
	case interface{~string}, fmt.Stringer:
		// Here, T is constrained on `interface{ ~string | fmt.Stringer }`,
		// which is not a constraint we currently allow. 
		//
		// That requires answering "does interface{ ~string | fmt.Stringer } imply C?",
		// to type-check this call:
		G[T](v) 
	}
}

func G[T C](v T) { /* … */ } // with some constraint C

Obviously, any proposal requiring the compiler to check that such a type-check is exhaustive would immediately ask one of those difficult questions. In my opinion, we have to be content with simply checking each type in-order, as with regular (type-)switches.

Any proposal allowing overloaded functions or methods (i.e. having multiple functions/methods with the same name, differing only in constraints) is extremely likely going to have to ask one of those questions, to decide what actual function to call for a given type argument (that is how this program breaks clang). If we allow Refined Method Constraints, I believe we still have to require there to only be one method per name (though as I've shown, that should be enough).


The most general proposal I can see right now is what the top-post proposes, with the added restriction that

If a case statement has multiple cases, they must not be interfaces containing methods or the predeclared identifier comparable and they must not embed any such interfaces.

(matching the current restriction on union elements). In particular

  1. There is no requirement for the cases to be disjoint. The first matching case will be taken.
  2. There is no requirement for the cases to be exhaustive. If in doubt, you'll need a default case or add a superfluous panic("unreachable"), unfortunate as that is.
  3. The only implicit constraints this can construct take the form of existing interface-definitions. That is, given valid interfaces A and B, interface{ A ; B } is always a valid (covering the single-constraint branches) and given A and B without methods or comparable, interface{ A | B } is valid (covering the multi-constraint branches).

With that, the compiler only ever has to answer the question "does a given type argument satisfy a specific (refined) constraint". That question is easy to answer and will always remain easy to answer by construction.

I think we might still consider weakening the power of the proposal anyways. The reason is that if we want to be able to have methods in unions, we might still want to limit the actual set of interfaces you can express (for example by limiting their nesting depth or similar limitations). If we do that, the implicit interfaces created by case branches might then violate that structure (e.g. any case increments the nesting depth) - which might be surprising and/or inconvenient. I think eliminating multi-constraint branches altogether might make it a bit more likely that no problems prop up, but even that I don't feel super confident about.

In short, no version of this proposal really makes me entirely confident about the prospects of allowing methods in unions. But there is a version of this proposal I feel confident we could do right now - if we accept that it might create friction in some cases later. Personally, I'd be unhappy to make that commitment without knowing what exactly we are talking about.

Alternatively, we could commit to never allow methods in unions. In that case, with the restriction above, this proposal seems fine to me. Personally, I'm not particularly happy making this commitment - especially if we at some point want to use general interfaces as union types.

Lastly, we could commit to ignoring the problem - either we describe a heuristic to solve the resulting CNFSAT problems in the spec, or we leave it up to the implementation to perhaps time out, or we leave it up to the implementation to impose limits… Personally, I'd be unhappy with this as well.


All of these are, of course, opinions. I've thought about these problems quite a bit, but take these opinions as you like.

@ianlancetaylor
Copy link
Contributor

@atdiar That last comment is not appropriate in this forum. Please be constructive. Thanks.

@atdiar
Copy link

atdiar commented Sep 11, 2023

@ianlancetaylor my apologies.

@Merovius a bit more than a year ago I had written a proposal in which I detailed one way to proceed with entailment (what you called an interface implying another) and it is on hold.

At the time, you had parsed it (we had had a little discussion off github).

You're right that depending on the constraint available in the language, the problem is more or less tractable.
This is the same issue people have with dependent types/ unrestrained refinement types.

In the present case, unless major upset, it should be tractable because unlike in the refinement type case, the set of expressible type of constraints is bounded for the compiler.

It's easy to make the equivalence between a type and the set of constraints it denotes. Then entailment is simply set inclusion.

@Merovius
Copy link
Contributor

Merovius commented Sep 11, 2023

Are you refering to #53734? I'll comment over there.

[edit] Well, I read through that and you don't actually talk about removing the restriction for methods in unions over there, so I'll comment here anyways. You have recognized the problem with that proposal yourself:

In theory, checking for C may seem exponential in the numbers of constraints being combined (o(kN), EXPSPACE in the WORST case only).
This seems to be a case of expansion of a boolean formula into DNF form.

However, in practice, the restriction on the domains of many constraints (which makes them incompatible with each other) simplify things. (e.g. if a is int and e string, aeg and aeh can be eliminated).
The checking should also be parallelizable.

So it shouldn't be a big worry, I think.

You are correct that you are describing an expansion into DNF. You are correct that this requires exponential runtime in general. You are correct that in the current language, that is not a big problem, because constraints end up being mutually exclusive (in all but a finite number of cases).

But if we where to allow methods in union elements, that would of course no longer be true. You'd have to expand the actual full DNF. So what you propose would indeed take exponential time to check. Because you are proposing a brute-force solution to an NP-complete problem. [/edit]

@jimmyfrasche
Copy link
Member

I want something like this but the part where the meaning of the type parameter implicitly changes in the block never quite sat right with me. My preference would be for something a little more explicit:

  1. allow defining a new type parameter in the switch
  2. the cases can contain any constraint spec(s)
  3. if there is a single constraint spec the new type parameter is scoped to that block and conversions between the new and old type parameter are legal

For example, the first switch switches on the parameter T and defines the result to be the new type parameter S which can be used:

var x T
switch type S T {
  case ~int | ~uint:
    y := S(x)
    x = T(y)
  case fmt.Stringer, ~float64:
    // S undefined as it could match either constraint
}
switch type T {
  case int:
    // we know T is an int
    // but no new type parameter declared
    // so we can't really do much about it
}

This is roughly equivalent to allowing all constraints to be interface types

var x T
switch y := any(x).(type) {
  case ~int | ~uint:
  case fmt.Stringer, ~float64:
}
switch any(x).(type) {
  case int:
}

While I also think interface types should be generalized, the parameter type switch avoids having to smuggle everything through interfaces and allows conversions and more compile-time type checking.

@Merovius
Copy link
Contributor

@jimmyfrasche I don't think I understand why you'd prefer that. It seems like more typing and syntax without more power, to me. What's the benefit of having to have a second type parameter and having to explicitly convert?

@jimmyfrasche
Copy link
Member

  • you still have access to the outer type if need be
  • you still have access to all the variables with their original type
  • you don't need to keep track of changes between the world inside the case-block and outside the switch statement in your head, they're in the code as explicit conversions
  • it's more similar to the dynamic type switch so there are fewer differences to learn

@rogpeppe
Copy link
Contributor Author

@jimmyfrasche
Would this code be valid under your suggestion?

func ElemIndex[T comparable](x []T, e T) int {
	switch type S T {
	case byte:
		return bytes.Index(([]S)(x), S(e))
	default:
		for i, g := range x {
			if g == e {
				return i
			}
		}
		return -1
	}
}

If so, perhaps you could share your reasoning. If not, how would you accomplish a similar task?

@jimmyfrasche
Copy link
Member

No. I don't think anything like that should be allowed, though. It would be too magical.

I suppose you could use unsafe quite safely in this case, if you absolutely needed the performance boost.

@atdiar
Copy link

atdiar commented Jan 15, 2024

Like @Merovius , does seem a bit more confusing to me.

In your initial example, is
case fmt.Stringer, ~float64 equivalent to case fmt.Stringer | ~float64?

If so, should the type parameter really be undefined?

@jimmyfrasche
Copy link
Member

It's different. The case is hit for either of those constraints. If you had case int, uint you could also write it as case int | uint but fmt.Stringer | ~float64 is not a valid constraint. I'd be fine if that didn't make it. I put it in because it's how you can use .(type) switches.

@atdiar
Copy link

atdiar commented Jan 15, 2024

That part is interesting, that's something to ponder. The current limitation might also be lifted at some point but that can indeed be something to keep in mind for later consideration.

@mikeschinkel
Copy link

mikeschinkel commented Jan 16, 2024

I would love this feature, but think the syntax should be more consistent with the interface type assertion syntax, using symbol characters rather than a keyword since the type keyword is used in declaration and symbols used for accessors.

So instead of switch type S T {...} I am proposing switch [S T] {...}.

I would specifically prefer this not just for consistency with type assertion but more importantly because I think it would be easier (at least for me) to recognize this when scanning code vs. if we use keywords.

Modifying @jimmyfrasche's earlier example:

var x T
switch [S T] {
  case ~int | ~uint:
    y := S(x)
    x = T(y)h 
  case fmt.Stringer, ~float64:
    // S undefined as it could match either constraint
}
switch [T] {
  case int:
    // we know T is an int
    // but no new type parameter declared
    // so we can't really do much about it
}

Of course I am doing a bit of bikeshedding here, so maybe just upvote or downvote this idea to indicate sentiment?

@arvidfm
Copy link

arvidfm commented Jan 16, 2024

@rogpeppe For the record, this does not compile today:

func F[T byte](list []T) int {
	// cannot convert list (variable of type []T) to type []byte
	_ = []byte(list)
	// cannot use list (variable of type []T) as []byte value in argument to bytes.Index
	return bytes.Index(list, []byte("."))
}

It seems you can't convert a generic slice to a concrete one, even if the type parameter is constrained by the concrete type in question. Same goes for converting a []T to a []S even when T and S have the same constraints. So I don't think either version of the type switch would make your example work by itself.

@corto25
Copy link

corto25 commented Jan 16, 2024

While the title of this issue is constraining us to think in terms of a switch, could there be room to explore other ways to specialize behavior for specific types or families of types? My feeling is type parameter switches will lead to code that is very difficult to test for all the types that it can/will be used for.

I would favor a technique that explores a form of overloading for generic function specialization.

For example, given this generic function:

func PrettyPrint[T any](v T) string {
  return fmt.Sprintf("%v", v)
}

Example of specialization of the function, just for the float64 type:

func PrettyPrint[T = float64](v T) string {
  return fmt.Sprintf("%.2f", math.Abs(v))
}

Or this syntax:

func PrettyPrint[](v float64) string {
  return fmt.Sprintf("%.2f", math.Abs(v))
}

@Merovius
Copy link
Contributor

Overloading is a pretty fraught topic. In particular, what should happen if a type satisfies the constraints of multiple functions? I'll note that this is exactly what makes C++ concepts NP-complete: It allows overloading of functions and, if the constraints of multiple functions match a type, chooses "the most specific one". Determining that is an NP-complete problem. This wouldn't be a problem for Go yet (right now, checking which constraint is "more specific" can probably be implemented efficiently), but it definitely is a road that should be chosen with care.

I'll also note that overloading is the one form of polymorphism Go has so far eschewed.

Personally, I still favor the original proposal here (with the exception of the , form, I think), with no real preference for the exact syntax. I don't feel personally convinced by @jimmyfrasche's arguments.

I'm not sure having access to the "outer" type really matters. Especially given that the type argument is the same - so the dynamic type of interfaces or what expressions evaluate to don't change - and the type parameter in the switch case will have stronger constraints than the one on the function. So ISTM anything that's possible with the "outer" one should also be possible with the "inner" one.

The original type switch just seems the simpler syntax and - to me, at least - the simpler semantics as well. I find it significantly easier to understand than having to introduce an entirely new type parameter. YMMV, of course.

@rogpeppe
Copy link
Contributor Author

@jimmyfrasche wrote in reply to this comment:

No. I don't think anything like that should be allowed, though. It would be too magical.

I suppose you could use unsafe quite safely in this case, if you absolutely needed the performance boost.

FWIW there's no need for unsafe for that case, even now: https://go.dev/play/p/n0yqq6EGPfy
It's even possible (though awkward and probably inefficient) to do it with only a single type check and no possibility of panic: https://go.dev/play/p/IcIpvbdGooe

This is precisely the kind of case that this proposal is designed for: it's common to have several
values all defined in terms of the same type parameter. Your suggestion doesn't help with that common
case at all.

For the record, this is what it would look like using this proposal:

func ElemIndex[T comparable](x []T, e T) int {
	switch type T {
	case byte:
		return bytes.Index(x, e)
	default:
		for i, g := range x {
			if g == e {
				return i
			}
		}
		return -1
	}
}

@arvidfm wrote:

@rogpeppe For the record, this does not compile today:

func F[T byte](list []T) int {
	// cannot convert list (variable of type []T) to type []byte
	_ = []byte(list)
	// cannot use list (variable of type []T) as []byte value in argument to bytes.Index
	return bytes.Index(list, []byte("."))
}

But this does (and this is what this proposal is aimed at replacing):

func F[T any](list []T) int {
	return bytes.IndexByte(any(list).([]byte), '.')
}

@arvidfm
Copy link

arvidfm commented Feb 10, 2024

I am once again finding myself daydreaming about this proposal as I'm playing around with the new experimental function iterators in 1.22. I wanted to write a function that takes any kind of iterable type and spits out an iter.Seq:

type Iterable[V any] interface {
	~func(func(V) bool) | ~[]V
}

func AsSeq[V any, VI Iterable[V]](it VI) iter.Seq[V] {
	switch type VI {
	case ~func(func(V) bool):
		return iter.Seq[V](it)
	case ~[]V:
		return func(yield func(T) bool) {
			for _, v := range it {
				if !yield(v) {
					return
				}
			}
		}
	}
}

which could then be used as:

func Map[T, U any, TS Iterable[T]](it TS, f func(T) U) iter.Seq[U] {
	return func(yield func(U) bool) {
		seq := AsSeq[T](it)
		seq(func(t T) bool {
			return yield(f(t))
		})
	}
}

This would have made for a really nice and ergonomic API, where users could pass in any kind of iterable value to Map, even derived ones, and have it Just Work™. For iter.Seq in particular, supporting derived types would have been nice because then all my functions could have returned a custom Seq-derived type with methods like Collect() for converting to a slice or Count() for getting the total number of elements.

Maybe someone smarter than me can think of a solution, but for the life of me I can't come up with an implementation that doesn't impose major constraints, like forcing the user to convert to iter.Seq before calling into the API, not allowing derived types, or having to use reflection.

The best I've managed to come up with is this reflection-based monstrosity:

func AsSeq[V any, VS Iterable[V]](it VS) iter.Seq[V] {
	switch vi := any(it).(type) {
	case iter.Seq[V]:
		return vi
	case []V:
		return func(yield func(V) bool) {
			for _, v := range vi {
				if !yield(v) {
					return
				}
			}
		}
	}

	// VS is a derived type, fall back to reflection
	v := reflect.ValueOf(it)
	return func(yield func(V) bool) {
		switch v.Kind() {
		case reflect.Func:
			v.Call([]reflect.Value{reflect.ValueOf(yield)})
		case reflect.Slice:
			length := v.Len()
			for i := 0; i < length; i++ {
				if !yield(v.Index(i).Interface().(V)) {
					return
				}
			}
		default:
			panic("unexpected type")
		}
	}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generics Issue is related to generics LanguageChange Suggested changes to the Go language LanguageChangeReview Discussed by language change review committee Proposal
Projects
Status: Hold
Development

No branches or pull requests