-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: sum types based on general interfaces #57644
Comments
Could you comment on why this restriction occurs? Is this simply to err on the side of caution initially and potentially remove this restriction in the future? Or is there a technical reason not to do this? |
The reason to not permit |
Is there a technical reason that the language could not also evolve to support |
In a vacuum, I'd prefer pretty much any other option, but since it's what generics use, it's what we should go with here and we should embrace it fully. Specifically,
@dsnet I think |
With the direct storage mechanism detailed in the post as an alternative to boxing, would it be possible for the zero-value not to be type Example interface {
int16 | string
} the zero value in memory would look like
I don't understand this comment, which may indicate that I'm missing something fundamental about the explanation. Why would pointers make any difference? If the above |
The example in the proposal is rather contrived, so I tried to imagine some real situations I've encountered where this new capability could be useful to express something that was harder to express before. Is the following also an example of something that this proposal would permit? type Success[T] struct {
Value T
}
type Failure struct {
Err error
}
type Result[T] interface {
Success[T] | Failure
}
func Example() Result[string] {
return Success[string]{"hello"}
} (NOTE WELL: I'm not meaning to imply that the above would be a good idea, but it's the example that came most readily to mind because I just happened to write something similar -- though somewhat more verbose -- to smuggle (result, error) tuples through a single generic type parameter yesterday. Outside of that limited situation I expect it would still be better to return Another example I thought of is Although I expect it would not be appropriate to change this retroactively for compatibility reasons, presumably a hypothetical green field version of that type could be defined like this instead: type Token interface {
Delim | bool | float64 | Number | string
// (json.Token also allows nil, but since that isn't a type I assume
// it wouldn't be named here and instead it would just be
// a nil value of type Token.)
} Given that the exact set of types here is finite, would we consider it to be a breaking change to add new types to this interface later? If not, that could presumably allow the following to compile by the compiler noticing that the // TokenString is a rather useless function that's just here to illustrate an
// exhaustive type switch...
func TokenString(t Token) string {
switch t := t.(type) {
case Delim:
return string(t)
case bool:
return strconv.FormatBool(t)
case float64:
return strconv.FormatFloat(t, 'g', -1, 64)
case Number:
return string(t)
case string:
return string
}
} I don't feel strongly either way about whether such sealed interfaces should have this special power, but it does seem like it needs to be decided either way before implementation because it would be hard to change that decision later without breaking some existing code. Even if this doesn't include a special rule for exhaustiveness, this still feels better in that it describes the range of EDIT: After posting this I realized that my type switch doesn't account for Finally, it seems like this would shrink the boilerplate required today to define what I might call a "sealed interface", by which I mean one which only accepts a fixed set of types defined in the same package as the interface. One way I've used this in the past is to define struct types that act as unique identifiers for particular kinds of objects but then have some functions that can accept a variety of different identifier types for a particular situation: type ResourceID struct {
Type string
Name string
}
type ModuleID struct {
Name string
}
type Targetable interface {
// Unexported method means that only types
// in this package can implement this interface.
targetable()
}
func (ResourceID) targetable() {}
func (ModuleID) targetable() {}
func Target(addr Targetable) {
// ...
} I think this proposal could reduce that to the following, if I've understood it correctly: type ResourceID struct {
Type string
Name string
}
type ModuleID struct {
Name string
}
type Targetable interface {
ResourceID | ModuleID
}
func Target(addr Targetable) {
// ...
} If any of the examples I listed above don't actually fit what this proposal is proposing (aside from the question about exhaustive matching, which is just a question), please let me know! If they do, then I must admit I'm not 100% convinced that the small reduction in boilerplate is worth this complexity, but I am leaning towards 👍 because I think the updated examples above would be easier to read for a future maintainer who is less experience with Go and so would benefit from a direct statement of my intent rather than having to infer the intent based on familiarity with idiom or with less common language features. |
@dsnet Sure, we could permit |
@DeedleFake The alternative implementation is only an implementation issue, not a language issue. We shouldn't use that to change something about the language, like whether the value can be The reason pointer values matter is that given a value of the interface type, the current garbage collector implementation has to be able to very very quickly know which fields in that value are pointers. The current implementation does this by associating a bitmask of pointers with each type, such that a 1 in the bitmask means that the pointer-sized slot at that offset in the value always holds a pointer. |
@apparentlymart I think that everything you wrote is correct according to this proposal. Thanks. |
It would be, but I think it would be worth it. And I don't think it would be so strange as to completely preclude eliminating the extra oddness that would come from union types always being nilable. In fact, I'd go so far as to say that if this way of implementing unions has to have them be nilable, then a different way of implementing them should be found.
I was worried it was going to be the garbage collector... Ah well. |
A major problem is that type constraints work on static types while interfaces work on dynamic types of objects. This immediately prohibits this approach to do union types.
This works because the static type of |
@merykitty per my understanding of the proposal, I think for the dynamic form of what you wrote you'd be expected to write something this: type Addable interface {
int | float32
}
func Add(x, y Addable) Addable {
switch x := x.(type) {
case int:
return x + y.(int)
case float32:
return x + y.(float32)
default:
panic("unsupported Addable types %T + %T", x, y)
}
} Of course this would panic if used incorrectly, but I think that's a typical assumption for interface values since they inherently move the final type checking to runtime. I would agree that the above seems pretty unfortunate, but I would also say that this feels like a better use-case for type parameters than for interface values and so the generic form you wrote is the better technique for this (admittedly contrived) goal. |
@merykitty No, in your example, |
also, note that the type set never includes interfaces. So |
Is something like that going to be allowed? type IntOrStr interface {
int | string
}
func DoSth[T IntOrStr](x T) {
var a IntOrStr = x
_ = a
} |
Let's say I have these definitions. type I1 interface {
int | any
}
type I2 interface {
string | any
}
type I interface {
I1 | I2
} Would it be legal to have a variable of type |
@mateusz834 Can't see why not.
I think the answer to all of these is "yes". For the cases where you assign an interface value, the dynamic type/value of the |
FWIW my main issue with this proposal is that IMO union types should allow representing something like |
This comment was marked as resolved.
This comment was marked as resolved.
@ianlancetaylor Does the proposal as-is allow both type sets and functions in an interface? It would have a remarkable property not typically present in sum types where you could have a closed set of types along with the ability to have those types implement some common functions and be used as an interface. |
For reference, this has been suggested a few times in #19412 and #41716, starting with #19412 (comment). Requiring nil variants versus allowing source code order to affect semantics is the classic tension of sum types proposals.
The spelling of a type with no information beyond existence is usually
Yes, since the proposal is just to allow values of general interfaces less |
Thanks @zephyrtronium. Taking your feedback into account, and also realizing that it is easy to redefine types, then I think points (2) and (3) I raised are not issues. Type definitions can be used to give the same type different semantics for each case. For example: type ClaimPredicateUnconditional struct{}
type ClaimPredicateAnd []ClaimPredicate
type ClaimPredicateOr []ClaimPredicate
type ClaimPredicateNot ClaimPredicate
type ClaimPredicateBeforeAbsoluteTime Int64
type ClaimPredicateBeforeRelativeTime Int64
type ClaimPredicate interface {
ClaimPredicateUnconditional |
ClaimPredicateAnd |
ClaimPredicateOr |
ClaimPredicateNot |
ClaimPredicateBeforeAbsoluteTime |
ClaimPredicateBeforeRelativeTime
} In the main Go code base I work in we have 106 unions implemented as multi-field structs, which require a decent amount of care to use. I think this proposal would make using those unions easier to understand, probably on par in terms of effort to write. If tools like gopls went on to support features like pre-filling out the case statements of a switch based on the type sets, since it can know the full set, that would make writing code using them easier too. The costs of this proposal feel minimal. Any code using the sum type would experience the type as an interface and have nothing new to learn over that of interfaces. This is I think the biggest benefit of this proposal. |
To me, On the one hand, On the other hand, union Exhaustiveness in type |
I have been reading this thanks to @Merovius linking it from the [go-nuts] list. Seems to me the biggest argument against interfaces-as-sum-types is over zero values, i.e. that Go fundamentally requires zero values and that can't change, there is no consensus on how to arrive at a zero value for these sum types, and with others wanting sum types to not have zero values as they see zero values conflicting with the benefits they see sum-types providing. IOW, a classic catch-22. If I understanding this wrong, please let me know. I think it would be great if this could become a feature of Go so I considered that catch-22 in hopes to resolve it and came up with something I think could work. The first aspect would be to require that these sum types not be able to be instantiated without providing an explicit value. That would be mean some of the following would throw a compiler error: type Identifier interface {
int | string
}
var widgetId Identifier // throws compile error
widgetId := Identifier(1) // compiles fine
widgetIds := make([]Identifier,3) // throws compile error
widgetIds := []Identifier{ // compiles fine
Identifier(123),
Identifier("happy"),
Identifier(456),
} Unless I miss some way in which a property can get a zero value, the above limitation would be sufficient to ensure that a sum type never had an opportunity to have a zero value (I ignored in my example returning an uninitialized value from a If simply disallowing sum types from being created if not initialized is not sufficient — because someone might use CGo or some other edge case to create an uninitialized sum type — then we would need a real zero value. That is where IMO reconsidering the untyped builtin zero (#61372) could fit in. A sum type could have a zero value of just So if it comes to pass that a variable or expression of type of a sum type has a zero value then using that variable or expression for anything other than assignment of a non-zero value or checking if it is equal to The The only real downside I see to this approach would be that you could not pre-create a slice or map with any elements using However, if for our 3rd aspect we allowed extending ids := make([]Identifier{0},10) And this sets a slice of 25 ids := make([]Identifier{""},25) So there it is. Please feel free to poke holes in this approach when and if you find any. P.S. We don't really even need |
@mikeschinkel Thanks. The idea of not permitting the type to be instantiated without a value has been suggested several times before in the various discussions of sum types. It has always been rejected. Zero values are built into the language too deeply. For one example--and just for one example, there are other problems--how do you handle a type assertion to the sum type if the type assertion fails? What value do you give to the first result of the type assertion? |
@ianlancetaylor — Thank you for acknowledging. I see your perspective in how my suggestion also results in a zero values concern.
I expect you meant that as a rhetorical question, but since you posed the question I hope you do not mind me at least answering it. If a type assertion fails in an assignment to a variable of a sum type then the variable would get the value of That seems reasonable to me, at least, because a failed type assertion is a failure so accessing the value of that variable is almost certainly a logic error anyway. Right? That scenario does bring up a question of whether or not a I do respect that you and others may view those constraints as not what Go should be, and I will be accepting of that if it is the final ruling. However, AFAICT, I still think the logic of my suggestion is valid, unless there is some other scenario that emerges that cannot be resolved in the same way as for failed type assertions. 🤷♂️ |
@mikeschinkel IMO your suggestion is now back to the point where every sum type has a zero value of That doesn't mean the suggestion isn't viable, it's just that it doesn't differ significantly from what we have been talking about so far. To me, that means FWIW that there is no need to disallow
This has been discussed above as well. It seems hard to impossible to me to disallow it, without drastical changes to Go's type system. Whether or not a variable is |
Admittedly there is not much difference, but there is one tangible difference; consistency. If we allowed that sum types could just be But yes, that is the only difference, however it could be the difference between someone objecting to sum types vs. supporting them. What percentage of people who would do each remains to be seen.
Yes, and the difference is that the combination of things — at least on this issue — have not been discussed prior AFAICT.
With the addition of initializers in my suggestion there is no reason to disallow
Yes, and that is where we disagree. My suggestion proposes making zero an exceptional case such that the vast majority of code would safely not deal with it because existence of a zero value and subsequent use would in itself be an exceptional case worthy of an immediate panic.
From a purity standpoint you are probably correct. But my understanding of Go's nature is that they have historically placed emphasis on pragmatism over purity. Otherwise there would have been no Respecting the existing nature of the Go language, I am arguing that since it is impossible to find a perfect solution, maybe instead we could be pragmatic and accept a really good one? The reason I think this approach could work is because there is only one way thus far I have discovered thus far that a sum type variable could have a
Unless I am missing something, it would seem easy to determine if a variable is used without So I ask this: rather than discuss in abstract terms, can you or others identify places where the compiler could not easily identify when a sum type variable received a value of |
If we had sum types, we could return an var typeAsserted optional[MySumType] = nothing
typeAsserted, _ = anything.(MySumType) (In practice, you would usually not declare the destination separately, I just did it to highlight its type.) But if we can generalize it so |
I feel like that should have made clear that this was just one example. As far as I can tell, to name a few others, you have not yet talked about channel-receives, map-accesses, I'll also note that the suggestion to disallow uninitialized values came up in this discussion before and most of this list has been posted there as well. And while I appreciate that it is frustrating to be told that something you see as an easy solution is unworkable, I'd also ask for a little bit of trust that when people like Ian or I say things like "Zero values are built into the language too deeply", it's not just an off-the-cuff remark. We wouldn't say that, if we saw a realistic way to make it work. In particular, listing instances of where zero values are mentioned in the spec is not meant as a request to special case solutions to them, but as a demonstration of what we mean when we say "zero values are built into the language too deeply". |
This might be a bit off-topic, but have zero value semantics like this been discussed?: func main() {
var m map[string]string
assert(m == zero)
assert(m == nil)
m = nil
assert(m != zero)
assert(m == nil)
m = zero
assert(m == zero)
assert(m == nil)
}
func assert(b bool) {
if !b {
panic("assertion failed")
}
} |
And FWIW
I assume that the pragmatic solution we will eventually adopt (if any) is to use union-element interfaces as variants and make their zero value |
@mrwonko Thanks. In these kinds of discussions, it is always possible to find a solution for any given problem. But it is also necessary to step back and consider the overall picture. Go is intended to be a reasonably simple, reasonably orthogonal language. When we add special cases we weaken those properties. This proposal is, I think, a somewhat simple, reasonably orthogonal, change that we could make. The question here is not how to complicate it to make it better. We're almost certainly not going to do that. Rather than make it more complicated, we will choose to make no change at all. The question here is whether to make this change at all--that is, whether the benefits of the change are worth adding more complexity to the language. Or perhaps we can find a way to make it more simple and more orthogonal. |
Odin lang is in many ways inspired by golang. Odin has Foo :: enum {
A,
B,
C,
D,
}
f := Foo.A https://odin-lang.org/docs/overview/#partial-switch It is true that if golang tries to implement sum types via extending/overloading
|
@ngortheone Note that this issue is specifically about using union-elements in interfaces as variants. There are other issues (my personal favorite is #54685) to discuss other ideas and #19412 as an umbrella issue for the general idea of variant types. I'll note that the vague notion of adding a new syntactical construct and type kind has been suggested a lot of times so far, so your suggestion isn't really novel. |
That probably means that the solution space to the sum type problem is small and the search has already exhausted all good options. The main question now is: Knowing all pros and cons of each solution will golang decide to go for any solution at all? |
Logically-speaking, why? I addressed the one example, and was looking forward to considering others.
I would tackle each of these, but from the tone of your arguments I don't feel like continuing what is evidently a contentious debate with you here on this issue.
I had searched for "disallow" and "initialized" on this page prior to my posting and they appeared nowhere. I searched again just now, but this time I opened up all the posts marked "off-topic" and found it was you telling @atdiar it wasn't possible, that culminated in his frustrated (your word) post before Robert Griesemer called for respectful discussion. However, nowhere in that dialog did anyone other than you — i.e. no one from the Go team — argue against the idea. So my takeaway is that you are asserting that if you already expressed an opinion against something that no one else should be able to discuss it? Just wanting to make sure I understand correctly.
No, it is absolutely not frustrating to be told something that what I presented as a strawman proposal is unworkable when objective and specific arguments against it are given. That is entirely the point of such a proposal to flesh out its feasibility.
What is frustrating instead is to be told, effectively "We have already considered ever conceivable option and so you should just trust me that you have no value to offer here."
As George Bernard Shaw said "The single biggest problem in communication is the illusion that it has taken place." You assume the statement "Zero values are built into the language too deeply" are interpreted as you understand the phrase in a binary form exactly as you understand it without recognizing that others don't interpret that statement the same. From my perspective my proposal absolutely respected that statement; why else would I have included the concept of BTW, I really like how Ian engages in discussions on this forum. He always replies in a respectful manner, makes a statement when he needs to, but evidently doesn't feel the need to debate everyone who has a proposal, even if it is not one they will pursue. The Go team then ultimately makes their decisions and we all move forward. His approach makes everyone feel as if they can contribute, but is tactful when a discussion gets out of control and reigns it in with a statement of intent. It would be a lot nicer in these forums if everyone were able to participate without any self-appointed gatekeepers. |
@mikeschinkel FWIW there is also #19412, which is more general, so contains more discussions of broader proposals than this one. Searching that casually brings up more discussion about these specific problems, involving a lot more people than me, including people on the Go team. So, apologies for writing "this discussion". It was imprecise. The general discourse about variants has been going on for a while and I don't always remember where all parts of it happened. |
Just to briefly underscore one point @Merovius has made a few times, I thought it might be helpful to re-post this snippet of Ian's original proposal text (from top comment above):
(And of course, given how bad GitHub issues are for long conversations, it's worth keeping in mind the benefits of scaling with many conversations in places outside of GitHub issues, such as the |
@thepudds — Given your comment it is worth noting that while some people may see discussion as being a different proposal, others making suggestions see it as addressing ways to make the original proposal viable. Also, given the concept of scaling with many conversations, it would be respectful of and incumbent on those who have the time to seek out and follow many different discussions in many different places that not everyone is fully aware of to not seek to tamp down comments by others without at least first linking to their specific points from those other discussions, and especially before admonishing people for discussing things "that have already be addressed and resolved," but elsewhere. #fwiw |
I think it would be very helpful if the resolved concrete types were also possible to list using the reflect package. If they are, it would be possible to add functionality to json.Unmarshal to write to these interfaces. That is, this example would work, if json.Unmarshal would get the types
would print 42, and v would have the underlying type I'm don't think the json package functionality would have to be part of this exact proposal. There would have to be decisions made about error handling, for example. But I would expect some support in the
With that documentation, I suppose methods would also be checked, otherwise |
Also see #68710 (comment) |
Is there any active proposal about nil safety? If so, how might it interact with the concern raised above? |
@gonzojive not yet to my knowledge. That would require to think about the usual type constructors in terms of accepting zeroable arguments or not. nil being the zero of interface types it would become a compile time error to create slices of such union/sum types for instance. Easily resolved with a notation such as In fact, everything can be done. The question is whether that would be a good use of the complexity budget. Even, type assertions could be handled properly. w, ok:= v.(A) where A would not have untyped nil in its typeset meaning nil wouldn't be assignable, we could still decide that such interface zero value is nil. Just assignment of nil would not be possible. So Then it's about what to do in the branch Overall nil is not an issue. The issue is nil in interface values. Because interfaces don't differentiate between value and pointer types when we call a method and because checking an interface value for the nilness of its content is not very practical yet. Hindsight is always 20/20 unfortunately, I wish That would get rid of the faq entry on nil comparison, perhaps shorten error handling a tiny bit. But anyway, this is all related. That's why I still like this proposal because it seems that it's just a piece of the overall puzzle. |
This is a speculative issue based on the way that type parameter constraints are implemented. This is a discussion of a possible future language change, not one that will be adopted in the near future. This is a version of #41716 updated for the final implementation of generics in Go.
We currently permit type parameter constraints to embed a union of types (see https://go.dev/ref/spec#Interface_types). We propose that we permit an ordinary interface type to embed a union of terms, where each term is itself a type. (This proposal does not permit the underlying type syntax
~T
to be used in an ordinary interface type, though of course that syntax is still valid for a type parameter constraint.)That's really the entire proposal.
Embedding a union in an interface affects the interface's type set. As always, a variable of interface type may store a value of any type that is in its type set, or, equivalently, a value of any type in its type set implements the interface type. Inversely, a variable of interface type may not store a value of any type that is not in its type set. Embedding a union means that the interface is something akin to a sum type that permits values of any type listed in the union.
For example:
The types
MyInt
andMyFloat
implementI1
. The typeMyOtherInt
does not implementI1
. None ofMyInt
,MyFloat
, orMyOtherInt
implementI2
.In all other ways an interface type with an embedded union would act exactly like an interface type. There would be no support for using operators with values of the interface type, even though that is permitted for type parameters when using such a type as a type parameter constraint. This is because in a generic function we know that two values of some type parameter are the same type, and may therefore be used with a binary operator such as
+
. With two values of some interface type, all we know is that both types appear in the type set, but they need not be the same type, and so+
may not be well defined. (One could imagine a further extension in which+
is permitted but panics if the values are not the same type, but there is no obvious reason why that would be useful in practice.)In particular, the zero value of an interface type with an embedded union would be
nil
, just as for any interface type. So this is a form of sum type in which there is always another possible option, namelynil
. Sum types in most languages do not work this way, and this may be a reason to not add this functionality to Go.As an implementation note, we could in some cases use a different implementation for interfaces with an embedded union type. We could use a small code, typically a single byte, to indicate the type stored in the interface, with a zero indicating
nil
. We could store the values directly, rather than boxed. For example,I1
above could be stored as the equivalent ofstruct { code byte; value [8]byte }
with thevalue
field holding either anint
or afloat64
depending on the value ofcode
. The advantage of this would be reducing memory allocations. It would only be possible when all the values stored do not include any pointers, or at least when all the pointers are in the same location relative to the start of the value. None of this would affect anything at the language level, though it might have some consequences for thereflect
package.As I said above, this is a speculative issue, opened here because it is an obvious extension of the generics implementation. In discussion here, please focus on the benefits and costs of this specific proposal. Discussion of sum types in general, or different proposals for sum types, should remain on #19412 or newer variants such as #54685. Thanks.
The text was updated successfully, but these errors were encountered: