Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: sigma types #52096

Closed
zephyrtronium opened this issue Apr 1, 2022 · 15 comments
Closed

proposal: Go 2: sigma types #52096

zephyrtronium opened this issue Apr 1, 2022 · 15 comments
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Milestone

Comments

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 1, 2022

Author background

  • Would you consider yourself a novice, intermediate, or experienced Go programmer? Experienced.
  • What other languages do you have experience with? C, Java, Rust, Python, MATLAB, various flavors of assembly

Related proposals

  • Has this idea, or one like it, been proposed before? It resembles sum types (proposal: spec: add sum types / discriminated unions #19412) or typed enums (proposal: spec: add typed enum support #19814) in the capability of the type system, although it is more general.
    • If so, how does this proposal differ? The types proposed are more general than sum types, which are in turn more general than typed enums. In particular, the mechanisms in this proposal would greatly simplify certain scenarios encountered in marshaling and unmarshaling when compared to sum types.

Proposal

  • What is the proposed change?

Introduce a restricted version of sigma types into Go. Formally, sigma types are pairs (a: T, b: B(a)) where a is a value of type T and B is a function that constructs the type of b from the value of a. Less formally, they can be considered tagged unions where the tag is an accessible value of a type the programmer can select.

Sigma types are a central feature of dependent type systems, which make it possible to statically prove many facts about the safety of programs. Such systems allow type metaprogramming, which allows the type constructor B to depend on variable values of a. Because Go does not have metaprogramming, I propose the restriction that a must be one of a list of constants. I believe that sigma types will be useful despite that restriction.

To avoid an unhelpfully mathematical name, I propose to name these types "switch types." A switch type has a discriminator, which is a variable of any type that can represent constants. It also has a discriminant set, which is a fixed set of constant values of the same type as the discriminator. Each value in the discriminant set maps statically to a dependent type. A value of a switch type has its discriminator equal to one of the values in the discriminant set and a dependent value of the type to which that value maps.

The syntax I propose is as follows. We add a new kind of type literal, SwitchType, with the following EBNF:

SwitchType     = "switch" Type "{" { SwitchTypeElem ";" } "}" .
SwitchTypeElem = SwitchTypeCase ":" Type .
SwitchTypeCase = "case" ExpressionList | "default" .

The Type following switch specifies the type of the discriminator. Each case gives a list of values of the discriminator and the type corresponding to those values. The values in the case clauses must be constants representable by the discriminator type. These form the discriminant set. It is an error to specify the same value multiple times within a switch type definition. The default clause specifies the type when the discriminator is equal to the zero value; it is an error to both specify the zero value in a case clause and to use a default clause in the same switch type definition. It is additionally an error to specify neither a default clause nor the zero value in a case clause, which is to say that the zero value must appear exactly once in the discriminant set.

The zero value of a switch type is the zero value of the discriminator type and the zero value of the dependent type thereof. This is the reason the zero value must appear in the discriminant set.

Switch type literals are composite literals containing zero or one keyed elements. If a keyed element appears, the key must be a constant equal to a member of the discriminant set of the switch type, and the element must be a value of the type to which the switch type maps the key.

In order to inspect the dynamic state of a switch-typed value, we use syntax analogous to the same operations on interface types. These come in two flavors. First, we can use switch assertions, a primary expression:

SwitchAssertion = "." "(" Expression ")" .

The expression in a SwitchAssertion must be a constant in the discriminant set of the switch type. Like type assertions on interfaces, switch assertions can be used in contexts expecting one or two values. When there is only one result, the assertion panics if the discriminator is not equal to the expression. When there are two, the first is the zero value of the corresponding dynamic type, and the second is a bool indicating whether the assertion succeeded.

Second, we can use switches that mirror type switches, which I will call discriminator switches:

DiscrimSwitchStmt  = "switch" [ SimpleStmt ";" ] DiscrimSwitchGuard "{" { DiscrimCaseClause } "}" .
DiscrimSwitchGuard = [ identifier ":=" ] PrimaryExpr "." "(" "case" ")" .
DiscrimCaseClause  = DiscrimSwitchCase ":" StatementList .
DiscrimSwitchCase  = "case" ExpressionList | "default" .

The expressions in the case clauses in a discriminator switch must each be in the discriminant set of the switch type of the primary expression of the switch switch guard. Analogous to type switches on interfaces, if an assignment appears in the switch guard and all expressions in the case which matches the discriminator appear in the same case of the switch type definition, then the assigned variable has the type to which that case maps; otherwise, it has the switch type.

Lastly, specific to switch types, the special form x.(switch) for switch-typed x evaluates to the current value of the discriminator.

Two switch types are identical if their discriminant sets are equal and both map each discriminator value to identical types. Neither the discriminator nor the dependent value of a switch value is addressable. Switch types are comparable when each dependent type of the switch type is comparable (all types which can represent constants are comparable, so the discriminator implicitly is as well) and are equal when their discriminators and dependent values are respectively equal.

A minor syntactic ambiguity arises with this definition of switch types. This occurs where a switch type literal without a type name is written in a statement context. Because Go requires that every literal must be used, the only otherwise valid use of a switch type literal in statement context is, ostensibly, to call a method or perform a channel send on a value of one of the dependent types:

func anime() {
	switch string {
	case "anime": int
	default:      error
	}{}.("").Error()
}

Considering the awkwardness and uselessness of this expression, it is unlikely that it would ever arise in practice. Therefore, the parser assumes a switch statement when encountering the switch keyword in statement context. If it is necessary to write such an expression, the workaround is to wrap at least the type in parentheses.

The alignment requirement of a switch type is the maximum of the alignments of the discriminator type and all dependent types. No guarantees are made about the maximum size or layout of a switch type; a compiler may choose to put the discriminator first, last, in the middle, or spread across bits of the dependent types' storage, and it may or may not choose to rearrange or share storage for different dependent types.

In order to handle switch types dynamically, reflection requires a new Switch value of reflect.Kind. On reflect.Type, a Discriminants method returns a map[any]reflect.Type enumerating the discriminant set as keys and the dependent types to which they map as values. Lastly, a Discriminator method on reflect.Value returns the current value of the discriminator as a reflect.Value. (The existing Elem method returns the dependent value.)

Packages go/ast and go/types will need new types to describe switch types.

  • Who does this proposal help, and why? This proposal allows implementing the full behavior of typed enums and discriminated unions, so all of the uses described in proposal: spec: add sum types / discriminated unions #19412 and proposal: spec: add typed enum support #19814 are addressed (except uses depending on exhaustive checking of cases). It expands upon that with the flexibility to describe types varying with arbitrary values, notably strings, which arises from time to time in some JSON APIs.
  • Please also describe the change informally, as in a class teaching Go. This proposal adds the ability to write types similar to Rust enums that can have values of several different types, but with the added advantage that the dynamic type is controlled by an arbitrary value that can be any constant.
  • Is this change backward compatible? Insofar as changes to the type system can be, yes. A compiler for a version of Go that implements the proposed changes will also compile code written for versions that do not, with no changes in semantics. Other tools will need updates to support the changes to the AST and type system.
    • Show example code before and after the change.
    • Before
type Sum interface {
	isSum()
}

type MadokaVariant struct{}

func (MadokaVariant) isSum() {}

var _ Sum = MadokaVariant{}

type HomuraVariant struct {
	err error
}

func (*HomuraVariant) isSum() {}

var _ Sum = (*HomuraVariant)(nil)

func Anime(homura bool) Sum {
	if homura {
		return &HomuraVariant{errors.New("kyuubei")}
	}
	return MadokaVariant{}
}
  • After
type Sum switch bool {
	case false: struct{}
	case true:  error
}

func Anime(homura bool) Sum {
	if homura {
		return Sum{true: errors.New("kyuubei")}
	}
	return Sum{}
}
  • Before
type Maybe[T any] struct {
	just T
	ok bool
}

func (m Maybe[T]) Just() (T, bool) {
	return m.just, m.ok
}

func (m Maybe[T]) String() string {
	if just, ok := m.Just(); ok {
		return fmt.Sprintf("Just(%v)", just)
	}
	return "Nothing"
}
  • After
const Just = true

type Maybe[T any] switch bool {
	case Just: T
	default:   struct{}
}

func (m Maybe[T]) String() string {
	switch t := m.(case) {
	case Just:
		return fmt.Sprintf("Just(%v)", t)
	default:
		return "Nothing"
	}
}
  • Orthogonality: how does this change interact or overlap with existing features? While it is technically possible to approximate sum types using sealed interfaces, doing so requires a large amount of code to define new types and methods for each variant, the interface zero value of nil is always a variant, and it is impossible to discover the set of variants dynamically. While this proposal addresses all of these problems, the last is, in my opinion, the most significant: reflection-based unmarshaling algorithms cannot decode interface-typed variables in general. Switch types in particular would allow reflection to decode a particular variant based on the value of another field, which is helpful for certain APIs which return JSON like {madoka: "homura", kyuubei: 9} or {error: "anime"} from a single endpoint. This is in addition to the benefits of sum types, such as making applications like parsers much simpler to describe in types.
  • Is the goal of this change a performance improvement? No.

Costs

  • Would this change make Go easier or harder to learn, and why? Harder. I don't think dependent types are a particularly easy concept to learn. Typescript has a concept very similar to switch types which they call dependent types, so people familiar with that language may find the idea accessible. Once users learn them, they may find certain problems easier and more natural to express in safe code, lowering the cost of learning different approaches like sealed interfaces or custom unmarshaling for those situations.
  • What is the cost of this proposal? (Every language change has a cost). The main cost is in the addition to the language, which increases the burden associated with learning Go. Supporting switch types would also add a significant amount of code to the compiler, runtime, and any external tools which process Go code, as well as any code which uses reflect to handle all possible types.
  • How many tools (such as vet, gopls, gofmt, goimports, etc.) would be affected? Yes.
  • What is the compile time cost? I do not expect that adding a new type kind would dramatically increase compilation costs for any programs, although there may be some small increase in compilation time and object sizes. The cost would arise in the amount of compiler code required to support switch types, changes to export and object formats to represent them, &c.
  • What is the run time cost? The existence of switch types should not substantially impact the execution of most programs, except those which use reflection to handle types broadly, since the proposed implementation of reflection of switch types may be expensive. The runtime itself will need substantial updates to account for the new type kind.
@gopherbot gopherbot added this to the Proposal milestone Apr 1, 2022
@ianlancetaylor ianlancetaylor added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Apr 1, 2022
@ianlancetaylor
Copy link
Contributor

Seems related to the long discussion at #19412.

What is the zero value of a switch type? I couldn't see that in a quick read.

@zephyrtronium
Copy link
Contributor Author

zephyrtronium commented Apr 1, 2022

I must have forgotten to include it. The zero value of a switch type is the value having the zero value of the discriminator and the dependent type corresponding to it. This is the reason the discriminator's zero value must appear in the discriminant set. I've updated the proposal to mention this.

@beoran
Copy link

beoran commented Apr 1, 2022

So ', for your example, the zero value of type Sum is Sum{false, struct{}}?

@zephyrtronium
Copy link
Contributor Author

Correct. Though, to be pedantic, the zero value of Sum is Sum{false: struct{}{}}. As proposed, there are no unkeyed switch type literals.

@deltamualpha
Copy link

deltamualpha commented Apr 1, 2022

This may be an ignorant question, so please excuse me if I just don't fully understand, but:

In the "before" version of the Madoka/Homura example, Anime(true) returns a pointer to a HomuraVariant. If I typeswitch that Sum a la .(*HomuraVariant), I can grab the err field out again.

It's unclear to me how the "after" version actually makes those fields accessible. The Sum switch-type definition doesn't have an err field defined anywhere. Am I just missing something obvious?

@zephyrtronium
Copy link
Contributor Author

zephyrtronium commented Apr 1, 2022

The switch type Sum doesn't have a field named err, but it does have error as a dependent type. To access the error in the "before" variant, one can do Anime(true).(*HomuraVariant).err. To access it in the "after" variant, one would do simply Anime(true).(true). In both cases, replacing Anime(true) with Anime(false) would cause the subsequent assertion to panic. However, the switch type version is much more succinct and more directly expresses provable logic.

@zeebo
Copy link
Contributor

zeebo commented Apr 1, 2022

Are you allowed to define methods on these types?

@DeedleFake
Copy link

I don't know what to believe today, and the anime-themed examples are throwing me off. Just to be clear: This is a serious proposal, despite it being April Fool's? It sure looks like it's serious, minus the examples.

@zephyrtronium
Copy link
Contributor Author

@zeebo It would be legal to define methods on a named switch type, with the usual semantics. Somewhat related, it would also be legal to define a named switch type with type parameters. E.g.,

const Just = true

type Maybe[T any] switch bool {
	case Just: T
	default:   struct{}
}

func (m Maybe[T]) String() string {
	switch t := m.(case) {
	case Just:
		return fmt.Sprintf("Just(%v)", t)
	default:
		return "Nothing"
	}
}

Perhaps I should add this as an example in the proposal.

@zephyrtronium
Copy link
Contributor Author

@DeedleFake There is some history to using Madoka and Homura in place of foo and bar.

@zephyrtronium
Copy link
Contributor Author

More directly, I think this proposal would be a beneficial change for Go's type system, but I don't really think it is a feasible change to make to Go. I originally drafted it a few months ago, fully intending to submit it seriously, but it felt increasingly silly in its scope as I wrote it. So, I would describe this as an April Fools' joke hatched from the shell of a real proposal. Maybe it was hasty to conclude the proposal was too much!

To me, it feels like the switch types I've described have just the wrong amount of overlap with interface types. The semantics are similar enough to copy many of the same syntactic elements, but different enough that borrowing those things would mostly be confusing. Perhaps a substantial part of that is the name: I hear "switch type" and immediately think of "type switch." Another awkward part is that interfaces can now have type elements to describe a union of some set of types, although I don't believe there has been a proposal to allow values of such interface types yet.

I do think that dependent types are good, and I think it's productive to think about how something like sigma types would fit in Go. I especially feel the lack of sum types quite often in the kinds of code I write, so I think it is productive to think about things like sigma types which would fill those gaps. Hopefully there is some value in this proposal beyond a rather high-effort prank. At the very least, I think it accurately demonstrates the scope of changes that a concrete proposal stemming from #19412 is likely to require.

And, for what it's worth, the anime references are just the example names I like to use – they weren't intended to be part of the joke. :)

@deltamualpha
Copy link

Amusingly, in that case, a number of people on the slack commented approvingly about how through and well-considered this proposal was!

@beoran
Copy link

beoran commented Apr 2, 2022

Even if it is intended as a joke, I still think it is interesting. The good point of this "sum type" or union type proposal is that it is a tagged union with an explicitly typed tag, and a well defined zero value. Those are two benefits other similar proposals did not have.

Maybe you should reconsider and simplify your proposal to these two points, and then we might get to something that does fit well with Go, and that can be considered seriously.

@ianlancetaylor
Copy link
Contributor

Closing as dup of #19813.

@zephyrtronium
Copy link
Contributor Author

For those who expressed interest in this proposal: I filed #54685 as a serious proposal which refines the ideas here and avoids their most egregious problems. In particular, the name is more reasonable, and it integrates the type element syntax currently used in generic interfaces.

@golang golang locked and limited conversation to collaborators Aug 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange Suggested changes to the Go language Proposal v2 An incompatible library change
Projects
None yet
Development

No branches or pull requests

7 participants