Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: type constraint containing any used directly as type #68710

Closed
godcong opened this issue Aug 2, 2024 · 25 comments
Closed

cmd/compile: type constraint containing any used directly as type #68710

godcong opened this issue Aug 2, 2024 · 25 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. ExpertNeeded NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@godcong
Copy link

godcong commented Aug 2, 2024

Proposal Details

type AnyType interface {
	any | []uint8
}

type KeySet struct {
	Keys []AnyType 
}

some issue content about this:
golang-jwt/jwt#401

The defined generic type AnyType can be used directly as a type and is compiled.

If you delete any, you will be prompted with

cannot use type AnyType outside a type constraint: interface contains type constraints

I haven't found anything about it, so I don't know if it's intentional or not.

If so, could you provide some relevant information?

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 2, 2024
@seankhliao seankhliao changed the title cmd/compile: can compilation use any to bypass go's generic checking? cmd/compile: type constraint containing any used directly as type Aug 2, 2024
@prattmic
Copy link
Member

prattmic commented Aug 2, 2024

cc @griesemer

@prattmic
Copy link
Member

prattmic commented Aug 2, 2024

Simple case: https://go.dev/play/p/t8l45ebTndw

package main

import (
	"fmt"
)

type Foo interface {
	any
}

func main() {
	var f Foo = 42
	fmt.Println(f)
}

It seems that interface{ any } is treated the same as interface{}. I am also unsure whether this is intended or a bug.

@mateusz834
Copy link
Member

I think that interface{ any } being the same as interface{} makes sense, same as interface{io.Reader} is the same as io.Reader.

@prattmic
Copy link
Member

prattmic commented Aug 2, 2024

@mateusz834 Good point, thanks. With a single interface this is clearly an embedded interface.

Then the remaining question is with the original example, where any | []uint8 is effectively "simplified" to any, which then allows embedding.

@atdiar
Copy link

atdiar commented Aug 2, 2024

I think we might want a different definition of type identity for basic interface as opposed to union ones.

If we ever end up with union implemented as interfaces this will simplify discrimination.

In that sense, the union here should not be usable as a type yet, just like any union we have nowadays. (because crypto.PublicKey is a defined interface type and the union with the slice type creates a full-blown 'union' interface type.

That should be relevant when embedding union interface types.
The "union" part remains part of the type identity.

Probably a bug. (but something to think over)

@mateusz834
Copy link
Member

This might make #57644 unintuitive.

Consider:

type Foo interface {
	any | int
}

type Bar interface {
	Foo | string
}

func main() {
	var _ Bar = struct{}{}
}

@cuonglm
Copy link
Member

cuonglm commented Aug 2, 2024

Then the remaining question is with the original example, where any | []uint8 is effectively "simplified" to any, which then allows embedding.

IIUC, this is the right behavior because:

type AnyType interface {
	any | []uint8
}

is a Basic Interface. While:

type AnyType interface {
	[]uint8
}

is not, since its type sets can not be defined entirely by a list of methods.

@atdiar
Copy link

atdiar commented Aug 2, 2024

To add, the clearer semantics would be that Bar accepts either interface values or string values. It's definitely weird to say that, given our current definition of type set and since interfaces have a dynamic type which is never an interface.

But, if a is a regular type, it implements itself. Where it's unclear nowadays is whether a implements a | b because a | b would be an interface and that would come with implementing operations that only interface types have (assertions, comparisons : a might not be comparable etc)
In theory it does, so we would have to think about it.

That's the type identity part.

So I guess we might need to wrap union typed values before assigning them. (no need, the issue is simply about the definition of unions and disjointness, having to explicitly mention the intersection as a separate case)

@Merovius
Copy link
Contributor

Merovius commented Aug 2, 2024

I'll note that any (or rather, empty interfaces) is the only interface this works with. IMO this is pretty clearly a bug.

IIUC, this is the right behavior because: […] is a Basic Interface.

I don't think the spec is all that clear about whether it is actually a basic interface. It says

In its most basic form an interface specifies a (possibly empty) list of methods. The type set defined by such an interface is the set of types which implement all of those methods, and the corresponding method set consists exactly of the methods specified by the interface. Interfaces whose type sets can be defined entirely by a list of methods are called basic interfaces.

Now, on the one hand, the type set of any | []uint8 is "all types", which can be defined by the empty "list of methods", so sure, strictly speaking it fulfills the definition. But ISTM the intent here is a syntactical one, i.e. "if it is an interface which neither embeds another interface nor contains any other type elements, except listing methods". The section headings of "Basic interfaces", "Embedded interfaces" and "General interfaces" also make that clear.

I just can't imagine anyone thinking it is intended for any | []uint8 to be interpreted the way it is.

@zigo101
Copy link

zigo101 commented Aug 3, 2024

type AnyType interface {
	any | []uint8
}

is equivalent to and automatically simplified by compiler as

type AnyType interface {
	any
}

So, it is a basic interface type. Both spec and implementation have no problems here.

@atdiar
Copy link

atdiar commented Aug 3, 2024

@zigo101 that's arguable. The interface is defined as a set of methods that is empty AND a union of terms.

any | uint8 is equivalent to either any other type that is not uint8 OR uint8 only OR uint8 and any (which is still uint8 in terms of constraints).

So the union might not be simplifiable?
In that case it would not be a basic interface.

@cuonglm
Copy link
Member

cuonglm commented Aug 3, 2024

@zigo101 that's arguable. The interface is defined as a set of methods that is empty AND a union of terms.

any | uint8 is equivalent to either any other type that is not uint8 OR uint8 only OR uint8 and any (which is still uint8 in terms of constraints).

So the union might not be simplifiable?
In that case it would not be a basic interface.

any denotes all non-interface type, including uint8. So any | <anything> | ... is just any.

@atdiar
Copy link

atdiar commented Aug 3, 2024

@cuonglm I understand what you mean. But in terms of sets, the union is decomposable in disjoint sets.
That might be the definition that we need when/if interface types are involved.

Otherwise, that would mean that we could consider the type set of the union term {[]uint8} to be empty/uninhabited since all types satisfy any. (to be understood as it not bringing anything to the union, being a redundant term).
That would make terms of a union non commutable.

[edit] It's perhaps easier to see that it is not true if we replace the interface by a union where each term is a distinct element of the type set: we'd get after simplification: {any\{[]uint8}} ∪ {[]uint8}

The decomposition in disjoint sets should be preferable.

@godcong
Copy link
Author

godcong commented Aug 7, 2024

type AnyType interface {
	any | []uint8
}

One question.
Because of the use of |, it is possible to identify the behavior as a type definition.
At this point, should any be a union?

So this any, then, should be recognized as an

type AnyType interface {
	interface{ /* nothing */ } | []uint8
}

Or should be recognized as

type AnyType interface {
	interface { /* all types */ } | []uint8
}

Should this any not be recognized as an interface type.
If you want to add an interface type, I think it should be in the following format:

type AnyType interface {
	interface{ /* types all or nothing? */ } | []uint8
	any // interface type
}

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 7, 2024
@mknyszek mknyszek added this to the Backlog milestone Aug 7, 2024
@mknyszek
Copy link
Contributor

mknyszek commented Aug 7, 2024

Just since I don't think this has been stated yet, this is not new behavior -- the behavior described by OP appears to be present in Go 1.21+.

@zigo101
Copy link

zigo101 commented Aug 8, 2024

It has been present since Go 1.18.

@griesemer
Copy link
Contributor

I believe this is working as intended, even if the behavior is admittedly surprising:

  1. Interfaces define type sets.
  2. The type set of interface{ any | int } is the set of all types: any includes int.
  3. The set of all types is represented by interface{} (which is any).
  4. Hence interface{ any | int } is equivalent to any and therefore the code in question is valid.

Existing implementation restrictions based on the current implementation (which represents type sets via lists of types and lists of methods) may lead to the appearance that any | int is something else than any - but these type expressions are simplified into equivalent expressions, which is what we are seeing here.

Note also that the spec says:

Interfaces that are not basic may only be used as type constraints, or as elements of other interfaces used as constraints. They cannot be the types of values or variables, or components of other, non-interface types.

This restriction is not based on the presence or absence of union terms, it is only based on the whether an interface is basic or not. And basic interfaces are (per spec):

Interfaces whose type sets can be defined entirely by a list of methods are called basic interfaces.

Therefore, interface{ any | int } is a basic interface.

Closing as working as intended.

@griesemer
Copy link
Contributor

griesemer commented Oct 2, 2024

PS: @Merovius I'd be interested to hear what you think the correct behavior should be. Presumably you want a syntactic distinction? I suppose one could add a marker to interfaces when they include a union of sorts and then say that those are not basic interfaces. Not sure what kind of ripple effects that might have. It would also not be backward compatible, but perhaps that's such a rare situation (and maybe a bug, as you alluded to), that it would be ok to make the change.

The spec's grouping of the section on interfaces into three parts is essentially for didactic purposes. I think we could simply define general interfaces (and perhaps basic interfaces as interfaces that are representable as method sets only) and be done with it. But when we introduced generics, after having used basic interfaces for > 10 years, I was looking for a gradual path to explain the expansion of the interface definition, and connect the general interfaces with what we had in the past. The fact that we have what seems like three different kinds of interfaces is not "real" in the implementation. There's just one, and it's based on the notion of a type set. The implementation of these type sets is crude, which is why we have implementation restrictions. As you know better than I, representing type sets in full generality is complicated.

Feel free to re-open if you think the existing behavior is a mistake and you have good arguments.

@mateusz834
Copy link
Member

@griesemer then why

type a interface{ io.Reader | *os.File }

func main() {
	var b a
	_ = b
}

Leads to a compile error?

@griesemer
Copy link
Contributor

griesemer commented Oct 2, 2024

@mateusz834 io.Reader and *os.File are different types. They cannot be "merged". But even if they were just pure (unnamed) interfaces (or alias types), it won't work because we have an implementation restriction (spec):

Implementation restriction: A union (with more than one term) cannot contain the predeclared identifier comparable or interfaces that specify methods, or embed comparable or interfaces that specify methods.

Without the implementation restriction (if the compiler were more sophisticated in its type set computation) something like this might work - though you'd have to use two interfaces, not an *os.File, because otherwise you bring a concrete type into the game again and the merge won't work.

@atdiar
Copy link

atdiar commented Oct 6, 2024

I think @mateusz834 reasoning might also be that since *os.File implements io.Reader, the constraint should be equivalent to io.Reader similarly to what happens in this issue.
This is not what is observed.

I would be happier if this was reopened.
I think each union terms has to stand on its own (discriminated union by default) or we will have problems in the future.

(If we compute Card of the Union of two overlapping typesets, we end up with 1 == 0 otherwise.)
Also for union terms to be commutative.

@Merovius
Copy link
Contributor

Merovius commented Oct 6, 2024

@griesemer I don't have any real, firm arguments, except of a vague dissatisfaction with using non-constructive definitions in the spec. A lexical rule like "does contain a union-element node" is easy to understand and it is extremely clear how to implement it. A semantic rule like "can the type set be expressed as…" is "more squishy" (I don't have good words for what I mean) and it requires some reasoning to understand and implement. I'm not even 100% clear on how to interpret its implications for constraints with empty type sets, for example. I don't think there really are problems, to be clear, but I don't like that I feel I'd need to think to form a real opinion.

From what I can tell, the only case where there is a difference between a lexical rule and the rule-as-written is there is a union element containing an empty interface. And in that case, the interface is equivalent to the union just not being there. So, using a purely lexical rule seems simpler while not costing anything in expressive power. "What would happen instead" is that the compiler would reject using any interface (potentially recursively) containing a union element as a value, so people would either not do it, or delete the meaningless union.

On the other hand, you are of course right that changing it now would be backwards-incompatible. And the argument could be spun the other way around: The only complexity in understanding/implementing the rule is to throw out unions containing empty interfaces, which isn't that bad. At least if my understanding is correct. So yeah, I won't argue that changing anything is necessarily worth it either. I just like the lexical interpretation better and wish that's how it had gotten implemented in the first place.

@griesemer
Copy link
Contributor

Thanks, @Merovius, very sensible feedback, as always.

@zigo101
Copy link

zigo101 commented Oct 7, 2024

@griesemer

io.Reader and *os.File are different types. They cannot be "merged". But even if they were just pure (unnamed) interfaces (or alias types), it won't work because we have an implementation restriction (spec):

Implementation restriction: A union (with more than one term) cannot contain the predeclared identifier comparable or interfaces that specify methods, or embed comparable or interfaces that specify methods.

Without the implementation restriction (if the compiler were more sophisticated in its type set computation) something like this might work - though you'd have to use two interfaces, not an *os.File, because otherwise you bring a concrete type into the game again and the merge won't work.

I don't get what are the rules here. Do you mean the reason of why any | int can be merged is because any is an alias? But it looks for a defined type type Any any, Any | int can also be merged.

In my opinion, io.Reader | *os.File should be also able to be merged without the above quoted implementation restriction.

@Merovius
Copy link
Contributor

Merovius commented Oct 7, 2024

@mateusz834 FWIW if you are interested in the reasoning for rejecting interface { io.Reader | *os.File }, you can watch this talk or read the accompanying blog post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. ExpertNeeded NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests