-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Go: extract and expose struct tags, interface method IDs #17357
base: main
Are you sure you want to change the base?
Go: extract and expose struct tags, interface method IDs #17357
Conversation
This enables us to distinguish all database types in QL. Previously structs with the same field names and types but differing tags, and interface types with matching method names and at least one non-exported method but declared in differing packages, were impossible or only sometimes possible to distinguish in QL. With this change these types can be distinguished, as well as permitting queries to examine struct field tags, e.g. to read JSON field name associations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Longer review to follow.
* Added methods `StructTag.hasOwnFieldWithTag` and `Field.getTag`, which enable CodeQL queries to examine struct field tags. | ||
* Added method `InterfaceType.getMethodTypeById`, which enables CodeQL queries to distinguish interfaces with matching non-exported method names that are declared in different packages, and are therefore incompatible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Added methods `StructTag.hasOwnFieldWithTag` and `Field.getTag`, which enable CodeQL queries to examine struct field tags. | |
* Added method `InterfaceType.getMethodTypeById`, which enables CodeQL queries to distinguish interfaces with matching non-exported method names that are declared in different packages, and are therefore incompatible. | |
* Added member predicates `StructTag.hasOwnFieldWithTag` and `Field.getTag`, which enable CodeQL queries to examine struct field tags. | |
* Added member predicate `InterfaceType.getMethodTypeById`, which enables CodeQL queries to distinguish interfaces with matching non-exported method names that are declared in different packages, and are therefore incompatible. |
@@ -1150,6 +1150,20 @@ var ComponentTypesTable = NewTable("component_types", | |||
EntityColumn(TypeType, "tp"), | |||
).KeySet("parent", "index") | |||
|
|||
// ComponentTagsTable is the table associating composite types with their component types' tags | |||
var ComponentTagsTable = NewTable("component_tags", | |||
EntityColumn(CompositeType, "parent"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EntityColumn(CompositeType, "parent"), | |
EntityColumn(StructType, "parent"), |
Tags only exist on fields of structs, so I don't see why we should make this table more general than that. (Various names should change as well, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly looks good! Thank you for improving this and moving it out into it's own PR. I just have a few suggestions in addition to @owen-mc's comments, which also make sense.
Also, to sanity check: in the PR description you discuss that part of the motivation here is to be able to distinguish types better. That makes sense and I found the relevant part of the Go specification for this in https://go.dev/ref/spec#Type_identity. For structs:
Two struct types are identical if they have the same sequence of fields, and if corresponding fields have the same names, and identical types, and identical tags. Non-exported field names from different packages are always different.
For interfaces:
Two interface types are identical if they define the same type set.
Looking over the tests here, I can see that the tests exercise the new functionality and that seems to behave as expected. Do the tests cover the new ability to decide (in)equality that you are hoping for? Could you comment on how the tests cover that?
// meth.Id() will be equal to meth.Name() for an exported method, or | ||
// packge-qualified otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: My understanding is that, in Go, exported methods always start with an upper-case character. Why did you go for this meth.Id() != meth.Name()
check rather than checking the first character of the name? Did that not work or do you think this method is more reliable or has other advantages?
StringColumn("tag"), | ||
).KeySet("parent", "index") | ||
|
||
// InterfacePrivateMethodIdsTable is the table associating interface types with their private method ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this table includes entries for methods (by index in the interface) which are private. If so, I don't think the comment makes that very clear. The current comment suggests that every interface type has a private method id. How about:
// InterfacePrivateMethodIdsTable is the table associating interface types with their private method ids | |
// InterfacePrivateMethodIdsTable is the table associating interface types with the indices of their private methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it should be clearer. This is slightly more informative:
// InterfacePrivateMethodIdsTable is the table associating interface types with their private method ids | |
// InterfacePrivateMethodIdsTable is the table associating interface types with the indices and ids of their private methods. |
* different packages defines two distinct types, but they appear identical according to | ||
* `getMethodType`. | ||
*/ | ||
Type getMethodTypeById(string id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably more a complaint about Go's terminology than anything else, but id
doesn't seem like the best choice of name for this.
* For example, `interface { Exported() int; notExported() int }` declared in two | ||
* different packages defines two distinct types, but they appear identical according to | ||
* `getMethodType`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be good to extend the example to include the id
s of the methods (with sample packages) to show explicitly what you mean in the previous paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this file be called InterfaceMethodIds.ql
?
Tests failing:
|
Retargeted this against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work spotting these problems and fixing them. A few small suggestions for improvement.
Also, shouldn't the label for struct types include the tag of each field? Since differing tags make it a different struct type? Ideally this would have a test as well. This could be done as a follow-up, but it also fits in pretty naturally with this PR.
@@ -1547,6 +1547,7 @@ func extractType(tw *trap.Writer, tp types.Type) trap.Label { | |||
name = "" | |||
} | |||
extractComponentType(tw, lbl, i, name, field.Type()) | |||
dbscheme.ComponentTagsTable.Emit(tw, lbl, i, tp.Tag(i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be slightly better to only emit this line when the tag is non-empty. This would save some space, at the cost of some slightly more complicated QL to access the tags. (Note that the spec says that empty tag is equivalent to no tag at all.) It would also make the upgrade script simpler - no need to make a table and fill it with (i, "")
relations.
StringColumn("tag"), | ||
).KeySet("parent", "index") | ||
|
||
// InterfacePrivateMethodIdsTable is the table associating interface types with their private method ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it should be clearer. This is slightly more informative:
// InterfacePrivateMethodIdsTable is the table associating interface types with their private method ids | |
// InterfacePrivateMethodIdsTable is the table associating interface types with the indices and ids of their private methods. |
This enables us to distinguish all database types in QL. Previously structs with the same field names and types but differing tags, and interface types with matching method names and at least one non-exported method but declared in differing packages, were impossible or only sometimes possible to distinguish in QL. With this change these types can be
distinguished, as well as permitting queries to examine struct field tags, e.g. to read JSON field name associations.
This is a pre-requisite to (some approaches to) dealing with Go 1.23's more direct exposure of type aliases, since it enables us to distinguish all types that are distinct in the database in QL, and therefore implement up-to-aliasing type matching, known in the Go spec as identical types.