protoc-gen-go: option to specify additional "used names" for generated messages #1320

joeycumines · 2021-05-16T13:36:32Z

I've implemented a plugin to generate ShallowCopy and ShallowClone methods for all of my messages. While I deliberately picked names that are unlikely to collide, I'd have preferred to use Copy and Clone, for example. I'd also like to be able to avoid naming collisions, in general, in order to address a wider range of use cases.

In consideration of the risks outlined in the implementation, I would like to propose an "opt-in" mechanism, whereby additional "used names" may be specified, via an additional plugin option.

The text was updated successfully, but these errors were encountered:

neild · 2021-05-17T16:36:58Z

Field name conflict resolution, as currently implemented, was a mistake. It doesn't scale with the addition of new reserved fields, it has many confusing edge cases, and it is not discoverable at all. We don't plan on ever adding new "used names"--to the contrary, we'd like to remove the ones that exist. I don't think we want to add a mechanism for extending it.

joeycumines · 2021-05-18T00:57:09Z

Sure @neild, I had figured as much after reading the comment I linked + others that cite historic reasons for certain behavior. Question though, can you explain what you mean by discoverable?

The conflict resolution algorithm is subtle and surprising (changing the order in which fields appear in the .proto source file can change the names of fields in generated code), and does not adapt well to adding new per-field methods such as setters.

I don't think it's particularly relevant, but the above leaves me with the impression that reserved "used names" (with names that aren't dynamically generated / need to remain fixed) would have been better grouped behind a single method, returning a type that could be extended later.

The main reason I've requested this specific solution is because I couldn't think of anything else that wouldn't potentially introduce breaking changes to existing users, since it's clear that modifying any of the "used names" has the potential to break usage of existing generated code. An option to add additional "used names" is the sort of thing I'd expect to be at least deprecated, if not outright removed, if the way that the Go names where generated was changed.

This is predicated on the idea that it's necessary to generate additional methods to achieve my goals, so I'll try explain those in more detail below, but feel free to ignore it. I will no doubt struggle to be succinct.

Broadly speaking, my goal is to generate code, to perform operations with behavior that is message specific. To do this, there are two problems I need to solve:

Where is the code generated?
How is the code mapped to each message type?

// ShallowClone returns a shallow copy of the receiver or nil if it's nil.
func (x *User) ShallowClone() (c *User)

The above is an example of one of the methods I am currently generating.

I've added the ShallowClone method to address the use case where it's desirable to "modify" fields of an existing message, without changing the original value. I am aware that is unsafe (and has been unsafe for a long time, with protobuf message types), but to demonstrate, it's intended to be somewhat equivalent to c := *x.

I've considered other solutions to support this use case, but they all have what I'd consider to be significant caveats. The closest to palatable would be something that behaves like proto.Clone. Another similar variant might be something like the example in this #1155. Another less well defined solution I've considered is to generate code that registers behavior for message types in global package state, owned by the plugin generating the code (extending the base behavior).

dsnet · 2021-05-18T02:40:46Z

We would like to one day remove the stateful name derivation logic as it is subtle and breaks the property where re-arranging fields in a protobuf message should be a safe operation. How we get to a world where we no longer have the "used names" logic is still to be determined, but you can imagine that it go something like:

Introduce the go_name feature (see protoc-gen-go: support go_name option to specify Go identifier #555),
Have protoc-gen-go start printing a warning that any name conflicts must use the go_name feature explicitly
Wait some period of time
Remove the "use names" logic from protoc-gen-go.

Given that our intention is to hopefully one day remove the "used names" logic, I don't think we will accept a feature that allows injecting names into that logic. I'm inclined to close this issue in favor of the pre-existing #555 issue.

joeycumines · 2021-05-18T04:42:17Z

Thanks for the link @dsnet.

Your proposal seems sensible to me, and is pretty much what I expected. It was made quite a long time ago, has it gained any momentum, recently?

I'd be curious to hear your thoughts on how naming collisions might be resolved, in cases where it might be problematic to have the plugin generating the message (e.g. protoc-gen-go) be unaware of additional message methods, either implemented manually, or generated by another plugin, e.g. one using protogen.

To be clear, I wouldn't expect protoc-gen-go to be aware of any other code.

The hypothetical problem case I am imagining is one where source is generated for .proto file that is externally managed / owned, and a new field is added, which subsequently generates a new go field and method, which collides with an existing method that was manually implemented, or generated by another plugin. Modifying the .proto file by hand may or may not be feasible. If it's not, then most problematic case would be one where it's not possible to change the existing name without breaking behavior, e.g. if go interfaces are used to handle messages in a generic way.

neild · 2021-05-18T06:28:28Z

I don't think it's particularly relevant, but the above leaves me with the impression that reserved "used names" (with names that aren't dynamically generated / need to remain fixed) would have been better grouped behind a single method, returning a type that could be extended later.

Yes, that would have been a better way to go. You can see this in the new proto.Message interface, which only includes a single method. (We'd probably also want a String method, for fmt package compatibility.)

There are other approaches that could have worked as well: Generate field names with a consistent prefix (e.g., m.F_Foo), make all fields unexported and access fields through methods with consistent prefixes (e.g., m.GetFoo()), name methods in a way that can't conflict (e.g., m.Proto_Marshal()), probably many others I'm not immediately thinking of.

It would be nice to have the chance to design the generated API afresh, but the cost of incompatibility is high.

If you're adding methods to generated types and want to avoid conflicts, I'd suggest including an underscore in the method names, since no generated field or method ever contains an underscore. For example, m.Proto_ShallowClone(). The underscore is not conventional Go style, of course.

joeycumines · 2021-05-18T06:56:54Z

If you're adding methods to generated types and want to avoid conflicts, I'd suggest including an underscore in the method names, since no generated field or method ever contains an underscore. For example, m.Proto_ShallowClone(). The underscore is not conventional Go style, of course.

That's... really good to know, thanks @neild. Since I'm only concerned with method name collisions, I think that resolves all of my concerns with a very simple change.

If it's not already, documenting it somewhere and making it as obvious as possible (as part of a sort of "plugin 101", in the protogen package comment, maybe?) would probably be a good idea.

I'll close this issue then, thank you both for your help.

joeycumines closed this as completed May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

protoc-gen-go: option to specify additional "used names" for generated messages #1320

protoc-gen-go: option to specify additional "used names" for generated messages #1320

joeycumines commented May 16, 2021

neild commented May 17, 2021

joeycumines commented May 18, 2021

dsnet commented May 18, 2021

joeycumines commented May 18, 2021

neild commented May 18, 2021

joeycumines commented May 18, 2021

protoc-gen-go: option to specify additional "used names" for generated messages #1320

protoc-gen-go: option to specify additional "used names" for generated messages #1320

Comments

joeycumines commented May 16, 2021

neild commented May 17, 2021

joeycumines commented May 18, 2021

dsnet commented May 18, 2021

joeycumines commented May 18, 2021

neild commented May 18, 2021

joeycumines commented May 18, 2021