Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResetVT does not work with oneof fields #144

Open
codesoap opened this issue Oct 3, 2024 · 2 comments
Open

ResetVT does not work with oneof fields #144

codesoap opened this issue Oct 3, 2024 · 2 comments

Comments

@codesoap
Copy link
Contributor

codesoap commented Oct 3, 2024

Problem

When using pools, the ResetVT method is able to reuse allocated memory of bytes. It looks like this with a minimal example:

syntax = "proto2";
package TEST;
option go_package = "./testpkg";

message Foo {
	required bytes raw = 1;
}
func (m *Foo) ResetVT() {
	if m != nil {
		f0 := m.Raw[:0]
		m.Reset()
		m.Raw = f0
	}
}

When the bytes are inside a oneof field, the memory is no longer reused:

syntax = "proto2";
package TEST;
option go_package = "./testpkg";

message Foo {
	oneof data {
		bytes raw = 1;
		bytes zlib_data = 2;
	}
}
func (m *Foo) ResetVT() {
	if m != nil {
		m.Reset()
	}
}

Use Case

This is a real problem for me when parsing open streetmap's PBF files, because they use this construct for large blobs of data that occur hundreds to thousands of times in a PBF file: https://github.com/openstreetmap/OSM-binary/blob/65e7e976f5c8e47f057a0d921639ea8e6309ef06/osmpbf/fileformat.proto#L38

Maybe it would have been better if they used a single bytes field and a separate type field, but I'm afraid the design is set in stone as the format is already widely used.

Suggested Goal

Unfortunately I don't understand the code well enough to provide a pull request, but this is how I would imagine the generated code to look like in order to fix the problem:

func (m *Foo) ResetVT() {
	if m != nil {
		var f0 isFoo_Data
		switch t := m.Data.(type) {
		case *Foo_Raw:
			t.Raw = t.Raw[:0]
			f0 = t
		case *Foo_ZlibData:
			t.ZlibData = t.ZlibData[:0]
			f0 = t
		}
		m.Reset()
		m.Data = f0
	}
}

Inside the UnmarshalVT method, the code would look something like this; As a first step I'm only reusing the memory, if the Data field previously had the same type:

...
			var v []byte
			if m.Data == nil {
				v = make([]byte, postIndex-iNdEx)
			} else {
				switch t := m.Data.(type) {
				case *Foo_Raw:
					if cap(t.Raw) < postIndex-iNdEx {
						v = make([]byte, postIndex-iNdEx)
					} else {
						v = t.Raw[:postIndex-iNdEx]
					}
				default:
					v = make([]byte, postIndex-iNdEx)
				}
			}
			copy(v, dAtA[iNdEx:postIndex])
			m.Data = &Foo_Raw{Raw: v}
...
@vmg
Copy link
Member

vmg commented Oct 7, 2024

This suggested generated code seems reasonable to me, but I do feel like it's gonna be tough to implement. We can merge this improvement if you get it working, and I think you're in the right direction. The right way to tackle this is starting with the generated code you're expecting to see and then tweaking the codegen until you get the desired results. :)

@codesoap
Copy link
Contributor Author

I'm afraid my pain is not high enough to attempt implementing the feature at the moment. Maybe I will find more motivation in the future. I'll leave it to you to close this issue for now or keep it open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants