Proto schema registry refresh optimizations #1040

bojand · 2024-01-26T19:58:23Z

Iterative work for optimizing some of the proto schema registry refreshing.

Changes

In schema Service we not keep a map of proto schemas, and proto file descriptors indexed by schema ID.
As we refresh the schemas, we determine which ones are new or have been updated, and merge those into the index. We only parse and compile these updated / new ones.
We also ensure that the removed ones are removed from the mapping.
Changes the returned types of GetSchemasBySubject(), GetSchemas(), GetSchemasIndividually() to return pointer types.
gci seems to incorrectly sort the "maps" import. Updated the tooling hoping it would address the issue, but it didn't seem to help, so I commented that step out for now.

I will try and add some tests for this soon.

…pointers throughout

weeco · 2024-01-26T23:48:02Z

backend/pkg/schema/client.go

+	errors := make([]error, len(subjectsRes.Subjects))
+	hasErrors := atomic.Bool{}


I think we should do errors := make([]error, 0, len(subjectsRes.Subjects)) and then we can check the length of errors to see if we have errors. I think in most cases we do not expect every single subject to return an error.

It feels a bit weird to return a slice of schemas and a slice of errors. At this point we can probably just refactor the function to return schema + error via a channel and handle the error and returned schema on the receiving side.

The intent is to be able to get all the schemas we successfully fetched, even if we failed to retrieve some subset of all the schemas in SR. So we return all the successfully fetched schemas and a collection of errors.
This is the case for both GetSchemasIndividually() and GetSchemas(), and then GetProtoDescriptors() we are able to log all the failed schemas, but still be able to parse and compile the successfully retrieved ones so we could potentially use them.

weeco · 2024-01-27T11:35:53Z

backend/pkg/schema/service.go

-			s.requestGroup.Forget(key)
-			return nil, fmt.Errorf("failed to get schema from registry: %w", err)
+	_, err, _ := s.requestGroup.Do(key, func() (any, error) {
+		schemasRes, errs := s.registryClient.GetSchemas(ctx, false)


schemaRes will be nil as far as I can tell if all retrieved schemas errored (e.g. schema registry not reachable). I think this func will still work if this is the case, but would be good to test the case where schemaRes is nil.

Rather than using singleflight we could use https://github.com/twmb/go-cache to simplify some things, but I assume you tried to keep as much of the original code as possible

bojand added 2 commits January 26, 2024 12:28

backend: optimize proto refresh to only compile changed schemas, use …

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23
Expired
Learn about vigilant mode

823ff51

…pointers throughout

backend: improve handling deletion of schemas better

5c572c1

weeco self-requested a review January 26, 2024 23:24

weeco reviewed Jan 26, 2024

View reviewed changes

bojand added 2 commits January 26, 2024 22:27

backend: use len of errors to detect if we have any errors

54d1951

backend: just return schemas and errors

a611352

weeco reviewed Jan 27, 2024

View reviewed changes

bojand added 2 commits January 27, 2024 14:21

backend: add check for nil or empty schemaRes in case of errors

33b800a

Merge branch 'master' into backend/proto_refresh_optimizations

fa483ab

weeco approved these changes Feb 5, 2024

View reviewed changes

bojand merged commit 50874a7 into master Feb 5, 2024
8 checks passed

bojand deleted the backend/proto_refresh_optimizations branch February 5, 2024 20:03

bojand restored the backend/proto_refresh_optimizations branch April 12, 2024 00:25

bojand deleted the backend/proto_refresh_optimizations branch April 12, 2024 00:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proto schema registry refresh optimizations #1040

Proto schema registry refresh optimizations #1040

bojand commented Jan 26, 2024 •

edited

Loading

weeco Jan 26, 2024

bojand Jan 27, 2024

weeco Jan 27, 2024

		errors := make([]error, len(subjectsRes.Subjects))
		hasErrors := atomic.Bool{}

Proto schema registry refresh optimizations #1040

Proto schema registry refresh optimizations #1040

Conversation

bojand commented Jan 26, 2024 • edited Loading

Changes

weeco Jan 26, 2024

Choose a reason for hiding this comment

bojand Jan 27, 2024

Choose a reason for hiding this comment

weeco Jan 27, 2024

Choose a reason for hiding this comment

bojand commented Jan 26, 2024 •

edited

Loading