Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP middleware can't identify RPC paths #326

Closed
akshayjshah opened this issue Jul 18, 2022 · 3 comments · Fixed by #480
Closed

HTTP middleware can't identify RPC paths #326

akshayjshah opened this issue Jul 18, 2022 · 3 comments · Fixed by #480
Labels
enhancement New feature or request

Comments

@akshayjshah
Copy link
Member

akshayjshah commented Jul 18, 2022

When using net/http middleware to wrap handlers and collect logs or metrics, it's nice to be able to parse out the protobuf package name, service name, and procedure name. However, it's potentially expensive to do this without an allowlist of known RPC endpoints: any script kiddie can generate zillions of invalid URLs, which may put a lot of load on observability systems (especially systems that treat each unique combination of tags as a timeseries).

@mattrobenolt brought this to our attention. Previously, he used grpc-go's generated ServiceDesc to build an allowlist. (This isn't what the gRPC folks want users to do with ServiceDesc, but that's beside the point.)

Today, users have a few options to solve this problem:

  • Emit logs & metrics from a connect interceptor. At that point, URLs have already been validated and Spec.Procedure is available. However, users may end up with one log line from HTTP middleware and another from a connect interceptor.
  • Use the generated service name constants to build a partial allowlist. This only validates the package.service portion of the URL, but that's likely enough to protect from most scanning attempts.
  • Write a separate plugin to generate an exact allowlist. This works, but is a bit of a pain.

If this is a common pain point, we could make this a bit easier by generating more code.

@akshayjshah akshayjshah added the enhancement New feature or request label Jul 18, 2022
@mattrobenolt
Copy link
Contributor

How I worked around this, was using the Unimplemented*Handler struct and reflect. It's not possible, afaict, to reflect the actual interface generated, but fortunately, the Unimplemented struct also covers the entire interface, so we can leverage that.

package main

import (
	"fmt"
	"reflect"
)

type Thing interface {
	A()
	B()
}

type UnimplementedThing struct{}

func (UnimplementedThing) A() {}
func (UnimplementedThing) B() {}

var _ Thing = (*UnimplementedThing)(nil)

func extractMethods(t any) []string {
	t2 := reflect.TypeOf(t)
	methods := make([]string, 0, t2.NumMethod())
	for i := 0; i < t2.NumMethod(); i++ {
		methods = append(methods, t2.Method(i).Name)
	}
	return methods
}

func main() {
	fmt.Println(extractMethods((*UnimplementedThing)(nil)))
}

This is just modeled after the generated connect-go code, so this is relatively easy to extract the method names.

The only information this doesn't extract is what type of handlers these are. If they're streaming vs unary, which is sometimes useful and existed in the ServiceDesc from grpc-go. I can expand this to do more introspection and get the request/response types to determine what kinds they are.

@akshayjshah akshayjshah changed the title HTTP middleware can't distinguish valid URLs from invalid HTTP middleware can't identify RPC paths Aug 17, 2022
@jhump
Copy link
Member

jhump commented Dec 16, 2022

FWIW, this can also be accomplished using proto reflection instead of Go reflection.

This approach is a bit more general since it handles cases where the protoc plugin has to mangle the method name when generating Go code. It's the proto method name that is used in the URI path, not the name of the Go method. These can vary if the original source used a lower-case method name; in that case, the protoc plugin capitalizes it so that it is an exported symbol.

// import "google.golang.org/protobuf/reflect/protoreflect"
// import "google.golang.org/protobuf/reflect/protoregistry"

// can use generated service name constant here, e.g. userv1.UserServiceName
descriptor, err := protoregistry.GlobalFiles.FindDescriptorByName("acme.user.v1.UserService")
if err != nil { return err }
svcDesc := descriptor.(protoreflect.ServiceDescriptor)

methodNames := make([]string, svcDesc.Methods().Len())
for i := 0; i < svcDesc.Methods().Len(); i++ {
    methodNames[i] = string(svcDesc.Methods().Get(i).FullName())
}

@akshayjshah
Copy link
Member Author

@jhump's got the right idea (@joshcarp has suggested a similar approach on other issues). I'm planning to document this approach and then close this issue.

akshayjshah added a commit that referenced this issue Mar 18, 2023
Generate constants for each RPC's fully-qualified name.

This is particularly useful in Connect interceptors and net/http
middleware, where users often want their code to behave differently
depending on the RPC being called. (For example, they may have a subset
of procedures exempted from authentication.) Without this PR, the best
approach we can offer is either hard-coding a string literal (fragile if
the package or RPC is moved, renamed, or deleted) or reflection using
the descriptor embedded in protoc-gen-go's output.

protoc-gen-go-grpc recently began doing something very similar to this,
so it seems reasonable for us to do the same.

Of the open issues, this fixes #326.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants