Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC reflection support #2160

Merged
merged 9 commits into from
Oct 21, 2021
Merged

GRPC reflection support #2160

merged 9 commits into from
Oct 21, 2021

Conversation

inancgumus
Copy link
Member

@inancgumus inancgumus commented Oct 5, 2021

This PR aims to fix some issues that happened in #1987.

Todo:

  • Validate the reasoning for the double Invoke tests (removed the duplication)
  • Don't emit a metric from the reflect call
  • Check the grpc reflection with a script + k6 run
  • Add a check for Connect() plaintext case
  • Add useful error messages to *Client.reflect()
  • Reduce rb.Replacer.Replace duplication
  • Add missing checks to the TestClient table test
  • Refactor assertResponse for cross-platform testing
  • Add ConnectFloatTimeout test back
  • Fix the failing ReflectBadParam test
  • Fix conflict with the new metrics registry
  • Bring back the deleted LICENSE files
  • Compare and update Joshua's changes
  • Git squash with preserve commits

@inancgumus inancgumus self-assigned this Oct 5, 2021
@inancgumus inancgumus added this to the v0.35.0 milestone Oct 5, 2021
@inancgumus inancgumus linked an issue Oct 5, 2021 that may be closed by this pull request
@inancgumus inancgumus force-pushed the feature/grpc-reflection branch from dd1c8db to b6fa782 Compare October 5, 2021 11:36
@inancgumus inancgumus added the wip label Oct 6, 2021
@inancgumus inancgumus changed the title Add grpc reflection (WIP) [WIP] Add grpc reflection Oct 6, 2021
@inancgumus inancgumus force-pushed the feature/grpc-reflection branch from 7ba0243 to e0ea261 Compare October 6, 2021 14:56
@inancgumus
Copy link
Member Author

Hey @mstoykov, can you elaborate on this? Could you share a code snippet for what you're proposing?

https://github.com/grafana/k6/pull/1987/files/429d8c68978a05f2470d0447e21a7f8f28b82af3#r696691071

@mstoykov
Copy link
Contributor

@inancgumus , the current code there leads to metric emission for the reflect call. This, unfortunately:

  1. happens when the whole connection gets closed, so you will only see it if you have multiple iterations. If you have single iteration for example the emission of the metric happens after we cancel the main context (at least in my experiments and IIRC) so at least to me it seemed to be connected to the context getting canceled.
  2. More importantly at least @na-- didn't want this to be emitted at all and I am kind of with him
  3. it also happens without any of the tags ... because they aren't set yet - which is also how I found it as getTags was extended to return empty tags in case if they were missing.
    So my original proposal was to:
  4. create another context that you cancel after the reflect call is done ... hopefully making it emit a metric at that time. Because you canceled the context also it should then not actually push the metric.
  5. if we decide we want this emitted we should probably set some real tags for it, and if not I think you can set tags on the context here instead of changing getTags.
  6. maybe try to find out another way (without cancelling the context) to close the connection

@inancgumus
Copy link
Member Author

inancgumus commented Oct 13, 2021

Thanks, @mstoykov. OK, here are the notes of our call with Mihail:

  • We don't want to emit a metric from the reflect call (there is only a single metric emission: GRPCReqDuration)
  • Creating a separate context only for the reflect call can solve the issue
  • The GRPC connection closes on the first iteration — It should be closed when the whole test run ends
  • Test it in a k6 script with multiple iterations to see the problem

@mstoykov
Copy link
Contributor

The GRPC connection closes on the first iteration — It should be closed when the whole test run ends

I think there was some misunderstanding - I was talking about the fact the "reflect" connection get's closed at the end of the iteration instead of after it has finished, which is mostly based on the fact that it emits the metric at that time. You can check that at running the sample grpc reflect script with 1 iteration and 2 iteration and output metrics to a csv output for example in which case with 1 iteration there is 1grpc_duration as in:

metric_name,timestamp,metric_value,check,error,error_code,expected_response,group,method,name,proto,scenario,service,status,subproto,tls_version,url,extra_tags
grpc_req_duration,1634114141,589.771265,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,

And in the other case there are 3

metric_name,timestamp,metric_value,check,error,error_code,expected_response,group,method,name,proto,scenario,service,status,subproto,tls_version,url,extra_tags
grpc_req_duration,1634114155,139.154875,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,
grpc_req_duration,1634114155,142.120361,,,,,,,,,,,1,,,,
grpc_req_duration,1634114156,525.058802,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,

I have omitted the other samples emitted, but you can see how there is a grpc_req_duration with absolutely no tags and no useful information.

So we either want this to not be emitted at all - perfectly fine for now.Or (in the future) to have some meaningful tags.

@inancgumus
Copy link
Member Author

inancgumus commented Oct 13, 2021

The GRPC connection closes on the first iteration — It should be closed when the whole test run ends

I think there was some misunderstanding - I was talking about the fact the "reflect" connection get's closed at the end of the iteration instead of after it has finished, which is mostly based on the fact that it emits the metric at that time. You can check that at running the sample grpc reflect script with 1 iteration and 2 iteration and output metrics to a csv output for example in which case with 1 iteration there is 1grpc_duration as in:

But there is only one GRPC connection and the reflect uses that:

func (c *Client) reflect(ctxPtr *context.Context) error {
	client := reflectpb.NewServerReflectionClient(c.conn)

This is the one reflect uses:

// Client represents a gRPC client that can be used to make RPC requests
type Client struct {
	mds  map[string]protoreflect.MethodDescriptor
	conn *grpc.ClientConn // <----
}

And that connection (conn) is getting filled by the Connect method:

c.conn, err = grpc.DialContext(ctx, addr, opts...)

Thanks for the other explanations.

@mstoykov
Copy link
Contributor

Then I guess we will need something else if the same connection is used 🤷

@mstoykov
Copy link
Contributor

But you can add another context for the reflect "client" which seems to work from my quick test - it definitely didn't emit a metric as the context it will use for the PushIfNotDone call is the one that we cancel. I have no idea if this doesn't have some side effects we don't want though so still some investigation needed 🤷

diff --git a/js/modules/k6/grpc/client.go b/js/modules/k6/grpc/client.go
index e183a420..41950eb6 100644
--- a/js/modules/k6/grpc/client.go
+++ b/js/modules/k6/grpc/client.go
@@ -310,7 +310,9 @@ func (c *Client) Connect(ctxPtr *context.Context, addr string, params map[string
 // It is called in the connect function the first time the Client.Connect function is called.
 func (c *Client) reflect(ctxPtr *context.Context) error {
        client := reflectpb.NewServerReflectionClient(c.conn)
-       methodClient, err := client.ServerReflectionInfo(*ctxPtr)
+       ctx, cancel := context.WithCancel(*ctxPtr)
+       defer cancel()
+       methodClient, err := client.ServerReflectionInfo(ctx)
        if err != nil {
                return fmt.Errorf("can't get server info: %w", err)
        }

@inancgumus
Copy link
Member Author

inancgumus commented Oct 13, 2021

Thanks, @mstoykov, I'll try your patch, but I need to reproduce the issue here first.

@mstoykov
Copy link
Contributor

@inancgumus run with multiple iterations -i 2 instead of multiple users -u 10

@inancgumus
Copy link
Member Author

inancgumus commented Oct 13, 2021

@mstoykov Do you mean the third grpc_req_duration emission in the second command's output below?

This is for grpc.js with the last k6 release:

$ k6 run -i 2 --out csv=out.csv samples/grpc.js && (cat out.csv | grep grpc)
grpc_req_duration,1634128623,766.106973,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,
grpc_req_duration,1634128624,913.535841,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,
grpc_req_duration,1634128624,328.909177,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,

This is for grpc_reflect.js:

$ go run . run -i 3 --out csv=out.csv samples/grpc_reflection.js && (cat out.csv | grep grpc)
grpc_req_duration,1634128661,774.908707,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,
grpc_req_duration,1634128661,778.269285,,,,,,,,,,,1,,,,
grpc_req_duration,1634128662,651.006906,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,
grpc_req_duration,1634128662,652.396278,,,,,,,,,,,1,,,,
grpc_req_duration,1634128663,952.326306,,,,,,GetFeature,/main.RouteGuide/GetFeature,,default,main.RouteGuide,0,,,127.0.0.1:10000/main.RouteGuide/GetFeature,

There are also more of them, happens every time after the first iteration, interesting.

@mstoykov
Copy link
Contributor

@inancgumus - yes

@inancgumus
Copy link
Member Author

inancgumus commented Oct 13, 2021

@mstoykov Whenever a GRPC request (receive) ends, the *grpcstat.End event gets called in the HandleRPC method.

So:

  • When the reflect method communicates with the server for server reflection, it uses the same connection, causing *grpcstat.End to be called in the HandleRPC.
  • The same thing also happens when an Invoke method call happens.

Note: We can’t use another connection for this problem because Invoke calls need to happen again, and the result will be the same as before.

Solution 1

Canceling the context solves the problem. But as you said, it may create some other problems as I also witnessed. There happens many context canceled errors.

Solution 2

Instead of canceling the context, we may transport a value in the context to let the HandleRPC method know that the GRPC communication was due to the reflection. I tried the following and solved the problem without "context cancellation" errors.

type grpcContextKey int

const reflectionInProgress grpcContextKey = 1

func (c *Client) reflect(ctxPtr *context.Context) error {
	*ctxPtr = context.WithValue(*ctxPtr, reflectiontInProgress, true)
        ...
}
func (c *Client) HandleRPC(ctx context.Context, stat grpcstats.RPCStats) {
        ...
	case *grpcstats.End:
		inProgress, ok := ctx.Value(reflectionInProgress).(bool)
		if ok && inProgress {
                        // prevent the metric emission from the reflect calls
			return
		}
                ...
}

We can reset the context value to false here because the Invoke happens only after connection along with the reflect call.

func (c *Client) Invoke(...) (*Response, error) {
	*ctxPtr = context.WithValue(*ctxPtr, reflectionInProgress, false)

Other Ideas

  • Keep track of this with an additional field in the Client struct. Actually, the same idea but without Context.
  • Let people provide the methods they want to call beforehand in the init stage. So we can load those methods, maybe to a temporary proto file (or in-memory) that we’ll be compiling on the fly. And then automatically use the existing Load method and provide the file to it. I’m not sure how this can work with k6 instances, though.

@inancgumus
Copy link
Member Author

@na-- @imiric @codebien @yorugac What are your thoughts on this?

@imiric
Copy link
Contributor

imiric commented Oct 14, 2021

@inancgumus I'm not a fan of using the context to pass values like that or relying on some internal state, and would prefer the context cancellation approach if possible.

Where did you see context canceled errors? I tested Mihail's patch briefly and didn't see any.

One behavior I noticed in the *grpcstats.End case is that s.Error will only be non-nil when the connection is actually closing, i.e. at the very end. It could be for other reasons as well, but maybe we could check that error code and decide if we want to emit samples or not based on that?

That said, I'm not sure if we should "hide" the fact another RPC call is made for the reflection. It makes sense to show metrics for that call as well, so we should only avoid emitting empty samples.

@inancgumus
Copy link
Member Author

inancgumus commented Oct 14, 2021

@inancgumus I'm not a fan of using the context to pass values like that or relying on some internal state, and would prefer the context cancellation approach if possible.

@imiric Agreed. I didn't like it as well. It was emitting a metric because it was using a different context than the one the connection uses.

So now there is no need to create and cancel another context.

Fix: 927673d

@inancgumus inancgumus changed the title [WIP] Add grpc reflection GRPC reflection support Oct 20, 2021
@inancgumus inancgumus removed the wip label Oct 20, 2021
@inancgumus inancgumus force-pushed the feature/grpc-reflection branch from 4abd4fc to d5c3620 Compare October 20, 2021 14:02
joshcarp and others added 4 commits October 20, 2021 17:15
Author: joshcarp <joshuacarpeggiani@gmail.com>

Fix: Naming errors related with the new metric
registry like: GRPCReqDurationName

Co-Author: Inanc Gumus <m@inanc.io>
* reflect api error string
* refactor reflect method
* add more meaningful error messages
GRPC reflection was emitting empty metrics when enabled.
This fixes the problem by using the same GRPC connection
context with the reflection feature.
This commit removes the duplicated Invoke test
from grpc reflection tests.
@inancgumus inancgumus force-pushed the feature/grpc-reflection branch from d5c3620 to 3492e3b Compare October 20, 2021 14:16
@inancgumus inancgumus marked this pull request as ready for review October 20, 2021 14:18
@github-actions github-actions bot requested review from imiric and yorugac October 20, 2021 14:19
@inancgumus
Copy link
Member Author

It's not in a super polished state but let's discover if it's good enough as it is. This work fixes some issues previously discussed (see the first comment of this PR).

Copy link
Member

@na-- na-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ❤️ the tests!

Copy link
Contributor

@mstoykov mstoykov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

As I mention I still don't like the fact we don't return nil when there are not tags, but the alternative change looks to be for HandleRPC to handle that, so I guess this is the lesser everl 🤷

@@ -33,7 +33,7 @@ func withTags(ctx context.Context, tags reqtags) context.Context {
func getTags(ctx context.Context) reqtags {
v := ctx.Value(ctxKeyTags{})
if v == nil {
return nil
return make(map[string]string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't like this change, but unfortunately by the looks of it - it will require some special handling in HandleRPC which likely will be the much more involved :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support to use gRPC Server Reflection
5 participants