Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for health service in gRPC server #9528

Merged
merged 2 commits into from
Jan 31, 2022

Conversation

shichao-an
Copy link
Contributor

@shichao-an shichao-an commented Jan 19, 2022

Description

Related Issue(s)

Fixes #9333

Checklist

  • Should this PR be backported?
  • Tests were added or are not required
  • Documentation was added or is not required

Deployment Notes

Signed-off-by: Shichao An <shichaoan.cn@gmail.com>
@vmg
Copy link
Collaborator

vmg commented Jan 24, 2022

This doesn't break anything, but I'm not exactly convinced that the generic GRPC health service API makes sense for Vitess, as our health status is already exposed via other APIs and it is more complex than what this generic API provides (most notably, Vitess distinguishes between serving/not serving -- which applies to queries but not the GRPC interface itself, and cannot be modelled with this API).

Is there a particular use case you have in mind for wiring up this API? If you're trying to health check a specific service in a Vitess cluster, I would recommend you use the GRPC APIs which we already provide and that relay much more relevant information about the actual health status of the Vitess cluster and its services. Cheers.

@shichao-an
Copy link
Contributor Author

This doesn't break anything, but I'm not exactly convinced that the generic GRPC health service API makes sense for Vitess, as our health status is already exposed via other APIs and it is more complex than what this generic API provides (most notably, Vitess distinguishes between serving/not serving -- which applies to queries but not the GRPC interface itself, and cannot be modelled with this API).

Is there a particular use case you have in mind for wiring up this API? If you're trying to health check a specific service in a Vitess cluster, I would recommend you use the GRPC APIs which we already provide and that relay much more relevant information about the actual health status of the Vitess cluster and its services. Cheers.

@vmg Our Service Networking infrastructure uses Envoy health check (link), which requires "grpc.health.v1.health" service endpoint and an optional service name (for example, "vtgateservice.Vitess") to send to this endpoint. Almost all our gRPC services have "grpc.health.v1.health" and that's why SN team made that assumption without supporting custom health endpoint for gRPC (they do support custom HTTP endpoint, but only for HTTP services). Without "grpc.health.v1.health", the Envoy HC simply fails.

On the higher level, it is for Envoy to determine load balancing semantics such as panic mode, so it doesn't really matter whether it is serving/not_serving from Vitess's perspective.

@vmg
Copy link
Collaborator

vmg commented Jan 25, 2022

Sounds good to me. Have you tested that this change makes your Envoy work in your infrastructure? From looking at the Godoc for the HealthCheck API, it seems like you may need to call SetServingStatus at least once to register a healthy service by name?

@shichao-an
Copy link
Contributor Author

Sounds good to me. Have you tested that this change makes your Envoy work in your infrastructure? From looking at the Godoc for the HealthCheck API, it seems like you may need to call SetServingStatus at least once to register a healthy service by name?

@vmg Need your input on this as well, since by default our gRPC applications use a custom health check that implements the standard Health service interface (something like this, health.RegisterHealthServer(grpcServer, customServer), where customServer implementing Check() and Watch() that returns serving/not_serving for HealthCheckRequest).

So if I only registered the default Server (as in this PR) like this healthpb.RegisterHealthServer(gserv, health.NewServer()) without using SetServingStatus or use a custom health check (like what we do internally), it will work for HealthCheckRequest whose input service is not set, and return SERVING (see this link.

I think for Vitess gRPC server (which is used by vtgate, vttablet, etc.), if we want to explicitly set serving services, we can simply use SetServingStatus. I wonder what services should I set as serving? Or should I simply set "grpc.health.v1.Health" as serving just as in

s.healthServer.SetServingStatus(healthServiceName, healthpb.HealthCheckResponse_SERVING)

@vmg
Copy link
Collaborator

vmg commented Jan 26, 2022

The code seen in that commit looks good to me. Iterating the services and calling SetServingStatus will work fine. Can you push that commit to this PR?

Signed-off-by: Shichao An <shichaoan.cn@gmail.com>
@shichao-an
Copy link
Contributor Author

The code seen in that commit looks good to me. Iterating the services and calling SetServingStatus will work fine. Can you push that commit to this PR?

Done. Here are the local tests I've done.

Make and start local docker:

make docker_local
make docker_run_local

Test gRPC calls using grpcurl

st-shichao1:~ shichao$ grpcurl  -plaintext localhost:15991 list
grpc.health.v1.Health
grpc.reflection.v1alpha.ServerReflection
vtgateservice.Vitess
st-shichao1:~ shichao$ grpcurl -plaintext localhost:15991 grpc.health.v1.Health.Check
{
  "status": "SERVING"
}
st-shichao1:~ shichao$ grpcurl -d '{ "service": "" }' -plaintext localhost:15991 grpc.health.v1.Health.Check
{
  "status": "SERVING"
}
st-shichao1:~ shichao$ grpcurl -d '{ "service": "grpc.health.v1.Health" }' -plaintext localhost:15991 grpc.health.v1.Health.Check
{
  "status": "SERVING"
}
st-shichao1:~ shichao$ grpcurl -d '{ "service": "vtgateservice.Vitess" }' -plaintext localhost:15991 grpc.health.v1.Health.Check
{
  "status": "SERVING"
}
st-shichao1:~ shichao$ grpcurl -d '{ "service": "grpc.reflection.v1alpha.ServerReflection" }' -plaintext localhost:15991 grpc.health.v1.Health.Check
{
  "status": "SERVING"
}
st-shichao1:~ shichao$ grpcurl -d '{ "service": "foobar" }' -plaintext localhost:15991 grpc.health.v1.Health.Check
ERROR:
  Code: NotFound
  Message: unknown service

@vmg vmg merged commit 69bdf93 into vitessio:main Jan 31, 2022
@shichao-an
Copy link
Contributor Author

@vmg Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Cluster management Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support gRPC health check service in VTGate
2 participants