-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs-only] ADR0029 - grpc in kubernetes #9488
base: master
Are you sure you want to change the base?
Conversation
Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes. |
0e9df5b
to
4e66dec
Compare
When trying to use the I'll add it to cs3org/reva#4744 and make the service names configurable in #9490 ... then we can test the behavior under load. |
After digesting this, ponder on the thought that some services expose http ports as well as grpc ... we need to clerify how http requests are retried and load balanced as well. if grpc uses headless services and dns .... that does might not mix with go micro http clients ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@butonic after you investigated it, which option would you personally prefer?
b755c50
to
3667594
Compare
I no longer see a strict requirement to have a service registry for the two main deployment scenarios. For a bare metal deployment I'd prefer unix sockets for grpc and for kubernetes I'd prefer DNS because the go grpc libs support balancing based on DNS. Even for docker (compose) unix sockets can be replaced with tcp connections to hostnames for setups that need to run some services in a dedicated container. Now http requests also need to be load balanced and retried ... in kubernetes long running http connections would face the same problems as grpc: the client might try to send requests to a no longer or not yet ready / healty service. But I havent found a good resource on how to retry and load balance http connections in kubernetes based on the same dns magic that go-grpc does. Something like esiqveland/balancer, benschw/dns-clb-go, benschw/srv-lb ... but maintained? https://github.com/markdingo/cslb had a release 2023 |
Signed-off-by: Jörn Friedrich Dreyer <jfd@butonic.de>
Quality Gate passedIssues Measures |
As far as I see, the DNS would act as our service registry. We'd still have the service registry, just not maintained by us (which could be good). While I assume this would work for kubernetes (I see some plans on how it could work), we'll also need to take into account other environments. Moreover, this should have a fully automated setup and tear down (or provide a simple command to do it). Configuring DNS entries the way we want manually won't be for average people. For the "client hammering the DNS", I think that would be a client behavior we could fix. I mean, once the client have resolved the DNS and we're connected to the target service, it's up to the client to decide to reuse the same connection or request a new connection to a different replica. One big problem I see with this solution is that we'll need to do a migration. This seems a big breaking change, and maybe a drawback big enough to discard the solution. |
I agree that in a kubernetes environment using headless services and dns would act as the service registry for go-grpc clients. (I still need to better understand http clients.) I see four ways tu run ocis:
For the first three deployment types unix sockets would suffice. In docker we can use hostnames with a tcp transport if we really need to spread the services in multiple containers. I don't see the necessity for a dedicated dns server. docker swarm also has a For kubernetes we can use dns and preconfigure all addresses using the helm charts. IMNSHO we should aim for unix sockets and fewer processes / pods. We should move some tasks to dedicated containers for security, eg. thumbnailers and content indexing. The current helm chart deploying every service in a dedicated container is just a waste of resources - AND fragile. For grpc the go client package has evolved to a point where it can handle everything that is necessary: https://pkg.go.dev/google.golang.org/grpc#ClientConn
Retries are a matter of configuration. But picking up new dns entries is ... a long standing issue grpc/grpc#12295 with the two scenarios (existing pod goes down, new pod comes up) starting to be discussed in grpc/grpc#12295 (comment). Reading the thread it seems the default dns:// resolver will, by design, not pick up new pods unless we configure a MaxConnectionAge on the server-side. The 'optimal' solution is to use a name resolution system that has notifications - aka the kubernetes API. cs3org/reva#4744 allows us to test and benchmark both: grpc go with |
It seems docker has an internal DNS server we could use. Assuming it has all the capabilities we need, we wouldn't need a custom DNS server. (I don't know how we can configure the DNS to provide the SRV records we need - or how are we going to register our services in the DNS otherwise; so we might still need a custom DNS we can configure at will)
If we're going through the dns route, I think it should work everywhere regardless of the deployment. This includes kubernetes with every service in an independent server, despite the deployment itself could be a bad idea. Then we could have "official" deployments with different sets of services in different servers. For docker, it seems that we aim for something like (only relevant content):
dig response for a different container in the same docker network:
I guess that should match the kubernetes setup and whatever library we use for the connection is able to work with it. |
To leverage the kubernetes pod state, we first used the go micro kubernetes registry implementation. When a pod fails the health or readyness probes, kubernetes will no longer | ||
- send traffic to the pod via the kube-proxy, which handles the ClusterIP for a service, | ||
- list the pod in DNS responses when the ClusterIP is disabled by setting it to `none` | ||
When using the ClusterIP HTTP/1.1 requests will be routed to a working pod. | ||
|
||
This nice setup starts to fail with long lived connections. The kube-proxy is connection based, causing requests with Keep-Alive to stick to the same pod for more than one request. Worse, HTTP/2 and in turn gRPC are multiplexing the connection. They will not pick up any changes to pods, explaining the symptomps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we using this at all? Isn't the Go micro registry returning pod IPs?
And from what I know, the nats-js-kv service registry doesn't have any insights into healthiness or readiness of services it tries to contact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. new pods will not be used because clients will reuse the existing gRPC connection | ||
2. gRPC clients will still try to send traffic to killed pods because they have not picked up that the pod was killed. Or the pod was killed a millisecond after the lookup was made. | ||
|
||
An addition to this problem are the health and readyness implementations of oCIS services not always reflecting the correct state of the service. One example is the storage-users service that returns ready `true` while runing a migration on startup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I know all /healthz and /readyz endpoints are hardcoded to true. Which is funny because the debug server might be up before the actual service server.
I investigated #8589 and tried to sum up my findings in an ADR because it may have architectural consequences.