Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publishNotReadyAddresses only for headless not main service #817

Closed
der-eismann opened this issue Dec 7, 2022 · 3 comments
Closed

publishNotReadyAddresses only for headless not main service #817

der-eismann opened this issue Dec 7, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@der-eismann
Copy link

Describe the bug
While migrating our Vault deployment to using Helm charts I got a bit confused by your use of publishNotReadyAddresses in the services. Basically you have two choices:

  • enabled (default): Your vault.default.svc.cluster.local (or vault-active) service can have unready pods attached which means requests to the service can fail. This is bad because some integrations like spring-cloud-vault still have no retry mechanism and will fail.
  • disabled: Cluster formation will probably fail because unready pods are not available via the vault-internal headless service

IMHO only the headless service should have publishNotReadyAddresses enabled, all other services should have it disabled. So I'd either want it to be configurable per service or fix it for some of the services.

To Reproduce
Steps to reproduce the behavior:

  1. Install chart in HA mode
  2. Send requests to the default vault k8s service
  3. When one or multiple pods are unready it can happen that the requests go to the unready pod and return a server error or time out

Expected behavior
Requests to the default vault k8s service should only go to ready pods without it affecting the cluster formation via the headless service.

Environment

  • Kubernetes version: 1.23.13
    • Distribution or cloud vendor (OpenShift, EKS, GKE, AKS, etc.): self-hosted
    • Other configuration options or runtime services (istio, etc.): istio
  • vault-helm version: 0.23.0

Chart values:

server:
  enabled: true
  service:
    publishNotReadyAddresses: false
@der-eismann der-eismann added the bug Something isn't working label Dec 7, 2022
@Matthias247
Copy link

this seems addressed by #902 and the 0.25 release?

@tomhjp
Copy link
Contributor

tomhjp commented Aug 4, 2023

Thanks for pointing this out - that's correct! Thanks for the detailed issue report too @der-eismann!

@tomhjp tomhjp closed this as completed Aug 4, 2023
@rgarcia89
Copy link

I would like to add a finding to this topic. Since it seem like the publishNotReadyAddresses is set to true for the vault-internal service because of this issue.

I am running a Kubernetes cluster with a default deny-all network policy, allowing only whitelisted connections. For the Vault cluster consisting of 3 pods, I have enabled communication between them on ports 8200 and 8201. Everything is functioning correctly thus far. However, I am logging every detected denial, and here arises an issue due to the fact that the vault-internal service is also publishing non-ready pods. Unfortunately, when a pod enters the termination state, it is not promptly removed from the headless service. Consequently, other Vault pods continue attempting to establish connections with it. This persists until the pod is ultimately terminated, resulting in logged denials since the target pod is removed not fast enough from the service. Something that normally would happen as soon as a pod goes into termination state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants