-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS Authentication in Kubernetes, Pulsar 2.6.1 - Broker crash loop on startup due to 401 in WorkerService.start(..) #84
Comments
After I swapped out the brokerClient auth to use token auth instead of TLS auth in the broker-configmap.yaml and proxy-configmap.yaml files, the cluster started just fine. So, it seems that there's a problem in the broker client TLS auth. |
I added TLS debugging. (I forgot to tag this issue in the commit.) |
It appears (from the debug logs) that the TLS session is established.
That implies that the CN is blank... However, the TLS logs (see attached) show that a CN is clearly present. pulsarbroker.txt |
Just before the exception is thrown, it appears that the broker is successfully able to establish a TLS session with Zookeeper, but then it gives this odd message:
and then loads a lot of certs, like:
Immediately after it loads those certs, it reports:
and then gets the 401 with:
|
…match superadmin names to ensure the principals are authorized.
I tried changing all the certs to use CNs that match roles specified as superAdmin roles, but I can't get beyond the exception:
|
Here's a complete broker log. |
@devinbost Did you ever find a solution for this? I am running to the same problem you described in apache/pulsar#8536 . It seems to me to be related to how the function worker is connecting to the broker, but it doesn't have anything to do with the helm chart itself. |
@gubespam I ended up putting this on the shelf to work on higher priority items, but I suspect it's a configuration issue. |
I did something like this:
In broker-configmap.yaml The broker is now able to start up when functions are enabled. Now the problem is when you deploy a function the functions_worker that gets spawned off has a default functions_works.yaml and not the one generated from bin/gen-yml-from-env.py conf/functions_worker.yml in the StateFullSet So of course he now gets a: HTTP 401 Unauthorized │ as he is trying to post to http://localhost:8080 which of course is wrong :) Trying to debug this currently, and then make a giant PR that enables mTLS |
I had a similar error happen to me. The cause was the tokens that were generated using the scripts/pulsar/prepare_helm_release.sh script that were stored in kubernetes secrets were asymmetric when they should have been symmetric. This was due to changing the values.yaml to be symmetric and redeploying. When redeploying, it doesn't overwrite the secrets if they already exist. I fixed this by manually deleting all of the kubernetes secrets and re-running the prepare script and reinstalling the helm chart. After doing that, everything worked properly. |
I believe that #435 addresses this issue. Released in 3.2.0 version of the chart. |
Copying from the Apache/Pulsar Github issue (apache/pulsar#8536):
Describe the bug
After configuring TLS Authentication in Pulsar 2.6.1 with this helm chart: https://github.com/devinbost/pulsar-helm-chart/tree/tls-auth
the broker gets stuck in a restart loop due to the
WorkerService
crashing with:during the
WorkerService.start(..)
method execution.Edit:
After debugging, the issue is that the data is still unreadable after the decrypt step, so something is misconfigured with the certs.
To Reproduce
Steps to reproduce the behavior:
Start minikube with an appropriate number of CPUs:
minikube start --memory=8192 --cpus=6 --cni=bridge
Run the following commands to setup the kubernetes environment, tokens, certs, and keys:
Install the local helm chart with the values file specified:
helm install --values examples/values-minikube-with-tls-and-jwt.yaml pulsar-ci ./charts/pulsar/
After waiting for a time, get logs from the broker:
kubectl -n pulsar logs pulsar-ci-broker-0
The logs should demonstrate the problem.
Expected behavior
Decryption should be happening correctly, resulting in the correct auth headers passing when we execute a PUT on the function/assignments topic during broker start.
Environment
Code involved (edited)
When we create the
brokerAdmin
client, we use thepulsarWebServiceUrl
: https://github.com/apache/pulsar/blob/master/pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/WorkerService.java#L146The first PUT on the function assignment topic uses the
brokerAdmin
client here: https://github.com/apache/pulsar/blob/master/pulsar-functions/worker/src/main/java/org/apache/pulsar/functions/worker/WorkerService.java#L169There must be a cert misconfiguration issue.
The text was updated successfully, but these errors were encountered: