Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pulsar Manager Initialization and Securing of Admin Secrets #448

Closed

Conversation

Mortom123
Copy link
Contributor

@Mortom123 Mortom123 commented Jan 28, 2024

See #438 / #438 (comment) 3.PR

Also the Pulsar-Manager Admin interface is exposed on a different, non outward facing, service because of given concerns.

Motivation

Right now you need to initialize the pulsar-manager manually, this can be done as an init job.

Modifications

  1. Change Admin Port exposure of Pulsar Manager to second internal service
  2. Provide means to securely generate a password / use an existing one
  3. Provide a job that runs on init and conditions the Pulsar Manager

Verifying this change

Some things I also noted:

  1. I tried all combinations of BROKER_URL and BOOKIE_URL such that BOOKIE_URL references the Bookie service and one of its ports (as given in the values). However, none of them worked. I suspect there's a naming error on side of the Pulsar Manager
  2. There are some errors when starting the manager. Maybe part of this deployment is incorrect.
/pulsar-manager/startup.sh: 21: initdb: not found 
/pulsar-manager/startup.sh: 22: pg_ctl: not found 

@Mortom123
Copy link
Contributor Author

Mortom123 commented Jan 28, 2024

Please have a look @lhotari, this also addresses some of the issues you mentioned regarding the admin port exposure.
Thanks

{{/* create environment */}}
{{- if or (not .Values.tls.enabled) (not .Values.tls.broker.enabled) }}
BROKER_URL="http://{{ template "pulsar.fullname" . }}-{{ .Values.broker.component }}:{{ .Values.broker.ports.http }}"
BOOKIE_URL="http://{{ template "pulsar.fullname" . }}-{{ .Values.broker.component }}:{{ .Values.broker.ports.pulsar }}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried to use .Values.bookie.component with any of the ports under .Values.bookie.ports, but none worked. However, .Values.bookie.component worked with .Values.broker.ports.pulsar ???

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it could work with broker's component and port. More comments below.

charts/pulsar/values.yaml Outdated Show resolved Hide resolved
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the review comments.

@lhotari
Copy link
Member

lhotari commented Jan 29, 2024

Thanks for the contribution @Mortom123 ! This will be a great improvement to Apache Pulsar Helm chart!

@Mortom123 Mortom123 force-pushed the feature/pulsar_manager_initialze branch from 3530530 to 3a1dc04 Compare January 29, 2024 15:21
@Mortom123
Copy link
Contributor Author

Mortom123 commented Jan 29, 2024

@lhotari I updated with your feedback and decided to also reduce the complexity of the secret by only use a single username/password for the UI / DB. I don't see much advantage of having separate ones there.

I allowed for manual overwrite of secret values, similar to kube-prometheus-stack, but the default (empty password) is a randomly generated password.
Maybe you have an idea how can test if the login/checking of the environment of the pulsar manager works in the CI pipeline?

I removed existingSecretName, because when injecting Secrets through secondary applications, I would argue that you have to provide a name for it either way. so might as well set it to the required name.

During this I also wondered why we have such flexibility for the names for of the components. Most of the things are setup using pulsar.fullname-component.name. Why don't we fix component.name and only allow for a changing fullname through e.g. other release names?

@lhotari lhotari mentioned this pull request Jan 29, 2024
1 task
@Mortom123 Mortom123 requested a review from lhotari January 30, 2024 08:13
@lhotari
Copy link
Member

lhotari commented Jan 31, 2024

I updated with your feedback and decided to also reduce the complexity of the secret by only use a single username/password for the UI / DB. I don't see much advantage of having separate ones there.

I think having a single username/password for both UI & DB is simply a very bad practice. It's better to have separate passwords. It's not that much additional overhead to add it now and follow good practices.

I allowed for manual overwrite of secret values, similar to kube-prometheus-stack, but the default (empty password) is a randomly generated password.

That part looks really good!

Maybe you have an idea how can test if the login/checking of the environment of the pulsar manager works in the CI pipeline?

Perhaps a simple HTTP request with curl & grep would do the job? That could be added in a similar way as the Pulsar Functions tests:

if [[ "$(ci::helm_values_for_deployment | yq .components.functions)" == "true" ]]; then
# test functions
ci::test_pulsar_function
fi

You would add this to .ci/chart_test.sh and .ci/helm.sh files. In addition there would need to be a test scenario in the GitHub Actions workflow matrix configured in https://github.com/apache/pulsar-helm-chart/blob/master/.github/workflows/pulsar-helm-chart-ci.yaml#L170 with respective values file in .ci/clusters directory.
When you run the CI jobs in your own fork (by enabling GitHub Actions) and opening a PR to your own fork, you can debug GitHub Actions by logging in by ssh. ssh access is only for your ssh public keys registered in GH, it's https://github.com/Mortom123.keys for you. There's a step in the build that will print the command for connecting with ssh. example from a build in my fork:
image
Once you have ssh access, you can debug things to see what problems there are in the CI environment.

I removed existingSecretName, because when injecting Secrets through secondary applications, I would argue that you have to provide a name for it either way. so might as well set it to the required name.

+1, pulsar-manager integration has been poor so I'm not concerned about the possible breaking change.

During this I also wondered why we have such flexibility for the names for of the components. Most of the things are setup using pulsar.fullname-component.name. Why don't we fix component.name and only allow for a changing fullname through e.g. other release names?

I guess this is some of the legacy baggage of the chart. As long as there's a backwards compatible way to improve things gradually, we can do that.

user: pulsar
password: pulsar
username: "pulsar"
password: "" # leave empty for random password
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linting failed on this line: 1337:18 [comments] too few spaces before comment

@Mortom123
Copy link
Contributor Author

@lhotari I added both credentials as well as CI tests. But the tests have some problems:

  1. Here the pulsar manager is not ready to accept connections before being hit by the curl. I tried to fix this by always waiting for jobs to complete in the pipeline. I don't know why we don't wait when installing anyway? Maybe you have some explanation?
  2. This also failed, because it takes some time for the job to be triggered (both manager + broker need to be fully ready).

There are now a couple of ways to fix this:

  1. Increase wait timeout in $install_args
  2. Revert the commit touching $install args and put an until loop before hitting the manager in a similar fashion as here.

I would prefer to wait for all jobs to complete during install. But what do you or @frankjkelly think?

@Mortom123 Mortom123 requested a review from lhotari February 5, 2024 12:45
@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

Closing and re-opening to trigger a new build

@lhotari lhotari closed this Feb 13, 2024
@lhotari lhotari reopened this Feb 13, 2024
@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

Closing and re-opening once again to get a fix for the linting job in CI.

@lhotari lhotari closed this Feb 13, 2024
@lhotari lhotari reopened this Feb 13, 2024
@Mortom123
Copy link
Contributor Author

@lhotari i think the build wont work yet. because the job that initializes the manager is not run in timed / waited for, see #448 (comment)

@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

  1. I don't know why we don't wait when installing anyway? Maybe you have some explanation?

I guess the reason has been that --wait didn't work before adding MetalLB in a8c7745 . Services of type: LoadBalancer require a load balancer implementation in the k8s cluster. MetalLB can be used for "kind" which is used in CI.

Another reason for not using --wait might have been that it's possible to have more logging about k8s events and pod state when the helm install command returns and the waiting loop can do whatever it wants. That's just a guess that I'm making.

@Mortom123 Mortom123 force-pushed the feature/pulsar_manager_initialze branch from 603e8d3 to 15c0876 Compare February 13, 2024 14:42
@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

@Mortom123 if you'd like to debug the issue, it's best to create a PR to your own fork since that will activate ssh access to the build VM.

if: ${{ github.repository != 'apache/pulsar-helm-chart' && github.event_name == 'pull_request' }}

It requires a pull request currently and won't work for manually triggered jobs in your fork.

@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

I wonder if upgrading the image to v0.4.0 would help?

https://hub.docker.com/r/apachepulsar/pulsar-manager/tags

@Mortom123
Copy link
Contributor Author

Alright. I will try to do so. :) @lhotari

@lhotari
Copy link
Member

lhotari commented Feb 13, 2024

@Mortom123 have you figured out how to use the ssh access to the build? that makes the feedback loop much faster when you have a shell inside a build VM and you can try out different things and debug issues directly. For example, there's k9s available to view the kind k8s cluster.

@Mortom123 Mortom123 closed this Feb 13, 2024
@Mortom123 Mortom123 force-pushed the feature/pulsar_manager_initialze branch from 02058b7 to 1f20887 Compare February 13, 2024 16:43
@Mortom123
Copy link
Contributor Author

I will try debugging on my Repo for now and submit a clean PR once its done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants