Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Kibana basepath in associations #8053

Merged
merged 21 commits into from
Oct 3, 2024

Conversation

kvalliyurnatt
Copy link
Contributor

@kvalliyurnatt kvalliyurnatt commented Sep 18, 2024

fixes: #7909
Currently the operator does not look at the kibana basePath config and accordingly alter path for fleet, in this PR we have the operator checking the basePath config. Currently this only looks at the config in the KIbana spec, the basePath can also be set in the ENV of the KIbana pod, this is currently not being handled

TODO:

  • Test changes locally

@thbkrkr thbkrkr added the >bug Something isn't working label Sep 19, 2024
@botelastic botelastic bot removed the triage label Sep 19, 2024
@kvalliyurnatt kvalliyurnatt changed the title [WIP]: Handle Kibana basepath updates for fleet Handle Kibana basepath updates for fleet Sep 19, 2024
@kvalliyurnatt kvalliyurnatt marked this pull request as ready for review September 19, 2024 20:19
@naemono
Copy link
Contributor

naemono commented Sep 20, 2024

@kvalliyurnatt I'm going to take one final look at this today and it should be g2g.

return "", nil
}

kbucfgConfig, err := ucfg.NewFrom(kb.Spec.Config.Data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that one downside of that approach is that we are reading the expected config, not necessarily the one currently applied in the Kibana config Secret.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a great point ! We probably should look at the secret which would be the current applied config. I can make that change

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barkbay but isn't the source of truth the kibana crd? @kvalliyurnatt and I were discussing and I'm not seeing much benefit of reading the secret vs the crd. Don't we treat the crd as the source of truth in nearly all of the controller, as we're talking eventual consistency here, right? Are there other places in the controller where we treat the secret as what should be applied across the stack? I'm trying to ensure we are consistent in how we handle this type of thing. Is there a situation you had in mind where one is better than the other?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barkbay but isn't the source of truth the kibana crd

I consider the custom resources as the desired state. The current state can be different. It can be a temporary difference, the controller is making some progress and is implementing the desired state, or the controller can be "stuck" (bug in the controller, issue reaching the k8s api, problem in the Kibana resource itself..).

[...] I'm not seeing much benefit of reading the secret

If the Kibana controller can't reconcile the configuration we may use an incorrect view of the configuration, and then we may also fail to reconcile the Agent resource. As mention before it can be a temporary failure (Agent controller is faster than Kibana controller), or it can a permanent one (bug in the Kibana controller, error in the Kibana resource...). In all the cases it can be confusing to see Agent controller errors while the root cause could be the Kibana controller, this is the kind of situation I wanted to avoid by challenging the proposed implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not strongly leaning towards one implementation over the other, I have made the changes to read form secret, but @naemono let me know if you feel strongly to go with reading from the CR

I think this point below makes me lean more towards reading from the secret.

In all the cases it can be confusing to see Agent controller errors while the root cause could be the Kibana controller, this is the kind of situation I wanted to avoid by challenging the proposed implementation.

Copy link
Contributor

@naemono naemono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this point below makes me lean more towards reading from the secret.

I don't disagree, but I wanted to have the discussion, and ensure that we are noting why we are doing so in the code, as it feels to me a slight (but not wrong) diversion for our SOP.

I'm good with the current approach, but do have some additional optimizations that are nice to have.

pkg/controller/agent/fleet.go Outdated Show resolved Hide resolved
pkg/controller/agent/fleet.go Outdated Show resolved Hide resolved
pkg/controller/agent/fleet.go Outdated Show resolved Hide resolved
@kvalliyurnatt
Copy link
Contributor Author

I tested it locally and I was able to verify that fleet tries to use the configured basepath

 "Post \"https://kibana-kb-http.kvalliy.svc:5601/monitoring/kibana/api/fleet/setup\": x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kibana-http\")"}]

I had some certificate issues locally, but I was able to check that the Kibana url included the basepath

@kvalliyurnatt kvalliyurnatt requested a review from pebrc September 24, 2024 13:11
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the approach needs to be a bit more general as this PR fixes only the Fleet setup problem (which is the one where this issue was reported tbf)

My intuition would be to address this on the level of the association configuration where we convey the Kibana URL to the associated objects:

func getKibanaExternalURL(c k8s.Client, assoc commonv1.Association) (string, error) {
kibanaRef := assoc.AssociationRef()
if !kibanaRef.IsDefined() {
return "", nil
}
kb := kbv1.Kibana{}
if err := c.Get(context.Background(), kibanaRef.NamespacedName(), &kb); err != nil {
return "", err
}
serviceName := kibanaRef.ServiceName
if serviceName == "" {
serviceName = kbv1.HTTPService(kb.Name)
}
nsn := types.NamespacedName{Namespace: kb.Namespace, Name: serviceName}
return association.ServiceURL(c, nsn, kb.Spec.HTTP.Protocol())
}

Other associated applications like Beats and APM in older versions also talk to the Kibana API to install dashboards IIRC and that functionality would be broken with a base path that is non-empty as well.

Finally I think we should try to configure the readiness probe for the Kibana pods to respect the basePath setting:

func readinessProbe(useTLS bool) corev1.Probe {
scheme := corev1.URISchemeHTTP
if useTLS {
scheme = corev1.URISchemeHTTPS
}
return corev1.Probe{
FailureThreshold: 3,
InitialDelaySeconds: 10,
PeriodSeconds: 10,
SuccessThreshold: 1,
TimeoutSeconds: 5,
ProbeHandler: corev1.ProbeHandler{
HTTPGet: &corev1.HTTPGetAction{
Port: intstr.FromInt(network.HTTPPort),
Path: "/login",
Scheme: scheme,
},
},
}
}

In the orginal issue I said we should only look at the Kibana configuration but I wonder if it would be much more effort to cover the case where users have set up the basePath via an environment variable (not recommended but 🤷 )

@kvalliyurnatt
Copy link
Contributor Author

I think the approach needs to be a bit more general as this PR fixes only the Fleet setup problem (which is the one where this issue was reported tbf)

My intuition would be to address this on the level of the association configuration where we convey the Kibana URL to the associated objects:

func getKibanaExternalURL(c k8s.Client, assoc commonv1.Association) (string, error) {
kibanaRef := assoc.AssociationRef()
if !kibanaRef.IsDefined() {
return "", nil
}
kb := kbv1.Kibana{}
if err := c.Get(context.Background(), kibanaRef.NamespacedName(), &kb); err != nil {
return "", err
}
serviceName := kibanaRef.ServiceName
if serviceName == "" {
serviceName = kbv1.HTTPService(kb.Name)
}
nsn := types.NamespacedName{Namespace: kb.Namespace, Name: serviceName}
return association.ServiceURL(c, nsn, kb.Spec.HTTP.Protocol())
}

Other associated applications like Beats and APM in older versions also talk to the Kibana API to install dashboards IIRC and that functionality would be broken with a base path that is non-empty as well.

Finally I think we should try to configure the readiness probe for the Kibana pods to respect the basePath setting:

func readinessProbe(useTLS bool) corev1.Probe {
scheme := corev1.URISchemeHTTP
if useTLS {
scheme = corev1.URISchemeHTTPS
}
return corev1.Probe{
FailureThreshold: 3,
InitialDelaySeconds: 10,
PeriodSeconds: 10,
SuccessThreshold: 1,
TimeoutSeconds: 5,
ProbeHandler: corev1.ProbeHandler{
HTTPGet: &corev1.HTTPGetAction{
Port: intstr.FromInt(network.HTTPPort),
Path: "/login",
Scheme: scheme,
},
},
}
}

In the orginal issue I said we should only look at the Kibana configuration but I wonder if it would be much more effort to cover the case where users have set up the basePath via an environment variable (not recommended but 🤷 )

@pebrc I have made the requested changes, please take another look when you get a chance

@kvalliyurnatt kvalliyurnatt requested a review from pebrc September 27, 2024 18:25
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to look at server.rewriteBasePath as well. Let me know if I missed something important here, but otherwise Kibana just ignores the basePath.

Also I wonder if would make sense to either modify an existing e2e test or add a new one to exercise this functionality.

pkg/controller/association/controller/apm_kibana.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
pkg/controller/association/controller/apm_kibana.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
@kvalliyurnatt kvalliyurnatt requested a review from pebrc September 30, 2024 23:03
@kvalliyurnatt
Copy link
Contributor Author

I think we need to look at server.rewriteBasePath as well. Let me know if I missed something important here, but otherwise Kibana just ignores the basePath.

Also I wonder if would make sense to either modify an existing e2e test or add a new one to exercise this functionality.

I modified the apm association test to test the Kibana basepath functionality

Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! a few nits and a finding around truthy values in Kibana conf. Otherwise good to go imo.

pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
test/e2e/test/kibana/builder.go Outdated Show resolved Hide resolved
pkg/controller/kibana/pod.go Outdated Show resolved Hide resolved
test/e2e/apm/association_test.go Show resolved Hide resolved
@pebrc
Copy link
Collaborator

pebrc commented Oct 2, 2024

buildkite test this -f p=gke,t=TestAPMKibanaAssociation

@kvalliyurnatt
Copy link
Contributor Author

buildkite test this -f p=gke,t=TestAPMKibanaAssociation

@kvalliyurnatt kvalliyurnatt requested a review from pebrc October 2, 2024 20:11
Copy link
Collaborator

@pebrc pebrc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice work!

@kvalliyurnatt kvalliyurnatt merged commit 08d2bf9 into elastic:main Oct 3, 2024
5 checks passed
@kvalliyurnatt kvalliyurnatt deleted the fix_kibana_basepath branch October 3, 2024 15:44
@rhr323 rhr323 changed the title Handle Kibana basepath updates for fleet Support Kibana basepath in associations Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug Something isn't working v2.15.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fleet server pod not created when Kibana is set with SERVER_BASEPATH
5 participants