Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Outlier detection (passive health checking) is enabled by default without a BTP #4854

Closed
lsjostro opened this issue Dec 5, 2024 · 6 comments · Fixed by #4856
Closed

Outlier detection (passive health checking) is enabled by default without a BTP #4854

lsjostro opened this issue Dec 5, 2024 · 6 comments · Fixed by #4856
Milestone

Comments

@lsjostro
Copy link
Contributor

lsjostro commented Dec 5, 2024

Description:
Passive health checking seems to be enabled by default without specifying a BTP. This might not always be preferred (even though passive health checks is super useful).

Repro steps:
Create a GW and HTTPRoute with no BTP and dump config, you will find "outlier_detection": {} present which enabled passive health with default values.

Environment:
EG 1.2.3

Logs:
look at the envoy_cluster_outlier_detection_ejections_active{} metric to see that outlier detection is active.

@lsjostro lsjostro added the triage label Dec 5, 2024
@arkodg
Copy link
Contributor

arkodg commented Dec 5, 2024

@arkodg arkodg added kind/bug Something isn't working help wanted Extra attention is needed cherrypick/release-v1.1.4 cherrypick/release-v1.2.4 and removed triage labels Dec 5, 2024
@arkodg arkodg added this to the v1.3.0-rc.1 milestone Dec 5, 2024
@lsjostro
Copy link
Contributor Author

lsjostro commented Dec 5, 2024

removing this line should be enough? https://github.com/envoyproxy/gateway/blob/main/internal/xds/translator/cluster.go#L106. + fixing the translator tests. OutlierDetection field will then be nil and not initiated and not included in the final config. I can open a PR if that helps?

@arkodg
Copy link
Contributor

arkodg commented Dec 5, 2024

looks like its been this way since the beginning

@lsjostro can you share your use case why you'd like to disable it ?

if we re changing behavior, this probably needs a bigger discussion @envoyproxy/gateway-maintainers

@arkodg arkodg removed help wanted Extra attention is needed cherrypick/release-v1.1.4 cherrypick/release-v1.2.4 kind/bug Something isn't working labels Dec 5, 2024
@lsjostro
Copy link
Contributor Author

lsjostro commented Dec 5, 2024

Yeah sure. First it's a bit misleading that you have passive health check enabled on all clusters even though you haven't explicit enabled it using a BTP. Second you might have tiers of Api gateways (proxyies). Envoy on first tier then another tier of envoys or other proxies before actually hitting the real service that might throw 500s. With outlier detection enabled on the first tier doesnt really make sense.

@lsjostro
Copy link
Contributor Author

lsjostro commented Dec 6, 2024

workaround until fixed.

---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyPatchPolicy
metadata:
  name: remove-outlier-detection
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: GatewayClass
    name: my-eg
  type: JSONPatch
  jsonPatches:
    - type: "type.googleapis.com/envoy.config.cluster.v3.Cluster"
      name: "httproute/some-namespace/backend-svc/rule/0"
      operation:
        op: remove
        path: "/outlier_detection"

@arkodg
Copy link
Contributor

arkodg commented Dec 6, 2024

thanks @lsjostro , I agree, disabled by default makes sense

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants