Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graduate GEP-2257 and GEP-1742 to standard #3210

Merged
merged 5 commits into from
Aug 9, 2024

Conversation

kflynn
Copy link
Contributor

@kflynn kflynn commented Jul 22, 2024

What type of PR is this?
/kind gep

REVIEWERS: this PR has been rebased to be best reviewed commit by commit:

  • 905d806: Actually move from experimental to standard
  • 767798b: Fix a couple of old typos in GEP-1742
  • 30d9acb: Clean up language in GEP-2257 around repeated units
  • dcc2438: Mention GEP-2257 in the CRD docstrings (I ran into this today for unrelated reasons) and clarify that if the backend request timeout isn't specified, behavior is implementation-specific.
  • fc64635: Run make generate

What this PR does / why we need it:
This PR moves GEPs 2257 and 1742 to Standard rather than Experimental. Conformant implementations of HTTPRouteRequestTimeout and HTTPRouteBackendTimeout as of Gateway API v1.0.0:

  • Cilium v1.15.0-pre.3
  • Contour v1.29.0
  • Envoy Gateway v0.6.0
  • Istio 1.2

(All of these implementations support both timeout fields. I'm referencing 1.0.0 because we don't have many 1.1.0 conformance reports yet. 😐)

Does this PR introduce a user-facing change?:

GEP-2257 Durations and HTTPRoute Timeouts are now standard.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/gep PRs related to Gateway Enhancement Proposal(GEP) cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 22, 2024
@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 22, 2024
@mikemorris
Copy link
Contributor

mikemorris commented Jul 23, 2024

I think it would be good to at least enumerate three conformant implementations in this PR before merging. From https://gateway-api.sigs.k8s.io/implementations/v1.1/ and https://gateway-api.sigs.k8s.io/implementations/v1.0/ it appears Istio, Cilium and Contour support both HTTPRouteRequestTimeout and HTTPRouteBackendTimeout, and Kong's kubernetes-ingress-controller supports HTTPRouteBackendTimeout - are we missing others who have implemented this so far?

@youngnick
Copy link
Contributor

Yes, this LGTM once we have the relevant implementations called out, with one caveat - the YAML files also need to have the status updated.

such parsers to `kube-rs`, `client-go`, and `client-python`, but it is not
necessary to gate the graduation of GEP-2257 on this work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's not necessary to complete these parsers before the GEP freeze in 1 week, but it is necessary to complete these parsers before the end of the API refinement phase (likely +4 weeks per https://gateway-api.sigs.k8s.io/contributing/release-cycle/#3-api-refinement-and-documentation). We should definitely clean up this guidance before our next release.

Interested in what others think here though:
/cc @youngnick @mlavacca

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good from here, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree as well

@kflynn
Copy link
Contributor Author

kflynn commented Jul 23, 2024

@mikemorris writes:

I think it would be good to at least enumerate three conformant implementations in this PR before merging. From https://gateway-api.sigs.k8s.io/implementations/v1.1/ and https://gateway-api.sigs.k8s.io/implementations/v1.0/ it appears Istio, Cilium and Contour support both HTTPRouteRequestTimeout and HTTPRouteBackendTimeout, and Kong's kubernetes-ingress-controller supports HTTPRouteBackendTimeout - are we missing others who have implemented this so far?

Updated the PR description. Envoy Gateway also implements both timeouts. Kong is the only one that claimed conformance with only one of the timeouts, and actually they don't claim to be conformant for v1.1.0 -- I'm curious about that, so I'm poking around a bit. 🙂

FTR I definitely like the idea that we should mention the conformant implementations for these kinds of GEP changes!

@kflynn
Copy link
Contributor Author

kflynn commented Jul 25, 2024

Yes, this LGTM once we have the relevant implementations called out, with one caveat - the YAML files also need to have the status updated.

🤦‍♂️ Yes, fixed. 🤦‍♂️

@mikemorris
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 30, 2024
@youngnick
Copy link
Contributor

I can't see the YAML file changes, did you miss pushing? Once those are in, this LGTM.

@kflynn
Copy link
Contributor Author

kflynn commented Jul 31, 2024

@youngnick writes:

I can't see the YAML file changes, did you miss pushing? Once those are in, this LGTM.

🤔 Do you see three commits in the PR? The last one (d868630) has the YAML changes.

(Not gonna lie, I've forgotten a push many a time. 🤦‍♂️ But in this specific case it looks OK from my end?)

@youngnick
Copy link
Contributor

Ah, I wasn't clear which YAML files I meant, sorry (now there's a perennial Kubernetes sentence for you!). There's also the GEP metadata YAML files stored in the geps/ directories that need updating.

kflynn added 5 commits July 31, 2024 22:01
Signed-off-by: Flynn <emissary@flynn.kodachi.com>
Signed-off-by: Flynn <emissary@flynn.kodachi.com>
… 2257 is experimental.

Signed-off-by: Flynn <emissary@flynn.kodachi.com>
…issing backendRequest timeout is implementation-specific.

Signed-off-by: Flynn <emissary@flynn.kodachi.com>
Signed-off-by: Flynn <emissary@flynn.kodachi.com>
@kflynn kflynn force-pushed the flynn/standard-timeouts branch from d868630 to fc64635 Compare August 1, 2024 02:03
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 1, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kflynn, mlavacca

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 1, 2024
@kflynn kflynn requested review from mlavacca and robscott August 1, 2024 02:04
@kflynn
Copy link
Contributor Author

kflynn commented Aug 1, 2024

@youngnick writes:

Ah, I wasn't clear which YAML files I meant, sorry (now there's a perennial Kubernetes sentence for you!). There's also the GEP metadata YAML files stored in the geps/ directories that need updating.

🤦‍♂️ Thanks for the catch, perennial confusion or no. This should be updated.

I've rebased this onto main, cleaned up the git history, and re-requested reviews. This is now best reviewed commit by commit; see the opening comment on the PR for details.

@mikemorris
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 2, 2024
@mpstefan
Copy link

mpstefan commented Aug 5, 2024

I know that this one is ready to go, but I wanted to mention again that nginx is unable to add this kind of timeout by default, so keeping this as extended would be much preferred.

For anyone curious, the nginx timeout directives are keeping track of timeouts not by HTTP request, but by the entire upstream - likely because it works with many different protocols and network layers. Which means if we set a "timeout" of 10s, and one request hangs, but the others continue and return from the upstream as normal, the hanging request would hang past 10 seconds because the upstream responded in time. We'd need to design our own module to keep track of a per http request timeout.

@kflynn
Copy link
Contributor Author

kflynn commented Aug 5, 2024

I know that this one is ready to go, but I wanted to mention again that nginx is unable to add this kind of timeout by default, so keeping this as extended would be much preferred.

Right -- we are indeed keeping it extended in this PR.

That said, this bit:

...if we set a "timeout" of 10s, and one request hangs, but the others continue and return from the upstream as normal...

is fascinating to me; I wonder if we need to update the diagram for NGINX. Are you talking about a single request from a client spawning multiple requests to upstream workloads? or multiple client requests going to the same upstream? or what, exactly? 🙂 (Please feel free to toss me a link to RTFM if that's warranted! 🙂)

@youngnick
Copy link
Contributor

Looks like all the feedback has been addressed, so I'm going to merge this one in - but I feel it's worth just clarifying for @mpstefan and anyone else who sees this issue what's happening in this one.

This one moves the GEPs for Durations and Timeouts to Standard , which is a measure of API stability. This means that we are guaranteeing that there will be no backwards-incompatible changes to these APIs forever (basically).

The Timeouts feature is still Extended though, which means that it's optional, but conformance tested. I understand that Nginx can't support this mode, and that's why it's staying as Extended for now (and there's a reasonable chance it will forever as well).

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 9, 2024
@k8s-ci-robot k8s-ci-robot merged commit a311340 into kubernetes-sigs:main Aug 9, 2024
8 checks passed
@kflynn kflynn deleted the flynn/standard-timeouts branch October 25, 2024 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/gep PRs related to Gateway Enhancement Proposal(GEP) lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants