Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Experimental] DeliverySpec Timeout #5148

Open
7 of 8 tasks
slinkydeveloper opened this issue Mar 25, 2021 · 17 comments · Fixed by #5149
Open
7 of 8 tasks

[Experimental] DeliverySpec Timeout #5148

slinkydeveloper opened this issue Mar 25, 2021 · 17 comments · Fixed by #5149
Assignees
Labels
area/delivery area/eventing The Eventing api group kind/feature-request priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. roadmap Issues for linking from the roadmap triage/accepted Issues which should be fixed (post-triage)
Milestone

Comments

@slinkydeveloper
Copy link
Contributor

slinkydeveloper commented Mar 25, 2021

Description
Sink services are usually very heterogeneous, each one with different response time characteristics, depending on their functionality, the way they're implemented, the guarantees the user wants. Because of that, timeouts of a request, when dispatching an event, may vary. DeliverySpec doesn't give the ability to the user to specify the timeout of the single request, so we usually just default to something like 10 seconds. This is not enough, and doesn't cover a lot of use cases where this parameter needs to be tuned.

This experimental feature proposed to add a new field to the DeliverySpec to define such timeout for each dispatched request.

Exit Criteria
DeliverySpec allows to configure the timeout of the single request.

Experimental flag name: delivery-timeout

Experimental feature stages plan

Below the proposed plan for the feature stages (this list implicitly includes the requirements defined in the process)

Affected WG

  • Event Delivery WG

Prior discussion

@slinkydeveloper
Copy link
Contributor Author

/assign

@matzew
Copy link
Member

matzew commented Mar 25, 2021

I think it is very reasonable to have some timeout notion configurable

/cc @lionelvillard

@slinkydeveloper
Copy link
Contributor Author

slinkydeveloper commented Mar 25, 2021

Spec PR: knative/specs#15
API PR: #5149

After these get accepted, I'll proceed with implementing it in broker and channel.

@lberk lberk added area/delivery area/eventing The Eventing api group priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Mar 29, 2021
@lberk lberk added this to the Backlog milestone Mar 29, 2021
@n3wscott
Copy link
Contributor

I believe this should be pushed to after we become v1.0 ready. It changes the spec of so many resources.

@slinkydeveloper
Copy link
Contributor Author

slinkydeveloper commented Mar 30, 2021

As explained here knative/specs#15 (comment), I think this is an important feature and we should do it before v1, and it's not a drastic change. It's trivial to implement for most components out there (at least the ones i know).

@n3wscott
Copy link
Contributor

n3wscott commented Apr 6, 2021

And as I said here: it has long lasting effects: knative/specs#15 (comment)

If this value changes in the core, then every CRD needs to be updated or they can not round trip with the timeout.

@slinkydeveloper
Copy link
Contributor Author

This is not a problem, we can deal with it in the weeks prior to the next release, we've plenty of time for it.

We're also going to remove v1beta1 APIs #5201, so it's even less a problem.

@n3wscott
Copy link
Contributor

n3wscott commented Apr 6, 2021

This is a very very real problem if v1 continues to be a sliding scale for anyone who has developed any component that exists outside the knative repos. We need to put a stop to the feature creep until we can cut a v1/GA version of Eventing.

@slinkydeveloper slinkydeveloper changed the title DeliverySpec Timeout [Experimental] DeliverySpec Timeout Apr 14, 2021
@slinkydeveloper
Copy link
Contributor Author

Refactored this issue as experimental feature, per proposal https://docs.google.com/document/d/1AoH0FyLeuHIg5snrlCKzELQv-NEA6bbjTU2Qv3zlW5k/edit?usp=sharing

@Shashankft9
Copy link
Member

Hey, I have recently been looking at the option of configuring timeouts based on the workload to which the cloudevent is going. I have a knative service which takes a bit of time to do some processing to give back the response (basically file conversion use case). Currently I am just directly using the knative service URL to do the work - where we can set the maxTimeout period (in serving), but I am evaluating how it will fit in the eventing part.
I believe without a configurable timeout option in channels, triggers, sources etc, the request will never actually happen successfully, so a feature like this would be really helpful!

@slinkydeveloper
Copy link
Contributor Author

Inclusion of this feature is approved! https://groups.google.com/g/knative-dev/c/7h1p0eBmUuo/m/lTzaSULKAQAJ

@devguyio devguyio moved this to In Progress in Eventing WG Roadmap Dec 1, 2021
@devguyio devguyio added the roadmap Issues for linking from the roadmap label Dec 2, 2021
@github-actions
Copy link

github-actions bot commented Mar 3, 2022

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 3, 2022
@pierDipi
Copy link
Member

pierDipi commented Mar 3, 2022

/remove-lifecycle stale

@knative-prow-robot knative-prow-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 3, 2022
@devguyio devguyio moved this from In Progress to Experimental Features in Eventing WG Roadmap Mar 3, 2022
@github-actions
Copy link

github-actions bot commented Jun 2, 2022

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 2, 2022
@lionelvillard lionelvillard added the triage/accepted Issues which should be fixed (post-triage) label Jun 2, 2022
@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2022
@pierDipi
Copy link
Member

pierDipi commented Aug 3, 2022

/assign

@pierDipi
Copy link
Member

pierDipi commented Aug 4, 2022

Feature promoted to beta in 1.7, so enabled by default.

@pierDipi pierDipi moved this from Alpha Features to Beta features in Eventing WG Roadmap Aug 10, 2022
@pierDipi pierDipi moved this from Beta features to In Progress in Eventing WG Roadmap Aug 17, 2022
@pierDipi pierDipi moved this from In Progress to Ready To Work in Eventing WG Roadmap Jun 14, 2023
@pierDipi pierDipi moved this from Ready To Work to In Progress in Eventing WG Roadmap Jun 14, 2023
@pfilaretov42
Copy link

pfilaretov42 commented Nov 6, 2024

Hi, I'm trying to set delivery timeout for kafka source and it seems it does not work.

Here is the kafka source with the delivery timeout set to 30s:

apiVersion: sources.knative.dev/v1beta1
kind: KafkaSource
metadata:
  name: source-knative-demo-kafka
  namespace: default
spec:
  consumerGroup: knative-group
  bootstrapServers:
    - my-cluster-kafka-bootstrap.kafka:9092
  topics:
    - knative-demo-kafka-source-topic
  sink:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: knative-demo
    uri: /events/long
  delivery:
    backoffDelay: PT3S
    backoffPolicy: linear
    retry: 2
    timeout: PT30S

And here are logs from kafka-source-dispatcher pod:

{"@timestamp":"2024-11-06T07:38:12.324Z","@version":"1","message":"Received a failure status code that is not 2xx. failed to send event to subscriber ...", ..., "statusCode":504}
{"@timestamp":"2024-11-06T07:38:30.328Z","@version":"1","message":"Received a failure status code that is not 2xx. failed to send event to subscriber ...",..., "statusCode":504}
{"@timestamp":"2024-11-06T07:38:51.332Z","@version":"1","message":"Received a failure status code that is not 2xx. failed to send event to subscriber ...",..., "statusCode":504}
{"@timestamp":"2024-11-06T07:38:51.332Z","@version":"1","message":"Failed to send event to subscriber ...", ...}
{"@timestamp":"2024-11-06T07:38:51.333Z","@version":"1","message":"Failed to send event to dead letter sink ...", ...}

As you can see the time between "Received a failure status code" messages is less than 30s.

Could you advice on how can I set the timeout for kafka source, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/delivery area/eventing The Eventing api group kind/feature-request priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. roadmap Issues for linking from the roadmap triage/accepted Issues which should be fixed (post-triage)
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

10 participants