Skip to content

Fix request hanging after response start timeout expires#15899

Merged
knative-prow[bot] merged 1 commit intoknative:mainfrom
Fedosin:responseStartTimeoutDrained
Jun 5, 2025
Merged

Fix request hanging after response start timeout expires#15899
knative-prow[bot] merged 1 commit intoknative:mainfrom
Fedosin:responseStartTimeoutDrained

Conversation

@Fedosin
Copy link
Contributor

@Fedosin Fedosin commented May 26, 2025

Fixes #15352

Proposed Changes

When a response starts before the responseStartTimeout but the timeout still fires, the timeout handler would not properly continue processing the request. This caused requests to hang indefinitely if they started responding just before the responseStartTimeout expired.

The issue occurred because after handling the responseStartTimeout case, the select loop would continue but without properly waiting for the handler to complete. Setting responseStartTimeoutDrained ensures the timer is properly cleaned up and the loop continues to process other events (completion, overall timeout, or idle timeout).

Fixes requests that start responding before responseStartTimeout but take longer than responseStartTimeout to complete.

Release Note

NONE

When a response starts before the responseStartTimeout but the timeout
still fires, the timeout handler would not properly continue processing
the request. This caused requests to hang indefinitely if they started
responding just before the responseStartTimeout expired.

The issue occurred because after handling the responseStartTimeout case,
the select loop would continue but without properly waiting for the
handler to complete. Setting responseStartTimeoutDrained ensures the
timer is properly cleaned up and the loop continues to process other
events (completion, overall timeout, or idle timeout).

Fixes requests that start responding before responseStartTimeout but
take longer than responseStartTimeout to complete.
@knative-prow knative-prow bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 26, 2025
@knative-prow knative-prow bot requested review from gauron99 and skonto May 26, 2025 13:39
@codecov
Copy link

codecov bot commented May 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.97%. Comparing base (c36383e) to head (9ba1c61).
Report is 11 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15899      +/-   ##
==========================================
+ Coverage   80.94%   80.97%   +0.03%     
==========================================
  Files         210      210              
  Lines       16769    16771       +2     
==========================================
+ Hits        13573    13580       +7     
+ Misses       2844     2837       -7     
- Partials      352      354       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dsimansk
Copy link
Contributor

/lgtm

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label May 26, 2025
Copy link
Member

@matzew matzew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@dsimansk
Copy link
Contributor

dsimansk commented Jun 5, 2025

/approve

@knative-prow
Copy link

knative-prow bot commented Jun 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dsimansk, Fedosin, matzew

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2025
@dsimansk
Copy link
Contributor

dsimansk commented Jun 5, 2025

/test unit-tests

@dsimansk
Copy link
Contributor

dsimansk commented Jun 5, 2025

https://prow.knative.dev/view/gs/knative-prow/pr-logs/pull/knative_serving/15899/unit-tests_serving_main/1930549514918170624

Captuing the failed unit tests run. It seems related as it's TimeoutTest failure. It might be intermittent though, let see in another run.

Or in worse case it's clashing with the other change: https://github.com/knative/serving/pull/15900/files

@knative-prow knative-prow bot merged commit cf4b1ae into knative:main Jun 5, 2025
68 checks passed
@dprotaso
Copy link
Member

dprotaso commented Jun 5, 2025

/lgtm

norman465 pushed a commit to src2img/knative-serving that referenced this pull request Jun 17, 2025
When a response starts before the responseStartTimeout but the timeout
still fires, the timeout handler would not properly continue processing
the request. This caused requests to hang indefinitely if they started
responding just before the responseStartTimeout expired.

The issue occurred because after handling the responseStartTimeout case,
the select loop would continue but without properly waiting for the
handler to complete. Setting responseStartTimeoutDrained ensures the
timer is properly cleaned up and the loop continues to process other
events (completion, overall timeout, or idle timeout).

Fixes requests that start responding before responseStartTimeout but
take longer than responseStartTimeout to complete.
SaschaSchwarze0 pushed a commit to src2img/knative-serving that referenced this pull request Jun 17, 2025
* Fix flakes in TestIdleTimeoutHandler (knative#15918)

* Correct tryTimeoutAndWriteError to write timeout regardless of prior writes (knative#15900)

Previously, the function comment suggested it would only write errors if nothing
had been written, but the implementation correctly only checks the timedOut flag.
This allows timeout errors to be written even after a response has started,
which is the desired behavior for handling slow responses.

- Fixed misleading function comment
- Updated test to match actual behavior
- Added comprehensive test coverage

* Fix request hanging after response start timeout expires (knative#15899)

When a response starts before the responseStartTimeout but the timeout
still fires, the timeout handler would not properly continue processing
the request. This caused requests to hang indefinitely if they started
responding just before the responseStartTimeout expired.

The issue occurred because after handling the responseStartTimeout case,
the select loop would continue but without properly waiting for the
handler to complete. Setting responseStartTimeoutDrained ensures the
timer is properly cleaned up and the loop continues to process other
events (completion, overall timeout, or idle timeout).

Fixes requests that start responding before responseStartTimeout but
take longer than responseStartTimeout to complete.

---------

Co-authored-by: Mike Fedosin <mfedosin@redhat.com>
isibeni pushed a commit to src2img/knative-serving that referenced this pull request Jun 23, 2025
* Fix flakes in TestIdleTimeoutHandler (knative#15918)

* Correct tryTimeoutAndWriteError to write timeout regardless of prior writes (knative#15900)

Previously, the function comment suggested it would only write errors if nothing
had been written, but the implementation correctly only checks the timedOut flag.
This allows timeout errors to be written even after a response has started,
which is the desired behavior for handling slow responses.

- Fixed misleading function comment
- Updated test to match actual behavior
- Added comprehensive test coverage

* Fix request hanging after response start timeout expires (knative#15899)

When a response starts before the responseStartTimeout but the timeout
still fires, the timeout handler would not properly continue processing
the request. This caused requests to hang indefinitely if they started
responding just before the responseStartTimeout expired.

The issue occurred because after handling the responseStartTimeout case,
the select loop would continue but without properly waiting for the
handler to complete. Setting responseStartTimeoutDrained ensures the
timer is properly cleaned up and the loop continues to process other
events (completion, overall timeout, or idle timeout).

Fixes requests that start responding before responseStartTimeout but
take longer than responseStartTimeout to complete.

---------

Co-authored-by: Mike Fedosin <mfedosin@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Knative Timeout Issues with Long-Running Requests

4 participants