-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v24.2.x] CORE-8082 cloud_io: add missing error handling to #24080
[v24.2.x] CORE-8082 cloud_io: add missing error handling to #24080
Conversation
the below tests from https://buildkite.com/redpanda/redpanda/builds/57850#01930ca1-436d-4a31-8472-4ad835437a5a have failed and will be retried
the below tests from https://buildkite.com/redpanda/redpanda/builds/57893#01931a48-90e4-4dfd-b982-b89bfe05b33d have failed and will be retried
|
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57850#01930ce4-eb4b-4808-ad55-fd762312e833:
non flaky failures in https://buildkite.com/redpanda/redpanda/builds/57893#01931a8b-4f31-46c7-aa82-ff40b8069da4:
|
Retry command for Build#57850please wait until all jobs are finished before running the slash command
|
This is to allow passing in a `retry_chain_logger` which does not inherit from `ss::logger` but wraps it. (cherry picked from commit f388831)
The call to `drain_response_stream` may throw various transport related errors (see one example below of a Broken Pipe error observed in CI). These errors should be handled inside the `remote::download_object` method because the caller's expectation is that download-related errors are communicated via the `download_result` return type rather than through an exception. Some of these errors (like the broken pipe error below) could also be retried, whereas with the previous implementation they were not retried. These exceptions are often ignored by the caller and may be printed as "Exceptional future ignored" log lines, which cause CI failures and are less useful for debugging. The below is an example of one such ignored exceptional future in the remote partition finalizing background fibre: ``` INFO 2024-10-29 12:41:17,708 [shard 1:main] cloud_storage - [fiber474 kafka/fuzzy-operator-6356-dzxvff/4] - remote_partition.cc:1406 - Finalizing remote storage state... DEBUG 2024-10-29 12:41:17,723 [shard 1:main] cloud_io - [fiber819~0|1|19984ms] - remote.cc:430 - Receive OK response from "37836c6f-30b0-482f-bb4e-0f3dffdb5cbe/meta/kafka/fuzzy-operator-6356-dzxvff/1_3447/manifest.bin" WARN 2024-10-29 12:41:17,723 [shard 1:main] http - /37836c6f-30b0-482f-bb4e-0f3dffdb5cbe/meta/kafka/fuzzy-operator-6356-dzxvff/1_3447/manifest.bin - client.cc:414 - receive error std::__1::system_error (error generic:32, System error during SSL read: [error:FFFFFFFF80000020:system library::Broken pipe]: Broken pipe) WARN 2024-10-29 12:41:17,723 [shard 1:main] seastar - Exceptional future ignored: std::__1::system_error (error generic:32, System error during SSL read: [error:FFFFFFFF80000020:system library::Broken pipe]: Broken pipe), backtrace: 0xa73be23 0xa392e05 0x360a6b8 0x9352157 0x360a71a 0xa48cc6f 0xa49045c 0xa4e77ca 0xa402f3f /opt/redpanda/lib/libc.so.6+0x961b6 /opt/redpanda/lib/libc.so.6+0x11839b ``` (cherry picked from commit ad14537)
52f30a9
to
453ea58
Compare
force-push: noop; to rebase to the latest of v24.2.x and drop the merge commit (52f30a9) from the branch |
Retry command for Build#57893please wait until all jobs are finished before running the slash command
|
Backport of PR #24059
Fixes #24076
Cherry pick conflicts:
remote.cc
has moved to a different path