Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

enable the supervisor to handle longer service outages #1098

Conversation

bmc-msft
Copy link
Contributor

The current API retry code at the agent level will fail if the service is down for more than 25 seconds. In the case of longer outages such as those seen in #1095, it would be useful if the supervisor was more resilient while running existing tasks.

This PR enables the supervisor's poll commands and claim commands interactions with the service to be fallible at the transport layer (https), but not at the content layer (json).

src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
@bmc-msft bmc-msft requested a review from ranweiler July 21, 2021 19:44
src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
src/agent/onefuzz-supervisor/src/coordinator.rs Outdated Show resolved Hide resolved
demoray added 3 commits July 21, 2021 16:33
…g-tasks' of github.com:bmc-msft/onefuzz into enable-agent-to-handle-long-service-outage-while-running-tasks
@bmc-msft bmc-msft merged commit d50c5e0 into microsoft:main Jul 21, 2021
@bmc-msft bmc-msft deleted the enable-agent-to-handle-long-service-outage-while-running-tasks branch July 21, 2021 21:03
@ghost ghost locked as resolved and limited conversation to collaborators Aug 21, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants