Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify that Windows runner service has started successfully #236

Merged
merged 1 commit into from
Dec 19, 2019

Conversation

juliobbv
Copy link
Contributor

Before this change, the runner was sending the command to start the runner, but it never verified that the service had actually started successfully. Now it waits for a successful Running status.

Fixes #194

try
{
_term.WriteLine("Waiting for service to start...");
service.WaitForStatus(ServiceControllerStatus.Running, TimeSpan.FromSeconds(60));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

60 seconds seems super long to start a service. Is there any reason it would take more then 15?

Copy link
Contributor Author

@juliobbv juliobbv Dec 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can afford the timeout to be intentionally conservative. On my machine, the service usually takes ~1s/2s to run, but then I'm using a machine with an NVMe SSD with tons of RAM, but a machine with an HDD that's happening to be in the process of paging during that precise config moment may take at least an order of magnitude longer, and create a false positive that could result in an issue getting filed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minutes used to be the default timeout for stopping windows services (maybe starting too?)

with a sluggish machine on startup things can be slow

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we've seen some surprisingly slow cloud machines cause timeouts in unit tests just launching simple processes. I've had to bump test timeouts from 10s to 20s before, and that's for a machine that has already finished booting up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Eric's experience with process launching on slow machines has been my experience as well. There are some environments that are just really slow, and we can't control them.

I'll keep the timeout as it currently is, but we can change it later if we desire so.

Copy link
Collaborator

@thboop thboop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM minor thoughts

@juliobbv juliobbv merged commit 9e37732 into master Dec 19, 2019
AdamOlech pushed a commit to antmicro/runner that referenced this pull request Jan 28, 2021
@TingluoHuang TingluoHuang deleted the users/juliobbv/checkWinSvcStarts branch September 1, 2023 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runner config says that agent service has been started successfully, but it's not the case
3 participants