Skip to content

Self-hosted runner on EC2 stalls at some point. #1463

@csenk

Description

@csenk

Describe the bug
For some reason unknown to me, our EC2 based self-hosted runners are not working anymore. The boot process seems to stop at some point without a clear indication of whats wrong.

Runner Version and Platform

Version of your runner?
2.284.0

OS of the machine running the runner? OSX/Windows/Linux/...
Linux 4.14.243-185.433.amzn2.x86_64 #1 SMP Mon Aug 9 05:55:52 UTC 2021.

What's not working?

These are the last few lines of the EC2 instance:

[...]
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper] Starting process:
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   File name: '/usr/bin/chmod'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Arguments: '755 "/home/ec2-user/actions-runner/svc.sh"'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Working directory: '/home/ec2-user/actions-runner'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Require exit code zero: 'True'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Encoding web name:  ; code page: ''
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Force kill process on cancellation: 'False'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Redirected STDIN: 'False'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Persist current code page: 'False'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   Keep redirected STDIN open: 'False'
[2021-11-04 13:37:26Z INFO ProcessInvokerWrapper]   High priority process: 'False'
[2021-11-04 13:37:27Z INFO ProcessInvokerWrapper] Updated oom_score_adj to 500 for PID: 8190.
[2021-11-04 13:37:27Z INFO ProcessInvokerWrapper] Process started with process id 8190, waiting for process exit.
[2021-11-04 13:37:27Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.
[2021-11-04 13:37:27Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.
[2021-11-04 13:37:27Z INFO ProcessInvokerWrapper] Finished process 8190 with exit code 0, and elapsed time 00:00:00.0010024.
[2021-11-04 13:37:27Z INFO Listener] Runner execution has finished with return code 0

However, in one of yesterdays run everything looks fine and it continued at the same point to run.

[...]
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper] Starting process:
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   File name: '/usr/bin/chmod'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Arguments: '755 "/home/ec2-user/actions-runner/svc.sh"'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Working directory: '/home/ec2-user/actions-runner'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Require exit code zero: 'True'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Encoding web name:  ; code page: ''
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Force kill process on cancellation: 'False'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Redirected STDIN: 'False'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Persist current code page: 'False'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   Keep redirected STDIN open: 'False'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper]   High priority process: 'False'
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper] Updated oom_score_adj to 500 for PID: 8191.
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper] Process started with process id 8191, waiting for process exit.
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper] STDOUT/STDERR stream read finished.
[2021-11-03 15:46:07Z INFO ProcessInvokerWrapper] Finished process 8191 with exit code 0, and elapsed time 00:00:00.0007144.
[2021-11-03 15:46:07Z INFO Listener] Runner execution has finished with return code 0
[2021-11-03 15:46:39Z INFO HostContext] No proxy settings were found based on environmental variables (http_proxy/https_proxy/HTTP_PROXY/HTTPS_PROXY)
[2021-11-03 15:46:39Z INFO HostContext] Well known directory 'Bin': '/home/ec2-user/actions-runner/bin'
[2021-11-03 15:46:39Z INFO HostContext] Well known directory 'Root': '/home/ec2-user/actions-runner'
[2021-11-03 15:46:39Z INFO HostContext] Well known config file 'Credentials': '/home/ec2-user/actions-runner/.credentials'
[2021-11-03 15:46:39Z INFO Listener] Runner is built for Linux (X64) - linux-x64.
[2021-11-03 15:46:39Z INFO Listener] RuntimeInformation: Linux 4.14.243-185.433.amzn2.x86_64 #1 SMP Mon Aug 9 05:55:52 UTC 2021.
[2021-11-03 15:46:39Z INFO Listener] Version: 2.284.0

Do you have any ideas?

We are using https://github.com/philips-labs/terraform-aws-github-runner to facilitate the runners if this might help in any way.

Hope you can help me, and thanks in advance :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions