-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checkout bricks a self-hosted runner and cannot recover #1148
Comments
Depending on how this is addressed, it could also fix other issues i.e: #933 , since that issue with submodule corruption is also fixed by just deleting the repo and allowing the runner to do a fresh clone ( #988 (comment) ). For example, as a broad workaround it could give up on reusing the existing git repository if any commands throw a fault, and try to delete and checkout the repository from scratch. |
Sometime ago there was a fix for this was introduced #964, but it seems it doesn't solve the issue. I might be wrong. |
This comment was marked as spam.
This comment was marked as spam.
We are using checkout v3 and this still seems to be an issue. |
Hi, also running into this issue. |
Does anyone have a workaround for this? |
Hi.... how fix if runners please send me your txt.... |
I have been using the following workaround while waiting for the fix:
This atleast prevents the runner from being bricked if |
Good workaround, thanks! |
I just wanted to add that I ran into this one today:
It then went ahead and gobbled up all the remaining jobs in the entire queue and failed them all with the same error. Edit: Seems like the above is an unrelated issue to what is mentioned in the first post, this time there was some random |
Happened again in a big way today :( |
@Ajaydip I tried your workaround and it didn't work for me, it always skips the action?
I did modify it a little bit .. I was hoping to be able to recover and run the rest of the pipeline unaffected without having to put an Edit: |
Any update on this? Does anyone have a working workaround? |
@bryanjtc the workaround above works OK, just note my Edit about using ‘outcome’ not ‘conclusion’ for testing whether to retry. |
Something went wrong, and all of our self-hosted runners checked out bad
.git
folders or somehow corrupted them. It happened on around 13 of our runners at the same time. I think it was a random occurrence, because I had to manually login and delete the repository folder, and then it was fine.Here are our logs:
In this case, checkout seems to be bailing fatally, i.e. after the error
fatal: --local can only be used inside a git repository
, the actions run ends immediately with a fault and won't try and continue.This effectively bricked the runner because any jobs that the bad runner would pick up would fail instantly. Not only that, but the bad runner would take all the jobs in the queue and virtually instantly fail them, which messed up our job history quite a bit unfortunately.
Since the resolution step was simply to login and delete the offending bad folder, it would be nice if it would automatically nuke away the folder and retry once.
It seems like it tried this:
I am not sure why that didn't work, since I was able to login and just
rm
the folder fine as the same user. In any case, all 13 runners failed to delete the folder automatically.To reproduce, I would suggest:
git config --local --get remote.origin.url
failsThe text was updated successfully, but these errors were encountered: