-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[content-init] workspace start is slow, waits between awaiting seccomp fd
and supervisor: workspace content available
#13256
Comments
I think this is the same issue as #12345 |
Apparently, not only is it slow, but it can lapse for up to an hour and fail to boot. Perhaps a different cause, but the symptoms are similar. We can see
but before reaching
supervisor got SIGTERM
I'm not sure who sent out SIGTERM to supervisor 🤔
However, the fact that there is a log of the supervisor means that ring2 must have succeeded in starting the supervisor itself. |
@kylos101 Should we mark the issue |
👋 @utam0k generally blocked is used because it's actively being worked (in-progress), but we cannot proceed. In this case, I think leaving in inbox is okay (because we'll see recent comments to get context). |
@utam0k I defer to you, you are the expert. 😄 Worst case, we can always reopen. 😸 |
Bug description
👋 sometimes workspace starts can take many minutes, and the delay (for some cases) is inbetween here and here.
Here is a link for the entire set of logs. You'll see that delay the customer experienced (confirmed in webapp logs where we see browser connection here, which was approximately ~40 minutes).
Definition of done
workspacekit
,ws-daemon
, mayberunInitializer
), so, we can understand why we might be stalling for so long. The logging should likely use exponential backoff, so that if its running repeatedly for 40 minutes, it isn't too verbose.Steps to reproduce
Unclear
Workspace affected
instance id 5740c71b-c6ea-4735-ae15-93daacc7982e
Expected behavior
The delay should be way less than ~40 minutes.
Example repository
n/a
Anything else?
Run this query to view all instances for this workspace. Some have a null minutes to start, others have a really long time, and the bulk and balance are short (under 2 minutes).
Node performance at the time of the event seemed okay:
Front conversations
The text was updated successfully, but these errors were encountered: