Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ws-manager] stuck in stopping: "container with exit code 1" ("cannot connect to daemon") #5271

Closed
geropl opened this issue Aug 19, 2021 · 2 comments · Fixed by #5396
Closed
Assignees
Labels
component: ws-manager groundwork: awaiting deployment priority: highest (user impact) Directly user impacting team: workspace Issue belongs to the Workspace team type: bug Something isn't working

Comments

@geropl
Copy link
Member

geropl commented Aug 19, 2021

Bug description

This bug very likely hid behind a mitigation for another problem for some time. We don't properly stop workspaces when supervisor "cannot connect to daemon" and the container exits with code "1".

Event trace: https://www.notion.so/gitpod/Task-Force-ws-daemon-restarts-b659846b2fad457eb55ec91c61d82eba#e516e9f48b584e2e8076eb3976b5f55b
Log query to identify cases: https://cloudlogging.app.goo.gl/ENj21hoLRyey9fYB7

DB query to (very likely) identify cases:

SELECT wsi.*
	FROM d_b_workspace_instance AS wsi
    JOIN d_b_workspace AS ws
		ON ws.id = wsi.workspaceId
    WHERE wsi.phasePersisted = 'stopping'
        AND (status->>'$.conditions.failed' = "" AND status->>'$.conditions.deployed' = "true" AND status->>'$.conditions.pullingImages' = "false")
	ORDER BY wsi.creationTime DESC
    LIMIT 500;

resulting instanceIds (mostly - but not exclusively - prebuilds) :

'f793f072-2803-44cd-8dda-d4cd00017040'
'9172cc9c-34d7-4eb9-9bd9-d9ebbcb98a74'
'b2dd3e17-4769-46e2-803c-c8fe42f00b65'
'b9dc16c6-9074-49a1-af1c-e836293082c7'
'dcd382a5-7461-4796-90ba-d24d5fb1ed86'
'b865cd98-3e66-4a9b-9891-bc1ff9eed892'
'cb9af383-5c72-4323-87bd-13fc5f3a14d2'
'0e509ee0-2519-459d-a703-351a3299524a'
'3e6903c1-804a-45d1-825a-2c3116911d91'
'b1cde0cd-f6f9-457e-a40d-84ebc388360d'
'a8461714-5668-4347-8c14-151e36dc1f5f'
'893e0021-9f1e-4ce1-889f-239f1e253a8a'
'8ef138a0-328a-4b74-aabf-8b1f57fc77af'
'220383e7-75e8-4b61-a427-1577069c721c'
'a24f17ff-e9c9-45d5-8e84-fb6e85471bde'
'85a3effc-14bd-4238-86d7-aa12b913d647'
'f9843d9a-03f1-4ebc-a987-efeb269ea526'
'a6d565ac-3a85-4ddc-8574-f0015db0126c'
'1cf02f55-4a72-4842-b546-b1e2c1f27a54'
'd34618ba-43fc-4aa5-9889-aaa7c579b024'
'af1976de-2ca0-474b-a9ee-eb229e7b07fd'
'a8f7597d-bc22-4e14-ab88-4380286c8c93'
'05683213-88bb-4351-b2ff-5432173b0b69'
'389a4a82-2719-4275-a3d9-6404449c4666'
'7c13d016-ef7f-407b-9308-25a89b875748'
'9b6751cc-1ad5-46b2-aeb9-01741ebb9688'
'62d1fe30-dadd-4403-ae3e-ef9111fbec83'
'2d53622b-b448-4939-ba5c-35442c8d76c0'
'a97fbef3-0f87-4c7f-8ebd-91f3323d1dfc'
'2fbfb0b9-2d97-4cae-ab6f-4cb9455245ec'
'48f13cc4-6311-48c2-950a-4b7ddedc3037'
'48f705e2-6a9a-4f86-89e1-688a34581e54'
'a9913e22-f573-4b3a-9051-290f6f8420d4'
'3f4495cb-ccba-497f-87cf-96e3822cfd7e'
'065bacf9-c1d0-4efc-a095-a242e1d76c3f'
'ddb6e58f-9056-4070-812d-099bc6a3ff46'
'fe2cdb60-3f9e-4226-8b50-d3cc2fdd143a'
'b6ee9c1f-9146-4348-b822-c07c8cc599d4'
'b00f2c83-afb4-4c7f-87f4-770a22b6bc53'
'44c3682a-af52-43ac-a3ee-183fc1ee7bbc'
'f351d38e-a04e-45c6-9479-dd5ce7636407'
'f3e50daf-06cd-4f29-9d3f-958e8b7aa3ee'
'4256e4cc-fa81-4f8e-b291-984b895876f7'
'becccbd5-6530-4e9f-b867-0902b31d82ad'
'34b829c1-661e-4b44-b995-5d7da583a3d5'
'4a8c77b3-6344-4b9e-aeba-e3d2c2cb2222'
'5108dea2-a528-4363-bd53-7a8a5addbd90'
'217a6724-7fb6-4a69-b385-557de20beb88'
'bc3db915-e7fd-4c8f-9d43-dde8211fd0d4'
'ed62b021-f7a1-44d3-a6a0-800d8e672f00'
'd4e5de19-294a-4afc-aac1-62590cd651f7'
'7e156655-b02c-42de-b491-b9eac42704fd'
'96776aa6-291d-4c2f-9a65-ce6358aec569'
'dcdca2db-981c-4bac-8c01-e85440837c1d'
'be0b7593-f3cd-4406-ad99-57fddd057139'
'417fb057-3e1e-4e51-ae6a-d9065c2df134'
'93501c01-b743-4ebf-915e-445926ec4cdd'
'593a15fd-3e67-4348-ae04-d1b41269f2e0'
'9193ddfe-9fbf-44b7-abbe-f131dc31456b'
'447a7785-1a6d-427b-ab6f-4195b9051f78'
'6776821e-f03b-4ee3-b176-ea20f0bd3278'
'06809466-c08d-49d1-bcd5-53c1a0be7a7c'
'ed5183da-7f57-4839-b47e-fcce6002ce2b'
'0f70187d-3250-4073-9c4b-97534031a463'
'166b3652-9a0e-4f0f-93ba-f65b5177b6da'
'2987d0c3-d172-409d-908a-cb07a3946a29'
'dc09bbd2-7c5c-4d0f-ab31-ef68ef4b09ee'
'3de11dac-35d2-474d-b026-e66283895b88'
'd2435293-2629-4abd-aafd-29c1584e7d03'
'e3077188-ab86-427c-8994-9f3a051466c5'
'92317c4e-bb22-4de1-9f78-c276542282e6'
'4ca070ce-ae24-4042-bcd5-f0788ae714e9'
'2e268554-2967-48ad-876a-f52b3f313543'
'bfc0a13a-5fc2-46d7-ae18-27f839e91f2a'
'9bfb4e7f-3e7e-4e06-9bba-d8837a376794'
'aefc5bf5-cce8-4554-bcd9-e6729386d59a'
'70f6ded4-b92d-41bd-bdd1-85081c9d191d'
'5d9a6546-3113-4145-bdc9-77f62f584b0d'
'5fdca6f6-393b-4a49-b16d-169cc468a2dc'
'ca112ad6-8b4e-47aa-9c87-f88d7309adae'
'8e59c362-3d3a-4c1b-9c01-8c1daaef9beb'
'9df1209a-c322-498c-9327-7dc1f02b50d1'
'c7c8b20d-f2df-4aed-b3e5-04bf6fd056de'
'1ca0c6f5-866d-45c0-a878-05be0c638899'
'98d19d75-45f5-4510-bb03-ef31c833df0f'
'b46809fe-31ca-4041-93a2-aa1c364df63e'
'0bc584e5-3c65-40d6-823c-f0deddc6a719'
'0326586f-53ad-4877-b635-f6765abe6f31'
'f779c1ec-6f2f-4ae7-9297-cec5fbdabd60'
'6731395f-b030-4e95-b1e0-8fc5ce666a54'
'd298339a-bb73-4207-9efe-6e993ccc6375'
'948ccb77-db1e-407f-9d27-65bbc99b550c'
'13196cd8-07ea-418b-91a5-69569061ab8d'
'f37b7aa1-b02c-448a-9523-2c4838978052'
'8b114fc7-991c-48eb-9f1c-eabb03256d8b'
'f74d81e7-d752-43e0-a02a-d06f4872a1cd'
'5c8a2ea4-fe9d-491a-8133-4f15dfb6901d'
'89921dbf-d4b2-4a10-9687-4500aa1a4f87'
'81554eff-4e4a-4d91-a721-d71e72d5d41c'
'bbd13ac2-dae7-482d-a326-066d265ca940'
'e9f6858b-901a-4e99-b14d-459c8812d6fe'
'96e5c4d7-84d2-4b85-a4a0-64780fbc3fe7'
'3b06e267-6b5c-4424-93b6-c3e2ede4dbca'
'98a2d985-cd42-4b1a-9d7b-e7d6ea693530'
'fbd2c03e-28f4-4960-86a7-cbc408e4ba68'
'6d58521b-90fc-4234-aff0-4234e3504f8b'
'fd57244d-b2a8-4a76-96b4-b1abc8d6fb87'
'264644eb-ee86-4f65-b154-12d586d77382'
'483ba505-df69-4ee1-aba6-8f7fb39870a1'
'6d08c61f-c7df-4237-99c6-02724b51478a'
'9bb6fe19-7f7a-44cc-ac20-6d9f6e384198'
'8b629a4f-f42b-44ba-af26-11bd72896af4'
'4deefcc4-62db-47ad-b79e-3690351191ea'
'73389c9e-9e30-419e-a467-686df1c12eb6'
'0eb4bca7-a1ab-4d8f-9981-a8b7c5b52e7c'
'e11dc4a5-0ca4-4b85-ae8a-af089aadaaf8'
'fd2b37af-54d6-4030-ba19-cf50e6259e94'
'8ade676d-d3f9-451b-9179-cbbc6737c136'
'3a19df35-cb97-4b0b-b601-2a415a35939a'
'96a8bb35-4f56-4e6f-9d12-cf87248b7413'
'02851916-ebde-46d0-a8d8-1d1cdd8d4603'
'e068892c-c678-4c68-b233-26d26ffc0da6'
'dccff272-755b-4ce4-a965-f2e0679bd745'
'3872f22d-f166-4555-a32c-9922ae724617'
'961f164e-8e92-4313-a922-c05b1e35ae93'
'f22acdbe-30c4-47f0-a2c3-6378efeaf520'
'86ff5fa0-c76a-4e6e-813a-f9ac7519e54e'
'7144d454-0755-438e-9674-68f5ec080825'
'26be1520-5ffc-4414-bbb3-f36c58191b8d'
'4c2a445f-5569-4d1c-831b-d7a2e1d7861d'
'96933a5b-be2d-429a-8c75-8611b7a815c7'
'c898a71b-1498-48ff-ac40-25e3c34f881a'
'a9831c34-7ec1-4d96-902a-d77aee7e80aa'
'3fe5cb62-5a99-4084-a96d-1f53317c7658'
'16f8701c-566b-4af6-baa3-1418743507ea'
'724e3d8e-ba0b-4613-afdc-f9dceadbd8cc'
'12b82c59-f3ce-4afd-a852-19c0557e379e'
'346ac7e6-9b69-4931-a26a-6c814adefb7c'
'5fdf21d6-1dde-4645-8948-35f5b268f048'
'5ec1c7d2-df4f-421f-bf20-7eca108950cc'
'495c71bb-96c6-46ec-92c5-baeac4687b52'
'd92943e4-808c-487c-a5bd-7d6b6e237068'
'587062ef-ec71-4d64-ae3c-8a463fe7cd12'
'bea68e5c-4ef7-4106-90c5-fb57aa4331be'
'5219baa8-5a02-40ea-bf7d-436232019395'
'b8c793be-ba32-4a47-85cd-b59affdb01ce'
'e4614673-2e18-4ad1-9abc-5e0402f9cdd9'
'6efa5a68-b3fb-464a-b9ae-e734ef78ef64'
'4acc63d7-9152-4ac8-b2f3-d5f86b123f52'
'f16e49a2-9e32-446a-a554-133dd5d34610'
'250a49ce-b09c-4907-bb6f-c8a591018f99'
'8d5e2cde-9ccd-4638-8bf1-8b1e91a8dee1'
'cb20f82c-8cbf-4292-9ba0-0ea38fafe2e4'
'132a4265-f7b1-42f0-b9f1-70bc8f884e2a'
'2ff9f29f-b335-4a01-89b6-8c3b21d7d2fa'
'0b30c81a-bf0b-4502-96b9-12b954e16f30'
'104d00f8-fe1b-4218-acc2-e7f3ebf9531a'
'36c8a1e0-3212-43bc-a857-2028cf557511'
'adb450f9-2d71-42ec-b7fd-59061e12e094'
'37443db1-9c55-462b-8c7b-cd932a1de1a4'
'054558d5-95fb-4d38-9b98-bc58e6cd1fc2'
'67f8ac9f-35cc-44e5-b6bd-e2e3c512a78b'
'e4f5604a-8a99-422f-9a9d-34d63ced2e2c'

Steps to reproduce

see above

Expected behavior

No response

Example repository

No response

Anything else?

No response

@geropl geropl added type: bug Something isn't working component: ws-manager priority: highest (user impact) Directly user impacting labels Aug 19, 2021
@JanKoehnlein
Copy link
Contributor

/schedule

@csweichel
Copy link
Contributor

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: ws-manager groundwork: awaiting deployment priority: highest (user impact) Directly user impacting team: workspace Issue belongs to the Workspace team type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants