Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods unable to be created on windows nodes #109

Closed
lizhuqi opened this issue Apr 26, 2021 · 28 comments
Closed

Pods unable to be created on windows nodes #109

lizhuqi opened this issue Apr 26, 2021 · 28 comments
Assignees
Labels
Windows on Kubernetes Windows Containers using Kubernetes

Comments

@lizhuqi
Copy link

lizhuqi commented Apr 26, 2021

It seems that Windows OS has some issue handling concurrent actions. Concurrent actions sometimes failed to start containers on Windows.

27m Warning FailedCreatePodSandBox pod/[POD_ONE] Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "[POD_ONE]": Error response from daemon: hcsshim::PrepareLayer - failed failed in Win32: The device is not ready. (0x15)
24m Warning FailedMount pod/[POD_ONE] Unable to attach or mount volumes: unmounted volumes=[workspace-volume default-token-xyz12], unattached volumes=[workspace-volume default-token-xyz12l]: timed out waiting for the condition
30m Normal Scheduled pod/[POD_TWO] Successfully assigned cje/[POD_TWO] to [NODE_ID]
29m Warning FailedCreatePodSandBox pod/[POD_TWO] Failed to create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "[POD_TWO]": Error response from daemon: hcsshim::PrepareLayer - failed failed in Win32: The device is not ready. (0x15)

@ghost ghost added the triage New and needs attention label Apr 26, 2021
@vrapolinario vrapolinario added Windows on Kubernetes Windows Containers using Kubernetes and removed triage New and needs attention labels Apr 26, 2021
@vrapolinario
Copy link
Contributor

Hi @lizhuqi an you please provide more details on the issue you're finding? Such as: Is this a managed K8s (AKS, EKS)? What version of K8s? K8s node config, etc?

@marosset
Copy link
Member

marosset commented Apr 27, 2021

Can you include a few more details such as

  • Pod spec
  • number of containers starting concurrently
  • Node side (reserved)
  • Any other repro steps.
  • OS version

Thanks!

@lizhuqi
Copy link
Author

lizhuqi commented Apr 27, 2021

I noticed that cluster which has the issue is still using 10.0.17763.1577. I will ask the owner to upgrade to 10.0.17763.1757 first and then add more details.

@zhiweiv
Copy link

zhiweiv commented May 13, 2021

This is also reported here microsoft/hcsshim#919.
VM spec is E2_V4(2 cores/16 GB) , we use bare pod as job workloads, there are about 20 concurrent pod creation/termination always.
The pod spec

    resources:
      requests:
        cpu: 50m
        memory: 500Mi
      limits:
        cpu: 1000m
        memory: 4Gi

@immuzz
Copy link

immuzz commented Jun 9, 2021

Is this on AKS or some other distro or DIY K8s?

@zhiweiv
Copy link

zhiweiv commented Jun 10, 2021

For me, it is reproduced in both AKS and AKS-Engine.

@ghost
Copy link

ghost commented Jul 10, 2021

This issue has been open for 30 days with no updates.
@immuzz, please provide an update or close this issue.

1 similar comment
@ghost
Copy link

ghost commented Aug 11, 2021

This issue has been open for 30 days with no updates.
@immuzz, please provide an update or close this issue.

@zhiweiv
Copy link

zhiweiv commented Aug 13, 2021

There is a fix microsoft/hcsshim#1091 (comment).
Per microsoft/hcsshim#919 (comment), I guess still need a long time for it to be available in AKS.

@AbelHu
Copy link

AbelHu commented Aug 16, 2021

@zhiweiv, I think that the fix needs to be picked up by moby at first and then we can upgrade the moby version in AKS side.
@immuzz do you know who can speed up taking the new hccshim release with this fix in next moby version?

@zhiweiv
Copy link

zhiweiv commented Aug 16, 2021 via email

@AbelHu
Copy link

AbelHu commented Aug 17, 2021

Hope it can be merged into AKS Windows ContainerD ASAP too. Get Outlook for iOShttps://aka.ms/o0ukef

________________________________ From: Chou Hu @.> Sent: Monday, August 16, 2021 10:23:29 AM To: microsoft/Windows-Containers @.> Cc: Zhiwei Liu @.>; Mention @.> Subject: Re: [microsoft/Windows-Containers] Pods unable to be created on windows nodes (#109) @zhiweivhttps://github.com/zhiweiv, I think that the fix needs to be picked up by moby at first and then we can upgrade the moby version in AKS side. @immuzzhttps://github.com/immuzz do you know who can speed up taking the new hccshim release with this fix in next moby version? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#109 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AFH3LQPHTZH6GWJSLQOVGADT5BZCDANCNFSM43TNATIA.

AKS will pick up it after the containerd 1.5.x has this fix.
kubernetes/kubernetes#101908 (comment)

@AbelHu
Copy link

AbelHu commented Aug 18, 2021

@zhiweiv FYI: this fix will be backported to 1.4.x

@zhiweiv
Copy link

zhiweiv commented Aug 31, 2021

The fix is in latest hcsshim release: https://github.com/microsoft/hcsshim/releases/tag/v0.8.21, the next step is moby update? Anyone can push this further?

@immuzz
Copy link

immuzz commented Sep 1, 2021

I dont know if this will be fixed in Moby. Have you tried this with containerd?

@zhiweiv
Copy link

zhiweiv commented Sep 30, 2021

@AbelHu
The Containerd updated hcsshim in https://github.com/containerd/containerd/releases/tag/v1.4.10, any plan for AKS Windows Containerd update?

@AbelHu
Copy link

AbelHu commented Sep 30, 2021

@kevpar could you help to confirm whether the hcsshim issue is fixed in the latest contianerd package (v0.0.42) used by AKS?

@zhiweiv
Copy link

zhiweiv commented Oct 1, 2021

@AbelHu
I often saw when you guys reference Windows Containerd, you use version 0.0.x, is it a internal build of Containerd in MS?

What is the difference with regular Containerd?

@AbelHu
Copy link

AbelHu commented Oct 1, 2021

In AKS, Windows containerd is still in public review so we use an internal package.

@AbelHu
Copy link

AbelHu commented Oct 1, 2021

I think that @immuzz should be able to answer more details for the difference.

@zhiweiv
Copy link

zhiweiv commented Oct 28, 2021

Windows Containerd on AKS is going GA per Azure/AKS#1976 (comment), any update on this?

@AbelHu
Copy link

AbelHu commented Oct 28, 2021

@zhiweiv it should be coming soon. Current plan is next month.

@ghost
Copy link

ghost commented Nov 27, 2021

This issue has been open for 30 days with no updates.
@immuzz, please provide an update or close this issue.

@zhiweiv
Copy link

zhiweiv commented Dec 23, 2021

AKS Windows Containerd has been updated to 1.5.8.

@ghost
Copy link

ghost commented Jan 22, 2022

This issue has been open for 30 days with no updates.
@immuzz, please provide an update or close this issue.

1 similar comment
@ghost
Copy link

ghost commented Feb 22, 2022

This issue has been open for 30 days with no updates.
@immuzz, please provide an update or close this issue.

@cwilhit
Copy link
Contributor

cwilhit commented Mar 7, 2022

@zhiweiv with containerd now GA in AKS, can you let me know if you are still seeing this issue? Else I will mark this issue as resolved.

@zhiweiv
Copy link

zhiweiv commented Mar 8, 2022

You can close this issue, it doesn't occur since containerd updated to 1.5.8.

@cwilhit cwilhit closed this as completed Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Windows on Kubernetes Windows Containers using Kubernetes
Projects
None yet
Development

No branches or pull requests

7 participants