-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resolve handling of RootFsInfo() error on kubelet start #19948
Comments
@liggitt since b61c00c is already in 3.11 and the upstream PR kubernetes/kubernetes#65595 does roughly the same thing do we want to
|
Automatic merge from submit-queue (batch tested with PRs 60150, 65467, 65487, 65595, 65374). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet: feature gate LSI capacity calculation Currently if `cm.cadvisorInterface.RootFsInfo()` fails, the whole kubelet bails. If `/var/lib/kubelet` is on a tmpfs or bindmount, this can happen (this is the case for some of our CI envs openshift/origin#19948). We would be able to workaround this, in the short term, by disabling the LSI feature gate if the capacity calculate was protected by the gate, but currently it isn't. This PR adds the gate check around setting the ephemeral storage capacity. @liggitt @derekwaynecarr @dashpole It might be a different discussion about whether or not this should be fatal. If it isn't fatal, seems that it would just prevent pods that had a ephemeral storage request from being scheduled. /sig node
if we don't plan to enable the feature in 3.11, it's largely moot. if we do, then we should revert/pick that PR, right? |
agreed. I'll keep this in mind if we decide to go forward with LSI in 3.11 (which would be a stretch, I think) |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
follow-up from #19137
b61c00c made the RootFsInfo() lookup failure on kubelet startup non-fatal because it was blocking CI (it fails when run on tmpfs/bindmounts I think)
need to determine the impact of continuing the kubelet even without the EphemeralStorageCapacityFromFsInfo results
The text was updated successfully, but these errors were encountered: