Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skip image pull for addons that are skipped #2603

Closed
neolit123 opened this issue Nov 11, 2021 · 21 comments · Fixed by kubernetes/kubernetes#114534
Closed

skip image pull for addons that are skipped #2603

neolit123 opened this issue Nov 11, 2021 · 21 comments · Fixed by kubernetes/kubernetes#114534
Assignees
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@neolit123
Copy link
Member

neolit123 commented Nov 11, 2021

during init, if the user passes a InitConfiguration with skipped addons (e.g. skipPhases: - addon/kube-proxy) the preflight phase still pulls the addon container image.

we can support skipping the image for the addon in init, but supporting this is in join and upgrade is a bit complicated because these commands would not know if init skipped the addon. they'd have to look at ConfigMaps in the cluster before making the decision to skip image pull.

opening this issue for discussion if this is something that we want to implement and how complicated it would be.

storing the addon state in ClusterConfiguration would make sense, but if addons will become external we wouldn't want to add named fields in ClusterConfiguration for addon availability control.

current workarounds for users that don't want to download addon image Foo:

  • use --ignore-preflight-errors=ImagePull for init, join, upgrade.
  • pre-download the addon image Foo using kubeadm config images pull to avoid errors (e.g. if there is no connection to gcr.io).
@neolit123 neolit123 added kind/design Categorizes issue or PR as related to design. priority/backlog Higher priority than priority/awaiting-more-evidence. labels Nov 11, 2021
@neolit123 neolit123 added this to the Next milestone Nov 11, 2021
@wangyysde
Copy link
Member

/cc @wangyysde

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 27, 2022
@RA489
Copy link
Contributor

RA489 commented Mar 4, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 4, 2022
@pacoxu
Copy link
Member

pacoxu commented May 11, 2022

  • init: we can skip according to the skipPhases
  • join/upgrade: we can check if the add-on exists. For instance, we can skip kube-proxy if DaemonSet kube-proxy is not found. (Is it tricky or a general way?)

@neolit123
Copy link
Member Author

neolit123 commented May 11, 2022

Yep, like i said in the OP

but supporting this is in join and upgrade is a bit complicated because these commands would not know if init skipped the addon. they'd have to look at ConfigMaps in the cluster before making the decision to skip image pull.

Upgrade already does skip if the CMs for the addons are not present.

On problem for join i think is that it requires constructing a client to check CMs way too early. The image pull is one of the earliest things in the join phases. That is problematic.

@wangyysde
Copy link
Member

  1. How about always skip image pull for addons when init/join/upgrade? Because addons be deployed with deployment or daemonSet, the image of addons be pullled after init/join/upgrade. :-)
  2. How about skip image pull for addons on PreflightPhase ,then pull them in reateKubeProxyAddon and coreDNSAddon function?

@wangyysde
Copy link
Member

/cc

@neolit123
Copy link
Member Author

neolit123 commented Jun 7, 2022

sorry for missing your message.

How about always skip image pull for addons when init/join/upgrade? Because addons be deployed with deployment or daemonSet, the image of addons be pullled after init/join/upgrade. :-)

this can work, i guess. technically the same applies to the images for the control plane in static pods.
the kubelet will pull them and init/join/upgrade does not have to.
one problem here is that the purpose of preflight is to "prepull" the images, so that later users don't have to look at kubelet or "kubectl describe" logs to understand why a component is not starting... :\

How about skip image pull for addons on PreflightPhase ,then pull them in reateKubeProxyAddon and coreDNSAddon function?

this can also work and i think i like this option more, because the error will happen in kubeadm and users don't have to "look at kubelet or "kubectl describe" logs".

it would be a matter of excluding kube-proxy and coredns from https://github.com/kubernetes/kubernetes/blob/79cef12276522232edb444f4493226466664327a/cmd/kubeadm/app/images/images.go#L90-L111 and adding separate pull checks for control plane proxy and dns.
https://github.com/kubernetes/kubernetes/blob/79cef12276522232edb444f4493226466664327a/cmd/kubeadm/app/preflight/checks.go#L1063

thoughts from others?
let's not proceed with changes / PRs until we get agreement.

a partial workaround today is the following:

  • prepull images with "kubeadm config images list/pull"
  • use "InitConfiguration.NodeRegistration.ImagePullPolicy" and "JoinConfiguration.NodeRegistration.ImagePullPolicy" == "Never"
    problem with that is that we have no "UpgradeConfiguration" to apply the policy during upgrade and the addon images will still be pulled.

@neolit123 neolit123 added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jun 7, 2022
@chendave
Copy link
Member

but if addons will become external we wouldn't want to add named fields in ClusterConfiguration for addon availability control

The addons are kube-proxy and coredns, does kubeadm supports the "external" addons today? and If the new field is only used to track what's the addon is enabled, what's the problem will be if the external addons is added into the ClusterConfiguration?

@chendave
Copy link
Member

I am +1 on moving the image pulling of the addons to the phases which they belong to, since now that kubeadm has phases innovated, each phase should be idempotent and shouldn't be coupled with each other. I think this is also the reason why I want to move the cleanup to it's phases (kubernetes/kubernetes#110972).

@neolit123
Copy link
Member Author

neolit123 commented Aug 22, 2022

The addons are kube-proxy and coredns, does kubeadm supports the "external" addons today? and If the new field is only used to track what's the addon is enabled, what's the problem will be if the external addons is added into the ClusterConfiguration?

no, kubeadm does not support external addons.
if we want the external addon support and externalize the current addons, that would need a kep and some discussion.

@neolit123
Copy link
Member Author

I am +1 on moving the image pulling of the addons to the phases which they belong to

as mentioned it would disagree with the preflight purpose, but that's likely the best we can do.

arguably 'config images' should continue working with all images.

@chendave
Copy link
Member

no, kubeadm does not support external addons.

so that storing the addon state in ClusterConfiguration as you said should be acceptable, at least we cannot foresee is there any problem with this approach atm.

for this issue, storing the addon state in ClusterConfiguration + move image pulling into it's phases looks like the right approach to go!

arguably 'config images' should continue working with all images.

agree!

@neolit123
Copy link
Member Author

i would like to read a kep if we are touching the cluster config. the exact approach is not clear to me at this stage.

@neolit123 neolit123 modified the milestones: Next, v1.26 Aug 25, 2022
@chendave
Copy link
Member

chendave commented Sep 2, 2022

/assign

I or @ruquanzhao will send a google doc for review in next few weeks.

@ruquanzhao
Copy link
Member

/assign
I write a document of this. Hope to get comments from everybody. @neolit123 @pacoxu @wangyysde @chendave
https://docs.google.com/document/d/1mf3dlFFrHVDRLOUYFy4AaRJqiCLcv9VcVS7cxrmDcSs/edit?usp=sharing

@chendave chendave removed their assignment Sep 29, 2022
@chendave
Copy link
Member

@ruquanzhao thanks for the doc, left some question or comments for you in the doc

@neolit123 neolit123 modified the milestones: v1.26, v1.27 Nov 21, 2022
AndiDog added a commit to giantswarm/cluster-aws that referenced this issue Jan 11, 2023
….15 tries to pull the official image incorrectly

While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades.
AndiDog added a commit to giantswarm/cluster-aws that referenced this issue Jan 11, 2023
…s to pull the official image incorrectly

While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades. CAPI propagates this value before attempting the node upgrade.
AndiDog added a commit to giantswarm/cluster-aws that referenced this issue Jan 11, 2023
…s to pull the official image incorrectly

While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades. CAPI propagates this value before attempting the node upgrade.
AndiDog added a commit to giantswarm/cluster-aws that referenced this issue Jan 11, 2023
…s to pull the official image incorrectly (#197)

While kubeadm is buggy (kubernetes/kubernetes#114978) and tries to download the CoreDNS image despite us skipping that addon (kubernetes/kubeadm#2603), we try to use the new official repository already. This fixes `kubeadm join` for new Kubernetes versions and therefore avoids stuck node upgrades. CAPI propagates this value before attempting the node upgrade.
@SataQiu
Copy link
Member

SataQiu commented Feb 16, 2023

/kind api-change

@k8s-ci-robot k8s-ci-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Feb 16, 2023
@neolit123 neolit123 modified the milestones: v1.27, v1.28 Apr 17, 2023
@chendave chendave mentioned this issue Jun 19, 2023
26 tasks
@neolit123 neolit123 modified the milestones: v1.28, v1.29 Jul 21, 2023
@kotyara85
Copy link

@ruquanzhao does this still work? I’m using kubeadm v1.30.3 and init doesn’t pull images but join does. Because of that my deployment stuck. Thanks

@ruquanzhao
Copy link
Member

@ruquanzhao does this still work? I’m using kubeadm v1.30.3 and init doesn’t pull images but join does. Because of that my deployment stuck. Thanks

this feature is in milestone 1.31, so you may need to use v1.31.

@pacoxu
Copy link
Member

pacoxu commented Sep 24, 2024

@ruquanzhao does this still work? I’m using kubeadm v1.30.3 and init doesn’t pull images but join does. Because of that my deployment stuck. Thanks

this feature is in milestone 1.31, so you may need to use v1.31.

To make it clear, the feature PR kubernetes/kubernetes#114534 was merged in v1.29 for v1beta4(at that time, kubeadm only enabled v1beta3), but v1beta4 is officially released in v1.31, see https://kubernetes.io/blog/2024/08/23/kubernetes-1-31-kubeadm-v1beta4/ or kubernetes/kubernetes#125029.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/design Categorizes issue or PR as related to design. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants