Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update containerd to version 1.6.x in LTS stream #916

Closed
ccojocar opened this issue Dec 20, 2022 · 23 comments
Closed

Update containerd to version 1.6.x in LTS stream #916

ccojocar opened this issue Dec 20, 2022 · 23 comments
Labels
kind/bug Something isn't working

Comments

@ccojocar
Copy link

Description

The support for containerd 1.5 was removed in Kubernetes 1.26 (https://kubernetes.io/blog/2022/11/18/upcoming-changes-in-kubernetes-1-26/#cri-api-removal) but latest version of Flatcar 3033.3.8 still ships containerd 1.5.x.

core@localhost ~ $ sudo ctr version
Client:
  Version:  1.5.13
  Revision: d0d56c1a4ace8bae8c7c98d28ba98f0537ebe704
  Go version: go1.17.8

Server:
  Version:  1.5.13
  Revision: d0d56c1a4ace8bae8c7c98d28ba98f0537ebe704
  UUID: 8fc0d883-cc26-4d00-b47b-d406257b9bef

The kubelet service exists with the following error:

Dec 20 18:21:59 localhost kubelet[3881]: E1220 18:21:59.850722    3881 run.go:74] "command failed" err="failed to run Kubelet: validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///run/containerd/con>
Dec 20 18:21:59 localhost systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE

Impact

Kubernetes 1.26 doesn't run with current version of Flatcar 3033.3.8.

Environment and steps to reproduce

Expected behavior

Kubernetes 1.26 should run with current version of Flatcar.

Additional information

@pothos
Copy link
Member

pothos commented Dec 20, 2022

3033.3.8 is the Flatcar LTS-2022 channel and probably won't get containerd 1.6 until LTS-2023.

You have the following options:

@ccojocar
Copy link
Author

@pothos Thanks for clarifications. I switched now to stable channel but there are some issues with the ssh when vagrant boots up.

The ssh configuration is the same like for LTS which works properly:

  config.ssh.connect_timeout = 30
  config.ssh.username = 'core'
  config.ssh.forward_agent = true
  config.ssh.insert_key = true
  config.ssh.keep_alive = true

This is the log from vagrant:

DEBUG ssh: Checking key permissions: /Users/gcojocar/linux/.vagrant.d/insecure_private_key
 INFO ssh: Attempting SSH connection...
 INFO ssh: Attempting to connect to SSH...
 INFO ssh:   - Host: 127.0.0.1
 INFO ssh:   - Port: 2222
 INFO ssh:   - Username: core
 INFO ssh:   - Password? false
 INFO ssh:   - Key Path: ["/Users/gcojocar/linux/.vagrant.d/insecure_private_key"]
DEBUG ssh:   - connect_opts: {:auth_methods=>["none", "hostbased", "publickey"], :config=>false, :forward_agent=>false, :send_env=>false, :keys_only=>true, :verify_host_key=>:never, :password=>nil, :port=>2222, :timeout=>30, :user_known
_hosts_file=>[], :verbose=>:debug, :logger=>#<Logger:0x00007fafd5560608 @level=0, @progname=nil, @default_formatter=#<Logger::Formatter:0x00007fafd55605e0 @datetime_format=nil>, @formatter=nil, @logdev=#<Logger::LogDevice:0x00007fafd556
0590 @shift_period_suffix=nil, @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#<StringIO:0x00007fafd5560658>, @binmode=false, @mon_data=#<Monitor:0x00007fafd5560568>, @mon_data_owner_object_id=117280>>, :keys=>["/Users/gcojocar/li
nux/.vagrant.d/insecure_private_key"], :remote_user=>"core"}
 INFO subprocess: Starting process: ["/usr/local/bin/VBoxManage", "showvminfo", "9201d38c-109e-4075-ab88-ced1192e2ce6", "--machinereadable"]
DEBUG subprocess: Command not in installer, not touching env vars.
 INFO subprocess: Command not in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0
 INFO subprocess: Starting process: ["/usr/local/bin/VBoxManage", "showvminfo", "9201d38c-109e-4075-ab88-ced1192e2ce6", "--machinereadable"]
DEBUG subprocess: Command not in installer, not touching env vars.
 INFO subprocess: Command not in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0
 INFO subprocess: Starting process: ["/usr/local/bin/VBoxManage", "showvminfo", "9201d38c-109e-4075-ab88-ced1192e2ce6", "--machinereadable"]
DEBUG subprocess: Command not in installer, not touching env vars.
 INFO subprocess: Command not in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0

Any idea what might be different? Thanks

@tormath1
Copy link
Contributor

Hi @ccojocar,

What's your vagrant version? Is it equal or greater than 2.2.16?

@pothos
Copy link
Member

pothos commented Dec 21, 2022

Sounds like an issue from deprecated crypto variants, maybe check that the key variant you use for the host key or login key is allowed on both sides.

@ccojocar
Copy link
Author

What's your vagrant version? Is it equal or greater than 2.2.16?

The vagrant is latest version 2.3.4.

@ccojocar
Copy link
Author

Sounds like an issue from deprecated crypto variants, maybe check that the key variant you use for the host key or login key is allowed on both sides.

Do you know which algorithms were deprecated? What should I use? Thanks

@ccojocar
Copy link
Author

@pothos @tormath1 It seems even vanilla Flatcar doesn't work with vagrant. I added the vagrant stable box and then run the following commands:

vagrant init flatcar-stable
vagrant up

It is stuck during default ssh configuration:

Bringing machine 'default' up with 'virtualbox' provider...
==> default: Importing base box 'flatcar-stable'...
==> default: Matching MAC address for NAT networking...
==> default: Setting the name of the VM: kubernetes-sigs_default_1671620378343_52155
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2222
    default: SSH username: core
    default: SSH auth method: private key

Also I tried to use my custom ssh key and the same without much success. This is the configuration I use:

  config.vm.provision "file", "source": "./id_ed25519.pub", "destination": "/home/core/.ssh/authorized_keys"
  config.ssh.connect_timeout = 30
  config.ssh.username = 'core'
  config.ssh.forward_agent = true
  config.ssh.insert_key = false
  config.ssh.keep_alive = true
  config.ssh.private_key_path = "./id_ed25519"

It remains stuck in the same place.

@tormath1
Copy link
Contributor

Which Flatcar .box are you using? That looks similar to: #158 (comment) now.

@ccojocar
Copy link
Author

@tormath1
Copy link
Contributor

@ccojocar
Copy link
Author

This box works with ssh https://lts.release.flatcar-linux.net/amd64-usr/current/flatcar_production_vagrant.box, it should be something in the Flatcar configuration which broke ssh.

@ccojocar
Copy link
Author

@tormath1 It is the same behaviour with flatcar_production_vagrant_virtualbox:

ringing machine 'default' up with 'virtualbox' provider...
==> default: Box 'flatcar-stable' could not be found. Attempting to find and install...
    default: Box Provider: virtualbox
    default: Box Version: >= 0
==> default: Box file was not detected as metadata. Adding it directly...
==> default: Adding box 'flatcar-stable' (v0) for provider: virtualbox
    default: Downloading: https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_vagrant_virtualbox.box
==> default: Successfully added box 'flatcar-stable' (v0) for 'virtualbox'!
==> default: Importing base box 'flatcar-stable'...
==> default: Matching MAC address for NAT networking...
==> default: Setting the name of the VM: security-profiles-operator_default_1671629780003_31543
==> default: Clearing any previously set network interfaces...
==> default: Preparing network interfaces based on configuration...
    default: Adapter 1: nat
==> default: Forwarding ports...
    default: 22 (guest) => 2222 (host) (adapter 1)
==> default: Running 'pre-boot' VM customizations...
==> default: Booting VM...
==> default: Waiting for machine to boot. This may take a few minutes...
    default: SSH address: 127.0.0.1:2222
    default: SSH username: core
    default: SSH auth method: private key

@ccojocar
Copy link
Author

It seems that Flatcar stable doesn't properly boot up. It is stuck here with a lot of systemd error:

image

Unfortunately I cannot extract the full log from Virtualbox console.

@tormath1
Copy link
Contributor

tormath1 commented Dec 21, 2022

@ccojocar after some debugging, I think I have the issue: vagrant does not seem to be supported between Ignition 0.33 (LTS) and Ignition 2.14 (stable).
coreos/ignition@26828f9

It was previously a no-op but now it just makes Ignition to fail when OEM_ID=vagrant.

EDIT: Manually adding flatcar.oem.id=metal to the kernel command line can makes the instance to boot, it's a workaround - but we might consider adding back these no-op provider to Ignition support.

@ccojocar
Copy link
Author

@tormath1 Thanks for looking into this. Is it possible to add this kernel option from Vagrantfile?

@ccojocar ccojocar changed the title Update containerd to version 1.6.x Update containerd to version 1.6.x in LTS stream Dec 21, 2022
@tormath1
Copy link
Contributor

@ccojocar I don't think as this box relies on Virtualbox (see also: https://www.virtualbox.org/pipermail/vbox-dev/2011-July/009910.html).
I'll try to build a Flatcar image this week with the Ignition patch and will provide the link for you to do the testing, it could be available in January releases once merged.
In the meantime, I would recommend to whether manually set the kernel argument or to explore the other solutions proposed by @pothos .

@ccojocar
Copy link
Author

@tormath1 Thanks, sounds good!

@tormath1
Copy link
Contributor

tormath1 commented Dec 22, 2022

@ccojocar Hi, here's the box from the CI if you want to give a try:

vagrant box add http://bincache.flatcar-linux.net/images/amd64/9999.0.0+tormath1-ignition-vagrant/flatcar_production_vagrant.box --name flatcar-main
vagrant init flatcar-main
vagrant up

@ccojocar
Copy link
Author

@tormath1 I tried this box with our integration tests and it seems that the ssh works fine also the kubernetes 1.26 was installed successfully. I am looking forward to a release.

Thanks for fixing this!

ccojocar added a commit to ccojocar/security-profiles-operator that referenced this issue Dec 22, 2022
A temporary box needs to be used until Flatcar will release of new version of stable.

See flatcar/Flatcar#916
k8s-ci-robot pushed a commit to kubernetes-sigs/security-profiles-operator that referenced this issue Dec 23, 2022
A temporary box needs to be used until Flatcar will release of new version of stable.

See flatcar/Flatcar#916
@invidian
Copy link
Member

invidian commented Jan 2, 2023

I also got hit by the Vagrant issue. Thanks for looking into it @tormath1! BTW, it seem there could be a separate issue created for Vagrant, as this one is about containerd in LTS.

@saschagrunert
Copy link

@ccojocar @tormath1 looks like the image is gone from http://bincache.flatcar-linux.net/images/amd64/9999.0.0+tormath1-ignition-vagrant/flatcar_production_vagrant.box

Is it possible to re-upload it again since it breaks our e2e tests. :)

@tormath1
Copy link
Contributor

@saschagrunert Hi, this image was a temporary image produced by the CI. The fix is now available in alpha and beta latest releases: https://beta.release.flatcar-linux.net/amd64-usr/3446.1.0/flatcar_production_vagrant.box (https://www.flatcar.org/releases#release-3446.1.0).
As a side node, beta channel is often considered as stable enough to run a few of them in production workloads.

saschagrunert added a commit to saschagrunert/security-profiles-operator that referenced this issue Jan 13, 2023
As per
flatcar/Flatcar#916 (comment),
this should fix the image location for the test.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert
Copy link

Thank you for the hint @tormath1, I was looking at stable.release.flatcar-linux.net, so I guess we now switch to the beta channel. 🙏

@pothos pothos closed this as completed Jan 13, 2023
saschagrunert added a commit to saschagrunert/security-profiles-operator that referenced this issue Jan 16, 2023
As per
flatcar/Flatcar#916 (comment),
this should fix the image location for the test.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
saschagrunert added a commit to kubernetes-sigs/security-profiles-operator that referenced this issue Jan 17, 2023
As per
flatcar/Flatcar#916 (comment),
this should fix the image location for the test.

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants