Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.8.0 plugin crahes #1105

Closed
mvrk69 opened this issue Sep 30, 2024 · 16 comments
Closed

v0.8.0 plugin crahes #1105

mvrk69 opened this issue Sep 30, 2024 · 16 comments

Comments

@mvrk69
Copy link

mvrk69 commented Sep 30, 2024

System Information

Linux distribution

Fedora 40

Terraform version

Terraform v1.9.6
on linux_amd64
+ provider registry.terraform.io/dmacvicar/libvirt v0.8.0

Description of Issue/Question

Setup

00-provider.tf
terraform {
required_providers {
libvirt = {
source = "dmacvicar/libvirt"
}
}
}

provider "libvirt" {
uri = "qemu+ssh://root@${var.compute_host}/system?keyfile=${var.keyfile}&sshauth=privkey"
}


01-variables.tf
variable "compute_host" {
description = "Compute Hostname"
type = string
}

variable "keyfile" {
description = "Compute host ssh key"
type = string
}

variable "datastore" {
description = "Datastore"
type = string
}


02-variables-values.tfvars
compute_host = "xxx.xxx.xxx.xxx"
keyfile = "/home/user/.ssh/id_rsa"
datastore = "vms"


Steps to Reproduce Issue

terraform plan -var-file="02-variables-values.tfvars"

Planning failed. Terraform encountered an error while generating this plan.


│ Error: Plugin did not respond

│ with provider["registry.terraform.io/dmacvicar/libvirt"],
│ on 00-provider.tf line 13, in provider "libvirt":
│ 13: provider "libvirt" {

│ The plugin encountered an error, and failed to respond to the plugin.(*GRPCProvider).ConfigureProvider call. The plugin logs may contain more details.

Stack trace from the terraform-provider-libvirt_v0.8.0 plugin:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x79911c]

goroutine 12 [running]:
github.com/kevinburke/ssh_config.(*Config).Get(0x0, {0xc00004e910, 0xb}, {0xe88d73, 0x8})
github.com/kevinburke/ssh_config@v1.2.0/config.go:343 +0x5c
github.com/dmacvicar/terraform-provider-libvirt/libvirt/uri.(*ConnectionURI).dialHost(0xc000140170, {0xc00004e910, 0xb}, 0x0, 0x0)
github.com/dmacvicar/terraform-provider-libvirt/libvirt/uri/ssh.go:166 +0x259
github.com/dmacvicar/terraform-provider-libvirt/libvirt/uri.(*ConnectionURI).dialSSH(0xc000140170)
github.com/dmacvicar/terraform-provider-libvirt/libvirt/uri/ssh.go:129 +0x1c5
github.com/dmacvicar/terraform-provider-libvirt/libvirt/uri.(*ConnectionURI).Dial(0xc000140170)
github.com/dmacvicar/terraform-provider-libvirt/libvirt/uri/connection_uri.go:81 +0x3d
github.com/digitalocean/go-libvirt/socket.(*Socket).Connect(0xc000144d20)
github.com/digitalocean/go-libvirt@v0.0.0-20240916165608-bff44a349d9d/socket/socket.go:141 +0xbd
github.com/digitalocean/go-libvirt.(*Libvirt).ConnectToURI(0xc0001194a0, {0xc00003cfa8, 0xe})
github.com/digitalocean/go-libvirt@v0.0.0-20240916165608-bff44a349d9d/libvirt.go:287 +0x2c
github.com/dmacvicar/terraform-provider-libvirt/libvirt.(*Config).Client(0xc0000f2450?)
github.com/dmacvicar/terraform-provider-libvirt/libvirt/config.go:36 +0x215
github.com/dmacvicar/terraform-provider-libvirt/libvirt.providerConfigure(0xc000438300)
github.com/dmacvicar/terraform-provider-libvirt/libvirt/provider.go:71 +0x147
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Provider).Configure(0xc0002ae770, {0xfd7228, 0xc000395b90}, 0xc000144af0)
github.com/hashicorp/terraform-plugin-sdk/v2@v2.34.0/helper/schema/provider.go:359 +0x1bb
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ConfigureProvider(0xc000392870, {0xfd7228?, 0xc000394900?}, 0xc0001483a0)
github.com/hashicorp/terraform-plugin-sdk/v2@v2.34.0/helper/schema/grpc_provider.go:616 +0x3c5
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).Configure(0xc0002a55e0, {0xfd7228?, 0xc000127e30?}, 0xc0001448c0)
github.com/hashicorp/terraform-plugin-go@v0.24.0/tfprotov5/tf5server/server.go:587 +0x342
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_Configure_Handler({0xe5ad20, 0xc0002a55e0}, {0xfd7228, 0xc000127e30}, 0xc0001eb700, 0x0)
github.com/hashicorp/terraform-plugin-go@v0.24.0/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:491 +0x1a6
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001e7000, {0xfd7228, 0xc000127da0}, {0xfdc420, 0xc0004f2000}, 0xc0004617a0, 0xc000395800, 0x15da1f0, 0x0)
google.golang.org/grpc@v1.66.2/server.go:1394 +0xe2b
google.golang.org/grpc.(*Server).handleStream(0xc0001e7000, {0xfdc420, 0xc0004f2000}, 0xc0004617a0)
google.golang.org/grpc@v1.66.2/server.go:1805 +0xe8b
google.golang.org/grpc.(*Server).serveStreams.func2.1()
google.golang.org/grpc@v1.66.2/server.go:1029 +0x7f
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 35
google.golang.org/grpc@v1.66.2/server.go:1040 +0x125

Error: The terraform-provider-libvirt_v0.8.0 plugin crashed!

This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.


Additional information:

Do you have SELinux or Apparmor/Firewall enabled? Some special configuration?
NO

@mvrk69
Copy link
Author

mvrk69 commented Oct 2, 2024

Nothing, i don't have .ssh/config, i'm using the default OS ssh settings

@memetb
Copy link
Contributor

memetb commented Oct 3, 2024

@dmacvicar I will inspect this closer, but we should consider whether upgrading the ssh_config itself was the root cause.

What I'm reading from that stack trace is that the crash is in that module.

@mvrk69 are you able to get debug logs please?

TF_LOG=DEBUG terraform apply

@memetb
Copy link
Contributor

memetb commented Oct 3, 2024

@mvrk69 I have tried to replicate this on a debian linux box to no avail. The debug logs will be crucial to assist.

@mvrk69
Copy link
Author

mvrk69 commented Oct 3, 2024

terraform_debug.log

@memetb
Copy link
Contributor

memetb commented Oct 3, 2024

Ok, the wheels start falling off the wagon here.

I will work on a PR for this asap.

@mvrk69: to confirm, if you do touch ~/.ssh/config this error will likely go away. Can you confirm?

@mvrk69
Copy link
Author

mvrk69 commented Oct 4, 2024

Yup, having ~/.ssh/config works, doesn't crash anymore

@johnbent
Copy link

Just want to chime in that I was seeing this same crash with the same stack trace. I did the 'touch ~/.ssh/config' and now the crash goes away but I'm encountering another error.

I'm on CentOS 8, running
OpenTofu v1.8.3
on linux_amd64
And seeing the same behavior with
Terraform v1.9.7

The error I'm seeing now is:
│ Error: failed to connect: failed to connect to remote host 'localhost': ssh: handshake failed: knownhosts: key mismatch

Doing ssh from the command-line does work so there shouldn't be a key mismatch.

It also works if I force the previous libvirt release ("0.7.6"). Here is a minimal reproducer

`
terraform {
required_providers {
libvirt = {
source = "dmacvicar/libvirt"
version = "0.8.0" # this crashes if no ~/.ssh/config. it fails connection with host key mismatch with empty ~/.ssh/config
#version = "0.7.6" # this works
}
}
}

provider "libvirt" {
alias = "remotehost"
uri = "qemu+ssh://root@localhost/system"
}

resource "libvirt_volume" "remotehost-qcow2" {
provider = libvirt.remotehost
name = "remotehost-qcow2"
pool = "default"
format = "qcow2"
size = 100000
}
`

Thanks for the great modules and the friendly community licensing!

@memetb
Copy link
Contributor

memetb commented Oct 13, 2024

@johnbent I've had the misfortune of dealing with this. I've even commented in the code here

So I would bet you that if you manually set the HostKeyAlgorithms value in your .ssh/config file to the correct value, it will work.

See: golang/go#29286

Quite honestly, this has been a nasty user experience and the solution isn't a simple one. I'm open to approaches the community thinks are good, but the short answer is that the failure is opaque and occurs inside go's implementation of the ssh key exchange.

The problem is that the eager failure of the underlying library forces you to know what the server is configured for which is a terrible abstraction leak. There are insane alternatives like trying and failure through the entire list of keys.

I'm open to thoughts from the community.

@johnbent
Copy link

@memetb , sorry to hear it is such a mess. I wish I had suggestions but this is all Greek to me. Thank goodness for ChatGPT which helped me figure out your instructions at least. :)

So, I did update my ssh config to be:

Host localhost
IdentityFile ~/.ssh/id_rsa
StrictHostKeyChecking yes
HostKeyAlgorithms ecdsa-sha2-nistp256

And now it seems to work. Thanks!

@memetb
Copy link
Contributor

memetb commented Oct 14, 2024

Glad to hear it works.

The behaviour you're seeing is baked into go's implementation of ssh. IMO, you shouldn't have to do it that way from a security perspective (after all, my understanding of the English words "Host key algorithms" is that any of the algorithms listed should succeed). Unfortunately, the way it is done right now, it immediately fails when an algorithm that doesn't match is found.

dmacvicar pushed a commit that referenced this issue Oct 19, 2024
* update log output to indicate intentions moving forward

* if we have failed to read the ssh_config file, ignore it entirely

* add defensive code path that dies if for some reason even this fails

* added this to prevent double warnings in case of missing file

* allow for sshcfg to be nil, which implies no ssh_config found

* update keypath configuration to allow for nil sshcfg

* update HostName configuration to allow for nil sshcfg

* update StrictHostKeyChecking to allow for nil sshcfg

* update UserKnownHostsFile to allow for nil sshcfg

* update HostKeyAlgorithms to allow for nil sshcfg

* remove ProxyCommand from main codepath - while still warning of use

* update ProxyJump config to allow for nil sshcfg

* update user config to allow for nil sshcfg while also simplifying

the code would previously get the current user (current execution's context)
whereas the cfg.User value already takes the ConnectionURI's set username which
should be sufficient as a default value check.

* fix spelling mistake

* why commit once when twice would suffice
@memetb
Copy link
Contributor

memetb commented Oct 22, 2024

@mvrk69 can you confirm the new release has fixed this?

@mvrk69
Copy link
Author

mvrk69 commented Oct 23, 2024

Hi, yes, seems its fixed now.

i tried v0.8.1, removed the ~/.ssh/config, and it didn't crash when i executed terraform plan

@memetb
Copy link
Contributor

memetb commented Oct 23, 2024

Thanks, please go ahead and close issue then.

@mvrk69 mvrk69 closed this as completed Oct 23, 2024
@mattsn0w
Copy link

HostKeyAlgorithms ecdsa-sha2-nistp256

I am re-deploying a set of VMs on a machine (x86_64 Ubuntu Server 20.04.6) and experienced the error "ssh: handshake failed: knownhosts: key mismatch" on 0.8.0 and 0.8.1. Setting the HKA option in ~/.ssh/config resolves the issue for me.

Let me know if you would like any debug output from the environment.

@memetb
Copy link
Contributor

memetb commented Oct 24, 2024

@mattsn0w this is "expected and proper behaviour", however I realize it may not be a good user experience.

As mentioned above, there is also a problem with the go library implementation itself whereby it fails on first mismatch of host keys whereas it should be succeeding on any match in the list (see here).

In particular, I find this a poor abstraction leak because it means I can't upgrade HKA on servers transparently: the clients need to know where the server is at...

Do you have any thoughts on how this would be a better UX for you as a user of the library? For instance, I can see a minor improvement which would be to make the error explicitly mention "HostKeyAlgorithms" as a place to check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants