Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Innernet fails with "Decode error occurred: Failed to parse message with type 16" #303

Closed
odrling opened this issue Feb 24, 2024 · 11 comments · Fixed by #324
Closed

Innernet fails with "Decode error occurred: Failed to parse message with type 16" #303

odrling opened this issue Feb 24, 2024 · 11 comments · Fixed by #324
Labels
bug Something isn't working

Comments

@odrling
Copy link
Contributor

odrling commented Feb 24, 2024

The interface is still set up correctly by innernet (or at least it seems to work and I hope I'm not missing something, but the hosts are still pingable).

❯ /usr/local/bin/innernet -vv --mtu 1420 up japanet7 --interval 60 --daemon
[*] bringing up interface japanet7.
[D] set address 10.100.0.6/16 on interface japanet7
[D] set interface japanet7 up with mtu 1420
[D] route 10.100.0.6/16 already existed.
[*] fetching state for japanet7 from server...
[D] [ureq::stream] connecting to 10.100.0.2:61820 at 10.100.0.2:61820
[D] [ureq::stream] created stream: Stream(TcpStream { addr: 10.100.0.6:4204, peer: 10.100.0.2:61820, fd: 4 })
[D] [ureq::unit] sending request GET http://10.100.0.2:61820/v1/user/state
[D] [ureq::unit] writing prelude: GET /v1/user/state HTTP/1.1
Host: 10.100.0.2:61820
User-Agent: ureq/2.9.6
Accept: */*
X-Innernet-Server-Key: REDACTED
[D] [ureq::response] Body entirely buffered (length: 2196)
[D] [ureq::pool] adding stream to pool: http|10.100.0.2|61820 -> Stream(TcpStream { addr: 10.100.0.6:4204, peer: 10.100.0.2:61820, fd: 4 })
[D] [ureq::unit] response 200 to GET http://10.100.0.2:61820/v1/user/state
[D] [wireguard_control::backends::kernel] get_by_name: got 1 response message(s) from netlink request
[D] [wireguard_control::backends::kernel] get_by_name: parsed wireguard device japanet7 with 4 peer(s)

[*] updated interface japanet7

[D] [ureq::stream] dropping stream: Stream(TcpStream { addr: 10.100.0.6:4204, peer: 10.100.0.2:61820, fd: 4 })

[E] Decode error occurred: Failed to parse message with type 16

NOTE: this also happens with no options set in the command line, that's just the command I've taken straight from my service file.

innernet show also fails with just this error:

❯ innernet -vv show

[E] Decode error occurred: Failed to parse message with type 16

innernet fetch doesn't consider the interface to be up.

❯ innernet fetch japanet7

[E] Interface is not up. Use 'innernet up japanet7' instead

The issue seems to have appeared on my server after a kernel update (a vendored SBC kernel 5.10 → 6.8 mainline) and a friend also got this error after an update to Alpine Linux 3.19 (so it would be 6.1 → 6.6).

❯ innernet --version
innernet 1.6.1
@strohel strohel added the bug Something isn't working label Feb 24, 2024
@odrling
Copy link
Contributor Author

odrling commented Feb 26, 2024

I've now looked a bit more into it. This happens when calling get_local_addrs and the error comes from https://github.com/rust-netlink/netlink-packet-core/blob/91e71b69fe8d94a8ae7e2748b0443272c6c3307c/src/message.rs#L116.
It seems related to k3s as this doesn't happen when I stop it on my server.

@strohel
Copy link
Member

strohel commented Feb 26, 2024

@odrling thanks for the investigation. What is k3s?

@odrling
Copy link
Contributor Author

odrling commented Feb 26, 2024

It's a Kubernetes implementation https://github.com/k3s-io/k3s

@strohel
Copy link
Member

strohel commented Feb 26, 2024

Interesting. Could it be affecting kernel's netlink responses somehow? Maybe they are then extended with some virtualization/namespace/cgroup info then?

@Santonclause
Copy link

I experience the same issue and it started after k3s installation. Stopping k3s service does not help

@odrling
Copy link
Contributor Author

odrling commented Sep 5, 2024

After looking more into it, it seems to be related to this upstream issue rust-netlink/netlink-packet-route#54

Upgrading to 0.18+ probably would fix this but they changed the API with 0.18 it seems, so it's not just a flip of a switch.

odrling added a commit to odrling/innernet that referenced this issue Oct 14, 2024
odrling added a commit to odrling/innernet that referenced this issue Oct 21, 2024
strohel pushed a commit that referenced this issue Oct 24, 2024
@Santonclause
Copy link

Santonclause commented Oct 29, 2024

the update did not solve the problem for me:

> sudo innernet list
[E] Decode error occurred: Failed to parse message with type 16
> sudo innernet --version
innernet 1.6.1

Is there anything I can troubleshoot?

I also run k3s on that server

@strohel
Copy link
Member

strohel commented Oct 29, 2024

@Santonclause sorry for a captain obvious question, but are you using recent-enough build from git main that has the fix? It hasn't been released yet in any version.

(unfortunately innernet --version returns 1.6.1 for both the last released version and current git main, so that cannot be used to tell which is which)

@Santonclause
Copy link

@Santonclause sorry for a captain obvious question, but are you using recent-enough build from git main that has the fix? It hasn't been released yet in any version.

(unfortunately innernet --version returns 1.6.1 for both the last released version and current git main, so that cannot be used to tell which is which)

@strohel , I've used the 1.6.1 release from here https://github.com/tonarino/innernet/releases and built from it. I assume this is not correct then? Should I just clone the main and build from source there?

@odrling
Copy link
Contributor Author

odrling commented Oct 29, 2024

@Santonclause Yes, you should try cloning the repo and building from the main branch.
The issue got closed when the fix was merged a few days ago and it is not yet in a release.

I used k3s as a test when I wrote the fix so you should not be running into this error with a build from main.

@Santonclause
Copy link

thank you @odrling! Yes, I just rebuilt from main branch and the issue was fixed. Thank you for the contribution!

sqrtsanta pushed a commit to targetaidev/innernet that referenced this issue Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants