Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Version 2.7.0 doesn't work on armv7 #3842

Closed
alexgorbatchev opened this issue Aug 11, 2020 · 13 comments
Closed

Version 2.7.0 doesn't work on armv7 #3842

alexgorbatchev opened this issue Aug 11, 2020 · 13 comments

Comments

@alexgorbatchev
Copy link

What you expected to happen?

Weave image should start on Raspberry Pi 4.

What happened?

  • Docker fails to start with exec user process caused "exec format error"
  • Kubernetes pods fail to start with Illegal instruction (core dumped)

How to reproduce it?

$ sudo docker run weaveworks/weave
sudo: unable to resolve host cluster-node-06: No address associated with hostname
Unable to find image 'weaveworks/weave:latest' locally
latest: Pulling from weaveworks/weave
e99479a55eec: Pull complete
b804f4074063: Pull complete
cea6abc40570: Pull complete
1c2a82708fcf: Pull complete
cfc44e9357af: Pull complete
65ba54c6d6f1: Pull complete
Digest: sha256:d5c94652f363336f6daaac6d0ca751e93e63f50298c59e354f8a0270407ff1d0
Status: Downloaded newer image for weaveworks/weave:latest
standard_init_linux.go:211: exec user process caused "exec format error"
HypriotOS/armv7: pirate@cluster-node-06 in ~


$ sudo docker run weaveworks/weave:2.6.5
sudo: unable to resolve host cluster-node-06: No address associated with hostname
Unable to find image 'weaveworks/weave:2.6.5' locally
2.6.5: Pulling from weaveworks/weave
e99479a55eec: Already exists
39cdb2a7fd10: Pull complete
cc790d3c80d6: Pull complete
06d20421373e: Pull complete
621686069f24: Pull complete
44bef99a19a2: Pull complete
Digest: sha256:a4ca9a2a74f97b318376956151dfe3172a5b66a9ad24ae0d849db4cabebfe745
Status: Downloaded newer image for weaveworks/weave:2.6.5
FATA: 2020/08/11 05:44:21.731568 All system IDs are blank
HypriotOS/armv7: pirate@cluster-node-06 in ~

Anything else we need to know?

Versions:

$ weave version
n/a



$ docker version
Client: Docker Engine - Community
 Version:           19.03.11
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        42e35e6
 Built:             Mon Jun  1 09:20:15 2020
 OS/Arch:           linux/arm
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.8
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.17
  Git commit:       afacb8b
  Built:            Wed Mar 11 01:29:22 2020
  OS/Arch:          linux/arm
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683



$ uname -a
Linux cluster-node-06 4.19.118-v7l+ #1311 SMP Mon Apr 27 14:26:42 BST 2020 armv7l GNU/Linux



$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.6-beta.0", GitCommit:"e7f962ba86f4ce7033828210ca3556393c377bcc", GitTreeState:"clean", BuildDate:"2020-01-15T08:26:26Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.6", GitCommit:"dff82dc0de47299ab66c83c626e08b245ab19037", GitTreeState:"clean", BuildDate:"2020-07-15T16:51:04Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/arm"}
@bboreham
Copy link
Contributor

Your first example is running weaveworks/weave:latest not 2.7.0.
I tried following the installation instructions and I got 2.7.0.

The weaveworks/weave-arm:latest image seems to match the amd64 image, which is certainly a concern, but not the one you reported.

@sfxworks
Copy link

? Latest currently matches 2.7.0
Ran kubectl create -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" which:

root@raspberrypi:~# curl -L "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')" | grep image
              image: 'docker.io/weaveworks/weave-kube:2.7.0'
              image: 'docker.io/weaveworks/weave-npc:2.7.0'

Which caused this as well when running kubectl logs -f (weave-pod) -c weave resulting in Illegal instruction (core dumped)

Current workaround is to use the 2.6.5 image.

@bboreham
Copy link
Contributor

Can you run docker history weaveworks/weave:2.7.0 on a machine where it is failing and post the output please.

@qbasicer
Copy link

qbasicer commented Aug 17, 2020

I'm seeing this as well. when moving from 2.6.2 -> 2.7.0

kube-system     weave-net-7kgbv                                                   2/2     Running            0          7m26s   10.71.0.219   k8node2    <none>           <none>
kube-system     weave-net-bw7z2                                                   1/2     CrashLoopBackOff   6          7m26s   10.71.0.217   k8master   <none>           <none>
kube-system     weave-net-hmm8n                                                   1/2     CrashLoopBackOff   6          7m26s   10.71.0.218   k8node1    <none>           <none>

(k8node2 is aarch64, k8master & k8node1 are armv7)

qbasicer@k8master:~$ kubectl describe pod weave-net-bw7z2 -n kube-system
Name:                 weave-net-bw7z2
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 k8master/10.71.0.217
Start Time:           Mon, 17 Aug 2020 14:54:13 -0400
Labels:               controller-revision-hash=8489f68cdc
                      name=weave-net
                      pod-template-generation=1
Annotations:          <none>
Status:               Running
IP:                   10.71.0.217
IPs:
  IP:           10.71.0.217
Controlled By:  DaemonSet/weave-net
Containers:
  weave:
    Container ID:  docker://5935853f93bd91b194970a39c22aaa1011a97453126e5fee85f8a44c65083556
    Image:         docker.io/weaveworks/weave-kube:2.7.0
    Image ID:      docker-pullable://weaveworks/weave-kube@sha256:f5488cc0e18f0df33aace12f13b5d7c479e3202ae4baf3971b7572d9c9e8fa0a
    Port:          <none>
    Host Port:     <none>
    Command:
      /home/weave/launch.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    132
      Started:      Mon, 17 Aug 2020 15:00:38 -0400
      Finished:     Mon, 17 Aug 2020 15:00:38 -0400
    Ready:          False
    Restart Count:  6
    Requests:
      cpu:      50m
      memory:   100Mi
    Readiness:  http-get http://127.0.0.1:6784/status delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      HOSTNAME:   (v1:spec.nodeName)
    Mounts:
      /host/etc from cni-conf (rw)
      /host/home from cni-bin2 (rw)
      /host/opt from cni-bin (rw)
      /host/var/lib/dbus from dbus (rw)
      /lib/modules from lib-modules (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from weave-net-token-rkq7q (ro)
      /weavedb from weavedb (rw)
  weave-npc:
    Container ID:   docker://76aa01b07961d87e1f0be68fb2b1bf9db418dcfd2db6b791e3d521594e197593
    Image:          docker.io/weaveworks/weave-npc:2.7.0
    Image ID:       docker-pullable://weaveworks/weave-npc@sha256:4ae4204ead601e63447f8d004c6fd4490a44155d68cce11503eedf94f4b4c4ce
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 17 Aug 2020 14:54:34 -0400
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:     50m
      memory:  100Mi
    Environment:
      HOSTNAME:   (v1:spec.nodeName)
    Mounts:
      /run/xtables.lock from xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from weave-net-token-rkq7q (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  weavedb:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/weave
    HostPathType:
  cni-bin:
    Type:          HostPath (bare host directory volume)
    Path:          /opt
    HostPathType:
  cni-bin2:
    Type:          HostPath (bare host directory volume)
    Path:          /home
    HostPathType:
  cni-conf:
    Type:          HostPath (bare host directory volume)
    Path:          /etc
    HostPathType:
  dbus:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/dbus
    HostPathType:
  lib-modules:
    Type:          HostPath (bare host directory volume)
    Path:          /lib/modules
    HostPathType:
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  FileOrCreate
  weave-net-token-rkq7q:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  weave-net-token-rkq7q
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     :NoSchedule
                 :NoExecute
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  <unknown>              default-scheduler  Successfully assigned kube-system/weave-net-bw7z2 to k8master
  Normal   Pulled     8m3s                   kubelet, k8master  Container image "docker.io/weaveworks/weave-npc:2.7.0" already present on machine
  Normal   Created    8m2s                   kubelet, k8master  Created container weave-npc
  Normal   Started    8m1s                   kubelet, k8master  Started container weave-npc
  Normal   Pulled     6m22s (x5 over 8m15s)  kubelet, k8master  Container image "docker.io/weaveworks/weave-kube:2.7.0" already present on machine
  Normal   Created    6m21s (x5 over 8m6s)   kubelet, k8master  Created container weave
  Normal   Started    6m20s (x5 over 8m3s)   kubelet, k8master  Started container weave
  Warning  BackOff    3m4s (x23 over 7m56s)  kubelet, k8master  Back-off restarting failed container
qbasicer@k8master:~$ kubectl logs weave-net-bw7z2 weave -n kube-system
Illegal instruction (core dumped)
qbasicer@k8master:~$

weaveworks/weave:2.7.0 wasn't on my system, but it looks like it's actually using weaveworks/weave-kube:2.7.0.

qbasicer@k8master:~$ docker history weaveworks/weave-kube:2.7.0
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
f58a4b249316        12 days ago         /bin/sh -c #(nop)  LABEL org.opencontainers.…   0B
<missing>           12 days ago         /bin/sh -c #(nop)  ARG revision                 0B
<missing>           12 days ago         /bin/sh -c #(nop)  ENTRYPOINT ["/home/weave/…   0B
<missing>           12 days ago         /bin/sh -c #(nop) ADD multi:2e1725d49d05bc74…   25.7MB
<missing>           12 days ago         /bin/sh -c #(nop)  LABEL maintainer=Weavewor…   0B
<missing>           12 days ago         /bin/sh -c #(nop)  LABEL org.opencontainers.…   0B
<missing>           12 days ago         /bin/sh -c #(nop)  ARG revision                 0B
<missing>           12 days ago         /bin/sh -c #(nop) WORKDIR /home/weave           0B
<missing>           12 days ago         /bin/sh -c #(nop)  ENTRYPOINT ["/home/weave/…   0B
<missing>           12 days ago         /bin/sh -c #(nop) ADD file:841615bdd8b70809d…   0B
<missing>           12 days ago         /bin/sh -c #(nop) ADD file:2c70ccb528207d724…   10.6MB
<missing>           12 days ago         /bin/sh -c #(nop) ADD multi:4a95425ab021dcf1…   36.9MB
<missing>           12 days ago         /bin/sh -c apk add --update     curl     ipt…   9.53MB
<missing>           2 months ago        /bin/sh -c #(nop) COPY file:ed2b07e44c7bdea9…   3.28MB
<missing>           2 months ago        /bin/sh -c #(nop)  LABEL works.weave.role=sy…   0B
<missing>           6 months ago        /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>           6 months ago        /bin/sh -c #(nop) ADD file:78bb3e8b6b95733f2…   4.01MB
qbasicer@k8master:~$

I pulled it regardless (probably the same image with multiple tags?)

qbasicer@k8master:~$ docker history weaveworks/weave:2.7.0
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
8e3df95470ab        12 days ago         /bin/sh -c #(nop)  LABEL org.opencontainers.…   0B
<missing>           12 days ago         /bin/sh -c #(nop)  ARG revision                 0B
<missing>           12 days ago         /bin/sh -c #(nop) WORKDIR /home/weave           0B
<missing>           12 days ago         /bin/sh -c #(nop)  ENTRYPOINT ["/home/weave/…   0B
<missing>           12 days ago         /bin/sh -c #(nop) ADD file:841615bdd8b70809d…   0B
<missing>           12 days ago         /bin/sh -c #(nop) ADD file:2c70ccb528207d724…   10.6MB
<missing>           12 days ago         /bin/sh -c #(nop) ADD multi:4a95425ab021dcf1…   36.9MB
<missing>           12 days ago         /bin/sh -c apk add --update     curl     ipt…   9.53MB
<missing>           2 months ago        /bin/sh -c #(nop) COPY file:ed2b07e44c7bdea9…   3.28MB
<missing>           2 months ago        /bin/sh -c #(nop)  LABEL works.weave.role=sy…   0B
<missing>           6 months ago        /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>           6 months ago        /bin/sh -c #(nop) ADD file:78bb3e8b6b95733f2…   4.01MB
qbasicer@k8master:~$

@sfxworks
Copy link

yeah using kubeadm I was showing weaveworks/weave-kube:2.7.0. Showing same results as above.

@FoxRomeo
Copy link

Hi
I have the same problem with 2.7.0 (and running fine with 2.6.5).
(RPi3 / Hypriot v1.12.3 / Kernel 5.4.51-v7+ / k8s 1.18.8)

I could pinpoint it down to weaveworks/weave-kube:2.7.0 /home/weave/weaver

On ARM systems I just see the illegal instruction, but with x86 qemu-arm and strace I could locate it here:
rt_sigaction(SIGRT_31, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_31, {sa_handler=0x6004f960, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x6015c3d0}, NULL, 8) = 0
arch_prctl(ARCH_SET_GS, 0x7fd08d3cf000) = 0
mprotect(0x62574000, 4096, PROT_NONE) = 0
write(2, "qemu: unhandled CPU exception 0x"..., 45qemu: unhandled CPU exception 0x2 - aborting
) = 45

while weaver from 2.6.5 does this here:
rt_sigaction(SIGRT_31, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_31, {sa_handler=0x6004f960, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x6015c3d0}, NULL, 8) = 0
arch_prctl(ARCH_SET_GS, 0x7f3f2c529000) = 0
mprotect(0x62574000, 4096, PROT_NONE) = 0
mmap(0x7f3f2e9c0000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f3f2e9c0000
uname({sysname="Linux", nodename="XxX", ...}) = 0
set_tid_address(0x7f3f2e9c0068) = 9532

As 2.6.5 is working, the question it had changed at this part of the code (or on how the code was compiled).

docker inspect weaveworks/weave-kube:2.7.0
[
{
"Id": "sha256:f58a4b2493168581989a2464e299705e50d8202b09ccd6c24f7f15e4041552ed",
"RepoTags": [
"weaveworks/weave-kube:2.7.0"
],
"RepoDigests": [
"weaveworks/weave-kube@sha256:f5488cc0e18f0df33aace12f13b5d7c479e3202ae4baf3971b7572d9c9e8fa0a"
],
"Parent": "",
"Comment": "",
"Created": "2020-08-05T10:57:36.937080898Z",
"Container": "14c920e6fa53a1700ae3895d778f0d58ea55b8f3ccba02903c3794c843b9c922",

@fastlorenzo
Copy link

fastlorenzo commented Aug 25, 2020

Same error as @FoxRomeo with RPi4 5.4.51-v7l+ (k8s: v1.18.8), works with weave 2.6.5

@boeremak
Copy link

New RPI4 cluster and after installing Weave, all my Weave pods were CrashedLoopbackOff with error code 132.

Installed 2.6.5 instead of 2.7.0 and everything is working.

@alexgorbatchev
Copy link
Author

As a side note, I switched my PRI4 cluster to Ubuntu 20 server arm64 and found compatibility to be much better across the board. Strongly recommend.

@obeyler
Copy link

obeyler commented Nov 14, 2020

same issue on Rpi3, same workaround (use 2.6.5)

@bboreham
Copy link
Contributor

I fired up a 'C1' machine on Scaleway and tried to run 2.7.0; it is an ARM 32-bit binary but crashed immediately with segmentation fault on startup. I don't get an illegal instruction.
Same story with 2.8.0 just released.
It apparently doesn't even get as far as the Go runtime.

I rebuilt the Weave Net daemon (weaver) natively on the ARM host, and it also crashes with segfault.

@bboreham
Copy link
Contributor

If I take away the linker flags for static linking it doesn't crash.
The code from v2.6.5 built with Go 1.15.6 doesn't crash.

After about a day of bisecting I came to the conclusion that it is the newer version of the Kubernetes client-go library which is triggering the crash, somehow in conjunction with CGo and static linking on ARM32.

The good news is this code can easily be moved to the kube-utils program where it doesn't crash.
I have pushed an image weaveworks/weave-kube-arm:git-ad621c807e2b to DockerHub built from the code at #3885

If someone tries that out please comment here how it went.

@bboreham
Copy link
Contributor

That change is now released in Weave Net 2.8.1 and I have tested weave launch on a Scaleway ARM machine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants