Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

piraeus-controller-0 cannot connect to etcd #20

Closed
mpepping opened this issue Mar 11, 2020 · 2 comments
Closed

piraeus-controller-0 cannot connect to etcd #20

mpepping opened this issue Mar 11, 2020 · 2 comments

Comments

@mpepping
Copy link

On deploying the Getting started example, the piraeus-controller-0 ends in a CrashLoopBackOff, with the error message:

ERROR LINSTOR/Controller - SYSTEM - No connection to ETCD server [Report number 5E69444E-00000-000000]

Etcd is deployed and accessible via it's service. Suggestions on howto debug or resolve this issue are welcome.

Logging of the piraeus-controller-0 pod:

kubectl -n kube-system logs -f piraeus-controller-0
LINSTOR, Module Controller
Version:            1.4.2 (974dfcad291e1f683941ada3d7e7337821060349)
Build time:         2020-01-27T11:15:32+00:00
Java Version:       11
Java VM:            Debian, Version 11.0.6+10-post-Debian-1deb10u1
Operating system:   Linux, Version 3.10.0-1062.9.1.el7.x86_64
Environment:        amd64, 1 processors, 247 MiB memory reserved for allocations

System components initialization in progress

20:04:31.119 [main] INFO  LINSTOR/Controller - SYSTEM - Log directory set to: '/var/log/linstor-controller'
20:04:31.295 [main] INFO  LINSTOR/Controller - SYSTEM - Linstor configuration file loaded from '/etc/linstor/linstor.toml'.
20:04:31.296 [Main] INFO  LINSTOR/Controller - SYSTEM - Loading API classes started.
20:04:32.691 [Main] INFO  LINSTOR/Controller - SYSTEM - API classes loading finished: 1350ms
20:04:32.692 [Main] INFO  LINSTOR/Controller - SYSTEM - Dependency injection started.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.inject.internal.cglib.core.$ReflectUtils$1 (file:/usr/share/linstor-server/lib/guice-4.2.2.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
WARNING: Please consider reporting this to the maintainers of com.google.inject.internal.cglib.core.$ReflectUtils$1
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20:04:36.029 [Main] INFO  LINSTOR/Controller - SYSTEM - Dependency injection finished: 3337ms
20:04:37.003 [Main] INFO  LINSTOR/Controller - SYSTEM - Initializing authentication subsystem
20:04:37.533 [Main] INFO  LINSTOR/Controller - SYSTEM - Initializing the etcd database
20:04:49.655 [Main] ERROR LINSTOR/Controller - SYSTEM - No connection to ETCD server [Report number 5E69444E-00000-000000]

20:04:49.657 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Shutdown in progress
20:04:49.658 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Shutting down service instance 'ETCDDatabaseService' of type ETCDDatabaseService
20:04:49.662 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Waiting for service instance 'ETCDDatabaseService' to complete shutdown
20:04:49.662 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Shutting down service instance 'TaskScheduleService' of type TaskScheduleService
20:04:49.662 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Waiting for service instance 'TaskScheduleService' to complete shutdown
20:04:49.662 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Shutting down service instance 'TimerEventService' of type TimerEventService
20:04:49.663 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Waiting for service instance 'TimerEventService' to complete shutdown
20:04:49.663 [Thread-0] INFO  LINSTOR/Controller - SYSTEM - Shutdown complete

Etcd and other piraeus services start OK:

kube-system     piraeus-controller-0                       0/1     Error       1          66s
kube-system     piraeus-csi-controller-0                   5/5     Running     0          60m
kube-system     piraeus-csi-node-nstql                     2/2     Running     0          60m
kube-system     piraeus-csi-node-qv6rv                     2/2     Running     0          60m
kube-system     piraeus-csi-node-swzmv                     2/2     Running     0          60m
kube-system     piraeus-etcd-0                             1/1     Running     0          60m
kube-system     piraeus-etcd-1                             1/1     Running     0          60m
kube-system     piraeus-etcd-2                             1/1     Running     0          60m
kube-system     piraeus-node-6gqhs                         0/1     Init:0/1    1          60m
kube-system     piraeus-node-k64d5                         0/1     Init:0/1    1          60m
kube-system     piraeus-node-qczk6                         0/1     Init:0/1    1          60m

Etcd is accessible via the svc:

# etcdctl --endpoints=http://10.43.119.148:2379 cluster-health
member 150e6aaeb9e31b16 is healthy: got healthy result from http://piraeus-etcd-0.piraeus-etcd:2379
member ad8612fde261bb43 is healthy: got healthy result from http://piraeus-etcd-2.piraeus-etcd:2379
member e41ccd855ab0c8bf is healthy: got healthy result from http://piraeus-etcd-1.piraeus-etcd:2379
cluster is healthy
@mariusrugan
Copy link

@mpepping can i suggest rolling back to the old iptables ... in the container ... ? i have a hunch i've seen this before, not this particular error, but a failure with connectivity with CoreDNS and

with Debian 10 as my k8s worker, i was running this:

update-alternatives --set iptables /usr/sbin/iptables-legacy
update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
update-alternatives --set arptables /usr/sbin/arptables-legacy
update-alternatives --set ebtables /usr/sbin/ebtables-legacy

for example:
rancher/rke#1788

Dunno if you can solve it, without killing the container, just to debug.

@mpepping
Copy link
Author

Thanks for the suggestion, @mariusrugan. Swapping out Firewalld for (legacy) IPtables resolved the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants