Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Weave NPC Container Crashes and Keeps Restarting #3247

Closed
alok87 opened this issue Feb 27, 2018 · 14 comments
Closed

Weave NPC Container Crashes and Keeps Restarting #3247

alok87 opened this issue Feb 27, 2018 · 14 comments

Comments

@alok87
Copy link
Contributor

alok87 commented Feb 27, 2018

What you expected to happen?

I expected weave-npc container to not crash and keep running

What happened?

Our weave npc container keeps crashing with the error show below. And Kubernetes restarts it.

INFO: 2018/02/27 10:29:15.704124 EVENT UpdateNamespace {"metadata":{"creationTimestamp":"2018-02-27T07:24:28Z","deletionTimestamp":"2018-02-27T10:28:23Z","labels":{"creator":"sujan.a"},"name":"helium","resourceVersion":"43844474","selfLink":"/api/v1/namespaces/helium","uid":"3eab7ba3-1b8f-11e8-b606-022ac66104c6"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Terminating"}} {"metadata":{"creationTimestamp":"2018-02-27T07:24:28Z","deletionTimestamp":"2018-02-27T10:28:23Z","labels":{"creator":"sujan.a"},"name":"helium","resourceVersion":"43844836","selfLink":"/api/v1/namespaces/helium","uid":"3eab7ba3-1b8f-11e8-b606-022ac66104c6"},"spec":{},"status":{"phase":"Terminating"}}
INFO: 2018/02/27 10:29:15.718956 EVENT DeleteNamespace {"metadata":{"creationTimestamp":"2018-02-27T07:24:28Z","deletionTimestamp":"2018-02-27T10:28:23Z","labels":{"creator":"sujan.a"},"name":"helium","resourceVersion":"43844836","selfLink":"/api/v1/namespaces/helium","uid":"3eab7ba3-1b8f-11e8-b606-022ac66104c6"},"spec":{},"status":{"phase":"Terminating"}}
DEBU: 2018/02/27 10:29:15.719339 removing default rule in namespace: helium, set weave-IdwzZLfj(%b(2#AY)mt/Me/~X
INFO: 2018/02/27 10:29:15.725883 destroying ipset: &npc.selectorSpec{key:"", selector:labels.internalSelector{}, dst:false, ipsetType:"hash:ip", ipsetName:"weave-V7d@]hQsm@w.FTL;i6Lv]yC8g", nsName:"helium"}
INFO: 2018/02/27 10:31:00.766750 EVENT AddNamespace {"metadata":{"creationTimestamp":"2018-02-27T10:31:00Z","labels":{"creator":"sujan.a"},"name":"helium","resourceVersion":"43845453","selfLink":"/api/v1/namespaces/helium","uid":"4e0aabf6-1ba9-11e8-900b-027f607c7b80"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}
FATA: 2018/02/27 10:31:00.769529 ipset [create weave-IdwzZLfj(%b(2#AY)mt/Me/~X hash:ip comment] failed: ipset v6.32: Set cannot be created: set with the same name already exists
: exit status 1

How to reproduce it?

Not sure how to reproduce it.

Anything else we need to know?

Cloud provider: aws
Kubernetes version: 1.7.6
Weave version: 2.2.0 https://github.com/weaveworks/weave/releases/tag/v2.2.0

@bboreham
Copy link
Contributor

Is that the whole log? There should be a line like this:

	Got list of ipsets: 

@alok87
Copy link
Contributor Author

alok87 commented Feb 27, 2018

@bboreham here is the log for it. Do you need more? huge log it is

INFO: 2018/02/27 10:13:26.512991 Serving /metrics on :6781
Tue Feb 27 10:13:26 2018 <5> ulogd.c:843 building new pluginstance stack: 'log1:NFLOG,base1:BASE,pcap1:PCAP'
DEBU: 2018/02/27 10:13:26.533660 Got list of ipsets: [weave-local-pods weave-+~.n|O}pR=Gb0*T3D?JTmrpGe weave-9Yu/#*JVl^#=tUJ7gk=H$(3=I weave-GDpD%ws;mA;QfL(_6uR=ov5ZI weave-I^@_pae1G]s/58:BZkY(kS;b7 weave-^^ED$nC#|$[AGt}n|$gE{?6W$ weave-6oo[aWLTJi$j;)c~oTv*K;h#t weave-sKf^a;H8z2c;f^X[eK~+Ws;;h weave-{|rO=lfO)@{FrEV8W$K9TW0[] weave-:cW?j*CoJb@k4E]n_f;8n9B;z weave-~%|?v:1wby|]7VB_eQ@p)??qA weave-nJv$i]NEFO~V^1KcDC*6@XxY9 weave-^Z9nhl:@/^l]i6#ZG/BmhYJk7 weave-GffHO0)*H3ezJCl3l#ovJuaIc weave-Y;%)wta+muf}/JjIO]15xHYYg weave-4fgi6zYQgH^:e=V=b%Z?^om=4 weave-%ma0yYb$l+}*LsF!3MV8_+r$F weave-Lw9{.PCg/]g{2:0n~^JJ?YH7H weave-Z]@fL:xRVAfS{JVk?jwC#Gl/T weave-4(~$MwirMmoNB){]Qfk4*W+ZO weave-p:oF57O*$dU1l7.fJw}?kB(]w weave-0EHD/vdN#O4]V?o4Tx7kS;APH weave-4vtqMI+kx/2]jD%_c0S%thO%V weave-Up;6Y+^4UTUS3T4*1C:LxqoCr weave-KkL3mEJ5Yjwg?4GvO(Ws2Gg$t weave-oeX/A4Ab^wUC[|iq]$IF+zlj$ weave-Op1DdkWPr}wFb{dY0bB3Y48Ey weave-d7iM9G1@b[~#(8*hY6i)Qbblp weave-yLCZG1rKs(7gt1q(pYlXVa_!_ weave-I^*nP1mv?;Ojir]0!B=WE~:ax weave-AN4rA+^/*464r%f=gQjSBU[|R weave-tJ*!:Ul4Gr91D%M5gAr^Vx(@U weave-MzxjTY2e!I8_Al?{7BNFT}dWp weave-9xLCl$e5%c/ZuTd^}k7@T4/9$ weave-837a4W4YnsSMy;g/29evzU3qT weave-(c7D)^Xs3lE%m:b#yeRs}W0Rg weave-+QB1_1[;Q3(KFxAS4(NW9Uw)o weave-RXRn$wQZ8S+hbJDPui.~9Cedc weave-_75A@$[j8U?:Mhq_bjEH;)RFQ weave-(6IjSeImE6)O3bfQJY(o[xQQ) weave-kiCQj*+7!l]#H*G2HMw7rX3d% weave-9nOoIgG97Sh+raGQ0foc2uQ%7 weave-X]N+Z(@Yrh($BwO|@v$Yvk8SS weave-.u!IZdBEqMTxZBrswV]x1%YCr weave-r.i@hNne.8L3HP92~lBF)GwT[ weave-rFI9qQgI)Jr.4z4J|5?RZgq7O weave-|ZGIZsi@j}WLpVbjT+lt#W.K+ weave-c:N6lY:fD*=oPD_zMTQqj=2wR weave-#Rp$^^eLQ?7!6];3;kS|]KcPE weave-qHq{iugL7*aEB6NzLBeOMS5em weave-i{PrkcB!)O(x+nEKr??Fv{zoF weave-a;+i_l+cFDaJyQvSiQtU(WMVl weave-E.1.0W^NGSp]0_t5WwH/]gX@L weave-k?Z;25^M}|1s7P3|H9i;*;MhG weave-*@iqKuS=*m;[2lC8wL|$6!!.9 weave-3=rjE.g*h;W]EU|TIzA=aC2d8 weave-1$XQ)fP+z$pX/05^U^mU%tj7= weave-F#?LrsT[K^73+TB^h.}Yam:on weave-lOT$=D7?ITKbbQ|J;3TG](t1U weave-hzhtZ[OF|CWqyyh0X4dL^+Sjz weave-+WvY9b)%Cg!q0j.OMWBw2)yY) weave-JMjt!mzlHOM8$Q6eG$BvNluRT weave-*GtRdi:d$DuMC.e!gkUfXdcrH weave-33.f1Fb|rX@|$]|2**Vm$=;.B weave-A/R=FloN829_:VV/o:M]b[ulw weave-!r~@EqpuMvP6SQ_.nN![$@{_U weave-UsY7aJ5#(~tZ%a+@dJy;r.WOn weave-l0se?~Y)wTsbl6x==~K]|/Ayy weave-?Q@g^;+9F2+@DM78]keL=8xmZ weave-d54FCY_WBzVk}}^gQYAsUHjWg weave-$uE~#S!D~Znnr:$w;Ge=_mnBm weave-39$*j7^vSTzBK8lA!^zDT3mqC weave-DBH?MkOr4JGX/^:HfU:o0}E?V weave-g$DhsR$Xb9WV)#|N7P5H3ppOZ weave-IdwzZLfj(%b(2#AY)mt/Me/~X weave-V7d@]hQsm@w.FTL;i6Lv]yC8g weave-mYJyAQ1(Yt5JD?FgJxHey{Z9c weave-SlIdqaO5Mxy(U#^y(Z;!QOh|G weave-):c0F6{KJAyO^~4R^Cb[@|LRE weave-m!w.w{kK:d8bGJyp2?C}^gZ}6 weave-SpmnUYN8K)#H:z3~=6U00i5dN weave-G7+$X1%uPtZ:_d#)QECoj^~fl weave-NX8_0]]e2Y+RGL]yrla!RU})( weave-XWOm|;M~1^X;]%[GNlCgY=!XF weave-Tb1;!h}%M4sfBH[.1S@0xL~BN weave-.#wkTr8EJKfno!r/o+@8pGi}U weave-?b%zl9GIe0AET1(QI^7NWe*fO weave-iuZcey(5DeXbzgRFs8Szo]+@p weave-QWA4JP?KtVUjpEz(|~;FFQPvH weave-_TduLy=tnh)+F*.w:~]*W(^]U weave-@{3|L4b(Bsw6rjiVP~6(fgEOr weave-a_qt/eJH+WQ+?WR;gOi4tM?$[ weave-|TN:D7P$v1uIZp8#xK%|FZ~=F weave-AIz(|3hp_byp*=n%5bFZRU:|l weave-:WHWkWO+*$[F8NaUg9U0$_{8q weave-b9:JU#]Wqh14e]}k62SXyCQHs weave-/*2ut6/CpH]^/?7wLn{i9dPey weave-q}vZH;Kvlh9x[nKsHhh#/.;dc weave-[GXL[;!jPD#T%bjt+V+:?OT+n weave-/0wZrjy0Q14=LEg8C8|J?x;1D weave-.Rt*RU5r9?WV+g0KfT:W!Lh7l weave-!r7rQu^{](tOH$;3YdwynWXMm weave-F?.h:xEiT7Vn?e[I;_M(^_RPy weave-jqR*Xd]{Kx/ajvxtSR*WhDQ|{ weave-SEgaFOSe/|Ch8AsEuPd2_SamA weave-#U[66Q|cnItyG$MQ=B)^0~fIF weave-.s+66[*c[~G9J?ImxH)7L_ihI weave-}==8%MIb$vA;;b$9+VLiic3aZ weave-^e]r!!ssJ^OrD2s[8PdePu/XN weave-BFZqv5BQPgq1wT*vU6koBa^2Z weave-D_zW{aq_aL3l}nRyJetGTP{jf weave-W%#5mNVptS9fMBFYhG)a+cU;t weave-AH333MxAhOcH*u~XQn6RVM^Ee weave-w.=PmBQ#eAb:k(9djLsgG]SqC weave-5g(+0QJoblLw.Fg1*Aj=gt.sJ weave-z}7j7IpnOZ5e4bTQpw(=2SBb$ weave-ITvbjGF+Y)#Q9vNm+]Vm[gv3s weave-($P)_v9il:o+xA0$IxOw[gTCt weave-gUO@tmp6Zz|=i$PIH4^@_WR32 weave-==^!fcI972/GPqVn@d|tLx^U8 weave-Q[kx4H97H6K8kxuUfNwY7PopD weave-r%[?sDO(f/hy:K8j;4~tYZMz= weave-k9xio.+xLyzB;2S*Al@K9=fEz weave-DS|9pwj[J400|H4sHmxb!jPWf weave-I7o7Q7NS*%lg{4Amg0qP{B6xw weave-3{4YhI5xch=:(LHV*Bf$x1phe weave-qsCERL8vO9C{+~lpt3x/f+}KX weave-D{xf#xh./E3y]LH8h*s(mFVMt weave-]$NacR83(D.8/0uXt@6QY_7GQ weave-aL+tk=@a=9:wIr6#({skhOe|$ weave-5IuC4/W@DAhUhFD=SSLi]j{xR weave-(6r=xf{kI{t_59H)wbZP@_#sM weave-Vb%*:X+IeQuEpwI9tmIz.prW4 weave-YrMIEHf{B{n#XB0Uw[97Q0u:r]
DEBU: 2018/02/27 10:13:26.533692 Flushing ipset 'weave-local-pods'
DEBU: 2018/02/27 10:13:26.534689 Flushing ipset 'weave-+~.n|O}pR=Gb0*T3D?JTmrpGe'
DEBU: 2018/02/27 10:13:26.536412 Flushing ipset 'weave-9Yu/#*JVl^#=tUJ7gk=H$(3=I'
DEBU: 2018/02/27 10:13:26.538032 Flushing ipset 'weave-GDpD%ws;mA;QfL(_6uR=ov5ZI'
DEBU: 2018/02/27 10:13:26.539568 Flushing ipset 'weave-I^@_pae1G]s/58:BZkY(kS;b7'
DEBU: 2018/02/27 10:13:26.541078 Flushing ipset 'weave-^^ED$nC#|$[AGt}n|$gE{?6W$'
DEBU: 2018/02/27 10:13:26.542533 Flushing ipset 'weave-6oo[aWLTJi$j;)c~oTv*K;h#t'
DEBU: 2018/02/27 10:13:26.543955 Flushing ipset 'weave-sKf^a;H8z2c;f^X[eK~+Ws;;h'
DEBU: 2018/02/27 10:13:26.545596 Flushing ipset 'weave-{|rO=lfO)@{FrEV8W$K9TW0[]'
DEBU: 2018/02/27 10:13:26.547054 Flushing ipset 'weave-:cW?j*CoJb@k4E]n_f;8n9B;z'
DEBU: 2018/02/27 10:13:26.548534 Flushing ipset 'weave-~%|?v:1wby|]7VB_eQ@p)??qA'
DEBU: 2018/02/27 10:13:26.550013 Flushing ipset 'weave-nJv$i]NEFO~V^1KcDC*6@XxY9'
DEBU: 2018/02/27 10:13:26.551485 Flushing ipset 'weave-^Z9nhl:@/^l]i6#ZG/BmhYJk7'
DEBU: 2018/02/27 10:13:26.552932 Flushing ipset 'weave-GffHO0)*H3ezJCl3l#ovJuaIc'
DEBU: 2018/02/27 10:13:26.554389 Flushing ipset 'weave-Y;%)wta+muf}/JjIO]15xHYYg'
DEBU: 2018/02/27 10:13:26.555831 Flushing ipset 'weave-4fgi6zYQgH^:e=V=b%Z?^om=4'
DEBU: 2018/02/27 10:13:26.557303 Flushing ipset 'weave-%ma0yYb$l+}*LsF!3MV8_+r$F'
DEBU: 2018/02/27 10:13:26.558374 Flushing ipset 'weave-Lw9{.PCg/]g{2:0n~^JJ?YH7H'
DEBU: 2018/02/27 10:13:26.559846 Flushing ipset 'weave-Z]@fL:xRVAfS{JVk?jwC#Gl/T'
DEBU: 2018/02/27 10:13:26.561039 Flushing ipset 'weave-4(~$MwirMmoNB){]Qfk4*W+ZO'
DEBU: 2018/02/27 10:13:26.562194 Flushing ipset 'weave-p:oF57O*$dU1l7.fJw}?kB(]w'
DEBU: 2018/02/27 10:13:26.563325 Flushing ipset 'weave-0EHD/vdN#O4]V?o4Tx7kS;APH'
DEBU: 2018/02/27 10:13:26.564516 Flushing ipset 'weave-4vtqMI+kx/2]jD%_c0S%thO%V'
DEBU: 2018/02/27 10:13:26.565614 Flushing ipset 'weave-Up;6Y+^4UTUS3T4*1C:LxqoCr'
DEBU: 2018/02/27 10:13:26.566693 Flushing ipset 'weave-KkL3mEJ5Yjwg?4GvO(Ws2Gg$t'
DEBU: 2018/02/27 10:13:26.567782 Flushing ipset 'weave-oeX/A4Ab^wUC[|iq]$IF+zlj$'
DEBU: 2018/02/27 10:13:26.568927 Flushing ipset 'weave-Op1DdkWPr}wFb{dY0bB3Y48Ey'
DEBU: 2018/02/27 10:13:26.570008 Flushing ipset 'weave-d7iM9G1@b[~#(8*hY6i)Qbblp'
DEBU: 2018/02/27 10:13:26.571065 Flushing ipset 'weave-yLCZG1rKs(7gt1q(pYlXVa_!_'
DEBU: 2018/02/27 10:13:26.572074 Flushing ipset 'weave-I^*nP1mv?;Ojir]0!B=WE~:ax'
DEBU: 2018/02/27 10:13:26.573123 Flushing ipset 'weave-AN4rA+^/*464r%f=gQjSBU[|R'```

@bboreham
Copy link
Contributor

Yes, I know the log is huge; it's huge because we logged everything we would need to debug complex issues. And now you have an issue and I can't debug it unless I can see everything I need to see.

In particular, the one ipset from your error message should have been flushed and then destroyed, with each step logged. But it would be more efficient for you just to upload the whole log.

@alok87
Copy link
Contributor Author

alok87 commented Feb 27, 2018

@bboreham here is the log for the ipset which failed to create as it existed

$ cat weave-npc.log | grep "weave-IdwzZLfj(%b(2#AY)mt/Me/~X"
DEBU: 2018/02/27 10:31:02.420832 Got list of ipsets: [weave-local-pods weave-sKf^a;H8z2c;f^X[eK~+Ws;;h weave-{|rO=lfO)@{FrEV8W$K9TW0[] weave-+~.n|O}pR=Gb0*T3D?JTmrpGe weave-9Yu/#*JVl^#=tUJ7gk=H$(3=I weave-GDpD%ws;mA;QfL(_6uR=ov5ZI weave-I^@_pae1G]s/58:BZkY(kS;b7 weave-^^ED$nC#|$[AGt}n|$gE{?6W$ weave-6oo[aWLTJi$j;)c~oTv*K;h#t weave-GffHO0)*H3ezJCl3l#ovJuaIc weave-Y;%)wta+muf}/JjIO]15xHYYg weave-:cW?j*CoJb@k4E]n_f;8n9B;z weave-~%|?v:1wby|]7VB_eQ@p)??qA weave-nJv$i]NEFO~V^1KcDC*6@XxY9 weave-^Z9nhl:@/^l]i6#ZG/BmhYJk7 weave-0EHD/vdN#O4]V?o4Tx7kS;APH weave-4vtqMI+kx/2]jD%_c0S%thO%V weave-4fgi6zYQgH^:e=V=b%Z?^om=4 weave-%ma0yYb$l+}*LsF!3MV8_+r$F weave-Lw9{.PCg/]g{2:0n~^JJ?YH7H weave-Z]@fL:xRVAfS{JVk?jwC#Gl/T weave-4(~$MwirMmoNB){]Qfk4*W+ZO weave-p:oF57O*$dU1l7.fJw}?kB(]w weave-oeX/A4Ab^wUC[|iq]$IF+zlj$ weave-Op1DdkWPr}wFb{dY0bB3Y48Ey weave-Up;6Y+^4UTUS3T4*1C:LxqoCr weave-KkL3mEJ5Yjwg?4GvO(Ws2Gg$t weave-Vb%*:X+IeQuEpwI9tmIz.prW4 weave-YrMIEHf{B{n#XB0Uw[97Q0u:r weave-(c7D)^Xs3lE%m:b#yeRs}W0Rg weave-+QB1_1[;Q3(KFxAS4(NW9Uw)o weave-RXRn$wQZ8S+hbJDPui.~9Cedc weave-_75A@$[j8U?:Mhq_bjEH;)RFQ weave-(6IjSeImE6)O3bfQJY(o[xQQ) weave-kiCQj*+7!l]#H*G2HMw7rX3d% weave-d7iM9G1@b[~#(8*hY6i)Qbblp weave-yLCZG1rKs(7gt1q(pYlXVa_!_ weave-I^*nP1mv?;Ojir]0!B=WE~:ax weave-AN4rA+^/*464r%f=gQjSBU[|R weave-tJ*!:Ul4Gr91D%M5gAr^Vx(@U weave-MzxjTY2e!I8_Al?{7BNFT}dWp weave-9xLCl$e5%c/ZuTd^}k7@T4/9$ weave-837a4W4YnsSMy;g/29evzU3qT weave-#Rp$^^eLQ?7!6];3;kS|]KcPE weave-qHq{iugL7*aEB6NzLBeOMS5em weave-9nOoIgG97Sh+raGQ0foc2uQ%7 weave-X]N+Z(@Yrh($BwO|@v$Yvk8SS weave-.u!IZdBEqMTxZBrswV]x1%YCr weave-r.i@hNne.8L3HP92~lBF)GwT[ weave-rFI9qQgI)Jr.4z4J|5?RZgq7O weave-oBtFDT4dLlasSe@yL@0zv(KVB weave-|ZGIZsi@j}WLpVbjT+lt#W.K+ weave-c:N6lY:fD*=oPD_zMTQqj=2wR weave-E.1.0W^NGSp]0_t5WwH/]gX@L weave-k?Z;25^M}|1s7P3|H9i;*;MhG weave-i{PrkcB!)O(x+nEKr??Fv{zoF weave-a;+i_l+cFDaJyQvSiQtU(WMVl weave-*GtRdi:d$DuMC.e!gkUfXdcrH weave-33.f1Fb|rX@|$]|2**Vm$=;.B weave-*@iqKuS=*m;[2lC8wL|$6!!.9 weave-3=rjE.g*h;W]EU|TIzA=aC2d8 weave-1$XQ)fP+z$pX/05^U^mU%tj7= weave-F#?LrsT[K^73+TB^h.}Yam:on weave-lOT$=D7?ITKbbQ|J;3TG](t1U weave-hzhtZ[OF|CWqyyh0X4dL^+Sjz weave-+WvY9b)%Cg!q0j.OMWBw2)yY) weave-JMjt!mzlHOM8$Q6eG$BvNluRT weave-DBH?MkOr4JGX/^:HfU:o0}E?V weave-g$DhsR$Xb9WV)#|N7P5H3ppOZ weave-IdwzZLfj(%b(2#AY)mt/Me/~X weave-mYJyAQ1(Yt5JD?FgJxHey{Z9c weave-SlIdqaO5Mxy(U#^y(Z;!QOh|G weave-):c0F6{KJAyO^~4R^Cb[@|LRE weave-m!w.w{kK:d8bGJyp2?C}^gZ}6 weave-A/R=FloN829_:VV/o:M]b[ulw weave-!r~@EqpuMvP6SQ_.nN![$@{_U weave-UsY7aJ5#(~tZ%a+@dJy;r.WOn weave-l0se?~Y)wTsbl6x==~K]|/Ayy weave-?Q@g^;+9F2+@DM78]keL=8xmZ weave-d54FCY_WBzVk}}^gQYAsUHjWg weave-$uE~#S!D~Znnr:$w;Ge=_mnBm weave-39$*j7^vSTzBK8lA!^zDT3mqC weave-?b%zl9GIe0AET1(QI^7NWe*fO weave-iuZcey(5DeXbzgRFs8Szo]+@p weave-SpmnUYN8K)#H:z3~=6U00i5dN weave-G7+$X1%uPtZ:_d#)QECoj^~fl weave-NX8_0]]e2Y+RGL]yrla!RU})( weave-XWOm|;M~1^X;]%[GNlCgY=!XF weave-Tb1;!h}%M4sfBH[.1S@0xL~BN weave-.#wkTr8EJKfno!r/o+@8pGi}U weave-/*2ut6/CpH]^/?7wLn{i9dPey weave-q}vZH;Kvlh9x[nKsHhh#/.;dc weave-[GXL[;!jPD#T%bjt+V+:?OT+n weave-/0wZrjy0Q14=LEg8C8|J?x;1D weave-QWA4JP?KtVUjpEz(|~;FFQPvH weave-_TduLy=tnh)+F*.w:~]*W(^]U weave-@{3|L4b(Bsw6rjiVP~6(fgEOr weave-a_qt/eJH+WQ+?WR;gOi4tM?$[ weave-|TN:D7P$v1uIZp8#xK%|FZ~=F weave-AIz(|3hp_byp*=n%5bFZRU:|l weave-:WHWkWO+*$[F8NaUg9U0$_{8q weave-b9:JU#]Wqh14e]}k62SXyCQHs weave-SEgaFOSe/|Ch8AsEuPd2_SamA weave-#U[66Q|cnItyG$MQ=B)^0~fIF weave-.Rt*RU5r9?WV+g0KfT:W!Lh7l weave-!r7rQu^{](tOH$;3YdwynWXMm weave-F?.h:xEiT7Vn?e[I;_M(^_RPy weave-jqR*Xd]{Kx/ajvxtSR*WhDQ|{ weave-5IuC4/W@DAhUhFD=SSLi]j{xR weave-(6r=xf{kI{t_59H)wbZP@_#sM weave-D_zW{aq_aL3l}nRyJetGTP{jf weave-W%#5mNVptS9fMBFYhG)a+cU;t weave-.s+66[*c[~G9J?ImxH)7L_ihI weave-}==8%MIb$vA;;b$9+VLiic3aZ weave-^e]r!!ssJ^OrD2s[8PdePu/XN weave-BFZqv5BQPgq1wT*vU6koBa^2Z weave-Q[kx4H97H6K8kxuUfNwY7PopD weave-r%[?sDO(f/hy:K8j;4~tYZMz= weave-AH333MxAhOcH*u~XQn6RVM^Ee weave-w.=PmBQ#eAb:k(9djLsgG]SqC weave-5g(+0QJoblLw.Fg1*Aj=gt.sJ weave-z}7j7IpnOZ5e4bTQpw(=2SBb$ weave-ITvbjGF+Y)#Q9vNm+]Vm[gv3s weave-($P)_v9il:o+xA0$IxOw[gTCt weave-gUO@tmp6Zz|=i$PIH4^@_WR32 weave-==^!fcI972/GPqVn@d|tLx^U8 weave-]$NacR83(D.8/0uXt@6QY_7GQ weave-aL+tk=@a=9:wIr6#({skhOe|$ weave-k9xio.+xLyzB;2S*Al@K9=fEz weave-DS|9pwj[J400|H4sHmxb!jPWf weave-I7o7Q7NS*%lg{4Amg0qP{B6xw weave-3{4YhI5xch=:(LHV*Bf$x1phe weave-qsCERL8vO9C{+~lpt3x/f+}KX weave-D{xf#xh./E3y]LH8h*s(mFVMt weave-GoNL4on7RW%(SE0[Og*[!Wxkx weave-VtEc1z26x}^HDPV_/p6_VZ;S5]
DEBU: 2018/02/27 10:31:02.486043 Flushing ipset 'weave-IdwzZLfj(%b(2#AY)mt/Me/~X'
DEBU: 2018/02/27 10:31:02.618128 Destroying ipset 'weave-IdwzZLfj(%b(2#AY)mt/Me/~X'
DEBU: 2018/02/27 10:31:03.783818 ensuring rule for DefaultAllow in namespace: helium, set weave-IdwzZLfj(%b(2#AY)mt/Me/~X
INFO: 2018/02/27 10:32:33.566858 adding entry 100.127.128.21 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 8451bbb5-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:33.566923 added entry 100.127.128.21 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 8451bbb5-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:33.586126 adding entry 100.127.128.23 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 846739d6-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:33.586166 added entry 100.127.128.23 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 846739d6-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:42.111952 adding entry 100.126.0.27 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84f8752f-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:42.112031 added entry 100.126.0.27 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84f8752f-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:43.271726 adding entry 100.105.128.3 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84ee5f0e-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:43.271764 added entry 100.105.128.3 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84ee5f0e-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:50.526241 adding entry 100.127.128.14 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 83ab3eee-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 10:32:50.526300 added entry 100.127.128.14 to weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 83ab3eee-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:34.685378 deleting entry 100.126.0.27 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84f8752f-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:34.685491 deleted entry 100.126.0.27 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84f8752f-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:34.764965 deleting entry 100.127.128.14 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 83ab3eee-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:34.764986 deleted entry 100.127.128.14 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 83ab3eee-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:35.539020 deleting entry 100.105.128.3 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84ee5f0e-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:35.539124 deleted entry 100.105.128.3 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 84ee5f0e-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:35.793132 deleting entry 100.127.128.23 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 846739d6-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:35.793231 deleted entry 100.127.128.23 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 846739d6-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:35.936119 deleting entry 100.127.128.21 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 8451bbb5-1ba9-11e8-ac8d-0238e90e49f8
INFO: 2018/02/27 12:04:35.936226 deleted entry 100.127.128.21 from weave-IdwzZLfj(%b(2#AY)mt/Me/~X of 8451bbb5-1ba9-11e8-ac8d-0238e90e49f8
DEBU: 2018/02/27 12:04:48.502525 removing default rule in namespace: helium, set weave-IdwzZLfj(%b(2#AY)mt/Me/~X
FATA: 2018/02/27 12:06:36.278271 ipset [create weave-IdwzZLfj(%b(2#AY)mt/Me/~X hash:ip comment] failed: ipset v6.32: Set cannot be created: set with the same name already exists```

@alok87
Copy link
Contributor Author

alok87 commented Feb 28, 2018

@bboreham The weave-npc crashes because it is trying to create a ipSet which already exists. It already exists because the last deletion did not clear the ipSet. The ipSet mentioned here

var ipset ipset.Name
is not getting deleted.

I was able to reproduce the issue by doing below.

  • kubectl create ns aks
INFO: 2018/02/28 04:29:58.907284 EVENT AddNamespace {"metadata":{"creationTimestamp":"2018-02-28T04:29:58Z","name":"aks","resourceVersion":"44123710","selfLink":"/api/v1/namespaces/aks","uid":"08fa2c09-1c40-11e8-ac8d-0238e90e49f8"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}
INFO: 2018/02/28 04:29:58.908863 creating ipset: &npc.selectorSpec{key:"", selector:labels.internalSelector{}, dst:false, ipsetType:"hash:ip", ipsetName:"weave-usb0iYnjHs@=O7u8u/9gTbt%b", nsName:"aks"}
DEBU: 2018/02/28 04:29:58.910842 ensuring rule for DefaultAllow in namespace: aks, set weave-#[w{=fp4BYyel2u|{f;WvV5Wr
  • Now the ipsets created are -
/ #  ipset list --name --output plain | grep "weave-usb0iYnjHs@=O7u8u/9gTbt%b"
weave-usb0iYnjHs@=O7u8u/9gTbt%b
/ # ipset list --name --output plain | grep "WvV5Wr"
weave-#[w{=fp4BYyel2u|{f;WvV5Wr
  • kubectl delete ns aks
INFO: 2018/02/28 04:31:55.178666 EVENT UpdateNamespace {"metadata":{"creationTimestamp":"2018-02-28T04:29:58Z","deletionTimestamp":"2018-02-28T04:31:49Z","name":"aks","resourceVersion":"44124032","selfLink":"/api/v1/namespaces/aks","uid":"08fa2c09-1c40-11e8-ac8d-0238e90e49f8"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Terminating"}} {"metadata":{"creationTimestamp":"2018-02-28T04:29:58Z","deletionTimestamp":"2018-02-28T04:31:49Z","name":"aks","resourceVersion":"44124052","selfLink":"/api/v1/namespaces/aks","uid":"08fa2c09-1c40-11e8-ac8d-0238e90e49f8"},"spec":{},"status":{"phase":"Terminating"}}
INFO: 2018/02/28 04:31:55.189275 EVENT DeleteNamespace {"metadata":{"creationTimestamp":"2018-02-28T04:29:58Z","deletionTimestamp":"2018-02-28T04:31:49Z","name":"aks","resourceVersion":"44124052","selfLink":"/api/v1/namespaces/aks","uid":"08fa2c09-1c40-11e8-ac8d-0238e90e49f8"},"spec":{},"status":{"phase":"Terminating"}}
DEBU: 2018/02/28 04:31:55.189299 removing default rule in namespace: aks, set weave-#[w{=fp4BYyel2u|{f;WvV5Wr
INFO: 2018/02/28 04:31:55.191673 destroying ipset: &npc.selectorSpec{key:"", selector:labels.internalSelector{}, dst:false, ipsetType:"hash:ip", ipsetName:"weave-usb0iYnjHs@=O7u8u/9gTbt%b", nsName:"aks"}
  • Now both the created ipsets should have got deleted. But only one was deleted. This is the problem.
/ # ipset list --name --output plain | grep "weave-usb0iYnjHs@=O7u8u/9gTbt%b"
/ # ipset list --name --output plain | grep "WvV5Wr"
weave-#[w{=fp4BYyel2u|{f;WvV5Wr
  • kubectl create ns aks On creating the same namespace again, weave-npc crashed with below error
FATA: 2018/02/28 04:36:13.919533 ipset [create weave-#[w{=fp4BYyel2u|{f;WvV5Wr hash:ip comment] failed: ipset v6.32: Set cannot be created: set with the same name already exists
: exit status 1

Now we need to find out why one of the two ipSet does not get deleted on deleting the namespace

@alok87
Copy link
Contributor Author

alok87 commented Feb 28, 2018

@bboreham When namespace is getting terminated and at that time if we try destroying the ipSet it fails. But as soon as the namespace gets terminated same command to destroy works.

  • Time when the namespace aks is in Terminating state and we try to delete ipSet. It fails
/ # ipset destroy "weave-#[w{=fp4BYyel2u|{f;WvV5Wr"
ipset v6.32: Set cannot be destroyed: it is in use by a kernel component
  • Time when the namespace aks is Terminated and we try to delete ipSet It works
/ # ipset destroy "weave-#[w{=fp4BYyel2u|{f;WvV5Wr"

@bboreham
Copy link
Contributor

That's great troubleshooting, @alok87 !

Is that Set cannot be destroyed message in the weave-npc log?

@alok87
Copy link
Contributor Author

alok87 commented Feb 28, 2018

@bboreham I do not see any such msg in the log. I think weave npc is not logging it or is it?
$ kubectl logs weave-net-xmc1z -n kube-system -c weave-npc | grep -i "Set cannot be destroyed"

@bboreham
Copy link
Contributor

Do you mean they keep restarting every time you delete a namespace, or they restart again and again without pause?

If the problem is triggered by a particular namespace delete event, why would the pods keep on restarting?

@alok87
Copy link
Contributor Author

alok87 commented Feb 28, 2018

@bboreham So we have a case where we create and delete namespace multiple times during the day. Whenever a namespace is deleted alls fine weave-npc does not restart. But whenever we create the namespace again with the same name, the weave-npc crashes with the error as mentioned above. And it then starts.

@bboreham
Copy link
Contributor

OK, just wanted to check the crash was associated with a particular event.

@petergardfjall
Copy link

We are seeing the exact same issue here. Just wanted to corroborate @alok87's finding.

The following sequence of commands brings down all weave pods:

$ kubectl create namespace bar
namespace "bar" created
$ kubectl delete namespace bar
namespace "bar" deleted
$ kubectl create namespace bar
namespace "bar" created

The logs from a weave pod while running those commands:

> kubectl logs -f -n kube-system weave-net-tmn4w
...
INFO: 2018/03/06 11:49:55.571679 EVENT AddNamespace {"metadata":{"creationTimestamp":"2018-03-06T11:49:55Z","name":"bar","resourceVersion":"854","selfLink":"/api/v1/namespaces/bar","uid":"7d1b1ff2-2134-11e8-bb6d-06183adadd3c"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}
INFO: 2018/03/06 11:49:55.572882 creating ipset: &npc.selectorSpec{key:"", selector:labels.internalSelector{}, dst:false, ipsetType:"hash:ip", ipsetName:"weave-GZ9Fvvaooldr$$yR~U(:7{8+c", nsName:"bar"}
DEBU: 2018/03/06 11:49:55.574080 ensuring rule for DefaultAllow in namespace: bar, set weave-9wNFw9*GQIuh7[IXl3S;uIVMt
INFO: 2018/03/06 11:50:03.341839 EVENT UpdateNamespace {"metadata":{"creationTimestamp":"2018-03-06T11:49:55Z","name":"bar","resourceVersion":"854","selfLink":"/api/v1/namespaces/bar","uid":"7d1b1ff2-2134-11e8-bb6d-06183adadd3c"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}} {"metadata":{"creationTimestamp":"2018-03-06T11:49:55Z","deletionTimestamp":"2018-03-06T11:50:03Z","name":"bar","resourceVersion":"868","selfLink":"/api/v1/namespaces/bar","uid":"7d1b1ff2-2134-11e8-bb6d-06183adadd3c"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Terminating"}}
INFO: 2018/03/06 11:50:08.437497 EVENT UpdateNamespace {"metadata":{"creationTimestamp":"2018-03-06T11:49:55Z","deletionTimestamp":"2018-03-06T11:50:03Z","name":"bar","resourceVersion":"868","selfLink":"/api/v1/namespaces/bar","uid":"7d1b1ff2-2134-11e8-bb6d-06183adadd3c"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Terminating"}} {"metadata":{"creationTimestamp":"2018-03-06T11:49:55Z","deletionTimestamp":"2018-03-06T11:50:03Z","name":"bar","resourceVersion":"878","selfLink":"/api/v1/namespaces/bar","uid":"7d1b1ff2-2134-11e8-bb6d-06183adadd3c"},"spec":{},"status":{"phase":"Terminating"}}
INFO: 2018/03/06 11:50:08.441199 EVENT DeleteNamespace {"metadata":{"creationTimestamp":"2018-03-06T11:49:55Z","deletionTimestamp":"2018-03-06T11:50:03Z","name":"bar","resourceVersion":"878","selfLink":"/api/v1/namespaces/bar","uid":"7d1b1ff2-2134-11e8-bb6d-06183adadd3c"},"spec":{},"status":{"phase":"Terminating"}}
DEBU: 2018/03/06 11:50:08.441221 removing default rule in namespace: bar, set weave-9wNFw9*GQIuh7[IXl3S;uIVMt
INFO: 2018/03/06 11:50:08.442720 destroying ipset: &npc.selectorSpec{key:"", selector:labels.internalSelector{}, dst:false, ipsetType:"hash:ip", ipsetName:"weave-GZ9Fvvaooldr$$yR~U(:7{8+c", nsName:"bar"}
INFO: 2018/03/06 11:50:19.278110 EVENT AddNamespace {"metadata":{"creationTimestamp":"2018-03-06T11:50:19Z","name":"bar","resourceVersion":"894","selfLink":"/api/v1/namespaces/bar","uid":"8b3c9114-2134-11e8-bb6d-06183adadd3c"},"spec":{"finalizers":["kubernetes"]},"status":{"phase":"Active"}}
FATA: 2018/03/06 11:50:19.279333 ipset [create weave-9wNFw9*GQIuh7[IXl3S;uIVMt hash:ip comment] failed: ipset v6.32: Set cannot be created: set with the same name already exists
: exit status 1

@alok87
Copy link
Contributor Author

alok87 commented Mar 6, 2018

@petergardfjall Yes it is the same issue, the fix PR #3250 is under approval. You can use this image alok88/weave-npc:del_default_ipset we have built from the PR #3250 branch as a temporary fix. It works for us in staging cluster where we were facing the issue and weave-npc container has stopped restarting after that.

@brb brb closed this as completed in 42e4b9a Mar 8, 2018
brb added a commit that referenced this issue Mar 8, 2018
Delete defaultAllowIPSet on namespace delete

Fix #3247

Duplicate merge commit as the PR got merged to "master" instead of "2.2"
@brb
Copy link
Contributor

brb commented Mar 12, 2018

FYI, the fix has been released in Weave Net 2.2.1: https://github.com/weaveworks/weave/releases/tag/v2.2.1

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants