-
Notifications
You must be signed in to change notification settings - Fork 1.1k
flux 1.11.0 no longer syncs without ClusterRole #1830
Comments
Curses, I did not intend this to be the case with #1442, though I admit I wasn't very diligent about trying out this scenario. Where exactly does it come to a halt, when it's not given a ClusterRole? (what do the logs say?) |
#1830 , which should fix this, is complete but pending review |
Hey, thanks for the responses.
Without ClusterRole:
The last line is spammed forever after. After adding the first set of permissions, updated the repo and tried to
Always the following after a restart with the tag behind
Repo tag never moved and nothing was applied. I added that resource, killed the pod and repeated until I added the
#1668 I assume? |
Brill, thanks for that @zeeZ, most helpful! |
You might have to stick to v1.10.1 for now @zeeZ -- sorry about that :-/ |
Yeah, sorry |
Now I am thinking that #1668 by itself won't be enough since it doesn't prevent flux from attempting to list cluster-scoped resources. We need to think about this. |
@zeeZ The fix will be included in the next Fix release. For now, you can test whether your issue is definitely fixed by using image Please reopen this issue if it isn't fixed. |
@2opremio I actually checked out your branch earlier. With no config change from 1.10.1 to yours sync worked as expected, thank you. What remains is the following, but didn't have any impact for me as there are no CRDs managed by flux:
This is repeated every second |
Fantastic! I will look into fixing that as well |
@zeeZ Are you getting any other errors? (even if not repeated) |
No further errors after adding a watch/list CRD cluster role. |
Great, I will try to get a fix for that early next week |
I've created a sample repo of some of the things I did to lock down Flux, maybe it can be of some use: I believe that's as far as I can go without Helm or GC enabled. Removing any of the rules defined will produce some kind of error during common operations, though I haven't played around with it enough to be able to tell where sync is actually affected and what is just noise. |
I've taken a look at the remaining recurring error. It's a tricky one because the func (r *Reflector) Run(stopCh <-chan struct{}) {
glog.V(3).Infof("Starting reflector %v (%s) from %s", r.expectedType, r.resyncPeriod, r.name)
wait.Until(func() {
if err := r.ListAndWatch(stopCh); err != nil {
utilruntime.HandleError(err)
}
}, r.period, stopCh)
} I see a bunch of options:
I dealt with a similar problem in Scope before, going for (2) but the error handling wasn't so deep down in the call stack. |
Yes; adapting parts of client-go is usually a quixotic enterprise. If it's much more complicated than the solution in weaveworks/scope, I'd say it's not worth it. Can we mute glog by doing flag.Parse with some fake command-line options? I'm grasping at straws .. (it's probably better to do 3. instead) |
I went for (3) in the end |
@zeeZ It should be fixed now. I would appreciate if you could give it a try ( |
After removing the CRD role I still get a constant stream of
I did some digging around the IsForbidden || IsNotFound workaround you added, but it seems While it stings a bit, I can live with allowing CRD listing. My initial issue was with list access to everything in the cluster, which has been resolved thanks to you. Perhaps documentation could be added with the minimum privileges Flux needs in order to operate properly, though I suspect that be complicated with helm and GC. Maybe a more restricted minimal example next to deploy? On a positive note, at least it is not silently firing a request every second that may add up for each instance you run ;) |
Crap, sorry about that. I need to do some further thinking.
…On Mon, Mar 18, 2019, 22:42 Christian ***@***.***> wrote:
After removing the CRD role I still get a constant stream of
ts=2019-03-18T21:05:54.062786645Z caller=main.go:175 type="internal kubernetes error" err="github.com/weaveworks/flux/cluster/kubernetes/cached_disco.go:100: Failed to list *v1beta1.CustomResourceDefinition: customresourcedefinitions.apiextensions.k8s.io is forbidden: User \"system:serviceaccount:flux-system:flux\" cannot list resource \"customresourcedefinitions\" in API group \"apiextensions.k8s.io\" at the cluster scope"
I did some digging around the IsForbidden || IsNotFound workaround you
added, but it seems ReasonForError returns StatusReasonUnknown. I'm not
familiar with K8S source, but I believe what we're dealing with here is no
metav1 error but a more generic one:
https://github.com/kubernetes/client-go/blob/7d04d0e2a0a1a4d4a1cd6baa432a2301492e4e65/tools/cache/reflector.go#L251
While it stings a bit, I can live with allowing CRD listing. My initial
issue was with list access to *everything* in the cluster, which has been
resolved thanks to you.
Perhaps documentation could be added with the minimum privileges Flux
needs in order to operate properly, though I suspect that be complicated
with helm and GC. Maybe a more restricted minimal example next to deploy?
On a positive note, at least it is not *silently* firing a request every
second that may add up for each instance you run ;)
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1830 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACQOJAtebOUuSS-4-nR9ZRSwjKfPOgvyks5vYAhfgaJpZM4b0c3f>
.
|
I run flux with explicit permissions, as limited as possible and with only a single namespaced
Role
and--k8s-namespace-whitelist
set. After upgrading to 1.11.0 it no longer syncs unless it is able to list virtually everything in the cluster.This is the
ClusterRole
I created from sync-loop errors before it was able to sync again. You can tell where I gave up:The FAQ answers "Can I restrict the namespaces that Flux can see" with "yes, experimental". Sadly, this is no longer the case.
Also name dropping #1217 and #1471
The text was updated successfully, but these errors were encountered: