Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource grows too big with multiple operators for same resource #372

Closed
kopf-archiver bot opened this issue Aug 18, 2020 · 1 comment
Closed

Resource grows too big with multiple operators for same resource #372

kopf-archiver bot opened this issue Aug 18, 2020 · 1 comment
Assignees
Labels
archive bug Something isn't working

Comments

@kopf-archiver
Copy link

kopf-archiver bot commented Aug 18, 2020

An issue by akojima at 2020-06-08 22:55:50+00:00
Original URL: zalando-incubator/kopf#372
 

Long story short

I'm writing an operator that uses the "multiple operator on same resource" feature in 0.27 but found an issue that I'm not sure whether it's a bug or if I'm doing something wrong.

The annotations used to persist kopf diffbase and progress data end up recursively included, causing the resource object to grow very large.

Description

If there are 2 kopf operators that manage the same resource, with custom persistence settings, they will include the annotations stored by themselves and the other operator in their own snapshot, escaping it while doing so. The result is that annotations end up including the entire history of changes until the resource grows too big and updates start failing with error 422 Unprocessable Entity.

The code snippet to reproduce the issue

Write the following code into op1.py and op2.py

import kopf

@kopf.on.create('zalando.org', 'v1', 'kopfexamples')
def create_fn(spec, **kwargs):
    print(f"{__name__}: And here we are! Creating: {spec}")


@kopf.on.startup()
def on_startup(settings: kopf.OperatorSettings, **_):
    settings.persistence.progress_storage = kopf.AnnotationsProgressStorage(prefix=__name__)
    settings.persistence.diffbase_storage = kopf.AnnotationsDiffBaseStorage(name=__name__+'/last-handled-configuration')
    settings.persistence.finalizer = __name__+"/kopf-finalizer"
The exact command to reproduce the issue
kopf run op1.py &
kopf run op2.py &
kubectl create -f obj.yaml
kubectl get kopfexamples.zalando.org -oyaml
The full output of the command that failed
$ kopf run op1.py 
[2020-06-08 15:30:19,511] kopf.activities.star [INFO    ] Activity 'on_startup' succeeded.
[2020-06-08 15:30:19,512] kopf.reactor.activit [INFO    ] Initial authentication has been initiated.
[2020-06-08 15:30:19,520] kopf.activities.auth [INFO    ] Activity 'login_via_client' succeeded.
[2020-06-08 15:30:19,520] kopf.reactor.activit [INFO    ] Initial authentication has finished.
[2020-06-08 15:30:19,534] kopf.engines.peering [WARNING ] Default peering object not found, falling back to the standalone mode.
op1: And here we are! Creating: {'duration': '1m', 'field': 'value', 'items': ['item1', 'item2']}
[2020-06-08 15:30:32,316] kopf.objects         [INFO    ] [default/kopf-example-1] Handler 'create_fn' succeeded.
[2020-06-08 15:30:32,316] kopf.objects         [INFO    ] [default/kopf-example-1] All handlers succeeded for creation.


$ kopf run op2.py 
[2020-06-08 15:30:58,358] kopf.activities.star [INFO    ] Activity 'on_startup' succeeded.
[2020-06-08 15:30:58,358] kopf.reactor.activit [INFO    ] Initial authentication has been initiated.
[2020-06-08 15:30:58,372] kopf.activities.auth [INFO    ] Activity 'login_via_client' succeeded.
[2020-06-08 15:30:58,372] kopf.reactor.activit [INFO    ] Initial authentication has finished.
[2020-06-08 15:30:58,386] kopf.engines.peering [WARNING ] Default peering object not found, falling back to the standalone mode.
op2: And here we are! Creating: {'duration': '1m', 'field': 'value', 'items': ['item1', 'item2']}
[2020-06-08 15:30:58,494] kopf.objects         [INFO    ] [default/kopf-example-1] Handler 'create_fn' succeeded.
[2020-06-08 15:30:58,494] kopf.objects         [INFO    ] [default/kopf-example-1] All handlers succeeded for creation.
[2020-06-08 15:36:00,786] kopf.reactor.queuein [ERROR   ] functools.partial(<function process_resource_event at 0x7fc0d01f8ef0>, lifecycle=<function asap at 0x7fc0f0896c20>, registry=<kopf.toolkits.legacy_registries.SmartGlobalRegistry object at 0x7fc0e17f9550>, settings=OperatorSettings(logging=LoggingSettings(), posting=PostingSettings(enabled=True, level=20), watching=WatchingSettings(server_timeout=None, client_timeout=None, connect_timeout=None, reconnect_backoff=0.1), batching=BatchingSettings(worker_limit=None, idle_timeout=5.0, batch_window=0.1, exit_timeout=2.0), execution=ExecutionSettings(executor=<concurrent.futures.thread.ThreadPoolExecutor object at 0x7fc0e17f9350>, _max_workers=None), background=BackgroundSettings(cancellation_polling=60, instant_exit_timeout=None, instant_exit_zero_time_cycles=10), persistence=PersistenceSettings(finalizer='op2/kopf-finalizer', progress_storage=<kopf.storage.progress.AnnotationsProgressStorage object at 0x7fc0e1808790>, diffbase_storage=<kopf.storage.diffbase.AnnotationsDiffBaseStorage object at 0x7fc0e17f9490>)), memories=<kopf.structs.containers.ResourceMemories object at 0x7fc0e17f9b90>, resource=Resource(group='zalando.org', version='v1', plural='kopfexamples'), event_queue=<Queue at 0x7fc0e17f9d10 maxsize=0 _getters[1] tasks=2>) failed with an exception. Ignoring the event.
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/reactor/queueing.py", line 187, in worker
    await processor(raw_event=raw_event, replenished=replenished)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/reactor/processing.py", line 204, in process_resource_event
    replenished=replenished,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/reactor/processing.py", line 227, in apply_reaction_outcomes
    await patching.patch_obj(resource=resource, patch=patch, body=body)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/clients/auth.py", line 45, in wrapper
    return await fn(*args, **kwargs, context=context)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/clients/patching.py", line 59, in patch_obj
    raise_for_status=True,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client.py", line 504, in _request
    await resp.start(conn)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 860, in start
    self._continue = None
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/helpers.py", line 596, in __exit__
    raise asyncio.TimeoutError from None
concurrent.futures._base.TimeoutError

[2020-06-08 15:38:15,025] kopf.reactor.queuein [ERROR   ] functools.partial(<function process_resource_event at 0x7fc0d01f8ef0>, lifecycle=<function asap at 0x7fc0f0896c20>, registry=<kopf.toolkits.legacy_registries.SmartGlobalRegistry object at 0x7fc0e17f9550>, settings=OperatorSettings(logging=LoggingSettings(), posting=PostingSettings(enabled=True, level=20), watching=WatchingSettings(server_timeout=None, client_timeout=None, connect_timeout=None, reconnect_backoff=0.1), batching=BatchingSettings(worker_limit=None, idle_timeout=5.0, batch_window=0.1, exit_timeout=2.0), execution=ExecutionSettings(executor=<concurrent.futures.thread.ThreadPoolExecutor object at 0x7fc0e17f9350>, _max_workers=None), background=BackgroundSettings(cancellation_polling=60, instant_exit_timeout=None, instant_exit_zero_time_cycles=10), persistence=PersistenceSettings(finalizer='op2/kopf-finalizer', progress_storage=<kopf.storage.progress.AnnotationsProgressStorage object at 0x7fc0e1808790>, diffbase_storage=<kopf.storage.diffbase.AnnotationsDiffBaseStorage object at 0x7fc0e17f9490>)), memories=<kopf.structs.containers.ResourceMemories object at 0x7fc0e17f9b90>, resource=Resource(group='zalando.org', version='v1', plural='kopfexamples'), event_queue=<Queue at 0x7fc0e17f9d10 maxsize=0 _getters[1] tasks=2>) failed with an exception. Ignoring the event.
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/reactor/queueing.py", line 187, in worker
    await processor(raw_event=raw_event, replenished=replenished)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/reactor/processing.py", line 204, in process_resource_event
    replenished=replenished,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/reactor/processing.py", line 227, in apply_reaction_outcomes
    await patching.patch_obj(resource=resource, patch=patch, body=body)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/clients/auth.py", line 45, in wrapper
    return await fn(*args, **kwargs, context=context)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/kopf/clients/patching.py", line 59, in patch_obj
    raise_for_status=True,
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client.py", line 588, in _request
    resp.raise_for_status()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/aiohttp/client_reqrep.py", line 946, in raise_for_status
    headers=self.headers)
aiohttp.client_exceptions.ClientResponseError: 422, message='Unprocessable Entity', url=URL('https://kubernetes.docker.internal:6443/apis/zalando.org/v1/namespaces/default/kopfexamples/kopf-example-1')
(snip)

$ kubectl get kopfexamples.zalando.org -oyaml 
apiVersion: v1
items:
- apiVersion: zalando.org/v1
  kind: KopfExample
  metadata:
    annotations:
      op1/last-handled-configuration: '{"spec": {"duration": "1m", "field": "value",
        "items": ["item1", "item2"]}, "metadata": {"labels": {"somelabel": "somevalue"},
        "annotations": {"op2/last-handled-configuration": "{\"spec\": {\"duration\":
        \"1m\", \"field\": \"value\", \"items\": [\"item1\", \"item2\"]}, \"metadata\":
        {\"labels\": {\"somelabel\": \"somevalue\"}, \"annotations\": {\"op1/last-handled-configuration\":
        \"{\\\"spec\\\": {\\\"duration\\\": \\\"1m\\\", \\\"field\\\": \\\"value\\\",
        \\\"items\\\": [\\\"item1\\\", \\\"item2\\\"]}, \\\"metadata\\\": {\\\"labels\\\":
        {\\\"somelabel\\\": \\\"somevalue\\\"}, \\\"annotations\\\": {\\\"op2/last-handled-configuration\\\":
        \\\"{\\\\\\\"spec\\\\\\\": {\\\\\\\"duration\\\\\\\": \\\\\\\"1m\\\\\\\",
        \\\\\\\"field\\\\\\\": \\\\\\\"value\\\\\\\", \\\\\\\"items\\\\\\\": [\\\\\\\"item1\\\\\\\",
        \\\\\\\"item2\\\\\\\"]}, \\\\\\\"metadata\\\\\\\": {\\\\\\\"labels\\\\\\\":
        {\\\\\\\"somelabel\\\\\\\": \\\\\\\"somevalue\\\\\\\"}, \\\\\\\"annotations\\\\\\\":
        {\\\\\\\"op1/last-handled-configuration\\\\\\\": \\\\\\\"{\\\\\\\\\\\\\\\"spec\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\"duration\\\\\\\\\\\\\\\": \\\\\\\\\\\\\\\"1m\\\\\\\\\\\\\\\",
        \\\\\\\\\\\\\\\"field\\\\\\\\\\\\\\\": \\\\\\\\\\\\\\\"value\\\\\\\\\\\\\\\",
        \\\\\\\\\\\\\\\"items\\\\\\\\\\\\\\\": [\\\\\\\\\\\\\\\"item1\\\\\\\\\\\\\\\",
        \\\\\\\\\\\\\\\"item2\\\\\\\\\\\\\\\"]}, \\\\\\\\\\\\\\\"metadata\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\"labels\\\\\\\\\\\\\\\": {\\\\\\\\\\\\\\\"somelabel\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\"somevalue\\\\\\\\\\\\\\\"}, \\\\\\\\\\\\\\\"annotations\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\"op2/last-handled-configuration\\\\\\\\\\\\\\\": \\\\\\\\\\\\\\\"{\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"spec\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"duration\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"1m\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\", \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"field\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"value\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\", \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"items\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        [\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"item1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\", \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"item2\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"]},
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"metadata\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"labels\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\": {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"somelabel\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"somevalue\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"},
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"annotations\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"op1/last-handled-configuration\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"{\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"spec\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"duration\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"1m\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\",
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"field\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"value\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\",
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"items\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        [\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"item1\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\",
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"item2\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"]},
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"metadata\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"labels\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"somelabel\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"somevalue\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"},
        \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"annotations\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":
        {\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\"op2/last-handled-configuration\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\":

(snip)

$ kubectl get kopfexamples.zalando.org kopf-example-1 -oyaml|wc 
     250     401  165593

Environment

  • Kopf version: 0.27
  • Kubernetes version: v1.16.6-beta.0 (docker-desktop for macos)
  • Python version: 3.7.5
  • OS/platform: macos
Python packages installed
aiohttp==3.6.1
aiojobs==0.2.2
astroid==2.0.4
async-timeout==3.0.1
attrs==19.2.0
autopep8==1.4.2
cachetools==3.1.1
certifi==2019.9.11
chardet==3.0.4
clang==6.0.0.2
Click==7.0
colored==1.4.0
google-auth==1.6.3
idna==2.8
iso8601==0.1.12
isort==4.3.4
kopf==0.27
kubernetes==10.0.1
lazy-object-proxy==1.3.1
mccabe==0.6.1
multidict==4.5.2
oauthlib==3.1.0
pip==20.1.1
prompt-toolkit==2.0.6
pyasn1==0.4.7
pyasn1-modules==0.2.7
pycodestyle==2.4.0
pykube-ng==19.10.0
pylint==2.1.1
python-dateutil==2.8.0
PyYAML==5.1.2
requests==2.22.0
requests-oauthlib==1.2.0
rsa==4.0
setuptools==42.0.2
six==1.11.0
typing-extensions==3.7.4.2
urllib3==1.25.6
wcwidth==0.1.7
websocket-client==0.56.0
wheel==0.33.6
wrapt==1.10.11
yarl==1.3.0

Commented by akojima at 2020-06-09 03:33:36+00:00
 

Also, is there a way to make kopf write state data only in resources that are of actual interest to the operator? For example, I have Pod event handlers that are filtered by label but all pods in the cluster get annotated by kopf.


Commented by nolar at 2020-06-11 12:21:27+00:00
 

akojima For the latter question: That's strange. If you use labels= or annotations= or when= or any other filters on the handlers, the unrelated objects should not be touched in any way. If that is the case, it is a bug.

And especially, the @kopf.on.event handlers should not cause any annotations stored by Kopf, as they are a low-level "spying" type of handlers.

Can you please report this as a separate issue?


For the former (main) question: That behaviour is explainable, and maybe is desired. Here is what happens:

Since both of the operators are unaware of each other, they do not purge the annotations of each other from the snapshots. Other operator's annotations look like just any other annotations, incl. the human-added ones.

The same would be true for the status fields or any other ways of storing the state on the resource itself — as long as they are indeed unaware of each other.

The logic of cleaning up the resource's body is here: https://github.com/nolar/kopf/blob/0.27/kopf/storage/progress.py#L233-L235 and in other clear() methods in the same module.

Another bit of it is here: https://github.com/nolar/kopf/blob/0.27/kopf/storage/diffbase.py#L78-L79

As you can see, either the own prefixes are removed, or kopf.zalando.org/-prefixed annotations are removed (regardless of own identity). Everything else is considered as essential.

To work around this problem you have to make them aware of each other. There are three ways:

  • Always keep kopf.zalando.org prefix.
  • Have the same prefix for all your operators (but: beware of conflicting handler names then).
  • Implement your own progress storage by inheriting from kopf.AnnotationsProgressStorage, and extending its clear() method with few more cleanups of "related" operators — now, both of them will be aware of each other.

The inclusion of own data (op1's data by op1 operator) is not actually happening. In fact, the op1 includes the op2's data, which includes the op1's data.


On a side-note: this is indeed a weird behaviour. I see how confusing it can be. And I am open to any suggestions to make it smoother.

Perhaps, I can add a tiny marker to the content of all progress-/diffbase-related annotations of all Kopf-based operators, and exclude such annotations regardless of their prefixes.

Technically, the content of the annotations is not promised to be a json. It can be any string in any format/notation/syntax. JSON is only used for debuggability. So, adding any kind of markers seems acceptable.


Commented by akojima at 2020-06-11 16:18:06+00:00
 

nolar Thank you for the thorough explanation.

I filed issue #374 for the secondary issue.

As for the main issue, always keeping the default kopf.zalando.org across all operators (+ unique handler names) seems like the simplest workaround.

I also have the impression that sometimes a ping-pong effect is triggered with multiple operators, where one handler notices the other operator adding annotations, then adding data of its own, which is noticed and handled the same way by the other operator. This loops with a flood of messages in --verbose mode, until the patch requests start failing because the resources can't grow any longer.

I switched to a single operator model for my project so this doesn't immediately affect me anymore, but still, anyone using 2 Kopf operators in the same cluster would be affected.

The marker to identify data added by Kopf sounds like a good idea to me.

Another idea would be to use a fixed key pre-prefix/suffix (e.g. myprefix/handler_name-kopf.private instead of myprefix/handler_name).

But generally speaking, what we want is the ability to advertise certain fields as being private to a specific operator instance and thus of no interest to anyone else, regardless of what framework is used to write them. Ideally there would be a standard way of doing that in all of Kubernetes-land, but since there isn't one (as far as I know, I'm new to Kubernetes tho), the only way I can think would be a configurable list of prefixes or suffixes to filter out known private field patterns.


Commented by nolar at 2020-06-12 08:02:24+00:00
 

Yes, it is a known issue. Or a known feature. If 2 operators use the same kopf.zalando.org/last-handled-configuration annotation, they will be triggering each other's reactions — like ping-pong indeed — and noticing the changes happening (the annotation is used as a base value for diff calculation).

We had this issue in our apps too. The only proper solution is to have different operator identities (this is why this feature was added). But — back to the original issue — the operators should be somehow aware of each other and/or ignore each other even if unaware.

I'll take it as a priority for version 0.28, together with K8s API throttling (it is quite easy to kill the cluster with all those 422s and ping-pongs going way too fast; see #351).

@kopf-archiver kopf-archiver bot closed this as completed Aug 18, 2020
@kopf-archiver kopf-archiver bot changed the title [archival placeholder] Resource grows too big with multiple operators for same resource Aug 19, 2020
@kopf-archiver kopf-archiver bot added the bug Something isn't working label Aug 19, 2020
@kopf-archiver kopf-archiver bot reopened this Aug 19, 2020
@nolar nolar self-assigned this Sep 4, 2020
@nolar
Copy link
Owner

nolar commented Sep 9, 2020

The issue is addressed in #539: all Kopf-originating annotations, regardless of the specific operators or their identities, will be ignored now. This required a little injection of Kopf's own identity, so it would be impossible to hide that the operators are Kopf-based, if it was a concern at all.

The reproduction is show there too. Tested manually — both reproduced and fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archive bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant