Prevent loss of events under stress-load or sync-blockers #732

nolar · 2021-04-03T23:03:11Z

A potential race condition is described in #718: when an "idle timeout" happens in a worker, it takes a moment before the worker's stream is deleted; the multiplexer can add new events during this microscopic time frame, and such events will be lost. (UPD: Not fully correct: the multiplexer will do nothing until the control is given back to asyncio event-loop; still the timeout happens even if the queue is filled in some cases.)

Similarly in #784: the queue-getting timeout happens because of lengthy sync operations in the operator's callback filters executed in the framework's stack (can be simulated with time.sleep(1)) — they were blocking the asyncio loop long enough for the event to arrive to the I/O socket, be noticed and queued immediately once the control returns to the asyncio loop, but too late for the worker's queue-getting calls which timed out. This resulted in the worker exiting with backlog.qsize() == 1.

This PR hypothetically fixes the issue by doing an extra check on the queue's emptiness before really exiting the worker. However, it is difficult to check the fix as it is difficult to reproduce the issue.

UPD: Both the issue and the fix are confirmed by manual tests. However, there are no automated tests: it seems impossible to simulate a scenario with queue.get() timing out while the queue is non-empty — neither with pure Python nor with mocks.

The locks are intentionally not used to avoid locking overhead — locks acquiring is quite expensive for this performance-critical part of the code, and the issue happens only in rare circumstances.

A manual repro:

Here, we use namespaces as the resource where the events will be lost (because they are easier to create from the command line: no yamls are needed). Specifically, the namespace "simulated-123" is supposed to have 2 events:

Creation with no label "xyz".
Updating with the label "xyz" and a random value.

The 1st event is never lost as it created a worker. And we intentionally give control back to asyncio to ensure that the worker is started (because the bug is in the worker).

The 2nd event is lost if the sync delay in the async event loop exceeds the worker's idle timeout. If it is below (taking the first asyncio.sleep() into account too), then the event is not lost.

import random
import asyncio
import time
import kopf
import pykube


@kopf.on.startup()
async def setup(settings, **_):
    settings.batching.idle_timeout = 1.0  # the default is 5s, but 1s is faster


@kopf.on.event('namespaces')
async def show_ns(name, labels, **_):
    print(f'>>> {name} {labels!r}')


# Any trigger. A creation of a KopfExample is just for convenience.
# Note: async handler with sync calls -- only for simulation.
# A more realistic scenario is to have sync calls in the "when=" callbacks.
@kopf.on.create('kex')
async def simulate(**_):

    # Ensure that an object-under-test is created and the workers start.
    api = pykube.HTTPClient(pykube.KubeConfig.from_env())
    obj = pykube.Namespace(api, {'metadata': {'name': 'simulated-123'}})
    obj.create()
    await asyncio.sleep(0.3)

    # Modify the object and thus provoke a new watch-event.
    obj.patch({'metadata': {'labels': {'xyz': str(random.randint(1000, 9999))}}})

    # But do not give control back to the asyncio event loop.
    time.sleep(1.0)  # BROKEN: misses the events
    # time.sleep(0.6)  # WORKS: all events are seen.

Before the fix:

### Reset: kubectl delete namespace simulated-123; kubectl delete -f examples/obj.yaml
### Start the operator.

[2021-07-04 19:14:55,251] kopf.objects         [DEBUG   ] [default] Handler 'show_ns' is invoked.
[2021-07-04 19:14:55,252] kopf.objects         [INFO    ] [default] Handler 'show_ns' succeeded.
[2021-07-04 19:14:55,254] kopf.objects         [DEBUG   ] [kube-system] Handler 'show_ns' is invoked.
[2021-07-04 19:14:55,255] kopf.objects         [INFO    ] [kube-system] Handler 'show_ns' succeeded.
[2021-07-04 19:14:55,258] kopf.objects         [DEBUG   ] [kube-public] Handler 'show_ns' is invoked.
[2021-07-04 19:14:55,258] kopf.objects         [INFO    ] [kube-public] Handler 'show_ns' succeeded.
[2021-07-04 19:14:55,261] kopf.objects         [DEBUG   ] [kube-node-lease] Handler 'show_ns' is invoked.
[2021-07-04 19:14:55,262] kopf.objects         [INFO    ] [kube-node-lease] Handler 'show_ns' succeeded.
>>> default {'kubernetes.io/metadata.name': 'default'}
>>> kube-system {'kubernetes.io/metadata.name': 'kube-system'}
>>> kube-public {'kubernetes.io/metadata.name': 'kube-public'}
>>> kube-node-lease {'kubernetes.io/metadata.name': 'kube-node-lease'}

### Do: kubectl apply -f examples/obj.yaml

[2021-07-04 19:14:57,208] kopf.objects         [DEBUG   ] [default/kopf-example-1] Creation is in progress: {'apiVersion': 'kopf.dev/v1', 'kind': 'KopfExample', 'metadata': {'annotations': {'kubectl.kubernetes.io/last-applied-configuration': '{"apiVersion":"kopf.dev/v1","kind":"KopfExample","metadata":{"annotations":{"someannotation":"somevalue"},"labels":{"somelabel":"somevalue"},"name":"kopf-example-1","namespace":"default"},"spec":{"duration":"1m","field":"value","items":["item1","item2"]}}\n', 'someannotation': 'somevalue'}, 'creationTimestamp': '2021-07-04T17:14:57Z', 'generation': 1, 'labels': {'somelabel': 'somevalue'}, 'managedFields': ..., 'name': 'kopf-example-1', 'namespace': 'default', 'resourceVersion': '24331', 'uid': '28e1db4b-b61b-47b8-91ef-e60b67df7b64'}, 'spec': {'duration': '1m', 'field': 'value', 'items': ['item1', 'item2']}}
[2021-07-04 19:14:57,209] kopf.objects         [DEBUG   ] [default/kopf-example-1] Handler 'simulate' is invoked.

### Here, the namespace is created and 0.3s is given for the 1st event (with no label).

[2021-07-04 19:14:57,376] kopf.objects         [DEBUG   ] [simulated-123] Handler 'show_ns' is invoked.
[2021-07-04 19:14:57,377] kopf.objects         [INFO    ] [simulated-123] Handler 'show_ns' succeeded.
>>> simulated-123 {'kubernetes.io/metadata.name': 'simulated-123'}

### Here, we sleep for ONLY 1.0s! This is more than the worker's idle timeout (as configured).

[2021-07-04 19:14:58,592] kopf.objects         [INFO    ] [default/kopf-example-1] Handler 'simulate' succeeded.
[2021-07-04 19:14:58,593] kopf.objects         [INFO    ] [default/kopf-example-1] Creation is processed: 1 succeeded; 0 failed.
[2021-07-04 19:14:58,595] kopf.objects         [DEBUG   ] [default/kopf-example-1] Patching with: {'metadata': {'annotations': {'kopf.zalando.org/last-handled-configuration': '{"spec":{"duration":"1m","field":"value","items":["item1","item2"]},"metadata":{"labels":{"somelabel":"somevalue"},"annotations":{"someannotation":"somevalue"}}}\n'}}}

### Note: nothing is printed. The namespace event with the "xyz" label is lost.
### But we can reveal it: kubectl label namespace simulated-123 zzzz=gogogo

[2021-07-04 19:18:42,780] kopf.objects         [DEBUG   ] [simulated-123] Handler 'show_ns' is invoked.
[2021-07-04 19:18:42,781] kopf.objects         [INFO    ] [simulated-123] Handler 'show_ns' succeeded.
>>> simulated-123 {'kubernetes.io/metadata.name': 'simulated-123', 'xyz': '9770', 'zzzz': 'gogogo'}

### The label IS there. It was just lost in the operator.

After the fix:

### Reset: kubectl delete namespace simulated-123; kubectl delete -f examples/obj.yaml
### Start the operator.

[2021-07-04 19:02:12,990] kopf.objects         [DEBUG   ] [default] Handler 'show_ns' is invoked.
[2021-07-04 19:02:12,991] kopf.objects         [INFO    ] [default] Handler 'show_ns' succeeded.
[2021-07-04 19:02:12,996] kopf.objects         [DEBUG   ] [kube-system] Handler 'show_ns' is invoked.
[2021-07-04 19:02:12,997] kopf.objects         [INFO    ] [kube-system] Handler 'show_ns' succeeded.
[2021-07-04 19:02:13,000] kopf.objects         [DEBUG   ] [kube-public] Handler 'show_ns' is invoked.
[2021-07-04 19:02:13,001] kopf.objects         [INFO    ] [kube-public] Handler 'show_ns' succeeded.
[2021-07-04 19:02:13,003] kopf.objects         [DEBUG   ] [kube-node-lease] Handler 'show_ns' is invoked.
[2021-07-04 19:02:13,004] kopf.objects         [INFO    ] [kube-node-lease] Handler 'show_ns' succeeded.
>>> default {'kubernetes.io/metadata.name': 'default'}
>>> kube-system {'kubernetes.io/metadata.name': 'kube-system'}
>>> kube-public {'kubernetes.io/metadata.name': 'kube-public'}
>>> kube-node-lease {'kubernetes.io/metadata.name': 'kube-node-lease'}

### Do: kubectl apply -f examples/obj.yaml

[2021-07-04 19:08:42,627] kopf.objects         [DEBUG   ] [default/kopf-example-1] Creation is in progress: {'apiVersion': 'kopf.dev/v1', 'kind': 'KopfExample', 'metadata': {'annotations': {'kubectl.kubernetes.io/last-applied-configuration': '{"apiVersion":"kopf.dev/v1","kind":"KopfExample","metadata":{"annotations":{"someannotation":"somevalue"},"labels":{"somelabel":"somevalue"},"name":"kopf-example-1","namespace":"default"},"spec":{"duration":"1m","field":"value","items":["item1","item2"]}}\n', 'someannotation': 'somevalue'}, 'creationTimestamp': '2021-07-04T17:08:42Z', 'generation': 1, 'labels': {'somelabel': 'somevalue'}, 'managedFields': [{'apiVersion': 'kopf.dev/v1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:annotations': {'.': {}, 'f:kubectl.kubernetes.io/last-applied-configuration': {}, 'f:someannotation': {}}, 'f:labels': {'.': {}, 'f:somelabel': {}}}, 'f:spec': {'.': {}, 'f:duration': {}, 'f:field': {}, 'f:items': {}}}, 'manager': 'kubectl-client-side-apply', 'operation': 'Update', 'time': '2021-07-04T17:08:42Z'}], 'name': 'kopf-example-1', 'namespace': 'default', 'resourceVersion': '24052', 'uid': 'db86559e-8022-4031-9f09-c41b641780c4'}, 'spec': {'duration': '1m', 'field': 'value', 'items': ['item1', 'item2']}}
[2021-07-04 19:08:42,628] kopf.objects         [DEBUG   ] [default/kopf-example-1] Handler 'simulate' is invoked.

### Here, the namespace is created and 0.3s is given for the 1st event (with no label).

[2021-07-04 19:08:42,785] kopf.objects         [DEBUG   ] [simulated-123] Handler 'show_ns' is invoked.
[2021-07-04 19:08:42,785] kopf.objects         [INFO    ] [simulated-123] Handler 'show_ns' succeeded.
>>> simulated-123 {'kubernetes.io/metadata.name': 'simulated-123'}

### Here, we sleep for 3.0s! Just to be on the intentionally unsafe side.
### The handler totals to 3.0+0.3+ 2x API calls = approx. 3.378s.

[2021-07-04 19:08:46,006] kopf.objects         [INFO    ] [default/kopf-example-1] Handler 'simulate' succeeded.
[2021-07-04 19:08:46,007] kopf.objects         [INFO    ] [default/kopf-example-1] Creation is processed: 1 succeeded; 0 failed.
[2021-07-04 19:08:46,009] kopf.objects         [DEBUG   ] [default/kopf-example-1] Patching with: {'metadata': {'annotations': {'kopf.zalando.org/last-handled-configuration': '{"spec":{"duration":"1m","field":"value","items":["item1","item2"]},"metadata":{"labels":{"somelabel":"somevalue"},"annotations":{"someannotation":"somevalue"}}}\n'}}}

### Note: the event is not lost, it is shown with the "xyz" label as soon as control is returned to asyncio:

[2021-07-04 19:08:46,122] kopf.objects         [DEBUG   ] [simulated-123] Handler 'show_ns' is invoked.
[2021-07-04 19:08:46,123] kopf.objects         [INFO    ] [simulated-123] Handler 'show_ns' succeeded.
>>> simulated-123 {'kubernetes.io/metadata.name': 'simulated-123', 'xyz': '5220'}

When diving deeper, additional print() statements in kopf._core.reactor.queueing.worker() can show that the finally: clause is executed with backlog.qsize() == 1 before the fix, i.e. the queue is not truly empty on timeout of getting from it.

Signed-off-by: Sergey Vasilyev <nolar@nolar.info>

nolar added the bug Something isn't working label Apr 3, 2021

nolar mentioned this pull request Apr 3, 2021

Race Condition in Stream Multiplexing #718

Open

nolar force-pushed the queueing-consistency branch 2 times, most recently from 98afe9a to 8018dc9 Compare April 29, 2021 06:05

nolar force-pushed the queueing-consistency branch from 8018dc9 to a8027f1 Compare May 12, 2021 21:26

nolar force-pushed the queueing-consistency branch from a8027f1 to 9a9bcb3 Compare July 4, 2021 15:13

Fix a race condition in the event multiplexer

4e22ff5

Signed-off-by: Sergey Vasilyev <nolar@nolar.info>

nolar force-pushed the queueing-consistency branch from 9a9bcb3 to 4e22ff5 Compare July 4, 2021 16:32

nolar changed the title ~~Fix a potential race condition in the event multiplexer~~ Fix a race condition in the event multiplexer Jul 4, 2021

nolar changed the title ~~Fix a race condition in the event multiplexer~~ Prevent losing the events under stress-load or sync-blockers Jul 4, 2021

nolar changed the title ~~Prevent losing the events under stress-load or sync-blockers~~ Prevent loss of events under stress-load or sync-blockers Jul 4, 2021

nolar marked this pull request as ready for review July 4, 2021 16:38

nolar merged commit d147687 into main Jul 4, 2021

nolar deleted the queueing-consistency branch July 4, 2021 17:25

nolar mentioned this pull request Oct 3, 2021

Process events instantly and consistently, stop skipping the events due to "batching" #844

Draft

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent loss of events under stress-load or sync-blockers #732

Prevent loss of events under stress-load or sync-blockers #732

nolar commented Apr 3, 2021 •

edited

Loading

Prevent loss of events under stress-load or sync-blockers #732

Prevent loss of events under stress-load or sync-blockers #732

Conversation

nolar commented Apr 3, 2021 • edited Loading

nolar commented Apr 3, 2021 •

edited

Loading