Commit 65be478
committed
bpf: Restore trylock for ringbuf in NMI
Ritesh reported in [0] occurences of timeout in cases where the AA
heuristic should have reliably triggered. There are a few known cases
where an NMI can interrupt lower contexts in situations where AA
detections will not trigger and lead to timeouts. One of these is an
NMI landing between the cmpxchg in fast path and the creation of the
lock entry. The other can be a chain of NMI waiters which do not cause
AA on their CPUs, but keep losing to other waiters which can queue and
skip ahead of them in the contention chain. More details are available
in the link below.
For the short term, restore the trylock fallback in case of NMIs for
ring buffer. This can be lifted once a more reasonable fallback in case
of NMIs is agreed upon that does not cause long timeouts for the common
case.
[0]: https://lore.kernel.org/bpf/CAH6OuBTjG+N=+GGwcpOUbeDN563oz4iVcU3rbse68egp9wj9_A@mail.gmail.com
Reported-by: Ritesh Oedayrajsingh Varma <ritesh@superluminal.eu>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>1 parent 79baa40 commit 65be478
1 file changed
+5
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
474 | 474 | | |
475 | 475 | | |
476 | 476 | | |
477 | | - | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
478 | 481 | | |
| 482 | + | |
479 | 483 | | |
480 | 484 | | |
481 | 485 | | |
| |||
0 commit comments