Skip to content

Commit 65be478

Browse files
committed
bpf: Restore trylock for ringbuf in NMI
Ritesh reported in [0] occurences of timeout in cases where the AA heuristic should have reliably triggered. There are a few known cases where an NMI can interrupt lower contexts in situations where AA detections will not trigger and lead to timeouts. One of these is an NMI landing between the cmpxchg in fast path and the creation of the lock entry. The other can be a chain of NMI waiters which do not cause AA on their CPUs, but keep losing to other waiters which can queue and skip ahead of them in the contention chain. More details are available in the link below. For the short term, restore the trylock fallback in case of NMIs for ring buffer. This can be lifted once a more reasonable fallback in case of NMIs is agreed upon that does not cause long timeouts for the common case. [0]: https://lore.kernel.org/bpf/CAH6OuBTjG+N=+GGwcpOUbeDN563oz4iVcU3rbse68egp9wj9_A@mail.gmail.com Reported-by: Ritesh Oedayrajsingh Varma <ritesh@superluminal.eu> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
1 parent 79baa40 commit 65be478

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

kernel/bpf/ringbuf.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -474,8 +474,12 @@ static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size)
474474

475475
cons_pos = smp_load_acquire(&rb->consumer_pos);
476476

477-
if (raw_res_spin_lock_irqsave(&rb->spinlock, flags))
477+
if (in_nmi()) {
478+
if (!raw_res_spin_trylock_irqsave(&rb->spinlock, flags))
479+
return NULL;
480+
} else if (raw_res_spin_lock_irqsave(&rb->spinlock, flags)) {
478481
return NULL;
482+
}
479483

480484
pend_pos = rb->pending_pos;
481485
prod_pos = rb->producer_pos;

0 commit comments

Comments
 (0)