Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

Pull bpf/for-next #189

Merged
merged 49 commits into from
Apr 29, 2024
Merged

Pull bpf/for-next #189

merged 49 commits into from
Apr 29, 2024

Conversation

htejun
Copy link
Collaborator

@htejun htejun commented Apr 29, 2024

No description provided.

Benjamin Tissoires and others added 30 commits April 23, 2024 18:31
To be able to add workqueues and reuse most of the timer code, we need
to make bpf_hrtimer more generic.

There is no code change except that the new struct gets a new u64 flags
attribute. We are still below 2 cache lines, so this shouldn't impact
the current running codes.

The ordering is also changed. Everything related to async callback
is now on top of bpf_hrtimer.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-1-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
No code change except for the new flags argument being stored in the
local data struct.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-2-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In the same way we have a generic __bpf_async_init(), we also need
to share code between timer and workqueue for the set_callback call.

We just add an unused flags parameter, as it will be used for workqueues.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-3-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Same reason than most bpf_timer* functions, we need almost the same for
workqueues.
So extract the generic part out of it so bpf_wq_cancel_and_free can reuse
it.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-4-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Mostly a copy/paste from the bpf_timer API, without the initialization
and free, as they will be done in a separate patch.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-5-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
cp include/uapi/linux/bpf.h tools/include/uapi/linux/bpf.h

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-6-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When a kfunc is declared with a KF_ARG_PTR_TO_MAP, we should have
reg->map_ptr set to a non NULL value, otherwise, that means that the
underlying type is not a map.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-7-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Introduce support for KF_ARG_PTR_TO_WORKQUEUE. The kfuncs will use bpf_wq
as argument and that will be recognized as workqueue argument by verifier.
bpf_wq_kern casting can happen inside kfunc, but using bpf_wq in
argument makes life easier for users who work with non-kern type in BPF
progs.

Duplicate process_timer_func into process_wq_func.
meta argument is only needed to ensure bpf_wq_init's workqueue and map
arguments are coming from the same map (map_uid logic is necessary for
correct inner-map handling), so also amend check_kfunc_args() to
match what helpers functions check is doing.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-8-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Currently bpf_wq_cancel_and_free() is just a placeholder as there is
no memory allocation for bpf_wq just yet.

Again, duplication of the bpf_timer approach

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-9-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We simply try in all supported map types if we can store/load a bpf_wq.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-10-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We need to teach the verifier about the second argument which is declared
as void * but which is of type KF_ARG_PTR_TO_MAP. We could have dropped
this extra case if we declared the second argument as struct bpf_map *,
but that means users will have to do extra casting to have their program
compile.

We also need to duplicate the timer code for the checking if the map
argument is matching the provided workqueue.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-11-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Allows to test if allocation/free works

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-12-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
To support sleepable async callbacks, we need to tell push_async_cb()
whether the cb is sleepable or not.

The verifier now detects that we are in bpf_wq_set_callback_impl and
can allow a sleepable callback to happen.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-13-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
We assign the callback and set everything up.
The actual tests of these callbacks will be done when bpf_wq_start() is
available.

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-14-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
again, copy/paste from bpf_timer_start().

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-15-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Allows to test if allocation/free works

Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-16-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Benjamin Tissoires says:

====================
Introduce bpf_wq

This is a followup of sleepable bpf_timer[0].

When discussing sleepable bpf_timer, it was thought that we should give
a try to bpf_wq, as the 2 APIs are similar but distinct enough to
justify a new one.

So here it is.

I tried to keep as much as possible common code in kernel/bpf/helpers.c
but I couldn't get away with code duplication in kernel/bpf/verifier.c.

This series introduces a basic bpf_wq support:
- creation is supported
- assignment is supported
- running a simple bpf_wq is also supported.

We will probably need to extend the API further with:
- a full delayed_work API (can be piggy backed on top with a correct
  flag)
- bpf_wq_cancel() <- apparently not, this is shooting ourself in the
  foot
- bpf_wq_cancel_sync() (for sleepable programs)
- documentation
---

For reference, the use cases I have in mind:

---

Basically, I need to be able to defer a HID-BPF program for the
following reasons (from the aforementioned patch):
1. defer an event:
   Sometimes we receive an out of proximity event, but the device can not
   be trusted enough, and we need to ensure that we won't receive another
   one in the following n milliseconds. So we need to wait those n
   milliseconds, and eventually re-inject that event in the stack.

2. inject new events in reaction to one given event:
   We might want to transform one given event into several. This is the
   case for macro keys where a single key press is supposed to send
   a sequence of key presses. But this could also be used to patch a
   faulty behavior, if a device forgets to send a release event.

3. communicate with the device in reaction to one event:
   We might want to communicate back to the device after a given event.
   For example a device might send us an event saying that it came back
   from sleeping state and needs to be re-initialized.

Currently we can achieve that by keeping a userspace program around,
raise a bpf event, and let that userspace program inject the events and
commands.
However, we are just keeping that program alive as a daemon for just
scheduling commands. There is no logic in it, so it doesn't really justify
an actual userspace wakeup. So a kernel workqueue seems simpler to handle.

bpf_timers are currently running in a soft IRQ context, this patch
series implements a sleppable context for them.

Cheers,
Benjamin

[0] https://lore.kernel.org/all/20240408-hid-bpf-sleepable-v6-0-0499ddd91b94@kernel.org/

Changes in v2:
- took previous review into account
- mainly dropped BPF_F_WQ_SLEEPABLE
- Link to v1: https://lore.kernel.org/r/20240416-bpf_wq-v1-0-c9e66092f842@kernel.org

====================

Link: https://lore.kernel.org/r/20240420-bpf_wq-v2-0-6c986a5a741f@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
__bpf_prog_enter_sleepable_recur does recursion check which is not applicable
to wq callback. The callback function is part of bpf program and bpf prog might
be running on the same cpu. So recursion check would incorrectly prevent
callback from running. The code can call __bpf_prog_enter_sleepable(), but
run_ctx would be fake, hence use explicit rcu_read_lock_trace();
migrate_disable(); to address this problem. Another reason to open code is
__bpf_prog_enter* are not available in !JIT configs.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404241719.IIGdpAku-lkp@intel.com/
Closes: https://lore.kernel.org/oe-kbuild-all/202404241811.FFV4Bku3-lkp@intel.com/
Fixes: eb48f6c ("bpf: wq: add bpf_wq_init")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Introduce two new BPF kfuncs, bpf_preempt_disable and
bpf_preempt_enable. These kfuncs allow disabling preemption in BPF
programs. Nesting is allowed, since the intended use cases includes
building native BPF spin locks without kernel helper involvement. Apart
from that, this can be used to per-CPU data structures for cases where
programs (or userspace) may preempt one or the other. Currently, while
per-CPU access is stable, whether it will be consistent is not
guaranteed, as only migration is disabled for BPF programs.

Global functions are disallowed from being called, but support for them
will be added as a follow up not just preempt kfuncs, but rcu_read_lock
kfuncs as well. Static subprog calls are permitted. Sleepable helpers
and kfuncs are disallowed in non-preemptible regions.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240424031315.2757363-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add tests for nested cases, nested count preservation upon different
subprog calls that disable/enable preemption, and test sleepable helper
call in non-preemptible regions.

182/1   preempt_lock/preempt_lock_missing_1:OK
182/2   preempt_lock/preempt_lock_missing_2:OK
182/3   preempt_lock/preempt_lock_missing_3:OK
182/4   preempt_lock/preempt_lock_missing_3_minus_2:OK
182/5   preempt_lock/preempt_lock_missing_1_subprog:OK
182/6   preempt_lock/preempt_lock_missing_2_subprog:OK
182/7   preempt_lock/preempt_lock_missing_2_minus_1_subprog:OK
182/8   preempt_lock/preempt_balance:OK
182/9   preempt_lock/preempt_balance_subprog_test:OK
182/10  preempt_lock/preempt_global_subprog_test:OK
182/11  preempt_lock/preempt_sleepable_helper:OK
182     preempt_lock:OK
Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240424031315.2757363-3-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Kumar Kartikeya Dwivedi says:

====================
Introduce bpf_preempt_{disable,enable}

This set introduces two kfuncs, bpf_preempt_disable and
bpf_preempt_enable, which are wrappers around preempt_disable and
preempt_enable in the kernel. These functions allow a BPF program to
have code sections where preemption is disabled. There are multiple use
cases that are served by such a feature, a few are listed below:

1. Writing safe per-CPU alogrithms/data structures that work correctly
   across different contexts.
2. Writing safe per-CPU allocators similar to bpf_memalloc on top of
   array/arena memory blobs.
3. Writing locking algorithms in BPF programs natively.

Note that local_irq_disable/enable equivalent is also needed for proper
IRQ context protection, but that is a more involved change and will be
sent later.

While bpf_preempt_{disable,enable} is not sufficient for all of these
usage scenarios on its own, it is still necessary.

The same effect as these kfuncs can in some sense be already achieved
using the bpf_spin_lock or rcu_read_lock APIs, therefore from the
standpoint of kernel functionality exposure in the verifier, this is
well understood territory.

Note that these helpers do allow calling kernel helpers and kfuncs from
within the non-preemptible region (unless sleepable). Otherwise, any
locks built using the preemption helpers will be as limited as
existing bpf_spin_lock.

Nesting is allowed by keeping a counter for tracking remaining enables
required to be performed. Similar approach can be applied to
rcu_read_locks in a follow up.

Changelog
=========
v1: https://lore.kernel.org/bpf/20240423061922.2295517-1-memxor@gmail.com

 * Move kfunc BTF ID declerations above css task kfunc for
   !CONFIG_CGROUPS config (Alexei)
 * Add test case for global function call in non-preemptible region
   (Jiri)
====================

Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240424031315.2757363-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
As Martin mentioned in review comment, there is an existing bug that
orig_netns_fd will be leaked in the later "goto fail;" case after
open("/proc/self/ns/net") in open_netns() in network_helpers.c. This
patch adds "close(token->orig_netns_fd);" before "free(token);" to
fix it.

Fixes: a303388 ("selftests/bpf: Move open_netns() and close_netns() into network_helpers.c")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/a104040b47c3c34c67f3f125cdfdde244a870d3c.1713868264.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
ASSERT helpers defined in test_progs.h shouldn't be used in public
functions like open_netns() and close_netns(). Since they depend on
test__fail() which defined in test_progs.c. Public functions may be
used not only in test_progs.c, but in other tests like test_sock_addr.c
in the next commit.

This patch uses log_err() to replace ASSERT helpers in open_netns()
and close_netns() in network_helpers.c to decouple dependencies, then
uses ASSERT_OK_PTR() to check the return values of all open_netns().

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/d1dad22b2ff4909af3f8bfd0667d046e235303cb.1713868264.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Include network_helpers.h in test_sock_addr.c, use the newly added public
helper start_server_addr() instead of the local defined function
start_server(). This can avoid duplicate code.

In order to use functions defined in network_helpers.c in test_sock_addr.c,
Makefile needs to be updated and <Linux/err.h> needs to be included in
network_helpers.h to avoid compilation errors.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/3101f57bde5502383eb41723c8956cc26be06893.1713868264.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
This patch uses public network helper connect_to_addr() exported in
network_helpers.h instead of the local defined function connect_to_server()
in test_sock_addr.c. This can avoid duplicate code.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f263797712d93fdfaf2943585c5dfae56714a00b.1713868264.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
This patch uses public helper make_sockaddr() exported in network_helpers.h
instead of the local defined function mk_sockaddr() in test_sock_addr.c.
This can avoid duplicate code.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/1473e189d6ca1a3925de4c5354d191a14eca0f3f.1713868264.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Geliang Tang says:

====================
This patchset uses more network helpers in test_sock_addr.c, but
first of all, patch 2 is needed to make network_helpers.c independent
of test_progs.c. Then network_helpers.h can be included into
test_sock_addr.c without compile errors.

Patch 1 and patch 2 address Martin's comments for the previous series
too.

v2:
 - Only a few minor cleanups to patch 5.
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The wq test was missing destroy(skel) part which was causing bpf progs to stay
loaded. That was causing test_progs to complain with
"Failed to unload bpf_testmod.ko from kernel: -11" message, but adding
destroy() wasn't enough, since wq callback may be delayed, so loop on unload of
bpf_testmod if errno is EAGAIN.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Fixes: 8290dba ("selftests/bpf: wq: add bpf_wq_start() checks")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The commit d56b63c ("bpf: add support for bpf_wq user type")
changes the fields support number to 11, just sync the comment.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240424054526.8031-1-haiyue.wang@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add crypto API support to BPF to be able to decrypt or encrypt packets
in TC/XDP BPF programs. Special care should be taken for initialization
part of crypto algo because crypto alloc) doesn't work with preemtion
disabled, it can be run only in sleepable BPF program. Also async crypto
is not supported because of the very same issue - TC/XDP BPF programs
are not sleepable.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://lore.kernel.org/r/20240422225024.2847039-2-vadfed@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
vvfedorenko and others added 19 commits April 24, 2024 16:01
Implement skcipher crypto in BPF crypto framework.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20240422225024.2847039-3-vadfed@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add simple tc hook selftests to show the way to work with new crypto
BPF API. Some tricky dynptr initialization is used to provide empty iv
dynptr. Simple AES-ECB algo is used to demonstrate encryption and
decryption of fixed size buffers.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://lore.kernel.org/r/20240422225024.2847039-4-vadfed@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Some simple benchmarks are added to understand the baseline of
performance.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://lore.kernel.org/r/20240422225024.2847039-5-vadfed@meta.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Vadim Fedorenko says:

====================
This series introduces crypto kfuncs to make BPF programs able to
utilize kernel crypto subsystem. Crypto operations made pluggable to
avoid extensive growth of kernel when it's not needed. Only skcipher is
added within this series, but it can be easily extended to other types
of operations. No hardware offload supported as it needs sleepable
context which is not available for TX or XDP programs. At the same time
crypto context initialization kfunc can only run in sleepable context,
that's why it should be run separately and store the result in the map.

Selftests show the common way to implement crypto actions in BPF
programs. Benchmark is also added to have a baseline.
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add bpf_guard_preempt() macro that uses newly introduced
bpf_preempt_disable/enable() kfuncs to guard a critical section.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240424225529.16782-1-alexei.starovoitov@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a testcase for the ring_buffer__consume_n() API.

The test produces multiple samples in a ring buffer, using a
sys_getpid() fentry prog, and consumes them from user-space in batches,
rather than consuming all of them greedily, like ring_buffer__consume()
does.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/lkml/CAEf4BzaR4zqUpDmj44KNLdpJ=Tpa97GrvzuzVNO5nM6b7oWd1w@mail.gmail.com
Link: https://lore.kernel.org/bpf/20240425140627.112728-1-andrea.righi@canonical.com
Test case dummy_st_ops/dummy_init_ret_value passes NULL as the first
parameter of the test_1() function. Mark this parameter as nullable to
make verifier aware of such possibility.
Otherwise, NULL check in the test_1() code:

      SEC("struct_ops/test_1")
      int BPF_PROG(test_1, struct bpf_dummy_ops_state *state)
      {
            if (!state)
                    return ...;

            ... access state ...
      }

Might be removed by verifier, thus triggering NULL pointer dereference
under certain conditions.

Reported-by: Jose E. Marchesi <jemarch@gnu.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240424012821.595216-2-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
As reported by Jose E. Marchesi in off-list discussion, GCC and LLVM
generate slightly different code for dummy_st_ops_success/test_1():

  SEC("struct_ops/test_1")
  int BPF_PROG(test_1, struct bpf_dummy_ops_state *state)
  {
  	int ret;

  	if (!state)
  		return 0xf2f3f4f5;

  	ret = state->val;
  	state->val = 0x5a;
  	return ret;
  }

  GCC-generated                  LLVM-generated
  ----------------------------   ---------------------------
  0: r1 = *(u64 *)(r1 + 0x0)     0: w0 = -0xd0c0b0b
  1: if r1 == 0x0 goto 5f        1: r1 = *(u64 *)(r1 + 0x0)
  2: r0 = *(s32 *)(r1 + 0x0)     2: if r1 == 0x0 goto 6f
  3: *(u32 *)(r1 + 0x0) = 0x5a   3: r0 = *(u32 *)(r1 + 0x0)
  4: exit                        4: w2 = 0x5a
  5: r0 = -0xd0c0b0b             5: *(u32 *)(r1 + 0x0) = r2
  6: exit                        6: exit

If the 'state' argument is not marked as nullable in
net/bpf/bpf_dummy_struct_ops.c, the verifier would assume that
'r1 == 0x0' is never true:
- for the GCC version, this means that instructions #5-6 would be
  marked as dead and removed;
- for the LLVM version, all instructions would be marked as live.

The test dummy_st_ops/dummy_init_ret_value actually sets the 'state'
parameter to NULL.

Therefore, when the 'state' argument is not marked as nullable,
the GCC-generated version of the code would trigger a NULL pointer
dereference at instruction #3.

This patch updates the test_1() test case to always follow a shape
similar to the GCC-generated version above, in order to verify whether
the 'state' nullability is marked correctly.

Reported-by: Jose E. Marchesi <jemarch@gnu.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240424012821.595216-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
dummy_st_ops.test_2 and dummy_st_ops.test_sleepable do not have their
'state' parameter marked as nullable. Update dummy_st_ops.c to avoid
passing NULL for such parameters, as the next patch would allow kernel
to enforce this restriction.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240424012821.595216-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When doing BPF_PROG_TEST_RUN for bpf_dummy_struct_ops programs,
reject execution when NULL is passed for non-nullable params.
For programs with non-nullable params verifier assumes that
such params are never NULL and thus might optimize out NULL checks.

Suggested-by: Kui-Feng Lee <sinquersw@gmail.com>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240424012821.595216-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Check if BPF_PROG_TEST_RUN for bpf_dummy_struct_ops programs
rejects execution if NULL is passed for non-nullable parameter.

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240424012821.595216-6-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Eduard Zingerman says:

====================
check bpf_dummy_struct_ops program params for test runs

When doing BPF_PROG_TEST_RUN for bpf_dummy_struct_ops programs,
execution should be rejected when NULL is passed for non-nullable
params, because for such params verifier assumes that such params are
never NULL and thus might optimize out NULL checks.

This problem was reported by Jose E. Marchesi in off-list discussion.
The code generated by GCC for dummy_st_ops_success/test_1() function
differs from LLVM variant in a way that allows verifier to remove the
NULL check. The test dummy_st_ops/dummy_init_ret_value actually sets
the 'state' parameter to NULL, thus GCC-generated version of the test
triggers NULL pointer dereference when BPF program is executed.

This patch-set addresses the issue in the following steps:
- patch #1 marks bpf_dummy_struct_ops.test_1 parameter as nullable,
  for verifier to have correct assumptions about test_1() programs;
- patch #2 modifies dummy_st_ops/dummy_init_ret_value to trigger NULL
  dereference with both GCC and LLVM (if patch #1 is not applied);
- patch #3 adjusts a few dummy_st_ops test cases to avoid passing NULL
  for 'state' parameter of test_2() and test_sleepable() functions,
  as parameters of these functions are not marked as nullable;
- patch #4 adjusts bpf_dummy_struct_ops to reject test execution of
  programs if NULL is passed for non-nullable parameter;
- patch #5 adds a test to verify logic from patch #4.
====================

Link: https://lore.kernel.org/r/20240424012821.595216-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Two important arguments in RTT estimation, mrtt and srtt, are passed to
tcp_bpf_rtt(), so that bpf programs get more information about RTT
computation in BPF_SOCK_OPS_RTT_CB.

The difference between bpf_sock_ops->srtt_us and the srtt here is: the
former is an old rtt before update, while srtt passed by tcp_bpf_rtt()
is that after update.

Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240425161724.73707-2-lulie@linux.alibaba.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Because srtt and mrtt_us are added as args in bpf_sock_ops at
BPF_SOCK_OPS_RTT_CB, a simple check is added to make sure they are both
non-zero.

$ ./test_progs -t tcp_rtt
  #373     tcp_rtt:OK
  Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240425161724.73707-3-lulie@linux.alibaba.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
…T_CB'

Philo Lu says:

====================
These provides more information about tcp RTT estimation. The selftest for
BPF_SOCK_OPS_RTT_CB is extended for the added args.

changelogs
-> v1:
- extend rtt selftest for added args (suggested by Stanislav)
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
The proposed intro paragraph text is derived from the first paragraph
of the IETF BPF WG charter at https://datatracker.ietf.org/wg/bpf/about/

Signed-off-by: Dave Thaler <dthaler1968@gmail.com>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20240422190942.24658-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The definition of bpf_tail_call_static in tools/lib/bpf/bpf_helpers.h
is guarded by a preprocessor check to assure that clang is recent
enough to support it.  This patch updates the guard so the function is
compiled when using GCC 13 or later as well.

Tested in bpf-next master. No regressions.

Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240426145158.14409-1-jose.marchesi@oracle.com
This patch elaborates on the use of PC by expanding the PC acronym,
explaining the units, and the relative position to which the offset
applies.

Signed-off-by: Dave Thaler <dthaler1968@googlemail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20240426231126.5130-1-dthaler1968@gmail.com
@htejun htejun merged commit 73f4013 into sched_ext Apr 29, 2024
1 check passed
@htejun htejun deleted the htejun/pull-bpf-for-next branch April 29, 2024 19:24
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants