Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

riscv: dma-noncoherent.c: Add hartid return as pointer #3

Closed
wants to merge 219 commits into from

Conversation

Haxk20
Copy link

@Haxk20 Haxk20 commented Oct 27, 2022

This was changed to return hartid as pointer in second argument and normal return value is used for normal return code.

Found the issue while building with the defconfig provided.

jernejsk and others added 30 commits June 6, 2022 23:17
# Conflicts:
#	arch/riscv/include/asm/errata_list.h
# Conflicts:
#	drivers/clk/sunxi-ng/ccu-sun50i-h6-r.c
RTL8723DS is another version of the RTL8723 WiFi + Bluetooth chip. It is
already supported by the hci_uart/btrtl driver. Document the compatible.

Series-to: Marcel Holtmann <marcel@holtmann.org>
Series-to: Johan Hedberg <johan.hedberg@gmail.com>
Series-to: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Series-to: David S. Miller <davem@davemloft.net>
Series-to: Eric Dumazet <edumazet@google.com>
Series-to: Jakub Kicinski <kuba@kernel.org>
Series-to: Paolo Abeni <pabeni@redhat.com>
Series-cc: linux-bluetooth@vger.kernel.org

Signed-off-by: Samuel Holland <samuel@sholland.org>
Currently, the driver can be unbound via sysfs. Because it does not call
of_platform_depopulate, unbinding the driver will ...

Signed-off-by: Samuel Holland <samuel@sholland.org>
@Haxk20 Haxk20 changed the base branch from d1/all to d1/wip October 28, 2022 16:50
This was changed to return hartid as pointer in second argument
and normal return value is used for normal return code.

Follow this change here so we dont error out.

Signed-off-by: Martin Botka <martin.botka@somainline.org>
smaeul pushed a commit that referenced this pull request Nov 3, 2022
With lockdeps enabled, we get the following warning:

======================================================
WARNING: possible circular locking dependency detected
------------------------------------------------------
kworker/u12:1/53 is trying to acquire lock:
ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
but task is already holding lock:
ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394

which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:

-> #1 (ect_mutex){+.+.}-{4:4}:
       __mutex_lock_common+0xd8/0xe60
       mutex_lock_nested+0x44/0x50
       cti_add_assoc_to_csdev+0x4c/0x184
       coresight_register+0x2f0/0x314
       tmc_probe+0x33c/0x414

-> #0 (coresight_mutex){+.+.}-{4:4}:
       __lock_acquire+0x1a20/0x32d0
       lock_acquire+0x160/0x308
       __mutex_lock_common+0xd8/0xe60
       mutex_lock_nested+0x44/0x50
       coresight_set_assoc_ectdev_mutex+0x3c/0x5c
       cti_update_conn_xrefs+0x6c/0xf8
       cti_probe+0x33c/0x394

other info that might help us debug this:
 Possible unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(ect_mutex);
                               lock(coresight_mutex);
                               lock(ect_mutex);
  lock(coresight_mutex);
 *** DEADLOCK ***

4 locks held by kworker/u12:1/53:
 #0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
 #1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
 #2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
 #3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394

To fix the same, call cti_add_assoc_to_csdev without the holding
coresight_mutex and confine the locking while setting the associated
ect / cti device using coresight_set_assoc_ectdev_mutex().

Fixes: 177af82 ("coresight: cti: Enable CTI associated with devices")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220721130329.3787211-1-sudeep.holla@arm.com
@smaeul
Copy link
Owner

smaeul commented Nov 5, 2022

Thanks for the pull request. This was fixed upstream.

@smaeul smaeul closed this Nov 5, 2022
@Haxk20 Haxk20 deleted the d1-nezha branch November 6, 2022 10:39
smaeul pushed a commit that referenced this pull request Feb 5, 2023
Commit 9130b8d ("SUNRPC: allow for upcalls for the same uid
but different gss service") introduced `auth` argument to
__gss_find_upcall(), but in gss_pipe_downcall() it was left as NULL
since it (and auth->service) was not (yet) determined.

When multiple upcalls with the same uid and different service are
ongoing, it could happen that __gss_find_upcall(), which returns the
first match found in the pipe->in_downcall list, could not find the
correct gss_msg corresponding to the downcall we are looking for.
Moreover, it might return a msg which is not sent to rpc.gssd yet.

We could see mount.nfs process hung in D state with multiple mount.nfs
are executed in parallel.  The call trace below is of CentOS 7.9
kernel-3.10.0-1160.24.1.el7.x86_64 but we observed the same hang w/
elrepo kernel-ml-6.0.7-1.el7.

PID: 71258  TASK: ffff91ebd4be0000  CPU: 36  COMMAND: "mount.nfs"
 #0 [ffff9203ca3234f8] __schedule at ffffffffa3b8899f
 #1 [ffff9203ca323580] schedule at ffffffffa3b88eb9
 #2 [ffff9203ca323590] gss_cred_init at ffffffffc0355818 [auth_rpcgss]
 #3 [ffff9203ca323658] rpcauth_lookup_credcache at ffffffffc0421ebc
[sunrpc]
 #4 [ffff9203ca3236d8] gss_lookup_cred at ffffffffc0353633 [auth_rpcgss]
 #5 [ffff9203ca3236e8] rpcauth_lookupcred at ffffffffc0421581 [sunrpc]
 #6 [ffff9203ca323740] rpcauth_refreshcred at ffffffffc04223d3 [sunrpc]
 torvalds#7 [ffff9203ca3237a0] call_refresh at ffffffffc04103dc [sunrpc]
 torvalds#8 [ffff9203ca3237b8] __rpc_execute at ffffffffc041e1c9 [sunrpc]
 torvalds#9 [ffff9203ca323820] rpc_execute at ffffffffc0420a48 [sunrpc]

The scenario is like this. Let's say there are two upcalls for
services A and B, A -> B in pipe->in_downcall, B -> A in pipe->pipe.

When rpc.gssd reads pipe to get the upcall msg corresponding to
service B from pipe->pipe and then writes the response, in
gss_pipe_downcall the msg corresponding to service A will be picked
because only uid is used to find the msg and it is before the one for
B in pipe->in_downcall.  And the process waiting for the msg
corresponding to service A will be woken up.

Actual scheduing of that process might be after rpc.gssd processes the
next msg.  In rpc_pipe_generic_upcall it clears msg->errno (for A).
The process is scheduled to see gss_msg->ctx == NULL and
gss_msg->msg.errno == 0, therefore it cannot break the loop in
gss_create_upcall and is never woken up after that.

This patch adds a simple check to ensure that a msg which is not
sent to rpc.gssd yet is not chosen as the matching upcall upon
receiving a downcall.

Signed-off-by: minoura makoto <minoura@valinux.co.jp>
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@nec.com>
Tested-by: Hiroshi Shimamoto <h-shimamoto@nec.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>
Fixes: 9130b8d ("SUNRPC: allow for upcalls for same uid but different gss service")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
…ata and not linked with libtraceevent

When we have a perf.data file with tracepoints, such as:

  # perf evlist -f
  probe_perf:lzma_decompress_to_file
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
  #

We end up segfaulting when using perf built with NO_LIBTRACEEVENT=1 by
trying to find an evsel with a NULL 'event_name' variable:

  (gdb) run report --stdio -f
  Starting program: /root/bin/perf report --stdio -f

  Program received signal SIGSEGV, Segmentation fault.
  0x000000000055219d in find_evsel (evlist=0xfda7b0, event_name=0x0) at util/sort.c:2830
  warning: Source file is more recent than executable.
  2830		if (event_name[0] == '%') {
  Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-11.fc36.x86_64 cyrus-sasl-lib-2.1.27-18.fc36.x86_64 elfutils-debuginfod-client-0.188-3.fc36.x86_64 elfutils-libelf-0.188-3.fc36.x86_64 elfutils-libs-0.188-3.fc36.x86_64 glibc-2.35-20.fc36.x86_64 keyutils-libs-1.6.1-4.fc36.x86_64 krb5-libs-1.19.2-12.fc36.x86_64 libbrotli-1.0.9-7.fc36.x86_64 libcap-2.48-4.fc36.x86_64 libcom_err-1.46.5-2.fc36.x86_64 libcurl-7.82.0-12.fc36.x86_64 libevent-2.1.12-6.fc36.x86_64 libgcc-12.2.1-4.fc36.x86_64 libidn2-2.3.4-1.fc36.x86_64 libnghttp2-1.51.0-1.fc36.x86_64 libpsl-0.21.1-5.fc36.x86_64 libselinux-3.3-4.fc36.x86_64 libssh-0.9.6-4.fc36.x86_64 libstdc++-12.2.1-4.fc36.x86_64 libunistring-1.0-1.fc36.x86_64 libunwind-1.6.2-2.fc36.x86_64 libxcrypt-4.4.33-4.fc36.x86_64 libzstd-1.5.2-2.fc36.x86_64 numactl-libs-2.0.14-5.fc36.x86_64 opencsd-1.2.0-1.fc36.x86_64 openldap-2.6.3-1.fc36.x86_64 openssl-libs-3.0.5-2.fc36.x86_64 slang-2.3.2-11.fc36.x86_64 xz-libs-5.2.5-9.fc36.x86_64 zlib-1.2.11-33.fc36.x86_64
  (gdb) bt
  #0  0x000000000055219d in find_evsel (evlist=0xfda7b0, event_name=0x0) at util/sort.c:2830
  #1  0x0000000000552416 in add_dynamic_entry (evlist=0xfda7b0, tok=0xffb6eb "trace", level=2) at util/sort.c:2976
  #2  0x0000000000552d26 in sort_dimension__add (list=0xf93e00 <perf_hpp_list>, tok=0xffb6eb "trace", evlist=0xfda7b0, level=2) at util/sort.c:3193
  #3  0x0000000000552e1c in setup_sort_list (list=0xf93e00 <perf_hpp_list>, str=0xffb6eb "trace", evlist=0xfda7b0) at util/sort.c:3227
  #4  0x00000000005532fa in __setup_sorting (evlist=0xfda7b0) at util/sort.c:3381
  #5  0x0000000000553cdc in setup_sorting (evlist=0xfda7b0) at util/sort.c:3608
  #6  0x000000000042eb9f in cmd_report (argc=0, argv=0x7fffffffe470) at builtin-report.c:1596
  torvalds#7  0x00000000004aee7e in run_builtin (p=0xf64ca0 <commands+288>, argc=3, argv=0x7fffffffe470) at perf.c:330
  torvalds#8  0x00000000004af0f2 in handle_internal_command (argc=3, argv=0x7fffffffe470) at perf.c:384
  torvalds#9  0x00000000004af241 in run_argv (argcp=0x7fffffffe29c, argv=0x7fffffffe290) at perf.c:428
  torvalds#10 0x00000000004af5fc in main (argc=3, argv=0x7fffffffe470) at perf.c:562
  (gdb)

So check if we have tracepoint events in add_dynamic_entry() and bail
out instead:

  # perf report --stdio -f
  This perf binary isn't linked with libtraceevent, can't process probe_perf:lzma_decompress_to_file
  Error:
  Unknown --sort key: `trace'
  #

Fixes: 378ef0f ("perf build: Use libtraceevent from the system")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/Y7MDb7kRaHZB6APC@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
…ed_text_end" symbol on s/390

The test case perf lock contention dumps core on s390. Run the following
commands:

  # ./perf lock record -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

      Total time: 2.799 [sec]
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
  #
  # ./perf lock contention
  Segmentation fault (core dumped)
  #

The function call stack is lengthy, here are the top 5 functions:

  # gdb ./perf core.24048
  GNU gdb (GDB) Fedora Linux 12.1-6.fc37
  Core was generated by `./perf lock contention'.
  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356
         3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);

 (gdb) where
  #0  0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356
  #1  0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957
  #2  0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586
  #3  0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004
  #4  0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254
  #5  0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
  .....

The issue is in function machine__is_lock_function() in file
./util/machine.c lines 3355:

   /* should not fail from here */
   sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap);
   machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)

On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
access and generates the core dump.

The reason why __sched_text_end is not in the symbol list on s390 is
simple:

When the symbol list is created at perf start up with function calls

  dso__load
  +--> dso__load_vmlinux_path
       +--> dso__load_vmlinux
            +--> dso__load_sym
	         +--> dso__load_sym_internal (reads kernel symbols)
		 +--> symbols__fixup_end
		 +--> symbols__fixup_duplicate

The issue is in function symbols__fixup_duplicate(). It deletes all
symbols with have the same address. On s390:

  # nm -g  ~/linux/vmlinux| fgrep c68390
  0000000000c68390 T __cpuidle_text_start
  0000000000c68390 T __sched_text_end
  #

two symbols have identical addresses and __sched_text_end is considered
duplicate (in ascending sort order) and removed from the symbol list.
Therefore it is missing and an invalid pointer reference occurs.  The
code checks for symbol __sched_text_start and when it exists assumes
symbol __sched_text_end is also in the symbol table. However this is not
the case on s390.

Same situation exists for symbol __lock_text_start:

0000000000c68770 T __cpuidle_text_end
0000000000c68770 T __lock_text_start

This symbol is also removed from the symbol table but used in function
machine__is_lock_function().

To fix this and keep duplicate symbols in the symbol table, set
symbol_conf.allow_aliases to true. This prevents the removal of
duplicate symbols in function symbols__fixup_duplicate().

Output After:

 # ./perf lock contention
 contended total wait  max wait  avg wait    type   caller

        48   124.39 ms 123.99 ms   2.59 ms rwsem:W unlink_anon_vmas+0x24a
        47    83.68 ms  83.26 ms   1.78 ms rwsem:W free_pgtables+0x132
         5    41.22 us  10.55 us   8.24 us rwsem:W free_pgtables+0x140
         4    40.12 us  20.55 us  10.03 us rwsem:W copy_process+0x1ac8
 #

Fixes: 0d2997f ("perf lock: Look up callchain for the contended locks")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
The 'TCA_MPLS_LABEL' attribute is of 'NLA_U32' type, but has a
validation type of 'NLA_VALIDATE_FUNCTION'. This is an invalid
combination according to the comment above 'struct nla_policy':

"
Meaning of `validate' field, use via NLA_POLICY_VALIDATE_FN:
   NLA_BINARY           Validation function called for the attribute.
   All other            Unused - but note that it's a union
"

This can trigger the warning [1] in nla_get_range_unsigned() when
validation of the attribute fails. Despite being of 'NLA_U32' type, the
associated 'min'/'max' fields in the policy are negative as they are
aliased by the 'validate' field.

Fix by changing the attribute type to 'NLA_BINARY' which is consistent
with the above comment and all other users of NLA_POLICY_VALIDATE_FN().
As a result, move the length validation to the validation function.

No regressions in MPLS tests:

 # ./tdc.py -f tc-tests/actions/mpls.json
 [...]
 # echo $?
 0

[1]
WARNING: CPU: 0 PID: 17743 at lib/nlattr.c:118
nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
Modules linked in:
CPU: 0 PID: 17743 Comm: syz-executor.0 Not tainted 6.1.0-rc8 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
RIP: 0010:nla_get_range_unsigned+0x1d8/0x1e0 lib/nlattr.c:117
[...]
Call Trace:
 <TASK>
 __netlink_policy_dump_write_attr+0x23d/0x990 net/netlink/policy.c:310
 netlink_policy_dump_write_attr+0x22/0x30 net/netlink/policy.c:411
 netlink_ack_tlv_fill net/netlink/af_netlink.c:2454 [inline]
 netlink_ack+0x546/0x760 net/netlink/af_netlink.c:2506
 netlink_rcv_skb+0x1b7/0x240 net/netlink/af_netlink.c:2546
 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6109
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1921
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg net/socket.c:734 [inline]
 ____sys_sendmsg+0x38f/0x500 net/socket.c:2482
 ___sys_sendmsg net/socket.c:2536 [inline]
 __sys_sendmsg+0x197/0x230 net/socket.c:2565
 __do_sys_sendmsg net/socket.c:2574 [inline]
 __se_sys_sendmsg net/socket.c:2572 [inline]
 __x64_sys_sendmsg+0x42/0x50 net/socket.c:2572
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Link: https://lore.kernel.org/netdev/CAO4mrfdmjvRUNbDyP0R03_DrD_eFCLCguz6OxZ2TYRSv0K9gxA@mail.gmail.com/
Fixes: 2a2ea50 ("net: sched: add mpls manipulation actions to TC")
Reported-by: Wei Chen <harperchen1110@gmail.com>
Tested-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230107171004.608436-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
There are some codepaths that do not initialize rx->link_sta properly. This
causes a crash in places which assume that rx->link_sta is valid if rx->sta
is valid.
One known instance is triggered by __ieee80211_rx_h_amsdu being called from
fast-rx. It results in a crash like this one:

 BUG: kernel NULL pointer dereference, address: 00000000000000a8
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT SMP PTI
 CPU: 1 PID: 506 Comm: mt76-usb-rx phy Tainted: G            E      6.1.0-debian64x+1.7 #3
 Hardware name: ZOTAC ZBOX-ID92/ZBOX-IQ01/ZBOX-ID92/ZBOX-IQ01, BIOS B220P007 05/21/2014
 RIP: 0010:ieee80211_deliver_skb+0x62/0x1f0 [mac80211]
 Code: 00 48 89 04 24 e8 9e a7 c3 df 89 c0 48 03 1c c5 a0 ea 39 a1 4c 01 6b 08 48 ff 03 48
       83 7d 28 00 74 11 48 8b 45 30 48 63 55 44 <48> 83 84 d0 a8 00 00 00 01 41 8b 86 c0
       11 00 00 8d 50 fd 83 fa 01
 RSP: 0018:ffff999040803b10 EFLAGS: 00010286
 RAX: 0000000000000000 RBX: ffffb9903f496480 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: ffff999040803ce0 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8d21828ac900
 R13: 000000000000004a R14: ffff8d2198ed89c0 R15: ffff8d2198ed8000
 FS:  0000000000000000(0000) GS:ffff8d24afe80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00000000000000a8 CR3: 0000000429810002 CR4: 00000000001706e0
 Call Trace:
  <TASK>
  __ieee80211_rx_h_amsdu+0x1b5/0x240 [mac80211]
  ? ieee80211_prepare_and_rx_handle+0xcdd/0x1320 [mac80211]
  ? __local_bh_enable_ip+0x3b/0xa0
  ieee80211_prepare_and_rx_handle+0xcdd/0x1320 [mac80211]
  ? prepare_transfer+0x109/0x1a0 [xhci_hcd]
  ieee80211_rx_list+0xa80/0xda0 [mac80211]
  mt76_rx_complete+0x207/0x2e0 [mt76]
  mt76_rx_poll_complete+0x357/0x5a0 [mt76]
  mt76u_rx_worker+0x4f5/0x600 [mt76_usb]
  ? mt76_get_min_avg_rssi+0x140/0x140 [mt76]
  __mt76_worker_fn+0x50/0x80 [mt76]
  kthread+0xed/0x120
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x22/0x30

Since the initialization of rx->link and rx->link_sta is rather convoluted
and duplicated in many places, clean it up by using a helper function to
set it.

Fixes: ccdde7c ("wifi: mac80211: properly implement MLO key handling")
Fixes: b320d6c ("wifi: mac80211: use correct rx link_sta instead of default")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20221230200747.19040-1-nbd@nbd.name
[remove unnecessary rx->sta->sta.mlo check]
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
Current imc-pmu code triggers a WARNING with CONFIG_DEBUG_ATOMIC_SLEEP
and CONFIG_PROVE_LOCKING enabled, while running a thread_imc event.

Command to trigger the warning:
  # perf stat -e thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/ sleep 5

   Performance counter stats for 'sleep 5':

                   0      thread_imc/CPM_CS_FROM_L4_MEM_X_DPTEG/

         5.002117947 seconds time elapsed

         0.000131000 seconds user
         0.001063000 seconds sys

Below is snippet of the warning in dmesg:

  BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
  in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2869, name: perf-exec
  preempt_count: 2, expected: 0
  4 locks held by perf-exec/2869:
   #0: c00000004325c540 (&sig->cred_guard_mutex){+.+.}-{3:3}, at: bprm_execve+0x64/0xa90
   #1: c00000004325c5d8 (&sig->exec_update_lock){++++}-{3:3}, at: begin_new_exec+0x460/0xef0
   #2: c0000003fa99d4e0 (&cpuctx_lock){-...}-{2:2}, at: perf_event_exec+0x290/0x510
   #3: c000000017ab8418 (&ctx->lock){....}-{2:2}, at: perf_event_exec+0x29c/0x510
  irq event stamp: 4806
  hardirqs last  enabled at (4805): [<c000000000f65b94>] _raw_spin_unlock_irqrestore+0x94/0xd0
  hardirqs last disabled at (4806): [<c0000000003fae44>] perf_event_exec+0x394/0x510
  softirqs last  enabled at (0): [<c00000000013c404>] copy_process+0xc34/0x1ff0
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  CPU: 36 PID: 2869 Comm: perf-exec Not tainted 6.2.0-rc2-00011-g1247637727f2 torvalds#61
  Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
  Call Trace:
    dump_stack_lvl+0x98/0xe0 (unreliable)
    __might_resched+0x2f8/0x310
    __mutex_lock+0x6c/0x13f0
    thread_imc_event_add+0xf4/0x1b0
    event_sched_in+0xe0/0x210
    merge_sched_in+0x1f0/0x600
    visit_groups_merge.isra.92.constprop.166+0x2bc/0x6c0
    ctx_flexible_sched_in+0xcc/0x140
    ctx_sched_in+0x20c/0x2a0
    ctx_resched+0x104/0x1c0
    perf_event_exec+0x340/0x510
    begin_new_exec+0x730/0xef0
    load_elf_binary+0x3f8/0x1e10
  ...
  do not call blocking ops when !TASK_RUNNING; state=2001 set at [<00000000fd63e7cf>] do_nanosleep+0x60/0x1a0
  WARNING: CPU: 36 PID: 2869 at kernel/sched/core.c:9912 __might_sleep+0x9c/0xb0
  CPU: 36 PID: 2869 Comm: sleep Tainted: G        W          6.2.0-rc2-00011-g1247637727f2 torvalds#61
  Hardware name: 8375-42A POWER9 0x4e1202 opal:v7.0-16-g9b85f7d961 PowerNV
  NIP:  c000000000194a1c LR: c000000000194a18 CTR: c000000000a78670
  REGS: c00000004d2134e0 TRAP: 0700   Tainted: G        W           (6.2.0-rc2-00011-g1247637727f2)
  MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 48002824  XER: 00000000
  CFAR: c00000000013fb64 IRQMASK: 1

The above warning triggered because the current imc-pmu code uses mutex
lock in interrupt disabled sections. The function mutex_lock()
internally calls __might_resched(), which will check if IRQs are
disabled and in case IRQs are disabled, it will trigger the warning.

Fix the issue by changing the mutex lock to spinlock.

Fixes: 8f95faa ("powerpc/powernv: Detect and create IMC device")
Reported-by: Michael Petlan <mpetlan@redhat.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
[mpe: Fix comments, trim oops in change log, add reported-by tags]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20230106065157.182648-1-kjain@linux.ibm.com
smaeul pushed a commit that referenced this pull request Feb 5, 2023
This fixes the following trace caused by attempting to lock
cmd_sync_work_lock while holding the rcu_read_lock:

kworker/u3:2/212 is trying to lock:
ffff888002600910 (&hdev->cmd_sync_work_lock){+.+.}-{3:3}, at:
hci_cmd_sync_queue+0xad/0x140
other info that might help us debug this:
context-{4:4}
4 locks held by kworker/u3:2/212:
 #0: ffff8880028c6530 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
 process_one_work+0x4dc/0x9a0
 #1: ffff888001aafde0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
 at: process_one_work+0x4dc/0x9a0
 #2: ffff888002600070 (&hdev->lock){+.+.}-{3:3}, at:
 hci_cc_le_set_cig_params+0x64/0x4f0
 #3: ffffffffa5994b00 (rcu_read_lock){....}-{1:2}, at:
 hci_cc_le_set_cig_params+0x2f9/0x4f0

Fixes: 26afbd8 ("Bluetooth: Add initial implementation of CIS connections")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
…ress

Bug report and analysis from Ding Hui.

During iSCSI session logout, if another task accesses the shost ipaddress
attr, we can get a KASAN UAF report like this:

[  276.942144] BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x78/0xe0
[  276.942535] Write of size 4 at addr ffff8881053b45b8 by task cat/4088
[  276.943511] CPU: 2 PID: 4088 Comm: cat Tainted: G            E      6.1.0-rc8+ #3
[  276.943997] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  276.944470] Call Trace:
[  276.944943]  <TASK>
[  276.945397]  dump_stack_lvl+0x34/0x48
[  276.945887]  print_address_description.constprop.0+0x86/0x1e7
[  276.946421]  print_report+0x36/0x4f
[  276.947358]  kasan_report+0xad/0x130
[  276.948234]  kasan_check_range+0x35/0x1c0
[  276.948674]  _raw_spin_lock_bh+0x78/0xe0
[  276.949989]  iscsi_sw_tcp_host_get_param+0xad/0x2e0 [iscsi_tcp]
[  276.951765]  show_host_param_ISCSI_HOST_PARAM_IPADDRESS+0xe9/0x130 [scsi_transport_iscsi]
[  276.952185]  dev_attr_show+0x3f/0x80
[  276.953005]  sysfs_kf_seq_show+0x1fb/0x3e0
[  276.953401]  seq_read_iter+0x402/0x1020
[  276.954260]  vfs_read+0x532/0x7b0
[  276.955113]  ksys_read+0xed/0x1c0
[  276.955952]  do_syscall_64+0x38/0x90
[  276.956347]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  276.956769] RIP: 0033:0x7f5d3a679222
[  276.957161] Code: c0 e9 b2 fe ff ff 50 48 8d 3d 32 c0 0b 00 e8 a5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
[  276.958009] RSP: 002b:00007ffc864d16a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  276.958431] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5d3a679222
[  276.958857] RDX: 0000000000020000 RSI: 00007f5d3a4fe000 RDI: 0000000000000003
[  276.959281] RBP: 00007f5d3a4fe000 R08: 00000000ffffffff R09: 0000000000000000
[  276.959682] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000020000
[  276.960126] R13: 0000000000000003 R14: 0000000000000000 R15: 0000557a26dada58
[  276.960536]  </TASK>
[  276.961357] Allocated by task 2209:
[  276.961756]  kasan_save_stack+0x1e/0x40
[  276.962170]  kasan_set_track+0x21/0x30
[  276.962557]  __kasan_kmalloc+0x7e/0x90
[  276.962923]  __kmalloc+0x5b/0x140
[  276.963308]  iscsi_alloc_session+0x28/0x840 [scsi_transport_iscsi]
[  276.963712]  iscsi_session_setup+0xda/0xba0 [libiscsi]
[  276.964078]  iscsi_sw_tcp_session_create+0x1fd/0x330 [iscsi_tcp]
[  276.964431]  iscsi_if_create_session.isra.0+0x50/0x260 [scsi_transport_iscsi]
[  276.964793]  iscsi_if_recv_msg+0xc5a/0x2660 [scsi_transport_iscsi]
[  276.965153]  iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi]
[  276.965546]  netlink_unicast+0x4d5/0x7b0
[  276.965905]  netlink_sendmsg+0x78d/0xc30
[  276.966236]  sock_sendmsg+0xe5/0x120
[  276.966576]  ____sys_sendmsg+0x5fe/0x860
[  276.966923]  ___sys_sendmsg+0xe0/0x170
[  276.967300]  __sys_sendmsg+0xc8/0x170
[  276.967666]  do_syscall_64+0x38/0x90
[  276.968028]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  276.968773] Freed by task 2209:
[  276.969111]  kasan_save_stack+0x1e/0x40
[  276.969449]  kasan_set_track+0x21/0x30
[  276.969789]  kasan_save_free_info+0x2a/0x50
[  276.970146]  __kasan_slab_free+0x106/0x190
[  276.970470]  __kmem_cache_free+0x133/0x270
[  276.970816]  device_release+0x98/0x210
[  276.971145]  kobject_cleanup+0x101/0x360
[  276.971462]  iscsi_session_teardown+0x3fb/0x530 [libiscsi]
[  276.971775]  iscsi_sw_tcp_session_destroy+0xd8/0x130 [iscsi_tcp]
[  276.972143]  iscsi_if_recv_msg+0x1bf1/0x2660 [scsi_transport_iscsi]
[  276.972485]  iscsi_if_rx+0x198/0x4b0 [scsi_transport_iscsi]
[  276.972808]  netlink_unicast+0x4d5/0x7b0
[  276.973201]  netlink_sendmsg+0x78d/0xc30
[  276.973544]  sock_sendmsg+0xe5/0x120
[  276.973864]  ____sys_sendmsg+0x5fe/0x860
[  276.974248]  ___sys_sendmsg+0xe0/0x170
[  276.974583]  __sys_sendmsg+0xc8/0x170
[  276.974891]  do_syscall_64+0x38/0x90
[  276.975216]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

We can easily reproduce by two tasks:
1. while :; do iscsiadm -m node --login; iscsiadm -m node --logout; done
2. while :; do cat \
/sys/devices/platform/host*/iscsi_host/host*/ipaddress; done

            iscsid              |        cat
--------------------------------+---------------------------------------
|- iscsi_sw_tcp_session_destroy |
  |- iscsi_session_teardown     |
    |- device_release           |
      |- iscsi_session_release  ||- dev_attr_show
        |- kfree                |  |- show_host_param_
                                |             ISCSI_HOST_PARAM_IPADDRESS
                                |    |- iscsi_sw_tcp_host_get_param
                                |      |- r/w tcp_sw_host->session (UAF)
  |- iscsi_host_remove          |
  |- iscsi_host_free            |

Fix the above bug by splitting the session removal into 2 parts:

 1. removal from iSCSI class which includes sysfs and removal from host
    tracking.

 2. freeing of session.

During iscsi_tcp host and session removal we can remove the session from
sysfs then remove the host from sysfs. At this point we know userspace is
not accessing the kernel via sysfs so we can free the session and host.

Link: https://lore.kernel.org/r/20230117193937.21244-2-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Acked-by: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
The referenced commit changed the error code returned by the kernel
when preventing a non-established socket from attaching the ktls
ULP. Before to such a commit, the user-space got ENOTCONN instead
of EINVAL.

The existing self-tests depend on such error code, and the change
caused a failure:

  RUN           global.non_established ...
 tls.c:1673:non_established:Expected errno (22) == ENOTCONN (107)
 non_established: Test failed at step #3
          FAIL  global.non_established

In the unlikely event existing applications do the same, address
the issue by restoring the prior error code in the above scenario.

Note that the only other ULP performing similar checks at init
time - smc_ulp_ops - also fails with ENOTCONN when trying to attach
the ULP to a non-established socket.

Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Fixes: 2c02d41 ("net/ulp: prevent ULP without clone op from entering the LISTEN status")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/7bb199e7a93317fb6f8bf8b9b2dc71c18f337cde.1674042685.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
smaeul pushed a commit that referenced this pull request Feb 5, 2023
During EEH error injection testing, a deadlock was encountered in the tg3
driver when tg3_io_error_detected() was attempting to cancel outstanding
reset tasks:

crash> foreach UN bt
...
PID: 159    TASK: c0000000067c6000  CPU: 8   COMMAND: "eehd"
...
 #5 [c00000000681f990] __cancel_work_timer at c00000000019fd18
 #6 [c00000000681fa30] tg3_io_error_detected at c00800000295f098 [tg3]
 torvalds#7 [c00000000681faf0] eeh_report_error at c00000000004e25c
...

PID: 290    TASK: c000000036e5f800  CPU: 6   COMMAND: "kworker/6:1"
...
 #4 [c00000003721fbc0] rtnl_lock at c000000000c940d8
 #5 [c00000003721fbe0] tg3_reset_task at c008000002969358 [tg3]
 #6 [c00000003721fc60] process_one_work at c00000000019e5c4
...

PID: 296    TASK: c000000037a65800  CPU: 21  COMMAND: "kworker/21:1"
...
 #4 [c000000037247bc0] rtnl_lock at c000000000c940d8
 #5 [c000000037247be0] tg3_reset_task at c008000002969358 [tg3]
 #6 [c000000037247c60] process_one_work at c00000000019e5c4
...

PID: 655    TASK: c000000036f49000  CPU: 16  COMMAND: "kworker/16:2"
...:1

 #4 [c0000000373ebbc0] rtnl_lock at c000000000c940d8
 #5 [c0000000373ebbe0] tg3_reset_task at c008000002969358 [tg3]
 #6 [c0000000373ebc60] process_one_work at c00000000019e5c4
...

Code inspection shows that both tg3_io_error_detected() and
tg3_reset_task() attempt to acquire the RTNL lock at the beginning of
their code blocks.  If tg3_reset_task() should happen to execute between
the times when tg3_io_error_deteced() acquires the RTNL lock and
tg3_reset_task_cancel() is called, a deadlock will occur.

Moving tg3_reset_task_cancel() call earlier within the code block, prior
to acquiring RTNL, prevents this from happening, but also exposes another
deadlock issue where tg3_reset_task() may execute AFTER
tg3_io_error_detected() has executed:

crash> foreach UN bt
PID: 159    TASK: c0000000067d2000  CPU: 9   COMMAND: "eehd"
...
 #4 [c000000006867a60] rtnl_lock at c000000000c940d8
 #5 [c000000006867a80] tg3_io_slot_reset at c0080000026c2ea8 [tg3]
 #6 [c000000006867b00] eeh_report_reset at c00000000004de88
...
PID: 363    TASK: c000000037564000  CPU: 6   COMMAND: "kworker/6:1"
...
 #3 [c000000036c1bb70] msleep at c000000000259e6c
 #4 [c000000036c1bba0] napi_disable at c000000000c6b848
 #5 [c000000036c1bbe0] tg3_reset_task at c0080000026d942c [tg3]
 #6 [c000000036c1bc60] process_one_work at c00000000019e5c4
...

This issue can be avoided by aborting tg3_reset_task() if EEH error
recovery is already in progress.

Fixes: db84bf4 ("tg3: tg3_reset_task() needs to use rtnl_lock to synchronize")
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20230124185339.225806-1-drc@linux.vnet.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants