Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from torvalds:master #96

Merged
merged 19 commits into from
Jul 7, 2020
Merged

[pull] master from torvalds:master #96

merged 19 commits into from
Jul 7, 2020

Conversation

pull[bot]
Copy link

@pull pull bot commented Jul 7, 2020

See Commits and Changes for more details.


Created by pull[bot]. Want to support this open source service? Please star it : )

richardweinberger and others added 19 commits June 15, 2020 19:39
ns_find_operation() returns 0 on success.

Fixes: 052a7a5 ("mtd: rawnand: nandsim: Clean error handling")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200615113404.25447-1-richard@nod.at
Check and set master panic write flag so that low level drivers
can use it to take required action to ensure oops data gets written
to assigned mtdoops device partition.

Fixes: 9f897bf ("mtd: Add flag to indicate panic_write")
Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200615155134.32007-1-kdasu.kdev@gmail.com
Trap handler for syscall tracing reads EFA (Exception Fault Address),
in case strace wants PC of trap instruction (EFA is not part of pt_regs
as of current code).

However this EFA read is racy as it happens after dropping to pure
kernel mode (re-enabling interrupts). A taken interrupt could
context-switch, trigger a different task's trap, clobbering EFA for this
execution context.

Fix this by reading EFA early, before re-enabling interrupts. A slight
side benefit is de-duplication of FAKE_RET_FROM_EXCPN in trap handler.
The trap handler is common to both ARCompact and ARCv2 builds too.

This just came out of code rework/review and no real problem was reported
but is clearly a potential problem specially for strace.

Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
kernel build system used to add -mcpu for each ARC ISA as default.
These days there are versions and varaints of ARC HS cores some of which
have specific -mcpu options to fine tune / optimize generated code.

So allow users/external build systems to specify their own -mcpu

This will be used in future patches for HSDK-4xD board support which
uses specific -mcpu to utilize dual issue scheduling of the core.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
[abrodkin/vgupta: rewrote changelog]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
On HS cores, loop buffer (LPB) is programmable in runtime and can
be optionally disabled.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Convert fall through comments to the pseudo-keyword which is now the
preferred way.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Under somewhat convoluted conditions, it is possible to attempt to
release an extent_buffer that is under io, which triggers a BUG_ON in
btrfs_release_extent_buffer_pages.

This relies on a few different factors. First, extent_buffer reads done
as readahead for searching use WAIT_NONE, so they free the local extent
buffer reference while the io is outstanding. However, they should still
be protected by TREE_REF. However, if the system is doing signficant
reclaim, and simultaneously heavily accessing the extent_buffers, it is
possible for releasepage to race with two concurrent readahead attempts
in a way that leaves TREE_REF unset when the readahead extent buffer is
released.

Essentially, if two tasks race to allocate a new extent_buffer, but the
winner who attempts the first io is rebuffed by a page being locked
(likely by the reclaim itself) then the loser will still go ahead with
issuing the readahead. The loser's call to find_extent_buffer must also
race with the reclaim task reading the extent_buffer's refcount as 1 in
a way that allows the reclaim to re-clear the TREE_REF checked by
find_extent_buffer.

The following represents an example execution demonstrating the race:

            CPU0                                                         CPU1                                           CPU2
reada_for_search                                            reada_for_search
  readahead_tree_block                                        readahead_tree_block
    find_create_tree_block                                      find_create_tree_block
      alloc_extent_buffer                                         alloc_extent_buffer
                                                                  find_extent_buffer // not found
                                                                  allocates eb
                                                                  lock pages
                                                                  associate pages to eb
                                                                  insert eb into radix tree
                                                                  set TREE_REF, refs == 2
                                                                  unlock pages
                                                              read_extent_buffer_pages // WAIT_NONE
                                                                not uptodate (brand new eb)
                                                                                                            lock_page
                                                                if !trylock_page
                                                                  goto unlock_exit // not an error
                                                              free_extent_buffer
                                                                release_extent_buffer
                                                                  atomic_dec_and_test refs to 1
        find_extent_buffer // found
                                                                                                            try_release_extent_buffer
                                                                                                              take refs_lock
                                                                                                              reads refs == 1; no io
          atomic_inc_not_zero refs to 2
          mark_buffer_accessed
            check_buffer_tree_ref
              // not STALE, won't take refs_lock
              refs == 2; TREE_REF set // no action
    read_extent_buffer_pages // WAIT_NONE
                                                                                                              clear TREE_REF
                                                                                                              release_extent_buffer
                                                                                                                atomic_dec_and_test refs to 1
                                                                                                                unlock_page
      still not uptodate (CPU1 read failed on trylock_page)
      locks pages
      set io_pages > 0
      submit io
      return
    free_extent_buffer
      release_extent_buffer
        dec refs to 0
        delete from radix tree
        btrfs_release_extent_buffer_pages
          BUG_ON(io_pages > 0)!!!

We observe this at a very low rate in production and were also able to
reproduce it in a test environment by introducing some spurious delays
and by introducing probabilistic trylock_page failures.

To fix it, we apply check_tree_ref at a point where it could not
possibly be unset by a competing task: after io_pages has been
incremented. All the codepaths that clear TREE_REF check for io, so they
would not be able to clear it after this point until the io is done.

Stack trace, for reference:
[1417839.424739] ------------[ cut here ]------------
[1417839.435328] kernel BUG at fs/btrfs/extent_io.c:4841!
[1417839.447024] invalid opcode: 0000 [#1] SMP
[1417839.502972] RIP: 0010:btrfs_release_extent_buffer_pages+0x20/0x1f0
[1417839.517008] Code: ed e9 ...
[1417839.558895] RSP: 0018:ffffc90020bcf798 EFLAGS: 00010202
[1417839.570816] RAX: 0000000000000002 RBX: ffff888102d6def0 RCX: 0000000000000028
[1417839.586962] RDX: 0000000000000002 RSI: ffff8887f0296482 RDI: ffff888102d6def0
[1417839.603108] RBP: ffff88885664a000 R08: 0000000000000046 R09: 0000000000000238
[1417839.619255] R10: 0000000000000028 R11: ffff88885664af68 R12: 0000000000000000
[1417839.635402] R13: 0000000000000000 R14: ffff88875f573ad0 R15: ffff888797aafd90
[1417839.651549] FS:  00007f5a844fa700(0000) GS:ffff88885f680000(0000) knlGS:0000000000000000
[1417839.669810] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1417839.682887] CR2: 00007f7884541fe0 CR3: 000000049f609002 CR4: 00000000003606e0
[1417839.699037] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[1417839.715187] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[1417839.731320] Call Trace:
[1417839.737103]  release_extent_buffer+0x39/0x90
[1417839.746913]  read_block_for_search.isra.38+0x2a3/0x370
[1417839.758645]  btrfs_search_slot+0x260/0x9b0
[1417839.768054]  btrfs_lookup_file_extent+0x4a/0x70
[1417839.778427]  btrfs_get_extent+0x15f/0x830
[1417839.787665]  ? submit_extent_page+0xc4/0x1c0
[1417839.797474]  ? __do_readpage+0x299/0x7a0
[1417839.806515]  __do_readpage+0x33b/0x7a0
[1417839.815171]  ? btrfs_releasepage+0x70/0x70
[1417839.824597]  extent_readpages+0x28f/0x400
[1417839.833836]  read_pages+0x6a/0x1c0
[1417839.841729]  ? startup_64+0x2/0x30
[1417839.849624]  __do_page_cache_readahead+0x13c/0x1a0
[1417839.860590]  filemap_fault+0x6c7/0x990
[1417839.869252]  ? xas_load+0x8/0x80
[1417839.876756]  ? xas_find+0x150/0x190
[1417839.884839]  ? filemap_map_pages+0x295/0x3b0
[1417839.894652]  __do_fault+0x32/0x110
[1417839.902540]  __handle_mm_fault+0xacd/0x1000
[1417839.912156]  handle_mm_fault+0xaa/0x1c0
[1417839.921004]  __do_page_fault+0x242/0x4b0
[1417839.930044]  ? page_fault+0x8/0x30
[1417839.937933]  page_fault+0x1e/0x30
[1417839.945631] RIP: 0033:0x33c4bae
[1417839.952927] Code: Bad RIP value.
[1417839.960411] RSP: 002b:00007f5a844f7350 EFLAGS: 00010206
[1417839.972331] RAX: 000000000000006e RBX: 1614b3ff6a50398a RCX: 0000000000000000
[1417839.988477] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002
[1417840.004626] RBP: 00007f5a844f7420 R08: 000000000000006e R09: 00007f5a94aeccb8
[1417840.020784] R10: 00007f5a844f7350 R11: 0000000000000000 R12: 00007f5a94aecc79
[1417840.036932] R13: 00007f5a94aecc78 R14: 00007f5a94aecc90 R15: 00007f5a94aecc40

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 7f9fe61 ("btrfs: improve global reserve stealing logic"),
added in the 5.8 merge window, introduced another leak for the space_info's
reclaim_size counter. This is very often triggered by the test cases
generic/269 and generic/416 from fstests, producing a stack trace like the
following during unmount:

[37079.155499] ------------[ cut here ]------------
[37079.156844] WARNING: CPU: 2 PID: 2000423 at fs/btrfs/block-group.c:3422 btrfs_free_block_groups+0x2eb/0x300 [btrfs]
[37079.158090] Modules linked in: dm_snapshot btrfs dm_thin_pool (...)
[37079.164440] CPU: 2 PID: 2000423 Comm: umount Tainted: G        W         5.7.0-rc7-btrfs-next-62 #1
[37079.165422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), (...)
[37079.167384] RIP: 0010:btrfs_free_block_groups+0x2eb/0x300 [btrfs]
[37079.168375] Code: bd 58 ff ff ff 00 4c 8d (...)
[37079.170199] RSP: 0018:ffffaa53875c7de0 EFLAGS: 00010206
[37079.171120] RAX: ffff98099e701cf8 RBX: ffff98099e2d4000 RCX: 0000000000000000
[37079.172057] RDX: 0000000000000001 RSI: ffffffffc0acc5b1 RDI: 00000000ffffffff
[37079.173002] RBP: ffff98099e701cf8 R08: 0000000000000000 R09: 0000000000000000
[37079.173886] R10: 0000000000000000 R11: 0000000000000000 R12: ffff98099e701c00
[37079.174730] R13: ffff98099e2d5100 R14: dead000000000122 R15: dead000000000100
[37079.175578] FS:  00007f4d7d0a5840(0000) GS:ffff9809ec600000(0000) knlGS:0000000000000000
[37079.176434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[37079.177289] CR2: 0000559224dcc000 CR3: 000000012207a004 CR4: 00000000003606e0
[37079.178152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[37079.178935] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[37079.179675] Call Trace:
[37079.180419]  close_ctree+0x291/0x2d1 [btrfs]
[37079.181162]  generic_shutdown_super+0x6c/0x100
[37079.181898]  kill_anon_super+0x14/0x30
[37079.182641]  btrfs_kill_super+0x12/0x20 [btrfs]
[37079.183371]  deactivate_locked_super+0x31/0x70
[37079.184012]  cleanup_mnt+0x100/0x160
[37079.184650]  task_work_run+0x68/0xb0
[37079.185284]  exit_to_usermode_loop+0xf9/0x100
[37079.185920]  do_syscall_64+0x20d/0x260
[37079.186556]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
[37079.187197] RIP: 0033:0x7f4d7d2d9357
[37079.187836] Code: eb 0b 00 f7 d8 64 89 01 48 (...)
[37079.189180] RSP: 002b:00007ffee4e0d368 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[37079.189845] RAX: 0000000000000000 RBX: 00007f4d7d3fb224 RCX: 00007f4d7d2d9357
[37079.190515] RDX: ffffffffffffff78 RSI: 0000000000000000 RDI: 0000559224dc5c90
[37079.191173] RBP: 0000559224dc1970 R08: 0000000000000000 R09: 00007ffee4e0c0e0
[37079.191815] R10: 0000559224dc7b00 R11: 0000000000000246 R12: 0000000000000000
[37079.192451] R13: 0000559224dc5c90 R14: 0000559224dc1a80 R15: 0000559224dc1ba0
[37079.193096] irq event stamp: 0
[37079.193729] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[37079.194379] hardirqs last disabled at (0): [<ffffffff97ab8935>] copy_process+0x755/0x1ea0
[37079.195033] softirqs last  enabled at (0): [<ffffffff97ab8935>] copy_process+0x755/0x1ea0
[37079.195700] softirqs last disabled at (0): [<0000000000000000>] 0x0
[37079.196318] ---[ end trace b32710d864dea887 ]---

In the past commit d611add ("btrfs: fix reclaim counter leak of
space_info objects") fixed similar cases. That commit however has a date
more recent (April 7 2020) then the commit mentioned before (March 13
2020), however it was merged in kernel 5.7 while the older commit, which
introduces a new leak, was merged only in the 5.8 merge window. So the
leak sneaked in unnoticed.

Fix this by making steal_from_global_rsv() remove the ticket using the
helper remove_ticket(), which decrements the reclaim_size counter of the
space_info object.

Fixes: 7f9fe61 ("btrfs: improve global reserve stealing logic")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Eric reported an issue where mounting -o recovery with a fuzzed fs
resulted in a kernel panic.  This is because we tried to free the tree
node, except it was an error from the read.  Fix this by properly
resetting the tree_root->node == NULL in this case.  The panic was the
following

  BTRFS warning (device loop0): failed to read tree root
  BUG: kernel NULL pointer dereference, address: 000000000000001f
  RIP: 0010:free_extent_buffer+0xe/0x90 [btrfs]
  Call Trace:
   free_root_extent_buffers.part.0+0x11/0x30 [btrfs]
   free_root_pointers+0x1a/0xa2 [btrfs]
   open_ctree+0x1776/0x18a5 [btrfs]
   btrfs_mount_root.cold+0x13/0xfa [btrfs]
   ? selinux_fs_context_parse_param+0x37/0x80
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   fc_mount+0xe/0x30
   vfs_kern_mount.part.0+0x71/0x90
   btrfs_mount+0x147/0x3e0 [btrfs]
   ? cred_has_capability+0x7c/0x120
   ? legacy_get_tree+0x27/0x40
   legacy_get_tree+0x27/0x40
   vfs_get_tree+0x25/0xb0
   do_mount+0x735/0xa40
   __x64_sys_mount+0x8e/0xd0
   do_syscall_64+0x4d/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Nik says: this is problematic only if we fail on the last iteration of
the loop as this results in init_tree_roots returning err value with
tree_root->node = -ERR. Subsequently the caller does: fail_tree_roots
which calls free_root_pointers on the bogus value.

Reported-by: Eric Sandeen <sandeen@redhat.com>
Fixes: b8522a1 ("btrfs: Factor out tree roots initialization during mount")
CC: stable@vger.kernel.org # 5.5+
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ add details how the pointer gets dereferenced ]
Signed-off-by: David Sterba <dsterba@suse.com>
Removing IFX0102 from tpm_tis was not a right move because both tpm_tis
and tpm_infineon use the same device ID. Revert the commit and add a
remark about a bug caused by commit 93e1b7d ("[PATCH] tpm: add HID
module parameter").

Fixes: e918e57 ("tpm_tis: Remove the HID IFX0102")
Reported-by: Peter Huewe <peterhuewe@gmx.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
This MIPS driver does not support COMPILE_TEST yet and failed to build
under my radar.

Replace 'mtd' chich is not defined in the scope of xway_nand_remove()
by nand_to_mtd(chip). The mistake has been added in the long series
dropping nand_release().

Tested with a 7.3.0 MIPS GCC toolchain built with Buildroot.

Fixes: 9fdd78f ("mtd: rawnand: xway: Stop using nand_release()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20200626065511.16424-1-miquel.raynal@bootlin.com
…linux-tpmdd

Pull tpm fix from Jarkko Sakkinen:
 "Revert commit e918e57 ("tpm_tis: Remove the HID IFX0102").

  Removing IFX0102 from tpm_tis was not a right move because both
  tpm_tis and tpm_infineon use the same device ID.

  A real fix requires quirks added to both drivers. It can probably wait
  until v5.9 as the bug has existed since 2006"

* tag 'tpmdd-next-v5.8-rc5' of git://git.infradead.org/users/jjs/linux-tpmdd:
  Revert commit e918e57 ("tpm_tis: Remove the HID IFX0102")
…git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - User build systems to pass -mcpu

 - Fix potential EFA clobber in syscall handler

 - Fix ARCompact 2 levels of interrupts build

 - Detect newer HS CPU releases

 - misc other fixes

* tag 'arc-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARCv2: support loop buffer (LPB) disabling
  ARC: build: remove deprecated toggle for arc700 builds
  ARC: build: allow users to specify -mcpu
  ARCv2: boot log: detect newer/upconing HS3x/HS4x releases
  ARC: elf: use right ELF_ARCH
  ARC: [arcompact] fix bitrot with 2 levels of interrupt
  ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE
…nel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - regression fix of a leak in global block reserve accounting

 - fix a (hard to hit) race of readahead vs releasepage that could lead
   to crash

 - convert all remaining uses of comment fall through annotations to the
   pseudo keyword

 - fix crash when mounting a fuzzed image with -o recovery

* tag 'for-5.8-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: reset tree root pointer after error in init_tree_roots
  btrfs: fix reclaim_size counter leak after stealing from global reserve
  btrfs: fix fatal extent_buffer readahead vs releasepage race
  btrfs: convert comments to fallthrough annotations
…ux/kernel/git/mtd/linux

Pull MTD fixes from Miquel Raynal:
 "MTD:
   - Set a missing master partition panic write flag

  Raw NAND:
   - Fix build issue in the xway driver
   - Fix a wrong return code"

* tag 'mtd/fixes-for-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: xway: Fix build issue
  mtd: set master partition panic write flag
  nandsim: Fix return code testing of ns_find_operation()
@pull pull bot added the ⤵️ pull label Jul 7, 2020
@pull pull bot merged commit 6d12075 into bergwolf:master Jul 7, 2020
pull bot pushed a commit that referenced this pull request Nov 27, 2020
syzkaller found that with CONFIG_DEBUG_KOBJECT_RELEASE=y, releasing a
struct slave device could result in the following splat:

  kobject: 'bonding_slave' (00000000cecdd4fe): kobject_release, parent 0000000074ceb2b2 (delayed 1000)
  bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
  ------------[ cut here ]------------
  ODEBUG: free active (active state 0) object type: timer_list hint: workqueue_select_cpu_near kernel/workqueue.c:1549 [inline]
  ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x98 kernel/workqueue.c:1600
  WARNING: CPU: 1 PID: 842 at lib/debugobjects.c:485 debug_print_object+0x180/0x240 lib/debugobjects.c:485
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 842 Comm: kworker/u4:4 Tainted: G S                5.9.0-rc8+ #96
  Hardware name: linux,dummy-virt (DT)
  Workqueue: netns cleanup_net
  Call trace:
   dump_backtrace+0x0/0x4d8 include/linux/bitmap.h:239
   show_stack+0x34/0x48 arch/arm64/kernel/traps.c:142
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x174/0x1f8 lib/dump_stack.c:118
   panic+0x360/0x7a0 kernel/panic.c:231
   __warn+0x244/0x2ec kernel/panic.c:600
   report_bug+0x240/0x398 lib/bug.c:198
   bug_handler+0x50/0xc0 arch/arm64/kernel/traps.c:974
   call_break_hook+0x160/0x1d8 arch/arm64/kernel/debug-monitors.c:322
   brk_handler+0x30/0xc0 arch/arm64/kernel/debug-monitors.c:329
   do_debug_exception+0x184/0x340 arch/arm64/mm/fault.c:864
   el1_dbg+0x48/0xb0 arch/arm64/kernel/entry-common.c:65
   el1_sync_handler+0x170/0x1c8 arch/arm64/kernel/entry-common.c:93
   el1_sync+0x80/0x100 arch/arm64/kernel/entry.S:594
   debug_print_object+0x180/0x240 lib/debugobjects.c:485
   __debug_check_no_obj_freed lib/debugobjects.c:967 [inline]
   debug_check_no_obj_freed+0x200/0x430 lib/debugobjects.c:998
   slab_free_hook mm/slub.c:1536 [inline]
   slab_free_freelist_hook+0x190/0x210 mm/slub.c:1577
   slab_free mm/slub.c:3138 [inline]
   kfree+0x13c/0x460 mm/slub.c:4119
   bond_free_slave+0x8c/0xf8 drivers/net/bonding/bond_main.c:1492
   __bond_release_one+0xe0c/0xec8 drivers/net/bonding/bond_main.c:2190
   bond_slave_netdev_event drivers/net/bonding/bond_main.c:3309 [inline]
   bond_netdev_event+0x8f0/0xa70 drivers/net/bonding/bond_main.c:3420
   notifier_call_chain+0xf0/0x200 kernel/notifier.c:83
   __raw_notifier_call_chain kernel/notifier.c:361 [inline]
   raw_notifier_call_chain+0x44/0x58 kernel/notifier.c:368
   call_netdevice_notifiers_info+0xbc/0x150 net/core/dev.c:2033
   call_netdevice_notifiers_extack net/core/dev.c:2045 [inline]
   call_netdevice_notifiers net/core/dev.c:2059 [inline]
   rollback_registered_many+0x6a4/0xec0 net/core/dev.c:9347
   unregister_netdevice_many.part.0+0x2c/0x1c0 net/core/dev.c:10509
   unregister_netdevice_many net/core/dev.c:10508 [inline]
   default_device_exit_batch+0x294/0x338 net/core/dev.c:10992
   ops_exit_list.isra.0+0xec/0x150 net/core/net_namespace.c:189
   cleanup_net+0x44c/0x888 net/core/net_namespace.c:603
   process_one_work+0x96c/0x18c0 kernel/workqueue.c:2269
   worker_thread+0x3f0/0xc30 kernel/workqueue.c:2415
   kthread+0x390/0x498 kernel/kthread.c:292
   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:925

This is a potential use-after-free if the sysfs nodes are being accessed
whilst removing the struct slave, so wait for the object destruction to
complete before freeing the struct slave itself.

Fixes: 07699f9 ("bonding: add sysfs /slave dir for bond slave devices.")
Fixes: a068aab ("bonding: Fix reference count leak in bond_sysfs_slave_add.")
Cc: Qiushi Wu <wu000273@umn.edu>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jamie Iles <jamie@nuviainc.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20201120142827.879226-1-jamie@nuviainc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
pull bot pushed a commit that referenced this pull request Jul 3, 2021
The "auxtrace_info" and "auxtrace" functions are not set in "tool" member of
"annotate". As a result, perf annotate does not support parsing itrace data.

Before:

  # perf record -e arm_spe_0/branch_filter=1/ -a sleep 1
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Captured and wrote 20.874 MB perf.data ]
  # perf annotate --stdio
  Error:
  The perf.data data has no samples!

Solution:

1. Add itrace options in help,
2. Set hook functions of "id_index", "auxtrace_info" and "auxtrace" in perf_tool.

After:

  # perf record --all-user -e arm_spe_0/branch_filter=1/ ls
  Couldn't synthesize bpf events.
  perf.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.010 MB perf.data ]
  # perf annotate --stdio
   Percent |      Source code & Disassembly of libc-2.28.so for branch-miss (1 samples, percent: local period)
  ------------------------------------------------------------------------------------------------------------
           :
           :
           :
           :           Disassembly of section .text:
           :
           :           0000000000066180 <__getdelim@@GLIBC_2.17>:
      0.00 :   66180:  stp     x29, x30, [sp, #-96]!
      0.00 :   66184:  cmp     x0, #0x0
      0.00 :   66188:  ccmp    x1, #0x0, #0x4, ne  // ne = any
      0.00 :   6618c:  mov     x29, sp
      0.00 :   66190:  stp     x24, x25, [sp, #56]
      0.00 :   66194:  stp     x26, x27, [sp, #72]
      0.00 :   66198:  str     x28, [sp, #88]
      0.00 :   6619c:  b.eq    66450 <__getdelim@@GLIBC_2.17+0x2d0>  // b.none
      0.00 :   661a0:  stp     x22, x23, [x29, #40]
      0.00 :   661a4:  mov     x22, x1
      0.00 :   661a8:  ldr     w1, [x3]
      0.00 :   661ac:  mov     w23, w2
      0.00 :   661b0:  stp     x20, x21, [x29, #24]
      0.00 :   661b4:  mov     x20, x3
      0.00 :   661b8:  mov     x21, x0
      0.00 :   661bc:  tbnz    w1, #15, 66360 <__getdelim@@GLIBC_2.17+0x1e0>
      0.00 :   661c0:  ldr     x0, [x3, #136]
      0.00 :   661c4:  ldr     x2, [x0, #8]
      0.00 :   661c8:  str     x19, [x29, #16]
      0.00 :   661cc:  mrs     x19, tpidr_el0
      0.00 :   661d0:  sub     x19, x19, #0x700
      0.00 :   661d4:  cmp     x2, x19
      0.00 :   661d8:  b.eq    663f0 <__getdelim@@GLIBC_2.17+0x270>  // b.none
      0.00 :   661dc:  mov     w1, #0x1                        // #1
      0.00 :   661e0:  ldaxr   w2, [x0]
      0.00 :   661e4:  cmp     w2, #0x0
      0.00 :   661e8:  b.ne    661f4 <__getdelim@@GLIBC_2.17+0x74>  // b.any
      0.00 :   661ec:  stxr    w3, w1, [x0]
      0.00 :   661f0:  cbnz    w3, 661e0 <__getdelim@@GLIBC_2.17+0x60>
      0.00 :   661f4:  b.ne    66448 <__getdelim@@GLIBC_2.17+0x2c8>  // b.any
      0.00 :   661f8:  ldr     x0, [x20, #136]
      0.00 :   661fc:  ldr     w1, [x20]
      0.00 :   66200:  ldr     w2, [x0, #4]
      0.00 :   66204:  str     x19, [x0, #8]
      0.00 :   66208:  add     w2, w2, #0x1
      0.00 :   6620c:  str     w2, [x0, #4]
      0.00 :   66210:  tbnz    w1, #5, 66388 <__getdelim@@GLIBC_2.17+0x208>
      0.00 :   66214:  ldr     x19, [x29, #16]
      0.00 :   66218:  ldr     x0, [x21]
      0.00 :   6621c:  cbz     x0, 66228 <__getdelim@@GLIBC_2.17+0xa8>
      0.00 :   66220:  ldr     x0, [x22]
      0.00 :   66224:  cbnz    x0, 6623c <__getdelim@@GLIBC_2.17+0xbc>
      0.00 :   66228:  mov     x0, #0x78                       // #120
      0.00 :   6622c:  str     x0, [x22]
      0.00 :   66230:  bl      20710 <malloc@plt>
      0.00 :   66234:  str     x0, [x21]
      0.00 :   66238:  cbz     x0, 66428 <__getdelim@@GLIBC_2.17+0x2a8>
      0.00 :   6623c:  ldr     x27, [x20, #8]
      0.00 :   66240:  str     x19, [x29, #16]
      0.00 :   66244:  ldr     x19, [x20, #16]
      0.00 :   66248:  sub     x19, x19, x27
      0.00 :   6624c:  cmp     x19, #0x0
      0.00 :   66250:  b.le    66398 <__getdelim@@GLIBC_2.17+0x218>
      0.00 :   66254:  mov     x25, #0x0                       // #0
      0.00 :   66258:  b       662d8 <__getdelim@@GLIBC_2.17+0x158>
      0.00 :   6625c:  nop
      0.00 :   66260:  add     x24, x19, x25
      0.00 :   66264:  ldr     x3, [x22]
      0.00 :   66268:  add     x26, x24, #0x1
      0.00 :   6626c:  ldr     x0, [x21]
      0.00 :   66270:  cmp     x3, x26
      0.00 :   66274:  b.cs    6629c <__getdelim@@GLIBC_2.17+0x11c>  // b.hs, b.nlast
      0.00 :   66278:  lsl     x3, x3, #1
      0.00 :   6627c:  cmp     x3, x26
      0.00 :   66280:  csel    x26, x3, x26, cs  // cs = hs, nlast
      0.00 :   66284:  mov     x1, x26
      0.00 :   66288:  bl      206f0 <realloc@plt>
      0.00 :   6628c:  cbz     x0, 66438 <__getdelim@@GLIBC_2.17+0x2b8>
      0.00 :   66290:  str     x0, [x21]
      0.00 :   66294:  ldr     x27, [x20, #8]
      0.00 :   66298:  str     x26, [x22]
      0.00 :   6629c:  mov     x2, x19
      0.00 :   662a0:  mov     x1, x27
      0.00 :   662a4:  add     x0, x0, x25
      0.00 :   662a8:  bl      87390 <explicit_bzero@@GLIBC_2.25+0x50>
      0.00 :   662ac:  ldr     x0, [x20, #8]
      0.00 :   662b0:  add     x19, x0, x19
      0.00 :   662b4:  str     x19, [x20, #8]
      0.00 :   662b8:  cbnz    x28, 66410 <__getdelim@@GLIBC_2.17+0x290>
      0.00 :   662bc:  mov     x0, x20
      0.00 :   662c0:  bl      73b80 <__underflow@@GLIBC_2.17>
      0.00 :   662c4:  cmn     w0, #0x1
      0.00 :   662c8:  b.eq    66410 <__getdelim@@GLIBC_2.17+0x290>  // b.none
      0.00 :   662cc:  ldp     x27, x19, [x20, #8]
      0.00 :   662d0:  mov     x25, x24
      0.00 :   662d4:  sub     x19, x19, x27
      0.00 :   662d8:  mov     x2, x19
      0.00 :   662dc:  mov     w1, w23
      0.00 :   662e0:  mov     x0, x27
      0.00 :   662e4:  bl      807b0 <memchr@@GLIBC_2.17>
      0.00 :   662e8:  cmp     x0, #0x0
      0.00 :   662ec:  mov     x28, x0
      0.00 :   662f0:  sub     x0, x0, x27
      0.00 :   662f4:  csinc   x19, x19, x0, eq  // eq = none
      0.00 :   662f8:  mov     x0, #0x7fffffffffffffff         // #9223372036854775807
      0.00 :   662fc:  sub     x0, x0, x25
      0.00 :   66300:  cmp     x19, x0
      0.00 :   66304:  b.lt    66260 <__getdelim@@GLIBC_2.17+0xe0>  // b.tstop
      0.00 :   66308:  adrp    x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
      0.00 :   6630c:  ldr     x0, [x0, #3624]
      0.00 :   66310:  mrs     x2, tpidr_el0
      0.00 :   66314:  ldr     x19, [x29, #16]
      0.00 :   66318:  mov     w3, #0x4b                       // #75
      0.00 :   6631c:  ldr     w1, [x20]
      0.00 :   66320:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   66324:  str     w3, [x2, x0]
      0.00 :   66328:  tbnz    w1, #15, 66340 <__getdelim@@GLIBC_2.17+0x1c0>
      0.00 :   6632c:  ldr     x0, [x20, #136]
      0.00 :   66330:  ldr     w1, [x0, #4]
      0.00 :   66334:  sub     w1, w1, #0x1
      0.00 :   66338:  str     w1, [x0, #4]
      0.00 :   6633c:  cbz     w1, 663b8 <__getdelim@@GLIBC_2.17+0x238>
      0.00 :   66340:  mov     x0, x24
      0.00 :   66344:  ldr     x28, [sp, #88]
      0.00 :   66348:  ldp     x20, x21, [x29, #24]
      0.00 :   6634c:  ldp     x22, x23, [x29, #40]
      0.00 :   66350:  ldp     x24, x25, [sp, #56]
      0.00 :   66354:  ldp     x26, x27, [sp, #72]
      0.00 :   66358:  ldp     x29, x30, [sp], #96
      0.00 :   6635c:  ret
    100.00 :   66360:  tbz     w1, #5, 66218 <__getdelim@@GLIBC_2.17+0x98>
      0.00 :   66364:  ldp     x20, x21, [x29, #24]
      0.00 :   66368:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6636c:  ldp     x22, x23, [x29, #40]
      0.00 :   66370:  mov     x0, x24
      0.00 :   66374:  ldp     x24, x25, [sp, #56]
      0.00 :   66378:  ldp     x26, x27, [sp, #72]
      0.00 :   6637c:  ldr     x28, [sp, #88]
      0.00 :   66380:  ldp     x29, x30, [sp], #96
      0.00 :   66384:  ret
      0.00 :   66388:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6638c:  ldr     x19, [x29, #16]
      0.00 :   66390:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66394:  nop
      0.00 :   66398:  mov     x0, x20
      0.00 :   6639c:  bl      73b80 <__underflow@@GLIBC_2.17>
      0.00 :   663a0:  cmn     w0, #0x1
      0.00 :   663a4:  b.eq    66438 <__getdelim@@GLIBC_2.17+0x2b8>  // b.none
      0.00 :   663a8:  ldp     x27, x19, [x20, #8]
      0.00 :   663ac:  sub     x19, x19, x27
      0.00 :   663b0:  b       66254 <__getdelim@@GLIBC_2.17+0xd4>
      0.00 :   663b4:  nop
      0.00 :   663b8:  str     xzr, [x0, #8]
      0.00 :   663bc:  ldxr    w2, [x0]
      0.00 :   663c0:  stlxr   w3, w1, [x0]
      0.00 :   663c4:  cbnz    w3, 663bc <__getdelim@@GLIBC_2.17+0x23c>
      0.00 :   663c8:  cmp     w2, #0x1
      0.00 :   663cc:  b.le    66340 <__getdelim@@GLIBC_2.17+0x1c0>
      0.00 :   663d0:  mov     x1, #0x81                       // #129
      0.00 :   663d4:  mov     x2, #0x1                        // #1
      0.00 :   663d8:  mov     x3, #0x0                        // #0
      0.00 :   663dc:  mov     x8, #0x62                       // #98
      0.00 :   663e0:  svc     #0x0
      0.00 :   663e4:  ldp     x20, x21, [x29, #24]
      0.00 :   663e8:  ldp     x22, x23, [x29, #40]
      0.00 :   663ec:  b       66370 <__getdelim@@GLIBC_2.17+0x1f0>
      0.00 :   663f0:  ldr     w2, [x0, #4]
      0.00 :   663f4:  add     w2, w2, #0x1
      0.00 :   663f8:  str     w2, [x0, #4]
      0.00 :   663fc:  tbz     w1, #5, 66214 <__getdelim@@GLIBC_2.17+0x94>
      0.00 :   66400:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   66404:  ldr     x19, [x29, #16]
      0.00 :   66408:  b       66330 <__getdelim@@GLIBC_2.17+0x1b0>
      0.00 :   6640c:  nop
      0.00 :   66410:  ldr     x0, [x21]
      0.00 :   66414:  strb    wzr, [x0, x24]
      0.00 :   66418:  ldr     w1, [x20]
      0.00 :   6641c:  ldr     x19, [x29, #16]
      0.00 :   66420:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66424:  nop
      0.00 :   66428:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6642c:  ldr     w1, [x20]
      0.00 :   66430:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66434:  nop
      0.00 :   66438:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   6643c:  ldr     w1, [x20]
      0.00 :   66440:  ldr     x19, [x29, #16]
      0.00 :   66444:  b       66328 <__getdelim@@GLIBC_2.17+0x1a8>
      0.00 :   66448:  bl      e3ba0 <pthread_setcanceltype@@GLIBC_2.17+0x30>
      0.00 :   6644c:  b       661f8 <__getdelim@@GLIBC_2.17+0x78>
      0.00 :   66450:  adrp    x0, 17f000 <sys_sigabbrev@@GLIBC_2.17+0x320>
      0.00 :   66454:  ldr     x0, [x0, #3624]
      0.00 :   66458:  mrs     x1, tpidr_el0
      0.00 :   6645c:  mov     w2, #0x16                       // #22
      0.00 :   66460:  mov     x24, #0xffffffffffffffff        // #-1
      0.00 :   66464:  str     w2, [x1, x0]
      0.00 :   66468:  b       66370 <__getdelim@@GLIBC_2.17+0x1f0>
      0.00 :   6646c:  ldr     w1, [x20]
      0.00 :   66470:  mov     x4, x0
      0.00 :   66474:  tbnz    w1, #15, 6648c <__getdelim@@GLIBC_2.17+0x30c>
      0.00 :   66478:  ldr     x0, [x20, #136]
      0.00 :   6647c:  ldr     w1, [x0, #4]
      0.00 :   66480:  sub     w1, w1, #0x1
      0.00 :   66484:  str     w1, [x0, #4]
      0.00 :   66488:  cbz     w1, 66494 <__getdelim@@GLIBC_2.17+0x314>
      0.00 :   6648c:  mov     x0, x4
      0.00 :   66490:  bl      20e40 <gnu_get_libc_version@@GLIBC_2.17+0x130>
      0.00 :   66494:  str     xzr, [x0, #8]
      0.00 :   66498:  ldxr    w2, [x0]
      0.00 :   6649c:  stlxr   w3, w1, [x0]
      0.00 :   664a0:  cbnz    w3, 66498 <__getdelim@@GLIBC_2.17+0x318>
      0.00 :   664a4:  cmp     w2, #0x1
      0.00 :   664a8:  b.le    6648c <__getdelim@@GLIBC_2.17+0x30c>
      0.00 :   664ac:  mov     x1, #0x81                       // #129
      0.00 :   664b0:  mov     x2, #0x1                        // #1
      0.00 :   664b4:  mov     x3, #0x0                        // #0
      0.00 :   664b8:  mov     x8, #0x62                       // #98
      0.00 :   664bc:  svc     #0x0
      0.00 :   664c0:  b       6648c <__getdelim@@GLIBC_2.17+0x30c>

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210615091704.259202-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pull bot pushed a commit that referenced this pull request Jun 2, 2023
The ice driver caches next_to_clean value at the beginning of
ice_clean_rx_irq() in order to remember the first buffer that has to be
freed/recycled after main Rx processing loop. The end boundary is
indicated by first descriptor of frame that Rx processing loop has ended
its duties. Note that if mentioned loop ended in the middle of gathering
multi-buffer frame, next_to_clean would be pointing to the descriptor in
the middle of the frame BUT freeing/recycling stage will stop at the
first descriptor. This means that next iteration of ice_clean_rx_irq()
will miss the (first_desc, next_to_clean - 1) entries.

 When running various 9K MTU workloads, such splats were observed:

[  540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  540.787787] #PF: supervisor read access in kernel mode
[  540.793002] #PF: error_code(0x0000) - not-present page
[  540.798218] PGD 0 P4D 0
[  540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G        W          6.3.0-rc7+ #96
[  540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[  540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
[  540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
[  540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
[  540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
[  540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
[  540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
[  540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
[  540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
[  540.890253] FS:  00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
[  540.898465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
[  540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  540.926032] PKRU: 55555554
[  540.928790] Call Trace:
[  540.931276]  <TASK>
[  540.933418]  ice_napi_poll+0x4ca/0x6d0 [ice]
[  540.937804]  ? __pfx_ice_napi_poll+0x10/0x10 [ice]
[  540.942716]  napi_busy_loop+0xd7/0x320
[  540.946537]  xsk_recvmsg+0x143/0x170
[  540.950178]  sock_recvmsg+0x99/0xa0
[  540.953729]  __sys_recvfrom+0xa8/0x120
[  540.957543]  ? do_futex+0xbd/0x1d0
[  540.961008]  ? __x64_sys_futex+0x73/0x1d0
[  540.965083]  __x64_sys_recvfrom+0x20/0x30
[  540.969155]  do_syscall_64+0x38/0x90
[  540.972796]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  540.977934] RIP: 0033:0x7f6de5f27934

To fix this, set cached_ntc to first_desc so that at the end, when
freeing/recycling buffers, descriptors from first to ntc are not missed.

Fixes: 2fba7dc ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230531154457.3216621-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
pull bot pushed a commit that referenced this pull request Nov 4, 2023
Update perf JSON files with spelling fixes by Colin Ian King
<colin.i.king@gmail.com> contributed in:

intel/perfmon#96 "Fix various spelling mistakes and typos as found using codespell #96"

This is added on top of the spelling mistakes and release number
updates in:

  intel/perfmon#98 "EMR, SPR, CLX, SKX, BDX, HSX, BDW-DE, WSM-EP*, NHM-*, JKT, IVT : Release event updates"

Some additional spelling fixes reported by Edward Baker
<edward.baker@intel.com> are added on top of this.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230829001730.1352769-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pull bot pushed a commit that referenced this pull request Jul 16, 2024
The distributor and PMR/RPR can present different views of the interrupt
priority space dependent upon the values of GICD_CTLR.DS and
SCR_EL3.FIQ. Currently we treat the distributor's view of the priority
space as canonical, and when the two differ we change the way we handle
values in the PMR/RPR, using the `gic_nonsecure_priorities` static key
to decide what to do.

This approach works, but it's sub-optimal. When using pseudo-NMI we
manipulate the distributor rarely, and we manipulate the PMR/RPR
registers very frequently in code spread out throughout the kernel (e.g.
local_irq_{save,restore}()). It would be nicer if we could use fixed
values for the PMR/RPR, and dynamically choose the values programmed
into the distributor.

This patch changes the GICv3 driver and arm64 code accordingly. PMR
values are chosen at compile time, and the GICv3 driver determines the
appropriate values to program into the distributor at boot time. This
removes the need for the `gic_nonsecure_priorities` static key and
results in smaller and better generated code for saving/restoring the
irqflags.

Before this patch, local_irq_disable() compiles to:

| 0000000000000000 <outlined_local_irq_disable>:
|    0:   d503201f        nop
|    4:   d50343df        msr     daifset, #0x3
|    8:   d65f03c0        ret
|    c:   d503201f        nop
|   10:   d2800c00        mov     x0, #0x60                       // #96
|   14:   d5184600        msr     icc_pmr_el1, x0
|   18:   d65f03c0        ret
|   1c:   d2801400        mov     x0, #0xa0                       // #160
|   20:   17fffffd        b       14 <outlined_local_irq_disable+0x14>

After this patch, local_irq_disable() compiles to:

| 0000000000000000 <outlined_local_irq_disable>:
|    0:   d503201f        nop
|    4:   d50343df        msr     daifset, #0x3
|    8:   d65f03c0        ret
|    c:   d2801800        mov     x0, #0xc0                       // #192
|   10:   d5184600        msr     icc_pmr_el1, x0
|   14:   d65f03c0        ret

... with 3 fewer instructions per call.

For defconfig + CONFIG_PSEUDO_NMI=y, this results in a minor saving of
~4K of text, and will make it easier to make further improvements to the
way we manipulate irqflags and DAIF bits.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240617111841.2529370-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
pull bot pushed a commit that referenced this pull request Sep 22, 2024
This command allows users to quickly retrieve a stacktrace using a handle
obtained from a memory coredump.

Example output:
(gdb) lx-stack_depot_lookup 0x00c80300
   0xffff8000807965b4 <kmem_cache_alloc_noprof+660>:    mov     x20, x0
   0xffff800081a077d8 <kmem_cache_oob_alloc+76>:        mov     x1, x0
   0xffff800081a079a0 <test_version_show+100>:  cbnz    w0, 0xffff800081a07968 <test_version_show+44>
   0xffff800082f4a3fc <kobj_attr_show+60>:      ldr     x19, [sp, #16]
   0xffff800080a0fb34 <sysfs_kf_seq_show+460>:  ldp     x3, x4, [sp, #96]
   0xffff800080a0a550 <kernfs_seq_show+296>:    ldp     x19, x20, [sp, #16]
   0xffff8000808e7b40 <seq_read_iter+836>:      mov     w5, w0
   0xffff800080a0b8ac <kernfs_fop_read_iter+804>:       mov     x23, x0
   0xffff800080914a48 <copy_splice_read+972>:   mov     x6, x0
   0xffff8000809151c4 <do_splice_read+348>:     ldr     x21, [sp, #32]

Link: https://lkml.kernel.org/r/20240723064902.124154-5-kuan-ying.lee@canonical.com
Signed-off-by: Kuan-Ying Lee <kuan-ying.lee@canonical.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kieran Bingham <kbingham@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants