Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oversun's MFC and CEC fixes #76

Merged
merged 2 commits into from
Jan 18, 2015
Merged

Oversun's MFC and CEC fixes #76

merged 2 commits into from
Jan 18, 2015

Conversation

gripped
Copy link

@gripped gripped commented Jan 18, 2015

Explained here http://forum.odroid.com/viewtopic.php?f=95&t=8512
and here http://forum.odroid.com/viewtopic.php?f=95&t=8541

I have built it and CEC is working and video no longer freezes.

mdrjr added a commit that referenced this pull request Jan 18, 2015
Oversun's MFC and CEC fixes
@mdrjr mdrjr merged commit 0834510 into hardkernel:odroidxu3-3.10.y Jan 18, 2015
@mdrjr
Copy link
Collaborator

mdrjr commented Jan 18, 2015

Thank you so much for putting it together ! Really.

Merged!

@gripped gripped deleted the xbmc-fixes branch January 18, 2015 21:08
Dmole pushed a commit to Dmole/linux that referenced this pull request Oct 29, 2016
commit 05af40e upstream.

This commit fixes a stack corruption in the pseries specific code dealing
with the huge pages.

In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments
to the hypervisor is not large enough. This leads to a stack corruption
where a previously saved register could be corrupted leading to unexpected
result in the caller, like the following panic:

  Oops: Kernel access of bad area, sig: 11 [hardkernel#1]
  SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: virtio_balloon ip_tables x_tables autofs4
  virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio
  CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 hardkernel#76
  task: c000000005394880 task.stack: c000000005570000
  NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000
  REGS: c000000005573820 TRAP: 0300   Not tainted  (4.8.0)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 84822884  XER: 20000000
  CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1
  GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964
  GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104
  GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5
  GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000
  GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000
  GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000
  GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964
  GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0
  NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470
  LR [c00000000027bf64] zap_huge_pmd+0x44/0x470
  Call Trace:
  [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable)
  [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0
  [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120
  [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0
  [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0
  [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0
  [c000000005573e30] [c000000000009560] system_call+0x38/0x108
  Instruction dump:
  fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1
  7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378

Most of the time, the bug is surfacing in a caller up in the stack from
__pSeries_lpar_hugepage_invalidate() which is quite confusing.

This bug is pending since v3.11 but was hidden if a caller of the
caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped
register (r18 in this case) in the stack and is not using it until
restoring it. GCC 6.2.0 seems to raise it more frequently.

This commit also change the definition of the parameter buffer in
pSeries_lpar_flush_hash_range() to rely on the global define
PLPAR_HCALL9_BUFSIZE (no functional change here).

Fixes: 1a52728 ("powerpc: Optimize hugepage invalidate")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmole pushed a commit to Dmole/linux that referenced this pull request Oct 29, 2016
commit 05af40e upstream.

This commit fixes a stack corruption in the pseries specific code dealing
with the huge pages.

In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments
to the hypervisor is not large enough. This leads to a stack corruption
where a previously saved register could be corrupted leading to unexpected
result in the caller, like the following panic:

  Oops: Kernel access of bad area, sig: 11 [hardkernel#1]
  SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: virtio_balloon ip_tables x_tables autofs4
  virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio
  CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 hardkernel#76
  task: c000000005394880 task.stack: c000000005570000
  NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000
  REGS: c000000005573820 TRAP: 0300   Not tainted  (4.8.0)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 84822884  XER: 20000000
  CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1
  GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964
  GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104
  GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5
  GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000
  GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000
  GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000
  GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964
  GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0
  NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470
  LR [c00000000027bf64] zap_huge_pmd+0x44/0x470
  Call Trace:
  [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable)
  [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0
  [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120
  [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0
  [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0
  [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0
  [c000000005573e30] [c000000000009560] system_call+0x38/0x108
  Instruction dump:
  fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1
  7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378

Most of the time, the bug is surfacing in a caller up in the stack from
__pSeries_lpar_hugepage_invalidate() which is quite confusing.

This bug is pending since v3.11 but was hidden if a caller of the
caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped
register (r18 in this case) in the stack and is not using it until
restoring it. GCC 6.2.0 seems to raise it more frequently.

This commit also change the definition of the parameter buffer in
pSeries_lpar_flush_hash_range() to rely on the global define
PLPAR_HCALL9_BUFSIZE (no functional change here).

Fixes: 1a52728 ("powerpc: Optimize hugepage invalidate")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mdrjr pushed a commit that referenced this pull request May 17, 2019
[ Upstream commit bd18bff ]

A VMEnter that VMFails (as opposed to VMExits) does not touch host
state beyond registers that are explicitly noted in the VMFail path,
e.g. EFLAGS.  Host state does not need to be loaded because VMFail
is only signaled for consistency checks that occur before the CPU
starts to load guest state, i.e. there is no need to restore any
state as nothing has been modified.  But in the case where a VMFail
is detected by hardware and not by KVM (due to deferring consistency
checks to hardware), KVM has already loaded some amount of guest
state.  Luckily, "loaded" only means loaded to KVM's software model,
i.e. vmcs01 has not been modified.  So, unwind our software model to
the pre-VMEntry host state.

Not restoring host state in this VMFail path leads to a variety of
failures because we end up with stale data in vcpu->arch, e.g. CR0,
CR4, EFER, etc... will all be out of sync relative to vmcs01.  Any
significant delta in the stale data is all but guaranteed to crash
L1, e.g. emulation of SMEP, SMAP, UMIP, WP, etc... will be wrong.

An alternative to this "soft" reload would be to load host state from
vmcs12 as if we triggered a VMExit (as opposed to VMFail), but that is
wildly inconsistent with respect to the VMX architecture, e.g. an L1
VMM with separate VMExit and VMFail paths would explode.

Note that this approach does not mean KVM is 100% accurate with
respect to VMX hardware behavior, even at an architectural level
(the exact order of consistency checks is microarchitecture specific).
But 100% emulation accuracy isn't the goal (with this patch), rather
the goal is to be consistent in the information delivered to L1, e.g.
a VMExit should not fall-through VMENTER, and a VMFail should not jump
to HOST_RIP.

This technically reverts commit "5af4157388ad (KVM: nVMX: Fix mmu
context after VMLAUNCH/VMRESUME failure)", but retains the core
aspects of that patch, just in an open coded form due to the need to
pull state from vmcs01 instead of vmcs12.  Restoring host state
resolves a variety of issues introduced by commit "4f350c6dbcb9
(kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly)",
which remedied the incorrect behavior of treating VMFail like VMExit
but in doing so neglected to restore arch state that had been modified
prior to attempting nested VMEnter.

A sample failure that occurs due to stale vcpu.arch state is a fault
of some form while emulating an LGDT (due to emulated UMIP) from L1
after a failed VMEntry to L3, in this case when running the KVM unit
test test_tpr_threshold_values in L1.  L0 also hits a WARN in this
case due to a stale arch.cr4.UMIP.

L1:
  BUG: unable to handle kernel paging request at ffffc90000663b9e
  PGD 276512067 P4D 276512067 PUD 276513067 PMD 274efa067 PTE 8000000271de2163
  Oops: 0009 [#1] SMP
  CPU: 5 PID: 12495 Comm: qemu-system-x86 Tainted: G        W         4.18.0-rc2+ #2
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:native_load_gdt+0x0/0x10

  ...

  Call Trace:
   load_fixmap_gdt+0x22/0x30
   __vmx_load_host_state+0x10e/0x1c0 [kvm_intel]
   vmx_switch_vmcs+0x2d/0x50 [kvm_intel]
   nested_vmx_vmexit+0x222/0x9c0 [kvm_intel]
   vmx_handle_exit+0x246/0x15a0 [kvm_intel]
   kvm_arch_vcpu_ioctl_run+0x850/0x1830 [kvm]
   kvm_vcpu_ioctl+0x3a1/0x5c0 [kvm]
   do_vfs_ioctl+0x9f/0x600
   ksys_ioctl+0x66/0x70
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x4f/0x100
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

L0:
  WARNING: CPU: 2 PID: 3529 at arch/x86/kvm/vmx.c:6618 handle_desc+0x28/0x30 [kvm_intel]
  ...
  CPU: 2 PID: 3529 Comm: qemu-system-x86 Not tainted 4.17.2-coffee+ #76
  Hardware name: Intel Corporation Kabylake Client platform/KBL S
  RIP: 0010:handle_desc+0x28/0x30 [kvm_intel]

  ...

  Call Trace:
   kvm_arch_vcpu_ioctl_run+0x863/0x1840 [kvm]
   kvm_vcpu_ioctl+0x3a1/0x5c0 [kvm]
   do_vfs_ioctl+0x9f/0x5e0
   ksys_ioctl+0x66/0x70
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x49/0xf0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 5af4157 (KVM: nVMX: Fix mmu context after VMLAUNCH/VMRESUME failure)
Fixes: 4f350c6 (kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly)
Cc: Jim Mattson <jmattson@google.com>
Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim KrÄmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mdrjr pushed a commit that referenced this pull request May 7, 2024
commit 50449ca upstream.

On arm64 machines, swsusp_save() faults if it attempts to access
MEMBLOCK_NOMAP memory ranges. This can be reproduced in QEMU using UEFI
when booting with rodata=off debug_pagealloc=off and CONFIG_KFENCE=n:

  Unable to handle kernel paging request at virtual address ffffff8000000000
  Mem abort info:
    ESR = 0x0000000096000007
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x07: level 3 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
    CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
  swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000eeb0b000
  [ffffff8000000000] pgd=180000217fff9803, p4d=180000217fff9803, pud=180000217fff9803, pmd=180000217fff8803, pte=0000000000000000
  Internal error: Oops: 0000000096000007 [#1] SMP
  Internal error: Oops: 0000000096000007 [#1] SMP
  Modules linked in: xt_multiport ipt_REJECT nf_reject_ipv4 xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_filter bpfilter rfkill at803x snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg dwmac_generic stmmac_platform snd_hda_codec stmmac joydev pcs_xpcs snd_hda_core phylink ppdev lp parport ramoops reed_solomon ip_tables x_tables nls_iso8859_1 vfat multipath linear amdgpu amdxcp drm_exec gpu_sched drm_buddy hid_generic usbhid hid radeon video drm_suballoc_helper drm_ttm_helper ttm i2c_algo_bit drm_display_helper cec drm_kms_helper drm
  CPU: 0 PID: 3663 Comm: systemd-sleep Not tainted 6.6.2+ #76
  Source Version: 4e22ed63a0a48e7a7cff9b98b7806d8d4add7dc0
  Hardware name: Greatwall GW-XXXXXX-XXX/GW-XXXXXX-XXX, BIOS KunLun BIOS V4.0 01/19/2021
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : swsusp_save+0x280/0x538
  lr : swsusp_save+0x280/0x538
  sp : ffffffa034a3fa40
  x29: ffffffa034a3fa40 x28: ffffff8000001000 x27: 0000000000000000
  x26: ffffff8001400000 x25: ffffffc08113e248 x24: 0000000000000000
  x23: 0000000000080000 x22: ffffffc08113e280 x21: 00000000000c69f2
  x20: ffffff8000000000 x19: ffffffc081ae2500 x18: 0000000000000000
  x17: 6666662074736420 x16: 3030303030303030 x15: 3038666666666666
  x14: 0000000000000b69 x13: ffffff9f89088530 x12: 00000000ffffffea
  x11: 00000000ffff7fff x10: 00000000ffff7fff x9 : ffffffc08193f0d0
  x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 : 0000000000000001
  x5 : ffffffa0fff09dc8 x4 : 0000000000000000 x3 : 0000000000000027
  x2 : 0000000000000000 x1 : 0000000000000000 x0 : 000000000000004e
  Call trace:
   swsusp_save+0x280/0x538
   swsusp_arch_suspend+0x148/0x190
   hibernation_snapshot+0x240/0x39c
   hibernate+0xc4/0x378
   state_store+0xf0/0x10c
   kobj_attr_store+0x14/0x24

The reason is swsusp_save() -> copy_data_pages() -> page_is_saveable()
-> kernel_page_present() assuming that a page is always present when
can_set_direct_map() is false (all of rodata_full,
debug_pagealloc_enabled() and arm64_kfence_can_set_direct_map() false),
irrespective of the MEMBLOCK_NOMAP ranges. Such MEMBLOCK_NOMAP regions
should not be saved during hibernation.

This problem was introduced by changes to the pfn_valid() logic in
commit a7d9f30 ("arm64: drop pfn_valid_within() and simplify
pfn_valid()").

Similar to other architectures, drop the !can_set_direct_map() check in
kernel_page_present() so that page_is_savable() skips such pages.

Fixes: a7d9f30 ("arm64: drop pfn_valid_within() and simplify pfn_valid()")
Cc: <stable@vger.kernel.org> # 5.14.x
Suggested-by: Mike Rapoport <rppt@kernel.org>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Co-developed-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: xiongxin <xiongxin@kylinos.cn>
Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240417025248.386622-1-tianyaxiong@kylinos.cn
[catalin.marinas@arm.com: rework commit message]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mdrjr pushed a commit that referenced this pull request Sep 10, 2024
…hunk()

commit 10d9d8c upstream.

[BUG]
There is an internal report that KASAN is reporting use-after-free, with
the following backtrace:

  BUG: KASAN: slab-use-after-free in btrfs_check_read_bio+0xa68/0xb70 [btrfs]
  Read of size 4 at addr ffff8881117cec28 by task kworker/u16:2/45
  CPU: 1 UID: 0 PID: 45 Comm: kworker/u16:2 Not tainted 6.11.0-rc2-next-20240805-default+ #76
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Workqueue: btrfs-endio btrfs_end_bio_work [btrfs]
  Call Trace:
   dump_stack_lvl+0x61/0x80
   print_address_description.constprop.0+0x5e/0x2f0
   print_report+0x118/0x216
   kasan_report+0x11d/0x1f0
   btrfs_check_read_bio+0xa68/0xb70 [btrfs]
   process_one_work+0xce0/0x12a0
   worker_thread+0x717/0x1250
   kthread+0x2e3/0x3c0
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x11/0x20

  Allocated by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   __kasan_slab_alloc+0x7d/0x80
   kmem_cache_alloc_noprof+0x16e/0x3e0
   mempool_alloc_noprof+0x12e/0x310
   bio_alloc_bioset+0x3f0/0x7a0
   btrfs_bio_alloc+0x2e/0x50 [btrfs]
   submit_extent_page+0x4d1/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

  Freed by task 20917:
   kasan_save_stack+0x37/0x60
   kasan_save_track+0x10/0x30
   kasan_save_free_info+0x37/0x50
   __kasan_slab_free+0x4b/0x60
   kmem_cache_free+0x214/0x5d0
   bio_free+0xed/0x180
   end_bbio_data_read+0x1cc/0x580 [btrfs]
   btrfs_submit_chunk+0x98d/0x1880 [btrfs]
   btrfs_submit_bio+0x33/0x70 [btrfs]
   submit_one_bio+0xd4/0x130 [btrfs]
   submit_extent_page+0x3ea/0xdb0 [btrfs]
   btrfs_do_readpage+0x8b4/0x12a0 [btrfs]
   btrfs_readahead+0x29a/0x430 [btrfs]
   read_pages+0x1a7/0xc60
   page_cache_ra_unbounded+0x2ad/0x560
   filemap_get_pages+0x629/0xa20
   filemap_read+0x335/0xbf0
   vfs_read+0x790/0xcb0
   ksys_read+0xfd/0x1d0
   do_syscall_64+0x6d/0x140
   entry_SYSCALL_64_after_hwframe+0x4b/0x53

[CAUSE]
Although I cannot reproduce the error, the report itself is good enough
to pin down the cause.

The call trace is the regular endio workqueue context, but the
free-by-task trace is showing that during btrfs_submit_chunk() we
already hit a critical error, and is calling btrfs_bio_end_io() to error
out.  And the original endio function called bio_put() to free the whole
bio.

This means a double freeing thus causing use-after-free, e.g.:

1. Enter btrfs_submit_bio() with a read bio
   The read bio length is 128K, crossing two 64K stripes.

2. The first run of btrfs_submit_chunk()

2.1 Call btrfs_map_block(), which returns 64K
2.2 Call btrfs_split_bio()
    Now there are two bios, one referring to the first 64K, the other
    referring to the second 64K.
2.3 The first half is submitted.

3. The second run of btrfs_submit_chunk()

3.1 Call btrfs_map_block(), which by somehow failed
    Now we call btrfs_bio_end_io() to handle the error

3.2 btrfs_bio_end_io() calls the original endio function
    Which is end_bbio_data_read(), and it calls bio_put() for the
    original bio.

    Now the original bio is freed.

4. The submitted first 64K bio finished
   Now we call into btrfs_check_read_bio() and tries to advance the bio
   iter.
   But since the original bio (thus its iter) is already freed, we
   trigger the above use-after free.

   And even if the memory is not poisoned/corrupted, we will later call
   the original endio function, causing a double freeing.

[FIX]
Instead of calling btrfs_bio_end_io(), call btrfs_orig_bbio_end_io(),
which has the extra check on split bios and do the proper refcounting
for cloned bios.

Furthermore there is already one extra btrfs_cleanup_bio() call, but
that is duplicated to btrfs_orig_bbio_end_io() call, so remove that
label completely.

Reported-by: David Sterba <dsterba@suse.com>
Fixes: 852eee6 ("btrfs: allow btrfs_submit_bio to split bios")
CC: stable@vger.kernel.org # 6.6+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants