Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please release the kernel source of mi4c #34

Closed
FoeniX233 opened this issue Dec 30, 2015 · 3 comments
Closed

Please release the kernel source of mi4c #34

FoeniX233 opened this issue Dec 30, 2015 · 3 comments

Comments

@FoeniX233
Copy link

We need more choice ! Please hear the sound of mi fans !!!

@talpachen
Copy link

We need more choice ! Please hear the sound of mi fans !!!

@sarvesvara
Copy link

As far the latest (last week) update on 2-Mar-2016... it will be released next. I don't know the meaning of 'next'.

Link: http://en.miui.com/thread-235127-1-1.html

@bgcngm
Copy link

bgcngm commented Mar 20, 2016

It has been released (under libra-l-oss branch). You can close this now.

bgcngm pushed a commit to Mi5Devs/android_kernel_xiaomi_msm8996 that referenced this issue Oct 8, 2016
[ Upstream commit ecf5fc6 ]

Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  #1 schedule at ffffffff815ab76e
  #2 schedule_timeout at ffffffff815ae5e5
  MiCode#3 io_schedule_timeout at ffffffff815aad6a
  MiCode#4 bit_wait_io at ffffffff815abfc6
  MiCode#5 __wait_on_bit at ffffffff815abda5
  MiCode#6 wait_on_page_bit at ffffffff8111fd4f
  MiCode#7 shrink_page_list at ffffffff81135445
  MiCode#8 shrink_inactive_list at ffffffff81135845
  MiCode#9 shrink_lruvec at ffffffff81135ead
 MiCode#10 shrink_zone at ffffffff811360c3
 MiCode#11 shrink_zones at ffffffff81136eff
 MiCode#12 do_try_to_free_pages at ffffffff8113712f
 MiCode#13 try_to_free_mem_cgroup_pages at ffffffff811372be
 MiCode#14 try_charge at ffffffff81189423
 MiCode#15 mem_cgroup_try_charge at ffffffff8118c6f5
 MiCode#16 __add_to_page_cache_locked at ffffffff8112137d
 MiCode#17 add_to_page_cache_lru at ffffffff81121618
 MiCode#18 pagecache_get_page at ffffffff8112170b
 MiCode#19 grow_dev_page at ffffffff811c8297
 MiCode#20 __getblk_slow at ffffffff811c91d6
 MiCode#21 __getblk_gfp at ffffffff811c92c1
 MiCode#22 ext4_ext_grow_indepth at ffffffff8124565c
 MiCode#23 ext4_ext_create_new_leaf at ffffffff81246ca8
 MiCode#24 ext4_ext_insert_extent at ffffffff81246f09
 MiCode#25 ext4_ext_map_blocks at ffffffff8124a848
 MiCode#26 ext4_map_blocks at ffffffff8121a5b7
 MiCode#27 mpage_map_one_extent at ffffffff8121b1fa
 MiCode#28 mpage_map_and_submit_extent at ffffffff8121f07b
 MiCode#29 ext4_writepages at ffffffff8121f6d5
 MiCode#30 do_writepages at ffffffff8112c490
 MiCode#31 __filemap_fdatawrite_range at ffffffff81120199
 MiCode#32 filemap_flush at ffffffff8112041c
 MiCode#33 ext4_alloc_da_blocks at ffffffff81219da1
 MiCode#34 ext4_rename at ffffffff81229b91
 MiCode#35 ext4_rename2 at ffffffff81229e32
 MiCode#36 vfs_rename at ffffffff811a08a5
 MiCode#37 SYSC_renameat2 at ffffffff811a3ffc
 MiCode#38 sys_renameat2 at ffffffff811a408e
 MiCode#39 sys_rename at ffffffff8119e51e
 MiCode#40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

Cc: stable@vger.kernel.org # 3.9+
[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
bgcngm pushed a commit to Mi5Devs/android_kernel_xiaomi_msm8996 that referenced this issue Oct 8, 2016
[ Upstream commit f377554 ]

The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.

This can probuce the following warning:

 [ INFO: suspicious RCU usage. ]
 4.4.0-00006-g0fe53e8-dirty MiCode#34 Tainted: G S
 -------------------------------
 include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!

 other info that might help us debug this:

 RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
 no locks held by swapper/8/0.

 stack backtrace:
  CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty MiCode#34
  Call Trace:
  [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
  [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
  [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
  [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
  [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
  [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
  [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
  [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
  [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
  [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
  [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
  [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14

This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.

Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.

Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.

Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org

Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Fixes: 97e1c18 ("tracing: Kernel Tracepoints")
Cc: stable@vger.kernel.org # v2.6.28+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
bgcngm pushed a commit to Mi5Devs/android_kernel_xiaomi_msm8996 that referenced this issue Oct 8, 2016
[ Upstream commit 7cafc0b ]

We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.

Otherwise we can get an OOPS that looks like this:

ld-linux.so.2(36808): Kernel bad sw trap 5 [#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 MiCode#34
task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
 [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c

The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code.  First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.

The userland register window fill handler is:

	add	%sp, STACK_BIAS + 0x00, %g1;		\
	ldxa	[%g1 + %g0] ASI, %l0;			\
	mov	0x08, %g2;				\
	mov	0x10, %g3;				\
	ldxa	[%g1 + %g2] ASI, %l1;			\
	mov	0x18, %g5;				\
	ldxa	[%g1 + %g3] ASI, %l2;			\
	ldxa	[%g1 + %g5] ASI, %l3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %l4;			\
	ldxa	[%g1 + %g2] ASI, %l5;			\
	ldxa	[%g1 + %g3] ASI, %l6;			\
	ldxa	[%g1 + %g5] ASI, %l7;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i0;			\
	ldxa	[%g1 + %g2] ASI, %i1;			\
	ldxa	[%g1 + %g3] ASI, %i2;			\
	ldxa	[%g1 + %g5] ASI, %i3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i4;			\
	ldxa	[%g1 + %g2] ASI, %i5;			\
	ldxa	[%g1 + %g3] ASI, %i6;			\
	ldxa	[%g1 + %g5] ASI, %i7;			\
	restored;					\
	retry; nop; nop; nop; nop;			\
	b,a,pt	%xcc, fill_fixup_dax;			\
	b,a,pt	%xcc, fill_fixup_mna;			\
	b,a,pt	%xcc, fill_fixup;

And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took.  In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for.  It just always branches to the last instruction in
the parent trap's handler.

For example, for a regular fault, the code goes:

winfix_trampoline:
	rdpr	%tpc, %g3
	or	%g3, 0x7c, %g3
	wrpr	%g3, %tnpc
	done

All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.

On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons.  The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).

This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary.  Now if you look at them, we'll see at the end:

	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;

And oops, all three cases are handled like a fault.

This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.

So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.

So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.

Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
AndropaX pushed a commit to AndropaX/android_kernel_xiaomi_msm8992 that referenced this issue Dec 16, 2016
commit f377554 upstream.

The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.

This can probuce the following warning:

 [ INFO: suspicious RCU usage. ]
 4.4.0-00006-g0fe53e8-dirty MiCode#34 Tainted: G S
 -------------------------------
 include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!

 other info that might help us debug this:

 RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
 no locks held by swapper/8/0.

 stack backtrace:
  CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty MiCode#34
  Call Trace:
  [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
  [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
  [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
  [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
  [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
  [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
  [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
  [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
  [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
  [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
  [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
  [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14

This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.

Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.

Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.

Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org

Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Fixes: 97e1c18 ("tracing: Kernel Tracepoints")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AndropaX pushed a commit to AndropaX/android_kernel_xiaomi_msm8992 that referenced this issue Jul 10, 2017
commit 0a8fd13 upstream.

When checking a new device's descriptors, the USB core does not check
for duplicate endpoint addresses.  This can cause a problem when the
sysfs files for those endpoints are created; trying to create multiple
files with the same name will provoke a WARNING:

WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
sysfs: cannot create duplicate filename
'/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
Kernel panic - not syncing: panic_on_warn set ...

CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ MiCode#34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
 ffff88006bee64c8 ffffffff81f96b8a ffffffff00000001 1ffff1000d7dcc2c
 ffffed000d7dcc24 0000000000000001 0000000041b58ab3 ffffffff8598b510
 ffffffff81f968f8 ffffffff850fee20 ffffffff85cff020 dffffc0000000000
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
 [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
 [<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
 [<ffffffff819e70ca>] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
 [<ffffffff819e7308>] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
 [<     inline     >] create_dir lib/kobject.c:71
 [<ffffffff81fa1b07>] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
 [<     inline     >] kobject_add_varg lib/kobject.c:366
 [<ffffffff81fa2479>] kobject_add+0x139/0x220 lib/kobject.c:411
 [<ffffffff82737a63>] device_add+0x353/0x1660 drivers/base/core.c:1088
 [<ffffffff82738d8d>] device_register+0x1d/0x20 drivers/base/core.c:1206
 [<ffffffff82cb77d3>] usb_create_ep_devs+0x163/0x260 drivers/usb/core/endpoint.c:195
 [<ffffffff82c9f27b>] create_intf_ep_devs+0x13b/0x200 drivers/usb/core/message.c:1030
 [<ffffffff82ca39d3>] usb_set_configuration+0x1083/0x18d0 drivers/usb/core/message.c:1937
 [<ffffffff82cc9e2e>] generic_probe+0x6e/0xe0 drivers/usb/core/generic.c:172
 [<ffffffff82caa7fa>] usb_probe_device+0xaa/0xe0 drivers/usb/core/driver.c:263

This patch prevents the problem by checking for duplicate endpoint
addresses during enumeration and skipping any duplicates.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
mihirshah006 pushed a commit to mihirshah006/Xiaomi_Kernel_OpenSource that referenced this issue Jul 25, 2017
commit ecf5fc6 upstream.

Nikolay has reported a hang when a memcg reclaim got stuck with the
following backtrace:

PID: 18308  TASK: ffff883d7c9b0a30  CPU: 1   COMMAND: "rsync"
  #0 __schedule at ffffffff815ab152
  premaca#1 schedule at ffffffff815ab76e
  premaca#2 schedule_timeout at ffffffff815ae5e5
  MiCode#3 io_schedule_timeout at ffffffff815aad6a
  MiCode#4 bit_wait_io at ffffffff815abfc6
  MiCode#5 __wait_on_bit at ffffffff815abda5
  MiCode#6 wait_on_page_bit at ffffffff8111fd4f
  MiCode#7 shrink_page_list at ffffffff81135445
  MiCode#8 shrink_inactive_list at ffffffff81135845
  MiCode#9 shrink_lruvec at ffffffff81135ead
 MiCode#10 shrink_zone at ffffffff811360c3
 MiCode#11 shrink_zones at ffffffff81136eff
 MiCode#12 do_try_to_free_pages at ffffffff8113712f
 MiCode#13 try_to_free_mem_cgroup_pages at ffffffff811372be
 MiCode#14 try_charge at ffffffff81189423
 MiCode#15 mem_cgroup_try_charge at ffffffff8118c6f5
 MiCode#16 __add_to_page_cache_locked at ffffffff8112137d
 MiCode#17 add_to_page_cache_lru at ffffffff81121618
 MiCode#18 pagecache_get_page at ffffffff8112170b
 MiCode#19 grow_dev_page at ffffffff811c8297
 MiCode#20 __getblk_slow at ffffffff811c91d6
 MiCode#21 __getblk_gfp at ffffffff811c92c1
 MiCode#22 ext4_ext_grow_indepth at ffffffff8124565c
 MiCode#23 ext4_ext_create_new_leaf at ffffffff81246ca8
 MiCode#24 ext4_ext_insert_extent at ffffffff81246f09
 MiCode#25 ext4_ext_map_blocks at ffffffff8124a848
 MiCode#26 ext4_map_blocks at ffffffff8121a5b7
 MiCode#27 mpage_map_one_extent at ffffffff8121b1fa
 MiCode#28 mpage_map_and_submit_extent at ffffffff8121f07b
 MiCode#29 ext4_writepages at ffffffff8121f6d5
 MiCode#30 do_writepages at ffffffff8112c490
 MiCode#31 __filemap_fdatawrite_range at ffffffff81120199
 MiCode#32 filemap_flush at ffffffff8112041c
 MiCode#33 ext4_alloc_da_blocks at ffffffff81219da1
 MiCode#34 ext4_rename at ffffffff81229b91
 MiCode#35 ext4_rename2 at ffffffff81229e32
 MiCode#36 vfs_rename at ffffffff811a08a5
 MiCode#37 SYSC_renameat2 at ffffffff811a3ffc
 MiCode#38 sys_renameat2 at ffffffff811a408e
 MiCode#39 sys_rename at ffffffff8119e51e
 MiCode#40 system_call_fastpath at ffffffff815afa89

Dave Chinner has properly pointed out that this is a deadlock in the
reclaim code because ext4 doesn't submit pages which are marked by
PG_writeback right away.

The heuristic was introduced by commit e62e384 ("memcg: prevent OOM
with too many dirty pages") and it was applied only when may_enter_fs
was specified.  The code has been changed by c3b94f4 ("memcg:
further prevent OOM with too many dirty pages") which has removed the
__GFP_FS restriction with a reasoning that we do not get into the fs
code.  But this is not sufficient apparently because the fs doesn't
necessarily submit pages marked PG_writeback for IO right away.

ext4_bio_write_page calls io_submit_add_bh but that doesn't necessarily
submit the bio.  Instead it tries to map more pages into the bio and
mpage_map_one_extent might trigger memcg charge which might end up
waiting on a page which is marked PG_writeback but hasn't been submitted
yet so we would end up waiting for something that never finishes.

Fix this issue by replacing __GFP_IO by may_enter_fs check (for case 2)
before we go to wait on the writeback.  The page fault path, which is
the only path that triggers memcg oom killer since 3.12, shouldn't
require GFP_NOFS and so we shouldn't reintroduce the premature OOM
killer issue which was originally addressed by the heuristic.

As per David Chinner the xfs is doing similar thing since 2.6.15 already
so ext4 is not the only affected filesystem.  Moreover he notes:

: For example: IO completion might require unwritten extent conversion
: which executes filesystem transactions and GFP_NOFS allocations. The
: writeback flag on the pages can not be cleared until unwritten
: extent conversion completes. Hence memory reclaim cannot wait on
: page writeback to complete in GFP_NOFS context because it is not
: safe to do so, memcg reclaim or otherwise.

[tytso@mit.edu: corrected the control flow]
Fixes: c3b94f4 ("memcg: further prevent OOM with too many dirty pages")
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[@MSF-Jarvis: Fix conflicts from "mm: vmscan: stall page reclaim after a list of pages have been processed" ]

Change-Id: I09aa7c565388b4b323034d5c71a463f4fb175462
mihirshah006 pushed a commit to mihirshah006/Xiaomi_Kernel_OpenSource that referenced this issue Jul 25, 2017
commit f377554 upstream.

The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.

This can probuce the following warning:

 [ INFO: suspicious RCU usage. ]
 4.4.0-00006-g0fe53e8-dirty MiCode#34 Tainted: G S
 -------------------------------
 include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!

 other info that might help us debug this:

 RCU used illegally from offline CPU!  rcu_scheduler_active = 1, debug_locks = 1
 no locks held by swapper/8/0.

 stack backtrace:
  CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S              4.4.0-00006-g0fe53e8-dirty MiCode#34
  Call Trace:
  [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
  [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
  [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
  [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
  [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
  [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
  [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
  [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
  [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
  [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
  [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
  [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14

This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.

Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.

Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.

Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org

Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Fixes: 97e1c18 ("tracing: Kernel Tracepoints")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mihirshah006 pushed a commit to mihirshah006/Xiaomi_Kernel_OpenSource that referenced this issue Jul 25, 2017
commit 0a8fd13 upstream.

When checking a new device's descriptors, the USB core does not check
for duplicate endpoint addresses.  This can cause a problem when the
sysfs files for those endpoints are created; trying to create multiple
files with the same name will provoke a WARNING:

WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
sysfs: cannot create duplicate filename
'/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
Kernel panic - not syncing: panic_on_warn set ...

CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ MiCode#34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
 ffff88006bee64c8 ffffffff81f96b8a ffffffff00000001 1ffff1000d7dcc2c
 ffffed000d7dcc24 0000000000000001 0000000041b58ab3 ffffffff8598b510
 ffffffff81f968f8 ffffffff850fee20 ffffffff85cff020 dffffc0000000000
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
 [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
 [<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
 [<ffffffff819e70ca>] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
 [<ffffffff819e7308>] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
 [<     inline     >] create_dir lib/kobject.c:71
 [<ffffffff81fa1b07>] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
 [<     inline     >] kobject_add_varg lib/kobject.c:366
 [<ffffffff81fa2479>] kobject_add+0x139/0x220 lib/kobject.c:411
 [<ffffffff82737a63>] device_add+0x353/0x1660 drivers/base/core.c:1088
 [<ffffffff82738d8d>] device_register+0x1d/0x20 drivers/base/core.c:1206
 [<ffffffff82cb77d3>] usb_create_ep_devs+0x163/0x260 drivers/usb/core/endpoint.c:195
 [<ffffffff82c9f27b>] create_intf_ep_devs+0x13b/0x200 drivers/usb/core/message.c:1030
 [<ffffffff82ca39d3>] usb_set_configuration+0x1083/0x18d0 drivers/usb/core/message.c:1937
 [<ffffffff82cc9e2e>] generic_probe+0x6e/0xe0 drivers/usb/core/generic.c:172
 [<ffffffff82caa7fa>] usb_probe_device+0xaa/0xe0 drivers/usb/core/driver.c:263

This patch prevents the problem by checking for duplicate endpoint
addresses during enumeration and skipping any duplicates.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Willy Tarreau <w@1wt.eu>
maxprzemo pushed a commit to maxprzemo/Xiaomi_Kernel_OpenSource that referenced this issue Feb 18, 2018
[ Upstream commit 7cafc0b ]

We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.

Otherwise we can get an OOPS that looks like this:

ld-linux.so.2(36808): Kernel bad sw trap 5 [MiCode#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 MiCode#34
task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
 [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c

The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code.  First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.

The userland register window fill handler is:

	add	%sp, STACK_BIAS + 0x00, %g1;		\
	ldxa	[%g1 + %g0] ASI, %l0;			\
	mov	0x08, %g2;				\
	mov	0x10, %g3;				\
	ldxa	[%g1 + %g2] ASI, %l1;			\
	mov	0x18, %g5;				\
	ldxa	[%g1 + %g3] ASI, %l2;			\
	ldxa	[%g1 + %g5] ASI, %l3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %l4;			\
	ldxa	[%g1 + %g2] ASI, %l5;			\
	ldxa	[%g1 + %g3] ASI, %l6;			\
	ldxa	[%g1 + %g5] ASI, %l7;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i0;			\
	ldxa	[%g1 + %g2] ASI, %i1;			\
	ldxa	[%g1 + %g3] ASI, %i2;			\
	ldxa	[%g1 + %g5] ASI, %i3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i4;			\
	ldxa	[%g1 + %g2] ASI, %i5;			\
	ldxa	[%g1 + %g3] ASI, %i6;			\
	ldxa	[%g1 + %g5] ASI, %i7;			\
	restored;					\
	retry; nop; nop; nop; nop;			\
	b,a,pt	%xcc, fill_fixup_dax;			\
	b,a,pt	%xcc, fill_fixup_mna;			\
	b,a,pt	%xcc, fill_fixup;

And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took.  In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for.  It just always branches to the last instruction in
the parent trap's handler.

For example, for a regular fault, the code goes:

winfix_trampoline:
	rdpr	%tpc, %g3
	or	%g3, 0x7c, %g3
	wrpr	%g3, %tnpc
	done

All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.

On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons.  The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).

This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary.  Now if you look at them, we'll see at the end:

	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;

And oops, all three cases are handled like a fault.

This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.

So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.

So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.

Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
maxprzemo pushed a commit to maxprzemo/Xiaomi_Kernel_OpenSource that referenced this issue Feb 18, 2018
[ Upstream commit bb1107f ]

Andrey Konovalov has reported the following warning triggered by the
syzkaller fuzzer.

  WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ MiCode#34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    __alloc_pages_slowpath mm/page_alloc.c:3511
    __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
    alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
    alloc_pages include/linux/gfp.h:469
    kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
    kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
    kmalloc_large include/linux/slab.h:422
    __kmalloc+0x210/0x2d0 mm/slub.c:3723
    kmalloc include/linux/slab.h:495
    ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
    new_sync_write fs/read_write.c:499
    __vfs_write+0x483/0x760 fs/read_write.c:512
    vfs_write+0x170/0x4e0 fs/read_write.c:560
    SYSC_write fs/read_write.c:607
    SyS_write+0xfb/0x230 fs/read_write.c:599
    entry_SYSCALL_64_fastpath+0x1f/0xc2

The issue is caused by a lack of size check for the request size in
ep_write_iter which should be fixed.  It, however, points to another
problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
resulting page allocator request might be MAX_ORDER which is too large
(see __alloc_pages_slowpath).

The same applies to the SLOB allocator which allows even larger sizes.
Make sure that they are capped properly and never request more than
MAX_ORDER order.

Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Goayandi pushed a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 2, 2018
[ Upstream commit 7cafc0b ]

We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.

Otherwise we can get an OOPS that looks like this:

ld-linux.so.2(36808): Kernel bad sw trap 5 [MiCode#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 MiCode#34
task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
 [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c

The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code.  First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.

The userland register window fill handler is:

	add	%sp, STACK_BIAS + 0x00, %g1;		\
	ldxa	[%g1 + %g0] ASI, %l0;			\
	mov	0x08, %g2;				\
	mov	0x10, %g3;				\
	ldxa	[%g1 + %g2] ASI, %l1;			\
	mov	0x18, %g5;				\
	ldxa	[%g1 + %g3] ASI, %l2;			\
	ldxa	[%g1 + %g5] ASI, %l3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %l4;			\
	ldxa	[%g1 + %g2] ASI, %l5;			\
	ldxa	[%g1 + %g3] ASI, %l6;			\
	ldxa	[%g1 + %g5] ASI, %l7;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i0;			\
	ldxa	[%g1 + %g2] ASI, %i1;			\
	ldxa	[%g1 + %g3] ASI, %i2;			\
	ldxa	[%g1 + %g5] ASI, %i3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i4;			\
	ldxa	[%g1 + %g2] ASI, %i5;			\
	ldxa	[%g1 + %g3] ASI, %i6;			\
	ldxa	[%g1 + %g5] ASI, %i7;			\
	restored;					\
	retry; nop; nop; nop; nop;			\
	b,a,pt	%xcc, fill_fixup_dax;			\
	b,a,pt	%xcc, fill_fixup_mna;			\
	b,a,pt	%xcc, fill_fixup;

And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took.  In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for.  It just always branches to the last instruction in
the parent trap's handler.

For example, for a regular fault, the code goes:

winfix_trampoline:
	rdpr	%tpc, %g3
	or	%g3, 0x7c, %g3
	wrpr	%g3, %tnpc
	done

All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.

On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons.  The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).

This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary.  Now if you look at them, we'll see at the end:

	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;

And oops, all three cases are handled like a fault.

This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.

So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.

So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.

Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Goayandi pushed a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 3, 2018
[ Upstream commit 7cafc0b ]

We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.

Otherwise we can get an OOPS that looks like this:

ld-linux.so.2(36808): Kernel bad sw trap 5 [MiCode#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 MiCode#34
task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
 [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c

The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code.  First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.

The userland register window fill handler is:

	add	%sp, STACK_BIAS + 0x00, %g1;		\
	ldxa	[%g1 + %g0] ASI, %l0;			\
	mov	0x08, %g2;				\
	mov	0x10, %g3;				\
	ldxa	[%g1 + %g2] ASI, %l1;			\
	mov	0x18, %g5;				\
	ldxa	[%g1 + %g3] ASI, %l2;			\
	ldxa	[%g1 + %g5] ASI, %l3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %l4;			\
	ldxa	[%g1 + %g2] ASI, %l5;			\
	ldxa	[%g1 + %g3] ASI, %l6;			\
	ldxa	[%g1 + %g5] ASI, %l7;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i0;			\
	ldxa	[%g1 + %g2] ASI, %i1;			\
	ldxa	[%g1 + %g3] ASI, %i2;			\
	ldxa	[%g1 + %g5] ASI, %i3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i4;			\
	ldxa	[%g1 + %g2] ASI, %i5;			\
	ldxa	[%g1 + %g3] ASI, %i6;			\
	ldxa	[%g1 + %g5] ASI, %i7;			\
	restored;					\
	retry; nop; nop; nop; nop;			\
	b,a,pt	%xcc, fill_fixup_dax;			\
	b,a,pt	%xcc, fill_fixup_mna;			\
	b,a,pt	%xcc, fill_fixup;

And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took.  In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for.  It just always branches to the last instruction in
the parent trap's handler.

For example, for a regular fault, the code goes:

winfix_trampoline:
	rdpr	%tpc, %g3
	or	%g3, 0x7c, %g3
	wrpr	%g3, %tnpc
	done

All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.

On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons.  The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).

This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary.  Now if you look at them, we'll see at the end:

	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;

And oops, all three cases are handled like a fault.

This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.

So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.

So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.

Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Goayandi pushed a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 4, 2018
[ Upstream commit bb1107f ]

Andrey Konovalov has reported the following warning triggered by the
syzkaller fuzzer.

  WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ MiCode#34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    __alloc_pages_slowpath mm/page_alloc.c:3511
    __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
    alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
    alloc_pages include/linux/gfp.h:469
    kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
    kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
    kmalloc_large include/linux/slab.h:422
    __kmalloc+0x210/0x2d0 mm/slub.c:3723
    kmalloc include/linux/slab.h:495
    ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
    new_sync_write fs/read_write.c:499
    __vfs_write+0x483/0x760 fs/read_write.c:512
    vfs_write+0x170/0x4e0 fs/read_write.c:560
    SYSC_write fs/read_write.c:607
    SyS_write+0xfb/0x230 fs/read_write.c:599
    entry_SYSCALL_64_fastpath+0x1f/0xc2

The issue is caused by a lack of size check for the request size in
ep_write_iter which should be fixed.  It, however, points to another
problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
resulting page allocator request might be MAX_ORDER which is too large
(see __alloc_pages_slowpath).

The same applies to the SLOB allocator which allows even larger sizes.
Make sure that they are capped properly and never request more than
MAX_ORDER order.

Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Goayandi pushed a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 13, 2018
[ Upstream commit 7cafc0b ]

We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.

Otherwise we can get an OOPS that looks like this:

ld-linux.so.2(36808): Kernel bad sw trap 5 [MiCode#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 MiCode#34
task: fff8000303be5c60 ti: fff8000301344000 task.ti: fff8000301344000
TSTATE: 0000004410001601 TPC: 0000000000a1a784 TNPC: 0000000000a1a788 Y: 00000002    Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0: fff8000024fc8248 g1: 0000000000db04dc g2: 0000000000000000 g3: 0000000000000001
g4: fff8000303be5c60 g5: fff800030e672000 g6: fff8000301344000 g7: 0000000000000001
o0: 0000000000b95ee8 o1: 000000000000012b o2: 0000000000000000 o3: 0000000200b9b358
o4: 0000000000000000 o5: fff8000301344040 sp: fff80003013475c1 ret_pc: 0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0: 00000000000007ff l1: 0000000000000000 l2: 000000000000005f l3: 0000000000000000
l4: fff8000301347e98 l5: fff8000024ff3060 l6: 0000000000000000 l7: 0000000000000000
i0: fff8000301347f60 i1: 0000000000102400 i2: 0000000000000000 i3: 0000000000000000
i4: 0000000000000000 i5: 0000000000000000 i6: fff80003013476a1 i7: 0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
 [0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c

The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code.  First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.

The userland register window fill handler is:

	add	%sp, STACK_BIAS + 0x00, %g1;		\
	ldxa	[%g1 + %g0] ASI, %l0;			\
	mov	0x08, %g2;				\
	mov	0x10, %g3;				\
	ldxa	[%g1 + %g2] ASI, %l1;			\
	mov	0x18, %g5;				\
	ldxa	[%g1 + %g3] ASI, %l2;			\
	ldxa	[%g1 + %g5] ASI, %l3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %l4;			\
	ldxa	[%g1 + %g2] ASI, %l5;			\
	ldxa	[%g1 + %g3] ASI, %l6;			\
	ldxa	[%g1 + %g5] ASI, %l7;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i0;			\
	ldxa	[%g1 + %g2] ASI, %i1;			\
	ldxa	[%g1 + %g3] ASI, %i2;			\
	ldxa	[%g1 + %g5] ASI, %i3;			\
	add	%g1, 0x20, %g1;				\
	ldxa	[%g1 + %g0] ASI, %i4;			\
	ldxa	[%g1 + %g2] ASI, %i5;			\
	ldxa	[%g1 + %g3] ASI, %i6;			\
	ldxa	[%g1 + %g5] ASI, %i7;			\
	restored;					\
	retry; nop; nop; nop; nop;			\
	b,a,pt	%xcc, fill_fixup_dax;			\
	b,a,pt	%xcc, fill_fixup_mna;			\
	b,a,pt	%xcc, fill_fixup;

And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took.  In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for.  It just always branches to the last instruction in
the parent trap's handler.

For example, for a regular fault, the code goes:

winfix_trampoline:
	rdpr	%tpc, %g3
	or	%g3, 0x7c, %g3
	wrpr	%g3, %tnpc
	done

All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.

On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons.  The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).

This is executed inline via the FILL_*_RTRAP handlers.  rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary.  Now if you look at them, we'll see at the end:

	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;
	    ba,a,pt    %xcc, user_rtt_fill_fixup;

And oops, all three cases are handled like a fault.

This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.

So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.

So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.

Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Goayandi pushed a commit to Goayandi/android_kernel_xiaomi_cappu that referenced this issue Apr 14, 2018
[ Upstream commit bb1107f ]

Andrey Konovalov has reported the following warning triggered by the
syzkaller fuzzer.

  WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ MiCode#34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    __alloc_pages_slowpath mm/page_alloc.c:3511
    __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
    alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
    alloc_pages include/linux/gfp.h:469
    kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
    kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
    kmalloc_large include/linux/slab.h:422
    __kmalloc+0x210/0x2d0 mm/slub.c:3723
    kmalloc include/linux/slab.h:495
    ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
    new_sync_write fs/read_write.c:499
    __vfs_write+0x483/0x760 fs/read_write.c:512
    vfs_write+0x170/0x4e0 fs/read_write.c:560
    SYSC_write fs/read_write.c:607
    SyS_write+0xfb/0x230 fs/read_write.c:599
    entry_SYSCALL_64_fastpath+0x1f/0xc2

The issue is caused by a lack of size check for the request size in
ep_write_iter which should be fixed.  It, however, points to another
problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
resulting page allocator request might be MAX_ORDER which is too large
(see __alloc_pages_slowpath).

The same applies to the SLOB allocator which allows even larger sizes.
Make sure that they are capped properly and never request more than
MAX_ORDER order.

Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AxelBlaz3 pushed a commit to AxelBlaz3/Xiaomi_Kernel_OpenSource that referenced this issue May 8, 2018
commit 0a8fd13 upstream.

When checking a new device's descriptors, the USB core does not check
for duplicate endpoint addresses.  This can cause a problem when the
sysfs files for those endpoints are created; trying to create multiple
files with the same name will provoke a WARNING:

WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
sysfs: cannot create duplicate filename
'/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
Kernel panic - not syncing: panic_on_warn set ...

CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ MiCode#34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
 ffff88006bee64c8 ffffffff81f96b8a ffffffff00000001 1ffff1000d7dcc2c
 ffffed000d7dcc24 0000000000000001 0000000041b58ab3 ffffffff8598b510
 ffffffff81f968f8 ffffffff850fee20 ffffffff85cff020 dffffc0000000000
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
 [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
 [<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
 [<ffffffff819e70ca>] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
 [<ffffffff819e7308>] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
 [<     inline     >] create_dir lib/kobject.c:71
 [<ffffffff81fa1b07>] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
 [<     inline     >] kobject_add_varg lib/kobject.c:366
 [<ffffffff81fa2479>] kobject_add+0x139/0x220 lib/kobject.c:411
 [<ffffffff82737a63>] device_add+0x353/0x1660 drivers/base/core.c:1088
 [<ffffffff82738d8d>] device_register+0x1d/0x20 drivers/base/core.c:1206
 [<ffffffff82cb77d3>] usb_create_ep_devs+0x163/0x260 drivers/usb/core/endpoint.c:195
 [<ffffffff82c9f27b>] create_intf_ep_devs+0x13b/0x200 drivers/usb/core/message.c:1030
 [<ffffffff82ca39d3>] usb_set_configuration+0x1083/0x18d0 drivers/usb/core/message.c:1937
 [<ffffffff82cc9e2e>] generic_probe+0x6e/0xe0 drivers/usb/core/generic.c:172
 [<ffffffff82caa7fa>] usb_probe_device+0xaa/0xe0 drivers/usb/core/driver.c:263

This patch prevents the problem by checking for duplicate endpoint
addresses during enumeration and skipping any duplicates.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pix106 pushed a commit to pix106/android_kernel_xiaomi that referenced this issue Oct 1, 2020
[ Upstream commit 96298f6 ]

According to Core Spec Version 5.2 | Vol 3, Part A 6.1.5,
the incoming L2CAP_ConfigReq should be handled during
OPEN state.

The section below shows the btmon trace when running
L2CAP/COS/CFD/BV-12-C before and after this change.

=== Before ===
...
> ACL Data RX: Handle 256 flags 0x02 dlen 12                MiCode#22
      L2CAP: Connection Request (0x02) ident 2 len 4
        PSM: 1 (0x0001)
        Source CID: 65
< ACL Data TX: Handle 256 flags 0x00 dlen 16                MiCode#23
      L2CAP: Connection Response (0x03) ident 2 len 8
        Destination CID: 64
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 256 flags 0x00 dlen 12                MiCode#24
      L2CAP: Configure Request (0x04) ident 2 len 4
        Destination CID: 65
        Flags: 0x0000
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#25
        Num handles: 1
        Handle: 256
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#26
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 16                MiCode#27
      L2CAP: Configure Request (0x04) ident 3 len 8
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00                                            ..
< ACL Data TX: Handle 256 flags 0x00 dlen 18                MiCode#28
      L2CAP: Configure Response (0x05) ident 3 len 10
        Source CID: 65
        Flags: 0x0000
        Result: Success (0x0000)
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 672
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#29
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 14                MiCode#30
      L2CAP: Configure Response (0x05) ident 2 len 6
        Source CID: 64
        Flags: 0x0000
        Result: Success (0x0000)
> ACL Data RX: Handle 256 flags 0x02 dlen 20                MiCode#31
      L2CAP: Configure Request (0x04) ident 3 len 12
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00 91 02 11 11                                ......
< ACL Data TX: Handle 256 flags 0x00 dlen 14                MiCode#32
      L2CAP: Command Reject (0x01) ident 3 len 6
        Reason: Invalid CID in request (0x0002)
        Destination CID: 64
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#33
        Num handles: 1
        Handle: 256
        Count: 1
...
=== After ===
...
> ACL Data RX: Handle 256 flags 0x02 dlen 12               MiCode#22
      L2CAP: Connection Request (0x02) ident 2 len 4
        PSM: 1 (0x0001)
        Source CID: 65
< ACL Data TX: Handle 256 flags 0x00 dlen 16               MiCode#23
      L2CAP: Connection Response (0x03) ident 2 len 8
        Destination CID: 64
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 256 flags 0x00 dlen 12               MiCode#24
      L2CAP: Configure Request (0x04) ident 2 len 4
        Destination CID: 65
        Flags: 0x0000
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#25
        Num handles: 1
        Handle: 256
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#26
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 16               MiCode#27
      L2CAP: Configure Request (0x04) ident 3 len 8
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00                                            ..
< ACL Data TX: Handle 256 flags 0x00 dlen 18               MiCode#28
      L2CAP: Configure Response (0x05) ident 3 len 10
        Source CID: 65
        Flags: 0x0000
        Result: Success (0x0000)
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 672
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#29
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 14               MiCode#30
      L2CAP: Configure Response (0x05) ident 2 len 6
        Source CID: 64
        Flags: 0x0000
        Result: Success (0x0000)
> ACL Data RX: Handle 256 flags 0x02 dlen 20               MiCode#31
      L2CAP: Configure Request (0x04) ident 3 len 12
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00 91 02 11 11                                .....
< ACL Data TX: Handle 256 flags 0x00 dlen 18               MiCode#32
      L2CAP: Configure Response (0x05) ident 3 len 10
        Source CID: 65
        Flags: 0x0000
        Result: Success (0x0000)
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 672
< ACL Data TX: Handle 256 flags 0x00 dlen 12               MiCode#33
      L2CAP: Configure Request (0x04) ident 3 len 4
        Destination CID: 65
        Flags: 0x0000
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#34
        Num handles: 1
        Handle: 256
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#35
        Num handles: 1
        Handle: 256
        Count: 1
...

Signed-off-by: Howard Chung <howardchung@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
mi-code pushed a commit that referenced this issue Oct 22, 2022
When checking a new device's descriptors, the USB core does not check
for duplicate endpoint addresses.  This can cause a problem when the
sysfs files for those endpoints are created; trying to create multiple
files with the same name will provoke a WARNING:

WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
sysfs: cannot create duplicate filename
'/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
Kernel panic - not syncing: panic_on_warn set ...

CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
 ffff88006bee64c8 ffffffff81f96b8a ffffffff00000001 1ffff1000d7dcc2c
 ffffed000d7dcc24 0000000000000001 0000000041b58ab3 ffffffff8598b510
 ffffffff81f968f8 ffffffff850fee20 ffffffff85cff020 dffffc0000000000
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
 [<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
 [<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
 [<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
 [<ffffffff819e70ca>] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
 [<ffffffff819e7308>] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
 [<     inline     >] create_dir lib/kobject.c:71
 [<ffffffff81fa1b07>] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
 [<     inline     >] kobject_add_varg lib/kobject.c:366
 [<ffffffff81fa2479>] kobject_add+0x139/0x220 lib/kobject.c:411
 [<ffffffff82737a63>] device_add+0x353/0x1660 drivers/base/core.c:1088
 [<ffffffff82738d8d>] device_register+0x1d/0x20 drivers/base/core.c:1206
 [<ffffffff82cb77d3>] usb_create_ep_devs+0x163/0x260 drivers/usb/core/endpoint.c:195
 [<ffffffff82c9f27b>] create_intf_ep_devs+0x13b/0x200 drivers/usb/core/message.c:1030
 [<ffffffff82ca39d3>] usb_set_configuration+0x1083/0x18d0 drivers/usb/core/message.c:1937
 [<ffffffff82cc9e2e>] generic_probe+0x6e/0xe0 drivers/usb/core/generic.c:172
 [<ffffffff82caa7fa>] usb_probe_device+0xaa/0xe0 drivers/usb/core/driver.c:263

This patch prevents the problem by checking for duplicate endpoint
addresses during enumeration and skipping any duplicates.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hectorvax pushed a commit to hectorvax/Xiaomi_Kernel_OpenSource that referenced this issue Jan 3, 2023
[ Upstream commit 96298f6 ]

According to Core Spec Version 5.2 | Vol 3, Part A 6.1.5,
the incoming L2CAP_ConfigReq should be handled during
OPEN state.

The section below shows the btmon trace when running
L2CAP/COS/CFD/BV-12-C before and after this change.

=== Before ===
...
> ACL Data RX: Handle 256 flags 0x02 dlen 12                MiCode#22
      L2CAP: Connection Request (0x02) ident 2 len 4
        PSM: 1 (0x0001)
        Source CID: 65
< ACL Data TX: Handle 256 flags 0x00 dlen 16                MiCode#23
      L2CAP: Connection Response (0x03) ident 2 len 8
        Destination CID: 64
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 256 flags 0x00 dlen 12                MiCode#24
      L2CAP: Configure Request (0x04) ident 2 len 4
        Destination CID: 65
        Flags: 0x0000
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#25
        Num handles: 1
        Handle: 256
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#26
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 16                MiCode#27
      L2CAP: Configure Request (0x04) ident 3 len 8
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00                                            ..
< ACL Data TX: Handle 256 flags 0x00 dlen 18                MiCode#28
      L2CAP: Configure Response (0x05) ident 3 len 10
        Source CID: 65
        Flags: 0x0000
        Result: Success (0x0000)
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 672
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#29
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 14                MiCode#30
      L2CAP: Configure Response (0x05) ident 2 len 6
        Source CID: 64
        Flags: 0x0000
        Result: Success (0x0000)
> ACL Data RX: Handle 256 flags 0x02 dlen 20                MiCode#31
      L2CAP: Configure Request (0x04) ident 3 len 12
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00 91 02 11 11                                ......
< ACL Data TX: Handle 256 flags 0x00 dlen 14                MiCode#32
      L2CAP: Command Reject (0x01) ident 3 len 6
        Reason: Invalid CID in request (0x0002)
        Destination CID: 64
        Source CID: 65
> HCI Event: Number of Completed Packets (0x13) plen 5      MiCode#33
        Num handles: 1
        Handle: 256
        Count: 1
...
=== After ===
...
> ACL Data RX: Handle 256 flags 0x02 dlen 12               MiCode#22
      L2CAP: Connection Request (0x02) ident 2 len 4
        PSM: 1 (0x0001)
        Source CID: 65
< ACL Data TX: Handle 256 flags 0x00 dlen 16               MiCode#23
      L2CAP: Connection Response (0x03) ident 2 len 8
        Destination CID: 64
        Source CID: 65
        Result: Connection successful (0x0000)
        Status: No further information available (0x0000)
< ACL Data TX: Handle 256 flags 0x00 dlen 12               MiCode#24
      L2CAP: Configure Request (0x04) ident 2 len 4
        Destination CID: 65
        Flags: 0x0000
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#25
        Num handles: 1
        Handle: 256
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#26
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 16               MiCode#27
      L2CAP: Configure Request (0x04) ident 3 len 8
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00                                            ..
< ACL Data TX: Handle 256 flags 0x00 dlen 18               MiCode#28
      L2CAP: Configure Response (0x05) ident 3 len 10
        Source CID: 65
        Flags: 0x0000
        Result: Success (0x0000)
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 672
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#29
        Num handles: 1
        Handle: 256
        Count: 1
> ACL Data RX: Handle 256 flags 0x02 dlen 14               MiCode#30
      L2CAP: Configure Response (0x05) ident 2 len 6
        Source CID: 64
        Flags: 0x0000
        Result: Success (0x0000)
> ACL Data RX: Handle 256 flags 0x02 dlen 20               MiCode#31
      L2CAP: Configure Request (0x04) ident 3 len 12
        Destination CID: 64
        Flags: 0x0000
        Option: Unknown (0x10) [hint]
        01 00 91 02 11 11                                .....
< ACL Data TX: Handle 256 flags 0x00 dlen 18               MiCode#32
      L2CAP: Configure Response (0x05) ident 3 len 10
        Source CID: 65
        Flags: 0x0000
        Result: Success (0x0000)
        Option: Maximum Transmission Unit (0x01) [mandatory]
          MTU: 672
< ACL Data TX: Handle 256 flags 0x00 dlen 12               MiCode#33
      L2CAP: Configure Request (0x04) ident 3 len 4
        Destination CID: 65
        Flags: 0x0000
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#34
        Num handles: 1
        Handle: 256
        Count: 1
> HCI Event: Number of Completed Packets (0x13) plen 5     MiCode#35
        Num handles: 1
        Handle: 256
        Count: 1
...

Signed-off-by: Howard Chung <howardchung@google.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants