Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Working #1

Merged
merged 2,437 commits into from
Apr 23, 2021
Merged

Working #1

merged 2,437 commits into from
Apr 23, 2021

Conversation

aled99
Copy link
Owner

@aled99 aled99 commented Apr 23, 2021

No description provided.

Alexey Dobriyan and others added 30 commits April 15, 2021 17:58
[ Upstream commit 5999b9e5b1f8a2f5417b755130919b3ac96f5550 ]

Only half of the file is under include guard because terminating #endif
is placed too early.

Link: https://lore.kernel.org/r/YE4snvoW1SuwcXAn@localhost.localdomain
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 2e5848a3d86f03024ae096478bdb892ab3d79131 ]

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created
since the .name might be used by userspace and we don't want to break
userspace, so we are changing the parameters passed to request_irq()

[    1.630764] name 'pci-das1602/16'
[    1.630950] WARNING: CPU: 0 PID: 181 at fs/proc/generic.c:180 __xlate_proc_name+0x93/0xb0
[    1.634009] RIP: 0010:__xlate_proc_name+0x93/0xb0
[    1.639441] Call Trace:
[    1.639976]  proc_mkdir+0x18/0x20
[    1.641946]  request_threaded_irq+0xfe/0x160
[    1.642186]  cb_pcidas_auto_attach+0xf4/0x610 [cb_pcidas]

Suggested-by: Ian Abbott <abbotti@mev.co.uk>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Link: https://lore.kernel.org/r/20210315195914.4801-1-ztong0001@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d2d106fe3badfc3bf0dd3899d1c3f210c7203eab ]

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created
since the .name might be used by userspace and we don't want to break
userspace, so we are changing the parameters passed to request_irq()

[    1.565966] name 'pci-das6402/16'
[    1.566149] WARNING: CPU: 0 PID: 184 at fs/proc/generic.c:180 __xlate_proc_name+0x93/0xb0
[    1.568923] RIP: 0010:__xlate_proc_name+0x93/0xb0
[    1.574200] Call Trace:
[    1.574722]  proc_mkdir+0x18/0x20
[    1.576629]  request_threaded_irq+0xfe/0x160
[    1.576859]  auto_attach+0x60a/0xc40 [cb_pcidas64]

Suggested-by: Ian Abbott <abbotti@mev.co.uk>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Link: https://lore.kernel.org/r/20210315195814.4692-1-ztong0001@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dbf54a9534350d6aebbb34f5c1c606b81a4f35dd ]

Simple-card/audio-graph-card drivers do not handle MCLK clock when it
is specified in the codec device node. The expectation here is that,
the codec should actually own up the MCLK clock and do necessary setup
in the driver.

Suggested-by: Mark Brown <broonie@kernel.org>
Suggested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Link: https://lore.kernel.org/r/1615829492-8972-3-git-send-email-spujar@nvidia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5de2055d31ea88fd9ae9709ac95c372a505a60fa ]

The use_ww_ctx flag is passed to mutex_optimistic_spin(), but the
function doesn't use it. The frequent use of the (use_ww_ctx && ww_ctx)
combination is repetitive.

In fact, ww_ctx should not be used at all if !use_ww_ctx.  Simplify
ww_mutex code by dropping use_ww_ctx from mutex_optimistic_spin() an
clear ww_ctx if !use_ww_ctx. In this way, we can replace (use_ww_ctx &&
ww_ctx) by just (ww_ctx).

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Link: https://lore.kernel.org/r/20210316153119.13802-2-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 5dccdc5a1916d4266edd251f20bbbb113a5c495f ]

In ext4_rename(), when RENAME_WHITEOUT failed to add new entry into
directory, it ends up dropping new created whiteout inode under the
running transaction. After commit <9b88f9fb0d2> ("ext4: Do not iput inode
under running transaction"), we follow the assumptions that evict() does
not get called from a transaction context but in ext4_rename() it breaks
this suggestion. Although it's not a real problem, better to obey it, so
this patch add inode to orphan list and stop transaction before final
iput().

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20210303131703.330415-2-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e862a3e4088070de352fdafe9bd9e3ae0a95a33c ]

This ensure that previous association attempts do not leave stale statuses
on subsequent attempts.

This fixes the WARN_ON(!cr->bss)) from __cfg80211_connect_result() when
connecting to an AP after a previous connection failure (e.g. where EAP fails
due to incorrect psk but association succeeded). In some scenarios, indeed,
brcmf_is_linkup() was reporting a link up event too early due to stale
BRCMF_VIF_STATUS_ASSOC_SUCCESS bit, thus reporting to cfg80211 a connection
result with a zeroed bssid (vif->profile.bssid is still empty), causing the
WARN_ON due to the call to cfg80211_get_bss() with the empty bssid.

Signed-off-by: Luca Pesce <luca.pesce@vimar.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1608807119-21785-1-git-send-email-luca.pesce@vimar.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 09078368d516918666a0122f2533dc73676d3d7e ]

ieee80211_find_sta_by_ifaddr() must be called under the RCU lock and
the resulting pointer is only valid under RCU lock as well.

Fix ath10k_wmi_tlv_op_pull_peer_stats_info() to hold RCU lock before it
calls ieee80211_find_sta_by_ifaddr() and release it when the resulting
pointer is no longer needed.

This problem was found while reviewing code to debug RCU warn from
ath10k_wmi_tlv_parse_peer_stats_info().

Link: https://lore.kernel.org/linux-wireless/7230c9e5-2632-b77e-c4f9-10eca557a5bb@linuxfoundation.org/
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210210212107.40373-1-skhan@linuxfoundation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 8a28af7a3e85ddf358f8c41e401a33002f7a9587 ]

The aq_nic_start function can fail in a variety of cases which leaves
the device in broken state.

An example case where the start function fails is the
request_threaded_irq which can be interrupted, resulting in a EINTR
result. This can be manually triggered by bringing the link up (e.g. ip
link set up) and triggering a SIGINT on the initiating process (e.g.
Ctrl+C). This would put the device into a half configured state.
Subsequently bringing the link up again would cause the napi_enable to
BUG.

In order to correctly clean up the failed attempt to start a device call
aq_nic_stop.

Signed-off-by: Nathan Rossi <nathan.rossi@digi.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39935dccb21c60f9bbf1bb72d22ab6fd14ae7705 ]

If a DDP broadcast packet is sent out to a non-gateway target, it is
also looped back. There is a potential for the loopback device to have a
longer hardware header length than the original target route's device,
which can result in the skb not being created with enough room for the
loopback device's hardware header. This patch fixes the issue by
determining that a loopback will be necessary prior to allocating the
skb, and if so, ensuring the skb has enough room.

This was discovered while testing a new driver that creates a LocalTalk
network interface (LTALK_HLEN = 1). It caused an skb_under_panic.

Signed-off-by: Doug Brown <doug@schmorgal.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 62e69bc419772638369eff8ff81340bde8aceb61 ]

lmc set sc->lmc_media pointer when there is a matching device.
However, when no matching device is found, this pointer is NULL
and the following dereference will result in a null-ptr-deref.

To fix this issue, unregister the hdlc device and return an error.

[    4.569359] BUG: KASAN: null-ptr-deref in lmc_init_one.cold+0x2b6/0x55d [lmc]
[    4.569748] Read of size 8 at addr 0000000000000008 by task modprobe/95
[    4.570102]
[    4.570187] CPU: 0 PID: 95 Comm: modprobe Not tainted 5.11.0-rc7 #94
[    4.570527] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-preb4
[    4.571125] Call Trace:
[    4.571261]  dump_stack+0x7d/0xa3
[    4.571445]  kasan_report.cold+0x10c/0x10e
[    4.571667]  ? lmc_init_one.cold+0x2b6/0x55d [lmc]
[    4.571932]  lmc_init_one.cold+0x2b6/0x55d [lmc]
[    4.572186]  ? lmc_mii_readreg+0xa0/0xa0 [lmc]
[    4.572432]  local_pci_probe+0x6f/0xb0
[    4.572639]  pci_device_probe+0x171/0x240
[    4.572857]  ? pci_device_remove+0xe0/0xe0
[    4.573080]  ? kernfs_create_link+0xb6/0x110
[    4.573315]  ? sysfs_do_create_link_sd.isra.0+0x76/0xe0
[    4.573598]  really_probe+0x161/0x420
[    4.573799]  driver_probe_device+0x6d/0xd0
[    4.574022]  device_driver_attach+0x82/0x90
[    4.574249]  ? device_driver_attach+0x90/0x90
[    4.574485]  __driver_attach+0x60/0x100
[    4.574694]  ? device_driver_attach+0x90/0x90
[    4.574931]  bus_for_each_dev+0xe1/0x140
[    4.575146]  ? subsys_dev_iter_exit+0x10/0x10
[    4.575387]  ? klist_node_init+0x61/0x80
[    4.575602]  bus_add_driver+0x254/0x2a0
[    4.575812]  driver_register+0xd3/0x150
[    4.576021]  ? 0xffffffffc0018000
[    4.576202]  do_one_initcall+0x84/0x250
[    4.576411]  ? trace_event_raw_event_initcall_finish+0x150/0x150
[    4.576733]  ? unpoison_range+0xf/0x30
[    4.576938]  ? ____kasan_kmalloc.constprop.0+0x84/0xa0
[    4.577219]  ? unpoison_range+0xf/0x30
[    4.577423]  ? unpoison_range+0xf/0x30
[    4.577628]  do_init_module+0xf8/0x350
[    4.577833]  load_module+0x3fe6/0x4340
[    4.578038]  ? vm_unmap_ram+0x1d0/0x1d0
[    4.578247]  ? ____kasan_kmalloc.constprop.0+0x84/0xa0
[    4.578526]  ? module_frob_arch_sections+0x20/0x20
[    4.578787]  ? __do_sys_finit_module+0x108/0x170
[    4.579037]  __do_sys_finit_module+0x108/0x170
[    4.579278]  ? __ia32_sys_init_module+0x40/0x40
[    4.579523]  ? file_open_root+0x200/0x200
[    4.579742]  ? do_sys_open+0x85/0xe0
[    4.579938]  ? filp_open+0x50/0x50
[    4.580125]  ? exit_to_user_mode_prepare+0xfc/0x130
[    4.580390]  do_syscall_64+0x33/0x40
[    4.580586]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    4.580859] RIP: 0033:0x7f1a724c3cf7
[    4.581054] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 f8 48 89 f7 48 89 d6 48 891
[    4.582043] RSP: 002b:00007fff44941c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    4.582447] RAX: ffffffffffffffda RBX: 00000000012ada70 RCX: 00007f1a724c3cf7
[    4.582827] RDX: 0000000000000000 RSI: 00000000012ac9e0 RDI: 0000000000000003
[    4.583207] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000001
[    4.583587] R10: 00007f1a72527300 R11: 0000000000000246 R12: 00000000012ac9e0
[    4.583968] R13: 0000000000000000 R14: 00000000012acc90 R15: 0000000000000001
[    4.584349] ==================================================================

Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit febf22565549ea7111e7d45e8f2d64373cc66b11 upstream.

We found a recording issue on a Dell AIO, users plug a headset-mic and
select headset-mic from UI, but can't record any sound from
headset-mic. The root cause is the determine_headset_type() returns a
wrong type, e.g. users plug a ctia type headset, but that function
returns omtp type.

On this machine, the internal mic is not connected to the codec, the
"Input Source" is headset mic by default. And when users plug a
headset, the determine_headset_type() will be called immediately, the
codec on this AIO is alc274, the delay time for this codec in the
determine_headset_type() is only 80ms, the delay is too short to
correctly determine the headset type, the fail rate is nearly 99% when
users plug the headset with the normal speed.

Other codecs set several hundred ms delay time, so here I change the
delay time to 850ms for alc2x4 series, after this change, the fail
rate is zero unless users plug the headset slowly on purpose.

Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210320091542.6748-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e54f30befa7990b897189b44a56c1138c6bfdbb5 upstream.

We found the alc_update_headset_mode() is not called on some machines
when unplugging the headset, as a result, the mode of the
ALC_HEADSET_MODE_UNPLUGGED can't be set, then the current_headset_type
is not cleared, if users plug a differnt type of headset next time,
the determine_headset_type() will not be called and the audio jack is
set to the headset type of previous time.

On the Dell machines which connect the dmic to the PCH, if we open
the gnome-sound-setting and unplug the headset, this issue will
happen. Those machines disable the auto-mute by ucm and has no
internal mic in the input source, so the update_headset_mode() will
not be called by cap_sync_hook or automute_hook when unplugging, and
because the gnome-sound-setting is opened, the codec will not enter
the runtime_suspend state, so the update_headset_mode() will not be
called by alc_resume when unplugging. In this case the
hp_automute_hook is called when unplugging, so add
update_headset_mode() calling to this function.

Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20210320091542.6748-2-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9dfacc54a8661bc8be6e08cffee59596ec59f263 upstream.

pm_runtime_put_suppliers() must not decrement rpm_active unless the
consumer is suspended. That is because, otherwise, it could suspend
suppliers for an active consumer.

That can happen as follows:

 static int driver_probe_device(struct device_driver *drv, struct device *dev)
 {
	int ret = 0;

	if (!device_is_registered(dev))
		return -ENODEV;

	dev->can_match = true;
	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
		 drv->bus->name, __func__, dev_name(dev), drv->name);

	pm_runtime_get_suppliers(dev);
	if (dev->parent)
		pm_runtime_get_sync(dev->parent);

 At this point, dev can runtime suspend so rpm_put_suppliers() can run,
 rpm_active becomes 1 (the lowest value).

	pm_runtime_barrier(dev);
	if (initcall_debug)
		ret = really_probe_debug(dev, drv);
	else
		ret = really_probe(dev, drv);

 Probe callback can have runtime resumed dev, and then runtime put
 so dev is awaiting autosuspend, but rpm_active is 2.

	pm_request_idle(dev);

	if (dev->parent)
		pm_runtime_put(dev->parent);

	pm_runtime_put_suppliers(dev);

 Now pm_runtime_put_suppliers() will put the supplier
 i.e. rpm_active 2 -> 1, but consumer can still be active.

	return ret;
 }

Fix by checking the runtime status. For any status other than
RPM_SUSPENDED, rpm_active can be considered to be "owned" by
rpm_[get/put]_suppliers() and pm_runtime_put_suppliers() need do nothing.

Reported-by: Asutosh Das <asutoshd@codeaurora.org>
Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0c33442f7203704aef345647e14c2fb86071001 upstream.

rpm_active indicates how many times the supplier usage_count has been
incremented. Consequently it must be updated after pm_runtime_get_sync() of
the supplier, not before.

Fixes: 4c06c4e6cf63 ("driver core: Fix possible supplier PM-usage counter imbalance")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9deb193af69d3fd6dd8e47f292b67c805a787010 upstream.

Commit cbc3b92ce037 fixed an issue to modify the macros of the stack trace
event so that user space could parse it properly. Originally the stack
trace format to user space showed that the called stack was a dynamic
array. But it is not actually a dynamic array, in the way that other
dynamic event arrays worked, and this broke user space parsing for it. The
update was to make the array look to have 8 entries in it. Helper
functions were added to make it parse it correctly, as the stack was
dynamic, but was determined by the size of the event stored.

Although this fixed user space on how it read the event, it changed the
internal structure used for the stack trace event. It changed the array
size from [0] to [8] (added 8 entries). This increased the size of the
stack trace event by 8 words. The size reserved on the ring buffer was the
size of the stack trace event plus the number of stack entries found in
the stack trace. That commit caused the amount to be 8 more than what was
needed because it did not expect the caller field to have any size. This
produced 8 entries of garbage (and reading random data) from the stack
trace event:

          <idle>-0       [002] d... 1976396.837549: <stack trace>
 => trace_event_raw_event_sched_switch
 => __traceiter_sched_switch
 => __schedule
 => schedule_idle
 => do_idle
 => cpu_startup_entry
 => secondary_startup_64_no_verify
 => 0xc8c5e150ffff93de
 => 0xffff93de
 => 0
 => 0
 => 0xc8c5e17800000000
 => 0x1f30affff93de
 => 0x00000004
 => 0x200000000

Instead, subtract the size of the caller field from the size of the event
to make sure that only the amount needed to store the stack trace is
reserved.

Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/

Cc: stable@vger.kernel.org
Fixes: cbc3b92ce037 ("tracing: Set kernel_stack's caller size properly")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e720e7d0e983bf05de80b231bccc39f1487f0f16 upstream.

There are code paths that rely on zero_pfn to be fully initialized
before core_initcall.  For example, wq_sysfs_init() is a core_initcall
function that eventually results in a call to kernel_execve, which
causes a page fault with a subsequent mmput.  If zero_pfn is not
initialized by then it may not get cleaned up properly and result in an
error:

  BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1

Here is an analysis of the race as seen on a MIPS device. On this
particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until
initialized, at which point it becomes PFN 5120:

  1. wq_sysfs_init calls into kobject_uevent_env at core_initcall:
       kobject_uevent_env+0x7e4/0x7ec
       kset_register+0x68/0x88
       bus_register+0xdc/0x34c
       subsys_virtual_register+0x34/0x78
       wq_sysfs_init+0x1c/0x4c
       do_one_initcall+0x50/0x1a8
       kernel_init_freeable+0x230/0x2c8
       kernel_init+0x10/0x100
       ret_from_kernel_thread+0x14/0x1c

  2. kobject_uevent_env() calls call_usermodehelper_exec() which executes
     kernel_execve asynchronously.

  3. Memory allocations in kernel_execve cause a page fault, bumping the
     MM reference counter:
       add_mm_counter_fast+0xb4/0xc0
       handle_mm_fault+0x6e4/0xea0
       __get_user_pages.part.78+0x190/0x37c
       __get_user_pages_remote+0x128/0x360
       get_arg_page+0x34/0xa0
       copy_string_kernel+0x194/0x2a4
       kernel_execve+0x11c/0x298
       call_usermodehelper_exec_async+0x114/0x194

  4. In case zero_pfn has not been initialized yet, zap_pte_range does
     not decrement the MM_ANONPAGES RSS counter and the BUG message is
     triggered shortly afterwards when __mmdrop checks the ref counters:
       __mmdrop+0x98/0x1d0
       free_bprm+0x44/0x118
       kernel_execve+0x160/0x1d8
       call_usermodehelper_exec_async+0x114/0x194
       ret_from_kernel_thread+0x14/0x1c

To avoid races such as described above, initialize init_zero_pfn at
early_initcall level.  Depending on the architecture, ZERO_PAGE is
either constant or gets initialized even earlier, at paging_init, so
there is no issue with initializing zero_pfn earlier.

Link: https://lkml.kernel.org/r/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niAXOpqEbt7w@mail.gmail.com
Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: stable@vger.kernel.org
Tested-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e61b84f9d3ddfba73091f9fbc940caae1c9eb22 upstream.

Offset calculation wasn't correct as start addresses are in pfn
not in bytes.

CC: stable@vger.kernel.org
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3512fb67093fabdf27af303066627b921ee9bd8 upstream.

The page table of AMDGPU requires an alignment to CPU page so we should
check ioctl parameters for it.  Return -EINVAL if some parameter is
unaligned to CPU page, instead of corrupt the page table sliently.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e46d1b78a03d52306f21f77a4e4a144b6d31486 upstream.

syzbot is reporting NULL pointer dereference at reiserfs_security_init()
[1], for commit ab17c4f ("reiserfs: fixup xattr_root caching")
is assuming that REISERFS_SB(s)->xattr_root != NULL in
reiserfs_xattr_jcreate_nblocks() despite that commit made
REISERFS_SB(sb)->priv_root != NULL && REISERFS_SB(s)->xattr_root == NULL
case possible.

I guess that commit 6cb4aff ("reiserfs: fix oops while creating
privroot with selinux enabled") wanted to check xattr_root != NULL
before reiserfs_xattr_jcreate_nblocks(), for the changelog is talking
about the xattr root.

  The issue is that while creating the privroot during mount
  reiserfs_security_init calls reiserfs_xattr_jcreate_nblocks which
  dereferences the xattr root. The xattr root doesn't exist, so we get
  an oops.

Therefore, update reiserfs_xattrs_initialized() to check both the
privroot and the xattr root.

Link: https://syzkaller.appspot.com/bug?id=8abaedbdeb32c861dc5340544284167dd0e46cde # [1]
Reported-and-tested-by: syzbot <syzbot+690cb1e51970435f9775@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 6cb4aff ("reiserfs: fix oops while creating privroot with selinux enabled")
Acked-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Jan Kara <jack@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c971af25cda94afe71617790826a86253e88eab0 upstream.

The restore in resume should match to suspend which only set for RK3288
SoCs pinctrl.

Fixes: 8dca933 ("pinctrl: rockchip: save and restore gpio6_c6 pinmux in suspend/resume")
Reviewed-by: Jianqun Xu <jay.xu@rock-chips.com>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Wang Panzhenzhuan <randy.wang@rock-chips.com>
Signed-off-by: Jianqun Xu <jay.xu@rock-chips.com>
Link: https://lore.kernel.org/r/20210223100725.269240-1-jay.xu@rock-chips.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c9570d4a5efd04479b3cd09c39b571eb031d94f4 ]

Add stubs for extcon_register_notifier_all() function for !CONFIG_EXTCON
case.  This is useful for compile testing and for drivers which use
EXTCON but do not require it (therefore do not depend on CONFIG_EXTCON).

Fixes: 815429b ("extcon: Add new extcon_register_notifier_all() to monitor all external connectors")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit d3bdd1c3140724967ca4136755538fa7c05c2b4e ]

When devm_kcalloc() fails, we should execute device_unregister()
to unregister edev->dev from system.

Fixes: 046050f ("extcon: Update the prototype of extcon_register_notifier() with enum extcon")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 829933ef05a951c8ff140e814656d73e74915faf ]

For each device, the nosy driver allocates a pcilynx structure.
A use-after-free might happen in the following scenario:

 1. Open nosy device for the first time and call ioctl with command
    NOSY_IOC_START, then a new client A will be malloced and added to
    doubly linked list.
 2. Open nosy device for the second time and call ioctl with command
    NOSY_IOC_START, then a new client B will be malloced and added to
    doubly linked list.
 3. Call ioctl with command NOSY_IOC_START for client A, then client A
    will be readded to the doubly linked list. Now the doubly linked
    list is messed up.
 4. Close the first nosy device and nosy_release will be called. In
    nosy_release, client A will be unlinked and freed.
 5. Close the second nosy device, and client A will be referenced,
    resulting in UAF.

The root cause of this bug is that the element in the doubly linked list
is reentered into the list.

Fix this bug by adding a check before inserting a client.  If a client
is already in the linked list, don't insert it.

The following KASAN report reveals it:

   BUG: KASAN: use-after-free in nosy_release+0x1ea/0x210
   Write of size 8 at addr ffff888102ad7360 by task poc
   CPU: 3 PID: 337 Comm: poc Not tainted 5.12.0-rc5+ #6
   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
   Call Trace:
     nosy_release+0x1ea/0x210
     __fput+0x1e2/0x840
     task_work_run+0xe8/0x180
     exit_to_user_mode_prepare+0x114/0x120
     syscall_exit_to_user_mode+0x1d/0x40
     entry_SYSCALL_64_after_hwframe+0x44/0xae

   Allocated by task 337:
     nosy_open+0x154/0x4d0
     misc_open+0x2ec/0x410
     chrdev_open+0x20d/0x5a0
     do_dentry_open+0x40f/0xe80
     path_openat+0x1cf9/0x37b0
     do_filp_open+0x16d/0x390
     do_sys_openat2+0x11d/0x360
     __x64_sys_open+0xfd/0x1a0
     do_syscall_64+0x33/0x40
     entry_SYSCALL_64_after_hwframe+0x44/0xae

   Freed by task 337:
     kfree+0x8f/0x210
     nosy_release+0x158/0x210
     __fput+0x1e2/0x840
     task_work_run+0xe8/0x180
     exit_to_user_mode_prepare+0x114/0x120
     syscall_exit_to_user_mode+0x1d/0x40
     entry_SYSCALL_64_after_hwframe+0x44/0xae

   The buggy address belongs to the object at ffff888102ad7300 which belongs to the cache kmalloc-128 of size 128
   The buggy address is located 96 bytes inside of 128-byte region [ffff888102ad7300, ffff888102ad7380)

[ Modified to use 'list_empty()' inside proper lock  - Linus ]

Link: https://lore.kernel.org/lkml/1617433116-5930-1-git-send-email-zheyuma97@gmail.com/
Reported-and-tested-by: 马哲宇 (Zheyu Ma) <zheyuma97@gmail.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Cc: Greg Kroah-Hartman <greg@kroah.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 1cc5ed25bdade86de2650a82b2730108a76de20c upstream.

Fix shift out-of-bounds in vhci_hub_control() SetPortFeature handling.

UBSAN: shift-out-of-bounds in drivers/usb/usbip/vhci_hcd.c:605:42
shift exponent 768 is too large for 32-bit type 'int'

Reported-by: syzbot+3dea30b047f41084de66@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20210324230654.34798-1-skhan@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bd860493f81eb2a46173f6f5e44cc38331c8dbd upstream.

This LTE modem (M.2 card) has a bug in its power management:
there is some kind of race condition for U3 wake-up between the host and
the device. The modem firmware sometimes crashes/locks when both events
happen at the same time and the modem fully drops off the USB bus (and
sometimes re-enumerates, sometimes just gets stuck until the next
reboot).

Tested with the modem wired to the XHCI controller on an AMD 3015Ce
platform. Without the patch, the modem dropped of the USB bus 5 times in
3 days. With the quirk, it stayed connected for a week while the
'runtime_suspended_time' counter incremented as excepted.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Link: https://lore.kernel.org/r/20210319124802.2315195-1-vpalatin@chromium.org
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92af4fc6ec331228aca322ca37c8aea7b150a151 upstream.

Pinephone running on Allwinner A64 fails to suspend with USB devices
connected as reported by Bhushan Shah <bshah@kde.org>. Reverting
commit 5fbf7a253470 ("usb: musb: fix idling for suspend after
disconnect interrupt") fixes the issue.

Let's add suspend checks also for suspend after disconnect interrupt
quirk handling like we already do elsewhere.

Fixes: 5fbf7a253470 ("usb: musb: fix idling for suspend after disconnect interrupt")
Reported-by: Bhushan Shah <bshah@kde.org>
Tested-by: Bhushan Shah <bshah@kde.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210324071142.42264-1-tony@atomide.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f978a30c9bb12dab1302d0f06951ee290f5e600 upstream.

The MediaTek 0.96 xHCI controller on some platforms does not
support bulk stream even HCCPARAMS says supporting, due to MaxPSASize
is set a default value 1 by mistake, here use XHCI_BROKEN_STREAMS
quirk to fix it.

Fixes: 94a631d ("usb: xhci-mtk: check hcc_params after adding primary hcd")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1616482975-17841-4-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08dff274edda54310d6f1cf27b62fddf0f8d146e upstream.

Counting break events is nice but we should actually report them to
the tty layer.

Fixes: 5a6a62b ("cdc-acm: add TIOCMIWAIT")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20210311133714.31881-1-oneukum@suse.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
…tint

commit 6069e3e927c8fb3a1947b07d1a561644ea960248 upstream.

We have a cycle of callbacks scheduling works which submit
URBs with thos callbacks. This needs to be blocked, stopped
and unblocked to untangle the circle.

The issue leads to faults like:

[   55.068392] Unable to handle kernel paging request at virtual address 6b6b6c03
[   55.075624] pgd = be866494
[   55.078335] [6b6b6c03] *pgd=00000000
[   55.081924] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[   55.087238] Modules linked in: ppp_async crc_ccitt ppp_generic slhc
xt_TCPMSS xt_tcpmss xt_hl nf_log_ipv6 nf_log_ipv4 nf_log_common
xt_policy xt_limit xt_conntrack xt_tcpudp xt_pkttype ip6table_mangle
iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
iptable_mangle ip6table_filter ip6_tables iptable_filter ip_tables
des_generic md5 sch_fq_codel cdc_mbim cdc_wdm cdc_ncm usbnet mii
cdc_acm usb_storage ip_tunnel xfrm_user xfrm6_tunnel tunnel6
xfrm4_tunnel tunnel4 esp6 esp4 ah6 ah4 xfrm_algo xt_LOG xt_LED
xt_comment x_tables ipv6
[   55.134954] CPU: 0 PID: 82 Comm: kworker/0:2 Tainted: G
   T 5.8.17 #1
[   55.142526] Hardware name: Freescale i.MX7 Dual (Device Tree)
[   55.148304] Workqueue: events acm_softint [cdc_acm]
[   55.153196] PC is at kobject_get+0x10/0xa4
[   55.157302] LR is at usb_get_dev+0x14/0x1c
[   55.161402] pc : [<8047c06c>]    lr : [<80560448>]    psr: 20000193
[   55.167671] sp : bca39ea8  ip : 00007374  fp : bf6cbd80
[   55.172899] r10: 00000000  r9 : bdd92284  r8 : bdd92008
[   55.178128] r7 : 6b6b6b6b  r6 : fffffffe  r5 : 60000113  r4 : 6b6b6be3
[   55.184658] r3 : 6b6b6b6b  r2 : 00000111  r1 : 00000000  r0 : 6b6b6be3
[   55.191191] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM Segment none
[   55.198417] Control: 10c5387d  Table: bcf0c06a  DAC: 00000051
[   55.204168] Process kworker/0:2 (pid: 82, stack limit = 0x9bdd2a89)
[   55.210439] Stack: (0xbca39ea8 to 0xbca3a000)
[   55.214805] 9ea0:                   bf6cbd80 80769a50 6b6b6b6b 80560448 bdeb0500 8056bfe8
[   55.222991] 9ec0: 00000002 b76da000 00000000 bdeb0500 bdd92448 bca38000 bdeb0510 8056d69c
[   55.231177] 9ee0: bca38000 00000000 80c050fc 00000000 bca39f44 09d42015 00000000 00000001
[   55.239363] 9f00: bdd92448 bdd92438 bdd92000 7f1158c4 bdd92448 bca2ee00 bf6cbd80 bf6cef00
[   55.247549] 9f20: 00000000 00000000 00000000 801412d8 bf6cbd98 80c03d00 bca2ee00 bf6cbd80
[   55.255735] 9f40: bca2ee14 bf6cbd98 80c03d00 00000008 bca38000 80141568 00000000 80c446ae
[   55.263921] 9f60: 00000000 bc9ed880 bc9f0700 bca38000 bc117eb4 80141524 bca2ee00 bc9ed8a4
[   55.272107] 9f80: 00000000 80147cc8 00000000 bc9f0700 80147b84 00000000 00000000 00000000
[   55.280292] 9fa0: 00000000 00000000 00000000 80100148 00000000 00000000 00000000 00000000
[   55.288477] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   55.296662] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[   55.304860] [<8047c06c>] (kobject_get) from [<80560448>] (usb_get_dev+0x14/0x1c)
[   55.312271] [<80560448>] (usb_get_dev) from [<8056bfe8>] (usb_hcd_unlink_urb+0x50/0xd8)
[   55.320286] [<8056bfe8>] (usb_hcd_unlink_urb) from [<8056d69c>] (usb_kill_urb.part.0+0x44/0xd0)
[   55.329004] [<8056d69c>] (usb_kill_urb.part.0) from [<7f1158c4>] (acm_softint+0x4c/0x10c [cdc_acm])
[   55.338082] [<7f1158c4>] (acm_softint [cdc_acm]) from [<801412d8>] (process_one_work+0x19c/0x3e8)
[   55.346969] [<801412d8>] (process_one_work) from [<80141568>] (worker_thread+0x44/0x4dc)
[   55.355072] [<80141568>] (worker_thread) from [<80147cc8>] (kthread+0x144/0x180)
[   55.362481] [<80147cc8>] (kthread) from [<80100148>] (ret_from_fork+0x14/0x2c)
[   55.369706] Exception stack(0xbca39fb0 to 0xbca39ff8)

Tested-by: Bruno Thomsen <bruno.thomsen@gmail.com>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210311130126.15972-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 17, 2021
[ Upstream commit 2acf15b94d5b8ea8392c4b6753a6ffac3135cd78 ]

Our syzcaller report a NULL pointer dereference:

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 116e95067 P4D 116e95067 PUD 1080b5067 PMD 0
Oops: 0010 [#1] SMP KASAN
CPU: 7 PID: 592 Comm: a.out Not tainted 5.13.0-next-20210629-dirty #67
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-p4
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffff888114e779b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 1ffff110229cef39 RCX: ffffffffaa67e1aa
RDX: 0000000000000000 RSI: ffff88810a58ee00 RDI: ffff8881233180b0
RBP: ffffffffac38e9c0 R08: ffffffffaa67e17e R09: 0000000000000001
R10: ffffffffb91c5557 R11: fffffbfff7238aaa R12: ffff88810a58ee00
R13: ffff888114e77aa0 R14: 0000000000000000 R15: ffff8881233180b0
FS:  00007f946163c480(0000) GS:ffff88839f1c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000001099c1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __lookup_slow+0x116/0x2d0
 ? page_put_link+0x120/0x120
 ? __d_lookup+0xfc/0x320
 ? d_lookup+0x49/0x90
 lookup_one_len+0x13c/0x170
 ? __lookup_slow+0x2d0/0x2d0
 ? reiserfs_schedule_old_flush+0x31/0x130
 reiserfs_lookup_privroot+0x64/0x150
 reiserfs_fill_super+0x158c/0x1b90
 ? finish_unfinished+0xb10/0xb10
 ? bprintf+0xe0/0xe0
 ? __mutex_lock_slowpath+0x30/0x30
 ? __kasan_check_write+0x20/0x30
 ? up_write+0x51/0xb0
 ? set_blocksize+0x9f/0x1f0
 mount_bdev+0x27c/0x2d0
 ? finish_unfinished+0xb10/0xb10
 ? reiserfs_kill_sb+0x120/0x120
 get_super_block+0x19/0x30
 legacy_get_tree+0x76/0xf0
 vfs_get_tree+0x49/0x160
 ? capable+0x1d/0x30
 path_mount+0xacc/0x1380
 ? putname+0x97/0xd0
 ? finish_automount+0x450/0x450
 ? kmem_cache_free+0xf8/0x5a0
 ? putname+0x97/0xd0
 do_mount+0xe2/0x110
 ? path_mount+0x1380/0x1380
 ? copy_mount_options+0x69/0x140
 __x64_sys_mount+0xf0/0x190
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

This is because 'root_inode' is initialized with wrong mode, and
it's i_op is set to 'reiserfs_special_inode_operations'. Thus add
check for 'root_inode' to fix the problem.

Link: https://lore.kernel.org/r/20210702040743.1918552-1-yukuai3@huawei.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit e8675d291ac007e1c636870db880f837a9ea112a ]

Our syzkaller trigger the "BUG_ON(!list_empty(&inode->i_wb_list))" in
clear_inode:

  kernel BUG at fs/inode.c:519!
  Internal error: Oops - BUG: 0 [#1] SMP
  Modules linked in:
  Process syz-executor.0 (pid: 249, stack limit = 0x00000000a12409d7)
  CPU: 1 PID: 249 Comm: syz-executor.0 Not tainted 4.19.95
  Hardware name: linux,dummy-virt (DT)
  pstate: 80000005 (Nzcv daif -PAN -UAO)
  pc : clear_inode+0x280/0x2a8
  lr : clear_inode+0x280/0x2a8
  Call trace:
    clear_inode+0x280/0x2a8
    ext4_clear_inode+0x38/0xe8
    ext4_free_inode+0x130/0xc68
    ext4_evict_inode+0xb20/0xcb8
    evict+0x1a8/0x3c0
    iput+0x344/0x460
    do_unlinkat+0x260/0x410
    __arm64_sys_unlinkat+0x6c/0xc0
    el0_svc_common+0xdc/0x3b0
    el0_svc_handler+0xf8/0x160
    el0_svc+0x10/0x218
  Kernel panic - not syncing: Fatal exception

A crash dump of this problem show that someone called __munlock_pagevec
to clear page LRU without lock_page: do_mmap -> mmap_region -> do_munmap
-> munlock_vma_pages_range -> __munlock_pagevec.

As a result memory_failure will call identify_page_state without
wait_on_page_writeback.  And after truncate_error_page clear the mapping
of this page.  end_page_writeback won't call sb_clear_inode_writeback to
clear inode->i_wb_list.  That will trigger BUG_ON in clear_inode!

Fix it by checking PageWriteback too to help determine should we skip
wait_on_page_writeback.

Link: https://lkml.kernel.org/r/20210604084705.3729204-1-yangerkun@huawei.com
Fixes: 0bc1f8b ("hwpoison: fix the handling path of the victimized page frame that belong to non-LRU")
Signed-off-by: yangerkun <yangerkun@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit ea6932d70e223e02fea3ae20a4feff05d7c1ea9a ]

There is a panic in socket ioctl cmd SIOCGSKNS when NET_NS is not enabled.
The reason is that nsfs tries to access ns->ops but the proc_ns_operations
is not implemented in this case.

[7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010
[7.670268] pgd = 32b54000
[7.670544] [00000010] *pgd=00000000
[7.671861] Internal error: Oops: 5 [#1] SMP ARM
[7.672315] Modules linked in:
[7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16
[7.673309] Hardware name: Generic DT based system
[7.673642] PC is at nsfs_evict+0x24/0x30
[7.674486] LR is at clear_inode+0x20/0x9c

The same to tun SIOCGSKNS command.

To fix this problem, we make get_net_ns() return -EINVAL when NET_NS is
disabled. Meanwhile move it to right place net/core/net_namespace.c.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Fixes: c62cce2 ("net: add an ioctl to get a socket network namespace")
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit b31d9d6d7abbf6483b871b6370bc31c930d53f54 upstream.

when system is doing s4, the process of xhci_resume may be as below:
1、xhci_mem_cleanup
2、xhci_init->xhci_mem_init->xhci_mem_cleanup(when memory is not enough).
xhci_mem_cleanup will be executed twice when system is out of memory.
xhci->port_caps is freed in xhci_mem_cleanup,but it isn't set to NULL.
It will be freed twice when xhci_mem_cleanup is called the second time.

We got following bug when system resumes from s4:

kernel BUG at mm/slub.c:309!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
CPU: 0 PID: 5929 Tainted: G S   W   5.4.96-arm64-desktop #1
pc : __slab_free+0x5c/0x424
lr : kfree+0x30c/0x32c

Call trace:
 __slab_free+0x5c/0x424
 kfree+0x30c/0x32c
 xhci_mem_cleanup+0x394/0x3cc
 xhci_mem_init+0x9ac/0x1070
 xhci_init+0x8c/0x1d0
 xhci_resume+0x1cc/0x5fc
 xhci_plat_resume+0x64/0x70
 platform_pm_thaw+0x28/0x60
 dpm_run_callback+0x54/0x24c
 device_resume+0xd0/0x200
 async_resume+0x24/0x60
 async_run_entry_fn+0x44/0x110
 process_one_work+0x1f0/0x490
 worker_thread+0x5c/0x450
 kthread+0x158/0x160
 ret_from_fork+0x10/0x24

Original patch that caused this issue was backported to 4.4 stable,
so this should be backported to 4.4 stabe as well.

Fixes: cf0ee7c60c89 ("xhci: Fix memory leak when caching protocol extended capability PSI tables - take 2")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Jiantao Zhang <water.zhangjiantao@huawei.com>
Signed-off-by: Tao Xue <xuetao09@huawei.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210617150354.1512157-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit fb312ac5ccb007e843f982b38d4d6886ba4b32f2 upstream.

I got this crash more times during debugging of PCIe controller and crash
happens somehow at the time when PCIe kernel code started link retraining (as
part of ASPM code) when at the same time PCIe link went down and ath9k probably
executed hw reset procedure.

Currently I'm not able to reproduce this issue as it looks like to be
some race condition between link training, ASPM, link down and reset
path. And as always, race conditions which depends on more input
parameters are hard to reproduce as it depends on precise timings.

But it is clear that pointers are zero in this case and should be
properly filled as same code pattern is used in ath9k_stop() function.
Anyway I was able to reproduce this crash by manually triggering ath
reset worker prior putting card up. I created simple patch to export
reset functionality via debugfs and use it to "simulate" of triggering
reset.    s proved that NULL-pointer dereference issue is there.

Function ath9k_hw_reset() is dereferencing chan structure pointer, so it
needs to be non-NULL pointer.

Function ath9k_stop() already contains code which sets ah->curchan to valid
non-NULL pointer prior calling ath9k_hw_reset() function.

Add same code pattern also into ath_reset_internal() function to prevent
kernel NULL pointer dereference in ath9k_hw_reset() function.

This change fixes kernel NULL pointer dereference in ath9k_hw_reset() which
is caused by calling ath9k_hw_reset() from ath_reset_internal() with NULL
chan structure.

    [   45.334305] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
    [   45.344417] Mem abort info:
    [   45.347301]   ESR = 0x96000005
    [   45.350448]   EC = 0x25: DABT (current EL), IL = 32 bits
    [   45.356166]   SET = 0, FnV = 0
    [   45.359350]   EA = 0, S1PTW = 0
    [   45.362596] Data abort info:
    [   45.365756]   ISV = 0, ISS = 0x00000005
    [   45.369735]   CM = 0, WnR = 0
    [   45.372814] user pgtable: 4k pages, 39-bit VAs, pgdp=000000000685d000
    [   45.379663] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
    [   45.388856] Internal error: Oops: 96000005 [#1] SMP
    [   45.393897] Modules linked in: ath9k ath9k_common ath9k_hw
    [   45.399574] CPU: 1 PID: 309 Comm: kworker/u4:2 Not tainted 5.12.0-rc2-dirty #785
    [   45.414746] Workqueue: phy0 ath_reset_work [ath9k]
    [   45.419713] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--)
    [   45.425910] pc : ath9k_hw_reset+0xc4/0x1c48 [ath9k_hw]
    [   45.431234] lr : ath9k_hw_reset+0xc0/0x1c48 [ath9k_hw]
    [   45.436548] sp : ffffffc0118dbca0
    [   45.439961] x29: ffffffc0118dbca0 x28: 0000000000000000
    [   45.445442] x27: ffffff800dee4080 x26: 0000000000000000
    [   45.450923] x25: ffffff800df9b9d8 x24: 0000000000000000
    [   45.456404] x23: ffffffc0115f6000 x22: ffffffc008d0d408
    [   45.461885] x21: ffffff800dee5080 x20: ffffff800df9b9d8
    [   45.467366] x19: 0000000000000000 x18: 0000000000000000
    [   45.472846] x17: 0000000000000000 x16: 0000000000000000
    [   45.478326] x15: 0000000000000010 x14: ffffffffffffffff
    [   45.483807] x13: ffffffc0918db94f x12: ffffffc011498720
    [   45.489289] x11: 0000000000000003 x10: ffffffc0114806e0
    [   45.494770] x9 : ffffffc01014b2ec x8 : 0000000000017fe8
    [   45.500251] x7 : c0000000ffffefff x6 : 0000000000000001
    [   45.505733] x5 : 0000000000000000 x4 : 0000000000000000
    [   45.511213] x3 : 0000000000000000 x2 : ffffff801fece870
    [   45.516693] x1 : ffffffc00eded000 x0 : 000000000000003f
    [   45.522174] Call trace:
    [   45.524695]  ath9k_hw_reset+0xc4/0x1c48 [ath9k_hw]
    [   45.529653]  ath_reset_internal+0x1a8/0x2b8 [ath9k]
    [   45.534696]  ath_reset_work+0x2c/0x40 [ath9k]
    [   45.539198]  process_one_work+0x210/0x480
    [   45.543339]  worker_thread+0x5c/0x510
    [   45.547115]  kthread+0x12c/0x130
    [   45.550445]  ret_from_fork+0x10/0x1c
    [   45.554138] Code: 910922c2 9117e021 95ff0398 b4000294 (b9400a61)
    [   45.560430] ---[ end trace 566410ba90b50e8b ]---
    [   45.565193] Kernel panic - not syncing: Oops: Fatal exception in interrupt
    [   45.572282] SMP: stopping secondary CPUs
    [   45.576331] Kernel Offset: disabled
    [   45.579924] CPU features: 0x00040002,0000200c
    [   45.584416] Memory Limit: none
    [   45.587564] Rebooting in 3 seconds..

Signed-off-by: Pali Rohár <pali@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210402122653.24014-1-pali@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 1a4520090681853e6b850cbe54b27247a013e0e5 ]

In 'bt878_irq', the driver calls 'tasklet_schedule', but this tasklet is
set in 'dvb_bt8xx_load_card' of another driver 'dvb-bt8xx'.
However, this two drivers are separate. The user may not load the
'dvb-bt8xx' driver when loading the 'bt8xx' driver, that is, the tasklet
has not been initialized when 'tasklet_schedule' is called, so it is
necessary to check whether the tasklet is initialized in 'bt878_probe'.

Fix this by adding a check at the end of bt878_probe.

The KASAN's report reveals it:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 800000006aab2067 P4D 800000006aab2067 PUD 6b2ea067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN PTI
CPU: 2 PID: 8724 Comm: syz-executor.0 Not tainted 4.19.177-
gdba4159c14ef-dirty #40
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-
gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:          (null)
Code: Bad RIP value.
RSP: 0018:ffff88806c287ea0 EFLAGS: 00010246
RAX: fffffbfff1b01774 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff1b01775 RDI: 0000000000000000
RBP: ffff88806c287f00 R08: fffffbfff1b01774 R09: fffffbfff1b01774
R10: 0000000000000001 R11: fffffbfff1b01773 R12: 0000000000000000
R13: ffff88806c29f530 R14: ffffffff8d80bb88 R15: ffffffff8d80bb90
FS:  00007f6b550e6700(0000) GS:ffff88806c280000(0000) knlGS:
0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000005ec98000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 tasklet_action_common.isra.17+0x141/0x420 kernel/softirq.c:522
 tasklet_action+0x50/0x70 kernel/softirq.c:540
 __do_softirq+0x224/0x92c kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:372 [inline]
 irq_exit+0x15a/0x180 kernel/softirq.c:412
 exiting_irq arch/x86/include/asm/apic.h:535 [inline]
 do_IRQ+0x123/0x1e0 arch/x86/kernel/irq.c:260
 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:670
 </IRQ>
RIP: 0010:__do_sys_interrupt kernel/sys.c:2593 [inline]
RIP: 0010:__se_sys_interrupt kernel/sys.c:2584 [inline]
RIP: 0010:__x64_sys_interrupt+0x5b/0x80 kernel/sys.c:2584
Code: ba 00 04 00 00 48 c7 c7 c0 99 31 8c e8 ae 76 5e 01 48 85 c0 75 21 e8
14 ae 24 00 48 c7 c3 c0 99 31 8c b8 0c 00 00 00 0f 01 c1 <31> db e8 fe ad
24 00 48 89 d8 5b 5d c3 48 c7 c3 ea ff ff ff eb ec
RSP: 0018:ffff888054167f10 EFLAGS: 00000212 ORIG_RAX: ffffffffffffffde
RAX: 000000000000000c RBX: ffffffff8c3199c0 RCX: ffffc90001ca6000
RDX: 000000000000001a RSI: ffffffff813478fc RDI: ffffffff8c319dc0
RBP: ffff888054167f18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000080 R11: fffffbfff18633b7 R12: ffff888054167f58
R13: ffff88805f638000 R14: 0000000000000000 R15: 0000000000000000
 do_syscall_64+0xb0/0x4e0 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4692a9
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6b550e5c48 EFLAGS: 00000246 ORIG_RAX: 000000000000014f
RAX: ffffffffffffffda RBX: 000000000077bf60 RCX: 00000000004692a9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000140
RBP: 00000000004cf7eb R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000077bf60
R13: 0000000000000000 R14: 000000000077bf60 R15: 00007fff55a1dca0
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
CR2: 0000000000000000
---[ end trace 68e5849c3f77cbb6 ]---
RIP: 0010:          (null)
Code: Bad RIP value.
RSP: 0018:ffff88806c287ea0 EFLAGS: 00010246
RAX: fffffbfff1b01774 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 1ffffffff1b01775 RDI: 0000000000000000
RBP: ffff88806c287f00 R08: fffffbfff1b01774 R09: fffffbfff1b01774
R10: 0000000000000001 R11: fffffbfff1b01773 R12: 0000000000000000
R13: ffff88806c29f530 R14: ffffffff8d80bb88 R15: ffffffff8d80bb90
FS:  00007f6b550e6700(0000) GS:ffff88806c280000(0000) knlGS:
0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000005ec98000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Reported-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 45c8ddd06c4b729c56a6083ab311bfbd9643f4a6 ]

Before referencing 'host->data', the driver needs to check whether it is
null pointer, otherwise it will cause a null pointer reference.

This log reveals it:

[   29.355199] BUG: kernel NULL pointer dereference, address:
0000000000000014
[   29.357323] #PF: supervisor write access in kernel mode
[   29.357706] #PF: error_code(0x0002) - not-present page
[   29.358088] PGD 0 P4D 0
[   29.358280] Oops: 0002 [#1] PREEMPT SMP PTI
[   29.358595] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.12.4-
g70e7f0549188-dirty #102
[   29.359164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   29.359978] RIP: 0010:via_sdc_isr+0x21f/0x410
[   29.360314] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00
10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43
18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77
[   29.361661] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046
[   29.362042] RAX: 0000000000000000 RBX: ffff888107d77880
RCX: 0000000000000000
[   29.362564] RDX: 0000000000000000 RSI: ffffffff835d20bb
RDI: 00000000ffffffff
[   29.363085] RBP: ffffc90000118ed8 R08: 0000000000000001
R09: 0000000000000001
[   29.363604] R10: 0000000000000000 R11: 0000000000000001
R12: 0000000000008600
[   29.364128] R13: ffff888107d779c8 R14: ffffc90009c00200
R15: 0000000000008000
[   29.364651] FS:  0000000000000000(0000) GS:ffff88817bc80000(0000)
knlGS:0000000000000000
[   29.365235] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   29.365655] CR2: 0000000000000014 CR3: 0000000005a2e000
CR4: 00000000000006e0
[   29.366170] DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
[   29.366683] DR3: 0000000000000000 DR6: 00000000fffe0ff0
DR7: 0000000000000400
[   29.367197] Call Trace:
[   29.367381]  <IRQ>
[   29.367537]  __handle_irq_event_percpu+0x53/0x3e0
[   29.367916]  handle_irq_event_percpu+0x35/0x90
[   29.368247]  handle_irq_event+0x39/0x60
[   29.368632]  handle_fasteoi_irq+0xc2/0x1d0
[   29.368950]  __common_interrupt+0x7f/0x150
[   29.369254]  common_interrupt+0xb4/0xd0
[   29.369547]  </IRQ>
[   29.369708]  asm_common_interrupt+0x1e/0x40
[   29.370016] RIP: 0010:native_safe_halt+0x17/0x20
[   29.370360] Code: 07 0f 00 2d db 80 43 00 f4 5d c3 0f 1f 84 00 00 00
00 00 8b 05 c2 37 e5 01 55 48 89 e5 85 c0 7e 07 0f 00 2d bb 80 43 00 fb
f4 <5d> c3 cc cc cc cc cc cc cc 55 48 89 e5 e8 67 53 ff ff 8b 0d f9 91
[   29.371696] RSP: 0018:ffffc9000008fe90 EFLAGS: 00000246
[   29.372079] RAX: 0000000000000000 RBX: 0000000000000002
RCX: 0000000000000000
[   29.372595] RDX: 0000000000000000 RSI: ffffffff854f67a4
RDI: ffffffff85403406
[   29.373122] RBP: ffffc9000008fe90 R08: 0000000000000001
R09: 0000000000000001
[   29.373646] R10: 0000000000000000 R11: 0000000000000001
R12: ffffffff86009188
[   29.374160] R13: 0000000000000000 R14: 0000000000000000
R15: ffff888100258000
[   29.374690]  default_idle+0x9/0x10
[   29.374944]  arch_cpu_idle+0xa/0x10
[   29.375198]  default_idle_call+0x6e/0x250
[   29.375491]  do_idle+0x1f0/0x2d0
[   29.375740]  cpu_startup_entry+0x18/0x20
[   29.376034]  start_secondary+0x11f/0x160
[   29.376328]  secondary_startup_64_no_verify+0xb0/0xbb
[   29.376705] Modules linked in:
[   29.376939] Dumping ftrace buffer:
[   29.377187]    (ftrace buffer empty)
[   29.377460] CR2: 0000000000000014
[   29.377712] ---[ end trace 51a473dffb618c47 ]---
[   29.378056] RIP: 0010:via_sdc_isr+0x21f/0x410
[   29.378380] Code: ff ff e8 84 aa d0 fd 66 45 89 7e 28 66 41 f7 c4 00
10 75 56 e8 72 aa d0 fd 66 41 f7 c4 00 c0 74 10 e8 65 aa d0 fd 48 8b 43
18 <c7> 40 14 ac ff ff ff e8 55 aa d0 fd 48 89 df e8 ad fb ff ff e9 77
[   29.379714] RSP: 0018:ffffc90000118e98 EFLAGS: 00010046
[   29.380098] RAX: 0000000000000000 RBX: ffff888107d77880
RCX: 0000000000000000
[   29.380614] RDX: 0000000000000000 RSI: ffffffff835d20bb
RDI: 00000000ffffffff
[   29.381134] RBP: ffffc90000118ed8 R08: 0000000000000001
R09: 0000000000000001
[   29.381653] R10: 0000000000000000 R11: 0000000000000001
R12: 0000000000008600
[   29.382176] R13: ffff888107d779c8 R14: ffffc90009c00200
R15: 0000000000008000
[   29.382697] FS:  0000000000000000(0000) GS:ffff88817bc80000(0000)
knlGS:0000000000000000
[   29.383277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   29.383697] CR2: 0000000000000014 CR3: 0000000005a2e000
CR4: 00000000000006e0
[   29.384223] DR0: 0000000000000000 DR1: 0000000000000000
DR2: 0000000000000000
[   29.384736] DR3: 0000000000000000 DR6: 00000000fffe0ff0
DR7: 0000000000000400
[   29.385260] Kernel panic - not syncing: Fatal exception in interrupt
[   29.385882] Dumping ftrace buffer:
[   29.386135]    (ftrace buffer empty)
[   29.386401] Kernel Offset: disabled
[   29.386656] Rebooting in 1 seconds..

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Link: https://lore.kernel.org/r/1622727200-15808-1-git-send-email-zheyuma97@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 85e8b032d6ebb0f698a34dd22c2f13443d905888 ]

syzbot complained in neigh_reduce(), because rcu_read_lock_bh()
is treated differently than rcu_read_lock()

WARNING: suspicious RCU usage
5.13.0-rc6-syzkaller #0 Not tainted
-----------------------------
include/net/addrconf.h:313 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
3 locks held by kworker/0:0/5:
 #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
 #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
 #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
 #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:617 [inline]
 #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:644 [inline]
 #0: ffff888011064d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x871/0x1600 kernel/workqueue.c:2247
 #1: ffffc90000ca7da8 ((work_completion)(&port->wq)){+.+.}-{0:0}, at: process_one_work+0x8a5/0x1600 kernel/workqueue.c:2251
 #2: ffffffff8bf795c0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1da/0x3130 net/core/dev.c:4180

stack backtrace:
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.13.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events ipvlan_process_multicast
Call Trace:
 __dump_stack lib/dump_stack.c:79 [inline]
 dump_stack+0x141/0x1d7 lib/dump_stack.c:120
 __in6_dev_get include/net/addrconf.h:313 [inline]
 __in6_dev_get include/net/addrconf.h:311 [inline]
 neigh_reduce drivers/net/vxlan.c:2167 [inline]
 vxlan_xmit+0x34d5/0x4c30 drivers/net/vxlan.c:2919
 __netdev_start_xmit include/linux/netdevice.h:4944 [inline]
 netdev_start_xmit include/linux/netdevice.h:4958 [inline]
 xmit_one net/core/dev.c:3654 [inline]
 dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3670
 __dev_queue_xmit+0x2133/0x3130 net/core/dev.c:4246
 ipvlan_process_multicast+0xa99/0xd70 drivers/net/ipvlan/ipvlan_core.c:287
 process_one_work+0x98d/0x1600 kernel/workqueue.c:2276
 worker_thread+0x64c/0x1120 kernel/workqueue.c:2422
 kthread+0x3b1/0x4a0 kernel/kthread.c:313
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

Fixes: f564f45 ("vxlan: add ipv6 proxy support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 4a754d7637026b42b0c9ba5787ad5ee3bc2ff77f ]

The "dev->port[i].mp.mpi" is set to NULL during mlx5_ib_unbind_slave_port()
execution, however that field is needed to add device to unaffiliated list.

Such flow causes to the following kernel panic while unloading mlx5_ib
module in multi-port mode, hence the device should be added to the list
prior to unbind call.

 RPC: Unregistered rdma transport module.
 RPC: Unregistered rdma backchannel transport module.
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] SMP NOPTI
 CPU: 4 PID: 1904 Comm: modprobe Not tainted 5.13.0-rc7_for_upstream_min_debug_2021_06_24_12_08 #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 RIP: 0010:mlx5_ib_cleanup_multiport_master+0x18b/0x2d0 [mlx5_ib]
 Code: 00 04 0f 85 c4 00 00 00 48 89 df e8 ef fa ff ff 48 8b 83 40 0d 00 00 48 8b 15 b9 e8 05 00 4a 8b 44 28 20 48 89 05 ad e8 05 00 <48> c7 00 d0 57 c5 a0 48 89 50 08 48 89 02 39 ab 88 0a 00 00 0f 86
 RSP: 0018:ffff888116ee3df8 EFLAGS: 00010296
 RAX: 0000000000000000 RBX: ffff8881154f6000 RCX: 0000000000000080
 RDX: ffffffffa0c557d0 RSI: ffff88810b69d200 RDI: 000000000002d8a0
 RBP: 0000000000000002 R08: ffff888110780408 R09: 0000000000000000
 R10: ffff88812452e1c0 R11: fffffffffff7e028 R12: 0000000000000000
 R13: 0000000000000080 R14: ffff888102c58000 R15: 0000000000000000
 FS:  00007f884393a740(0000) GS:ffff8882f5a00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 00000001249f6004 CR4: 0000000000370ea0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  mlx5_ib_stage_init_cleanup+0x16/0xd0 [mlx5_ib]
  __mlx5_ib_remove+0x33/0x90 [mlx5_ib]
  mlx5r_remove+0x22/0x30 [mlx5_ib]
  auxiliary_bus_remove+0x18/0x30
  __device_release_driver+0x177/0x220
  driver_detach+0xc4/0x100
  bus_remove_driver+0x58/0xd0
  auxiliary_driver_unregister+0x12/0x20
  mlx5_ib_cleanup+0x13/0x897 [mlx5_ib]
  __x64_sys_delete_module+0x154/0x230
  ? exit_to_user_mode_prepare+0x104/0x140
  do_syscall_64+0x3f/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f8842e095c7
 Code: 73 01 c3 48 8b 0d d9 48 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a9 48 2c 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffc68f6e758 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 00005638207929c0 RCX: 00007f8842e095c7
 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000563820792a28
 RBP: 00005638207929c0 R08: 00007ffc68f6d701 R09: 0000000000000000
 R10: 00007f8842e82880 R11: 0000000000000206 R12: 0000563820792a28
 R13: 0000000000000001 R14: 0000563820792a28 R15: 00007ffc68f6fb40
 Modules linked in: xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm ib_umad mlx5_ib(-) mlx4_ib ib_uverbs ib_core mlx4_en mlx4_core mlx5_core ptp pps_core [last unloaded: rpcrdma]
 CR2: 0000000000000000
 ---[ end trace a0bb7e20804e9e9b ]---

Fixes: 7ce6095e3bff ("RDMA/mlx5: Don't add slave port to unaffiliated list")
Link: https://lore.kernel.org/r/899ac1b33a995be5ec0e16a4765c4e43c2b1ba5b.1624956444.git.leonro@nvidia.com
Reviewed-by: Itay Aveksis <itayav@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 6a45ece4c9af473555f01f0f8b97eba56e3c7d0d ]

io_remap_pfn_range() will trigger a BUG_ON if it encounters a
populated pte within the mapping range.  This can occur because we map
the entire vma on fault and multiple faults can be blocked behind the
vma_lock.  This leads to traces like the one reported below.

We can use our vma_list to test whether a given vma is mapped to avoid
this issue.

[ 1591.733256] kernel BUG at mm/memory.c:2177!
[ 1591.739515] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 1591.747381] Modules linked in: vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O)
[ 1591.760536] CPU: 2 PID: 227 Comm: lcore-worker-2 Tainted: G O 5.11.0-rc3+ #1
[ 1591.770735] Hardware name:  , BIOS HixxxxFPGA 1P B600 V121-1
[ 1591.778872] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--)
[ 1591.786134] pc : remap_pfn_range+0x214/0x340
[ 1591.793564] lr : remap_pfn_range+0x1b8/0x340
[ 1591.799117] sp : ffff80001068bbd0
[ 1591.803476] x29: ffff80001068bbd0 x28: 0000042eff6f0000
[ 1591.810404] x27: 0000001100910000 x26: 0000001300910000
[ 1591.817457] x25: 0068000000000fd3 x24: ffffa92f1338e358
[ 1591.825144] x23: 0000001140000000 x22: 0000000000000041
[ 1591.832506] x21: 0000001300910000 x20: ffffa92f141a4000
[ 1591.839520] x19: 0000001100a00000 x18: 0000000000000000
[ 1591.846108] x17: 0000000000000000 x16: ffffa92f11844540
[ 1591.853570] x15: 0000000000000000 x14: 0000000000000000
[ 1591.860768] x13: fffffc0000000000 x12: 0000000000000880
[ 1591.868053] x11: ffff0821bf3d01d0 x10: ffff5ef2abd89000
[ 1591.875932] x9 : ffffa92f12ab0064 x8 : ffffa92f136471c0
[ 1591.883208] x7 : 0000001140910000 x6 : 0000000200000000
[ 1591.890177] x5 : 0000000000000001 x4 : 0000000000000001
[ 1591.896656] x3 : 0000000000000000 x2 : 0168044000000fd3
[ 1591.903215] x1 : ffff082126261880 x0 : fffffc2084989868
[ 1591.910234] Call trace:
[ 1591.914837]  remap_pfn_range+0x214/0x340
[ 1591.921765]  vfio_pci_mmap_fault+0xac/0x130 [vfio_pci]
[ 1591.931200]  __do_fault+0x44/0x12c
[ 1591.937031]  handle_mm_fault+0xcc8/0x1230
[ 1591.942475]  do_page_fault+0x16c/0x484
[ 1591.948635]  do_translation_fault+0xbc/0xd8
[ 1591.954171]  do_mem_abort+0x4c/0xc0
[ 1591.960316]  el0_da+0x40/0x80
[ 1591.965585]  el0_sync_handler+0x168/0x1b0
[ 1591.971608]  el0_sync+0x174/0x180
[ 1591.978312] Code: eb1b027f 540000c0 f9400022 b4fffe02 (d4210000)

Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking")
Reported-by: Zeng Tao <prime.zeng@hisilicon.com>
Suggested-by: Zeng Tao <prime.zeng@hisilicon.com>
Link: https://lore.kernel.org/r/162497742783.3883260.3282953006487785034.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 6a1e5a4af17e440dd82a58a2c5f40ff17a82b722 ]

When 'nicstar_init_one' fails, 'ns_init_card_error' will be executed for
error handling, but the correct memory free function should be used,
otherwise it will cause an error. Since 'card->rsq.org' and
'card->tsq.org' are allocated using 'dma_alloc_coherent' function, they
should be freed using 'dma_free_coherent'.

Fix this by using 'dma_free_coherent' instead of 'kfree'

This log reveals it:

[    3.440294] kernel BUG at mm/slub.c:4206!
[    3.441059] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[    3.441430] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty #141
[    3.441986] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[    3.442780] RIP: 0010:kfree+0x26a/0x300
[    3.443065] Code: e8 3a c3 b9 ff e9 d6 fd ff ff 49 8b 45 00 31 db a9 00 00 01 00 75 4d 49 8b 45 00 a9 00 00 01 00 75 0a 49 8b 45 08 a8 01 75 02 <0f> 0b 89 d9 b8 00 10 00 00 be 06 00 00 00 48 d3 e0 f7 d8 48 63 d0
[    3.443396] RSP: 0000:ffffc90000017b70 EFLAGS: 00010246
[    3.443396] RAX: dead000000000100 RBX: 0000000000000000 RCX: 0000000000000000
[    3.443396] RDX: 0000000000000000 RSI: ffffffff85d3df94 RDI: ffffffff85df38e6
[    3.443396] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000001
[    3.443396] R10: 0000000000000000 R11: 0000000000000001 R12: ffff888107dc0000
[    3.443396] R13: ffffea00001f0100 R14: ffff888101a8bf00 R15: ffff888107dc0160
[    3.443396] FS:  0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000
[    3.443396] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.443396] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0
[    3.443396] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    3.443396] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    3.443396] Call Trace:
[    3.443396]  ns_init_card_error+0x12c/0x220
[    3.443396]  nicstar_init_one+0x10d2/0x1130
[    3.443396]  local_pci_probe+0x4a/0xb0
[    3.443396]  pci_device_probe+0x126/0x1d0
[    3.443396]  ? pci_device_remove+0x100/0x100
[    3.443396]  really_probe+0x27e/0x650
[    3.443396]  driver_probe_device+0x84/0x1d0
[    3.443396]  ? mutex_lock_nested+0x16/0x20
[    3.443396]  device_driver_attach+0x63/0x70
[    3.443396]  __driver_attach+0x117/0x1a0
[    3.443396]  ? device_driver_attach+0x70/0x70
[    3.443396]  bus_for_each_dev+0xb6/0x110
[    3.443396]  ? rdinit_setup+0x40/0x40
[    3.443396]  driver_attach+0x22/0x30
[    3.443396]  bus_add_driver+0x1e6/0x2a0
[    3.443396]  driver_register+0xa4/0x180
[    3.443396]  __pci_register_driver+0x77/0x80
[    3.443396]  ? uPD98402_module_init+0xd/0xd
[    3.443396]  nicstar_init+0x1f/0x75
[    3.443396]  do_one_initcall+0x7a/0x3d0
[    3.443396]  ? rdinit_setup+0x40/0x40
[    3.443396]  ? rcu_read_lock_sched_held+0x4a/0x70
[    3.443396]  kernel_init_freeable+0x2a7/0x2f9
[    3.443396]  ? rest_init+0x2c0/0x2c0
[    3.443396]  kernel_init+0x13/0x180
[    3.443396]  ? rest_init+0x2c0/0x2c0
[    3.443396]  ? rest_init+0x2c0/0x2c0
[    3.443396]  ret_from_fork+0x1f/0x30
[    3.443396] Modules linked in:
[    3.443396] Dumping ftrace buffer:
[    3.443396]    (ftrace buffer empty)
[    3.458593] ---[ end trace 3c6f8f0d8ef59bcd ]---
[    3.458922] RIP: 0010:kfree+0x26a/0x300
[    3.459198] Code: e8 3a c3 b9 ff e9 d6 fd ff ff 49 8b 45 00 31 db a9 00 00 01 00 75 4d 49 8b 45 00 a9 00 00 01 00 75 0a 49 8b 45 08 a8 01 75 02 <0f> 0b 89 d9 b8 00 10 00 00 be 06 00 00 00 48 d3 e0 f7 d8 48 63 d0
[    3.460499] RSP: 0000:ffffc90000017b70 EFLAGS: 00010246
[    3.460870] RAX: dead000000000100 RBX: 0000000000000000 RCX: 0000000000000000
[    3.461371] RDX: 0000000000000000 RSI: ffffffff85d3df94 RDI: ffffffff85df38e6
[    3.461873] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000001
[    3.462372] R10: 0000000000000000 R11: 0000000000000001 R12: ffff888107dc0000
[    3.462871] R13: ffffea00001f0100 R14: ffff888101a8bf00 R15: ffff888107dc0160
[    3.463368] FS:  0000000000000000(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000
[    3.463949] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.464356] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0
[    3.464856] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    3.465356] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    3.465860] Kernel panic - not syncing: Fatal exception
[    3.466370] Dumping ftrace buffer:
[    3.466616]    (ftrace buffer empty)
[    3.466871] Kernel Offset: disabled
[    3.467122] Rebooting in 1 seconds..

Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 0ea9fd001a14ebc294f112b0361a4e601551d508 ]

Rfkill block and unblock Intel USB Bluetooth [8087:0026] may make it
stops working:
[  509.691509] Bluetooth: hci0: HCI reset during shutdown failed
[  514.897584] Bluetooth: hci0: MSFT filter_enable is already on
[  530.044751] usb 3-10: reset full-speed USB device number 5 using xhci_hcd
[  545.660350] usb 3-10: device descriptor read/64, error -110
[  561.283530] usb 3-10: device descriptor read/64, error -110
[  561.519682] usb 3-10: reset full-speed USB device number 5 using xhci_hcd
[  566.686650] Bluetooth: hci0: unexpected event for opcode 0x0500
[  568.752452] Bluetooth: hci0: urb 0000000096cd309b failed to resubmit (113)
[  578.797955] Bluetooth: hci0: Failed to read MSFT supported features (-110)
[  586.286565] Bluetooth: hci0: urb 00000000c522f633 failed to resubmit (113)
[  596.215302] Bluetooth: hci0: Failed to read MSFT supported features (-110)

Or kernel panics because other workqueues already freed skb:
[ 2048.663763] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 2048.663775] #PF: supervisor read access in kernel mode
[ 2048.663779] #PF: error_code(0x0000) - not-present page
[ 2048.663782] PGD 0 P4D 0
[ 2048.663787] Oops: 0000 [#1] SMP NOPTI
[ 2048.663793] CPU: 3 PID: 4491 Comm: rfkill Tainted: G        W         5.13.0-rc1-next-20210510+ #20
[ 2048.663799] Hardware name: HP HP EliteBook 850 G8 Notebook PC/8846, BIOS T76 Ver. 01.01.04 12/02/2020
[ 2048.663801] RIP: 0010:__skb_ext_put+0x6/0x50
[ 2048.663814] Code: 8b 1b 48 85 db 75 db 5b 41 5c 5d c3 be 01 00 00 00 e8 de 13 c0 ff eb e7 be 02 00 00 00 e8 d2 13 c0 ff eb db 0f 1f 44 00 00 55 <8b> 07 48 89 e5 83 f8 01 74 14 b8 ff ff ff ff f0 0f c1
07 83 f8 01
[ 2048.663819] RSP: 0018:ffffc1d105b6fd80 EFLAGS: 00010286
[ 2048.663824] RAX: 0000000000000000 RBX: ffff9d9ac5649000 RCX: 0000000000000000
[ 2048.663827] RDX: ffffffffc0d1daf6 RSI: 0000000000000206 RDI: 0000000000000000
[ 2048.663830] RBP: ffffc1d105b6fd98 R08: 0000000000000001 R09: ffff9d9ace8ceac0
[ 2048.663834] R10: ffff9d9ace8ceac0 R11: 0000000000000001 R12: ffff9d9ac5649000
[ 2048.663838] R13: 0000000000000000 R14: 00007ffe0354d650 R15: 0000000000000000
[ 2048.663843] FS:  00007fe02ab19740(0000) GS:ffff9d9e5f8c0000(0000) knlGS:0000000000000000
[ 2048.663849] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2048.663853] CR2: 0000000000000000 CR3: 0000000111a52004 CR4: 0000000000770ee0
[ 2048.663856] PKRU: 55555554
[ 2048.663859] Call Trace:
[ 2048.663865]  ? skb_release_head_state+0x5e/0x80
[ 2048.663873]  kfree_skb+0x2f/0xb0
[ 2048.663881]  btusb_shutdown_intel_new+0x36/0x60 [btusb]
[ 2048.663905]  hci_dev_do_close+0x48c/0x5e0 [bluetooth]
[ 2048.663954]  ? __cond_resched+0x1a/0x50
[ 2048.663962]  hci_rfkill_set_block+0x56/0xa0 [bluetooth]
[ 2048.664007]  rfkill_set_block+0x98/0x170
[ 2048.664016]  rfkill_fop_write+0x136/0x1e0
[ 2048.664022]  vfs_write+0xc7/0x260
[ 2048.664030]  ksys_write+0xb1/0xe0
[ 2048.664035]  ? exit_to_user_mode_prepare+0x37/0x1c0
[ 2048.664042]  __x64_sys_write+0x1a/0x20
[ 2048.664048]  do_syscall_64+0x40/0xb0
[ 2048.664055]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2048.664060] RIP: 0033:0x7fe02ac23c27
[ 2048.664066] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[ 2048.664070] RSP: 002b:00007ffe0354d638 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 2048.664075] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe02ac23c27
[ 2048.664078] RDX: 0000000000000008 RSI: 00007ffe0354d650 RDI: 0000000000000003
[ 2048.664081] RBP: 0000000000000000 R08: 0000559b05998440 R09: 0000559b05998440
[ 2048.664084] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
[ 2048.664086] R13: 0000000000000000 R14: ffffffff00000000 R15: 00000000ffffffff

So move the shutdown callback to a place where workqueues are either
flushed or cancelled to resolve the issue.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit 93aa71ad7379900e61c8adff6a710a4c18c7c99b upstream.

Commit 66a834d09293 ("scsi: core: Fix error handling of scsi_host_alloc()")
changed the allocation logic to call put_device() to perform host cleanup
with the assumption that IDA removal and stopping the kthread would
properly be performed in scsi_host_dev_release(). However, in the unlikely
case that the error handler thread fails to spawn, shost->ehandler is set
to ERR_PTR(-ENOMEM).

The error handler cleanup code in scsi_host_dev_release() will call
kthread_stop() if shost->ehandler != NULL which will always be the case
whether the kthread was successfully spawned or not. In the case that it
failed to spawn this has the nasty side effect of trying to dereference an
invalid pointer when kthread_stop() is called. The following splat provides
an example of this behavior in the wild:

scsi host11: error handler thread failed to spawn, error = -4
Kernel attempted to read user page (10c) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x0000010c
Faulting instruction address: 0xc00000000818e9a8
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvscsi(+) scsi_transport_srp dm_multipath dm_mirror dm_region
 hash dm_log dm_mod fuse overlay squashfs loop
CPU: 12 PID: 274 Comm: systemd-udevd Not tainted 5.13.0-rc7 #1
NIP:  c00000000818e9a8 LR: c0000000089846e8 CTR: 0000000000007ee8
REGS: c000000037d12ea0 TRAP: 0300   Not tainted  (5.13.0-rc7)
MSR:  800000000280b033 &lt;SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE&gt;  CR: 28228228
XER: 20040001
CFAR: c0000000089846e4 DAR: 000000000000010c DSISR: 40000000 IRQMASK: 0
GPR00: c0000000089846e8 c000000037d13140 c000000009cc1100 fffffffffffffffc
GPR04: 0000000000000001 0000000000000000 0000000000000000 c000000037dc0000
GPR08: 0000000000000000 c000000037dc0000 0000000000000001 00000000fffff7ff
GPR12: 0000000000008000 c00000000a049000 c000000037d13d00 000000011134d5a0
GPR16: 0000000000001740 c0080000190d0000 c0080000190d1740 c000000009129288
GPR20: c000000037d13bc0 0000000000000001 c000000037d13bc0 c0080000190b7898
GPR24: c0080000190b7708 0000000000000000 c000000033bb2c48 0000000000000000
GPR28: c000000046b28280 0000000000000000 000000000000010c fffffffffffffffc
NIP [c00000000818e9a8] kthread_stop+0x38/0x230
LR [c0000000089846e8] scsi_host_dev_release+0x98/0x160
Call Trace:
[c000000033bb2c48] 0xc000000033bb2c48 (unreliable)
[c0000000089846e8] scsi_host_dev_release+0x98/0x160
[c00000000891e960] device_release+0x60/0x100
[c0000000087e55c4] kobject_release+0x84/0x210
[c00000000891ec78] put_device+0x28/0x40
[c000000008984ea4] scsi_host_alloc+0x314/0x430
[c0080000190b38bc] ibmvscsi_probe+0x54/0xad0 [ibmvscsi]
[c000000008110104] vio_bus_probe+0xa4/0x4b0
[c00000000892a860] really_probe+0x140/0x680
[c00000000892aefc] driver_probe_device+0x15c/0x200
[c00000000892b63c] device_driver_attach+0xcc/0xe0
[c00000000892b740] __driver_attach+0xf0/0x200
[c000000008926f28] bus_for_each_dev+0xa8/0x130
[c000000008929ce4] driver_attach+0x34/0x50
[c000000008928fc0] bus_add_driver+0x1b0/0x300
[c00000000892c798] driver_register+0x98/0x1a0
[c00000000810eb60] __vio_register_driver+0x80/0xe0
[c0080000190b4a30] ibmvscsi_module_init+0x9c/0xdc [ibmvscsi]
[c0000000080121d0] do_one_initcall+0x60/0x2d0
[c000000008261abc] do_init_module+0x7c/0x320
[c000000008265700] load_module+0x2350/0x25b0
[c000000008265cb4] __do_sys_finit_module+0xd4/0x160
[c000000008031110] system_call_exception+0x150/0x2d0
[c00000000800d35c] system_call_common+0xec/0x278

Fix this be nulling shost->ehandler when the kthread fails to spawn.

Link: https://lore.kernel.org/r/20210701195659.3185475-1-tyreld@linux.ibm.com
Fixes: 66a834d09293 ("scsi: core: Fix error handling of scsi_host_alloc()")
Cc: stable@vger.kernel.org
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit 704adfb5a9978462cd861f170201ae2b5e3d3a80 upstream.

The histogram logic was allowing events with char * pointers to be used as
normal strings. But it was easy to crash the kernel with:

 # echo 'hist:keys=filename' > events/syscalls/sys_enter_openat/trigger

And open some files, and boom!

 BUG: unable to handle page fault for address: 00007f2ced0c3280
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 1173fa067 P4D 1173fa067 PUD 1171b6067 PMD 1171dd067 PTE 0
 Oops: 0000 [#1] PREEMPT SMP
 CPU: 6 PID: 1810 Comm: cat Not tainted 5.13.0-rc5-test+ #61
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01
v03.03 07/14/2016
 RIP: 0010:strlen+0x0/0x20
 Code: f6 82 80 2a 0b a9 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2a 0b
a9 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74
10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3

 RSP: 0018:ffffbdbf81567b50 EFLAGS: 00010246
 RAX: 0000000000000003 RBX: ffff93815cdb3800 RCX: ffff9382401a22d0
 RDX: 0000000000000100 RSI: 0000000000000000 RDI: 00007f2ced0c3280
 RBP: 0000000000000100 R08: ffff9382409ff074 R09: ffffbdbf81567c98
 R10: ffff9382409ff074 R11: 0000000000000000 R12: ffff9382409ff074
 R13: 0000000000000001 R14: ffff93815a744f00 R15: 00007f2ced0c3280
 FS:  00007f2ced0f8580(0000) GS:ffff93825a800000(0000)
knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f2ced0c3280 CR3: 0000000107069005 CR4: 00000000001706e0
 Call Trace:
  event_hist_trigger+0x463/0x5f0
  ? find_held_lock+0x32/0x90
  ? sched_clock_cpu+0xe/0xd0
  ? lock_release+0x155/0x440
  ? kernel_init_free_pages+0x6d/0x90
  ? preempt_count_sub+0x9b/0xd0
  ? kernel_init_free_pages+0x6d/0x90
  ? get_page_from_freelist+0x12c4/0x1680
  ? __rb_reserve_next+0xe5/0x460
  ? ring_buffer_lock_reserve+0x12a/0x3f0
  event_triggers_call+0x52/0xe0
  ftrace_syscall_enter+0x264/0x2c0
  syscall_trace_enter.constprop.0+0x1ee/0x210
  do_syscall_64+0x1c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Where it triggered a fault on strlen(key) where key was the filename.

The reason is that filename is a char * to user space, and the histogram
code just blindly dereferenced it, with obvious bad results.

I originally tried to use strncpy_from_user/kernel_nofault() but found
that there's other places that its dereferenced and not worth the effort.

Just do not allow "char *" to act like strings.

Link: https://lkml.kernel.org/r/20210715000206.025df9d2@rorschach.local.home

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Tom Zanussi <zanussi@kernel.org>
Fixes: 79e577c ("tracing: Support string type key properly")
Fixes: 5967bd5c4239 ("tracing: Let filter_assign_type() detect FILTER_PTR_STRING")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit 54f93336d000229f72c26d8a3f69dd256b744528 upstream.

We get a bug during ltp can_filter test as following.

===========================================
[60919.264984] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[60919.265223] PGD 8000003dda726067 P4D 8000003dda726067 PUD 3dda727067 PMD 0
[60919.265443] Oops: 0000 [#1] SMP PTI
[60919.265550] CPU: 30 PID: 3638365 Comm: can_filter Kdump: loaded Tainted: G        W         4.19.90+ #1
[60919.266068] RIP: 0010:selinux_socket_sock_rcv_skb+0x3e/0x200
[60919.293289] RSP: 0018:ffff8d53bfc03cf8 EFLAGS: 00010246
[60919.307140] RAX: 0000000000000000 RBX: 000000000000001d RCX: 0000000000000007
[60919.320756] RDX: 0000000000000001 RSI: ffff8d5104a8ed00 RDI: ffff8d53bfc03d30
[60919.334319] RBP: ffff8d9338056800 R08: ffff8d53bfc29d80 R09: 0000000000000001
[60919.347969] R10: ffff8d53bfc03ec0 R11: ffffb8526ef47c98 R12: ffff8d53bfc03d30
[60919.350320] perf: interrupt took too long (3063 > 2500), lowering kernel.perf_event_max_sample_rate to 65000
[60919.361148] R13: 0000000000000001 R14: ffff8d53bcf90000 R15: 0000000000000000
[60919.361151] FS:  00007fb78b6b3600(0000) GS:ffff8d53bfc00000(0000) knlGS:0000000000000000
[60919.400812] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[60919.413730] CR2: 0000000000000010 CR3: 0000003e3f784006 CR4: 00000000007606e0
[60919.426479] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[60919.439339] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[60919.451608] PKRU: 55555554
[60919.463622] Call Trace:
[60919.475617]  <IRQ>
[60919.487122]  ? update_load_avg+0x89/0x5d0
[60919.498478]  ? update_load_avg+0x89/0x5d0
[60919.509822]  ? account_entity_enqueue+0xc5/0xf0
[60919.520709]  security_sock_rcv_skb+0x2a/0x40
[60919.531413]  sk_filter_trim_cap+0x47/0x1b0
[60919.542178]  ? kmem_cache_alloc+0x38/0x1b0
[60919.552444]  sock_queue_rcv_skb+0x17/0x30
[60919.562477]  raw_rcv+0x110/0x190 [can_raw]
[60919.572539]  can_rcv_filter+0xbc/0x1b0 [can]
[60919.582173]  can_receive+0x6b/0xb0 [can]
[60919.591595]  can_rcv+0x31/0x70 [can]
[60919.600783]  __netif_receive_skb_one_core+0x5a/0x80
[60919.609864]  process_backlog+0x9b/0x150
[60919.618691]  net_rx_action+0x156/0x400
[60919.627310]  ? sched_clock_cpu+0xc/0xa0
[60919.635714]  __do_softirq+0xe8/0x2e9
[60919.644161]  do_softirq_own_stack+0x2a/0x40
[60919.652154]  </IRQ>
[60919.659899]  do_softirq.part.17+0x4f/0x60
[60919.667475]  __local_bh_enable_ip+0x60/0x70
[60919.675089]  __dev_queue_xmit+0x539/0x920
[60919.682267]  ? finish_wait+0x80/0x80
[60919.689218]  ? finish_wait+0x80/0x80
[60919.695886]  ? sock_alloc_send_pskb+0x211/0x230
[60919.702395]  ? can_send+0xe5/0x1f0 [can]
[60919.708882]  can_send+0xe5/0x1f0 [can]
[60919.715037]  raw_sendmsg+0x16d/0x268 [can_raw]

It's because raw_setsockopt() concurrently with
unregister_netdevice_many(). Concurrent scenario as following.

	cpu0						cpu1
raw_bind
raw_setsockopt					unregister_netdevice_many
						unlist_netdevice
dev_get_by_index				raw_notifier
raw_enable_filters				......
can_rx_register
can_rcv_list_find(..., net->can.rx_alldev_list)

......

sock_close
raw_release(sock_a)

......

can_receive
can_rcv_filter(net->can.rx_alldev_list, ...)
raw_rcv(skb, sock_a)
BUG

After unlist_netdevice(), dev_get_by_index() return NULL in
raw_setsockopt(). Function raw_enable_filters() will add sock
and can_filter to net->can.rx_alldev_list. Then the sock is closed.
Followed by, we sock_sendmsg() to a new vcan device use the same
can_filter. Protocol stack match the old receiver whose sock has
been released on net->can.rx_alldev_list in can_rcv_filter().
Function raw_rcv() uses the freed sock. UAF BUG is triggered.

We can find that the key issue is that net_device has not been
protected in raw_setsockopt(). Use rtnl_lock to protect net_device
in raw_setsockopt().

Fixes: c18ce10 ("[CAN]: Add raw protocol")
Link: https://lore.kernel.org/r/20210722070819.1048263-1-william.xuanziyang@huawei.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 6206b7981a36476f4695d661ae139f7db36a802d ]

Liajian reported a bug_on hit on a ThunderX2 arm64 server with FastLinQ
QL41000 ethernet controller:
 BUG: scheduling while atomic: kworker/0:4/531/0x00000200
  [qed_probe:488()]hw prepare failed
  kernel BUG at mm/vmalloc.c:2355!
  Internal error: Oops - BUG: 0 [#1] SMP
  CPU: 0 PID: 531 Comm: kworker/0:4 Tainted: G W 5.4.0-77-generic #86-Ubuntu
  pstate: 00400009 (nzcv daif +PAN -UAO)
 Call trace:
  vunmap+0x4c/0x50
  iounmap+0x48/0x58
  qed_free_pci+0x60/0x80 [qed]
  qed_probe+0x35c/0x688 [qed]
  __qede_probe+0x88/0x5c8 [qede]
  qede_probe+0x60/0xe0 [qede]
  local_pci_probe+0x48/0xa0
  work_for_cpu_fn+0x24/0x38
  process_one_work+0x1d0/0x468
  worker_thread+0x238/0x4e0
  kthread+0xf0/0x118
  ret_from_fork+0x10/0x18

In this case, qed_hw_prepare() returns error due to hw/fw error, but in
theory work queue should be in process context instead of interrupt.

The root cause might be the unpaired spin_{un}lock_bh() in
_qed_mcp_cmd_and_union(), which causes botton half is disabled incorrectly.

Reported-by: Lijian Zhang <Lijian.Zhang@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit a17ad0961706244dce48ec941f7e476a38c0e727 ]

In some cases skb head could be locked and entire header
data is pulled from skb. When skb_zerocopy() called in such cases,
following BUG is triggered. This patch fixes it by copying entire
skb in such cases.
This could be optimized incase this is performance bottleneck.

---8<---
kernel BUG at net/core/skbuff.c:2961!
invalid opcode: 0000 [#1] SMP PTI
CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           OE     5.4.0-77-generic #86-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:skb_zerocopy+0x37a/0x3a0
RSP: 0018:ffffbcc70013ca38 EFLAGS: 00010246
Call Trace:
 <IRQ>
 queue_userspace_packet+0x2af/0x5e0 [openvswitch]
 ovs_dp_upcall+0x3d/0x60 [openvswitch]
 ovs_dp_process_packet+0x125/0x150 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 netdev_port_receive+0x87/0x130 [openvswitch]
 netdev_frame_hook+0x4b/0x60 [openvswitch]
 __netif_receive_skb_core+0x2b4/0xc90
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x18/0x60
 process_backlog+0xa9/0x160
 net_rx_action+0x142/0x390
 __do_softirq+0xe1/0x2d6
 irq_exit+0xae/0xb0
 do_IRQ+0x5a/0xf0
 common_interrupt+0xf/0xf

Code that triggered BUG:
int
skb_zerocopy(struct sk_buff *to, struct sk_buff *from, int len, int hlen)
{
        int i, j = 0;
        int plen = 0; /* length of skb->head fragment */
        int ret;
        struct page *page;
        unsigned int offset;

        BUG_ON(!from->head_frag && !hlen);

Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit 894c9ef9780c5cf2f143415e867ee39a33ecb75d upstream.

Configuring an instance's parallel mask without any online CPUs...

  echo 2 > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
  echo 0 > /sys/devices/system/cpu/cpu1/online

...makes tcrypt mode=215 crash like this:

  divide error: 0000 [#1] SMP PTI
  CPU: 4 PID: 283 Comm: modprobe Not tainted 5.4.0-rc8-padata-doc-v2+ #2
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014
  RIP: 0010:padata_do_parallel+0x114/0x300
  Call Trace:
   pcrypt_aead_encrypt+0xc0/0xd0 [pcrypt]
   crypto_aead_encrypt+0x1f/0x30
   do_mult_aead_op+0x4e/0xdf [tcrypt]
   test_mb_aead_speed.constprop.0.cold+0x226/0x564 [tcrypt]
   do_test+0x28c2/0x4d49 [tcrypt]
   tcrypt_mod_init+0x55/0x1000 [tcrypt]
   ...

cpumask_weight() in padata_cpu_hash() returns 0 because the mask has no
CPUs.  The problem is __padata_remove_cpu() checks for valid masks too
early and so doesn't mark the instance PADATA_INVALID as expected, which
would have made padata_do_parallel() return error before doing the
division.

Fix by introducing a second padata CPU hotplug state before
CPUHP_BRINGUP_CPU so that __padata_remove_cpu() sees the online mask
without @cpu.  No need for the second argument to padata_replace() since
@cpu is now already missing from the online mask.

Fixes: 33e5445 ("padata: Handle empty padata cpumasks")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
commit 5648c073c33d33a0a19d0cb1194a4eb88efe2b71 upstream.

Add the following Telit FD980 composition 0x1056:

Cfg #1: mass storage
Cfg #2: rndis, tty, adb, tty, tty, tty, tty

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Link: https://lore.kernel.org/r/20210803194711.3036-1-dnlplm@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
[ Upstream commit 2acf15b94d5b8ea8392c4b6753a6ffac3135cd78 ]

Our syzcaller report a NULL pointer dereference:

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 116e95067 P4D 116e95067 PUD 1080b5067 PMD 0
Oops: 0010 [#1] SMP KASAN
CPU: 7 PID: 592 Comm: a.out Not tainted 5.13.0-next-20210629-dirty #67
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-p4
RIP: 0010:0x0
Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
RSP: 0018:ffff888114e779b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 1ffff110229cef39 RCX: ffffffffaa67e1aa
RDX: 0000000000000000 RSI: ffff88810a58ee00 RDI: ffff8881233180b0
RBP: ffffffffac38e9c0 R08: ffffffffaa67e17e R09: 0000000000000001
R10: ffffffffb91c5557 R11: fffffbfff7238aaa R12: ffff88810a58ee00
R13: ffff888114e77aa0 R14: 0000000000000000 R15: ffff8881233180b0
FS:  00007f946163c480(0000) GS:ffff88839f1c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000001099c1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 __lookup_slow+0x116/0x2d0
 ? page_put_link+0x120/0x120
 ? __d_lookup+0xfc/0x320
 ? d_lookup+0x49/0x90
 lookup_one_len+0x13c/0x170
 ? __lookup_slow+0x2d0/0x2d0
 ? reiserfs_schedule_old_flush+0x31/0x130
 reiserfs_lookup_privroot+0x64/0x150
 reiserfs_fill_super+0x158c/0x1b90
 ? finish_unfinished+0xb10/0xb10
 ? bprintf+0xe0/0xe0
 ? __mutex_lock_slowpath+0x30/0x30
 ? __kasan_check_write+0x20/0x30
 ? up_write+0x51/0xb0
 ? set_blocksize+0x9f/0x1f0
 mount_bdev+0x27c/0x2d0
 ? finish_unfinished+0xb10/0xb10
 ? reiserfs_kill_sb+0x120/0x120
 get_super_block+0x19/0x30
 legacy_get_tree+0x76/0xf0
 vfs_get_tree+0x49/0x160
 ? capable+0x1d/0x30
 path_mount+0xacc/0x1380
 ? putname+0x97/0xd0
 ? finish_automount+0x450/0x450
 ? kmem_cache_free+0xf8/0x5a0
 ? putname+0x97/0xd0
 do_mount+0xe2/0x110
 ? path_mount+0x1380/0x1380
 ? copy_mount_options+0x69/0x140
 __x64_sys_mount+0xf0/0x190
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

This is because 'root_inode' is initialized with wrong mode, and
it's i_op is set to 'reiserfs_special_inode_operations'. Thus add
check for 'root_inode' to fix the problem.

Link: https://lore.kernel.org/r/20210702040743.1918552-1-yukuai3@huawei.com
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
Subsequent to commit 4d7f8d7 ("Revert "sched: Prevent raising SCHED_SOFTIRQ when CPU is !active"")

A CPU can be paused while it is idle with it's tick stopped.
nohz_balance_exit_idle() should be called from the local CPU,
so it can't be called during pause which can happen remotely.
This results in paused CPU participating in the nohz idle balance,
which should be avoided. This can be done by calling

Fix this issue by calling nohz_balance_exit_idle() from the paused
CPU when it exits and enters idle again. This lazy approach avoids
waking the CPU from idle during pause.

[@0ctobot: This is an attempt to silence the following dmesg WARN_ON:
[    6.081902] ------------[ cut here ]------------
[    6.081909] (flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK
[    6.081922] WARNING: CPU: 6 PID: 0 at _nohz_idle_balance+0x408/0x410
[    6.081926] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G S                4.19.202-NeutrinoKernel-lothal-EXP #1
[    6.081928] Hardware name: Qualcomm Technologies, Inc. kona MTP 19805 20809 (DT)
[    6.081931] pstate: 60400005 (nZCv daif +PAN -UAO)
[    6.081933] pc : _nohz_idle_balance+0x408/0x410
[    6.081935] lr : _nohz_idle_balance+0x400/0x410
[    6.081937] sp : ffffff8010033df0
[    6.081938] x29: ffffff8010033e30 x28: ffffff9d37e75040
[    6.081940] x27: ffffff9d37e572c0 x26: ffffff9d35c96050
[    6.081942] x25: 0000000000000007 x24: 0000000000000001
[    6.081944] x23: ffffff9d37e76000 x22: fffffffcb39e2a00
[    6.081946] x21: 0000000000000006 x20: 00000000ffff8d30
[    6.081947] x19: 0000000000000000 x18: ffffff8010035038
[    6.081949] x17: 0000000000000008 x16: 0000000000000000
[    6.081951] x15: 0000000000000008 x14: ffffff9d37f6dc18
[    6.081953] x13: 0000000000000004 x12: ffffff9d37e76000
[    6.081954] x11: 0000000000000000 x10: 0000000000000000
[    6.081956] x9 : a2d84b4c1c6fa900 x8 : a2d84b4c1c6fa900
[    6.081957] x7 : 000000000000002d x6 : ffffff9d38115e65
[    6.081959] x5 : ffffff9d38113500 x4 : 0000000000000000
[    6.081961] x3 : 000000000000004b x2 : 0000000000000000
[    6.081962] x1 : ffffff9d3811352d x0 : 000000000000002d
[    6.081964] Call trace:
[    6.081966] _nohz_idle_balance+0x408/0x410
[    6.081968] run_rebalance_domains+0x78/0xa8
[    6.081971] __do_softirq+0x144/0x2f4
[    6.081975] irq_exit+0x170/0x184
[    6.081978] scheduler_ipi+0x160/0x1d4
[    6.081980] handle_IPI+0x94/0x3e0
[    6.081982] gic_handle_irq.17896+0x100/0x180
[    6.081984] el1_irq+0xc4/0x140
[    6.081988] lpm_cpuidle_enter+0x61c/0x6d0
[    6.081991] cpuidle_enter_state+0x1dc/0x368
[    6.081993] do_idle+0x67c/0x74c
[    6.081995] cpu_startup_entry+0x58/0x5c
[    6.081997] secondary_start_kernel+0x13c/0x140
[    6.081999] ---[ end trace 6de171506e562a81 ]---]

Bug: 180530906
Change-Id: Ia2dfd9c9cac9b0f37c55a9256b9d5f3141ca0421
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Aug 22, 2021
…stats

In rmnet_shs_wq_refresh_dl_mrkr_stats(), `tdiff` is used as a divisor
and is set to `rmnet_shs_wq_tnsec - tbl_p->l_epoch`. However, the
function called before it, rmnet_shs_wq_refresh_total_stats(), resets
`tbl_p->l_epoch` to `rmnet_shs_wq_tnsec`, resulting in `tdiff` always
being zero inside rmnet_shs_wq_refresh_dl_mrkr_stats(). This results in
a 100% consistent divide-by-zero exception:

[32.709616] Unexpected kernel BRK exception at EL1
[32.709626] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP
[32.709639] Process kworker/3:1 (pid: 79, stack limit = 0x0000000068506c9d)
[32.709647] CPU: 3 PID: 79 Comm: kworker/3:1 Tainted: G S      W         4.19.202-NeutrinoKernel-lothal-EXP #1
[32.709651] Hardware name: Qualcomm Technologies, Inc. kona MTP 19805 20809 (DT)
[32.709664] Workqueue: rmnet_shs_wq rmnet_shs_wq_process_wq
[32.709670] pstate: 60c00085 (nZCv daIf +PAN +UAO)
[32.709676] pc : rmnet_shs_wq_update_stats+0x3c8/0x5c0
[32.709680] lr : rmnet_shs_wq_update_stats+0x230/0x5c0
[32.709684] sp : ffffff801054bce0
[32.709687] x29: ffffff801054bce0 x28: ffffffa9fe579cdd
[32.709692] x27: ffffffa9fe57ada0 x26: ffffffa9fe57b1e0
[32.709697] x25: 0000000000000008 x24: ffffffa9fdc15920
[32.709702] x23: ffffffa9fe57e6b0 x22: 0000000000000007
[32.709706] x21: 0000000000000000 x20: 0000000000000000
[32.709711] x19: ffffffa9fddd4840 x18: 0000000000000000
[32.709716] x17: 0000000000000000 x16: 0000000000000005
[32.709720] x15: 0000000000000000 x14: 0000000000000179
[32.709725] x13: 0000000000000000 x12: 0000000000000000
[32.709730] x11: 0000000000000000 x10: 0000000000000000
[32.709734] x9 : 0000000000000000 x8 : 0000000000000000
[32.709739] x7 : 0000000000000000 x6 : 0000000000000000
[32.709743] x5 : 0000000000000000 x4 : ffffffa9fe57b218
[32.709748] x3 : fffffff2f1d31000 x2 : 0000000000000000
[32.709752] x1 : 0000000000000000 x0 : 0000000000000007
[32.709758] Call trace:
[32.709764]  rmnet_shs_wq_update_stats+0x3c8/0x5c0
[32.709768]  rmnet_shs_wq_process_wq+0x54/0xd0
[32.709775]  process_one_work+0x22c/0x3c0
[32.709779]  worker_thread+0x188/0x5e0
[32.709784]  kthread+0x130/0x160
[32.709790]  ret_from_fork+0x10/0x1c
[32.709797] Code: f9449860 f9022f60 f9449c60 f9023360 (d4207d00)
[32.709802] ---[ end trace 1ae09793f9d923fb ]---

Reordering the function calls fixes the issue. It looks like this issue
was hidden by Clang's dead store elimination depending on whether or not
tracepoints were enabled, since the quotients in question are only ever
used as arguments to trace_rmnet_shs_wq_high().

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 18, 2021
…stats

In rmnet_shs_wq_refresh_dl_mrkr_stats(), `tdiff` is used as a divisor
and is set to `rmnet_shs_wq_tnsec - tbl_p->l_epoch`. However, the
function called before it, rmnet_shs_wq_refresh_total_stats(), resets
`tbl_p->l_epoch` to `rmnet_shs_wq_tnsec`, resulting in `tdiff` always
being zero inside rmnet_shs_wq_refresh_dl_mrkr_stats(). This results in
a 100% consistent divide-by-zero exception:

[32.709616] Unexpected kernel BRK exception at EL1
[32.709626] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP
[32.709639] Process kworker/3:1 (pid: 79, stack limit = 0x0000000068506c9d)
[32.709647] CPU: 3 PID: 79 Comm: kworker/3:1 Tainted: G S      W         4.19.202-NeutrinoKernel-lothal-EXP #1
[32.709651] Hardware name: Qualcomm Technologies, Inc. kona MTP 19805 20809 (DT)
[32.709664] Workqueue: rmnet_shs_wq rmnet_shs_wq_process_wq
[32.709670] pstate: 60c00085 (nZCv daIf +PAN +UAO)
[32.709676] pc : rmnet_shs_wq_update_stats+0x3c8/0x5c0
[32.709680] lr : rmnet_shs_wq_update_stats+0x230/0x5c0
[32.709684] sp : ffffff801054bce0
[32.709687] x29: ffffff801054bce0 x28: ffffffa9fe579cdd
[32.709692] x27: ffffffa9fe57ada0 x26: ffffffa9fe57b1e0
[32.709697] x25: 0000000000000008 x24: ffffffa9fdc15920
[32.709702] x23: ffffffa9fe57e6b0 x22: 0000000000000007
[32.709706] x21: 0000000000000000 x20: 0000000000000000
[32.709711] x19: ffffffa9fddd4840 x18: 0000000000000000
[32.709716] x17: 0000000000000000 x16: 0000000000000005
[32.709720] x15: 0000000000000000 x14: 0000000000000179
[32.709725] x13: 0000000000000000 x12: 0000000000000000
[32.709730] x11: 0000000000000000 x10: 0000000000000000
[32.709734] x9 : 0000000000000000 x8 : 0000000000000000
[32.709739] x7 : 0000000000000000 x6 : 0000000000000000
[32.709743] x5 : 0000000000000000 x4 : ffffffa9fe57b218
[32.709748] x3 : fffffff2f1d31000 x2 : 0000000000000000
[32.709752] x1 : 0000000000000000 x0 : 0000000000000007
[32.709758] Call trace:
[32.709764]  rmnet_shs_wq_update_stats+0x3c8/0x5c0
[32.709768]  rmnet_shs_wq_process_wq+0x54/0xd0
[32.709775]  process_one_work+0x22c/0x3c0
[32.709779]  worker_thread+0x188/0x5e0
[32.709784]  kthread+0x130/0x160
[32.709790]  ret_from_fork+0x10/0x1c
[32.709797] Code: f9449860 f9022f60 f9449c60 f9023360 (d4207d00)
[32.709802] ---[ end trace 1ae09793f9d923fb ]---

Reordering the function calls fixes the issue. It looks like this issue
was hidden by Clang's dead store elimination depending on whether or not
tracepoints were enabled, since the quotients in question are only ever
used as arguments to trace_rmnet_shs_wq_high().

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 18, 2021
…spend()

Add irq request check condition before enabling swrm interrupt
This will make sure to only enabled interrupt request when it is in disabled state.

This Fixes:
W         : ------------[ cut here ]------------
W         : Unbalanced enable for IRQ 502
W         : WARNING: CPU: 4 PID: 81 at kernel/irq/manage.c:621 enable_irq+0x98/0xf0
I         : Modules linked in:
I         : CPU: 4 PID: 81 Comm: kworker/4:1 Tainted: G S                4.19.197-IMMENSITY-g09a924c384cb #1
I Hardware name: Qualcomm Technologies, Inc. xiaomi alioth (DT)
I Workqueue: pm pm_runtime_work
I pstate  : 60c00085 (nZCv daIf +PAN +UAO)
I pc      : enable_irq+0x98/0xf0
I lr      : enable_irq+0x98/0xf0
I sp      : ffffff800885bbb0
I         : x29: ffffff800885bbc0 x28: ffffffa6fea0db38
I         : x27: 0000000000000002 x26: 0000000000000000
I         : x25: ffffffd1991ef37d x24: 0000000000000000
I         : x23: ffffffd1991edca1 x22: ffffffd199555410
I         : x21: ffffffd1991ef248 x20: 00000000000001f6
I         : x19: ffffffd18cc7b400 x18: ffffffd1b4e9f048
I         : x17: 0000000000000000 x16: 0000000000000000
I         : x15: 0000000000000086 x14: 0000000000000030
I         : x13: 0000000000049754 x12: 0000000000000000
I         : x11: 0000000000000000 x10: 0000000000000007
I         : x9 : 060ca0f25e42ae00 x8 : 060ca0f25e42ae00
I         : x7 : 0000000000000000 x6 : ffffffa6fed3f8e5
I         : x5 : 00000000001b68dc x4 : 000000000000000e
I         : x3 : 0000000000000032 x2 : 0000000000000007
I         : x1 : 0000000000000007 x0 : 000000000000001d
I Call trace:
I         : enable_irq+0x98/0xf0
I         : swrm_runtime_suspend+0x390/0x47c
I         : pm_generic_runtime_suspend+0x28/0x3c
I         : __rpm_callback+0x12c/0x218
I         : rpm_suspend+0x420/0x7cc
I         : pm_runtime_work+0x98/0xa8
I         : process_one_work+0x228/0x3f4
I         : worker_thread+0x264/0x4b0
I         : kthread+0x13c/0x158
I         : ret_from_fork+0x10/0x18
W         : ---[ end trace 56c9cc0df5ea202b ]---

Change-Id: Ic539bfc8d595faf530361d32e0be4ce9009fec08
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 18, 2021
Subsequent to commit 4d7f8d7 ("Revert "sched: Prevent raising SCHED_SOFTIRQ when CPU is !active"")

A CPU can be paused while it is idle with it's tick stopped.
nohz_balance_exit_idle() should be called from the local CPU,
so it can't be called during pause which can happen remotely.
This results in paused CPU participating in the nohz idle balance,
which should be avoided. This can be done by calling

Fix this issue by calling nohz_balance_exit_idle() from the paused
CPU when it exits and enters idle again. This lazy approach avoids
waking the CPU from idle during pause.

[@0ctobot: This is an attempt to silence the following dmesg WARN_ON:
[    6.081902] ------------[ cut here ]------------
[    6.081909] (flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK
[    6.081922] WARNING: CPU: 6 PID: 0 at _nohz_idle_balance+0x408/0x410
[    6.081926] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G S                4.19.202-NeutrinoKernel-lothal-EXP #1
[    6.081928] Hardware name: Qualcomm Technologies, Inc. kona MTP 19805 20809 (DT)
[    6.081931] pstate: 60400005 (nZCv daif +PAN -UAO)
[    6.081933] pc : _nohz_idle_balance+0x408/0x410
[    6.081935] lr : _nohz_idle_balance+0x400/0x410
[    6.081937] sp : ffffff8010033df0
[    6.081938] x29: ffffff8010033e30 x28: ffffff9d37e75040
[    6.081940] x27: ffffff9d37e572c0 x26: ffffff9d35c96050
[    6.081942] x25: 0000000000000007 x24: 0000000000000001
[    6.081944] x23: ffffff9d37e76000 x22: fffffffcb39e2a00
[    6.081946] x21: 0000000000000006 x20: 00000000ffff8d30
[    6.081947] x19: 0000000000000000 x18: ffffff8010035038
[    6.081949] x17: 0000000000000008 x16: 0000000000000000
[    6.081951] x15: 0000000000000008 x14: ffffff9d37f6dc18
[    6.081953] x13: 0000000000000004 x12: ffffff9d37e76000
[    6.081954] x11: 0000000000000000 x10: 0000000000000000
[    6.081956] x9 : a2d84b4c1c6fa900 x8 : a2d84b4c1c6fa900
[    6.081957] x7 : 000000000000002d x6 : ffffff9d38115e65
[    6.081959] x5 : ffffff9d38113500 x4 : 0000000000000000
[    6.081961] x3 : 000000000000004b x2 : 0000000000000000
[    6.081962] x1 : ffffff9d3811352d x0 : 000000000000002d
[    6.081964] Call trace:
[    6.081966] _nohz_idle_balance+0x408/0x410
[    6.081968] run_rebalance_domains+0x78/0xa8
[    6.081971] __do_softirq+0x144/0x2f4
[    6.081975] irq_exit+0x170/0x184
[    6.081978] scheduler_ipi+0x160/0x1d4
[    6.081980] handle_IPI+0x94/0x3e0
[    6.081982] gic_handle_irq.17896+0x100/0x180
[    6.081984] el1_irq+0xc4/0x140
[    6.081988] lpm_cpuidle_enter+0x61c/0x6d0
[    6.081991] cpuidle_enter_state+0x1dc/0x368
[    6.081993] do_idle+0x67c/0x74c
[    6.081995] cpu_startup_entry+0x58/0x5c
[    6.081997] secondary_start_kernel+0x13c/0x140
[    6.081999] ---[ end trace 6de171506e562a81 ]---]

Bug: 180530906
Change-Id: Ia2dfd9c9cac9b0f37c55a9256b9d5f3141ca0421
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 19, 2021
…stats

In rmnet_shs_wq_refresh_dl_mrkr_stats(), `tdiff` is used as a divisor
and is set to `rmnet_shs_wq_tnsec - tbl_p->l_epoch`. However, the
function called before it, rmnet_shs_wq_refresh_total_stats(), resets
`tbl_p->l_epoch` to `rmnet_shs_wq_tnsec`, resulting in `tdiff` always
being zero inside rmnet_shs_wq_refresh_dl_mrkr_stats(). This results in
a 100% consistent divide-by-zero exception:

[32.709616] Unexpected kernel BRK exception at EL1
[32.709626] Internal error: ptrace BRK handler: f20003e8 [#1] PREEMPT SMP
[32.709639] Process kworker/3:1 (pid: 79, stack limit = 0x0000000068506c9d)
[32.709647] CPU: 3 PID: 79 Comm: kworker/3:1 Tainted: G S      W         4.19.202-NeutrinoKernel-lothal-EXP #1
[32.709651] Hardware name: Qualcomm Technologies, Inc. kona MTP 19805 20809 (DT)
[32.709664] Workqueue: rmnet_shs_wq rmnet_shs_wq_process_wq
[32.709670] pstate: 60c00085 (nZCv daIf +PAN +UAO)
[32.709676] pc : rmnet_shs_wq_update_stats+0x3c8/0x5c0
[32.709680] lr : rmnet_shs_wq_update_stats+0x230/0x5c0
[32.709684] sp : ffffff801054bce0
[32.709687] x29: ffffff801054bce0 x28: ffffffa9fe579cdd
[32.709692] x27: ffffffa9fe57ada0 x26: ffffffa9fe57b1e0
[32.709697] x25: 0000000000000008 x24: ffffffa9fdc15920
[32.709702] x23: ffffffa9fe57e6b0 x22: 0000000000000007
[32.709706] x21: 0000000000000000 x20: 0000000000000000
[32.709711] x19: ffffffa9fddd4840 x18: 0000000000000000
[32.709716] x17: 0000000000000000 x16: 0000000000000005
[32.709720] x15: 0000000000000000 x14: 0000000000000179
[32.709725] x13: 0000000000000000 x12: 0000000000000000
[32.709730] x11: 0000000000000000 x10: 0000000000000000
[32.709734] x9 : 0000000000000000 x8 : 0000000000000000
[32.709739] x7 : 0000000000000000 x6 : 0000000000000000
[32.709743] x5 : 0000000000000000 x4 : ffffffa9fe57b218
[32.709748] x3 : fffffff2f1d31000 x2 : 0000000000000000
[32.709752] x1 : 0000000000000000 x0 : 0000000000000007
[32.709758] Call trace:
[32.709764]  rmnet_shs_wq_update_stats+0x3c8/0x5c0
[32.709768]  rmnet_shs_wq_process_wq+0x54/0xd0
[32.709775]  process_one_work+0x22c/0x3c0
[32.709779]  worker_thread+0x188/0x5e0
[32.709784]  kthread+0x130/0x160
[32.709790]  ret_from_fork+0x10/0x1c
[32.709797] Code: f9449860 f9022f60 f9449c60 f9023360 (d4207d00)
[32.709802] ---[ end trace 1ae09793f9d923fb ]---

Reordering the function calls fixes the issue. It looks like this issue
was hidden by Clang's dead store elimination depending on whether or not
tracepoints were enabled, since the quotients in question are only ever
used as arguments to trace_rmnet_shs_wq_high().

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 19, 2021
…spend()

Add irq request check condition before enabling swrm interrupt
This will make sure to only enabled interrupt request when it is in disabled state.

This Fixes:
W         : ------------[ cut here ]------------
W         : Unbalanced enable for IRQ 502
W         : WARNING: CPU: 4 PID: 81 at kernel/irq/manage.c:621 enable_irq+0x98/0xf0
I         : Modules linked in:
I         : CPU: 4 PID: 81 Comm: kworker/4:1 Tainted: G S                4.19.197-IMMENSITY-g09a924c384cb #1
I Hardware name: Qualcomm Technologies, Inc. xiaomi alioth (DT)
I Workqueue: pm pm_runtime_work
I pstate  : 60c00085 (nZCv daIf +PAN +UAO)
I pc      : enable_irq+0x98/0xf0
I lr      : enable_irq+0x98/0xf0
I sp      : ffffff800885bbb0
I         : x29: ffffff800885bbc0 x28: ffffffa6fea0db38
I         : x27: 0000000000000002 x26: 0000000000000000
I         : x25: ffffffd1991ef37d x24: 0000000000000000
I         : x23: ffffffd1991edca1 x22: ffffffd199555410
I         : x21: ffffffd1991ef248 x20: 00000000000001f6
I         : x19: ffffffd18cc7b400 x18: ffffffd1b4e9f048
I         : x17: 0000000000000000 x16: 0000000000000000
I         : x15: 0000000000000086 x14: 0000000000000030
I         : x13: 0000000000049754 x12: 0000000000000000
I         : x11: 0000000000000000 x10: 0000000000000007
I         : x9 : 060ca0f25e42ae00 x8 : 060ca0f25e42ae00
I         : x7 : 0000000000000000 x6 : ffffffa6fed3f8e5
I         : x5 : 00000000001b68dc x4 : 000000000000000e
I         : x3 : 0000000000000032 x2 : 0000000000000007
I         : x1 : 0000000000000007 x0 : 000000000000001d
I Call trace:
I         : enable_irq+0x98/0xf0
I         : swrm_runtime_suspend+0x390/0x47c
I         : pm_generic_runtime_suspend+0x28/0x3c
I         : __rpm_callback+0x12c/0x218
I         : rpm_suspend+0x420/0x7cc
I         : pm_runtime_work+0x98/0xa8
I         : process_one_work+0x228/0x3f4
I         : worker_thread+0x264/0x4b0
I         : kthread+0x13c/0x158
I         : ret_from_fork+0x10/0x18
W         : ---[ end trace 56c9cc0df5ea202b ]---

Change-Id: Ic539bfc8d595faf530361d32e0be4ce9009fec08
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 19, 2021
Subsequent to commit 4d7f8d7 ("Revert "sched: Prevent raising SCHED_SOFTIRQ when CPU is !active"")

A CPU can be paused while it is idle with it's tick stopped.
nohz_balance_exit_idle() should be called from the local CPU,
so it can't be called during pause which can happen remotely.
This results in paused CPU participating in the nohz idle balance,
which should be avoided. This can be done by calling

Fix this issue by calling nohz_balance_exit_idle() from the paused
CPU when it exits and enters idle again. This lazy approach avoids
waking the CPU from idle during pause.

[@0ctobot: This is an attempt to silence the following dmesg WARN_ON:
[    6.081902] ------------[ cut here ]------------
[    6.081909] (flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK
[    6.081922] WARNING: CPU: 6 PID: 0 at _nohz_idle_balance+0x408/0x410
[    6.081926] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G S                4.19.202-NeutrinoKernel-lothal-EXP #1
[    6.081928] Hardware name: Qualcomm Technologies, Inc. kona MTP 19805 20809 (DT)
[    6.081931] pstate: 60400005 (nZCv daif +PAN -UAO)
[    6.081933] pc : _nohz_idle_balance+0x408/0x410
[    6.081935] lr : _nohz_idle_balance+0x400/0x410
[    6.081937] sp : ffffff8010033df0
[    6.081938] x29: ffffff8010033e30 x28: ffffff9d37e75040
[    6.081940] x27: ffffff9d37e572c0 x26: ffffff9d35c96050
[    6.081942] x25: 0000000000000007 x24: 0000000000000001
[    6.081944] x23: ffffff9d37e76000 x22: fffffffcb39e2a00
[    6.081946] x21: 0000000000000006 x20: 00000000ffff8d30
[    6.081947] x19: 0000000000000000 x18: ffffff8010035038
[    6.081949] x17: 0000000000000008 x16: 0000000000000000
[    6.081951] x15: 0000000000000008 x14: ffffff9d37f6dc18
[    6.081953] x13: 0000000000000004 x12: ffffff9d37e76000
[    6.081954] x11: 0000000000000000 x10: 0000000000000000
[    6.081956] x9 : a2d84b4c1c6fa900 x8 : a2d84b4c1c6fa900
[    6.081957] x7 : 000000000000002d x6 : ffffff9d38115e65
[    6.081959] x5 : ffffff9d38113500 x4 : 0000000000000000
[    6.081961] x3 : 000000000000004b x2 : 0000000000000000
[    6.081962] x1 : ffffff9d3811352d x0 : 000000000000002d
[    6.081964] Call trace:
[    6.081966] _nohz_idle_balance+0x408/0x410
[    6.081968] run_rebalance_domains+0x78/0xa8
[    6.081971] __do_softirq+0x144/0x2f4
[    6.081975] irq_exit+0x170/0x184
[    6.081978] scheduler_ipi+0x160/0x1d4
[    6.081980] handle_IPI+0x94/0x3e0
[    6.081982] gic_handle_irq.17896+0x100/0x180
[    6.081984] el1_irq+0xc4/0x140
[    6.081988] lpm_cpuidle_enter+0x61c/0x6d0
[    6.081991] cpuidle_enter_state+0x1dc/0x368
[    6.081993] do_idle+0x67c/0x74c
[    6.081995] cpu_startup_entry+0x58/0x5c
[    6.081997] secondary_start_kernel+0x13c/0x140
[    6.081999] ---[ end trace 6de171506e562a81 ]---]

Bug: 180530906
Change-Id: Ia2dfd9c9cac9b0f37c55a9256b9d5f3141ca0421
Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Adam W. Willis <return.of.octobot@gmail.com>
aled99 pushed a commit that referenced this pull request Sep 19, 2021
This fixes the following CFI violation when the rmnet_perf module is
loaded:

CFI failure (target: [<ffffff9cddd181a4>] rmnet_perf_core_deaggregate+0x0/0x2c4):
------------[ cut here ]------------
WARNING: CPU: 1 PID: 0 at rmnet_rx_handler+0x240/0x270
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G S      W       4.14.186 #1
Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 MTP 18865 19863 14 15 (DT)
task: 0000000098c067f6 task.stack: 00000000289c42de
pc : rmnet_rx_handler+0x240/0x270
lr : rmnet_rx_handler+0x240/0x270
sp : ffffff801000bd00 pstate : 60400145
x29: ffffff801000bd00 x28: ffffff9cdc68e798
x27: ffffffe5ed28e090 x26: 0000000000000000
x25: ffffff9cdc68e9cc x24: ffffffe42fd6b900
x23: ffffff9cde829f30 x22: ffffff9cddd181a4
x21: ffffffe5f40fd100 x20: ffffffe5dfb95000
x19: ffffffe42fd6b900 x18: 0000000000010000
x17: 0000000000000008 x16: 0000000000000000
x15: 0000000000000008 x14: ffffff9cde85d990
x13: 0000000005000000 x12: 00ff00ff00000000
x11: ffffffffffffffff x10: 0000000000000008
x9 : 99d99e2e2d2e1900 x8 : 99d99e2e2d2e1900
x7 : 0000000000000000 x6 : ffffffe5f52091f1
x5 : 0000000000000000 x4 : 0000000000000000
x3 : fffffffffffffffc x2 : 0000000000000000
x1 : 0000000000000008 x0 : 0000000000000051
\x0aPC: 0xffffff9cdd12b8bc:
b8bc  f900051f aa1503e0 aa1403e1 940001f9 b4fffe60 aa0003f6 aa1403e1 94000015
b8dc  eb1602bf 54ffff01 17ffffef 900091e0 91188000 aa1603e1 aa1603e2 97d8aa73
b8fc  d4210000 17ffffcb aa1503e0 97d9465c 17ffffd2 aa1503e0 97d94659 17ffffdd
b91c  aa1303e0 528001c1 aa1503e2 94303be3 d10183ff a9027bfd f9001bf7 a90457f6
\x0aLR: 0xffffff9cdd12b8bc:
b8bc  f900051f aa1503e0 aa1403e1 940001f9 b4fffe60 aa0003f6 aa1403e1 94000015
b8dc  eb1602bf 54ffff01 17ffffef 900091e0 91188000 aa1603e1 aa1603e2 97d8aa73
b8fc  d4210000 17ffffcb aa1503e0 97d9465c 17ffffd2 aa1503e0 97d94659 17ffffdd
b91c  aa1303e0 528001c1 aa1503e2 94303be3 d10183ff a9027bfd f9001bf7 a90457f6
\x0aSP: 0xffffff801000bcc0:
bcc0  dd12b8fc ffffff9c 60400145 00000000 1000bca8 ffffff80 dd12b828 ffffff9c
bce0  ffffffff 0000007f 2d2e1900 99d99e2e 1000bd00 ffffff80 dd12b8fc ffffff9c
bd00  1000bd60 ffffff80 ddd4eafc ffffff9c 2fd6b900 ffffffe4 de84eec0 ffffff9c
bd20  00000000 00000000 ed28e000 ffffffe5 00000000 00000001 00000000 00000000

Call trace:
rmnet_rx_handler+0x240/0x270
__netif_receive_skb_core+0x50c/0xba0
process_backlog+0x1e4/0x3d0
net_rx_action+0x134/0x4f4
__do_softirq+0x16c/0x344
irq_exit+0x16c/0x178
handle_IPI+0x220/0x2e0
gic_handle_irq.16379+0xa8/0x180
el1_irq+0xb0/0x124
lpm_cpuidle_enter+0x33c/0x358
cpuidle_enter_state+0x220/0x400
do_idle+0x430/0x5f0
cpu_startup_entry+0x74/0x78
__cpu_disable+0x0/0xf0
---[ end trace 6e7b287874dec53e ]---

Reported-by: Adam W. Willis <return.of.octobot@gmail.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>
Signed-off-by: Carlos Ayrton Lopez Arroyo <15030201@itcelaya.edu.mx>
aled99 pushed a commit that referenced this pull request Sep 19, 2021
This fixes the following CFI violation when the rmnet_shs module is
loaded:

CFI failure (target: [<ffffff9cddd1e27c>] rmnet_shs_assign+0x0/0x9d0):
------------[ cut here ]------------
WARNING: CPU: 1 PID: 0 at rmnet_deliver_skb+0x224/0x24c
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G S      W       4.14.186 #1
Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 MTP 18865 19863 14 15 (DT)
task: 0000000098c067f6 task.stack: 00000000289c42de
pc : rmnet_deliver_skb+0x224/0x24c
lr : rmnet_deliver_skb+0x224/0x24c
sp : ffffff801000bc10 pstate : 60400145
x29: ffffff801000bc10 x28: ffffff9cdc68e798
x27: ffffffe5ed28e090 x26: 0000000000000000
x25: 0000000000000000 x24: ffffffe585b41ca8
x23: 0000000000000001 x22: ffffff9cddd1e27c
x21: ffffffe5f40fd100 x20: ffffffe5dfb95000
x19: ffffffe4eff9d500 x18: 0000000000000002
x17: 000000000000009c x16: 000000000000009c
x15: 0000000000000068 x14: 0000000000000082
x13: ffffff9cdefaec08 x12: 0000000000000004
x11: 00000000ffffffff x10: ffffffe5f5200000
x9 : 99d99e2e2d2e1900 x8 : 99d99e2e2d2e1900
x7 : 0000000000000000 x6 : ffffffe5f5209fc2
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 0000000000003a29 x2 : 0000000000000001
x1 : 0000000000000000 x0 : 0000000000000046
\x0aPC: 0xffffff9cdd12b3fc:
b3fc  a9424ff4 a94157f6 a8c37bfd d65f03c0 91246100 aa1303e1 9431af1a a9424ff4
b41c  a94157f6 a8c37bfd d65f03c0 900091e0 91188000 aa1603e1 aa1603e2 97d8aba3
b43c  d4210000 17ffff9e aa1503e0 97d9478c 17ffffa5 aa1503e0 aa0803f6 97d94788
b45c  aa1603e8 17ffffac d10103ff a9017bfd a90257f6 a9034ff4 910043fd aa0003f3
\x0aLR: 0xffffff9cdd12b3fc:
b3fc  a9424ff4 a94157f6 a8c37bfd d65f03c0 91246100 aa1303e1 9431af1a a9424ff4
b41c  a94157f6 a8c37bfd d65f03c0 900091e0 91188000 aa1603e1 aa1603e2 97d8aba3
b43c  d4210000 17ffff9e aa1503e0 97d9478c 17ffffa5 aa1503e0 aa0803f6 97d94788
b45c  aa1603e8 17ffffac d10103ff a9017bfd a90257f6 a9034ff4 910043fd aa0003f3
\x0aSP: 0xffffff801000bbd0:
bbd0  dd12b43c ffffff9c 60400145 00000000 1000bbb8 ffffff80 dd12b2b4 ffffff9c
bbf0  ffffffff 0000007f 2d2e1900 99d99e2e 1000bc10 ffffff80 dd12b43c ffffff9c
bc10  1000bc40 ffffff80 ddd18d48 ffffff9c 00000040 00000000 ddd18338 ffffff9c
bc30  eff9d500 ffffffe4 85b41c18 ffffffe5 1000bc50 ffffff80 ddd1a390 ffffff9c

Call trace:
rmnet_deliver_skb+0x224/0x24c
rmnet_perf_core_send_skb+0x138/0x140
rmnet_perf_opt_flush_single_flow_node+0x624/0x668
rmnet_perf_core_deaggregate+0x194/0x2c4
rmnet_rx_handler+0x17c/0x270
__netif_receive_skb_core+0x50c/0xba0
process_backlog+0x1e4/0x3d0
net_rx_action+0x134/0x4f4
__do_softirq+0x16c/0x344
irq_exit+0x16c/0x178
handle_IPI+0x220/0x2e0
gic_handle_irq.16379+0xa8/0x180
el1_irq+0xb0/0x124
lpm_cpuidle_enter+0x33c/0x358
cpuidle_enter_state+0x220/0x400
do_idle+0x430/0x5f0
cpu_startup_entry+0x74/0x78
__cpu_disable+0x0/0xf0
---[ end trace 6e7b287874dec53f ]---

Reported-by: Adam W. Willis <return.of.octobot@gmail.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Signed-off-by: Diab Neiroukh <lazerl0rd@thezest.dev>
Signed-off-by: Carlos Ayrton Lopez Arroyo <15030201@itcelaya.edu.mx>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.