Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.14 hdmi #3

Merged
merged 675 commits into from
Jun 4, 2018
Merged

4.14 hdmi #3

merged 675 commits into from
Jun 4, 2018

Conversation

d3adme4t
Copy link
Owner

@d3adme4t d3adme4t commented Jun 4, 2018

No description provided.

dcaratti and others added 30 commits May 30, 2018 07:52
[ Upstream commit 1f110e7 ]

when the following command

 # tc action add action sample rate 100 group 100 index 100

is run for the first time, and psample_group_get(100) fails to create a
new group, tcf_sample_cleanup() calls psample_group_put(NULL), thus
causing the following error:

 BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
 IP: psample_group_put+0x15/0x71 [psample]
 PGD 8000000075775067 P4D 8000000075775067 PUD 7453c067 PMD 0
 Oops: 0002 [#1] SMP PTI
 Modules linked in: act_sample(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core mbcache jbd2 crct10dif_pclmul snd_hwdep crc32_pclmul snd_seq ghash_clmulni_intel pcbc snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev pcspkr i2c_piix4 soundcore virtio_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_net ata_piix virtio_console virtio_blk libata serio_raw crc32c_intel virtio_pci i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_tunnel_key]
 CPU: 2 PID: 5740 Comm: tc Tainted: G            E    4.16.0-rc4.act_vlan.orig+ #403
 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
 RIP: 0010:psample_group_put+0x15/0x71 [psample]
 RSP: 0018:ffffb8a80032f7d0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000024
 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc06d93c0
 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
 R10: 00000000bd003000 R11: ffff979fba04aa59 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: ffff979fbba3f22c
 FS:  00007f7638112740(0000) GS:ffff979fbfd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000001c CR3: 00000000734ea001 CR4: 00000000001606e0
 Call Trace:
  __tcf_idr_release+0x79/0xf0
  tcf_sample_init+0x125/0x1d0 [act_sample]
  tcf_action_init_1+0x2cc/0x430
  tcf_action_init+0xd3/0x1b0
  tc_ctl_action+0x18b/0x240
  rtnetlink_rcv_msg+0x29c/0x310
  ? _cond_resched+0x15/0x30
  ? __kmalloc_node_track_caller+0x1b9/0x270
  ? rtnl_calcit.isra.28+0x100/0x100
  netlink_rcv_skb+0xd2/0x110
  netlink_unicast+0x17c/0x230
  netlink_sendmsg+0x2cd/0x3c0
  sock_sendmsg+0x30/0x40
  ___sys_sendmsg+0x27a/0x290
  ? filemap_map_pages+0x34a/0x3a0
  ? __handle_mm_fault+0xbfd/0xe20
  __sys_sendmsg+0x51/0x90
  do_syscall_64+0x6e/0x1a0
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
 RIP: 0033:0x7f7637523ba0
 RSP: 002b:00007fff0473ef58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 00007fff0473f080 RCX: 00007f7637523ba0
 RDX: 0000000000000000 RSI: 00007fff0473efd0 RDI: 0000000000000003
 RBP: 000000005aaaac80 R08: 0000000000000002 R09: 0000000000000000
 R10: 00007fff0473e9e0 R11: 0000000000000246 R12: 0000000000000000
 R13: 00007fff0473f094 R14: 0000000000000001 R15: 0000000000669f60
 Code: be 02 00 00 00 48 89 df e8 a9 fe ff ff e9 7c ff ff ff 0f 1f 40 00 0f 1f 44 00 00 53 48 89 fb 48 c7 c7 c0 93 6d c0 e8 db 20 8c ef <83> 6b 1c 01 74 10 48 c7 c7 c0 93 6d c0 ff 14 25 e8 83 83 b0 5b
 RIP: psample_group_put+0x15/0x71 [psample] RSP: ffffb8a80032f7d0
 CR2: 000000000000001c

Fix it in tcf_sample_cleanup(), ensuring that calls to psample_group_put(p)
are done only when p is not NULL.

Fixes: cadb9c9 ("net/sched: act_sample: Fix error path in init")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fc04fdb ]

batadv_check_unicast_ttvn may redirect a packet to itself or another
originator. This involves rewriting the ttvn and the destination address in
the batadv unicast header. These field were not yet pulled (with skb rcsum
update) and thus any change to them also requires a change in the receive
checksum.

Reported-by: Matthias Schiffer <mschiffer@universe-factory.net>
Fixes: a73105b ("batman-adv: improved client announcement mechanism")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dd1df24 ]

This re-introduces the effect of commit a324523 ("vti4:
Don't count header length twice.") which was accidentally
reverted by merge commit f895f0c ("Merge branch 'master' of
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec").

The commit message from Steffen Klassert said:

    We currently count the size of LL_MAX_HEADER and struct iphdr
    twice for vti4 devices, this leads to a wrong device mtu.
    The size of LL_MAX_HEADER and struct iphdr is already counted in
    ip_tunnel_bind_dev(), so don't do it again in vti_tunnel_init().

And this is still the case now: ip_tunnel_bind_dev() already
accounts for the header length of the link layer (not
necessarily LL_MAX_HEADER, if the output device is found), plus
one IP header.

For example, with a vti device on top of veth, with MTU of 1500,
the existing implementation would set the initial vti MTU to
1332, accounting once for LL_MAX_HEADER (128, included in
hard_header_len by vti) and twice for the same IP header (once
from hard_header_len, once from ip_tunnel_bind_dev()).

It should instead be 1480, because ip_tunnel_bind_dev() is able
to figure out that the output device is veth, so no additional
link layer header is attached, and will properly count one
single IP header.

The existing issue had the side effect of avoiding PMTUD for
most xfrm policies, by arbitrarily lowering the initial MTU.
However, the only way to get a consistent PMTU value is to let
the xfrm PMTU discovery do its course, and commit d6af1a3
("vti: Add pmtu handling to vti_xmit.") now takes care of local
delivery cases where the application ignores local socket
notifications.

Fixes: b9959fd ("vti: switch to new ip tunnel code")
Fixes: f895f0c ("Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 24fc797 ]

Otherwise, it's possible to specify invalid MTU values directly
on creation of a link (via 'ip link add'). This is already
prevented on subsequent MTU changes by commit b96f9af
("ipv4/6: use core net MTU range checking").

Fixes: c544193 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 03080e5 ]

Don't hardcode a MTU value on vti tunnel initialization,
ip_tunnel_newlink() is able to deal with this already. See also
commit ffc2b6e ("ip_gre: fix IFLA_MTU ignored on NEWLINK").

Fixes: 1181412 ("net/ipv4: VTI support new module for ip_vti.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f8a554b ]

We shouldn't allow a tunnel to have IP_MAX_MTU as MTU, because
another IPv6 header is going on top of our packets. Without this
patch, we might end up building packets bigger than IP_MAX_MTU.

Fixes: b96f9af ("ipv4/6: use core net MTU range checking")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 19125cb ]

After switching to a new channel, driver schedules session protection
time event in order to hear the beacon on the new channel.
The duration of the protection is two beacon intervals.
However, since we start to switch slightly before beacon with count 1, in
case we don't hear (or AP doesn't transmit) the very first beacon on the
new channel the protection ends without hearing any beacon at all.
At this stage the switch is not complete, the queues are closed and the
interface doesn't have quota yet or TBTT events. As the result, we are
stuck forever waiting for iwl_mvm_post_channel_switch() to be called.

Fix this by increasing the protection time to be 3 beacon intervals and
in addition drop the connection if the time event ends before we got any
beacon.

Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b38790 ]

When a queue is reserved for aggregation, the queue id is assigned
to the tid_data. This is fine since iwl_mvm_sta_tx_agg_oper()
takes care of allocating the queue before actual tx starts.
When the reservation is cancelled (e.g. when the AP declined the
aggregation request) the tid_data is not cleared. As a result,
following tx for this tid was trying to use an unallocated queue.

Fix this by setting the txq_id for the tid to invalid when unreserving
the queue.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit df65c8d ]

If the driver failed to resume from D3, it is possible that it has
no valid aux station. In such case, fw restart will end up in sending
station related commands with an invalid station id, which will
result in an assert.

Fix this by allocating a new station id for the aux station if it
does not have a valid id even in the case of fw restart.

Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4a6d2e5 ]

When starting aggregation, the code checks the status of the queue
allocated to the aggregation tid, which might not yet be allocated
and thus the queue index may be invalid.
Fix this by reserving a new queue in case the queue id is invalid.

While at it, clean up some unreachable code (a condition that is
already handled earlier) and remove all the non-DQA comments since
non-DQA mode is no longer supported.

Fixes: cf961e1 ("iwlwifi: mvm: support dqa-mode agg on non-shared queue")
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 192b4af ]

Since commit 846c7df ("drm/atomic: Try to preserve the crtc enabled
state in drm_atomic_remove_fb, v2."), removing the last framebuffer will
no longer disable the corresponding pipeline, which causes the KMS core
to complain about leaked connectors on driver unbind.

Fix this by calling drm_atomic_helper_shutdown() on driver unbind, which
will cause all display pipelines to be shut down and therefore drop the
extra references on the connectors.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c917e0f ]

When a perf_event is attached to parent cgroup, it should count events
for all children cgroups:

   parent_group   <---- perf_event
     \
      - child_group  <---- process(es)

However, in our tests, we found this perf_event cannot report reliable
results. Here is an example case:

  # create cgroups
  mkdir -p /sys/fs/cgroup/p/c
  # start perf for parent group
  perf stat -e instructions -G "p"

  # on another console, run test process in child cgroup:
  stressapptest -s 2 -M 1000 & echo $! > /sys/fs/cgroup/p/c/cgroup.procs

  # after the test process is done, stop perf in the first console shows

       <not counted>      instructions              p

The instruction should not be "not counted" as the process runs in the
child cgroup.

We found this is because perf_event->cgrp and cpuctx->cgrp are not
identical, thus perf_event->cgrp are not updated properly.

This patch fixes this by updating perf_cgroup properly for ancestor
cgroup(s).

Reported-by: Ephraim Park <ephiepark@fb.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <jolsa@redhat.com>
Cc: <kernel-team@fb.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20180312165943.1057894-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9b9322d ]

The commit "regulatory: add NUL to request alpha2" increases the length of
alpha2 to 3. This causes a regression on brcmfmac, because
brcmf_cfg80211_reg_notifier() expect valid ISO3166 codes in the complete
array. So fix this accordingly.

Fixes: 657308f ("regulatory: add NUL to request alpha2")
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 825d487 ]

Some filesystems have timestamps with coarse precision that may allow
for a recently built object file to have the same timestamp as the
updated time on one of its dependency files. When that happens, the
object file doesn't get rebuilt as it should.

This is especially the case on filesystems that don't have sub-second
time precision, such as ext3 or Ext4 with 128B inodes.

Let's prevent that by making sure updated dependency files have a newer
timestamp than the first file we created (i.e. autoksyms.h.tmpnew).

Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Lindroth <thomas.lindroth@gmail.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5f3e3b8 ]

The option size check is using optval instead of optlen
causing the set option call to fail. Use the correct
field, optlen, for size check.

Fixes: 6a21dfc ("RDMA/ucma: Limit possible option size")
Signed-off-by: Chien Tin Tung <chien.tin.tung@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c3594f2 ]

QPs that were configured with ack timeout value lower than 1
msec will not implement re-transmission timeout.
This means that if a packet / ACK were dropped, the QP
will not retransmit this packet.

This can lead to an application hang.

Fixes: cecbcdd ("qedr: Add support for QP verbs")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b15606f ]

Return code wasn't set properly when CNQ allocation failed.
This only affect error message logging, currently user will
receive an error message that says the qedr driver load failed
with rc '0', instead of ENOMEM

Fixes: ec72fce ("qedr: Add support for RoCE HW init")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit caf61b1 ]

Once the FW is transitioned to error, FLUSH cqes can be received.
We want the driver to be aware of the fact that QP is already in error.

Without this fix, a user may see false error messages in the dmesg log,
mentioning that a FLUSH cqe was received while QP is not in error state.

Fixes: cecbcdd ("qedr: Add support for QP verbs")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bbc09e7 ]

when the following command sequence is entered

 # tc action add action bpf bytecode '4,40 0 0 12,31 0 1 2048,6 0 0 262144,6 0 0 0' index 100
 RTNETLINK answers: Invalid argument
 We have an error talking to the kernel
 # tc action add action bpf bytecode '4,40 0 0 12,21 0 1 2048,6 0 0 262144,6 0 0 0' index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel

act_bpf correctly refuses to install the first TC rule, because 31 is not
a valid instruction. However, it refuses to install the second TC rule,
even if the BPF code is correct. Furthermore, it's no more possible to
install any other rule having the same value of 'index' until act_bpf
module is unloaded/inserted again. After the idr has been reserved, call
tcf_idr_release() instead of tcf_idr_cleanup(), to fix this issue.

Fixes: 65a206c ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 60e10b3 ]

if the kernel fails to duplicate 'sdata', creation of a new action fails
with -ENOMEM. However, subsequent attempts to install the same action
using the same value of 'index' systematically fail with -ENOSPC, and
that value of 'index' will no more be usable by act_simple, until rmmod /
insmod of act_simple.ko is done:

 # tc actions add action simple sdata hello index 100
 # tc actions list action simple

        action order 0: Simple <hello>
         index 100 ref 1 bind 0
 # tc actions flush action simple
 # tc actions add action simple sdata hello index 100
 RTNETLINK answers: Cannot allocate memory
 We have an error talking to the kernel
 # tc actions flush action simple
 # tc actions add action simple sdata hello index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 # tc actions add action simple sdata hello index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 ...

Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup().

Fixes: 65a206c ("net/sched: Change act_api and act_xxx modules to use IDR")
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5bf7f81 ]

tcf_act_police_init() can fail after the idr has been successfully
reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
subsequent attempts to configure a police rule using the same idr value
systematiclly fail with -ENOSPC:

 # tc action add action police rate 1000 burst 1000 drop index 100
 RTNETLINK answers: Cannot allocate memory
 We have an error talking to the kernel
 # tc action add action police rate 1000 burst 1000 drop index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 # tc action add action police rate 1000 burst 1000 drop index 100
 RTNETLINK answers: No space left on device
 ...

Fix this in the error path of tcf_act_police_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().

Fixes: 65a206c ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94fa3f9 ]

tcf_pedit_init() can fail to allocate 'keys' after the idr has been
successfully reserved. When this happens, subsequent attempts to configure
a pedit rule using the same idr value systematically fail with -ENOSPC:

 # tc action add action pedit munge ip ttl set 63 index 100
 RTNETLINK answers: Cannot allocate memory
 We have an error talking to the kernel
 # tc action add action pedit munge ip ttl set 63 index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 # tc action add action pedit munge ip ttl set 63 index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 ...

Fix this in the error path of tcf_act_pedit_init(), calling
tcf_idr_release() in place of tcf_idr_cleanup().

Fixes: 65a206c ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1e46ef1 ]

__tcf_ipt_init() can fail after the idr has been successfully reserved.
When this happens, subsequent attempts to configure xt/ipt rules using
the same idr value systematically fail with -ENOSPC:

 # tc action add action xt -j LOG --log-prefix test1 index 100
 tablename: mangle hook: NF_IP_POST_ROUTING
         target:  LOG level warning prefix "test1" index 100
 RTNETLINK answers: Cannot allocate memory
 We have an error talking to the kernel
 Command "(null)" is unknown, try "tc actions help".
 # tc action add action xt -j LOG --log-prefix test1 index 100
 tablename: mangle hook: NF_IP_POST_ROUTING
         target:  LOG level warning prefix "test1" index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 Command "(null)" is unknown, try "tc actions help".
 # tc action add action xt -j LOG --log-prefix test1 index 100
 tablename: mangle hook: NF_IP_POST_ROUTING
         target:  LOG level warning prefix "test1" index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 ...

Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
to avoid NULL pointer dereference.

Fixes: 65a206c ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f29cdfb ]

tcf_skbmod_init() can fail after the idr has been successfully reserved.
When this happens, every subsequent attempt to configure skbmod rules
using the same idr value will systematically fail with -ENOSPC, unless
the first attempt was done using the 'replace' keyword:

 # tc action add action skbmod swap mac index 100
 RTNETLINK answers: Cannot allocate memory
 We have an error talking to the kernel
 # tc action add action skbmod swap mac index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 # tc action add action skbmod swap mac index 100
 RTNETLINK answers: No space left on device
 We have an error talking to the kernel
 ...

Fix this in tcf_skbmod_init(), ensuring that tcf_idr_release() is called
on the error path when the idr has been reserved, but not yet inserted.
Also, don't test 'ovr' in the error path, to avoid a 'replace' failure
implicitly become a 'delete' that leaks refcount in act_skbmod module:

 # rmmod act_skbmod; modprobe act_skbmod
 # tc action add action skbmod swap mac index 100
 # tc action add action skbmod swap mac continue index 100
 RTNETLINK answers: File exists
 We have an error talking to the kernel
 # tc action replace action skbmod swap mac continue index 100
 RTNETLINK answers: Cannot allocate memory
 We have an error talking to the kernel
 # tc action list action skbmod
 #
 # rmmod  act_skbmod
 rmmod: ERROR: Module act_skbmod is in use

Fixes: 65a206c ("net/sched: Change act_api and act_xxx modules to use IDR")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 40013ff ]

We have a functional dependency on the FIXED_PHY MDIO bus because we register
fixed PHY devices "the old way" which only works if the code that does this has
had a chance to run before the fixed MDIO bus is probed. Make sure we account
for that and have dsa_loop_bdinfo.o be either built-in or modular depending on
whether CONFIG_FIXED_PHY reflects that too.

Fixes: 98cd155 ("net: dsa: Mock-up driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5a9f698 ]

The original ast driver cannot display properly if the resolution is 1280x800 and the pixel clock is 83.5MHz.
Here is the update to fix it.

Signed-off-by: Y.C. Chen <yc_chen@aspeedtech.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8970a63 ]

Alexander reported a use of uninitialized memory in __mpol_equal(),
which is caused by incorrect use of preferred_node.

When mempolicy in mode MPOL_PREFERRED with flags MPOL_F_LOCAL, it uses
numa_node_id() instead of preferred_node, however, __mpol_equal() uses
preferred_node without checking whether it is MPOL_F_LOCAL or not.

[akpm@linux-foundation.org: slight comment tweak]
Link: http://lkml.kernel.org/r/4ebee1c2-57f6-bcb8-0e2d-1833d1ee0bb7@huawei.com
Fixes: fc36b8d ("mempolicy: use MPOL_F_LOCAL to Indicate Preferred Local Policy")
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Reported-by: Alexander Potapenko <glider@google.com>
Tested-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d3c335 ]

Commit 2516035 ("mm, thp: remove __GFP_NORETRY from khugepaged and
madvised allocations") changed the page allocator to no longer detect
thp allocations based on __GFP_NORETRY.

It did not, however, modify the mem cgroup try_charge() path to avoid
oom kill for either khugepaged collapsing or thp faulting.  It is never
expected to oom kill a process to allocate a hugepage for thp; reclaim
is governed by the thp defrag mode and MADV_HUGEPAGE, but allocations
(and charging) should fallback instead of oom killing processes.

Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1803191409420.124411@chino.kir.corp.google.com
Fixes: 2516035 ("mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations")
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9a3fb9f ]

A recent commit introduced a new struct xfrm_trans_cb
that is used with the sk_buff control buffer. Unfortunately
it placed the structure in front of the control buffer and
overlooked that the IPv4/IPv6 control buffer is still needed
for some layer 4 protocols. As a result the IPv4/IPv6 control
buffer is overwritten with this structure. Fix this by setting
a apropriate header in front of the structure.

Fixes acf568e ("xfrm: Reinject transport-mode packets ...")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 871bef2 ]

Add a testcase for probe event argument syntax which
ensures the kprobe_events interface correctly parses
given event arguments.

Link: http://lkml.kernel.org/r/152129033679.31874.12705519603869152799.stgit@devbox

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Jun 15, 2018
[ Upstream commit 2bbea6e ]

when mounting an ISO filesystem sometimes (very rarely)
the system hangs because of a race condition between two tasks.

PID: 6766   TASK: ffff88007b2a6dd0  CPU: 0   COMMAND: "mount"
 #0 [ffff880078447ae0] __schedule at ffffffff8168d605
 #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49
 #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995
 #3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef
 #4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod]
 #5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50
 #6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3
 #7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs]
 #8 [ffff880078447da8] mount_bdev at ffffffff81202570
 #9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs]
#10 [ffff880078447e28] mount_fs at ffffffff81202d09
#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f
#12 [ffff880078447ea8] do_mount at ffffffff81220fee
#13 [ffff880078447f28] sys_mount at ffffffff812218d6
#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007fd9ea914e9a  RSP: 00007ffd5d9bf648  RFLAGS: 00010246
    RAX: 00000000000000a5  RBX: ffffffff81698c49  RCX: 0000000000000010
    RDX: 00007fd9ec2bc210  RSI: 00007fd9ec2bc290  RDI: 00007fd9ec2bcf30
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000010
    R10: 00000000c0ed0001  R11: 0000000000000206  R12: 00007fd9ec2bc040
    R13: 00007fd9eb6b2380  R14: 00007fd9ec2bc210  R15: 00007fd9ec2bcf30
    ORIG_RAX: 00000000000000a5  CS: 0033  SS: 002b

This task was trying to mount the cdrom.  It allocated and configured a
super_block struct and owned the write-lock for the super_block->s_umount
rwsem. While exclusively owning the s_umount lock, it called
sr_block_ioctl and waited to acquire the global sr_mutex lock.

PID: 6785   TASK: ffff880078720fb0  CPU: 0   COMMAND: "systemd-udevd"
 #0 [ffff880078417898] __schedule at ffffffff8168d605
 #1 [ffff880078417900] schedule at ffffffff8168dc59
 #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605
 #3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838
 #4 [ffff8800784179d0] down_read at ffffffff8168cde0
 #5 [ffff8800784179e8] get_super at ffffffff81201cc7
 #6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de
 #7 [ffff880078417a40] flush_disk at ffffffff8123a94b
 #8 [ffff880078417a88] check_disk_change at ffffffff8123ab50
 #9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom]
#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod]
#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86
#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65
#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b
#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7
#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf
#16 [ffff880078417d00] do_last at ffffffff8120d53d
#17 [ffff880078417db0] path_openat at ffffffff8120e6b2
#18 [ffff880078417e48] do_filp_open at ffffffff8121082b
frank-w#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33
frank-w#20 [ffff880078417f70] sys_open at ffffffff811fde4e
frank-w#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49
    RIP: 00007f29438b0c20  RSP: 00007ffc76624b78  RFLAGS: 00010246
    RAX: 0000000000000002  RBX: ffffffff81698c49  RCX: 0000000000000000
    RDX: 00007f2944a5fa70  RSI: 00000000000a0800  RDI: 00007f2944a5fa70
    RBP: 00007f2944a5f540   R8: 0000000000000000   R9: 0000000000000020
    R10: 00007f2943614c40  R11: 0000000000000246  R12: ffffffff811fde4e
    R13: ffff880078417f78  R14: 000000000000000c  R15: 00007f2944a4b010
    ORIG_RAX: 0000000000000002  CS: 0033  SS: 002b

This task tried to open the cdrom device, the sr_block_open function
acquired the global sr_mutex lock. The call to check_disk_change()
then saw an event flag indicating a possible media change and tried
to flush any cached data for the device.
As part of the flush, it tried to acquire the super_block->s_umount
lock associated with the cdrom device.
This was the same super_block as created and locked by the previous task.

The first task acquires the s_umount lock and then the sr_mutex_lock;
the second task acquires the sr_mutex_lock and then the s_umount lock.

This patch fixes the issue by moving check_disk_change() out of
cdrom_open() and let the caller take care of it.

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Jun 15, 2018
[ Upstream commit d546b67 ]

spin_lock/unlock was used instead of spin_un/lock_irq
in a procedure used in process space, on a spinlock
which can be grabbed in an interrupt.

This caused the stack trace below to be displayed (on kernel
4.17.0-rc1 compiled with Lock Debugging enabled):

[  154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
[  154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G          I
[  154.675856] -----------------------------------------------------
[  154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[  154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
[  154.700927]
and this task is already holding:
[  154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
[  154.718028] which would create a new lock dependency:
[  154.723705]  (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
[  154.731922]
but this new dependency connects a SOFTIRQ-irq-safe lock:
[  154.740798]  (&(&cq->lock)->rlock){..-.}
[  154.740800]
... which became SOFTIRQ-irq-safe at:
[  154.752163]   _raw_spin_lock_irqsave+0x3e/0x50
[  154.757163]   mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
[  154.762554]   ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
...
to a SOFTIRQ-irq-unsafe lock:
[  154.815603]  (&(&qp_table->lock)->rlock){+.+.}
[  154.815604]
... which became SOFTIRQ-irq-unsafe at:
[  154.827718] ...
[  154.827720]   _raw_spin_lock+0x35/0x50
[  154.833912]   mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
[  154.839302]   mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]

Since mlx4_qp_lookup() is called only in process space, we can
simply replace the spin_un/lock calls with spin_un/lock_irq calls.

Fixes: 6dc06c0 ("net/mlx4: Fix the check in attaching steering rules")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
@frank-w frank-w deleted the 4.14-hdmi branch July 21, 2018 13:36
@frank-w frank-w restored the 4.14-hdmi branch November 20, 2018 11:20
@frank-w frank-w deleted the 4.14-hdmi branch November 22, 2018 12:24
@frank-w frank-w restored the 4.14-hdmi branch December 17, 2018 13:28
@frank-w frank-w deleted the 4.14-hdmi branch April 26, 2019 17:55
d3adme4t pushed a commit that referenced this pull request Jan 18, 2020
As virtual_context_destroy() may be called from a request signal, it may
be called from inside an irq-off section, and so we need to do a full
save/restore of the irq state rather than blindly re-enable irqs upon
unlocking.

<4> [110.024262] WARNING: inconsistent lock state
<4> [110.024277] 5.4.0-rc8-CI-CI_DRM_7489+ #1 Tainted: G     U
<4> [110.024292] --------------------------------
<4> [110.024305] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
<4> [110.024323] kworker/0:0/5 [HC0[0]:SC0[0]:HE1:SE1] takes:
<4> [110.024338] ffff88826a0c7a18 (&(&rq->lock)->rlock){?.-.}, at: i915_request_retire+0x221/0x930 [i915]
<4> [110.024592] {IN-HARDIRQ-W} state was registered at:
<4> [110.024612]   lock_acquire+0xa7/0x1c0
<4> [110.024627]   _raw_spin_lock_irqsave+0x33/0x50
<4> [110.024788]   intel_engine_breadcrumbs_irq+0x38c/0x600 [i915]
<4> [110.024808]   irq_work_run_list+0x49/0x70
<4> [110.024824]   irq_work_run+0x26/0x50
<4> [110.024839]   smp_irq_work_interrupt+0x44/0x1e0
<4> [110.024855]   irq_work_interrupt+0xf/0x20
<4> [110.024871]   __do_softirq+0xb7/0x47f
<4> [110.024885]   irq_exit+0xba/0xc0
<4> [110.024898]   do_IRQ+0x83/0x160
<4> [110.024910]   ret_from_intr+0x0/0x1d
<4> [110.024922] irq event stamp: 172864
<4> [110.024938] hardirqs last  enabled at (172863): [<ffffffff819ea214>] _raw_spin_unlock_irq+0x24/0x50
<4> [110.024963] hardirqs last disabled at (172864): [<ffffffff819e9fba>] _raw_spin_lock_irq+0xa/0x40
<4> [110.024988] softirqs last  enabled at (172812): [<ffffffff81c00385>] __do_softirq+0x385/0x47f
<4> [110.025012] softirqs last disabled at (172797): [<ffffffff810b829a>] irq_exit+0xba/0xc0
<4> [110.025031]
other info that might help us debug this:
<4> [110.025049]  Possible unsafe locking scenario:

<4> [110.025065]        CPU0
<4> [110.025075]        ----
<4> [110.025084]   lock(&(&rq->lock)->rlock);
<4> [110.025099]   <Interrupt>
<4> [110.025109]     lock(&(&rq->lock)->rlock);
<4> [110.025124]
 *** DEADLOCK ***

<4> [110.025144] 4 locks held by kworker/0:0/5:
<4> [110.025156]  #0: ffff88827588f528 ((wq_completion)events){+.+.}, at: process_one_work+0x1de/0x620
<4> [110.025187]  #1: ffffc9000006fe78 ((work_completion)(&engine->retire_work)){+.+.}, at: process_one_work+0x1de/0x620
<4> [110.025219]  #2: ffff88825605e270 (&kernel#2){+.+.}, at: engine_retire+0x57/0xe0 [i915]
<4> [110.025405]  #3: ffff88826a0c7a18 (&(&rq->lock)->rlock){?.-.}, at: i915_request_retire+0x221/0x930 [i915]
<4> [110.025634]
stack backtrace:
<4> [110.025653] CPU: 0 PID: 5 Comm: kworker/0:0 Tainted: G     U            5.4.0-rc8-CI-CI_DRM_7489+ #1
<4> [110.025675] Hardware name:  /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017
<4> [110.025856] Workqueue: events engine_retire [i915]
<4> [110.025872] Call Trace:
<4> [110.025891]  dump_stack+0x71/0x9b
<4> [110.025907]  mark_lock+0x49a/0x500
<4> [110.025926]  ? print_shortest_lock_dependencies+0x200/0x200
<4> [110.025946]  mark_held_locks+0x49/0x70
<4> [110.025962]  ? _raw_spin_unlock_irq+0x24/0x50
<4> [110.025978]  lockdep_hardirqs_on+0xa2/0x1c0
<4> [110.025995]  _raw_spin_unlock_irq+0x24/0x50
<4> [110.026171]  virtual_context_destroy+0xc5/0x2e0 [i915]
<4> [110.026376]  __active_retire+0xb4/0x290 [i915]
<4> [110.026396]  dma_fence_signal_locked+0x9e/0x1b0
<4> [110.026613]  i915_request_retire+0x451/0x930 [i915]
<4> [110.026766]  retire_requests+0x4d/0x60 [i915]
<4> [110.026919]  engine_retire+0x63/0xe0 [i915]

Fixes: b1e3177 ("drm/i915: Coordinate i915_active with its own mutex")
Fixes: 6d06779 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205145934.663183-1-chris@chris-wilson.co.uk
(cherry picked from commit 6f7ac82)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
d3adme4t pushed a commit that referenced this pull request Jan 18, 2020
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:

Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.

                /
           {22, 22}
              /
             /
   --->  {10, 50}
           /  \
          /    \
    {10, 30}  {20, 60}

The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.

Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:

        {20, 60}             {10, 50} <--
          /  \                 /  \
         /    \               /    \
    {10, 50}  NIL <--       NIL  {20, 60}

          (a)                    (b)

Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.

The reason is that the rbtree searching will stop at the pointing node
as shown above.

Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.

The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.

By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.

Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
d3adme4t pushed a commit that referenced this pull request Jan 18, 2020
We got another syzbot report [1] that tells us we must use
write_lock_irq()/write_unlock_irq() to avoid possible deadlock.

[1]

WARNING: inconsistent lock state
5.5.0-rc1-syzkaller #0 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage.
syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes:
ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
{HARDIRQ-ON-W} state was registered at:
  lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
  __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
  _raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319
  sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657
  tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489
  tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585
  tiocsetd drivers/tty/tty_io.c:2337 [inline]
  tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597
  vfs_ioctl fs/ioctl.c:47 [inline]
  file_ioctl fs/ioctl.c:545 [inline]
  do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:749
  __do_sys_ioctl fs/ioctl.c:756 [inline]
  __se_sys_ioctl fs/ioctl.c:754 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754
  do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
irq event stamp: 3946
hardirqs last  enabled at (3945): [<ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last  enabled at (3945): [<ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199
hardirqs last disabled at (3946): [<ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42
softirqs last  enabled at (2658): [<ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline]
softirqs last  enabled at (2658): [<ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222
softirqs last disabled at (2656): [<ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline]
softirqs last disabled at (2656): [<ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(disc_data_lock);
  <Interrupt>
    lock(disc_data_lock);

 *** DEADLOCK ***

5 locks held by syz-executor826/9605:
 #0: ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
 #1: ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413
 #2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
 #2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116
 #3: ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823
 #4: ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288

stack backtrace:
CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101
 valid_state kernel/locking/lockdep.c:3112 [inline]
 mark_lock_irq kernel/locking/lockdep.c:3309 [inline]
 mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666
 mark_usage kernel/locking/lockdep.c:3554 [inline]
 __lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909
 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
 _raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223
 sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
 sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402
 tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536
 tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50
 tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387
 uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104
 serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761
 serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834
 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline]
 serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850
 serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126
 __handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149
 handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189
 handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206
 handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830
 generic_handle_irq_desc include/linux/irqdesc.h:156 [inline]
 do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250
 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607
 </IRQ>
RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline]
RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579
Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7
RSP: 0018:ffffc90001eafa20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd7
RAX: 0000000000000000 RBX: ffff88809fd9e0c0 RCX: 1ffffffff13266dd
RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000000
RBP: ffffc90001eafa60 R08: 1ffff11013d22898 R09: ffffed1013d22899
R10: ffffed1013d22898 R11: ffff88809e9144c7 R12: ffff8880a905e138
R13: ffff88809e9144c0 R14: 0000000000000000 R15: dffffc0000000000
 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline]
 __mutex_lock_common kernel/locking/mutex.c:962 [inline]
 __mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121
 tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
 tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665
 __fput+0x2ff/0x890 fs/file_table.c:280
 ____fput+0x16/0x20 fs/file_table.c:313
 task_work_run+0x145/0x1c0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x8e7/0x2ef0 kernel/exit.c:797
 do_group_exit+0x135/0x360 kernel/exit.c:895
 __do_sys_exit_group kernel/exit.c:906 [inline]
 __se_sys_exit_group kernel/exit.c:904 [inline]
 __x64_sys_exit_group+0x44/0x50 kernel/exit.c:904
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43fef8
Code: Bad RIP value.
RSP: 002b:00007ffdb07d2338 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043fef8
RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
RBP: 00000000004bf730 R08: 00000000000000e7 R09: ffffffffffffffd0
R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000000001
R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000

Fixes: 6e4e2f8 ("6pack,mkiss: fix lock inconsistency")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
d3adme4t pushed a commit that referenced this pull request Oct 6, 2020
When closing and freeing the source device we could end up doing our
final blkdev_put() on the bdev, which will grab the bd_mutex.  As such
we want to be holding as few locks as possible, so move this call
outside of the dev_replace->lock_finishing_cancel_unmount lock.  Since
we're modifying the fs_devices we need to make sure we're holding the
uuid_mutex here, so take that as well.

There's a report from syzbot probably hitting one of the cases where
the bd_mutex and device_list_mutex are taken in the wrong order, however
it's not with device replace, like this patch fixes. As there's no
reproducer available so far, we can't verify the fix.

https://lore.kernel.org/lkml/000000000000fc04d105afcf86d7@google.com/
dashboard link: https://syzkaller.appspot.com/bug?extid=84a0634dc5d21d488419

  WARNING: possible circular locking dependency detected
  5.9.0-rc5-syzkaller #0 Not tainted
  ------------------------------------------------------
  syz-executor.0/6878 is trying to acquire lock:
  ffff88804c17d780 (&bdev->bd_mutex){+.+.}-{3:3}, at: blkdev_put+0x30/0x520 fs/block_dev.c:1804

  but task is already holding lock:
  ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #4 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
	 __mutex_lock_common kernel/locking/mutex.c:956 [inline]
	 __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
	 btrfs_finish_chunk_alloc+0x281/0xf90 fs/btrfs/volumes.c:5255
	 btrfs_create_pending_block_groups+0x2f3/0x700 fs/btrfs/block-group.c:2109
	 __btrfs_end_transaction+0xf5/0x690 fs/btrfs/transaction.c:916
	 find_free_extent_update_loop fs/btrfs/extent-tree.c:3807 [inline]
	 find_free_extent+0x23b7/0x2e60 fs/btrfs/extent-tree.c:4127
	 btrfs_reserve_extent+0x166/0x460 fs/btrfs/extent-tree.c:4206
	 cow_file_range+0x3de/0x9b0 fs/btrfs/inode.c:1063
	 btrfs_run_delalloc_range+0x2cf/0x1410 fs/btrfs/inode.c:1838
	 writepage_delalloc+0x150/0x460 fs/btrfs/extent_io.c:3439
	 __extent_writepage+0x441/0xd00 fs/btrfs/extent_io.c:3653
	 extent_write_cache_pages.constprop.0+0x69d/0x1040 fs/btrfs/extent_io.c:4249
	 extent_writepages+0xcd/0x2b0 fs/btrfs/extent_io.c:4370
	 do_writepages+0xec/0x290 mm/page-writeback.c:2352
	 __writeback_single_inode+0x125/0x1400 fs/fs-writeback.c:1461
	 writeback_sb_inodes+0x53d/0xf40 fs/fs-writeback.c:1721
	 wb_writeback+0x2ad/0xd40 fs/fs-writeback.c:1894
	 wb_do_writeback fs/fs-writeback.c:2039 [inline]
	 wb_workfn+0x2dc/0x13e0 fs/fs-writeback.c:2080
	 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
	 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
	 kthread+0x3b5/0x4a0 kernel/kthread.c:292
	 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

  -> #3 (sb_internal#2){.+.+}-{0:0}:
	 percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
	 __sb_start_write+0x234/0x470 fs/super.c:1672
	 sb_start_intwrite include/linux/fs.h:1690 [inline]
	 start_transaction+0xbe7/0x1170 fs/btrfs/transaction.c:624
	 find_free_extent_update_loop fs/btrfs/extent-tree.c:3789 [inline]
	 find_free_extent+0x25e1/0x2e60 fs/btrfs/extent-tree.c:4127
	 btrfs_reserve_extent+0x166/0x460 fs/btrfs/extent-tree.c:4206
	 cow_file_range+0x3de/0x9b0 fs/btrfs/inode.c:1063
	 btrfs_run_delalloc_range+0x2cf/0x1410 fs/btrfs/inode.c:1838
	 writepage_delalloc+0x150/0x460 fs/btrfs/extent_io.c:3439
	 __extent_writepage+0x441/0xd00 fs/btrfs/extent_io.c:3653
	 extent_write_cache_pages.constprop.0+0x69d/0x1040 fs/btrfs/extent_io.c:4249
	 extent_writepages+0xcd/0x2b0 fs/btrfs/extent_io.c:4370
	 do_writepages+0xec/0x290 mm/page-writeback.c:2352
	 __writeback_single_inode+0x125/0x1400 fs/fs-writeback.c:1461
	 writeback_sb_inodes+0x53d/0xf40 fs/fs-writeback.c:1721
	 wb_writeback+0x2ad/0xd40 fs/fs-writeback.c:1894
	 wb_do_writeback fs/fs-writeback.c:2039 [inline]
	 wb_workfn+0x2dc/0x13e0 fs/fs-writeback.c:2080
	 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
	 worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
	 kthread+0x3b5/0x4a0 kernel/kthread.c:292
	 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294

  -> #2 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}:
	 __flush_work+0x60e/0xac0 kernel/workqueue.c:3041
	 wb_shutdown+0x180/0x220 mm/backing-dev.c:355
	 bdi_unregister+0x174/0x590 mm/backing-dev.c:872
	 del_gendisk+0x820/0xa10 block/genhd.c:933
	 loop_remove drivers/block/loop.c:2192 [inline]
	 loop_control_ioctl drivers/block/loop.c:2291 [inline]
	 loop_control_ioctl+0x3b1/0x480 drivers/block/loop.c:2257
	 vfs_ioctl fs/ioctl.c:48 [inline]
	 __do_sys_ioctl fs/ioctl.c:753 [inline]
	 __se_sys_ioctl fs/ioctl.c:739 [inline]
	 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
	 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #1 (loop_ctl_mutex){+.+.}-{3:3}:
	 __mutex_lock_common kernel/locking/mutex.c:956 [inline]
	 __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
	 lo_open+0x19/0xd0 drivers/block/loop.c:1893
	 __blkdev_get+0x759/0x1aa0 fs/block_dev.c:1507
	 blkdev_get fs/block_dev.c:1639 [inline]
	 blkdev_open+0x227/0x300 fs/block_dev.c:1753
	 do_dentry_open+0x4b9/0x11b0 fs/open.c:817
	 do_open fs/namei.c:3251 [inline]
	 path_openat+0x1b9a/0x2730 fs/namei.c:3368
	 do_filp_open+0x17e/0x3c0 fs/namei.c:3395
	 do_sys_openat2+0x16d/0x420 fs/open.c:1168
	 do_sys_open fs/open.c:1184 [inline]
	 __do_sys_open fs/open.c:1192 [inline]
	 __se_sys_open fs/open.c:1188 [inline]
	 __x64_sys_open+0x119/0x1c0 fs/open.c:1188
	 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&bdev->bd_mutex){+.+.}-{3:3}:
	 check_prev_add kernel/locking/lockdep.c:2496 [inline]
	 check_prevs_add kernel/locking/lockdep.c:2601 [inline]
	 validate_chain kernel/locking/lockdep.c:3218 [inline]
	 __lock_acquire+0x2a96/0x5780 kernel/locking/lockdep.c:4426
	 lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006
	 __mutex_lock_common kernel/locking/mutex.c:956 [inline]
	 __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
	 blkdev_put+0x30/0x520 fs/block_dev.c:1804
	 btrfs_close_bdev fs/btrfs/volumes.c:1117 [inline]
	 btrfs_close_bdev fs/btrfs/volumes.c:1107 [inline]
	 btrfs_close_one_device fs/btrfs/volumes.c:1133 [inline]
	 close_fs_devices.part.0+0x1a4/0x800 fs/btrfs/volumes.c:1161
	 close_fs_devices fs/btrfs/volumes.c:1193 [inline]
	 btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179
	 close_ctree+0x688/0x6cb fs/btrfs/disk-io.c:4149
	 generic_shutdown_super+0x144/0x370 fs/super.c:464
	 kill_anon_super+0x36/0x60 fs/super.c:1108
	 btrfs_kill_super+0x38/0x50 fs/btrfs/super.c:2265
	 deactivate_locked_super+0x94/0x160 fs/super.c:335
	 deactivate_super+0xad/0xd0 fs/super.c:366
	 cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118
	 task_work_run+0xdd/0x190 kernel/task_work.c:141
	 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
	 exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
	 exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
	 syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &bdev->bd_mutex --> sb_internal#2 --> &fs_devs->device_list_mutex

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&fs_devs->device_list_mutex);
				 lock(sb_internal#2);
				 lock(&fs_devs->device_list_mutex);
    lock(&bdev->bd_mutex);

   *** DEADLOCK ***

  3 locks held by syz-executor.0/6878:
   #0: ffff88809070c0e0 (&type->s_umount_key#70){++++}-{3:3}, at: deactivate_super+0xa5/0xd0 fs/super.c:365
   #1: ffffffff8a5b37a8 (uuid_mutex){+.+.}-{3:3}, at: btrfs_close_devices+0x23/0x1f0 fs/btrfs/volumes.c:1178
   #2: ffff8880908cfce0 (&fs_devs->device_list_mutex){+.+.}-{3:3}, at: close_fs_devices.part.0+0x2e/0x800 fs/btrfs/volumes.c:1159

  stack backtrace:
  CPU: 0 PID: 6878 Comm: syz-executor.0 Not tainted 5.9.0-rc5-syzkaller #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
  Call Trace:
   __dump_stack lib/dump_stack.c:77 [inline]
   dump_stack+0x198/0x1fd lib/dump_stack.c:118
   check_noncircular+0x324/0x3e0 kernel/locking/lockdep.c:1827
   check_prev_add kernel/locking/lockdep.c:2496 [inline]
   check_prevs_add kernel/locking/lockdep.c:2601 [inline]
   validate_chain kernel/locking/lockdep.c:3218 [inline]
   __lock_acquire+0x2a96/0x5780 kernel/locking/lockdep.c:4426
   lock_acquire+0x1f3/0xae0 kernel/locking/lockdep.c:5006
   __mutex_lock_common kernel/locking/mutex.c:956 [inline]
   __mutex_lock+0x134/0x10e0 kernel/locking/mutex.c:1103
   blkdev_put+0x30/0x520 fs/block_dev.c:1804
   btrfs_close_bdev fs/btrfs/volumes.c:1117 [inline]
   btrfs_close_bdev fs/btrfs/volumes.c:1107 [inline]
   btrfs_close_one_device fs/btrfs/volumes.c:1133 [inline]
   close_fs_devices.part.0+0x1a4/0x800 fs/btrfs/volumes.c:1161
   close_fs_devices fs/btrfs/volumes.c:1193 [inline]
   btrfs_close_devices+0x95/0x1f0 fs/btrfs/volumes.c:1179
   close_ctree+0x688/0x6cb fs/btrfs/disk-io.c:4149
   generic_shutdown_super+0x144/0x370 fs/super.c:464
   kill_anon_super+0x36/0x60 fs/super.c:1108
   btrfs_kill_super+0x38/0x50 fs/btrfs/super.c:2265
   deactivate_locked_super+0x94/0x160 fs/super.c:335
   deactivate_super+0xad/0xd0 fs/super.c:366
   cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118
   task_work_run+0xdd/0x190 kernel/task_work.c:141
   tracehook_notify_resume include/linux/tracehook.h:188 [inline]
   exit_to_user_mode_loop kernel/entry/common.c:163 [inline]
   exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:190
   syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:265
   entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x460027
  RSP: 002b:00007fff59216328 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
  RAX: 0000000000000000 RBX: 0000000000076035 RCX: 0000000000460027
  RDX: 0000000000403188 RSI: 0000000000000002 RDI: 00007fff592163d0
  RBP: 0000000000000333 R08: 0000000000000000 R09: 000000000000000b
  R10: 0000000000000005 R11: 0000000000000246 R12: 00007fff59217460
  R13: 0000000002df2a60 R14: 0000000000000000 R15: 00007fff59217460

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
[ add syzbot reference ]
Signed-off-by: David Sterba <dsterba@suse.com>
d3adme4t pushed a commit that referenced this pull request Oct 6, 2020
…kernel/git/kvmarm/kvmarm into kvm-master

KVM/arm64 fixes for 5.9, take #3

- Fix synchronization of VTTBR update on TLB invalidation for nVHE systems
d3adme4t pushed a commit that referenced this pull request May 10, 2021
When running the latest kernel on an sc7180 with KASAN I got this
splat:
  BUG: KASAN: slab-out-of-bounds in a6xx_gpu_init+0x618/0x644
  Read of size 4 at addr ffffff8088f36100 by task kworker/7:1/58
  CPU: 7 PID: 58 Comm: kworker/7:1 Not tainted 5.11.0+ #3
  Hardware name: Google Lazor (rev1 - 2) with LTE (DT)
  Workqueue: events deferred_probe_work_func
  Call trace:
   dump_backtrace+0x0/0x3a8
   show_stack+0x24/0x30
   dump_stack+0x174/0x1e0
   print_address_description+0x70/0x2e4
   kasan_report+0x178/0x1bc
   __asan_report_load4_noabort+0x44/0x50
   a6xx_gpu_init+0x618/0x644
   adreno_bind+0x26c/0x438

This is because the speed bin is defined like this:
  gpu_speed_bin: gpu_speed_bin@1d2 {
    reg = <0x1d2 0x2>;
    bits = <5 8>;
  };

As you can see the "length" is 2 bytes. That means that the nvmem
subsystem allocates only 2 bytes. The GPU code, however, was casting
the pointer allocated by nvmem to a (u32 *) and dereferencing. That's
not so good.

Let's fix this to just use the nvmem_cell_read_u16() accessor function
which simplifies things and also gets rid of the splat.

Let's also put an explicit conversion from little endian in place just
to make things clear. The nvmem subsystem today is assuming little
endian and this makes it clear. Specifically, the way the above sc7180
cell is interpreted:

NVMEM:
 +--------+--------+--------+--------+--------+
 | ...... | 0x1d3  | 0x1d2  | ...... | 0x000  |
 +--------+--------+--------+--------+--------+
              ^       ^
             msb     lsb

You can see that the least significant data is at the lower address
which is little endian.

NOTE: someone who is truly paying attention might wonder about me
picking the "u16" version of this accessor instead of the "u8" (since
the value is 8 bits big) or the u32 version (just for fun). At the
moment you need to pick the accessor that exactly matches the length
the cell was specified as in the device tree. Hopefully future
patches to the nvmem subsystem will fix this.

Fixes: fe7952c ("drm/msm: Add speed-bin support to a618 gpu")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Rob Clark <robdclark@chromium.org>
d3adme4t pushed a commit that referenced this pull request May 10, 2021
I got several memory leak reports from Asan with a simple command.  It
was because VDSO is not released due to the refcount.  Like in
__dsos_addnew_id(), it should put the refcount after adding to the list.

  $ perf record true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]

  =================================================================
  ==692599==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 439 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256
    #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132
    #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347
    #4 0x559bce50826c in map__new util/map.c:175
    #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #6 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #7 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #8 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #10 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #11 0x559bce519bea in perf_session__process_events util/session.c:2297
    #12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    frank-w#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    frank-w#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169
    #2 0x559bce50821b in map__new util/map.c:168
    #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #4 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #8 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #9 0x559bce519bea in perf_session__process_events util/session.c:2297
    #10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
d3adme4t pushed a commit that referenced this pull request May 10, 2021
Patch fixes the bug:
BUG: kernel NULL pointer dereference, address: 0000000000000050
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 4137 Comm: uvc-gadget Tainted: G           OE     5.10.0-next-20201214+ #3
Hardware name: ASUS All Series/Q87T, BIOS 0908 07/22/2014
RIP: 0010:cdnsp_remove_request+0xe9/0x530 [cdnsp_udc_pci]
Code: 01 00 00 31 f6 48 89 df e8 64 d4 ff ff 48 8b 43 08 48 8b 13 45 31 f6 48 89 42 08 48 89 10 b8 98 ff ff ff 48 89 1b 48 89 5b 08 <41> 83 6d 50 01 41 83 af d0 00 00 00 01 41 f6 84 24 78 20 00 00 08
RSP: 0018:ffffb68d00d07b60 EFLAGS: 00010046
RAX: 00000000ffffff98 RBX: ffff9d29c57fbf00 RCX: 0000000000001400
RDX: ffff9d29c57fbf00 RSI: 0000000000000000 RDI: ffff9d29c57fbf00
RBP: ffffb68d00d07bb0 R08: ffff9d2ad9510a00 R09: ffff9d2ac011c000
R10: ffff9d2a12b6e760 R11: 0000000000000000 R12: ffff9d29d3fb8000
R13: 0000000000000000 R14: 0000000000000000 R15: ffff9d29d3fb88c0
FS:  0000000000000000(0000) GS:ffff9d2adba00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000050 CR3: 0000000102164005 CR4: 00000000001706f0
Call Trace:
 cdnsp_ep_dequeue+0x3c/0x90 [cdnsp_udc_pci]
 cdnsp_gadget_ep_dequeue+0x3f/0x80 [cdnsp_udc_pci]
 usb_ep_dequeue+0x21/0x70 [udc_core]
 uvcg_video_enable+0x19d/0x220 [usb_f_uvc]
 uvc_v4l2_release+0x49/0x90 [usb_f_uvc]
 v4l2_release+0xa5/0x100 [videodev]
 __fput+0x99/0x250
 ____fput+0xe/0x10
 task_work_run+0x75/0xb0
 do_exit+0x370/0xb80
 do_group_exit+0x43/0xa0
 get_signal+0x12d/0x820
 arch_do_signal_or_restart+0xb2/0x870
 ? __switch_to_asm+0x36/0x70
 ? kern_select+0xc6/0x100
 exit_to_user_mode_prepare+0xfc/0x170
 syscall_exit_to_user_mode+0x2a/0x40
 do_syscall_64+0x43/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe969cf5dd7
Code: Unable to access opcode bytes at RIP 0x7fe969cf5dad.

Problem occurs for UVC class. During disconnecting the UVC class disable
endpoints and then start dequeuing all requests. This leads to situation
where requests are removed twice. The first one in
cdnsp_gadget_ep_disable and the second in cdnsp_gadget_ep_dequeue
function.
Patch adds condition in cdnsp_gadget_ep_dequeue function which allows
dequeue requests only from enabled endpoint.

Fixes: 3d82904 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
Signed-off-by: Peter Chen <peter.chen@kernel.org>
d3adme4t pushed a commit that referenced this pull request May 10, 2021
…/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.12, take #3

- Fix GICv3 MMIO compatibility probing
- Prevent guests from using the ARMv8.4 self-hosted tracing extension
d3adme4t pushed a commit that referenced this pull request May 10, 2021
The following deadlock is detected:

  truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).

  PID: 14827  TASK: ffff881686a9af80  CPU: 20  COMMAND: "ora_p005_hrltd9"
   #0  __schedule at ffffffff818667cc
   #1  schedule at ffffffff81866de6
   #2  inode_dio_wait at ffffffff812a2d04
   #3  ocfs2_setattr at ffffffffc05f322e [ocfs2]
   #4  notify_change at ffffffff812a5a09
   #5  do_truncate at ffffffff812808f5
   #6  do_sys_ftruncate.constprop.18 at ffffffff81280cf2
   #7  sys_ftruncate at ffffffff81280d8e
   #8  do_syscall_64 at ffffffff81003949
   #9  entry_SYSCALL_64_after_hwframe at ffffffff81a001ad

dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:

   #0  __schedule+700 at ffffffff818667cc
   #1  schedule+54 at ffffffff81866de6
   #2  rwsem_down_write_failed+536 at ffffffff8186aa28
   #3  call_rwsem_down_write_failed+23 at ffffffff8185a1b7
   #4  down_write+45 at ffffffff81869c9d
   #5  ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
   #6  ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
   #7  dio_complete+140 at ffffffff812c873c
   #8  dio_aio_complete_work+25 at ffffffff812c89f9
   #9  process_one_work+361 at ffffffff810b1889
  #10  worker_thread+77 at ffffffff810b233d
  #11  kthread+261 at ffffffff810b7fd5
  #12  ret_from_fork+62 at ffffffff81a0035e

Thus above forms ABBA deadlock.  The same deadlock was mentioned in
upstream commit 28f5a8a ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()").  It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.

End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.

This is to fix the deadlock itself.  It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.

[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested]
  Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com

Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
d3adme4t pushed a commit that referenced this pull request Sep 3, 2021
commit 0af7782 upstream.

The execution of fb_delete_videomode() is not based on the result of the
previous fbcon_mode_deleted(). As a result, the mode is directly deleted,
regardless of whether it is still in use, which may cause UAF.

==================================================================
BUG: KASAN: use-after-free in fb_mode_is_equal+0x36e/0x5e0 \
drivers/video/fbdev/core/modedb.c:924
Read of size 4 at addr ffff88807e0ddb1c by task syz-executor.0/18962

CPU: 2 PID: 18962 Comm: syz-executor.0 Not tainted 5.10.45-rc1+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ...
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x137/0x1be lib/dump_stack.c:118
 print_address_description+0x6c/0x640 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report+0x13d/0x1e0 mm/kasan/report.c:562
 fb_mode_is_equal+0x36e/0x5e0 drivers/video/fbdev/core/modedb.c:924
 fbcon_mode_deleted+0x16a/0x220 drivers/video/fbdev/core/fbcon.c:2746
 fb_set_var+0x1e1/0xdb0 drivers/video/fbdev/core/fbmem.c:975
 do_fb_ioctl+0x4d9/0x6e0 drivers/video/fbdev/core/fbmem.c:1108
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 18960:
 kasan_save_stack mm/kasan/common.c:48 [inline]
 kasan_set_track+0x3d/0x70 mm/kasan/common.c:56
 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355
 __kasan_slab_free+0x108/0x140 mm/kasan/common.c:422
 slab_free_hook mm/slub.c:1541 [inline]
 slab_free_freelist_hook+0xd6/0x1a0 mm/slub.c:1574
 slab_free mm/slub.c:3139 [inline]
 kfree+0xca/0x3d0 mm/slub.c:4121
 fb_delete_videomode+0x56a/0x820 drivers/video/fbdev/core/modedb.c:1104
 fb_set_var+0x1f3/0xdb0 drivers/video/fbdev/core/fbmem.c:978
 do_fb_ioctl+0x4d9/0x6e0 drivers/video/fbdev/core/fbmem.c:1108
 vfs_ioctl fs/ioctl.c:48 [inline]
 __do_sys_ioctl fs/ioctl.c:753 [inline]
 __se_sys_ioctl+0xfb/0x170 fs/ioctl.c:739
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 13ff178 ("fbcon: Call fbcon_mode_deleted/new_modelist directly")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210712085544.2828-1-thunder.leizhen@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Sep 3, 2021
commit 4385539 upstream.

The ordering of MSI-X enable in hardware is dysfunctional:

 1) MSI-X is disabled in the control register
 2) Various setup functions
 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing
    the MSI-X table entries
 4) MSI-X is enabled and masked in the control register with the
    comment that enabling is required for some hardware to access
    the MSI-X table

Step #4 obviously contradicts #3. The history of this is an issue with the
NIU hardware. When #4 was introduced the table access actually happened in
msix_program_entries() which was invoked after enabling and masking MSI-X.

This was changed in commit d71d643 ("PCI/MSI: Kill redundant call of
irq_set_msi_desc() for MSI-X interrupts") which removed the table write
from msix_program_entries().

Interestingly enough nobody noticed and either NIU still works or it did
not get any testing with a kernel 3.19 or later.

Nevertheless this is inconsistent and there is no reason why MSI-X can't be
enabled and masked in the control register early on, i.e. move step #4
above to step #1. This preserves the NIU workaround and has no side effects
on other hardware.

Fixes: d71d643 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
commit 67069a1 upstream.

ASan reported a memory leak caused by info_linear not being deallocated.

The info_linear was allocated during in perf_event__synthesize_one_bpf_prog().

This patch adds the corresponding free() when bpf_prog_info_node
is freed in perf_env__purge_bpf().

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 7688 byte(s) in 19 object(s) allocated from:
      #0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f)
      #1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16
      #2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16
      #3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9
      #4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8
      #5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8
      #6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8
      #7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
      #8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
      #9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2
      #10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3
      #11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
commit 41d5854 upstream.

I got several memory leak reports from Asan with a simple command.  It
was because VDSO is not released due to the refcount.  Like in
__dsos_addnew_id(), it should put the refcount after adding to the list.

  $ perf record true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]

  =================================================================
  ==692599==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 439 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256
    #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132
    #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347
    #4 0x559bce50826c in map__new util/map.c:175
    #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #6 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #7 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #8 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #10 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #11 0x559bce519bea in perf_session__process_events util/session.c:2297
    #12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    frank-w#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    frank-w#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169
    #2 0x559bce50821b in map__new util/map.c:168
    #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
    #4 0x559bce512f6b in machines__deliver_event util/session.c:1481
    #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551
    #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244
    #7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
    #8 0x559bce519bea in __perf_session__process_events util/session.c:2268
    #9 0x559bce519bea in perf_session__process_events util/session.c:2297
    #10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
    #11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
    #12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
    #13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
    #14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
[ Upstream commit 21e3980 ]

vctrl_enable() and vctrl_disable() call regulator_enable() and
regulator_disable(), respectively. However, vctrl_* are regulator ops
and should not be calling the locked regulator APIs. Doing so results in
a lockdep warning.

Instead of exporting more internal regulator ops, model the ctrl supply
as an actual supply to vctrl-regulator. At probe time this driver still
needs to use the consumer API to fetch its constraints, but otherwise
lets the regulator core handle the upstream supply for it.

The enable/disable/is_enabled ops are not removed, but now only track
state internally. This preserves the original behavior with the ops
being available, but one could argue that the original behavior was
already incorrect: the internal state would not match the upstream
supply if that supply had another consumer that enabled the supply,
while vctrl-regulator was not enabled.

The lockdep warning is as follows:

	WARNING: possible circular locking dependency detected
	5.14.0-rc6 #2 Not tainted
	------------------------------------------------------
	swapper/0/1 is trying to acquire lock:
	ffffffc011306d00 (regulator_list_mutex){+.+.}-{3:3}, at:
		regulator_lock_dependent (arch/arm64/include/asm/current.h:19
					  include/linux/ww_mutex.h:111
					  drivers/regulator/core.c:329)

	but task is already holding lock:
	ffffff8004a77160 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
		regulator_lock_recursive (drivers/regulator/core.c:156
					  drivers/regulator/core.c:263)

	which lock already depends on the new lock.

	the existing dependency chain (in reverse order) is:

	-> #2 (regulator_ww_class_mutex){+.+.}-{3:3}:
	__mutex_lock_common (include/asm-generic/atomic-instrumented.h:606
			     include/asm-generic/atomic-long.h:29
			     kernel/locking/mutex.c:103
			     kernel/locking/mutex.c:144
			     kernel/locking/mutex.c:963)
	ww_mutex_lock (kernel/locking/mutex.c:1199)
	regulator_lock_recursive (drivers/regulator/core.c:156
				  drivers/regulator/core.c:263)
	regulator_lock_dependent (drivers/regulator/core.c:343)
	regulator_enable (drivers/regulator/core.c:2808)
	set_machine_constraints (drivers/regulator/core.c:1536)
	regulator_register (drivers/regulator/core.c:5486)
	devm_regulator_register (drivers/regulator/devres.c:196)
	reg_fixed_voltage_probe (drivers/regulator/fixed.c:289)
	platform_probe (drivers/base/platform.c:1427)
	[...]

	-> #1 (regulator_ww_class_acquire){+.+.}-{0:0}:
	regulator_lock_dependent (include/linux/ww_mutex.h:129
				  drivers/regulator/core.c:329)
	regulator_enable (drivers/regulator/core.c:2808)
	set_machine_constraints (drivers/regulator/core.c:1536)
	regulator_register (drivers/regulator/core.c:5486)
	devm_regulator_register (drivers/regulator/devres.c:196)
	reg_fixed_voltage_probe (drivers/regulator/fixed.c:289)
	[...]

	-> #0 (regulator_list_mutex){+.+.}-{3:3}:
	__lock_acquire (kernel/locking/lockdep.c:3052 (discriminator 4)
			kernel/locking/lockdep.c:3174 (discriminator 4)
			kernel/locking/lockdep.c:3789 (discriminator 4)
			kernel/locking/lockdep.c:5015 (discriminator 4))
	lock_acquire (arch/arm64/include/asm/percpu.h:39
		      kernel/locking/lockdep.c:438
		      kernel/locking/lockdep.c:5627)
	__mutex_lock_common (include/asm-generic/atomic-instrumented.h:606
			     include/asm-generic/atomic-long.h:29
			     kernel/locking/mutex.c:103
			     kernel/locking/mutex.c:144
			     kernel/locking/mutex.c:963)
	mutex_lock_nested (kernel/locking/mutex.c:1125)
	regulator_lock_dependent (arch/arm64/include/asm/current.h:19
				  include/linux/ww_mutex.h:111
				  drivers/regulator/core.c:329)
	regulator_enable (drivers/regulator/core.c:2808)
	vctrl_enable (drivers/regulator/vctrl-regulator.c:400)
	_regulator_do_enable (drivers/regulator/core.c:2617)
	_regulator_enable (drivers/regulator/core.c:2764)
	regulator_enable (drivers/regulator/core.c:308
			  drivers/regulator/core.c:2809)
	_set_opp (drivers/opp/core.c:819 drivers/opp/core.c:1072)
	dev_pm_opp_set_rate (drivers/opp/core.c:1164)
	set_target (drivers/cpufreq/cpufreq-dt.c:62)
	__cpufreq_driver_target (drivers/cpufreq/cpufreq.c:2216
				 drivers/cpufreq/cpufreq.c:2271)
	cpufreq_online (drivers/cpufreq/cpufreq.c:1488 (discriminator 2))
	cpufreq_add_dev (drivers/cpufreq/cpufreq.c:1563)
	subsys_interface_register (drivers/base/bus.c:?)
	cpufreq_register_driver (drivers/cpufreq/cpufreq.c:2819)
	dt_cpufreq_probe (drivers/cpufreq/cpufreq-dt.c:344)
	[...]

	other info that might help us debug this:

	Chain exists of:
	  regulator_list_mutex --> regulator_ww_class_acquire --> regulator_ww_class_mutex

	 Possible unsafe locking scenario:

	       CPU0                    CPU1
	       ----                    ----
	  lock(regulator_ww_class_mutex);
				       lock(regulator_ww_class_acquire);
				       lock(regulator_ww_class_mutex);
	  lock(regulator_list_mutex);

	 *** DEADLOCK ***

	6 locks held by swapper/0/1:
	#0: ffffff8002d32188 (&dev->mutex){....}-{3:3}, at:
		__device_driver_lock (drivers/base/dd.c:1030)
	#1: ffffffc0111a0520 (cpu_hotplug_lock){++++}-{0:0}, at:
		cpufreq_register_driver (drivers/cpufreq/cpufreq.c:2792 (discriminator 2))
	#2: ffffff8002a8d918 (subsys mutex#9){+.+.}-{3:3}, at:
		subsys_interface_register (drivers/base/bus.c:1033)
	#3: ffffff800341bb90 (&policy->rwsem){+.+.}-{3:3}, at:
		cpufreq_online (include/linux/bitmap.h:285
				include/linux/cpumask.h:405
				drivers/cpufreq/cpufreq.c:1399)
	#4: ffffffc011f0b7b8 (regulator_ww_class_acquire){+.+.}-{0:0}, at:
		regulator_enable (drivers/regulator/core.c:2808)
	#5: ffffff8004a77160 (regulator_ww_class_mutex){+.+.}-{3:3}, at:
		regulator_lock_recursive (drivers/regulator/core.c:156
		drivers/regulator/core.c:263)

	stack backtrace:
	CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc6 #2 7c8f8996d021ed0f65271e6aeebf7999de74a9fa
	Hardware name: Google Scarlet (DT)
	Call trace:
	dump_backtrace (arch/arm64/kernel/stacktrace.c:161)
	show_stack (arch/arm64/kernel/stacktrace.c:218)
	dump_stack_lvl (lib/dump_stack.c:106 (discriminator 2))
	dump_stack (lib/dump_stack.c:113)
	print_circular_bug (kernel/locking/lockdep.c:?)
	check_noncircular (kernel/locking/lockdep.c:?)
	__lock_acquire (kernel/locking/lockdep.c:3052 (discriminator 4)
			kernel/locking/lockdep.c:3174 (discriminator 4)
			kernel/locking/lockdep.c:3789 (discriminator 4)
			kernel/locking/lockdep.c:5015 (discriminator 4))
	lock_acquire (arch/arm64/include/asm/percpu.h:39
		      kernel/locking/lockdep.c:438
		      kernel/locking/lockdep.c:5627)
	__mutex_lock_common (include/asm-generic/atomic-instrumented.h:606
			     include/asm-generic/atomic-long.h:29
			     kernel/locking/mutex.c:103
			     kernel/locking/mutex.c:144
			     kernel/locking/mutex.c:963)
	mutex_lock_nested (kernel/locking/mutex.c:1125)
	regulator_lock_dependent (arch/arm64/include/asm/current.h:19
				  include/linux/ww_mutex.h:111
				  drivers/regulator/core.c:329)
	regulator_enable (drivers/regulator/core.c:2808)
	vctrl_enable (drivers/regulator/vctrl-regulator.c:400)
	_regulator_do_enable (drivers/regulator/core.c:2617)
	_regulator_enable (drivers/regulator/core.c:2764)
	regulator_enable (drivers/regulator/core.c:308
			  drivers/regulator/core.c:2809)
	_set_opp (drivers/opp/core.c:819 drivers/opp/core.c:1072)
	dev_pm_opp_set_rate (drivers/opp/core.c:1164)
	set_target (drivers/cpufreq/cpufreq-dt.c:62)
	__cpufreq_driver_target (drivers/cpufreq/cpufreq.c:2216
				 drivers/cpufreq/cpufreq.c:2271)
	cpufreq_online (drivers/cpufreq/cpufreq.c:1488 (discriminator 2))
	cpufreq_add_dev (drivers/cpufreq/cpufreq.c:1563)
	subsys_interface_register (drivers/base/bus.c:?)
	cpufreq_register_driver (drivers/cpufreq/cpufreq.c:2819)
	dt_cpufreq_probe (drivers/cpufreq/cpufreq-dt.c:344)
	[...]

Reported-by: Brian Norris <briannorris@chromium.org>
Fixes: f8702f9 ("regulator: core: Use ww_mutex for regulators locking")
Fixes: e915331 ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Link: https://lore.kernel.org/r/20210825033704.3307263-3-wenst@chromium.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
[ Upstream commit 6fa54bc ]

If em28xx_ir_init fails, it would decrease the refcount of dev. However,
in the em28xx_ir_fini, when ir is NULL, it goes to ref_put and decrease
the refcount of dev. This will lead to a refcount bug.

Fix this bug by removing the kref_put in the error handling code
of em28xx_ir_init.

refcount_t: underflow; use-after-free.
WARNING: CPU: 0 PID: 7 at lib/refcount.c:28 refcount_warn_saturate+0x18e/0x1a0 lib/refcount.c:28
Modules linked in:
CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.13.0 #3
Workqueue: usb_hub_wq hub_event
RIP: 0010:refcount_warn_saturate+0x18e/0x1a0 lib/refcount.c:28
Call Trace:
  kref_put.constprop.0+0x60/0x85 include/linux/kref.h:69
  em28xx_usb_disconnect.cold+0xd7/0xdc drivers/media/usb/em28xx/em28xx-cards.c:4150
  usb_unbind_interface+0xbf/0x3a0 drivers/usb/core/driver.c:458
  __device_release_driver drivers/base/dd.c:1201 [inline]
  device_release_driver_internal+0x22a/0x230 drivers/base/dd.c:1232
  bus_remove_device+0x108/0x160 drivers/base/bus.c:529
  device_del+0x1fe/0x510 drivers/base/core.c:3540
  usb_disable_device+0xd1/0x1d0 drivers/usb/core/message.c:1419
  usb_disconnect+0x109/0x330 drivers/usb/core/hub.c:2221
  hub_port_connect drivers/usb/core/hub.c:5151 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5440 [inline]
  port_event drivers/usb/core/hub.c:5586 [inline]
  hub_event+0xf81/0x1d40 drivers/usb/core/hub.c:5668
  process_one_work+0x2c9/0x610 kernel/workqueue.c:2276
  process_scheduled_works kernel/workqueue.c:2338 [inline]
  worker_thread+0x333/0x5b0 kernel/workqueue.c:2424
  kthread+0x188/0x1d0 kernel/kthread.c:319
  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Fixes: ac56886 ("media: em28xx: Fix possible memory leak of em28xx struct")
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
commit 57f0ff0 upstream.

It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:

  # perf top --sort comm -g --ignore-callees=do_idle

terminates with:

  #0  0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
  #1  0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
  #2  0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
  #3  hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
  #4  0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
      sample_self=sample_self@entry=true) at util/hist.c:657
  #5  0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
      sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
  #6  0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
      at util/hist.c:1056
  #7  iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
  #8  0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
      at util/hist.c:1231
  #9  0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
      at builtin-top.c:842
  #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
  #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
  #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
  #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
  #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
  #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
  #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
  #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
  #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6

If you look at the frame #2, the code is:

488	 if (he->srcline) {
489          he->srcline = strdup(he->srcline);
490          if (he->srcline == NULL)
491              goto err_rawdata;
492	 }

If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.

Also, if you look at the commit 1fb7d06, it adds the srcline property
into the struct, but not initializing it everywhere needed.

Committer notes:

Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():

2181         if (al.sym != NULL) {
2182                 if (perf_hpp_list.parent && !*parent &&
2183                     symbol__match_regex(al.sym, &parent_regex))
2184                         *parent = al.sym;
2185                 else if (have_ignore_callees && root_al &&
2186                   symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187                         /* Treat this symbol as the root,
2188                            forgetting its callees. */
2189                         *root_al = al;
2190                         callchain_cursor_reset(cursor);
2191                 }
2192         }

And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:

1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212                          int max_stack_depth, void *arg)
1213 {
1214         int err, err2;
1215         struct map *alm = NULL;
1216
1217         if (al)
1218                 alm = map__get(al->map);
1219
1220         err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221                                         iter->evsel, al, max_stack_depth);
1222         if (err) {
1223                 map__put(alm);
1224                 return err;
1225         }
1226
1227         err = iter->ops->prepare_entry(iter, al);
1228         if (err)
1229                 goto out;
1230
1231         err = iter->ops->add_single_entry(iter, al);
1232         if (err)
1233                 goto out;
1234

That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:

        iter->ops->add_single_entry(iter, al);

will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
commit 98e2e40 upstream.

When the refcount is decreased to 0, the resource reclamation branch is
entered.  Before CPU0 reaches the race point (1), CPU1 may obtain the
spinlock and traverse the rbtree to find 'root', see
nilfs_lookup_root().

Although CPU1 will call refcount_inc() to increase the refcount, it is
obviously too late.  CPU0 will release 'root' directly, CPU1 then
accesses 'root' and triggers UAF.

Use refcount_dec_and_lock() to ensure that both the operations of
decrease refcount to 0 and link deletion are lock protected eliminates
this risk.

	     CPU0                      CPU1
	nilfs_put_root():
		    <-------- (1)
				spin_lock(&nilfs->ns_cptree_lock);
				rb_erase(&root->rb_node, &nilfs->ns_cptree);
				spin_unlock(&nilfs->ns_cptree_lock);

	kfree(root);
		    <-------- use-after-free

  refcount_t: underflow; use-after-free.
  WARNING: CPU: 2 PID: 9476 at lib/refcount.c:28 \
  refcount_warn_saturate+0x1cf/0x210 lib/refcount.c:28
  Modules linked in:
  CPU: 2 PID: 9476 Comm: syz-executor.0 Not tainted 5.10.45-rc1+ #3
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
  RIP: 0010:refcount_warn_saturate+0x1cf/0x210 lib/refcount.c:28
  ... ...
  Call Trace:
     __refcount_sub_and_test include/linux/refcount.h:283 [inline]
     __refcount_dec_and_test include/linux/refcount.h:315 [inline]
     refcount_dec_and_test include/linux/refcount.h:333 [inline]
     nilfs_put_root+0xc1/0xd0 fs/nilfs2/the_nilfs.c:795
     nilfs_segctor_destroy fs/nilfs2/segment.c:2749 [inline]
     nilfs_detach_log_writer+0x3fa/0x570 fs/nilfs2/segment.c:2812
     nilfs_put_super+0x2f/0xf0 fs/nilfs2/super.c:467
     generic_shutdown_super+0xcd/0x1f0 fs/super.c:464
     kill_block_super+0x4a/0x90 fs/super.c:1446
     deactivate_locked_super+0x6a/0xb0 fs/super.c:335
     deactivate_super+0x85/0x90 fs/super.c:366
     cleanup_mnt+0x277/0x2e0 fs/namespace.c:1118
     __cleanup_mnt+0x15/0x20 fs/namespace.c:1125
     task_work_run+0x8e/0x110 kernel/task_work.c:151
     tracehook_notify_resume include/linux/tracehook.h:188 [inline]
     exit_to_user_mode_loop kernel/entry/common.c:164 [inline]
     exit_to_user_mode_prepare+0x13c/0x170 kernel/entry/common.c:191
     syscall_exit_to_user_mode+0x16/0x30 kernel/entry/common.c:266
     do_syscall_64+0x45/0x80 arch/x86/entry/common.c:56
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

There is no reproduction program, and the above is only theoretical
analysis.

Link: https://lkml.kernel.org/r/1629859428-5906-1-git-send-email-konishi.ryusuke@gmail.com
Fixes: ba65ae4 ("nilfs2: add checkpoint tree to nilfs object")
Link: https://lkml.kernel.org/r/20210723012317.4146-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
[ Upstream commit 8f96a5b ]

We update the ctime/mtime of a block device when we remove it so that
blkid knows the device changed.  However we do this by re-opening the
block device and calling filp_update_time.  This is more correct because
it'll call the inode->i_op->update_time if it exists, but the block dev
inodes do not do this.  Instead call generic_update_time() on the
bd_inode in order to avoid the blkdev_open path and get rid of the
following lockdep splat:

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #406 Not tainted
------------------------------------------------------
losetup/11596 is trying to acquire lock:
ffff939640d2f538 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0

but task is already holding lock:
ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #4 (&lo->lo_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7d/0x750
       lo_open+0x28/0x60 [loop]
       blkdev_get_whole+0x25/0xf0
       blkdev_get_by_dev.part.0+0x168/0x3c0
       blkdev_open+0xd2/0xe0
       do_dentry_open+0x161/0x390
       path_openat+0x3cc/0xa20
       do_filp_open+0x96/0x120
       do_sys_openat2+0x7b/0x130
       __x64_sys_openat+0x46/0x70
       do_syscall_64+0x38/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #3 (&disk->open_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7d/0x750
       blkdev_get_by_dev.part.0+0x56/0x3c0
       blkdev_open+0xd2/0xe0
       do_dentry_open+0x161/0x390
       path_openat+0x3cc/0xa20
       do_filp_open+0x96/0x120
       file_open_name+0xc7/0x170
       filp_open+0x2c/0x50
       btrfs_scratch_superblocks.part.0+0x10f/0x170
       btrfs_rm_device.cold+0xe8/0xed
       btrfs_ioctl+0x2a31/0x2e70
       __x64_sys_ioctl+0x80/0xb0
       do_syscall_64+0x38/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #2 (sb_writers#12){.+.+}-{0:0}:
       lo_write_bvec+0xc2/0x240 [loop]
       loop_process_work+0x238/0xd00 [loop]
       process_one_work+0x26b/0x560
       worker_thread+0x55/0x3c0
       kthread+0x140/0x160
       ret_from_fork+0x1f/0x30

-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}:
       process_one_work+0x245/0x560
       worker_thread+0x55/0x3c0
       kthread+0x140/0x160
       ret_from_fork+0x1f/0x30

-> #0 ((wq_completion)loop0){+.+.}-{0:0}:
       __lock_acquire+0x10ea/0x1d90
       lock_acquire+0xb5/0x2b0
       flush_workqueue+0x91/0x5e0
       drain_workqueue+0xa0/0x110
       destroy_workqueue+0x36/0x250
       __loop_clr_fd+0x9a/0x660 [loop]
       block_ioctl+0x3f/0x50
       __x64_sys_ioctl+0x80/0xb0
       do_syscall_64+0x38/0x90
       entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

Chain exists of:
  (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&lo->lo_mutex);
                               lock(&disk->open_mutex);
                               lock(&lo->lo_mutex);
  lock((wq_completion)loop0);

 *** DEADLOCK ***

1 lock held by losetup/11596:
 #0: ffff939655510c68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]

stack backtrace:
CPU: 1 PID: 11596 Comm: losetup Not tainted 5.14.0-rc2+ #406
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
 dump_stack_lvl+0x57/0x72
 check_noncircular+0xcf/0xf0
 ? stack_trace_save+0x3b/0x50
 __lock_acquire+0x10ea/0x1d90
 lock_acquire+0xb5/0x2b0
 ? flush_workqueue+0x67/0x5e0
 ? lockdep_init_map_type+0x47/0x220
 flush_workqueue+0x91/0x5e0
 ? flush_workqueue+0x67/0x5e0
 ? verify_cpu+0xf0/0x100
 drain_workqueue+0xa0/0x110
 destroy_workqueue+0x36/0x250
 __loop_clr_fd+0x9a/0x660 [loop]
 ? blkdev_ioctl+0x8d/0x2a0
 block_ioctl+0x3f/0x50
 __x64_sys_ioctl+0x80/0xb0
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
commit 8253a34 upstream.

When passing 'phys' in the devicetree to describe the USB PHY phandle
(which is the recommended way according to
Documentation/devicetree/bindings/usb/ci-hdrc-usb2.txt) the
following NULL pointer dereference is observed on i.MX7 and i.MX8MM:

[    1.489344] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
[    1.498170] Mem abort info:
[    1.500966]   ESR = 0x96000044
[    1.504030]   EC = 0x25: DABT (current EL), IL = 32 bits
[    1.509356]   SET = 0, FnV = 0
[    1.512416]   EA = 0, S1PTW = 0
[    1.515569]   FSC = 0x04: level 0 translation fault
[    1.520458] Data abort info:
[    1.523349]   ISV = 0, ISS = 0x00000044
[    1.527196]   CM = 0, WnR = 1
[    1.530176] [0000000000000098] user address but active_mm is swapper
[    1.536544] Internal error: Oops: 96000044 [#1] PREEMPT SMP
[    1.542125] Modules linked in:
[    1.545190] CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.14.0-dirty #3
[    1.551901] Hardware name: Kontron i.MX8MM N801X S (DT)
[    1.557133] Workqueue: events_unbound deferred_probe_work_func
[    1.562984] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[    1.568998] pc : imx7d_charger_detection+0x3f0/0x510
[    1.573973] lr : imx7d_charger_detection+0x22c/0x510

This happens because the charger functions check for the phy presence
inside the imx_usbmisc_data structure (data->usb_phy), but the chipidea
core populates the usb_phy passed via 'phys' inside 'struct ci_hdrc'
(ci->usb_phy) instead.

This causes the NULL pointer dereference inside imx7d_charger_detection().

Fix it by also searching for 'phys' in case 'fsl,usbphy' is not found.

Tested on a imx7s-warp board.

Fixes: 746f316 ("usb: chipidea: introduce imx7d USB charger detection")
Cc: stable@vger.kernel.org
Reported-by: Heiko Thiery <heiko.thiery@gmail.com>
Tested-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de>
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20210921113754.767631-1-festevam@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d3adme4t pushed a commit that referenced this pull request Nov 11, 2021
…rbage value

[ Upstream commit 902c0b1 ]

Currently, when the rule related to IDLETIMER is added, idletimer_tg timer
structure is initialized by kmalloc on executing idletimer_tg_create
function. However, in this process timer->timer_type is not defined to
a specific value. Thus, timer->timer_type has garbage value and it occurs
kernel panic. So, this commit fixes the panic by initializing
timer->timer_type using kzalloc instead of kmalloc.

Test commands:
    # iptables -A OUTPUT -j IDLETIMER --timeout 1 --label test
    $ cat /sys/class/xt_idletimer/timers/test
      Killed

Splat looks like:
    BUG: KASAN: user-memory-access in alarm_expires_remaining+0x49/0x70
    Read of size 8 at addr 0000002e8c7bc4c8 by task cat/917
    CPU: 12 PID: 917 Comm: cat Not tainted 5.14.0+ #3 79940a339f71eb14fc81aee1757a20d5bf13eb0e
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    Call Trace:
     dump_stack_lvl+0x6e/0x9c
     kasan_report.cold+0x112/0x117
     ? alarm_expires_remaining+0x49/0x70
     __asan_load8+0x86/0xb0
     alarm_expires_remaining+0x49/0x70
     idletimer_tg_show+0xe5/0x19b [xt_IDLETIMER 11219304af9316a21bee5ba9d58f76a6b9bccc6d]
     dev_attr_show+0x3c/0x60
     sysfs_kf_seq_show+0x11d/0x1f0
     ? device_remove_bin_file+0x20/0x20
     kernfs_seq_show+0xa4/0xb0
     seq_read_iter+0x29c/0x750
     kernfs_fop_read_iter+0x25a/0x2c0
     ? __fsnotify_parent+0x3d1/0x570
     ? iov_iter_init+0x70/0x90
     new_sync_read+0x2a7/0x3d0
     ? __x64_sys_llseek+0x230/0x230
     ? rw_verify_area+0x81/0x150
     vfs_read+0x17b/0x240
     ksys_read+0xd9/0x180
     ? vfs_write+0x460/0x460
     ? do_syscall_64+0x16/0xc0
     ? lockdep_hardirqs_on+0x79/0x120
     __x64_sys_read+0x43/0x50
     do_syscall_64+0x3b/0xc0
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    RIP: 0033:0x7f0cdc819142
    Code: c0 e9 c2 fe ff ff 50 48 8d 3d 3a ca 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
    RSP: 002b:00007fff28eee5b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
    RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f0cdc819142
    RDX: 0000000000020000 RSI: 00007f0cdc032000 RDI: 0000000000000003
    RBP: 00007f0cdc032000 R08: 00007f0cdc031010 R09: 0000000000000000
    R10: 0000000000000022 R11: 0000000000000246 R12: 00005607e9ee31f0
    R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000

Fixes: 68983a3 ("netfilter: xtables: Add snapshot of hardidletimer target")
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
d3adme4t pushed a commit that referenced this pull request Dec 14, 2022
…kernel/git/at91/linux into arm/fixes

AT91 fixes for 6.1 #3

It contains:
- build fix for SAMA5D3 devices which don't have an L2 cache and due to this
  accesssing outer_cache.write_sec in sama5_secure_cache_init() could throw
  undefined reference to `outer_cache' if CONFIG_OUTER_CACHE is disabled
  from common sama5_defconfig.

* tag 'at91-fixes-6.1-3' of https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
  ARM: at91: fix build for SAMA5D3 w/o L2 cache

Link: https://lore.kernel.org/r/20221125093521.382105-1-claudiu.beznea@microchip.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
d3adme4t pushed a commit that referenced this pull request Dec 14, 2022
When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
might be called, and it's pretty much like in tipc_rcv().

Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
to protect the peer_net not being freed by another thread. However, when
receiving these packets, tipc_node_add_conn() might be called where the
peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
lockdep detector, although it is not a real dead lock:

    WARNING: possible recursive locking detected
    --------------------------------------------
    conn_server/1086 is trying to acquire lock:
    ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]

    but task is already holding lock:
    ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                     at: tipc_node_xmit+0x285/0xb30 [tipc]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&n->lock#2);
      lock(&n->lock#2);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    4 locks held by conn_server/1086:
     #0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
                          at: tipc_accept+0x9c0/0x10b0 [tipc]
     #1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
                          at: tipc_accept+0x363/0x10b0 [tipc]
     #2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
                          at: tipc_node_xmit+0x285/0xb30 [tipc]
     #3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
                          at: tipc_sk_rcv+0x2da/0x1b40 [tipc]

    Call Trace:
     <TASK>
     dump_stack_lvl+0x44/0x5b
     __lock_acquire.cold.77+0x1f2/0x3d7
     lock_acquire+0x1d2/0x610
     _raw_write_lock_bh+0x38/0x80
     tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
     tipc_sk_finish_conn+0x21e/0x640 [tipc]
     tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
     tipc_sk_rcv+0xbb4/0x1b40 [tipc]
     tipc_lxc_xmit+0x225/0x26b [tipc]
     tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
     __tipc_sendstream+0x879/0xff0 [tipc]
     tipc_accept+0x966/0x10b0 [tipc]
     do_accept+0x37d/0x590

This patch avoids this warning by not holding the 'node rw lock' before
calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
should be enough, as in cleanup_net() when freeing the netns, it calls
synchronize_rcu() before the free is continued.

Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
sense to call it under rcu_read_lock(). Note that the right lock order
must be:

   rcu_read_lock();
   tipc_node_read_lock(n);
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

instead of:

   tipc_node_read_lock(n);
   rcu_read_lock();
   tipc_node_read_unlock(n);
   tipc_lxc_xmit();
   rcu_read_unlock();

and we have to call tipc_node_read_lock/unlock() twice in
tipc_node_xmit().

Fixes: f73b128 ("tipc: improve throughput between nodes in netns")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment