Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] master from torvalds:master #338

Merged
merged 27 commits into from
Dec 4, 2020
Merged

[pull] master from torvalds:master #338

merged 27 commits into from
Dec 4, 2020

Conversation

pull[bot]
Copy link

@pull pull bot commented Dec 4, 2020

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

jonhunter and others added 27 commits November 9, 2020 14:32
Deferred probe is an expected return value for tegra_output_probe().
Given that the driver deals with it properly, there's no need to output
a warning that may potentially confuse users.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The Tegra SOR driver uses the devm infrastructure to request regulators,
but enables them without registering them with the infrastructure.

This results in the following splat if probing fails for any odd resaon
(such as dependencies not being available):

[    8.974187] tegra-sor 15580000.sor: cannot get HDMI supply: -517
[    9.414403] tegra-sor 15580000.sor: failed to probe HDMI: -517
[    9.421240] ------------[ cut here ]------------
[    9.425879] WARNING: CPU: 1 PID: 164 at drivers/regulator/core.c:2089 _regulator_put.part.0+0x16c/0x174
[    9.435259] Modules linked in: tegra_drm(E+) cec(E) ahci_tegra(E) drm_kms_helper(E) drm(E) libahci_platform(E) libahci(E) max77620_regulator(E) xhci_tegra(E+) sdhci_tegra(E) xhci_hcd(E) libata(E) sdhci_pltfm(E) cqhci(E) fixed(E) usbcore(E) scsi_mod(E) sdhci(E) host1x(E)
[    9.459211] CPU: 1 PID: 164 Comm: systemd-udevd Tainted: G S    D W   E     5.9.0-rc7-00298-gf6337624c4fe #1980
[    9.469285] Hardware name: NVIDIA Jetson TX2 Developer Kit (DT)
[    9.475202] pstate: 80000005 (Nzcv daif -PAN -UAO BTYPE=--)
[    9.480784] pc : _regulator_put.part.0+0x16c/0x174
[    9.485581] lr : regulator_put+0x44/0x60
[    9.489501] sp : ffffffc011d837b0
[    9.492814] x29: ffffffc011d837b0 x28: ffffff81dd085900
[    9.498141] x27: ffffff81de1c8ec0 x26: ffffff81de1c8c10
[    9.503464] x25: ffffff81dd085800 x24: ffffffc008f2c6b0
[    9.508790] x23: ffffffc0117373f0 x22: 0000000000000005
[    9.514101] x21: ffffff81dd085900 x20: ffffffc01172b098
[    9.515822] ata1: SATA link down (SStatus 0 SControl 300)
[    9.519426] x19: ffffff81dd085100 x18: 0000000000000030
[    9.530122] x17: 0000000000000000 x16: 0000000000000000
[    9.535453] x15: 0000000000000000 x14: 000000000000038f
[    9.540777] x13: 0000000000000003 x12: 0000000000000040
[    9.546105] x11: ffffff81eb800000 x10: 0000000000000ae0
[    9.551417] x9 : ffffffc0106fea24 x8 : ffffff81de83e6c0
[    9.556728] x7 : 0000000000000018 x6 : 00000000000003c3
[    9.562064] x5 : 0000000000005660 x4 : 0000000000000000
[    9.567392] x3 : ffffffc01172b388 x2 : ffffff81de83db80
[    9.572702] x1 : 0000000000000000 x0 : 0000000000000001
[    9.578034] Call trace:
[    9.580494]  _regulator_put.part.0+0x16c/0x174
[    9.584940]  regulator_put+0x44/0x60
[    9.588522]  devm_regulator_release+0x20/0x2c
[    9.592885]  release_nodes+0x1c8/0x2c0
[    9.596636]  devres_release_all+0x44/0x6c
[    9.600649]  really_probe+0x1ec/0x504
[    9.604316]  driver_probe_device+0x100/0x170
[    9.608589]  device_driver_attach+0xcc/0xd4
[    9.612774]  __driver_attach+0xb0/0x17c
[    9.616614]  bus_for_each_dev+0x7c/0xd4
[    9.620450]  driver_attach+0x30/0x3c
[    9.624027]  bus_add_driver+0x154/0x250
[    9.627867]  driver_register+0x84/0x140
[    9.631719]  __platform_register_drivers+0xa0/0x180
[    9.636660]  host1x_drm_init+0x60/0x1000 [tegra_drm]
[    9.641629]  do_one_initcall+0x54/0x2d0
[    9.645490]  do_init_module+0x68/0x29c
[    9.649244]  load_module+0x2178/0x26c0
[    9.652997]  __do_sys_finit_module+0xb0/0x120
[    9.657356]  __arm64_sys_finit_module+0x2c/0x40
[    9.661902]  el0_svc_common.constprop.0+0x80/0x240
[    9.666701]  do_el0_svc+0x30/0xa0
[    9.670022]  el0_svc+0x18/0x50
[    9.673081]  el0_sync_handler+0x90/0x318
[    9.677006]  el0_sync+0x158/0x180
[    9.680324] ---[ end trace 90f6c89d62d85ff6 ]---

Instead, let's register a callback that will disable the regulators
on teardown. This allows for the removal of the .remove callbacks,
which are not needed anymore.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
idr_init() uses base 0 which is an invalid identifier for this driver.
The new function idr_init_base allows IDR to set the ID lookup from
base 1. This avoids all lookups that otherwise starts from 0 since
0 is always unused.

References: commit 6ce711f ("idr: Make 1-based IDRs more efficient")

Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
The original patch for commit 3d2e7ae ("drm/tegra: output: Don't
leak OF node on error") contained this hunk, but it was accidentally
dropped during conflict resolution. This causes use-after-free errors
on devices that use an I2C controller for HDMI DDC/CI on Tegra210 and
later.

Fixes: 3d2e7ae ("drm/tegra: output: Don't leak OF node on error")
Signed-off-by: Thierry Reding <treding@nvidia.com>
The conversion away from the simple display pipeline helper missed
to convert the prepare_fb plane callback, so no fences are attached to
the atomic state, breaking synchronization with other devices. Fix
this by plugging in the drm_gem_fb_prepare_fb helper function.

Fixes: ae1ed00 ("drm: mxsfb: Stop using DRM simple display pipeline helper")
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201120211306.325841-1-l.stach@pengutronix.de
This will make sure applications which use the IN_FORMATS blob
to figure out which modifiers they can use will pick up the
linear modifier which is needed by mxsfb. Such applications
will not work otherwise if an incompatible implicit modifier
ends up being selected.

Before commit ae1ed00 ("drm: mxsfb: Stop using DRM simple
display pipeline helper"), the DRM simple display pipeline
helper took care of this.

Signed-off-by: Daniel Abrecht <public@danielabrecht.ch>
Fixes: ae1ed00 ("drm: mxsfb: Stop using DRM simple display pipeline helper")
Reviewed-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/2a99ffffc2378209307e0992a6e97e70@nodmarc.danielabrecht.ch
This wasn't initialized for pre NV50 hardware.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-and-Tested-by: Mark Hounschell <markh@compro.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Link: https://patchwork.freedesktop.org/series/84298/
Fix the missing clk_disable_unprepare() before return from
tegra_sor_init() in the error handling case.

Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
In the Rockchip DRM LVDS component driver, the endpoint id provided to
drm_of_find_panel_or_bridge is grabbed from the endpoint's reg property.

However, the property may be missing in the case of a single endpoint.
Initialize the endpoint_id variable to 0 to avoid using an
uninitialized variable in that case.

Fixes: 34cc0aa ("drm/rockchip: Add support for Rockchip Soc LVDS")
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20201110200430.1713467-1-paul.kocialkowski@bootlin.com
The probe routine acquires the reset GPIO using GPIOD_OUT_LOW. Directly
afterwards it calls acx565akm_detect(), which sets the GPIO value to
HIGH. If the bootloader initialized the GPIO to HIGH before the probe
routine was called, there is only a very short time period of a few
instructions where the reset signal is LOW. Exact time depends on
compiler optimizations, kernel configuration and alignment of the stars,
but I expect it to be always way less than 10us. There are no public
datasheets for the panel, but acx565akm_power_on() has a comment with
timings and reset period should be at least 10us. So this potentially
brings the panel into a half-reset state.

The result is, that panel may not work after boot and can get into a
working state by re-enabling it (e.g. by blanking + unblanking), since
that does a clean reset cycle. This bug has recently been hit by Ivaylo
Dimitrov, but there are some older reports which are probably the same
bug. At least Tony Lindgren, Peter Ujfalusi and Jarkko Nikula have
experienced it in 2017 describing the blank/unblank procedure as
possible workaround.

Note, that the bug really goes back in time. It has originally been
introduced in the predecessor of the omapfb driver in commit 3c45d05
("OMAPDSS: acx565akm panel: handle gpios in panel driver") in 2012.
That driver eventually got replaced by a newer one, which had the bug
from the beginning in commit 8419274 ("OMAPDSS: Add Sony ACX565AKM
panel driver") and still exists in fbdev world. That driver has later
been copied to omapdrm and then was used as a basis for this driver.
Last but not least the omapdrm specific driver has been removed in
commit 45f16c8 ("drm/omap: displays: Remove unused panel drivers").

Reported-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reported-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Fixes: 1c8fc3f ("drm/panel: Add driver for the Sony ACX565AKM panel")
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Tested-by: Jarkko Nikula <jarkko.nikula@bitmer.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127200429.129868-1-sebastian.reichel@collabora.com
When the SDI output was converted to DRM bridge, the atomic versions of
enable and disable funcs were used. This was not intended, as that would
require implementing other atomic funcs too. This leads to:

WARNING: CPU: 0 PID: 18 at drivers/gpu/drm/drm_bridge.c:708 drm_atomic_helper_commit_modeset_enables+0x134/0x268

and display not working.

Fix this by using the legacy enable/disable funcs.

Fixes: 8bef8a6 ("drm/omap: sdi: Register a drm_bridge")
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: stable@vger.kernel.org # v5.7+
Link: https://patchwork.freedesktop.org/patch/msgid/20201127085241.848461-1-tomi.valkeinen@ti.com
Ville noticed that the last mocs entry is used unconditionally by the HW
when it performs cache evictions, and noted that while the value is not
meant to be writable by the driver, we should program it to a reasonable
value nevertheless.

As it turns out, we can change the value of mocs:63 and the value we
were programming into it would cause hard hangs in conjunction with
atomic operations.

v2: Add details from bspec about how it is used by HW

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2707
Fixes: 3bbaba0 ("drm/i915: Added Programming of the MOCS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140841.1982-1-chris@chris-wilson.co.uk
(cherry picked from commit 977933b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Allow a brief period for continued access to a dead intel_context by
deferring the release of the struct until after an RCU grace period.
As we are using a dedicated slab cache for the contexts, we can defer
the release of the slab pages via RCU, with the caveat that individual
structs may be reused from the freelist within an RCU grace period. To
handle that, we have to avoid clearing members of the zombie struct.

This is required for a later patch to handle locking around virtual
requests in the signaler, as those requests may want to move between
engines and be destroyed while we are holding b->irq_lock on a physical
engine.

v2: Drop mutex_reinit(), if we never mark the mutex as destroyed we
don't need to reset the debug code, at the loss of having the mutex
debug code spot us attempting to destroy a locked mutex.
v3: As the intended use will remain strongly referenced counted, with
very little inflight access across reuse, drop the ctor.
v4: Drop the unrequired change to remove the temporary reference around
dropping the active context, and add back some more missing ctor
operations.
v5: The ctor is back. Tvrtko spotted that ce->signal_lock [introduced
later] maybe accessed under RCU and so needs special care not to be
reinitialised.
v6: Don't mix SLAB_TYPESAFE_BY_RCU and RCU list iteration.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-3-chris@chris-wilson.co.uk
(cherry picked from commit 14d1eaf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As we funnel more and more contexts into the breadcrumbs on an engine,
the hold time of b->irq_lock grows. As we may then contend with the
b->irq_lock during request submission, this increases the burden upon
the engine->active.lock and so directly impacts both our execution
latency and client latency. If we split the b->irq_lock by introducing a
per-context spinlock to manage the signalers within a context, we then
only need the b->irq_lock for enabling/disabling the interrupt and can
avoid taking the lock for walking the list of contexts within the signal
worker. Even with the current setup, this greatly reduces the number of
times we have to take and fight for b->irq_lock.

Furthermore, this closes the race between enabling the signaling context
while it is in the process of being signaled and removed:

<4>[  416.208555] list_add corruption. prev->next should be next (ffff8881951d5910), but was dead000000000100. (prev=ffff8882781bb870).
<4>[  416.208573] WARNING: CPU: 7 PID: 0 at lib/list_debug.c:28 __list_add_valid+0x4d/0x70
<4>[  416.208575] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio mei_hdcp x86_pkg_temp_thermal coretemp ax88179_178a usbnet mii crct10dif_pclmul snd_intel_dspcfg crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core e1000e snd_pcm ptp pps_core mei_me mei prime_numbers intel_lpss_pci [last unloaded: i915]
<4>[  416.208611] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G     U            5.8.0-CI-CI_DRM_8852+ #1
<4>[  416.208614] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3212.A00.1905212112 05/21/2019
<4>[  416.208627] RIP: 0010:__list_add_valid+0x4d/0x70
<4>[  416.208631] Code: c3 48 89 d1 48 c7 c7 60 18 33 82 48 89 c2 e8 ea e0 b6 ff 0f 0b 31 c0 c3 48 89 c1 4c 89 c6 48 c7 c7 b0 18 33 82 e8 d3 e0 b6 ff <0f> 0b 31 c0 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 00 19 33 82 e8
<4>[  416.208633] RSP: 0018:ffffc90000280e18 EFLAGS: 00010086
<4>[  416.208636] RAX: 0000000000000000 RBX: ffff888250a44880 RCX: 0000000000000105
<4>[  416.208639] RDX: 0000000000000105 RSI: ffffffff82320c5b RDI: 00000000ffffffff
<4>[  416.208641] RBP: ffff8882781bb870 R08: 0000000000000000 R09: 0000000000000001
<4>[  416.208643] R10: 00000000054d2957 R11: 000000006abbd991 R12: ffff8881951d58c8
<4>[  416.208646] R13: ffff888286073880 R14: ffff888286073848 R15: ffff8881951d5910
<4>[  416.208669] FS:  0000000000000000(0000) GS:ffff88829c180000(0000) knlGS:0000000000000000
<4>[  416.208671] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  416.208673] CR2: 0000556231326c48 CR3: 0000000005610001 CR4: 0000000000760ee0
<4>[  416.208675] PKRU: 55555554
<4>[  416.208677] Call Trace:
<4>[  416.208679]  <IRQ>
<4>[  416.208751]  i915_request_enable_breadcrumb+0x278/0x400 [i915]
<4>[  416.208839]  __i915_request_submit+0xca/0x2a0 [i915]
<4>[  416.208892]  __execlists_submission_tasklet+0x480/0x1830 [i915]
<4>[  416.208942]  execlists_submission_tasklet+0xc4/0x130 [i915]
<4>[  416.208947]  tasklet_action_common.isra.17+0x6c/0x1c0
<4>[  416.208954]  __do_softirq+0xdf/0x498
<4>[  416.208960]  ? handle_fasteoi_irq+0x150/0x150
<4>[  416.208964]  asm_call_on_stack+0xf/0x20
<4>[  416.208966]  </IRQ>
<4>[  416.208969]  do_softirq_own_stack+0xa1/0xc0
<4>[  416.208972]  irq_exit_rcu+0xb5/0xc0
<4>[  416.208976]  common_interrupt+0xf7/0x260
<4>[  416.208980]  asm_common_interrupt+0x1e/0x40
<4>[  416.208985] RIP: 0010:cpuidle_enter_state+0xb6/0x410
<4>[  416.208987] Code: 00 31 ff e8 9c 3e 89 ff 80 7c 24 0b 00 74 12 9c 58 f6 c4 02 0f 85 31 03 00 00 31 ff e8 e3 6c 90 ff e8 fe a4 94 ff fb 45 85 ed <0f> 88 c7 02 00 00 49 63 c5 4c 2b 24 24 48 8d 14 40 48 8d 14 90 48
<4>[  416.208989] RSP: 0018:ffffc90000143e70 EFLAGS: 00000206
<4>[  416.208991] RAX: 0000000000000007 RBX: ffffe8ffffda8070 RCX: 0000000000000000
<4>[  416.208993] RDX: 0000000000000000 RSI: ffffffff8238b4ee RDI: ffffffff8233184f
<4>[  416.208995] RBP: ffffffff826b4e00 R08: 0000000000000000 R09: 0000000000000000
<4>[  416.208997] R10: 0000000000000001 R11: 0000000000000000 R12: 00000060e7f24a8f
<4>[  416.208998] R13: 0000000000000003 R14: 0000000000000003 R15: 0000000000000003
<4>[  416.209012]  cpuidle_enter+0x24/0x40
<4>[  416.209016]  do_idle+0x22f/0x2d0
<4>[  416.209022]  cpu_startup_entry+0x14/0x20
<4>[  416.209025]  start_secondary+0x158/0x1a0
<4>[  416.209030]  secondary_startup_64+0xa4/0xb0
<4>[  416.209039] irq event stamp: 10186977
<4>[  416.209042] hardirqs last  enabled at (10186976): [<ffffffff810b9363>] tasklet_action_common.isra.17+0xe3/0x1c0
<4>[  416.209044] hardirqs last disabled at (10186977): [<ffffffff81a5e5ed>] _raw_spin_lock_irqsave+0xd/0x50
<4>[  416.209047] softirqs last  enabled at (10186968): [<ffffffff810b9a1a>] irq_enter_rcu+0x6a/0x70
<4>[  416.209049] softirqs last disabled at (10186969): [<ffffffff81c00f4f>] asm_call_on_stack+0xf/0x20

<4>[  416.209317] list_del corruption, ffff8882781bb870->next is LIST_POISON1 (dead000000000100)
<4>[  416.209317] WARNING: CPU: 7 PID: 46 at lib/list_debug.c:47 __list_del_entry_valid+0x4e/0x90
<4>[  416.209317] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio mei_hdcp x86_pkg_temp_thermal coretemp ax88179_178a usbnet mii crct10dif_pclmul snd_intel_dspcfg crc32_pclmul snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core e1000e snd_pcm ptp pps_core mei_me mei prime_numbers intel_lpss_pci [last unloaded: i915]
<4>[  416.209317] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G     U  W         5.8.0-CI-CI_DRM_8852+ #1
<4>[  416.209317] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake Y LPDDR4x T4 RVP TLC, BIOS ICLSFWR1.R00.3212.A00.1905212112 05/21/2019
<4>[  416.209317] RIP: 0010:__list_del_entry_valid+0x4e/0x90
<4>[  416.209317] Code: 2e 48 8b 32 48 39 fe 75 3a 48 8b 50 08 48 39 f2 75 48 b8 01 00 00 00 c3 48 89 fe 48 89 c2 48 c7 c7 38 19 33 82 e8 62 e0 b6 ff <0f> 0b 31 c0 c3 48 89 fe 48 c7 c7 70 19 33 82 e8 4e e0 b6 ff 0f 0b
<4>[  416.209317] RSP: 0018:ffffc90000280de8 EFLAGS: 00010086
<4>[  416.209317] RAX: 0000000000000000 RBX: ffff8882781bb848 RCX: 0000000000010104
<4>[  416.209317] RDX: 0000000000010104 RSI: ffffffff8238b4ee RDI: 00000000ffffffff
<4>[  416.209317] RBP: ffff8882781bb880 R08: 0000000000000000 R09: 0000000000000001
<4>[  416.209317] R10: 000000009fb6666e R11: 00000000feca9427 R12: ffffc90000280e18
<4>[  416.209317] R13: ffff8881951d5930 R14: dead0000000000d8 R15: ffff8882781bb880
<4>[  416.209317] FS:  0000000000000000(0000) GS:ffff88829c180000(0000) knlGS:0000000000000000
<4>[  416.209317] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  416.209317] CR2: 0000556231326c48 CR3: 0000000005610001 CR4: 0000000000760ee0
<4>[  416.209317] PKRU: 55555554
<4>[  416.209317] Call Trace:
<4>[  416.209317]  <IRQ>
<4>[  416.209317]  remove_signaling_context.isra.13+0xd/0x70 [i915]
<4>[  416.209513]  signal_irq_work+0x1f7/0x4b0 [i915]

This is caused by virtual engines where although we take the breadcrumb
lock on each of the active engines, they may be different engines on
different requests, It turns out that the b->irq_lock was not a
sufficient proxy for the engine->active.lock in the case of more than
one request, so introduce an explicit lock around ce->signals.

v2: ce->signal_lock is acquired with only RCU protection and so must be
treated carefully and not cleared during reallocation. We also then need
to confirm that the ce we lock is the same as we found in the breadcrumb
list.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2276
Fixes: c18636f ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs")
Fixes: 2854d86 ("drm/i915/gt: Replace intel_engine_transfer_stale_breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201126140407.31952-4-chris@chris-wilson.co.uk
(cherry picked from commit c744d50)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
As we use a shmemfs file to hold the context state, when not in use it
may be swapped out, such as across suspend. Since we wrote into the
shmemfs without marking the pages as dirty, the contents may be dropped
instead of being written back to swap. On re-using the shmemfs file,
such as creating a new context after resume, the contents of that file
were likely garbage and so the new context could then hang the GPU.

Simply mark the page as being written when copying into the shmemfs
file, and it the new contents will be retained across swapout.

Fixes: be1cb55 ("drm/i915/gt: Keep a no-frills swappable copy of the default context state")
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.8+
Link: https://patchwork.freedesktop.org/patch/msgid/20201127120718.454037-161-matthew.auld@intel.com
(cherry picked from commit a9d71f7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We treat idling the GT (intel_rps_park) as a downclock event, and reduce
the frequency we intend to restart the GT with. Since the two workloads
are likely related (e.g. a compositor rendering every 16ms), we want to
carry the frequency and load information from across the idling.
However, we do also need to update the frequencies so that workloads
that run for less than 1ms are autotuned by RPS (otherwise we leave
compositors running at max clocks, draining excess power). Conversely,
if we try to run too slowly, the next workload has to run longer. Since
there is a hysteresis in the power graph, below a certain frequency
running a short workload for longer consumes more energy than running it
slightly higher for less time. The exact balance point is unknown
beforehand, but measurements with 30fps media playback indicate that RPe
is a better choice.

Reported-by: Edward Baker <edward.baker@intel.com>
Tested-by: Edward Baker <edward.baker@intel.com>
Fixes: 043cd2d ("drm/i915/gt: Leave rps->cur_freq on unpark")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201124183521.28623-1-chris@chris-wilson.co.uk
(cherry picked from commit f7ed83c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
…splay

!HAS_DISPLAY() implies !HAS_OVERLAY(), skipping overlay setup anyway, so
return earlier from intel_modeset_init() for clarity.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-4-lucas.demarchi@intel.com
(cherry picked from commit 71c8415)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Prior to sanitizing the GGTT, the only operations allowed in
intel_display_init_nogem() are those to reserve the preallocated (and
active) regions in the GGTT leftover from the BIOS. Trying to allocate a
GGTT vma (such as intel_pin_and_fence_fb_obj during the initial modeset)
may then conflict with other preallocated regions that have not yet been
protected.

Move the initial modesetting from the end of init_nogem to the beginning
of init so that any vma pinning (either framebuffers or DSB, for example),
is after the GGTT is ready to handle it.

This will prevent the DSB object from being destroyed too early:

[   53.449241] BUG: KASAN: use-after-free in i915_init_ggtt+0x324/0x9e0 [i915]
[   53.449309] Read of size 8 at addr ffff88811b1e8070 by task systemd-udevd/345

[   53.449399] CPU: 1 PID: 345 Comm: systemd-udevd Tainted: G        W         5.10.0-rc5+ #12
[   53.449409] Call Trace:
[   53.449418]  dump_stack+0x9a/0xcc
[   53.449558]  ? i915_init_ggtt+0x324/0x9e0 [i915]
[   53.449565]  print_address_description.constprop.0+0x3e/0x60
[   53.449577]  ? _raw_spin_lock_irqsave+0x4e/0x50
[   53.449718]  ? i915_init_ggtt+0x324/0x9e0 [i915]
[   53.449849]  ? i915_init_ggtt+0x324/0x9e0 [i915]
[   53.449857]  kasan_report.cold+0x1f/0x37
[   53.449993]  ? i915_init_ggtt+0x324/0x9e0 [i915]
[   53.450130]  i915_init_ggtt+0x324/0x9e0 [i915]
[   53.450273]  ? i915_ggtt_suspend+0x1f0/0x1f0 [i915]
[   53.450281]  ? static_obj+0x69/0x80
[   53.450289]  ? lockdep_init_map_waits+0xa9/0x310
[   53.450431]  ? intel_wopcm_init+0x96/0x3d0 [i915]
[   53.450581]  ? i915_gem_init+0x75/0x2d0 [i915]
[   53.450720]  i915_gem_init+0x75/0x2d0 [i915]
[   53.450852]  i915_driver_probe+0x8c2/0x1210 [i915]
[   53.450993]  ? i915_pm_prepare+0x630/0x630 [i915]
[   53.451006]  ? check_chain_key+0x1e7/0x2e0
[   53.451025]  ? __pm_runtime_resume+0x58/0xb0
[   53.451157]  i915_pci_probe+0xa6/0x2b0 [i915]
[   53.451285]  ? i915_pci_remove+0x40/0x40 [i915]
[   53.451295]  ? lockdep_hardirqs_on_prepare+0x124/0x230
[   53.451302]  ? _raw_spin_unlock_irqrestore+0x42/0x50
[   53.451309]  ? lockdep_hardirqs_on+0xbf/0x130
[   53.451315]  ? preempt_count_sub+0xf/0xb0
[   53.451321]  ? _raw_spin_unlock_irqrestore+0x2f/0x50
[   53.451335]  pci_device_probe+0xf9/0x190
[   53.451350]  really_probe+0x17f/0x5b0
[   53.451365]  driver_probe_device+0x13a/0x1c0
[   53.451376]  device_driver_attach+0x82/0x90
[   53.451386]  ? device_driver_attach+0x90/0x90
[   53.451391]  __driver_attach+0xab/0x190
[   53.451401]  ? device_driver_attach+0x90/0x90
[   53.451407]  bus_for_each_dev+0xe4/0x140
[   53.451414]  ? subsys_dev_iter_exit+0x10/0x10
[   53.451423]  ? __list_add_valid+0x2b/0xa0
[   53.451440]  bus_add_driver+0x227/0x2e0
[   53.451454]  driver_register+0xd3/0x150
[   53.451585]  i915_init+0x92/0xac [i915]
[   53.451592]  ? 0xffffffffa0a20000
[   53.451598]  do_one_initcall+0xb6/0x3b0
[   53.451606]  ? trace_event_raw_event_initcall_finish+0x150/0x150
[   53.451614]  ? __kasan_kmalloc.constprop.0+0xc2/0xd0
[   53.451627]  ? kmem_cache_alloc_trace+0x4a4/0x8e0
[   53.451634]  ? kasan_unpoison_shadow+0x33/0x40
[   53.451649]  do_init_module+0xf8/0x350
[   53.451662]  load_module+0x43de/0x47f0
[   53.451716]  ? module_frob_arch_sections+0x20/0x20
[   53.451731]  ? rw_verify_area+0x5f/0x130
[   53.451780]  ? __do_sys_finit_module+0x10d/0x1a0
[   53.451785]  __do_sys_finit_module+0x10d/0x1a0
[   53.451792]  ? __ia32_sys_init_module+0x40/0x40
[   53.451800]  ? seccomp_do_user_notification.isra.0+0x5c0/0x5c0
[   53.451829]  ? rcu_read_lock_bh_held+0xb0/0xb0
[   53.451835]  ? mark_held_locks+0x24/0x90
[   53.451856]  do_syscall_64+0x33/0x80
[   53.451863]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   53.451868] RIP: 0033:0x7fde09b4470d
[   53.451875] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 53 f7 0c 00 f7 d8 64 89 01 48
[   53.451880] RSP: 002b:00007ffd6abc1718 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[   53.451890] RAX: ffffffffffffffda RBX: 000056444e528150 RCX: 00007fde09b4470d
[   53.451895] RDX: 0000000000000000 RSI: 00007fde09a21ded RDI: 000000000000000f
[   53.451899] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000
[   53.451904] R10: 000000000000000f R11: 0000000000000246 R12: 00007fde09a21ded
[   53.451909] R13: 0000000000000000 R14: 000056444e329200 R15: 000056444e528150

[   53.451957] Allocated by task 345:
[   53.451995]  kasan_save_stack+0x1b/0x40
[   53.452001]  __kasan_kmalloc.constprop.0+0xc2/0xd0
[   53.452006]  kmem_cache_alloc+0x1cd/0x8d0
[   53.452146]  i915_vma_instance+0x126/0xb70 [i915]
[   53.452304]  i915_gem_object_ggtt_pin_ww+0x222/0x3f0 [i915]
[   53.452446]  intel_dsb_prepare+0x14f/0x230 [i915]
[   53.452588]  intel_atomic_commit+0x183/0x690 [i915]
[   53.452730]  intel_initial_commit+0x2bc/0x2f0 [i915]
[   53.452871]  intel_modeset_init_nogem+0xa02/0x2af0 [i915]
[   53.452995]  i915_driver_probe+0x8af/0x1210 [i915]
[   53.453120]  i915_pci_probe+0xa6/0x2b0 [i915]
[   53.453125]  pci_device_probe+0xf9/0x190
[   53.453131]  really_probe+0x17f/0x5b0
[   53.453136]  driver_probe_device+0x13a/0x1c0
[   53.453142]  device_driver_attach+0x82/0x90
[   53.453148]  __driver_attach+0xab/0x190
[   53.453153]  bus_for_each_dev+0xe4/0x140
[   53.453158]  bus_add_driver+0x227/0x2e0
[   53.453164]  driver_register+0xd3/0x150
[   53.453286]  i915_init+0x92/0xac [i915]
[   53.453292]  do_one_initcall+0xb6/0x3b0
[   53.453297]  do_init_module+0xf8/0x350
[   53.453302]  load_module+0x43de/0x47f0
[   53.453307]  __do_sys_finit_module+0x10d/0x1a0
[   53.453312]  do_syscall_64+0x33/0x80
[   53.453318]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   53.453345] Freed by task 82:
[   53.453379]  kasan_save_stack+0x1b/0x40
[   53.453384]  kasan_set_track+0x1c/0x30
[   53.453389]  kasan_set_free_info+0x1b/0x30
[   53.453394]  __kasan_slab_free+0x112/0x160
[   53.453399]  kmem_cache_free+0xb2/0x3f0
[   53.453536]  i915_gem_flush_free_objects+0x31a/0x3b0 [i915]
[   53.453542]  process_one_work+0x519/0x9f0
[   53.453547]  worker_thread+0x75/0x5c0
[   53.453552]  kthread+0x1da/0x230
[   53.453557]  ret_from_fork+0x22/0x30

[   53.453584] The buggy address belongs to the object at ffff88811b1e8040
                which belongs to the cache i915_vma of size 968
[   53.453692] The buggy address is located 48 bytes inside of
                968-byte region [ffff88811b1e8040, ffff88811b1e8408)
[   53.453792] The buggy address belongs to the page:
[   53.453842] page:00000000b35f7048 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88811b1ef940 pfn:0x11b1e8
[   53.453847] head:00000000b35f7048 order:3 compound_mapcount:0 compound_pincount:0
[   53.453853] flags: 0x8000000000010200(slab|head)
[   53.453860] raw: 8000000000010200 ffff888115596248 ffff888115596248 ffff8881155b6340
[   53.453866] raw: ffff88811b1ef940 0000000000170001 00000001ffffffff 0000000000000000
[   53.453870] page dumped because: kasan: bad access detected

[   53.453895] Memory state around the buggy address:
[   53.453944]  ffff88811b1e7f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   53.454011]  ffff88811b1e7f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   53.454079] >ffff88811b1e8000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[   53.454146]                                                              ^
[   53.454211]  ffff88811b1e8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   53.454279]  ffff88811b1e8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   53.454347] ==================================================================
[   53.454414] Disabling lock debugging due to kernel taint
[   53.454434] general protection fault, probably for non-canonical address 0xdead0000000000d0: 0000 [#1] PREEMPT SMP KASAN PTI
[   53.454446] CPU: 1 PID: 345 Comm: systemd-udevd Tainted: G    B   W         5.10.0-rc5+ #12
[   53.454592] RIP: 0010:i915_init_ggtt+0x26f/0x9e0 [i915]
[   53.454602] Code: 89 8d 48 ff ff ff 4c 8d 60 d0 49 39 c7 0f 84 37 02 00 00 4c 89 b5 40 ff ff ff 4d 8d bc 24 90 00 00 00 4c 89 ff e8 c1 97 f8 e0 <49> 83 bc 24 90 00 00 00 00 0f 84 0f 02 00 00 49 8d 7c 24 08 e8 a8
[   53.454618] RSP: 0018:ffff88812247f430 EFLAGS: 00010286
[   53.454625] RAX: 0000000000000000 RBX: ffff888136440000 RCX: ffffffffa03fb78f
[   53.454633] RDX: 0000000000000000 RSI: 0000000000000008 RDI: dead000000000160
[   53.454641] RBP: ffff88812247f500 R08: ffffffff8113589f R09: 0000000000000000
[   53.454648] R10: ffffffff83063843 R11: fffffbfff060c708 R12: dead0000000000d0
[   53.454656] R13: ffff888136449ba0 R14: 0000000000002000 R15: dead000000000160
[   53.454664] FS:  00007fde095c4880(0000) GS:ffff88840c880000(0000) knlGS:0000000000000000
[   53.454672] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   53.454679] CR2: 00007fef132b4f28 CR3: 000000012245c002 CR4: 00000000003706e0
[   53.454686] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   53.454693] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   53.454700] Call Trace:
[   53.454833]  ? i915_ggtt_suspend+0x1f0/0x1f0 [i915]

Reported-by: Matthew Auld <matthew.auld@intel.com>
Fixes: afeda4f ("drm/i915/dsb: Pre allocate and late cleanup of cmd buffer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201125193032.29282-1-chris@chris-wilson.co.uk
(cherry picked from commit b3bf99d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
…egra/linux into drm-fixes

drm/tegra: Fixes for v5.10-rc7

This is a set of small fixes for various issues found during the last
couple of weeks.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127145324.125776-1-thierry.reding@gmail.com
Fix fan set speed calculation.

Suggested-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[Why]
While booting into OS, driver updates DPP/DISP CLKs.
But init clock value is zero which is invalid.

[How]
Get current clocks value to update init clocks.
To avoid underflow.

Signed-off-by: Brandon Syu <Brandon.Syu@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Port from VCN2.5
Add vcn dpg harware synchronization to fix race condition
issue between vcn driver and hardware.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.9.x
Port from VCN2.5
SCRATCH2 is used to keep decode wptr as a workaround
which fix a hardware DPG decode wptr update bug for
vcn2.5 beforehand.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.9.x
…rg/drm/drm-intel into drm-fixes

Fixes for GPU hang, null dereference, suspend-resume, power consumption, and use-after-free.

- Program mocs:63 for cache eviction on gen9 (Chris)
- Protect context lifetime with RCU (Chris)
- Split the breadcrumb spinlock between global and contexts (Chris)
- Retain default context state across shrinking (Venkata)
- Limit frequency drop to RPe on parking (Chris)
- Return earlier from intel_modeset_init() without display (Jani)
- Defer initial modeset until after GGTT is initialized (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203134705.GA1575873@intel.com
….org/~agd5f/linux into drm-fixes

amd-drm-fixes-5.10-2020-12-02:

amdgpu:
- SMU11 manual fan fix
- Renoir display clock fix
- VCN3 dynamic powergating fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203044815.41257-1-alexander.deucher@amd.com
…g/drm/drm-misc into drm-fixes

One bridge fix for OMAP, one for a race condition in a panel, two for
uninitialized variables in rockchip and nouveau, and two fixes for mxsfb
to fix a regression with modifiers and a fix for a fence synchronization
issue.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20201203125943.h2ft2xoywunt5orl@gilmour
…/drm

Pull drm fixes from Dave Airlie:
 "This week's regular fixes.

  i915 has fixes for a few races, use-after-free, and gpu hangs. Tegra
  just has some minor fixes that I didn't see much point in hanging on
  to. The nouveau fix is for all pre-nv50 cards and was reported a few
  times. Otherwise it's just some amdgpu, and a few misc fixes.

  Summary:

  amdgpu:
   - SMU11 manual fan fix
   - Renoir display clock fix
   - VCN3 dynamic powergating fix

  i915:
   - Program mocs:63 for cache eviction on gen9 (Chris)
   - Protect context lifetime with RCU (Chris)
   - Split the breadcrumb spinlock between global and contexts (Chris)
   - Retain default context state across shrinking (Venkata)
   - Limit frequency drop to RPe on parking (Chris)
   - Return earlier from intel_modeset_init() without display (Jani)
   - Defer initial modeset until after GGTT is initialized (Chris)

  nouveau:
   - pre-nv50 regression fix

  rockchip:
   - uninitialised LVDS property fix

  omap:
   - bridge fix

  panel:
   - race fix

  mxsfb:
   - fence sync fix
   - modifiers fix

  tegra:
   - idr init fix
   - sor fixes
   - output/of cleanup fix"

* tag 'drm-fixes-2020-12-04' of git://anongit.freedesktop.org/drm/drm: (22 commits)
  drm/amdgpu/vcn3.0: remove old DPG workaround
  drm/amdgpu/vcn3.0: stall DPG when WPTR/RPTR reset
  drm/amd/display: Init clock value by current vbios CLKs
  drm/amdgpu/pm/smu11: Fix fan set speed bug
  drm/i915/display: Defer initial modeset until after GGTT is initialised
  drm/i915/display: return earlier from intel_modeset_init() without display
  drm/i915/gt: Limit frequency drop to RPe on parking
  drm/i915/gt: Retain default context state across shrinking
  drm/i915/gt: Split the breadcrumb spinlock between global and contexts
  drm/i915/gt: Protect context lifetime with RCU
  drm/i915/gt: Program mocs:63 for cache eviction on gen9
  drm/omap: sdi: fix bridge enable/disable
  drm/panel: sony-acx565akm: Fix race condition in probe
  drm/rockchip: Avoid uninitialized use of endpoint id in LVDS
  drm/tegra: sor: Disable clocks on error in tegra_sor_init()
  drm/nouveau: make sure ret is initialized in nouveau_ttm_io_mem_reserve
  drm: mxsfb: Implement .format_mod_supported
  drm: mxsfb: fix fence synchronization
  drm/tegra: output: Do not put OF node twice
  drm/tegra: replace idr_init() by idr_init_base()
  ...
@pull pull bot added the ⤵️ pull label Dec 4, 2020
@pull pull bot merged commit e87297f into bergwolf:master Dec 4, 2020
pull bot pushed a commit that referenced this pull request Feb 10, 2023
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of 0000:04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0000000000000090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 <TASK>
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0], and also removed the no_scheduler check [1].

[0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/
[1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/

Fixes: 067f44c ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)")
Suggested-by: Christian König <christian.koenig@amd.com>
Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
pull bot pushed a commit that referenced this pull request Nov 21, 2023
Enable the cpu v4 tests for LoongArch. Currently, we don't have BPF
trampoline in LoongArch JIT, so the fentry test `test_ptr_struct_arg`
still failed, will followup.

Test result attached below:

  # ./test_progs -t verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap
  #316/1   verifier_bswap/BSWAP, 16:OK
  #316/2   verifier_bswap/BSWAP, 16 @unpriv:OK
  #316/3   verifier_bswap/BSWAP, 32:OK
  #316/4   verifier_bswap/BSWAP, 32 @unpriv:OK
  #316/5   verifier_bswap/BSWAP, 64:OK
  #316/6   verifier_bswap/BSWAP, 64 @unpriv:OK
  #316     verifier_bswap:OK
  #330/1   verifier_gotol/gotol, small_imm:OK
  #330/2   verifier_gotol/gotol, small_imm @unpriv:OK
  #330     verifier_gotol:OK
  #338/1   verifier_ldsx/LDSX, S8:OK
  #338/2   verifier_ldsx/LDSX, S8 @unpriv:OK
  #338/3   verifier_ldsx/LDSX, S16:OK
  #338/4   verifier_ldsx/LDSX, S16 @unpriv:OK
  #338/5   verifier_ldsx/LDSX, S32:OK
  #338/6   verifier_ldsx/LDSX, S32 @unpriv:OK
  #338/7   verifier_ldsx/LDSX, S8 range checking, privileged:OK
  #338/8   verifier_ldsx/LDSX, S16 range checking:OK
  #338/9   verifier_ldsx/LDSX, S16 range checking @unpriv:OK
  #338/10  verifier_ldsx/LDSX, S32 range checking:OK
  #338/11  verifier_ldsx/LDSX, S32 range checking @unpriv:OK
  #338     verifier_ldsx:OK
  #349/1   verifier_movsx/MOV32SX, S8:OK
  #349/2   verifier_movsx/MOV32SX, S8 @unpriv:OK
  #349/3   verifier_movsx/MOV32SX, S16:OK
  #349/4   verifier_movsx/MOV32SX, S16 @unpriv:OK
  #349/5   verifier_movsx/MOV64SX, S8:OK
  #349/6   verifier_movsx/MOV64SX, S8 @unpriv:OK
  #349/7   verifier_movsx/MOV64SX, S16:OK
  #349/8   verifier_movsx/MOV64SX, S16 @unpriv:OK
  #349/9   verifier_movsx/MOV64SX, S32:OK
  #349/10  verifier_movsx/MOV64SX, S32 @unpriv:OK
  #349/11  verifier_movsx/MOV32SX, S8, range_check:OK
  #349/12  verifier_movsx/MOV32SX, S8, range_check @unpriv:OK
  #349/13  verifier_movsx/MOV32SX, S16, range_check:OK
  #349/14  verifier_movsx/MOV32SX, S16, range_check @unpriv:OK
  #349/15  verifier_movsx/MOV32SX, S16, range_check 2:OK
  #349/16  verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK
  #349/17  verifier_movsx/MOV64SX, S8, range_check:OK
  #349/18  verifier_movsx/MOV64SX, S8, range_check @unpriv:OK
  #349/19  verifier_movsx/MOV64SX, S16, range_check:OK
  #349/20  verifier_movsx/MOV64SX, S16, range_check @unpriv:OK
  #349/21  verifier_movsx/MOV64SX, S32, range_check:OK
  #349/22  verifier_movsx/MOV64SX, S32, range_check @unpriv:OK
  #349/23  verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK
  #349/24  verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK
  #349     verifier_movsx:OK
  #361/1   verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK
  #361/2   verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK
  #361/3   verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK
  #361/4   verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK
  #361/5   verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK
  #361/6   verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK
  #361/7   verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK
  #361/8   verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK
  #361/9   verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK
  #361/10  verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK
  #361/11  verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK
  #361/12  verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK
  #361/13  verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK
  #361/14  verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK
  #361/15  verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK
  #361/16  verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK
  #361/17  verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK
  #361/18  verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK
  #361/19  verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK
  #361/20  verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK
  #361/21  verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK
  #361/22  verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK
  #361/23  verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK
  #361/24  verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK
  #361/25  verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK
  #361/26  verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK
  #361/27  verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK
  #361/28  verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK
  #361/29  verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK
  #361/30  verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK
  #361/31  verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK
  #361/32  verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK
  #361/33  verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK
  #361/34  verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK
  #361/35  verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK
  #361/36  verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK
  #361/37  verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK
  #361/38  verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK
  #361/39  verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK
  #361/40  verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK
  #361/41  verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK
  #361/42  verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK
  #361/43  verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK
  #361/44  verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK
  #361/45  verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK
  #361/46  verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK
  #361/47  verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK
  #361/48  verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK
  #361/49  verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK
  #361/50  verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK
  #361/51  verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK
  #361/52  verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK
  #361/53  verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK
  #361/54  verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK
  #361/55  verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK
  #361/56  verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK
  #361/57  verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK
  #361/58  verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK
  #361/59  verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK
  #361/60  verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK
  #361/61  verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK
  #361/62  verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK
  #361/63  verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK
  #361/64  verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK
  #361/65  verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK
  #361/66  verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK
  #361/67  verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK
  #361/68  verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK
  #361/69  verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK
  #361/70  verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK
  #361/71  verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK
  #361/72  verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK
  #361/73  verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK
  #361/74  verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK
  #361/75  verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK
  #361/76  verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK
  #361/77  verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK
  #361/78  verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK
  #361/79  verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK
  #361/80  verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK
  #361/81  verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK
  #361/82  verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK
  #361/83  verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK
  #361/84  verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK
  #361/85  verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK
  #361/86  verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK
  #361/87  verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK
  #361/88  verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK
  #361/89  verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK
  #361/90  verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK
  #361/91  verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK
  #361/92  verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK
  #361/93  verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK
  #361/94  verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK
  #361/95  verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK
  #361/96  verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK
  #361/97  verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK
  #361/98  verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK
  #361/99  verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK
  #361/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK
  #361/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK
  #361/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK
  #361/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK
  #361/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK
  #361/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK
  #361/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK
  #361/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK
  #361/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK
  #361/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK
  #361/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK
  #361/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK
  #361/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK
  #361/113 verifier_sdiv/SDIV32, zero divisor:OK
  #361/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK
  #361/115 verifier_sdiv/SDIV64, zero divisor:OK
  #361/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK
  #361/117 verifier_sdiv/SMOD32, zero divisor:OK
  #361/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK
  #361/119 verifier_sdiv/SMOD64, zero divisor:OK
  #361/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK
  #361     verifier_sdiv:OK
  Summary: 5/163 PASSED, 0 SKIPPED, 0 FAILED

  # ./test_progs -t ldsx_insn
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  #116/1   ldsx_insn/map_val and probed_memory:FAIL
  #116/2   ldsx_insn/ctx_member_sign_ext:OK
  #116/3   ldsx_insn/ctx_member_narrow_sign_ext:OK
  #116     ldsx_insn:FAIL

  All error logs:
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__open 0 nsec
  test_map_val_and_probed_memory:PASS:test_ldsx_insn__load 0 nsec
  libbpf: prog 'test_ptr_struct_arg': failed to attach: ERROR: strerror_r(-524)=22
  libbpf: prog 'test_ptr_struct_arg': failed to auto-attach: -524
  test_map_val_and_probed_memory:FAIL:test_ldsx_insn__attach unexpected error: -524 (errno 524)
  #116/1   ldsx_insn/map_val and probed_memory:FAIL
  #116     ldsx_insn:FAIL
  Summary: 0/2 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
pull bot pushed a commit that referenced this pull request Jul 13, 2024
Add a test case which replaces an active ingress qdisc while keeping the
miniq in-tact during the transition period to the new clsact qdisc.

  # ./vmtest.sh -- ./test_progs -t tc_link
  [...]
  ./test_progs -t tc_link
  [    3.412871] bpf_testmod: loading out-of-tree module taints kernel.
  [    3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
  #332     tc_links_after:OK
  #333     tc_links_append:OK
  #334     tc_links_basic:OK
  #335     tc_links_before:OK
  #336     tc_links_chain_classic:OK
  #337     tc_links_chain_mixed:OK
  #338     tc_links_dev_chain0:OK
  #339     tc_links_dev_cleanup:OK
  #340     tc_links_dev_mixed:OK
  #341     tc_links_ingress:OK
  #342     tc_links_invalid:OK
  #343     tc_links_prepend:OK
  #344     tc_links_replace:OK
  #345     tc_links_revision:OK
  Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240708133130.11609-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.