forked from tiwai/sound
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve quirks detection for es8336 #27
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
No functionality change, only add indirection for bus management helpers. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111013135.38289-5-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
No functionality change, only add indirection for link power management helpers. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111013135.38289-6-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
No functionality change, only add indirection for in-band wake management helpers. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111013135.38289-7-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
The auxdevice layer is completely generic, it should be split from intel.c which is only geared to the 'cnl' hw_ops now. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Rander Wang <rander.wang@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111013135.38289-8-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
…pt thread When the code reaches the SoundWire interrupt thread handling, the interrupt was enabled already, and there is no code that disables it -> this is a no-op sequence. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Acked-By: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20221111042653.45520-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
Different generations of Intel hardware rely on different programming sequences to enable SoundWire IP. In existing hardware, the SoundWire interrupt is enabled with a register field in the DSP register space. With HDaudio multi-link extensions registers, the SoundWire interrupt will be enabled with a generic interrupt enable field in LCTL, without any dependency on the DSP being enabled. Add a per-chip callback following the example of the check_sdw_irq() model already upstream. Note that the callback is not populated yet for MeteorLake (MTL) since the interrupts are already enabled in the init. A follow-up patch will move the functionality to this callback after a couple of cleanups. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111042653.45520-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
…tions The offsets and sequences are identical for interrupt enabling and disabling, we can refactor the code with a single routine and a boolean. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111042653.45520-4-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
There's no real rationale for enabling the SoundWire interrupt in the init, this can be done from the enable_sdw_irq() callback. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111042653.45520-5-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
…tion The number of links is stored in different registers depending on the IP version, add sdw_check_lcount() callback. This callback only checks that the number of links supported in hardware is compatible with the number of links exposed in ACPI _DSD properties. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111042653.45520-6-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
The functionality is implemented with per-chip callbacks, there are no users of this symbol, remove the code. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Acked-By: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20221111042653.45520-7-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
The number of links is checked with a chip-dependent helper in the caller, remove the check in drivers/soundwire/intel_init.c This change makes intel_init.c hardware-agnostic - which is quite fitting for a layer that only creates auxiliary devices. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Acked-By: Vinod Koul <vkoul@kernel.org> Link: https://lore.kernel.org/r/20221111042653.45520-8-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
There's no reason to delay the multi-link parsing, this can be done earlier before checking the SoundWire capabilities. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://lore.kernel.org/r/20221111042653.45520-9-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
ALSA callers need to know whether there was a change to the value so that they can report a control write change correctly. Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://lore.kernel.org/r/20221123165811.3014472-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
Functions that update cs_dsp controls need to handle return codes that indicate whether the control value changed. A return code of 1 indicates a change, 0 indicates no-change and a negative value is an error condition. Acked controls implicitly change value when written so a successful write shall always report that the value changed. Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com> Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://lore.kernel.org/r/20221123165811.3014472-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
With incorrect topology file we can end up with a case when swidget->spipe is NULL and it going to be caught in sof_widget_list_setup(). Check spipe against NULL before dereferencing for the pipe_widget. Reported-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
These fixes were queued for v5.18 but due to me changing my scripting they never actually got merged - pulling them up now.
As the devm_kcalloc may return NULL, the return value needs to be checked to avoid NULL poineter dereference. Fixes: 24caf8d ("ASoC: qcom: lpass-sc7180: Add platform driver for lpass audio") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221124140510.63468-1-yuancan@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
SND_SOC_QCOM_COMMON depends on SOUNDWIRE for some symbols but this is not explicitly specified using Kconfig depends. On the other hand SND_SOC_QCOM_COMMON is also directly selected by the sound card Kconfigs, this could result in various combinations and some symbols ending up in modules and soundcard that uses those symbols as in-build driver. Fix these issues by explicitly specifying the dependencies of SND_SOC_QCOM_COMMON and also use imply a to select SND_SOC_QCOM_COMMON so that the symbol is selected based on its dependencies. Also remove dummy stubs in common.c around CONFIG_SOUNDWIRE Fixes: 3bd975f ("ASoC: qcom: sm8250: move some code to common") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20221124140351.407506-1-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
Add new acpi id and compatible id for nau8315. Signed-off-by: David Lin <CTLIN0@nuvoton.com> Link: https://lore.kernel.org/r/20221124055658.53828-1-CTLIN0@nuvoton.com Signed-off-by: Mark Brown <broonie@kernel.org>
The audio amplifier NAU8318 is almost functionally identical to NAU8315. Adds compatible string "nuvoton,nau8318" for driver reuse. Signed-off-by: David Lin <CTLIN0@nuvoton.com> Link: https://lore.kernel.org/r/20221124055658.53828-2-CTLIN0@nuvoton.com Signed-off-by: Mark Brown <broonie@kernel.org>
In mt8186 platform, I2S2 should be the main I2S port that provide the clock, on the contrary I2S3 should be the second I2S port that use this clock. Fixes: 9986bda ("ASoC: mediatek: mt8186: Configure shared clocks") Signed-off-by: Jiaxin Yu <jiaxin.yu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20221124023050.4470-1-jiaxin.yu@mediatek.com Signed-off-by: Mark Brown <broonie@kernel.org>
If the DSP crashes before the system suspends, the setting of target state will be skipped because the firmware state will no longer be SOF_FW_BOOT_COMPLETE. This leads to the incorrect assumption that the DSP should suspend to D0I3 instead of suspending to D3. To fix this, set the target_state before we skip to DSP suspend even when the DSP has crashed. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
When the DSP is suspended while the firmware is in the crashed state, we skip tearing down the pipelines. This means that the widget reference counts will not get to reset to 0 before suspend. This will lead to errors with resuming audio after system resume. To fix this, invoke the tear_down_all_pipelines op before skipping to DSP suspend. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
…ernel.org/pub/scm/linux/kernel/git/wsa/linux into HEAD so we can apply I2C cleanups.
The probe function doesn't make use of the i2c_device_id * parameter so it can be trivially converted. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20221118224540.619276-602-uwe@kleine-koenig.org Signed-off-by: Mark Brown <broonie@kernel.org>
The probe function doesn't make use of the i2c_device_id * parameter so it can be trivially converted. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20221118224540.619276-605-uwe@kleine-koenig.org Signed-off-by: Mark Brown <broonie@kernel.org>
.probe_new() doesn't get the i2c_device_id * parameter, so determine that explicitly in the probe function. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20221118224540.619276-603-uwe@kleine-koenig.org Signed-off-by: Mark Brown <broonie@kernel.org>
The probe function doesn't make use of the i2c_device_id * parameter so it can be trivially converted. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20221118224540.619276-604-uwe@kleine-koenig.org Reviewed-by: Matt Flax <flatmax@flatmax.com> Signed-off-by: Mark Brown <broonie@kernel.org>
Qualify the KConfig symbol for cs_dsp by adding a FW_ prefix so that it is more explicit what is being referred to. This is preparation for using the symbol to namespace the exports. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://lore.kernel.org/r/20221124134556.3343784-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
Move all the exports into a namespace. This also adds the MODULE_IMPORT_NS to the 3 drivers that use the exported functions. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://lore.kernel.org/r/20221124134556.3343784-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
The amplifier may provide hardware support for I/V feedback, or alternatively the firmware may generate an echo reference attached to the SSP and dailink used for the amplifier. To avoid any issues with invalid/NULL substreams in the latter case, always unconditionally set dpcm_capture. Link: thesofproject#4083 Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
…stored If the pipeline setup fails at the first widget/pipeline then we will have no spipe stored under the pipeline_list->pipelines, the pipeline_list->count is 0. If this is the case we would have a NULL pointer dereference if the execution is allowed to proceed. Check for this condition along with the pipeline_list->pipelines check Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
The create pipeline message can carry the target code_id which is set to 0 at the moment. Add macros to set the core_id in the message extension. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
Token SOF_TKN_SCHED_CORE in topology file can specify the target core for the pipeline, if it is missing it is going to be 0 (as it is right now). Firmware will double-check all information retrieved by topology and report errors if required. This will allow policy and changes in topologies without a need for a synchronized kernel change. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
… error The sof_widget_free() on the error path will decrement the use count and if we jump to widget_free: then the use_count will be decremented by two, which is not correct as we only incremented once with 1. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
…race The use_count of the swidget is protect by ALSA core PCM locking with the exception when an associated kcontrol is changed. It has been observed that a rightly timed kcontrol access during stream stop can result of an attempt to send a control update to a widget which has been freed up between the check of the use_count and the message sending. We need to protect the entire sof_widget_setup() and sof_widget_free() execution to make it safe to rely on the use_count. Move the code under an _unlocked() function and use a mutex to protect the execution of the functions for concurrency. On the control path we need to use the lock only for the kcontrol access, the widget_kcontrol_setup() op is called with the lock already held. Reported-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com>
When the DSP is disabled in the BIOS, the DSP_BAR and PP_BAR cannot be accessed. One possible objection noted in initial reviews is that this patch adds a number of branches. However the number of branches is actually limited in probe/suspend/resume routines mostly, so there isn't really a degradation in terms of readability and maintainability. Adding yet another level of abstraction/ops/callbacks would increase complexity and not really help in terms of code reuse or readability and maintainability. A split between controller and DSP driver would be even more invasive. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Add support of SOF on MediaTek MT8188 SoC. MT8188 ADSP integrates with a single core Cadence HiFi-5 DSP. The IPC communication between AP and DSP is based on shared DRAM and mailbox interrupt. The change in the mt8186.h is compatible on both mt8186 and mt8188. The register controls booting the DSP core with the default address or the user specified address. Both mt8186 and mt8188 should boot with the user specified boot in the driver. The usage of the register is the same on both SoC, but the control bit is different on mt8186 and mt8188, which is bit 1 on mt8186 and bit 0 on mt8188. Configure the redundant bit has no side effect on both SoCs. Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
Set the generic iomem callback for debugfs_add_region_item to support sof-logger. Signed-off-by: Tinghan Shen <tinghan.shen@mediatek.com>
Just a couple of checkpatch fixes. Signed-off-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Most of the ES83xx codec configuration is exposed in the DSDT table and accessible via a _DSM method. Start adding basic definitions and helpers to dump the information. Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Co-developed-by: David Yang <yangxiaohua@everest-semi.com> Signed-off-by: David Yang <yangxiaohua@everest-semi.com> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Read jack detection information from _DSM and pre-set value. The value may be overridden with a property set by the machine driver. Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Read jack detection information from _DSM and pre-set value. The value may be overridden with a property set by the machine driver. Reviewed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
The ACPI entry for es83xx should contain its configuration at _DSM. It is helpful to debug problems by looking on its contents. So, add a debug function that outputs its contents as read by the Kernel. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
The enable GPIOs can either be low or high level activated. Add a logic to detect it from _DSM. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
The _DSM method has an entry to detect the microphone type, and where they're attached. Add support for checking it. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Now that we can read GPIO enable level from ACPI _DSM, use it to properly set the right GPIO levels. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
This is already detected from _DSM, so no need to have a quirk inside the driver anymore. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Use the _DSM data to fill the analog microphone ports, removing another quirk. Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Hmm... basing it on the top of topic/sof-dev didn't work... it still shows 10k+ commits. I'll just cancel it and open PR against https://github.com/thesofproject/linux/. |
plbossart
pushed a commit
that referenced
this pull request
Jun 16, 2023
The cited commit adds a compeletion to remove dependency on rtnl lock. But it causes a deadlock for multiple encapsulations: crash> bt ffff8aece8a64000 PID: 1514557 TASK: ffff8aece8a64000 CPU: 3 COMMAND: "tc" #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418 #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898 #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8 #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core] #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core] #7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core] #8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core] #9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core] #10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core] #11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core] #12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8 #13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower] #14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower] #15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047 #16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31 #17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853 #18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835 #19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27 #20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245 #21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482 #22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a #23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2 #24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2 #25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f #26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c crash> bt 0xffff8aeb07544000 PID: 1110766 TASK: ffff8aeb07544000 CPU: 0 COMMAND: "kworker/u20:9" #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45 #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418 #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88 #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core] #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core] #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core] #7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c #8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012 #9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d #10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f After the first encap is attached, flow will be added to encap entry's flows list. If neigh update is running at this time, the following encaps of the flow can't hold the encap_tbl_lock and sleep. If neigh update thread is waiting for that flow's init_done, deadlock happens. Fix it by holding lock outside of the for loop. If neigh update is running, prevent encap flows from offloading. Since the lock is held outside of the for loop, concurrent creation of encap entries is not allowed. So remove unnecessary wait_for_completion call for res_ready. Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
plbossart
pushed a commit
that referenced
this pull request
Mar 11, 2024
It appears the client object tree has no locking unless I've missed something else. Fix races around adding/removing client objects, mostly vram bar mappings. 4562.099306] general protection fault, probably for non-canonical address 0x6677ed422bceb80c: 0000 [#1] PREEMPT SMP PTI [ 4562.099314] CPU: 2 PID: 23171 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 [ 4562.099324] Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 [ 4562.099330] RIP: 0010:nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099503] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 48 89 f8 48 85 f6 74 39 48 8b 87 a0 00 00 00 48 85 c0 74 12 <48> 8b 48 f8 48 39 ce 73 15 48 8b 40 10 48 85 c0 75 ee 48 c7 c0 fe [ 4562.099506] RSP: 0000:ffffa94cc420bbf8 EFLAGS: 00010206 [ 4562.099512] RAX: 6677ed422bceb814 RBX: ffff98108791f400 RCX: ffff9810f26b8f58 [ 4562.099517] RDX: 0000000000000000 RSI: ffff9810f26b9158 RDI: ffff98108791f400 [ 4562.099519] RBP: ffff9810f26b9158 R08: 0000000000000000 R09: 0000000000000000 [ 4562.099521] R10: ffffa94cc420bc48 R11: 0000000000000001 R12: ffff9810f02a7cc0 [ 4562.099526] R13: 0000000000000000 R14: 00000000000000ff R15: 0000000000000007 [ 4562.099528] FS: 00007f629c5017c0(0000) GS:ffff98142c700000(0000) knlGS:0000000000000000 [ 4562.099534] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4562.099536] CR2: 00007f629a882000 CR3: 000000017019e004 CR4: 00000000003706f0 [ 4562.099541] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4562.099542] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4562.099544] Call Trace: [ 4562.099555] <TASK> [ 4562.099573] ? die_addr+0x36/0x90 [ 4562.099583] ? exc_general_protection+0x246/0x4a0 [ 4562.099593] ? asm_exc_general_protection+0x26/0x30 [ 4562.099600] ? nvkm_object_search+0x1d/0x70 [nouveau] [ 4562.099730] nvkm_ioctl+0xa1/0x250 [nouveau] [ 4562.099861] nvif_object_map_handle+0xc8/0x180 [nouveau] [ 4562.099986] nouveau_ttm_io_mem_reserve+0x122/0x270 [nouveau] [ 4562.100156] ? dma_resv_test_signaled+0x26/0xb0 [ 4562.100163] ttm_bo_vm_fault_reserved+0x97/0x3c0 [ttm] [ 4562.100182] ? __mutex_unlock_slowpath+0x2a/0x270 [ 4562.100189] nouveau_ttm_fault+0x69/0xb0 [nouveau] [ 4562.100356] __do_fault+0x32/0x150 [ 4562.100362] do_fault+0x7c/0x560 [ 4562.100369] __handle_mm_fault+0x800/0xc10 [ 4562.100382] handle_mm_fault+0x17c/0x3e0 [ 4562.100388] do_user_addr_fault+0x208/0x860 [ 4562.100395] exc_page_fault+0x7f/0x200 [ 4562.100402] asm_exc_page_fault+0x26/0x30 [ 4562.100412] RIP: 0033:0x9b9870 [ 4562.100419] Code: 85 a8 f7 ff ff 8b 8d 80 f7 ff ff 89 08 e9 18 f2 ff ff 0f 1f 84 00 00 00 00 00 44 89 32 e9 90 fa ff ff 0f 1f 84 00 00 00 00 00 <44> 89 32 e9 f8 f1 ff ff 0f 1f 84 00 00 00 00 00 66 44 89 32 e9 e7 [ 4562.100422] RSP: 002b:00007fff9ba2dc70 EFLAGS: 00010246 [ 4562.100426] RAX: 0000000000000004 RBX: 000000000dd65e10 RCX: 000000fff0000000 [ 4562.100428] RDX: 00007f629a882000 RSI: 00007f629a882000 RDI: 0000000000000066 [ 4562.100432] RBP: 00007fff9ba2e570 R08: 0000000000000000 R09: 0000000123ddf000 [ 4562.100434] R10: 0000000000000001 R11: 0000000000000246 R12: 000000007fffffff [ 4562.100436] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 4562.100446] </TASK> [ 4562.100448] Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink cmac bnep sunrpc iwlmvm intel_rapl_msr intel_rapl_common snd_sof_pci_intel_cnl x86_pkg_temp_thermal intel_powerclamp snd_sof_intel_hda_common mac80211 coretemp snd_soc_acpi_intel_match kvm_intel snd_soc_acpi snd_soc_hdac_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_mlink snd_sof_intel_hda snd_sof kvm snd_sof_utils snd_soc_core snd_hda_codec_realtek libarc4 snd_hda_codec_generic snd_compress snd_hda_ext_core vfat fat snd_hda_intel snd_intel_dspcfg irqbypass iwlwifi snd_hda_codec snd_hwdep snd_hda_core btusb btrtl mei_hdcp iTCO_wdt rapl mei_pxp btintel snd_seq iTCO_vendor_support btbcm snd_seq_device intel_cstate bluetooth snd_pcm cfg80211 intel_wmi_thunderbolt wmi_bmof intel_uncore snd_timer mei_me snd ecdh_generic i2c_i801 [ 4562.100541] ecc mei i2c_smbus soundcore rfkill intel_pch_thermal acpi_pad zram nouveau drm_ttm_helper ttm gpu_sched i2c_algo_bit drm_gpuvm drm_exec mxm_wmi drm_display_helper drm_kms_helper drm crct10dif_pclmul crc32_pclmul nvme e1000e crc32c_intel nvme_core ghash_clmulni_intel video wmi pinctrl_cannonlake ip6_tables ip_tables fuse [ 4562.100616] ---[ end trace 0000000000000000 ]--- Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org
plbossart
pushed a commit
that referenced
this pull request
May 3, 2024
Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <stable@vger.kernel.org> Fixes: be55287 ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Use _DSM table to detect GPIO level and AMIC. This change has @plbossart patches that apply, and replaces one that doesn't actually work.
Tested on Huawei Matebook D15.