Skip to content

Commit

Permalink
drm/amdgpu: bypass RAS error reset in some conditions
Browse files Browse the repository at this point in the history
PMFW is responsible for RAS error reset in some conditions, driver can
skip the operation.

v2: add check for ras->in_recovery, it's set earlier than
amdgpu_in_reset.

v3: fix error in gpu reset check.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
  • Loading branch information
Tao Zhou authored and alexdeucher committed Oct 26, 2023
1 parent f3a3bbf commit 73582be
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
Original file line number Diff line number Diff line change
Expand Up @@ -1220,14 +1220,22 @@ int amdgpu_ras_reset_error_count(struct amdgpu_device *adev,
enum amdgpu_ras_block block)
{
struct amdgpu_ras_block_object *block_obj = amdgpu_ras_get_ras_block(adev, block, 0);
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
const struct amdgpu_mca_smu_funcs *mca_funcs = adev->mca.mca_funcs;

if (!block_obj || !block_obj->hw_ops) {
dev_dbg_once(adev->dev, "%s doesn't config RAS function\n",
ras_block_str(block));
return -EOPNOTSUPP;
}

if (!amdgpu_ras_is_supported(adev, block))
/* skip ras error reset in gpu reset */
if ((amdgpu_in_reset(adev) || atomic_read(&ras->in_recovery)) &&
mca_funcs && mca_funcs->mca_set_debug_mode)
return -EOPNOTSUPP;

if (!amdgpu_ras_is_supported(adev, block) ||
!amdgpu_ras_get_mca_debug_mode(adev))
return -EOPNOTSUPP;

if (block_obj->hw_ops->reset_ras_error_count)
Expand Down

0 comments on commit 73582be

Please sign in to comment.