Perform a ras_suspend to disable ras on all IPs to workaround
some ROCm stability issue.
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
/* block all schedulers and reset given job's ring */
list_for_each_entry(tmp_adev, device_list_handle, gmc.xgmi.head) {
+ /* disable ras on ALL IPs */
+ if (amdgpu_device_ip_need_full_reset(tmp_adev))
+ amdgpu_ras_suspend(tmp_adev);
+
for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
struct amdgpu_ring *ring = tmp_adev->rings[i];