drm/i915/lrc: Scrub the GPU state of the guilty hanging request
authorChris Wilson <chris@chris-wilson.co.uk>
Sat, 28 Apr 2018 11:15:32 +0000 (12:15 +0100)
committerChris Wilson <chris@chris-wilson.co.uk>
Mon, 30 Apr 2018 10:52:41 +0000 (11:52 +0100)
commit5692251c254a3d561316c4e8e10c77e470b60658
treee95f4a56262274a50cb182ad1b987849e34520b1
parent78b60ce7b96cf1869b51cee916a40041e400d6ce
drm/i915/lrc: Scrub the GPU state of the guilty hanging request

Previously, we just reset the ring register in the context image such
that we could skip over the broken batch and emit the closing
breadcrumb. However, on resume the context image and GPU state would be
reloaded, which may have been left in an inconsistent state by the
reset. The presumption was that at worst it would just cause another
reset and skip again until it recovered, however it seems just as likely
to cause an unrecoverable hang. Instead of risking loading an incomplete
context image, restore it back to the default state.

v2: Fix up off-by-one from including the ppHSWP in with the register
state.
v3: Use a ring local to compact a few lines.
v4: Beware setting the ring local before checking for a NULL request.

References: https://bugs.freedesktop.org/show_bug.cgi?id=105304
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: MichaƂ Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20180428111532.15819-1-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/intel_lrc.c