drm/i915/gt: Unlock engine-pm after queuing the kernel context switch
authorChris Wilson <chris@chris-wilson.co.uk>
Wed, 20 Nov 2019 16:55:14 +0000 (16:55 +0000)
committerChris Wilson <chris@chris-wilson.co.uk>
Wed, 20 Nov 2019 16:58:26 +0000 (16:58 +0000)
commit5cba288466e9b229feb68295675246e7522fb5eb
treeb4d5b5d6331847140533e39fdcb5100724c13067
parenta6edbca74b305adc165e67065d7ee766006e6a48
drm/i915/gt: Unlock engine-pm after queuing the kernel context switch

In commit a79ca656b648 ("drm/i915: Push the wakeref->count deferral to
the backend"), I erroneously concluded that we last modify the engine
inside __i915_request_commit() meaning that we could enable concurrent
submission for userspace as we enqueued this request. However, this
falls into a trap with other users of the engine->kernel_context waking
up and submitting their request before the idle-switch is queued, with
the result that the kernel_context is executed out-of-sequence most
likely upsetting the GPU and certainly ourselves when we try to retire
the out-of-sequence requests.

As such we need to hold onto the effective engine->kernel_context mutex
lock (via the engine pm mutex proxy) until we have finish queuing the
request to the engine.

v2: Serialise against concurrent intel_gt_retire_requests()
v3: Describe the hairy locking scheme with intel_gt_retire_requests()
for future reference.
v4: Combine timeline->lock and engine pm release; it's hairy.

Fixes: a79ca656b648 ("drm/i915: Push the wakeref->count deferral to the backend")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120165514.3955081-2-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/gt/intel_engine_pm.c