drm/i915/execlists: Cancel breadcrumb on preempting the virtual engine
authorChris Wilson <chris@chris-wilson.co.uk>
Tue, 16 Jul 2019 12:49:30 +0000 (13:49 +0100)
committerChris Wilson <chris@chris-wilson.co.uk>
Fri, 19 Jul 2019 11:53:29 +0000 (12:53 +0100)
As we unwind the requests for a preemption event, we return a virtual
request back to its original virtual engine (so that it is available for
execution on any of its siblings). In the process, this means that its
breadcrumb should no longer be associated with the original physical
engine, and so we are forced to decouple it. Previously, as the request
could not complete without our awareness, we would move it to the next
real engine without any danger. However, preempt-to-busy allowed for
requests to continue on the HW and complete in the background as we
unwound, which meant that we could end up retiring the request before
fixing up the breadcrumb link.

[51679.517943] INFO: trying to register non-static key.
[51679.517956] the code is fine but needs lockdep annotation.
[51679.517960] turning off the locking correctness validator.
[51679.517966] CPU: 0 PID: 3270 Comm: kworker/u8:0 Tainted: G     U            5.2.0+ #717
[51679.517971] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[51679.518012] Workqueue: i915 retire_work_handler [i915]
[51679.518017] Call Trace:
[51679.518026]  dump_stack+0x67/0x90
[51679.518031]  register_lock_class+0x52c/0x540
[51679.518038]  ? find_held_lock+0x2d/0x90
[51679.518042]  __lock_acquire+0x68/0x1800
[51679.518047]  ? find_held_lock+0x2d/0x90
[51679.518073]  ? __i915_sw_fence_complete+0xff/0x1c0 [i915]
[51679.518079]  lock_acquire+0x90/0x170
[51679.518105]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518112]  _raw_spin_lock+0x27/0x40
[51679.518138]  ? i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518165]  i915_request_cancel_breadcrumb+0x29/0x160 [i915]
[51679.518199]  i915_request_retire+0x43f/0x530 [i915]
[51679.518232]  retire_requests+0x4d/0x60 [i915]
[51679.518263]  i915_retire_requests+0xdf/0x1f0 [i915]
[51679.518294]  retire_work_handler+0x4c/0x60 [i915]
[51679.518301]  process_one_work+0x22c/0x5c0
[51679.518307]  worker_thread+0x37/0x390
[51679.518311]  ? process_one_work+0x5c0/0x5c0
[51679.518316]  kthread+0x116/0x130
[51679.518320]  ? kthread_create_on_node+0x40/0x40
[51679.518325]  ret_from_fork+0x24/0x30
[51679.520177] ------------[ cut here ]------------
[51679.520189] list_del corruption, ffff88883675e2f0->next is LIST_POISON1 (dead000000000100)

Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190716124931.5870-4-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/gt/intel_lrc.c

index d076d9148b6dc0c4fa2e5839ca7ac335cd262b9a..dce36482a252eac4d57405e4a94ac03234ae9bac 100644 (file)
@@ -495,6 +495,19 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
                        list_move(&rq->sched.link, pl);
                        active = rq;
                } else {
+                       /*
+                        * Decouple the virtual breadcrumb before moving it
+                        * back to the virtual engine -- we don't want the
+                        * request to complete in the background and try
+                        * and cancel the breadcrumb on the virtual engine
+                        * (instead of the old engine where it is linked)!
+                        */
+                       if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
+                                    &rq->fence.flags)) {
+                               spin_lock(&rq->lock);
+                               i915_request_cancel_breadcrumb(rq);
+                               spin_unlock(&rq->lock);
+                       }
                        rq->engine = owner;
                        owner->submit_request(rq);
                        active = NULL;