drm/i915: Mark up obj->mm.lock for shrinker
authorChris Wilson <chris@chris-wilson.co.uk>
Mon, 31 Oct 2016 12:40:48 +0000 (12:40 +0000)
committerChris Wilson <chris@chris-wilson.co.uk>
Mon, 31 Oct 2016 13:28:46 +0000 (13:28 +0000)
As we may allocate from within the obj->mm.lock we may enter the
shrinker for direct reclaim. Operating on the current object is
prevented by checking for obj->mm.pages (which is only set as the last
operation in the allocation path). However, we need to identify the
single recursion of accessing another object's obj->mm.lock as the two
locks have identical class and so appear to be the same to lockdep,
convincing it that a deadlock is possible. Use mutex_lock_nested() to
remove the false positive.

[ 2165.945734] =================================
[ 2165.945749] [ INFO: inconsistent lock state ]
[ 2165.945765] 4.9.0-rc2+ #2 Tainted: G        W
[ 2165.945781] ---------------------------------
[ 2165.945796] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 2165.945816] kswapd0/62 [HC0[0]:SC0[0]:HE1:SE1] takes: (&obj->mm.lock){+.+.?.}, at: [<ffffffffc0289a1f>] i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.945904] {RECLAIM_FS-ON-W} state was registered at:
[ 2165.945931] [<ffffffffb10bd50f>] mark_held_locks+0x6f/0xa0
[ 2165.945956] [<ffffffffb10bf889>] lockdep_trace_alloc+0x69/0xc0
[ 2165.945982] [<ffffffffb11eea53>] kmem_cache_alloc_trace+0x33/0x2a0
[ 2165.946019] [<ffffffffc028a28a>] i915_gem_object_get_pages_stolen+0x6a/0xd0 [i915]
[ 2165.946060] [<ffffffffc027e1d0>] ____i915_gem_object_get_pages+0x20/0x60 [i915]
[ 2165.946098] [<ffffffffc027e268>] __i915_gem_object_get_pages+0x58/0x70 [i915]
[ 2165.946138] [<ffffffffc028a3dc>] _i915_gem_object_create_stolen+0xec/0x120 [i915]
[ 2165.946177] [<ffffffffc028af73>] i915_gem_object_create_stolen_for_preallocated+0xf3/0x3f0 [i915]
[ 2165.946222] [<ffffffffc02bae43>] intel_alloc_initial_plane_obj.isra.125+0xd3/0x200 [i915]
[ 2165.946266] [<ffffffffc02cb1c1>] intel_modeset_init+0x931/0x1530 [i915]
[ 2165.946301] [<ffffffffc023d584>] i915_driver_load+0xa14/0x14a0 [i915]
[ 2165.946335] [<ffffffffc0248aff>] i915_pci_probe+0x4f/0x70 [i915]
[ 2165.946362] [<ffffffffb13cc452>] local_pci_probe+0x42/0xa0
[ 2165.946386] [<ffffffffb13cd903>] pci_device_probe+0x103/0x150
[ 2165.946411] [<ffffffffb14adeb3>] driver_probe_device+0x223/0x430
[ 2165.946436] [<ffffffffb14ae1a3>] __driver_attach+0xe3/0xf0
[ 2165.946461] [<ffffffffb14ab943>] bus_for_each_dev+0x73/0xc0
[ 2165.946485] [<ffffffffb14ad5ee>] driver_attach+0x1e/0x20
[ 2165.946508] [<ffffffffb14ad003>] bus_add_driver+0x173/0x270
[ 2165.946533] [<ffffffffb14aee70>] driver_register+0x60/0xe0
[ 2165.946557] [<ffffffffb13cbd6d>] __pci_register_driver+0x5d/0x60
[ 2165.946606] [<ffffffffc0378057>] soundcore_open+0x17/0x230 [soundcore]
[ 2165.946636] [<ffffffffb1000450>] do_one_initcall+0x50/0x180
[ 2165.946661] [<ffffffffb117fd2d>] do_init_module+0x5f/0x1f1
[ 2165.946685] [<ffffffffb1108964>] load_module+0x2174/0x2a80
[ 2165.946709] [<ffffffffb11094df>] SYSC_finit_module+0xdf/0x110
[ 2165.946734] [<ffffffffb110952e>] SyS_finit_module+0xe/0x10
[ 2165.946758] [<ffffffffb1742aea>] entry_SYSCALL_64_fastpath+0x18/0xad
[ 2165.946776] irq event stamp: 90871
[ 2165.946788] hardirqs last  enabled at (90871):
[ 2165.946805] [<ffffffffb173e9da>] __mutex_unlock_slowpath+0x11a/0x1c0
[ 2165.946823] hardirqs last disabled at (90870):
[ 2165.946839] [<ffffffffb173e91b>] __mutex_unlock_slowpath+0x5b/0x1c0
[ 2165.946856] softirqs last  enabled at (90858):
[ 2165.946872] [<ffffffffb174581a>] __do_softirq+0x39a/0x4c6
[ 2165.946887] softirqs last disabled at (90671):
[ 2165.946902] [<ffffffffb1066cea>] irq_exit+0xea/0xf0
[ 2165.946916] other info that might help us debug this:
[ 2165.946936]  Possible unsafe locking scenario:
[ 2165.946955]        CPU0
[ 2165.946965]        ----
[ 2165.946975]   lock(&obj->mm.lock);
[ 2165.947000]   <Interrupt>
[ 2165.947010]     lock(&obj->mm.lock);
[ 2165.947035] *** DEADLOCK ***
[ 2165.947054] 2 locks held by kswapd0/62:
[ 2165.947067]  #0: (shrinker_rwsem){++++..}, at: [<ffffffffb119a20e>] shrink_slab.part.40+0x5e/0x5d0
[ 2165.947120]  #1: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffc028954b>] i915_gem_shrinker_lock+0x1b/0x60 [i915]
[ 2165.948909] stack backtrace:
[ 2165.950650] CPU: 2 PID: 62 Comm: kswapd0 Tainted: G        W 4.9.0-rc2+ #2
[ 2165.951587] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015
[ 2165.952484]  ffffc90000b5f8c8 ffffffffb137f645 ffff88016c5a2700 ffffffffb25f20a0
[ 2165.953395]  ffffc90000b5f918 ffffffffb10bcecd 0000000000000000 ffff880100000001
[ 2165.954305]  0000000000000001 000000000000000a ffff88016c5a2fd0 ffff88016c5a2700
[ 2165.955240] Call Trace:
[ 2165.956170]  [<ffffffffb137f645>] dump_stack+0x68/0x93
[ 2165.957071]  [<ffffffffb10bcecd>] print_usage_bug+0x1dd/0x1f0
[ 2165.957979]  [<ffffffffb10bd439>] mark_lock+0x559/0x5c0
[ 2165.958875]  [<ffffffffb10bc3f0>] ?  print_shortest_lock_dependencies+0x1b0/0x1b0
[ 2165.959829]  [<ffffffffb10be04d>] __lock_acquire+0x66d/0x12a0
[ 2165.960729]  [<ffffffffb11ef541>] ? __slab_free+0xa1/0x340
[ 2165.961625]  [<ffffffffb10dba5d>] ?  debug_lockdep_rcu_enabled+0x1d/0x20
[ 2165.962530]  [<ffffffffb10bd50f>] ? mark_held_locks+0x6f/0xa0
[ 2165.963457]  [<ffffffffb10bf0b0>] lock_acquire+0xf0/0x1f0
[ 2165.964368]  [<ffffffffc0289a1f>] ? i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.965269]  [<ffffffffc0289a1f>] ? i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.966150]  [<ffffffffb173d837>] mutex_lock_nested+0x77/0x420
[ 2165.967030]  [<ffffffffc0289a1f>] ? i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.967952]  [<ffffffffc027c7a1>] ?  __i915_gem_object_put_pages.part.58+0x161/0x1b0 [i915]
[ 2165.968835]  [<ffffffffc0289a1f>] i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.969712]  [<ffffffffc0289e40>] i915_gem_shrinker_scan+0x70/0xb0 [i915]
[ 2165.970591]  [<ffffffffb119a3ae>] shrink_slab.part.40+0x1fe/0x5d0
[ 2165.971504]  [<ffffffffb119f19c>] shrink_node+0x22c/0x320
[ 2165.972371]  [<ffffffffb11a05fb>] kswapd+0x38b/0x9b0
[ 2165.973238]  [<ffffffffb11a0270>] ?  mem_cgroup_shrink_node+0x330/0x330
[ 2165.974068]  [<ffffffffb108630f>] kthread+0xff/0x120
[ 2165.974929]  [<ffffffffb1086210>] ? kthread_park+0x60/0x60
[ 2165.975847]  [<ffffffffb1742d57>] ret_from_fork+0x27/0x40

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation...")
Testcase: igt/gem_ctx_create/maximum-swap
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161031124048.30355-1-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/i915_gem_shrinker.c

index 0241658af16bf46613615f4be47821f0e8956625..0daa09cabbccddc4ccfad4e17cd6d075731f3a21 100644 (file)
@@ -223,7 +223,9 @@ i915_gem_shrink(struct drm_i915_private *dev_priv,
                                continue;
 
                        if (unsafe_drop_pages(obj)) {
-                               mutex_lock(&obj->mm.lock);
+                               /* May arrive from get_pages on another bo */
+                               mutex_lock_nested(&obj->mm.lock,
+                                                 SINGLE_DEPTH_NESTING);
                                if (!obj->mm.pages) {
                                        __i915_gem_object_invalidate(obj);
                                        count += obj->base.size >> PAGE_SHIFT;