Ville Syrjälä [Mon, 31 Oct 2016 20:37:04 +0000 (22:37 +0200)]
drm/i915: Use struct intel_crtc in legacy platform wm code
Unify our approach to things by using intel_crtc instead of drm_crtc.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Mon, 31 Oct 2016 20:37:03 +0000 (22:37 +0200)]
drm/i915: Pass intel_crtc to update_wm functions
Unify our approach to things by passing around intel_crtc instead of
drm_crtc.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-5-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Mon, 31 Oct 2016 20:37:02 +0000 (22:37 +0200)]
drm/i915: Pass intel_crtc to intel_crtc_active()
Unify our approach to things by passing around intel_crtc instead of
drm_crtc.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Mon, 31 Oct 2016 20:37:01 +0000 (22:37 +0200)]
drm/i915: Pass dev_priv to skl_init_scalers()
Unify our approach to things by passing around dev_priv instead of dev.
While at it let's do some house cleaning: s/intel_foo/foo/ and move
things into tighter scope.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Ville Syrjälä [Mon, 31 Oct 2016 20:37:00 +0000 (22:37 +0200)]
drm/i915: Pass dev_priv to plane constructors
Unify our approach to things by passing around dev_priv instead of dev.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Chris Wilson [Sun, 30 Oct 2016 13:28:20 +0000 (13:28 +0000)]
drm/i915: Export a function to flush the context upon pinning
For legacy contexts we employ an optimisation to only flush the context
when binding into the global GTT. This avoids stalling on the GPU when
reloading an active context. Wrap this detail up into a helper and
export it for a potential third user. (Longer term, context pinning
needs to be reworked as the current handling of switch context pins too
late and so risks eviction and corrupting the request. Plans, plans,
plans.)
v2: Expand the comment explaining the optimisation for avoiding the
stall on active contexts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161030132820.32163-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Maarten Lankhorst [Wed, 26 Oct 2016 13:41:36 +0000 (15:41 +0200)]
drm/i915/gen9+: Use the watermarks from crtc_state for everything, v2.
There's no need to keep a duplicate skl_pipe_wm around any more,
everything can be discovered from crtc_state, which we pass around
correctly now even in case of plane disable.
The copy in intel_crtc->wm.skl.active is equal to
crtc_state->wm.skl.optimal after the atomic commit completes.
It's useful for two-step watermark programming, but not required for
gen9+ which does it in a single step. We can pull the old allocation
from old_crtc_state.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477489299-25777-9-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Maarten Lankhorst [Wed, 26 Oct 2016 13:41:34 +0000 (15:41 +0200)]
drm/i915/skl+: Clean up minimum allocations, v2.
Move calculating minimum allocations to a helper, which cleans up the
code some more. The cursor is still allocated in advance because it
doesn't count towards data rate and should always be reserved.
changes since v1:
- Change comment to have a extra opening line. (Matt)
- Rebase to remove unused plane->pipe == pipe, handled by the iterator
now. (Paulo)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477489299-25777-7-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Maarten Lankhorst [Wed, 26 Oct 2016 13:41:33 +0000 (15:41 +0200)]
drm/i915/skl+: Remove minimum block allocation from crtc state.
This is not required any more now that we get fresh state from
drm_atomic_crtc_state_for_each_plane_state. Zero all state
in advance.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477489299-25777-6-git-send-email-maarten.lankhorst@linux.intel.com
Maarten Lankhorst [Wed, 26 Oct 2016 13:41:32 +0000 (15:41 +0200)]
drm/i915/skl+: Remove data_rate from watermark struct, v2.
It's only used in one function, and can be calculated without caching it
in the global struct by using drm_atomic_crtc_state_for_each_plane_state.
There are loops over all planes, including planes that don't exist.
This is harmless, because data_rate will always be 0 for them and we
never program them when updating watermarks.
Changes since v1:
- Rename rate back to data_rate, and change array name to
plane_data_rate. (Matt)
- Remove whitespace. (Paulo)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477489299-25777-5-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Maarten Lankhorst [Tue, 1 Nov 2016 11:04:10 +0000 (12:04 +0100)]
drm/i915/gen9+: Use for_each_intel_plane_on_crtc in skl_print_wm_changes, v2.
Using for_each_intel_plane_on_crtc will allow us to find all allocations
that may have changed, not just the one added by the atomic state.
This will print changes to plane allocations for crtc's when some
planes are not added to the atomic state.
Changes since v1:
- Rephrase commit message. (Ville)
- Use plane->base.id and plane->name to kill off cursor special
case. (Ville)
- Add intel_crtc to prevent a line wrap. (Paulo)
- Line wrap debug messages.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/c9f7dc1a-d23a-7c16-b2b7-1c23dd07ed35@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Maarten Lankhorst [Wed, 26 Oct 2016 13:41:30 +0000 (15:41 +0200)]
drm/i915/gen9+: Use cstate plane mask instead of crtc->state.
I'm planning on getting rid of all obj->state dereferences,
and replace thhem with accessor functions.
Remove this one early, they're equivalent because removed
planes are already part of the state, else they could not
have been removed.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477489299-25777-3-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Maarten Lankhorst [Wed, 26 Oct 2016 13:41:29 +0000 (15:41 +0200)]
drm/i915/skl+: Prepare for removing data rate from skl watermark state, v2.
Caching is not required, drm_atomic_crtc_state_for_each_plane_state can
be used to inspect the states of all planes assigned to the CRTC even
if they are not part of _state, so we can just recalculate every time.
Changes since v1:
- Remove plane->pipe checks, they're implied by the macros.
- Split unrelated changes to a separate commit.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477489299-25777-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Chris Wilson [Tue, 1 Nov 2016 12:11:34 +0000 (12:11 +0000)]
drm/i915: Improve lockdep tracking for obj->mm.lock
The shrinker may appear to recurse into obj->mm.lock as the shrinker may
be called from a direct reclaim path whilst handling get_pages. We
filter out recursing on the same obj->mm.lock by inspecting
obj->mm.pages, but we do want to take the lock on a second object in
order to reap their pages. lockdep spots the recursion on the same
lockclass and needs annotation to avoid a false positive. To keep the
two paths distinct, create an enum to indicate which subclass of
obj->mm.lock we are using. This removes the false positive and avoids
masking real bugs.
Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101121134.27504-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Tue, 1 Nov 2016 11:54:00 +0000 (11:54 +0000)]
drm/i915: Store the vma in an rbtree under the object
With full-ppgtt one of the main bottlenecks is the lookup of the VMA
underneath the object. For execbuf there is merit in having a very fast
direct lookup of ctx:handle to the vma using a hashtree, but that still
leaves a large number of other lookups. One way to speed up the lookup
would be to use a rhashtable, but that requires extra allocations and
may exhibit poor worse case behaviour. An alternative is to use an
embedded rbtree, i.e. no extra allocations and deterministic behaviour,
but at the slight cost of O(lgN) lookups (instead of O(1) for
rhashtable). The major of such tree will be very shallow and so not much
slower, and still scales much, much better than the current unsorted
list.
v2: Bump vma_compare() to return a long, as we return the result of
comparing two pointers.
References: https://bugs.freedesktop.org/show_bug.cgi?id=87726
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk
Chris Wilson [Tue, 1 Nov 2016 10:03:17 +0000 (10:03 +0000)]
drm/i915: Track pages pinned due to swizzling quirk
If we have a tiled object and an unknown CPU swizzle pattern, we pin the
pages to prevent the object from being swapped out (and us corrupting
the contents as we do not know the access pattern and so cannot convert
it to linear and back to tiled on reuse). This requires us to remember
to drop the extra pinning when freeing the object, or else we trigger
warnings about the pin leak. In commit
fbbd37b36fa5 ("drm/i915: Move
object release to a freelist + worker"), the object free path was
deferred to a worker, but the unpinning of the quirk, along with marking
the object as reclaimable, was left on the immediate path (so that if
required we could reclaim the pages under memory pressure as early as
possible). However, this split introduced a bug where the pages were no
longer being unpinned if they were marked as unneeded.
[ 231.800401] WARNING: CPU: 1 PID: 90 at drivers/gpu/drm/i915/i915_gem.c:4275 __i915_gem_free_objects+0x326/0x3c0 [i915]
[ 231.800403] WARN_ON(i915_gem_object_has_pinned_pages(obj))
[ 231.800405] Modules linked in:
[ 231.800406] snd_hda_intel i915 snd_hda_codec_generic mei_me snd_hda_codec coretemp snd_hwdep mei lpc_ich snd_hda_core snd_pcm e1000e ptp pps_core [last unloaded: i915]
[ 231.800426] CPU: 1 PID: 90 Comm: kworker/1:4 Tainted: G U 4.9.0-rc2-CI-CI_DRM_1780+ #1
[ 231.800428] Hardware name: LENOVO 7465CTO/7465CTO, BIOS 6DET44WW (2.08 ) 04/22/2009
[ 231.800456] Workqueue: events __i915_gem_free_work [i915]
[ 231.800459]
ffffc9000034fc80 ffffffff8142dd65 ffffc9000034fcd0 0000000000000000
[ 231.800465]
ffffc9000034fcc0 ffffffff8107e4e6 000010b300000001 0000000000001000
[ 231.800469]
ffff88011d3db740 ffff880130ef0000 0000000000000000 ffff880130ef5ea0
[ 231.800474] Call Trace:
[ 231.800479] [<
ffffffff8142dd65>] dump_stack+0x67/0x92
[ 231.800484] [<
ffffffff8107e4e6>] __warn+0xc6/0xe0
[ 231.800487] [<
ffffffff8107e54a>] warn_slowpath_fmt+0x4a/0x50
[ 231.800491] [<
ffffffff811d12ac>] ? kmem_cache_free+0x2dc/0x340
[ 231.800520] [<
ffffffffa009ef36>] __i915_gem_free_objects+0x326/0x3c0 [i915]
[ 231.800548] [<
ffffffffa009effe>] __i915_gem_free_work+0x2e/0x50 [i915]
[ 231.800552] [<
ffffffff8109c27c>] process_one_work+0x1ec/0x6b0
[ 231.800555] [<
ffffffff8109c1f6>] ? process_one_work+0x166/0x6b0
[ 231.800558] [<
ffffffff8109c789>] worker_thread+0x49/0x490
[ 231.800561] [<
ffffffff8109c740>] ? process_one_work+0x6b0/0x6b0
[ 231.800563] [<
ffffffff8109c740>] ? process_one_work+0x6b0/0x6b0
[ 231.800566] [<
ffffffff810a2aab>] kthread+0xeb/0x110
[ 231.800569] [<
ffffffff810a29c0>] ? kthread_park+0x60/0x60
[ 231.800573] [<
ffffffff818164a7>] ret_from_fork+0x27/0x40
Moving to a separate flag for tracking the quirked pin is overkill for
the bug (since we only have to interchange the two tests in
i915_gem_free_object) but it does reduce a complicated test on all
objects and provide a sanitycheck for uncommon code paths.
Fixes: fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-2-chris@chris-wilson.co.uk
Chris Wilson [Tue, 1 Nov 2016 10:03:16 +0000 (10:03 +0000)]
drm/i915: Avoid accessing request->timeline outside of its lifetime
Whilst waiting on a request, we may do so without holding any locks or
any guards beyond a reference to the request. In order to avoid taking
locks within request deallocation, we drop references to its timeline
(via the context and ppgtt) upon retirement. We should avoid chasing
such pointers outside of their control, in particular we inspect the
request->timeline to see if we may restore the RPS waitboost for a
client. If we instead look at the engine->timeline, we will have similar
behaviour on both full-ppgtt and !full-ppgtt systems and reduce the
amount of reward we give towards stalling clients (i.e. only if the
client stalls and the GPU is uncontended does it reclaim its boost).
This restores behaviour back to pre-timelines, whilst fixing:
[ 645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr
ffff8802335643a0
[ 645.078577] Read of size 4 by task gem_exec_schedu/28408
[ 645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64
[ 645.078724] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[ 645.078816]
ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200
[ 645.078998]
ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200
[ 645.079172]
ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796
[ 645.079345] Call Trace:
[ 645.079404] [<
ffffffff8143d059>] dump_stack+0x68/0x9f
[ 645.079467] [<
ffffffff81229c5c>] kasan_object_err+0x1c/0x70
[ 645.079534] [<
ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0
[ 645.079601] [<
ffffffff8122a244>] kasan_report+0x34/0x40
[ 645.079676] [<
ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0
[ 645.079741] [<
ffffffff81229951>] __asan_load4+0x61/0x80
[ 645.079807] [<
ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0
[ 645.079876] [<
ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590
[ 645.079944] [<
ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500
[ 645.080016] [<
ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0
[ 645.080084] [<
ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
[ 645.080157] [<
ffffffff8110a796>] ? __lock_is_held+0x46/0xc0
[ 645.080226] [<
ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690
[ 645.080296] [<
ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690
[ 645.080366] [<
ffffffff811f8f85>] ? __might_fault+0x75/0xe0
[ 645.080433] [<
ffffffff815a55f7>] drm_ioctl+0x327/0x640
[ 645.080508] [<
ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0
[ 645.080603] [<
ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120
[ 645.080670] [<
ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
[ 645.080738] [<
ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
[ 645.080804] [<
ffffffff8120268c>] ? do_mmap+0x47c/0x580
[ 645.080871] [<
ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
[ 645.080938] [<
ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150
[ 645.081011] [<
ffffffff81108c53>] ? up_write+0x23/0x50
[ 645.081078] [<
ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
[ 645.081145] [<
ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90
[ 645.081214] [<
ffffffff8110d853>] ? mark_held_locks+0x23/0xc0
[ 645.082030] [<
ffffffff81288408>] ? __fget+0x168/0x250
[ 645.082106] [<
ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1
[ 645.082176] [<
ffffffff81288592>] ? __fget_light+0xa2/0xc0
[ 645.082242] [<
ffffffff8127604c>] SyS_ioctl+0x3c/0x70
[ 645.082309] [<
ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 645.082374] Object at
ffff880233564200, in cache kmalloc-8192 size: 8192
[ 645.082431] Allocated:
[ 645.082480] PID = 28408
[ 645.082535] [ 645.082566] [<
ffffffff8103ae66>] save_stack_trace+0x16/0x20
[ 645.082623] [ 645.082656] [<
ffffffff81228b06>] save_stack+0x46/0xd0
[ 645.082716] [ 645.082756] [<
ffffffff812292fd>] kasan_kmalloc+0xad/0xe0
[ 645.082817] [ 645.082848] [<
ffffffff81631752>] i915_ppgtt_create+0x52/0x220
[ 645.082908] [ 645.082941] [<
ffffffff8161db96>] i915_gem_create_context+0x396/0x560
[ 645.083027] [ 645.083059] [<
ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0
[ 645.083152] [ 645.083183] [<
ffffffff815a55f7>] drm_ioctl+0x327/0x640
[ 645.083243] [ 645.083274] [<
ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
[ 645.083334] [ 645.083372] [<
ffffffff8127604c>] SyS_ioctl+0x3c/0x70
[ 645.083432] [ 645.083464] [<
ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 645.083551] Freed:
[ 645.083599] PID = 27629
[ 645.083648] [ 645.083676] [<
ffffffff8103ae66>] save_stack_trace+0x16/0x20
[ 645.083738] [ 645.083770] [<
ffffffff81228b06>] save_stack+0x46/0xd0
[ 645.083830] [ 645.083862] [<
ffffffff81229203>] kasan_slab_free+0x73/0xc0
[ 645.083922] [ 645.083961] [<
ffffffff812279c9>] kfree+0xa9/0x170
[ 645.084021] [ 645.084053] [<
ffffffff81629f60>] i915_ppgtt_release+0x100/0x180
[ 645.084139] [ 645.084171] [<
ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230
[ 645.084257] [ 645.084288] [<
ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230
[ 645.084380] [ 645.084413] [<
ffffffff81645250>] i915_gem_request_retire+0x620/0x630
[ 645.084500] [ 645.085226] [<
ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280
[ 645.085313] [ 645.085352] [<
ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0
[ 645.085440] [ 645.085471] [<
ffffffff810c725b>] process_one_work+0x4fb/0x920
[ 645.085532] [ 645.085562] [<
ffffffff810c770d>] worker_thread+0x8d/0x840
[ 645.085622] [ 645.085653] [<
ffffffff810d21e5>] kthread+0x185/0x1b0
[ 645.085718] [ 645.085750] [<
ffffffff819ad7a7>] ret_from_fork+0x27/0x40
[ 645.085811] Memory state around the buggy address:
[ 645.085869]
ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 645.085956]
ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 645.086053] >
ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 645.086138] ^
[ 645.086193]
ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 645.086283]
ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
v2: Add a comment to document the hint like nature of
intel_engine_last_submit()
Fixes: 73cb97010d4f ("drm/i915: Combine seqno + tracking into a global timeline struct")
Fixes: 80b204bce8f2 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
Chris Wilson [Tue, 1 Nov 2016 08:48:43 +0000 (08:48 +0000)]
drm/i915: Move the recently scanned objects to the tail after shrinking
During shrinking, we walk over the list of objects searching for
victims. Any that are not removed are put back into the global list.
Currently, they are put back in order (at the front) which means they
will be first to be scanned again. If we instead move them to the rear
of the list, we will scan new potential victims on the next pass and
waste less time rescanning unshrinkable objects. Normally the lists are
kept in rough order to shrinking (with object least frequently used at
the start), by moving just scanned objects to the rear we are
acknowledging that they are still in use.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101084843.3961-3-chris@chris-wilson.co.uk
Chris Wilson [Tue, 1 Nov 2016 08:48:42 +0000 (08:48 +0000)]
drm/i915: Discard objects from mm global_list after being shrunk
In the shrinker, we can safely remove an empty object (obj->mm.pages ==
NULL) after having discarded the pages because we are holding the
struct_mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101084843.3961-2-chris@chris-wilson.co.uk
Chris Wilson [Tue, 1 Nov 2016 08:48:41 +0000 (08:48 +0000)]
drm/i915: Use the full hammer when shutting down the rcu tasks
To flush all call_rcu() tasks (here from i915_gem_free_object()) we need
to call rcu_barrier() (not synchronize_rcu()). If we don't then we may
still have objects being freed as we continue to teardown the driver -
in particular, the recently released rings may race with the memory
manager shutdown resulting in sporadic:
[ 142.217186] WARNING: CPU: 7 PID: 6185 at drivers/gpu/drm/drm_mm.c:932 drm_mm_takedown+0x2e/0x40
[ 142.217187] Memory manager not clean during takedown.
[ 142.217187] Modules linked in: i915(-) x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich snd_hda_codec_realtek snd_hda_codec_generic mei_me mei snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core snd_pcm e1000e ptp pps_core [last unloaded: snd_hda_intel]
[ 142.217199] CPU: 7 PID: 6185 Comm: rmmod Not tainted 4.9.0-rc2-CI-Trybot_242+ #1
[ 142.217199] Hardware name: LENOVO 10AGS00601/SHARKBAY, BIOS FBKT34AUS 04/24/2013
[ 142.217200]
ffffc90002ecfce0 ffffffff8142dd65 ffffc90002ecfd30 0000000000000000
[ 142.217202]
ffffc90002ecfd20 ffffffff8107e4e6 000003a40778c2a8 ffff880401355c48
[ 142.217204]
ffff88040778c2a8 ffffffffa040f3c0 ffffffffa040f4a0 00005621fbf8b1f0
[ 142.217206] Call Trace:
[ 142.217209] [<
ffffffff8142dd65>] dump_stack+0x67/0x92
[ 142.217211] [<
ffffffff8107e4e6>] __warn+0xc6/0xe0
[ 142.217213] [<
ffffffff8107e54a>] warn_slowpath_fmt+0x4a/0x50
[ 142.217214] [<
ffffffff81559e3e>] drm_mm_takedown+0x2e/0x40
[ 142.217236] [<
ffffffffa035c02a>] i915_gem_cleanup_stolen+0x1a/0x20 [i915]
[ 142.217246] [<
ffffffffa034c581>] i915_ggtt_cleanup_hw+0x31/0xb0 [i915]
[ 142.217253] [<
ffffffffa0310311>] i915_driver_cleanup_hw+0x31/0x40 [i915]
[ 142.217260] [<
ffffffffa0312001>] i915_driver_unload+0x141/0x1a0 [i915]
[ 142.217268] [<
ffffffffa031c2c4>] i915_pci_remove+0x14/0x20 [i915]
[ 142.217269] [<
ffffffff8147d214>] pci_device_remove+0x34/0xb0
[ 142.217271] [<
ffffffff8157b14c>] __device_release_driver+0x9c/0x150
[ 142.217272] [<
ffffffff8157bcc6>] driver_detach+0xb6/0xc0
[ 142.217273] [<
ffffffff8157abe3>] bus_remove_driver+0x53/0xd0
[ 142.217274] [<
ffffffff8157c787>] driver_unregister+0x27/0x50
[ 142.217276] [<
ffffffff8147c265>] pci_unregister_driver+0x25/0x70
[ 142.217287] [<
ffffffffa03d764c>] i915_exit+0x1a/0x71 [i915]
[ 142.217289] [<
ffffffff811136b3>] SyS_delete_module+0x193/0x1e0
[ 142.217291] [<
ffffffff818174ae>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 142.217292] ---[ end trace
6fd164859c154772 ]---
[ 142.217505] [drm:show_leaks] *ERROR* node [
6b6b6b6b6b6b6b6b +
6b6b6b6b6b6b6b6b]: inserted at
[<
ffffffff81559ff3>] save_stack.isra.1+0x53/0xa0
[<
ffffffff8155a98d>] drm_mm_insert_node_in_range_generic+0x2ad/0x360
[<
ffffffffa035bf23>] i915_gem_stolen_insert_node_in_range+0x93/0xe0 [i915]
[<
ffffffffa035c855>] i915_gem_object_create_stolen+0x75/0xb0 [i915]
[<
ffffffffa036a51a>] intel_engine_create_ring+0x9a/0x140 [i915]
[<
ffffffffa036a921>] intel_init_ring_buffer+0xf1/0x440 [i915]
[<
ffffffffa036be1b>] intel_init_render_ring_buffer+0xab/0x1b0 [i915]
[<
ffffffffa0363d08>] intel_engines_init+0xc8/0x210 [i915]
[<
ffffffffa0355d7c>] i915_gem_init+0xac/0xf0 [i915]
[<
ffffffffa0311454>] i915_driver_load+0x9c4/0x1430 [i915]
[<
ffffffffa031c2f8>] i915_pci_probe+0x28/0x40 [i915]
[<
ffffffff8147d315>] pci_device_probe+0x85/0xf0
[<
ffffffff8157b7ff>] driver_probe_device+0x21f/0x430
[<
ffffffff8157baee>] __driver_attach+0xde/0xe0
In particular note that the node was being poisoned as we inspected the
list, a clear indication that the object is being freed as we make the
assertion.
v2: Don't loop, just assert that we do all the work required as that
will be better at detecting further errors.
Fixes: fbbd37b36fa5 ("drm/i915: Move object release to a freelist + worker")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101084843.3961-1-chris@chris-wilson.co.uk
Ville Syrjälä [Tue, 25 Oct 2016 15:58:03 +0000 (18:58 +0300)]
drm/i915: Reorganize sprite init
Kill the switch statement from the sprite init code and replace with a
more straightforward if ladder. Now each significant evolution of the
sprite hardware is in its own neat box.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477411083-19255-5-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Tue, 25 Oct 2016 15:58:02 +0000 (18:58 +0300)]
drm/i915: Bail if plane/crtc init fails
Due to the plane->index not getting readjusted in drm_plane_cleanup(),
we can't continue initialization of some plane/crtc init fails.
Well, we sort of could I suppose if we left all initialized planes on
the list, but that would expose those planes to userspace as well.
But for crtcs the situation is even worse since we assume that
pipe==crtc index occasionally, so we can't really deal with a partially
initialize set of crtcs.
So seems safest to just abort the entire thing if anything goes wrong.
All the failure paths here are kmalloc()s anyway, so it seems unlikely
we'd get very far if these start failing.
v2: Add (enum plane) case to silence gcc
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477411083-19255-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Tue, 25 Oct 2016 15:58:01 +0000 (18:58 +0300)]
drm/i915: Initialize planes in a reasonable order
The zpos magic sorting uses the object ID to solve conflicting zpos
values. Let's initialize our planes in an order that makes the object
IDs agree with the normal primary->sprites->cursor z order.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477411083-19255-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ville Syrjälä [Tue, 25 Oct 2016 15:58:00 +0000 (18:58 +0300)]
drm/i915: Don't try to initialize sprite planes on pre-ilk
We don't currently implement support for sprite planes on pre-ilk
platforms, so let's leave num_sprites at 0 so that we don't get
spurious errors during driver init.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477411083-19255-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Chris Wilson [Mon, 31 Oct 2016 12:40:48 +0000 (12:40 +0000)]
drm/i915: Mark up obj->mm.lock for shrinker
As we may allocate from within the obj->mm.lock we may enter the
shrinker for direct reclaim. Operating on the current object is
prevented by checking for obj->mm.pages (which is only set as the last
operation in the allocation path). However, we need to identify the
single recursion of accessing another object's obj->mm.lock as the two
locks have identical class and so appear to be the same to lockdep,
convincing it that a deadlock is possible. Use mutex_lock_nested() to
remove the false positive.
[ 2165.945734] =================================
[ 2165.945749] [ INFO: inconsistent lock state ]
[ 2165.945765] 4.9.0-rc2+ #2 Tainted: G W
[ 2165.945781] ---------------------------------
[ 2165.945796] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
[ 2165.945816] kswapd0/62 [HC0[0]:SC0[0]:HE1:SE1] takes: (&obj->mm.lock){+.+.?.}, at: [<
ffffffffc0289a1f>] i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.945904] {RECLAIM_FS-ON-W} state was registered at:
[ 2165.945931] [<
ffffffffb10bd50f>] mark_held_locks+0x6f/0xa0
[ 2165.945956] [<
ffffffffb10bf889>] lockdep_trace_alloc+0x69/0xc0
[ 2165.945982] [<
ffffffffb11eea53>] kmem_cache_alloc_trace+0x33/0x2a0
[ 2165.946019] [<
ffffffffc028a28a>] i915_gem_object_get_pages_stolen+0x6a/0xd0 [i915]
[ 2165.946060] [<
ffffffffc027e1d0>] ____i915_gem_object_get_pages+0x20/0x60 [i915]
[ 2165.946098] [<
ffffffffc027e268>] __i915_gem_object_get_pages+0x58/0x70 [i915]
[ 2165.946138] [<
ffffffffc028a3dc>] _i915_gem_object_create_stolen+0xec/0x120 [i915]
[ 2165.946177] [<
ffffffffc028af73>] i915_gem_object_create_stolen_for_preallocated+0xf3/0x3f0 [i915]
[ 2165.946222] [<
ffffffffc02bae43>] intel_alloc_initial_plane_obj.isra.125+0xd3/0x200 [i915]
[ 2165.946266] [<
ffffffffc02cb1c1>] intel_modeset_init+0x931/0x1530 [i915]
[ 2165.946301] [<
ffffffffc023d584>] i915_driver_load+0xa14/0x14a0 [i915]
[ 2165.946335] [<
ffffffffc0248aff>] i915_pci_probe+0x4f/0x70 [i915]
[ 2165.946362] [<
ffffffffb13cc452>] local_pci_probe+0x42/0xa0
[ 2165.946386] [<
ffffffffb13cd903>] pci_device_probe+0x103/0x150
[ 2165.946411] [<
ffffffffb14adeb3>] driver_probe_device+0x223/0x430
[ 2165.946436] [<
ffffffffb14ae1a3>] __driver_attach+0xe3/0xf0
[ 2165.946461] [<
ffffffffb14ab943>] bus_for_each_dev+0x73/0xc0
[ 2165.946485] [<
ffffffffb14ad5ee>] driver_attach+0x1e/0x20
[ 2165.946508] [<
ffffffffb14ad003>] bus_add_driver+0x173/0x270
[ 2165.946533] [<
ffffffffb14aee70>] driver_register+0x60/0xe0
[ 2165.946557] [<
ffffffffb13cbd6d>] __pci_register_driver+0x5d/0x60
[ 2165.946606] [<
ffffffffc0378057>] soundcore_open+0x17/0x230 [soundcore]
[ 2165.946636] [<
ffffffffb1000450>] do_one_initcall+0x50/0x180
[ 2165.946661] [<
ffffffffb117fd2d>] do_init_module+0x5f/0x1f1
[ 2165.946685] [<
ffffffffb1108964>] load_module+0x2174/0x2a80
[ 2165.946709] [<
ffffffffb11094df>] SYSC_finit_module+0xdf/0x110
[ 2165.946734] [<
ffffffffb110952e>] SyS_finit_module+0xe/0x10
[ 2165.946758] [<
ffffffffb1742aea>] entry_SYSCALL_64_fastpath+0x18/0xad
[ 2165.946776] irq event stamp: 90871
[ 2165.946788] hardirqs last enabled at (90871):
[ 2165.946805] [<
ffffffffb173e9da>] __mutex_unlock_slowpath+0x11a/0x1c0
[ 2165.946823] hardirqs last disabled at (90870):
[ 2165.946839] [<
ffffffffb173e91b>] __mutex_unlock_slowpath+0x5b/0x1c0
[ 2165.946856] softirqs last enabled at (90858):
[ 2165.946872] [<
ffffffffb174581a>] __do_softirq+0x39a/0x4c6
[ 2165.946887] softirqs last disabled at (90671):
[ 2165.946902] [<
ffffffffb1066cea>] irq_exit+0xea/0xf0
[ 2165.946916] other info that might help us debug this:
[ 2165.946936] Possible unsafe locking scenario:
[ 2165.946955] CPU0
[ 2165.946965] ----
[ 2165.946975] lock(&obj->mm.lock);
[ 2165.947000] <Interrupt>
[ 2165.947010] lock(&obj->mm.lock);
[ 2165.947035] *** DEADLOCK ***
[ 2165.947054] 2 locks held by kswapd0/62:
[ 2165.947067] #0: (shrinker_rwsem){++++..}, at: [<
ffffffffb119a20e>] shrink_slab.part.40+0x5e/0x5d0
[ 2165.947120] #1: (&dev->struct_mutex){+.+.+.}, at: [<
ffffffffc028954b>] i915_gem_shrinker_lock+0x1b/0x60 [i915]
[ 2165.948909] stack backtrace:
[ 2165.950650] CPU: 2 PID: 62 Comm: kswapd0 Tainted: G W 4.9.0-rc2+ #2
[ 2165.951587] Hardware name: LENOVO 80MX/Lenovo E31-80, BIOS DCCN34WW(V2.03) 12/01/2015
[ 2165.952484]
ffffc90000b5f8c8 ffffffffb137f645 ffff88016c5a2700 ffffffffb25f20a0
[ 2165.953395]
ffffc90000b5f918 ffffffffb10bcecd 0000000000000000 ffff880100000001
[ 2165.954305]
0000000000000001 000000000000000a ffff88016c5a2fd0 ffff88016c5a2700
[ 2165.955240] Call Trace:
[ 2165.956170] [<
ffffffffb137f645>] dump_stack+0x68/0x93
[ 2165.957071] [<
ffffffffb10bcecd>] print_usage_bug+0x1dd/0x1f0
[ 2165.957979] [<
ffffffffb10bd439>] mark_lock+0x559/0x5c0
[ 2165.958875] [<
ffffffffb10bc3f0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0
[ 2165.959829] [<
ffffffffb10be04d>] __lock_acquire+0x66d/0x12a0
[ 2165.960729] [<
ffffffffb11ef541>] ? __slab_free+0xa1/0x340
[ 2165.961625] [<
ffffffffb10dba5d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 2165.962530] [<
ffffffffb10bd50f>] ? mark_held_locks+0x6f/0xa0
[ 2165.963457] [<
ffffffffb10bf0b0>] lock_acquire+0xf0/0x1f0
[ 2165.964368] [<
ffffffffc0289a1f>] ? i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.965269] [<
ffffffffc0289a1f>] ? i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.966150] [<
ffffffffb173d837>] mutex_lock_nested+0x77/0x420
[ 2165.967030] [<
ffffffffc0289a1f>] ? i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.967952] [<
ffffffffc027c7a1>] ? __i915_gem_object_put_pages.part.58+0x161/0x1b0 [i915]
[ 2165.968835] [<
ffffffffc0289a1f>] i915_gem_shrink+0x29f/0x500 [i915]
[ 2165.969712] [<
ffffffffc0289e40>] i915_gem_shrinker_scan+0x70/0xb0 [i915]
[ 2165.970591] [<
ffffffffb119a3ae>] shrink_slab.part.40+0x1fe/0x5d0
[ 2165.971504] [<
ffffffffb119f19c>] shrink_node+0x22c/0x320
[ 2165.972371] [<
ffffffffb11a05fb>] kswapd+0x38b/0x9b0
[ 2165.973238] [<
ffffffffb11a0270>] ? mem_cgroup_shrink_node+0x330/0x330
[ 2165.974068] [<
ffffffffb108630f>] kthread+0xff/0x120
[ 2165.974929] [<
ffffffffb1086210>] ? kthread_park+0x60/0x60
[ 2165.975847] [<
ffffffffb1742d57>] ret_from_fork+0x27/0x40
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1233e2db199d ("drm/i915: Move object backing storage manipulation...")
Testcase: igt/gem_ctx_create/maximum-swap
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161031124048.30355-1-chris@chris-wilson.co.uk
Jani Nikula [Wed, 26 Oct 2016 16:11:32 +0000 (19:11 +0300)]
MAINTAINERS: drop dri-devel list for i915
In practice, none of the i915 developers Cc dri-devel for strictly i915
specific patches. Make MAINTAINERS reflect reality, and reduce random
i915 specific noise on dri-devel.
Also, we have a fairly large crowd reading and responding on intel-gfx,
and we're pretty good at involving dri-devel when that is appropriate.
Cc: dri-devel@lists.freedesktop.org
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477498292-9808-1-git-send-email-jani.nikula@intel.com
Lyude [Wed, 26 Oct 2016 16:36:09 +0000 (12:36 -0400)]
drm/i915/vlv: Prevent enabling hpd polling in late suspend
One of the CI machines began to run into issues with the hpd poller
suddenly waking up in the midst of the late suspend phase. It looks like
this is getting caused by the fact we now deinitialize power wells in
late suspend, which means that intel_hpd_poll_init() gets called in late
suspend causing polling to get re-enabled. So, when deinitializing power
wells on valleyview we now refrain from enabling polling in the midst of
suspend.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98040
Fixes: 19625e85c6ec ("drm/i915: Enable polling when we don't have hpd")
Signed-off-by: Lyude <lyude@redhat.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Cc: Petry Latvala <petri.latvala@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477499769-1966-1-git-send-email-lyude@redhat.com
Chris Wilson [Fri, 28 Oct 2016 12:58:58 +0000 (13:58 +0100)]
drm/i915: Enable multiple timelines
With the infrastructure converted over to tracking multiple timelines in
the GEM API whilst preserving the efficiency of using a single execution
timeline internally, we can now assign a separate timeline to every
context with full-ppgtt.
v2: Add a comment to indicate the xfer between timelines upon submission.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:57 +0000 (13:58 +0100)]
drm/i915: Defer setting of global seqno on request to submission
Defer the assignment of the global seqno on a request to its submission.
In the next patch, we will only allocate the global seqno at that time,
here we are just enabling the wait-for-submission before wait-for-seqno
paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-34-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:56 +0000 (13:58 +0100)]
drm/i915: Reserve space in the global seqno during request allocation
A restriction on our global seqno is that they cannot wrap, and that we
cannot use the value 0. This allows us to detect when a request has not
yet been submitted, its global seqno is still 0, and ensures that
hardware semaphores are monotonic as required by older hardware. To
meet these restrictions when we defer the assignment of the global
seqno, we must check that we have an available slot in the global seqno
space during request construction. If that test fails, we wait for all
requests to be completed and reset the hardware back to 0.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-33-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:55 +0000 (13:58 +0100)]
drm/i915: Convert breadcrumbs spinlock to be irqsafe
The breadcrumbs are about to be used from within IRQ context sections
(e.g. nouveau signals a fence from an interrupt handler causing us to
submit a new request) and/or from bottom-half tasklets (i.e.
intel_lrc_irq_handler), therefore we need to employ the irqsafe spinlock
variants.
For example, deferring the request submission to the
intel_lrc_irq_handler generates this trace:
[ 66.388639] =================================
[ 66.388650] [ INFO: inconsistent lock state ]
[ 66.388663] 4.9.0-rc2+ #56 Not tainted
[ 66.388672] ---------------------------------
[ 66.388682] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 66.388695] swapper/1/0 [HC0[0]:SC1[1]:HE0:SE0] takes:
[ 66.388706] (&(&b->lock)->rlock){+.?...} , at: [<
ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
[ 66.388761] {SOFTIRQ-ON-W} state was registered at:
[ 66.388772] [ 66.388783] [<
ffffffff810bd842>] __lock_acquire+0x682/0x1870
[ 66.388795] [ 66.388803] [<
ffffffff810bedbc>] lock_acquire+0x6c/0xb0
[ 66.388814] [ 66.388824] [<
ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
[ 66.388835] [ 66.388845] [<
ffffffff81401e41>] intel_engine_reset_breadcrumbs+0x21/0xb0
[ 66.388857] [ 66.388866] [<
ffffffff81403ae7>] gen8_init_common_ring+0x67/0x100
[ 66.388878] [ 66.388887] [<
ffffffff81403b92>] gen8_init_render_ring+0x12/0x60
[ 66.388903] [ 66.388912] [<
ffffffff813f8707>] i915_gem_init_hw+0xf7/0x2a0
[ 66.388927] [ 66.388936] [<
ffffffff813f899b>] i915_gem_init+0xbb/0xf0
[ 66.388950] [ 66.388959] [<
ffffffff813b4980>] i915_driver_load+0x7e0/0x1330
[ 66.388978] [ 66.388988] [<
ffffffff813c09d8>] i915_pci_probe+0x28/0x40
[ 66.389003] [ 66.389013] [<
ffffffff812fa0db>] pci_device_probe+0x8b/0xf0
[ 66.389028] [ 66.389037] [<
ffffffff8147737e>] driver_probe_device+0x21e/0x430
[ 66.389056] [ 66.389065] [<
ffffffff8147766e>] __driver_attach+0xde/0xe0
[ 66.389080] [ 66.389090] [<
ffffffff814751ad>] bus_for_each_dev+0x5d/0x90
[ 66.389105] [ 66.389113] [<
ffffffff81477799>] driver_attach+0x19/0x20
[ 66.389134] [ 66.389144] [<
ffffffff81475ced>] bus_add_driver+0x15d/0x260
[ 66.389159] [ 66.389168] [<
ffffffff81477e3b>] driver_register+0x5b/0xd0
[ 66.389183] [ 66.389281] [<
ffffffff812fa19b>] __pci_register_driver+0x5b/0x60
[ 66.389301] [ 66.389312] [<
ffffffff81aed333>] i915_init+0x3e/0x45
[ 66.389326] [ 66.389336] [<
ffffffff81ac2ffa>] do_one_initcall+0x8b/0x118
[ 66.389350] [ 66.389359] [<
ffffffff81ac323a>] kernel_init_freeable+0x1b3/0x23b
[ 66.389378] [ 66.389387] [<
ffffffff8160fc39>] kernel_init+0x9/0x100
[ 66.389402] [ 66.389411] [<
ffffffff816180e7>] ret_from_fork+0x27/0x40
[ 66.389426] irq event stamp: 315865
[ 66.389438] hardirqs last enabled at (315864): [<
ffffffff816178f1>] _raw_spin_unlock_irqrestore+0x31/0x50
[ 66.389469] hardirqs last disabled at (315865): [<
ffffffff816176b3>] _raw_spin_lock_irqsave+0x13/0x50
[ 66.389499] softirqs last enabled at (315818): [<
ffffffff8107a04c>] _local_bh_enable+0x1c/0x50
[ 66.389530] softirqs last disabled at (315819): [<
ffffffff8107a50e>] irq_exit+0xbe/0xd0
[ 66.389559]
[ 66.389559] other info that might help us debug this:
[ 66.389580] Possible unsafe locking scenario:
[ 66.389580]
[ 66.389598] CPU0
[ 66.389609] ----
[ 66.389620] lock(&(&b->lock)->rlock);
[ 66.389650] <Interrupt>
[ 66.389661] lock(&(&b->lock)->rlock);
[ 66.389690]
[ 66.389690] *** DEADLOCK ***
[ 66.389690]
[ 66.389715] 2 locks held by swapper/1/0:
[ 66.389728] #0: (&(&tl->lock)->rlock){..-...}, at: [<
ffffffff81403e01>] intel_lrc_irq_handler+0x201/0x3c0
[ 66.389785] #1: (&(&req->lock)->rlock/1){..-...}, at: [<
ffffffff813fc0af>] __i915_gem_request_submit+0x8f/0x170
[ 66.389854]
[ 66.389854] stack backtrace:
[ 66.389959] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.9.0-rc2+ #56
[ 66.389976] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[ 66.389999]
ffff88027fd03c58 ffffffff812beae5 ffff88027696e680 ffffffff822afe20
[ 66.390036]
ffff88027fd03ca8 ffffffff810bb420 0000000000000001 0000000000000000
[ 66.390070]
0000000000000000 0000000000000006 0000000000000004 ffff88027696ee10
[ 66.390104] Call Trace:
[ 66.390117] <IRQ>
[ 66.390128] [<
ffffffff812beae5>] dump_stack+0x68/0x93
[ 66.390147] [<
ffffffff810bb420>] print_usage_bug+0x1d0/0x1e0
[ 66.390164] [<
ffffffff810bb8a0>] mark_lock+0x470/0x4f0
[ 66.390181] [<
ffffffff810ba9d0>] ? print_shortest_lock_dependencies+0x1b0/0x1b0
[ 66.390203] [<
ffffffff810bd75d>] __lock_acquire+0x59d/0x1870
[ 66.390221] [<
ffffffff810bedbc>] lock_acquire+0x6c/0xb0
[ 66.390237] [<
ffffffff810bedbc>] ? lock_acquire+0x6c/0xb0
[ 66.390255] [<
ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
[ 66.390273] [<
ffffffff8161753a>] _raw_spin_lock+0x2a/0x40
[ 66.390291] [<
ffffffff81401c88>] ? intel_engine_enable_signaling+0x78/0x150
[ 66.390309] [<
ffffffff81401c88>] intel_engine_enable_signaling+0x78/0x150
[ 66.390327] [<
ffffffff813fc170>] __i915_gem_request_submit+0x150/0x170
[ 66.390345] [<
ffffffff81403e8b>] intel_lrc_irq_handler+0x28b/0x3c0
[ 66.390363] [<
ffffffff81079d97>] tasklet_action+0x57/0xc0
[ 66.390380] [<
ffffffff8107a249>] __do_softirq+0x119/0x240
[ 66.390396] [<
ffffffff8107a50e>] irq_exit+0xbe/0xd0
[ 66.390414] [<
ffffffff8101afd5>] do_IRQ+0x65/0x110
[ 66.390431] [<
ffffffff81618806>] common_interrupt+0x86/0x86
[ 66.390446] <EOI>
[ 66.390457] [<
ffffffff814ec6d1>] ? cpuidle_enter_state+0x151/0x200
[ 66.390480] [<
ffffffff814ec7a2>] cpuidle_enter+0x12/0x20
[ 66.390498] [<
ffffffff810b639e>] call_cpuidle+0x1e/0x40
[ 66.390516] [<
ffffffff810b65ae>] cpu_startup_entry+0x10e/0x1f0
[ 66.390534] [<
ffffffff81036133>] start_secondary+0x103/0x130
(This is split out of the defer global seqno allocation patch due to
realisation that we need a more complete conversion if we want to defer
request submission even further.)
v2: lockdep was warning about mixed SOFTIRQ contexts not HARDIRQ
contexts so we only need to use spin_lock_bh and not disable interrupts.
v3: We need full irq protection as we may be called from a third party
interrupt handler (via fences).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-32-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:54 +0000 (13:58 +0100)]
drm/i915: Create a unique name for the context
This will be used for communicating issues with this context to
userspace, so we want to identify the parent process and the individual
context. Note that the name isn't quite unique, it makes the presumption
of there only being a single device fd per process.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-31-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:53 +0000 (13:58 +0100)]
drm/i915: Move the global sync optimisation to the timeline
Currently we try to reduce the number of synchronisations (now the
number of requests we need to wait upon) by noting that if we have
earlier waited upon a request, all subsequent requests in the timeline
will be after the wait. This only applies to requests in this timeline,
as other timelines will not be ordered by that waiter.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-30-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:52 +0000 (13:58 +0100)]
drm/i915: Defer breadcrumb emission
Move the actual emission of the breadcrumb for closing the request from
i915_add_request() to the submit callback. (It can be moved later when
required.) This allows us to defer the allocation of the global_seqno
from request construction to actual submission, allowing us to emit the
requests out of order (wrt to the order of their construction, they
still will only be executed one all of their dependencies are resolved
including that all earlier requests on their timeline have been
submitted.) We have to specialise how we then emit the request in order
to write into the preallocated space, rather than at the tail of the
ringbuffer (which will have been advanced by the addition of new
requests).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-29-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:51 +0000 (13:58 +0100)]
drm/i915: Record space required for breadcrumb emission
In the next patch, we will use deferred breadcrumb emission. That requires
reserving sufficient space in the ringbuffer to emit the breadcrumb, which
first requires us to know how large the breadcrumb is.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-28-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:50 +0000 (13:58 +0100)]
drm/i915: Rename ->emit_request to ->emit_breadcrumb
Now that the emission of the request tail and its submission to hardware
are two separate steps, engine->emit_request() is confusing.
engine->emit_request() is called to emit the breadcrumb commands for the
request into the ring, name it such (engine->emit_breadcrumb).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-27-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:49 +0000 (13:58 +0100)]
drm/i915: Introduce a global_seqno for each request
Though we will have multiple timelines, we still have a single timeline
of execution. This we can use to provide an execution and retirement order
of requests. This keeps tracking execution of requests simple, and vital
for preserving a single waiter (i.e. so that we can order the waiters so
that only the earliest to wakeup need be woken). To accomplish this we
distinguish the seqno used to order requests per-context (external) and
that used internally for execution.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-26-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:48 +0000 (13:58 +0100)]
drm/i915: Wait first for submission, before waiting for request completion
In future patches, we will no longer be able to wait on a static global
seqno and instead have to break our wait up into phases. First we wait
for the global seqno assignment (upon submission to hardware), and once
submitted we wait for the hardware to complete.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-25-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:47 +0000 (13:58 +0100)]
drm/i915: Queue the idling context switch after all other timelines
Before suspend, we wait for the switch to the kernel context. In order
for all the other context images to be complete upon suspend, that
switch must be the last operation by the GPU (i.e. this idling request
must not overtake any pending requests). To make this request execute last,
we make it depend on every other inflight request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-24-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:46 +0000 (13:58 +0100)]
drm/i915: Combine seqno + tracking into a global timeline struct
Our timelines are more than just a seqno. They also provide an ordered
list of requests to be executed. Due to the restriction of handling
individual address spaces, we are limited to a timeline per address
space but we use a fence context per engine within.
Our first step to introducing independent timelines per context (i.e. to
allow each context to have a queue of requests to execute that have a
defined set of dependencies on other requests) is to provide a timeline
abstraction for the global execution queue.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:45 +0000 (13:58 +0100)]
drm/i915: Restore nonblocking awaits for modesetting
After combining the dma-buf reservation object and the GEM reservation
object, we lost the ability to do a nonblocking wait on the i915 request
(as we blocked upon the reservation object during prepare_fb). We can
instead convert the reservation object into a fence upon which we can
asynchronously wait (including a forced timeout in case the DMA fence is
never signaled).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-22-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:44 +0000 (13:58 +0100)]
drm/i915: Move GEM activity tracking into a common struct reservation_object
In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)
v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.
Caveats:
* busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.
* non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"
* dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).
* loss of object-level retirement callbacks, emulated by VMA retirement
tracking.
* minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:43 +0000 (13:58 +0100)]
drm/i915: Use lockless object free
Having moved the locked phase of freeing an object to a separate worker,
we can now declare to the core that we only need the unlocked variant of
driver->gem_free_object, and can use the simple unreference internally.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-20-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:42 +0000 (13:58 +0100)]
drm/i915: Move object release to a freelist + worker
We want to hide the latency of releasing objects and their backing
storage from the submission, so we move the actual free to a worker.
This allows us to switch to struct_mutex freeing of the object in the
next patch.
Furthermore, if we know that the object we are dereferencing remains valid
for the duration of our access, we can forgo the usual synchronisation
barriers and atomic reference counting. To ensure this we defer freeing
an object til after an RCU grace period, such that any lookup of the
object within an RCU read critical section will remain valid until
after we exit that critical section. We also employ this delay for
rate-limiting the serialisation on reallocation - we have to slow down
object creation in order to prevent resource starvation (in particular,
files).
v2: Return early in i915_gem_tiling() ioctl to skip over superfluous
work on error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-19-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:41 +0000 (13:58 +0100)]
drm/i915: Acquire the backing storage outside of struct_mutex in set-domain
As we can locklessly (well struct_mutex-lessly) acquire the backing
storage, do so in set-domain-ioctl to reduce the contention on the
struct_mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-18-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:40 +0000 (13:58 +0100)]
drm/i915: Implement pwrite without struct-mutex
We only need struct_mutex within pwrite for a brief window where we need
to serialise with rendering and control our cache domains. Elsewhere we
can rely on the backing storage being pinned, and forgive userspace any
races against us.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-17-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:39 +0000 (13:58 +0100)]
drm/i915: Implement pread without struct-mutex
We only need struct_mutex within pread for a brief window where we need
to serialise with rendering and control our cache domains. Elsewhere we
can rely on the backing storage being pinned, and forgive userspace any
races against us.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-16-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:38 +0000 (13:58 +0100)]
drm/i915/dmabuf: Acquire the backing storage outside of struct_mutex
Use the per-object mm.lock to allocate the backing storage (and hold a
reference to it across the dmabuf access) without resorting to
struct_mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-15-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:37 +0000 (13:58 +0100)]
drm/i915: Move object backing storage manipulation to its own locking
Break the allocation of the backing storage away from struct_mutex into
a per-object lock. This allows parallel page allocation, provided we can
do so outside of struct_mutex (i.e. set-domain-ioctl, pwrite, GTT
fault), i.e. before execbuf! The increased cost of the atomic counters
are hidden behind i915_vma_pin() for the typical case of execbuf, i.e.
as the object is typically bound between execbufs, the page_pin_count is
static. The cost will be felt around set-domain and pwrite, but offset
by the improvement from reduced struct_mutex contention.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-14-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:36 +0000 (13:58 +0100)]
drm/i915: Pass around sg_table to get_pages/put_pages backend
The plan is to move obj->pages out from under the struct_mutex into its
own per-object lock. We need to prune any assumption of the struct_mutex
from the get_pages/put_pages backends, and to make it easier we pass
around the sg_table to operate on rather than indirectly via the obj.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:35 +0000 (13:58 +0100)]
drm/i915: Refactor object page API
The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:34 +0000 (13:58 +0100)]
drm/i915: Use radixtree to jump start intel_partial_pages()
We can use the radixtree index of the obj->pages to find the start
position of the desired partial range.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-11-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:33 +0000 (13:58 +0100)]
drm/i915: Use a radixtree for random access to the object's backing storage
A while ago we switched from a contiguous array of pages into an sglist,
for that was both more convenient for mapping to hardware and avoided
the requirement for a vmalloc array of pages on every object. However,
certain GEM API calls (like pwrite, pread as well as performing
relocations) do desire access to individual struct pages. A quick hack
was to introduce a cache of the last access such that finding the
following page was quick - this works so long as the caller desired
sequential access. Walking backwards, or multiple callers, still hits a
slow linear search for each page. One solution is to store each
successful lookup in a radix tree.
v2: Rewrite building the radixtree for clarity, hopefully.
v3: Rearrange execbuf to avoid calling i915_gem_object_get_sg() from
within an atomic section and so relax the allocation context to a simple
GFP_KERNEL and mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-10-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:32 +0000 (13:58 +0100)]
drm/i915: Markup GEM API with lockdep asserts
Add lockdep_assert_held(struct_mutex) to the API preamble of the
internal GEM interfaces.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:31 +0000 (13:58 +0100)]
drm/i915: Reuse the active golden render state batch
The golden render state is constant, but we recreate the batch setting
it up for every new context. If we keep that batch in a volatile cache
we can safely reuse it whenever we need to initialise a new context. We
mark the pages as purgeable and use the shrinker to recover pages from
the batch whenever we face memory pressues, recreating that batch afresh
on the next new context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtien@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-8-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:30 +0000 (13:58 +0100)]
drm/i915: Introduce an internal allocator for disposable private objects
Quite a few of our objects used for internal hardware programming do not
benefit from being swappable or from being zero initialised. As such
they do not benefit from using a shmemfs backing storage and since they
are internal and never directly exposed to the user, we do not need to
worry about providing a filp. For these we can use an
drm_i915_gem_object wrapper around a sg_table of plain struct page. They
are not swap backed and not automatically pinned. If they are reaped
by the shrinker, the pages are released and the contents discarded. For
the internal use case, this is fine as for example, ringbuffers are
pinned from being written by a request to be read by the hardware. Once
they are idle, they can be discarded entirely. As such they are a good
match for execlist ringbuffers and a small variety of other internal
objects.
In the first iteration, this is limited to the scratch batch buffers we
use (for command parsing and state initialisation).
v2: Allocate physically contiguous pages, where possible.
v3: Reduce maximum order on subsequent requests following an allocation
failure.
v4: Fix up mismatch between swiotlb segment size and page count (it
counts in 2k units, not 4k pages)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-7-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:29 +0000 (13:58 +0100)]
drm/i915: Defer active reference until required
We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:28 +0000 (13:58 +0100)]
drm/i915: Remove unused i915_gem_active_wait() in favour of _unlocked()
Since we only use the more generic unlocked variant, just rename it as
the normal i915_gem_active_wait(). The temporary cost is that we need to
always acquire the reference in a RCU safe manner, but the benefit is
that we will combine the common paths.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-5-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:27 +0000 (13:58 +0100)]
drm/i915: Rearrange i915_wait_request() accounting with callers
Our low-level wait routine has evolved from our generic wait interface
that handled unlocked, RPS boosting, waits with time tracking. If we
push our GEM fence tracking to use reservation_objects (required for
handling multiple timelines), we lose the ability to pass the required
information down to i915_wait_request(). However, if we push the extra
functionality from i915_wait_request() to the individual callsites
(i915_gem_object_wait_rendering and i915_gem_wait_ioctl) that make use
of those extras, we can both simplify our low level wait and prepare for
extending the GEM interface for use of reservation_objects.
v2: Rewrite i915_wait_request() kerneldocs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-4-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:26 +0000 (13:58 +0100)]
drm/i915: Remove superfluous wait_for_error() from throttle-ioctl
The throttle-ioctl never touches the struct_mutex. It does, however, as
part of its ABI report whether the hardware is terminally wedged. For
that purposes, it only has to report the current state and not incur the
cost of checking/waiting every invocation, as we do not have to wait for
a reset before waiting on a request to ensure completion (that is baked
into the wait request implementation).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-3-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:25 +0000 (13:58 +0100)]
drm/i915: Allow i915_sw_fence_await_sw_fence() to allocate
In forthcoming patches, we want to be able to dynamically allocate the
wait_queue_t used whilst awaiting. This is more convenient if we extend
the i915_sw_fence_await_sw_fence() to perform the allocation for us if
we pass in a gfp mask as an alternative than a preallocated struct.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-2-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 12:58:24 +0000 (13:58 +0100)]
drm/i915: Support asynchronous waits on struct fence from i915_gem_request
We will need to wait on DMA completion (as signaled via struct fence)
before executing our i915_gem_request. Therefore we want to expose a
method for adding the await on the fence itself to the request.
v2: Add a comment detailing a failure to handle a signal-on-any
fence-array.
v3: Pretend that magic numbers don't exist.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-1-chris@chris-wilson.co.uk
Chris Wilson [Fri, 28 Oct 2016 14:27:56 +0000 (15:27 +0100)]
drm/i915: Remove insert-page shortcut from execbuf relocate_iomap()
We are not allowed to touch the GTT entries underneath an atomic section,
as they take a rpm wakelock (which is illegal from atomic context) and
in the near future acquiring the DMA address for a page within an object
may sleep for an allocation. This makes the current shortcircuit in
relocation_iomap() for performing a second relocation on an adjacent page
illegal, and we need to release the atomic iomapping, lookup the DMA,
insert it into the GTT before reentering the atomic iomap section.
As it happens, this is precisely what we do on if we are using an
iomapping over the full object and not just a single page and by
removing the shortcut, we do the right thing.
Fixes: 9c870d03674f ("drm/i915: Use RPM as the barrier for controlling...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028142756.3850-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Matt Roper [Wed, 26 Oct 2016 22:51:29 +0000 (15:51 -0700)]
drm/i915: Use macro in place of open-coded for_each_universal_plane loop
This was the only use of (misleadingly-named) intel_num_planes()
function, so we can remove it as well.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477522291-10874-3-git-send-email-matthew.d.roper@intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Matt Roper [Wed, 26 Oct 2016 22:51:28 +0000 (15:51 -0700)]
drm/i915: Rename for_each_plane -> for_each_universal_plane
This macro's name is a bit misleading; it doesn't actually iterate over
all planes since it omits the cursor plane. Its only uses are in gen9
code which is using it to iterate over the universal planes (which we
treat as primary+sprites); in these cases the legacy cursor registers
are programmed independently if necessary. The macro's iterator value
(0 for primary plane, spritenum+1 for each secondary plane) also isn't
meaningful outside the gen9 context where the hardware considers them to
all be "universal" planes that follow this numbering.
This is just a renaming/clarification patch with no functional change.
However it will make the subsequent patches more clear.
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477522291-10874-2-git-send-email-matthew.d.roper@intel.com
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Navare, Manasi D [Wed, 26 Oct 2016 23:25:55 +0000 (16:25 -0700)]
drm/i915: Change the placement of some static functions in intel_dp.c
These static helper functions are required to be used during
fallback link rate implemnetation so they need to be placed at the top
of the file.
v3:
* Add cleanup to other patch (Mika Kahola)
v2:
* Dont move around functions declared in intel_drv.h (Rodrigo Vivi)
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477524358-16563-4-git-send-email-manasi.d.navare@intel.com
Ander Conselvan de Oliveira [Wed, 19 Oct 2016 07:59:00 +0000 (10:59 +0300)]
drm/i915: Address broxton phy registers based on phy and channel number
The port registers related to the phys in broxton map to different
channels and specific phys. Make that mapping explicit.
v2: Pass enum dpio_phy to macros instead of mmio base. (Imre)
v3: Fix typo in macros. (Imre)
v4: Also change variables from u32 to enum dpio_phy. (Imre)
Remove leftovers from previous version. (Imre)
v5: Actually git add the changes.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476863940-6019-1-git-send-email-ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:21 +0000 (19:22 +0300)]
drm/i915: Add location of the Rcomp resistor to bxt_ddi_phy_info
Use struct bxt_ddi_phy_info to hold information of where the Rcomp
resistor is located, instead of hard coding it in the init sequence.
Note that this moves the enabling of the phy with the Rcomp resistor out
of the power well enable code. That should be safe since
bxt_ddi_phy_init() is called while the power domains lock is held, and
that is the only way that function gets called, so there is no
possibility of a concurrent phy enable caused by a power domain get
call.
v2: Replace comment about lock with lockdep_assert_held() (Imre)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/62d209950ad48484564f3e793cf247cf62572a39.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:20 +0000 (19:22 +0300)]
drm/i915: Create a struct to hold information about the broxton phys
Information about which phy is dual channel is hardcoded in the phy init
sequence. Split that to a separate struct so the init sequence is more
generic.
v2: Restore mangled part that ended up in following patch. (Imre)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/9102f4c984044126057e4fdd1b91a615ff25fae6.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:19 +0000 (19:22 +0300)]
drm/i915: Move broxton vswing sequence to intel_dpio_phy.c
The vswing sequence is related to the DPIO phy, so move it closer to the
rest of DPIO phy related code.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/59aa5c85a115c5cbed81e793f20cd7b9f8de694b.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:18 +0000 (19:22 +0300)]
drm/i915: Move DPIO phy documentation section to intel_dpio_phy.c
Move the DPIO phy documentation section to intel_dpio_phy.c, since that
is a more suitable place now that there is a source file dedicated for
those phys.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/55a2d38c15c06a8c5bce498b28decc03948f0224.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:17 +0000 (19:22 +0300)]
drm/i915: Move broxton phy code to intel_dpio_phy.c
The phy in broxton is also a dpio phy, similar to cherryview but with
programming through MMIO. So move the code together with the other
similar phys.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/d611de6d256593cf904172db7ff27f164480c228.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:16 +0000 (19:22 +0300)]
drm/i915: Pass lane count to bxt_ddi_phy_calc_lane_optmin_mask()
Pass lane count to bxt_ddi_phy_calc_lane_optmin_mask() instead of having
it extract that number from a pipe_config to decouple the phy code from
intel_crtc_state.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/a4977e0207e594953c4f9d1b5f2ef972a8679e74.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:15 +0000 (19:22 +0300)]
drm/i915: Explicitly map broxton DPIO power wells to phys
The mapping from the BXT_DPIO_CMN_* power wells to their respective phys
required a detour implemented in the bxt_power_well_to_phy() function.
Instead, embed that information directly into the power_well struct, by
resurrecting the data field.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/7fe97582fa08c7340ce6a3b6b0ea3e72a73182d7.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Ander Conselvan de Oliveira [Thu, 6 Oct 2016 16:22:14 +0000 (19:22 +0300)]
drm/i915: Rename struct i915_power_well field data to id
Calling it data seems to imply arbitrary data can be associated with the
power well. However, that field is used for look ups and expected to be
unique, so rename it.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/f3916c3c5bfa793b0fc870fd44007a3ff425194d.1475770848.git-series.ander.conselvan.de.oliveira@intel.com
Daniel Vetter [Fri, 28 Oct 2016 07:14:08 +0000 (09:14 +0200)]
Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge latest drm-next to pull in the s/fence/dma_fence/ rework,
needed before we merge more i915 fencing patches.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Dave Airlie [Fri, 28 Oct 2016 04:24:56 +0000 (14:24 +1000)]
Merge branch 'linux-4.9' of git://github.com/skeggsb/linux into drm-next
Karol's work which greatly improves volt/clock changes on a
heap of boards, nothing too exciting beyond a random collection of fixes.
* 'linux-4.9' of git://github.com/skeggsb/linux: (33 commits)
drm/nouveau/fb/nv50: defer DMA mapping of scratch page to oneinit() hook
drm/nouveau/fb/gf100: defer DMA mapping of scratch page to oneinit() hook
drm/nouveau/pci: set streaming DMA mask early
drm/nouveau/kms: add Maxwell to backlight initialization
drm/nouveau/bar/nv50: fix bar2 vm size
drm/nouveau/disp: remove unused function in sorg94.c
drm/nouveau/volt: use kernel's 64-bit signed division function
drm/nouveau/core: add missing header dependencies
drm/nouveau/gr/nv3x: add 0x0597 kelvin 3d class support
drm/nouveau/drm/nouveau: add a LED driver for the NVIDIA logo
drm/nouveau/fb/ram: Use Kepler implementation on Maxwell
drm/nouveau/volt: Make use of cvb coefficients
drm/nouveau/volt/gf100-: Add speedo
drm/nouveau/volt: Add implementation for gf100
drm/nouveau/bios/vmap: unk0 field is the mode
drm/nouveau/volt: Don't require perfect fit
drm/nouveau/clk: Allow boosting only when NvBoost is set
drm/nouveau/bios: Add parsing of VPSTATE table
drm/nouveau/clk: Respect voltage limits in nvkm_cstate_prog
drm/nouveau/clk: Fixup cstate selection
...
Dave Airlie [Fri, 28 Oct 2016 01:33:52 +0000 (11:33 +1000)]
Merge tag 'topic/drm-misc-2016-10-27' of git://anongit.freedesktop.org/git/drm-intel into drm-next
Pull request already again to get the s/fence/dma_fence/ stuff in and
allow everyone to resync. Otherwise really just misc stuff all over, and a
new bridge driver.
* tag 'topic/drm-misc-2016-10-27' of git://anongit.freedesktop.org/git/drm-intel:
drm/bridge: fix platform_no_drv_owner.cocci warnings
drm/bridge: fix semicolon.cocci warnings
drm: Print some debug/error info during DP dual mode detect
drm: mark drm_of_component_match_add dummy inline
drm/bridge: add Silicon Image SiI8620 driver
dt-bindings: add Silicon Image SiI8620 bridge bindings
video: add header file for Mobile High-Definition Link (MHL) interface
drm: convert DT component matching to component_match_add_release()
dma-buf: Rename struct fence to dma_fence
dma-buf/fence: add an lockdep_assert_held()
drm/dp: Factor out helper to distinguish between branch and sink devices
drm/edid: Only print the bad edid when aborting
drm/msm: add missing header dependencies
drm/msm/adreno: move function declarations to header file
drm/i2c/tda998x: mark symbol static where possible
doc: add missing docbook parameter for fence-array
drm: RIP mode_config->rotation_property
drm/msm/mdp5: Advertize 180 degree rotation
drm/msm/mdp5: Use per-plane rotation property
Dave Airlie [Fri, 28 Oct 2016 00:35:59 +0000 (10:35 +1000)]
Merge branch 'drm-next-4.10' of git://people.freedesktop.org/~agd5f/linux into drm-next
First new feature pull for 4.10. Highlights:
- Support for multple virtual displays in the virtual dce component
- New VM mgr to support non-contiguous vram buffers
- Support for UVD powergating on additional asics
- Power management improvements
- lots of code cleanup and bug fixes
* 'drm-next-4.10' of git://people.freedesktop.org/~agd5f/linux: (107 commits)
drm/amdgpu: turn on/off uvd clock when dpm enable/disable on CI
drm/amdgpu: disable dpm before turn off clock when vce idle.
drm/amdgpu: enable uvd bypass mode for CI/VI.
drm/amdgpu: just not load smc firmware if smu is already running
drm/amdgpu: when suspend, set boot state instand of disable dpm.
drm/amdgpu: use failed label to handle context init failure
drm/amdgpu: add amdgpu_ttm_bo_eviction_valuable callback
drm/ttm: make eviction decision a driver callback v2
drm/ttm: fix coding style in ttm_bo_driver.h
drm/radeon/pm: autoswitch power state when in balanced mode
drm/amd/powerplay: fix spelling mistake and add KERN_WARNING to printks
drm/amdgpu:new ids flag for preempt
drm/amdgpu: mark symbols static where possible
drm/amdgpu: change function declarations and add missing header dependencies
drm/amdgpu: s/amdgpuCrtc/amdgpu_crtc/ in pageflip code
drm/amdgpu/atom: remove a bunch of unused functions
drm/amdgpu: consolidate atom scratch reg handling for hangs
drm/amdgpu: use amdgpu_bo_[create|free]_kernel for wb
drm/amdgpu: add VCE VM session tracking
drm/amdgpu: improve parse_cs handling a bit
...
Rex Zhu [Wed, 26 Oct 2016 10:05:00 +0000 (18:05 +0800)]
drm/amdgpu: turn on/off uvd clock when dpm enable/disable on CI
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Wed, 26 Oct 2016 09:05:30 +0000 (17:05 +0800)]
drm/amdgpu: disable dpm before turn off clock when vce idle.
v2: move return value check as well
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Wed, 26 Oct 2016 09:04:33 +0000 (17:04 +0800)]
drm/amdgpu: enable uvd bypass mode for CI/VI.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Wed, 26 Oct 2016 05:44:12 +0000 (13:44 +0800)]
drm/amdgpu: just not load smc firmware if smu is already running
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rex Zhu [Mon, 3 Oct 2016 12:46:36 +0000 (20:46 +0800)]
drm/amdgpu: when suspend, set boot state instand of disable dpm.
fix pm-hibernate bug, when suspend/resume, dpm start failed.
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Huang Rui [Wed, 26 Oct 2016 09:07:03 +0000 (17:07 +0800)]
drm/amdgpu: use failed label to handle context init failure
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Tvrtko Ursulin [Thu, 27 Oct 2016 12:48:32 +0000 (13:48 +0100)]
drm/i915: Correct pipe fault reporting string
Newline somehow ended up in the middle of the line.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1477572512-4030-1-git-send-email-tvrtko.ursulin@linux.intel.com
Daniel Vetter [Thu, 27 Oct 2016 08:33:17 +0000 (10:33 +0200)]
Merge tag 'gvt-next-2016-10-27' of https://github.com/01org/gvt-linux into drm-intel-next-queued
gvt-next-2016-10-27
- Resolve current left build issue with ACPI=n and 32bit kernel
- TLB workaround from Arkadiusz
- vGPU reset fix from Ping
- workload scheduler nesting sleep fix from Changbin
- more misc fixes for sparse warnings and cleanups
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
kbuild test robot [Wed, 26 Oct 2016 16:58:36 +0000 (00:58 +0800)]
drm/bridge: fix platform_no_drv_owner.cocci warnings
drivers/gpu/drm/bridge/sil-sii8620.c:1556:3-8: No need to set .owner here. The core will do it.
Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
CC: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161026165836.GA98766@lkp-sb04.lkp.intel.com
kbuild test robot [Wed, 26 Oct 2016 16:58:36 +0000 (00:58 +0800)]
drm/bridge: fix semicolon.cocci warnings
drivers/gpu/drm/bridge/sil-sii8620.c:988:2-3: Unneeded semicolon
Remove unneeded semicolon.
Generated by: scripts/coccinelle/misc/semicolon.cocci
CC: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161026165836.GA98907@lkp-sb04.lkp.intel.com
Du, Changbin [Thu, 27 Oct 2016 03:10:31 +0000 (11:10 +0800)]
drm/i915/gvt: fix nested sleeping issue
We cannot use blocking method mutex_lock inside a wait loop.
Here we invoke pick_next_workload() which needs acquire a
mutex in our "condition" experssion. Then we go into a another
of the going-to-sleep sequence and changing the task state.
This is a dangerous. Let's rewrite the wait sequence to avoid
nested sleeping.
v2: fix do...while loop exit condition (zhenyu)
v3: rebase to gvt-staging branch
Signed-off-by: Du, Changbin <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Bing Niu [Mon, 31 Oct 2016 09:35:12 +0000 (17:35 +0800)]
drm/i915/gvt: throw error basing on execlist submit result
throw error message in elsp emulation handler basing on execlist
submit result. guest will trigger tdr process for recovering, gvt
just follow guest's desire.
v2: populate error to top of mmio emulation logic, comments from
zhenyu
Signed-off-by: Bing Niu <bing.niu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Ping Gao [Wed, 26 Oct 2016 01:38:52 +0000 (09:38 +0800)]
drm/i915/gvt: add full vGPU reset support
Full vGPU reset need to release all the shadow PPGGT pages to avoid
unnecessary write-protect and also should re-initialize pvinfo after
resetting vregs to keep pvinfo correct.
Signed-off-by: Ping Gao <ping.a.gao@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Anusha Srivatsa [Tue, 25 Oct 2016 00:28:21 +0000 (17:28 -0700)]
drm/i915/DMC/KBL: Load DMC on KBL using the no_stepping_info array
Currently, for display there is only one DMC image for KBL.
Remove the stepping_info table for KBL and use the no_stepping_info
array for loading the firmware.
v2: Removed the block of code as pointed out by Rodrigo to make the
loads as generic as possible.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477355301-7035-1-git-send-email-anusha.srivatsa@intel.com
Imre Deak [Wed, 26 Oct 2016 16:29:19 +0000 (19:29 +0300)]
drm: Print some debug/error info during DP dual mode detect
There's at least one LSPCON device that occasionally returns an unexpected
adaptor ID which leads to a failed detect. Print some debug info to help
debugging this and future cases. Also print an error for an unexpected
adaptor ID, so users can report it.
v2:
- s/adapter/adaptor/ and add code comment about incorrect type 1 adaptor
IDs. (Ville)
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1477499359-12001-1-git-send-email-imre.deak@intel.com
Arnd Bergmann [Wed, 26 Oct 2016 08:57:47 +0000 (10:57 +0200)]
drm: mark drm_of_component_match_add dummy inline
The newly added drm_of_component_match_add helper is defined as
'static' in a header when CONFIG_OF is disabled, causing a warning
each time the header is included:
In file included from /git/arm-soc/drivers/gpu/drm/bridge/dw-hdmi.c:23:0:
include/drm/drm_of.h:33:13: error: 'drm_of_component_match_add' defined but not used [-Werror=unused-function]
This marks it 'inline' like the other such helpers in this file.
Fixes: 97ac0e47aed5 ("drm: convert DT component matching to component_match_add_release()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20161026085759.3875472-1-arnd@arndb.de
Ville Syrjälä [Mon, 24 Oct 2016 16:13:04 +0000 (19:13 +0300)]
drm/i915: Fix SKL+ 90/270 degree rotated plane coordinate computation
Pass the framebuffer size in .16 fixed point coordinates to
drm_rect_rotate() since that's what the source coordinates are as well
at this stage. We used to do this part of the computation in integer
coordinates, but that got changed when moving the computation to
happen in the check phase of the operation. Unfortunately I forgot
to shift up the fb width and height appropriately.
With the bogus size we ended up with some negative fb offset, which when
added to the vma offset caused out scanout to start at an offset earlier
than we inteded. Eg. when testing on my SKL I saw a row of incorrect
tiles at the top of my screen.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: b63a16f6cd89 ("drm/i915: Compute display surface offset in the plane check hook for SKL+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477325584-23679-1-git-send-email-ville.syrjala@linux.intel.com
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Arkadiusz Hiler [Tue, 25 Oct 2016 12:48:02 +0000 (14:48 +0200)]
drm/i915: fix comment on I915_{READ, WRITE}_FW
Comment mentioned use of intel_uncore_forcewake_irq{unlock, lock}
functions which are nonexistent (and never were).
The description was also incomplete and could cause confusion. Updated
comment is more elaborate on usage and caveats.
v2: mention __locked variant of intel_uncore_forcewake_{get,put} instead
of plain ones
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilsono.c.uk>
[Mika: removed two superfluous lines on comment noted by Chris]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477399682-3133-1-git-send-email-arkadiusz.hiler@intel.com
Imre Deak [Mon, 24 Oct 2016 16:33:31 +0000 (19:33 +0300)]
drm/i915/lspcon: Add workaround for resuming in PCON mode
On my APL the LSPCON firmware resumes in PCON mode as opposed to the
expected LS mode. It also appears to be in a state where AUX DPCD reads
will succeed but return garbage recovering only after a few hundreds of
milliseconds. After the recovery time DPCD reads will result in the
correct values and things will continue to work. If I2C over AUX is
attempted during this recovery time (implying an AUX write transaction)
the firmware won't recover and will stay in this broken state.
As a workaround check if the firmware is in PCON state after resume and
if so wait until the correct DPCD values are returned. For this we
compare the branch descriptor with the one we cached during init time.
If the firmware was in the LS state, we skip the w/a and continue as
before.
v2:
- Use the DP descriptor value cached in intel_dp. (Jani)
- Get to intel_dp using container_of(), instead of a cached ptr.
(Shashank)
- Use usleep_range() instead of msleep().
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98353
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-9-git-send-email-imre.deak@intel.com
Imre Deak [Mon, 24 Oct 2016 16:33:30 +0000 (19:33 +0300)]
drm/i915/lspcon: Get DDC adapter via container_of() instead of cached ptr
We can use the container_of() magic to get to the DDC adapter, so no
need for caching a pointer to it. We'll also need to get at the intel_dp
ptr in the following patch, so add a helper that can be used for both
purposes.
Cc: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-8-git-send-email-imre.deak@intel.com
Imre Deak [Mon, 24 Oct 2016 16:33:29 +0000 (19:33 +0300)]
drm/i915/dp: Read DP descriptor for eDP and LSPCON too
As for external DP sink and branch devices read and print the DP
descriptor for eDP and LSPCON devices as well to aid debugging.
v2:
- Split out this change to a separate patch. (Jani)
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477326811-30431-7-git-send-email-imre.deak@intel.com