git.openwrt.org Git - openwrt/staging/blogic.git/log

drm/i915: Wrap engine->schedule in RCU locks for set-wedge protection

Similar to the staging around handling of engine->submit_request, we
need to stop adding to the execlists->queue prior to calling
engine->cancel_requests. cancel_requests will move requests from the
queue onto the timeline, so if we add a request onto the queue after that
point, it will be lost.

Fixes: af7a8ffad9c5 ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-5-chris@chris-wilson.co.uk

drm/i915: Include ring->emit in debugging

Include ring->emit and ring->space alongside ring->(head,tail) when
printing debug information.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-4-chris@chris-wilson.co.uk

drm/i915: Update ring position from request on retiring

When wedged, we do not update the ring->tail as we submit the requests
causing us to leak the ring->space upon cleaning up the wedged driver.
We can just use the value stored in rq->tail, and keep the submission
backend details away from set-wedge.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-3-chris@chris-wilson.co.uk

drm/i915: Finish the wait-for-wedge by retiring all the inflight requests

Before we reset the GPU after marking the device as wedged, we wait for
all the remaining requests to be completed (and marked as EIO).
Afterwards, we should flush the request lists so the next batch start
with the driver in an idle state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-1-chris@chris-wilson.co.uk

drm/i915/icl: do not save DDI A/E sharing bit for ICL

We don't want to preserve the DDI A 4 lane bit on ICL.

Fixes: 3d2011cfa41f ("drm/i915/icl: remove port A/E lane sharing limitation.")
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306104155.3526-1-jani.nikula@intel.com

drm/i915: Push irq_shift from gen8_cs_irq_handler() to caller

Originally we were inlining gen8_cs_irq_handler() and so expected the
compiler to constant-fold away the irq_shift (so we had hardcoded it as
opposed to use engine->irq_shift). However, we dropped the inline given
the proliferation of gen8_cs_irq_handler()s. If we pull the shifting
of the iir into the caller, we can shrink the code still further:

add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-34 (-34)
Function                                     old     new   delta
gen8_cs_irq_handler                          123     118      -5
gen8_gt_irq_handler                          261     248     -13
gen11_irq_handler                            722     706     -16

v2: Drop gen11_cs_irq_handler now that it is a simple
stub around gen8_cs_irq_handler (Daniele)

References: 5d3d69d5c119 ("drm/i915: Stop inlining the execlists IRQ handler")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180309010808.11921-1-chris@chris-wilson.co.uk

drm/i915: Index the ring frequency table by HW frequency range

When reporting the frequency table stored in the punit, report the full
range and not just the user restricted frequency range. In the process
keep the code to set the frequency table and read it the same.

v3: As we haven't separated the sb_lock from the pcu_lock yet, there's a
cycle between the pcu_lock and intel_runtime_pm_get.

References: f936ec34dea8 ("drm/i915/skl: Updated the i915_ring_freq_table debugfs function")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20180308142648.4016-2-chris@chris-wilson.co.uk

drm/i915: Kick the rps worker when changing the boost frequency

The boost frequency is only applied from the RPS worker while someone is
waiting on a request and requested a boost. As such, when the user
wishes to change the frequency, we have to kick the worker in order to
re-evaluate whether to apply the boost frequency.

v2: Check num_waiters to decide if we should kick the worker to handle
boosting.

Fixes: 29ecd78d3b79 ("drm/i915: Define a separate variable and control for RPS waitboost frequency")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308142648.4016-1-chris@chris-wilson.co.uk

drm/i915: Handle pipe CRC around enabling/disabling pipe.

This will get rid of the following error:
[   74.730271] WARNING: CPU: 4 PID: 0 at drivers/gpu/drm/drm_vblank.c:614 drm_calc_vbltimestamp_from_scanoutpos+0x13e/0x2f0
[   74.730311] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_intel crct10dif_pclmul snd_hda_codec crc32_pclmul snd_hwdep broadcom ghash_clmulni_intel snd_hda_core bcm_phy_lib snd_pcm tg3 lpc_ich mei_me mei prime_numbers
[   74.730353] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G     U           4.16.0-rc2-CI-CI_DRM_3822+ #1
[   74.730355] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
[   74.730359] RIP: 0010:drm_calc_vbltimestamp_from_scanoutpos+0x13e/0x2f0
[   74.730361] RSP: 0018:ffff88022fb03d10 EFLAGS: 00010086
[   74.730365] RAX: ffffffffa0291d20 RBX: ffff88021a180000 RCX: 0000000000000001
[   74.730367] RDX: ffffffff820e7db8 RSI: 0000000000000001 RDI: ffffffff82068cea
[   74.730369] RBP: ffff88022fb03d70 R08: 0000000000000000 R09: ffffffff815d26d0
[   74.730371] R10: 0000000000000000 R11: ffffffffa0161ca0 R12: 0000000000000001
[   74.730373] R13: ffff880212448008 R14: ffff880212448330 R15: 0000000000000000
[   74.730376] FS:  0000000000000000(0000) GS:ffff88022fb00000(0000) knlGS:0000000000000000
[   74.730378] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   74.730380] CR2: 000055edcbec9000 CR3: 0000000002210001 CR4: 00000000000606e0
[   74.730382] Call Trace:
[   74.730385]  <IRQ>
[   74.730397]  drm_get_last_vbltimestamp+0x36/0x50
[   74.730401]  drm_update_vblank_count+0x64/0x240
[   74.730409]  drm_crtc_accurate_vblank_count+0x41/0x90
[   74.730453]  display_pipe_crc_irq_handler+0x176/0x220 [i915]
[   74.730497]  i9xx_pipe_crc_irq_handler+0xfe/0x150 [i915]
[   74.730537]  ironlake_irq_handler+0x618/0xa30 [i915]
[   74.730548]  __handle_irq_event_percpu+0x3c/0x340
[   74.730556]  handle_irq_event_percpu+0x1b/0x50
[   74.730561]  handle_irq_event+0x2f/0x50
[   74.730566]  handle_edge_irq+0xe4/0x1b0
[   74.730572]  handle_irq+0x11/0x20
[   74.730576]  do_IRQ+0x5e/0x120
[   74.730584]  common_interrupt+0x84/0x84
[   74.730586]  </IRQ>
[   74.730591] RIP: 0010:cpuidle_enter_state+0xaa/0x350
[   74.730593] RSP: 0018:ffffc9000008beb8 EFLAGS: 00000212 ORIG_RAX: ffffffffffffffde
[   74.730597] RAX: ffff880226b80040 RBX: 000000000031fc3e RCX: 0000000000000001
[   74.730599] RDX: 0000000000000000 RSI: ffffffff8210fb59 RDI: ffffffff820c02e7
[   74.730601] RBP: 0000000000000004 R08: 00000000000040af R09: 0000000000000018
[   74.730603] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004
[   74.730606] R13: ffffe8ffffd00430 R14: 0000001166120bf4 R15: ffffffff82294460
[   74.730621]  ? cpuidle_enter_state+0xa6/0x350
[   74.730629]  do_idle+0x188/0x1d0
[   74.730636]  cpu_startup_entry+0x14/0x20
[   74.730641]  start_secondary+0x129/0x160
[   74.730646]  secondary_startup_64+0xa5/0xb0
[   74.730660] Code: e1 48 c7 c2 b8 7d 0e 82 be 01 00 00 00 48 c7 c7 ea 8c 06 82 e8 64 ec ff ff 48 8b 83 c8 07 00 00 48 83 78 28 00 0f 84 e2 fe ff ff <0f> 0b 45 31 ed e9 db fe ff ff 41 b8 d3 4d 62 10 89 c8 6a 03 41
[   74.730754] ---[ end trace 14b1345705b68565 ]---

Changes since v1:
- Don't try to apply CRC workaround when enabling pipe, it should already be enabled.
Changes since v2:
- Make crc functions for !DEBUGFS case inline.
- Pass intel_crtc to crc functions.
- Add comments to callsites.
Changes since v3:
- Cache selected source to pipe_crc->source.
- Set pipe_crc->skipped to MIN_INT during disable to close a race condition.
Changes since v4:
- Handle fallout from setting pipe_crc->source in irq handler.

Cc: Marta Löfstedt <marta.lofstedt@intel.com>
Reported-by: Marta Löfstedt <marta.lofstedt@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105185
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308120202.52446-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: Only prune fences after wait-for-all

Currently, we only allow ourselves to prune the fences so long as
all the waits completed (i.e. all the fences we checked were signaled),
and that the reservation snapshot did not change across the wait.
However, if we only waited for a subset of the reservation object, i.e.
just waiting for the last writer to complete as opposed to all readers
as well, then we would erroneously conclude we could prune the fences as
indeed although all of our waits were successful, they did not represent
the totality of the reservation object.

v2: We only need to check the shared fences due to construction (i.e.
all of the shared fences will be later than the exclusive fence, if
any).

Fixes: e54ca9774777 ("drm/i915: Remove completed fences after a wait")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307171303.29466-1-chris@chris-wilson.co.uk

drm/i915: Update DRIVER_DATE to 20180308

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Merge tag 'gvt-next-2018-03-08' of https://github.com/intel/gvt-linux into drm-intel-next-queued

gvt-next-2018-03-08

- big refactor for shadow ppgtt (Changbin)
- KBL context save/restore via LRI cmd (Weinan)
- misc smatch fixes (Zhenyu)
- Properly unmap dma for guest page (Changbin)
- other misc fixes (Xiong, etc.)

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308023152.oi4ialn5uxetbruf@zhen-hp.sh.intel.com

drm/i915: add schedule out notification of preempted but completed request

There is one corner case missing schedule out notification of the preempted
request. The preempted request is just completed when preemption happen,
then it will be canceled and won't be resubmitted later, GVT-g will lost
the schedule out notification.

Here add schedule out notification if found the preempted request has been
completed.

v2:
- refine description, add completed check and notification in
execlists_cancel_port_requests. (Chris)

v3:
- use ternary confitional, remove local variable. (Tvrtko)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1520302557-25079-1-git-send-email-weinan.z.li@intel.com

drm/i915: expose rcs topology through query uAPI

With the introduction of asymmetric slices in CNL, we cannot rely on
the previous SUBSLICE_MASK getparam to tell userspace what subslices
are available. Here we introduce a more detailed way of querying the
Gen's GPU topology that doesn't aggregate numbers.

This is essential for monitoring parts of the GPU with the OA unit,
because counters need to be normalized to the number of
EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do
not gives us sufficient information.

The Mesa series making use of this API is :

    https://patchwork.freedesktop.org/series/38795/

As a bonus we can draw representations of the GPU :

    https://imgur.com/a/vuqpa

v2: Rename uapi struct s/_mask/_info/ (Tvrtko)
    Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko)
    Add uapi macros to read data from *_info structs (Tvrtko)

v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko)

v4: factorize query item writting (Tvrtko)
    tweak uapi struct/define names (Tvrtko)

v5: Replace ALIGN() macro (Chris)

v6: Updated uapi comments (Tvrtko)
    Moved flags != 0 checks into vfuncs (Tvrtko)

v7: Use access_ok() before copying anything, to avoid overflows (Chris)
    Switch BUG_ON() to GEM_WARN_ON() (Tvrtko)

v8: Tweak uapi comments style to match the coding style (Lionel)

v9: Fix error in comment about computation of enabled subslice (Tvrtko)

v10: Fix/update comments in uAPI (Sagar)

v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single
     drm_i915_query_topology_info (Joonas)

v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas)

v13: Fix comment in uAPI (Joonas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-7-lionel.g.landwerlin@intel.com

drm/i915: add query uAPI

There are a number of information that are readable from hardware
registers and that we would like to make accessible to userspace. One
particular example is the topology of the execution units (how are
execution units grouped in subslices and slices and also which ones
have been fused off for die recovery).

At the moment the GET_PARAM ioctl covers some basic needs, but
generally is only able to return a single value for each defined
parameter. This is a bit problematic with topology descriptions which
are array/maps of available units.

This change introduces a new ioctl that can deal with requests to fill
structures of potentially variable lengths. The user is expected fill
a query with length fields set at 0 on the first call, the kernel then
sets the length fields to the their expected values. A second call to
the kernel with length fields at their expected values will trigger a
copy of the data to the pointed memory locations.

The scope of this uAPI is only to provide information to userspace,
not to allow configuration of the device.

v2: Simplify dispatcher code iteration (Tvrtko)
    Tweak uapi drm_i915_query_item structure (Tvrtko)

v3: Rename pad fields into flags (Chris)
    Return error on flags field != 0 (Chris)
    Only copy length back to userspace in drm_i915_query_item (Chris)

v4: Use array of functions instead of switch (Chris)

v5: More comments in uapi (Tvrtko)
    Return query item errors in length field (All)

v6: Tweak uapi comments style to match the coding style (Lionel)

v7: Add i915_query.h (Joonas)

v8: (Lionel) Change the behavior of the item iterator to report
    invalid queries into the query item rather than stopping the
    iteration. This enables userspace applications to query newer
    items on older kernels and only have failure on the items that are
    not supported.

v9: Edit copyright headers (Joonas)

v10: Typos & comments in uapi (Joonas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-6-lionel.g.landwerlin@intel.com

drm/i915: add rcs topology to error state

This might be useful information for developers looking at an error
state.

v2: Place topology towards the end of the error state (Chris)

v3: Reuse common printing code (Michal)

v4: Make this a one-liner (Chris)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-5-lionel.g.landwerlin@intel.com

drm/i915/debugfs: add rcs topology entry

While the end goal is to make this information available to userspace
through a new ioctl, there is no reason we can't display it in a human
readable fashion through debugfs.

slice0: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice1: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)
slice2: 3 subslice(s) (0x7):
subslice0: 8 EUs (0xff)
subslice1: 8 EUs (0xff)
subslice2: 8 EUs (0xff)
subslice3: 0 EUs (0x0)

v2: Reformat debugfs printing (Tvrtko)
Use the new EU mask helper (Tvrtko)

v3: Move printing code to intel_device_info.c to be shared with error
state (Michal)

v4: Bump u8 to u16 when using sseu_get_eus() (Lionel)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-4-lionel.g.landwerlin@intel.com

drm/i915/debugfs: reuse max slice/subslices already stored in sseu

Now that we have that information in topology fields, let's just reuse it.

v2: Style tweaks (Tvrtko)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-3-lionel.g.landwerlin@intel.com

drm/i915: store all subslice masks

Up to now, subslice mask was assumed to be uniform across slices. But
starting with Cannonlake, slices can be asymmetric (for example slice0
has different number of subslices as slice1+). This change stores all
subslices masks for all slices rather than having a single mask that
applies to all slices.

v2: Rework how we store total numbers in sseu_dev_info (Tvrtko)
    Fix CHV eu masks, was reading disabled as enabled (Tvrtko)
    Readability changes (Tvrtko)
    Add EU index helper (Tvrtko)

v3: Turn ALIGN(v, 8) / 8 into DIV_ROUND_UP(v, BITS_PER_BYTE) (Tvrtko)
    Reuse sseu_eu_idx() for setting eu_mask on CHV (Tvrtko)
    Reformat debug prints for subslices (Tvrtko)

v4: Change eu_mask helper into sseu_set_eus() (Tvrtko)

v5: With Haswell reporting masks & counts, bump sseu_*_eus() functions
    to use u16 (Lionel)

v6: Fix sseu_get_eus() for > 8 EUs per subslice (Lionel)

v7: Change debugfs enabels for number of subslices per slice, will
    need a small igt/pm_sseu change (Lionel)
    Drop subslice_total field from sseu_dev_info, rely on
    sseu_subslice_total() to recompute the value instead (Lionel)

v8: Remove unused function compute_subslice_total() (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306122857.27317-2-lionel.g.landwerlin@intel.com

drm/i915/guc: work around gcc-4.4.4 union initializer issue

gcc-4.4.4 has problems with initalizers of anon unions.

drivers/gpu/drm/i915/intel_guc_log.c: In function 'guc_log_control':
drivers/gpu/drm/i915/intel_guc_log.c:64: error: unknown field 'logging_enabled' specified in initializer

Work around this.

Fixes: 35fe703c3161 ("drm/i915/guc: Change values for i915_guc_log_control")
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180308001333.rI2vrNRTY%akpm@linux-foundation.org

drm/i915/cnl: Add Wa_2201832410

"Clock gating bug in GWL may not clear barrier state when an EOT
is received, causing a hang the next time that barrier is used."

HSDES: 2201832410

Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180307220912.3681-1-rodrigo.vivi@intel.com

drm/i915/icl: Gen11 forcewake support

The main difference with previous GENs is that starting from Gen11
each VCS and VECS engine has its own power well, which only exist
if the related engine exists in the HW.
The fallback forcewake request workaround is only needed on gen9
according to the HSDES WA entry (1604254524), so we can go back to using
the simpler fw_domains_get/put functions.

BSpec: 18331

v2: fix fwtable, use array to test shadow tables, create new
    accessors to avoid check on every access (Tvrtko)
v3 (from Paulo): Rebase.
v4:
  - Range 09400-097FF should be FORCEWAKE_ALL (Daniele)
  - Use the BIT macro for forcewake domains (Daniele)
  - Add a comment about the range ordering (Oscar)
  - Updated commit message (Oscar)
v5: Rebased
v6: Use I915_MAX_VCS/VECS (Michal)
v7: translate FORCEWAKE_ALL to available domains
v8: rebase, add clarification on fallback ack in commit message.
v9: fix rebase issue, change check in fw_domains_init from IS_GEN11
    to GEN >= 11
v10: Generate is_genX_shadowed with a macro (Daniele)
     Include gen11_fw_ranges in the selftest (Michel)
v11: Simplify FORCEWAKE_ALL, new line between NEEDS_FORCEWAKEs (Tvrtko)

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-6-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/icl: Add Indirect Context Offset for Gen11

v2: rebased to intel_lr_indirect_ctx_offset
v3: rebase, move define to intel_lrc_reg.h

BSpec: 11740
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-5-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/icl: Enhanced execution list support

Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue
(the ExecLists Submission Queue - ELSQ), which is then loaded on the
HW. When writing to the ELSP register, the lrcs are written cyclically
in the queue from position 0 to position 7. Alternatively, it is
possible to write directly in the individual positions of the queue
using the ELSQC registers. To be able to re-use all the existing code
we're using the latter method and we're currently limiting ourself to
only using 2 elements.

v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
preemption (Daniele), reuse engine->execlists.elsp (Chris)
v6: use has_logical_ring_elsq to differentiate the new paths
v7: add preemption support, rename els to submit_reg (Chris)
v8: save the ctrl register inside the execlists struct, drop CSB
handling updates (superseded by preempt_complete_status) (Chris)
v9: s/drm_i915_gem_request/i915_request (Mika)
v10: resolved conflict in inject_preempt_context (Mika)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-4-mika.kuoppala@linux.intel.com

drm/i915/icl: new context descriptor support

Starting from Gen11 the context descriptor format has been updated in
the HW. The hw_id field has been considerably reduced in size and engine
class and instance fields have been added.

There is a slight name clashing issue because the field that we call
hw_id is actually called SW Context ID in the specs for Gen11+.

With the current size of the hw_id field we can have a maximum of 2k
contexts at any time, but we could use the sw_counter field (which is sw
defined) to increase that because the HW requirement is that
engine_id + sw id + sw_counter is a unique number.
GuC uses a similar method to support more contexts but does its tracking
at lrc level. To avoid doing an implementation that will need to be
reworked once GuC support lands, defer it for now and mark it as TODO.

v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
v3: rebased, bring back lost code from i915_gem_context.c
v4: make TODO comment more generic
v5: be consistent with bit ordering, add extra checks (Chris)

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/icl: Correctly initialize the Gen11 engines

Gen11 has up to 4 VCS and up to 2 VECS engines, this patch adds mmio
base definitions for all of them.

Bspec: 20944
Bspec: 7021

v2: Set the correct mmio_base in intel_engines_init_mmio; updating the
base mmio values any later would cause incorrect reads in
i915_gem_sanitize (Michel).

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ceraolo Spurio, Daniele <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-2-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915: Assert that the request is indeed complete when signaled from irq

After we call dma_fence_signal(), confirm that the request was indeed
complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305104105.8296-1-chris@chris-wilson.co.uk

drm/i915: Handle changing enable_fbc parameter at runtime better.

If i915.enable_fbc is cleared at runtime, but FBC was previously enabled
then we don't disable FBC until the next time the crtc is disabled.

Make sure that if the module param is changed, we disable FBC in
intel_fbc_post_update so we never have to worry about disabling.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305123608.20665-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Track whether the DP link is trained or not

LSPCON likes to throw short HPDs during the enable seqeunce prior to the
link being trained. These obviously result in the channel CR/EQ check
failing and thus we schedule a pointless hotplug work to retrain the
link. Avoid that by ignoring the bad CR/EQ status until we've actually
initially trained the link.

I've not actually investigated to see what LSPCON is trying to signal
with the short pulse. But as long as it signals anything I think we're
supposed to check the link status anyway, so I don't really see other
good ways to solve this. I've not seen these short pulses being
generated by normal DP sinks.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-5-ville.syrjala@linux.intel.com

drm/i915: Nuke intel_dp->channel_eq_status

intel_dp->channel_eq_status is used in exactly one function, and we
don't need it to persist between calls. So just go back to using a
local variable instead.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-4-ville.syrjala@linux.intel.com

drm/i915: Move SST DP link retraining into the ->post_hotplug() hook

Doing link retraining from the short pulse handler is problematic since
that might introduce deadlocks with MST sideband processing. Currently
we don't retrain MST links from this code, but we want to change that.
So better to move the entire thing to the hotplug work. We can utilize
the new encoder->hotplug() hook for this.

The only thing we leave in the short pulse handler is the link status
check. That one still depends on the link parameters stored under
intel_dp, so no locking around that but races should be mostly harmless
as the actual retraining code will recheck the link state if we
end up there by mistake.

v2: Rebase due to ->post_hotplug() now being just ->hotplug()
Check the connector type to figure out if we should do
the HDMI thing or the DP think for DDI

[pushed with whitespace changes for sparse]
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-3-ville.syrjala@linux.intel.com

drm/i915: Reinitialize sink scrambling/TMDS clock ratio on HPD

The LG 4k TV I have doesn't deassert HPD when I turn the TV off, but
when I turn it back on it will pulse the HPD line. By that time it has
forgotten everything we told it about scrambling and the clock ratio.
Hence if we want to get a picture out if it again we have to tell it
whether we're currently sending scrambled data or not. Implement
that via the encoder->hotplug() hook.

v2: Force a full modeset to not follow the HDMI 2.0 spec more
closely (Shashank)

[pushed with whitespace fixes to make sparse happy]
Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com

drm/i915: Convert intel_hpd_irq_event() into an encoder hotplug hook

Allow encoders to customize their hotplug processing by moving the
intel_hpd_irq_event() code into an encoder hotplug vfunc. Currently
only SDVO needs this to re-enable hotplug signalling in the SDVO
chip. We'll use this same hook for DP/HDMI link management later.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180117192149.17760-1-ville.syrjala@linux.intel.com

drm/i915/cnp: Document WaSouthDisplayDisablePWMCGEGating

No functional change since WA is already applied.
But since it has different names on different databases,
let's document it here to avoid future confusion.

Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306012812.19779-1-rodrigo.vivi@intel.com

drm/i915/cnl: document WaVFUnitClockGatingDisable

No functional change. WA is already properly applied.
but in different databases it has different names.
Let's document all of them to avoid future confusion.

Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306012000.18928-1-rodrigo.vivi@intel.com

drm/i915/psr: Update PSR2 resolution check for Cannonlake

In fact, apply the Cannonlake resolution check for all >= Gen-10 platforms
to be safe.

v3: Update GLK too. (Ville)
Longer variable names.
if-else in place of ternary operator.
v2: Use local variables for resolution limits and print them (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Elio Martinez Monroy <elio.martinez.monroy@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306203355.29292-1-dhinakaran.pandiyan@intel.com

drm/i915: Flush waiters on seqno wraparound

Previously, we would spin waiting for all waiters to wake up and notice
their request had completed before we would reset the seqno upon
wraparound. However, we can mark their waits as complete and wake them
up directly using the existing machinery for handling the flushing of
missed wakeups when idling.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-2-chris@chris-wilson.co.uk

drm/i915: Stop kicking the signaling thread on seqno wraparound

Since commit fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted
signalers"), we cancel the signaler when retiring the request and so
upon wraparound, where we wait for all requests to be retired, we no
longer need to spin waiting for the signaling thread to release its
references to the in-flight requests, and so we can assert that the
signaler is idle.

References: fd10e2ce9905 ("drm/i915/breadcrumbs: Ignore unsubmitted signalers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180306130143.13312-1-chris@chris-wilson.co.uk

drm/i915/breadcrumbs: Assert all missed breadcrumbs were signaled

When parking the engines and their breadcrumbs, if we have waiters left
then they missed their wakeup. Verify that each waiter's seqno did
complete.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-2-chris@chris-wilson.co.uk

drm/i915/breadcrumbs: Reduce signaler rbtree to a sorted list

The goal here is to try and reduce the latency of signaling additional
requests following the wakeup from interrupt by reducing the list of
to-be-signaled requests from an rbtree to a sorted linked list. The
original choice of using an rbtree was to facilitate random insertions
of request into the signaler while maintaining a sorted list. However,
if we assume that most new requests are added when they are submitted,
we see those new requests in execution order making a insertion sort
fast, and the reduction in overhead of each signaler iteration
significant.

Since commit 56299fb7d904 ("drm/i915: Signal first fence from irq handler
if complete"), we signal most fences directly from notify_ring() in the
interrupt handler greatly reducing the amount of work that actually
needs to be done by the signaler kthread. All the thread is then
required to do is operate as the bottom-half, cleaning up after the
interrupt handler and preparing the next waiter. This includes signaling
all later completed fences in a saturated system, but on a mostly idle
system we only have to rebuild the wait rbtree in time for the next
interrupt. With this de-emphasis of the signaler's role, we want to
rejig it's datastructures to reduce the amount of work we require to
both setup the signal tree and maintain it on every interrupt.

References: 56299fb7d904 ("drm/i915: Signal first fence from irq handler if complete")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222092545.17216-1-chris@chris-wilson.co.uk

drm/i915/error: capture uc_state after gen_state

error->device_info.has_guc, which we check in capture_uc_state, is set
in capture_gen_state, so the latter needs to be performed first.

v2: rebased

Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Fixes: 7d41ef3479a6 (drm/i915: Add Guc/HuC firmware details to error state)
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-3-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/error: standardize function style in error capture

some of the static functions used from capture() have the "i915_"
prefix while other don't; most of them take i915 as a parameter, but one
of them derives it internally from error->i915. Let's be consistent by
avoiding prefix for static functions and by getting i915 from
error->i915. While at it, s/dev_priv/i915 in functions that don't
perform register reads.

v2: take i915 from error->i915 (Michal), s/dev_priv/i915,
update commit message

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-2-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/error: remove unused gen8_engine_sync_index

Leftover from Gen8 ringbuffer support removal

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305222122.3547-1-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gvt: Return error at the failure of finding page_track

In XenGT, ioreq copy is used to trap mmio write and ppgtt write. Both
of them are memory write, ioreq handler couldn't distinguish them. So
ioreq handler probe the ppgtt write handler, if it is succuess, this
ioreq is ppgtt write, otherwise it is mmio write.

So ppgtt write handler should return an error at the failure of finding
page track, it is fatal to implement ioreq handler in XenGT.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Release gvt->lock at the failure of finding page track

page_track_handler take lock at the beginning, the lock should be released
at the failure of finding page track. Otherwise deadlock will happen.

Fixes: e502a2af4c35 ("drm/i915/gvt: Provide generic page_track infrastructure for write-protected page")
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/kvmgt: Add kvmgt debugfs entry nr_cache_entries under vgpu

Add a new debugfs entry kvmgt_nr_cache_entries under vgpu which shows
the number of entry in dma cache.

$ cat /sys/kernel/debug/gvt/vgpu1/kvmgt_nr_cache_entries
10101

v3: fix compiling error for some configuration. (Xiong Zhang <xiong.y.zhang@intel.com>)
v2: keep debugfs layout flat.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix guest vGPU hang caused by very high dma setup overhead

The implementation of current kvmgt implicitly setup dma mapping at MPT
API gfn_to_mfn. First this design against the API's original purpose.
Second, there is no unmap hit in this design. The result is that the
dma mapping keep growing larger and larger. For mutl-vm case, they will
consume IOMMU IOVA low 4GB address space quickly and so tons of rbtree
entries crated in the IOMMU IOVA allocator. Finally, single IOVA
allocation can take as long as ~70ms. Such latency is intolerable.

To address both above issues, this patch introduced two new MPT API:
o dma_map_guest_page - setup dma map for guest page
o dma_unmap_guest_page - cancel dma map for guest page

The kvmgt implements these 2 API. And to reduce dma setup overhead for
duplicated pages (eg. scratch pages), two caches are used: one is for
mapping gfn to struct gvt_dma, another is for mapping dma addr to
struct gvt_dma.

With these 2 new API, the gtt now is able to cancel dma mapping when page
table is invalidated. The dma mapping is not in a gradual increase now.

v2: follow the old logic for VFIO_IOMMU_NOTIFY_DMA_UNMAP at this point.

Cc: Hang Yuan <hang.yuan@intel.com>
Cc: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix check error on hws_pga_write() fail message

Fix below check error by using proper failure message output.

drivers/gpu/drm/i915//gvt/handlers.c:1392 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR()
drivers/gpu/drm/i915//gvt/handlers.c:1402 hws_pga_write() error: 'vgpu' dereferencing possible ERR_PTR()

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix one indent error

Fix below warning:

drivers/gpu/drm/i915//gvt/handlers.c:323 gdrst_mmio_write() warn: inconsistent indenting

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix check error on fence mmio handler

Fix below error with minor code refactor.

CHECK drivers/gpu/drm/i915//gvt/handlers.c
drivers/gpu/drm/i915//gvt/handlers.c:203 sanitize_fence_mmio_access() error: 'vgpu' dereferencing possible ERR_PTR()

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix check error of vgpu create failure message

Fix check error at

CHECK drivers/gpu/drm/i915//gvt/kvmgt.c
drivers/gpu/drm/i915//gvt/kvmgt.c:455 intel_vgpu_create() error: we previously assumed 'vgpu' could be null (see line 454)

For failed vgpu create, just show error return in failure message.

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix vGPU sched timeslice calculation warning

Fix below warning by using proper ktime helper to calculate timeslice.

CHECK drivers/gpu/drm/i915//gvt/sched_policy.c
drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1
drivers/gpu/drm/i915//gvt/sched_policy.c:108 gvt_balance_timeslice() debug: sval_binop_signed: invalid divide LLONG_MIN/-1

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: remove gvt max port definition

Remove GVT-g private max port definition but use i915 one.

Fix error caused by:
drivers/gpu/drm/i915//gvt/handlers.c:871 dp_aux_ch_ctl_mmio_write() error: buffer overflow 'display->ports' 5 <= 5

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix one gvt_vgpu_error() use in dmabuf.c

Fix below warning with proper usage.

CHECK drivers/gpu/drm/i915//gvt/dmabuf.c
drivers/gpu/drm/i915//gvt/dmabuf.c:462 intel_vgpu_get_dmabuf() error: 'vgpu' dereferencing possible ERR_PTR()

Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: init mmio by lri command in vgpu inhibit context

There is one issue relates to Coarse Power Gating(CPG) on KBL NUC in GVT-g,
vgpu can't get the correct default context by updating the registers before
inhibit context submission. It always get back the hardware default value
unless the inhibit context submission happened before the 1st time
forcewake put. With this wrong default context, vgpu will run with
incorrect state and meet unknown issues.

The solution is initialize these mmios by adding lri command in ring buffer
of the inhibit context, then gpu hardware has no chance to go down RC6 when
lri commands are right being executed, and then vgpu can get correct
default context for further use.

v3:
- fix code fault, use 'for' to loop through mmio render list(Zhenyu)

v4:
- save the count of engine mmio need to be restored for inhibit context and
refine some comments. (Kevin)

v5:
- code rebase

Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: add interface to check if context is inhibit

No functional change, just for easy to use.

v4:
- refine comment (Kevin)

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: add define GEN9_MOCS_SIZE

No functional change. This defination will also be used in future patchesi.

v4:
- refine patch description (Kevin)

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Define PTE addr mask with GENMASK_ULL

Define the masks better.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Manage shadow pages with radix tree

We don't know how many page tables will be shadowed. It varies
considerably corresponding to guest load. Radix tree is a better
choice for us. Since Page Frame Number is used as key so most of
the bits are common.

Here is some performance data (duration in us) of looking up a
element:
Before: (aka. ppgtt_find_shadow_page)
0.308 0.292 0.246 0.432 0.143 ... 0.311 0.225 0.382 0.199 0.325
After: (aka. intel_vgpu_find_spt_by_mfn)
0.106 0.106 0.107 0.106 0.105 0.107 ... 0.107 0.109 0.105 0.108

This time I didn't get the early data of hash table. The data is
measured when desktop is shown.

As last change, the overall benchmark almost is not changed, but
we get better scalability.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Provide generic page_track infrastructure for write-protected page

This patch provide generic page_track infrastructure for write-protected
guest page. The old page_track logic gets rewrote and now stays in a new
standalone page_track.c. This page track infrastructure can be both used
by vGUC and GTT shadowing.

The important change is that it uses radix tree instead of hash table.
We don't have a predictable number of pages that will be tracked.

Here is some performance data (duration in us) of looking up a element:
Before: (aka. intel_vgpu_find_tracked_page)
0.091 0.089 0.090 ... 0.093 0.091 0.087 ... 0.292 0.285 0.292 0.291
After: (aka. intel_vgpu_find_page_track)
0.104 0.105 0.100 0.102 0.102 0.100 ... 0.101 0.101 0.105 0.105

The hash table has good performance at beginning, but turns bad with
more pages being tracked even no 3D applications are running. As
expected, radix tree has stable duration and very quick.

The overall benchmark (tested with Heaven Benchmark) marginally improved
since this is not the bottleneck. What we benefit more from this change
is scalability.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Don't extend page_track to mpt layer

Don't extend page_track to mpt layer. Keep MPT simple and clean.
Meanwhile remove gtt.n_tracked_guest_page which doesn't make much
sense.

v2: clean up gtt.n_tracked_guest_page.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rename mpt api {set, unset}_wp_page to {enable, disable}_page_track

The kvmgt's implementation of mpt api {set,unset}_wp_page is not real
write-protection - the data get written before invoke this two api.
As discussed, change the mpt api to match the real behavior.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rename shadow_page to short name spt

The target structure of some functions is struct intel_vgpu_ppgtt_spt and
their names are xxx_shadow_page. It should be xxx_shadow_page_table. Let's
use short name 'spt' instead to reduce the length. As well as the hash
table name.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rework shadow page management code

This is a another big one and the GVT shadow page management code is
heavily refined.

The new code only use struct intel_vgpu_ppgtt_spt to represent a vgpu
shadow page table - w/ or wo/ a guest page associated with. A pure shadow
page (no guest page associated) will be used to shadow splited 2M huge
gtt. In this case, the spt.guest_page.gfn should be a zero.

To search a existed shadow page table, we have two new interfaces:
- intel_vgpu_find_spt_by_gfn(), find a spt by guest gfn. It must not
be a pure spt.
- intel_vgpu_find_spt_by_mfn, Find the spt using shadow page mfn in
shadowed PTE.

The oos_page management is remained as what is was.

v2: Split some changes into small standalone patches.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine pte shadowing process

Make the shadow PTE population code clear. Later we will add huge gtt
support based on this.

v2:
- rebase to latest code.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Use standard pte bit definition

GTT entry has similar format with the CPU PTE. We'd prefer named macro
instead of hardcode.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Factor out intel_vgpu_{get, put}_ppgtt_mm interface

Factor out these two interfaces so we can kill some duplicated code in
scheduler.c.

v2:
- rename to intel_vgpu_{get,put}_ppgtt_mm
- refine handle_g2v_notification

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rename ggtt related functions to be more specific

Accurate names help to avoid confusing so improve readability.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add verbose gtt shadow logs

This add a new macro gvt_vdbg_mm() to print more verbose logs for
gtt shadowing. The added verbose logs are very useful for debugging.
gvt_vdbg_mm() only comes into effect if VERBOSE_DEBUG is defined by
the developer.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine ggtt_set_shadow_entry

Less code and use existed helper ggtt_set_host_entry.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine ggtt and ppgtt root entry ops

Separate ggtt and ppgtt since they are different. A little more code but
straightforward.

And move these helpers to gtt.c since that is the only client.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Refine the intel_vgpu_mm reference management

If we manage an object with a reference count, then its life cycle
must flow the reference count operations. Meanwhile, change the
operation functions to generic name *put* and *get*.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Rework shadow graphic memory management code

This is a big one and the GVT shadow graphic memory management code is
heavily refined. The new code is more straightforward with less code.

The struct intel_vgpu_mm is restructured to be clearly defined, use
accurate names and some of the original fields are removed which are
really redundant.

Now we only manage ppgtt mm object with mm->ppgtt_mm.lru_list. No need
to mix ppgtt and ggtt together, since one vGPU only has one ggtt object.

v4: Don't invoke ppgtt_free_all_shadow_page before intel_vgpu_destroy_all_ppgtt_mm.
v3: Add GVT_RING_CTX_NR_PDPS to avoid confusing about the PDPs.
v2: Split some changes into small standalone patches.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/icl: Ringbuffer interrupt handling

On Gen11 interrupt masks need to be clear to allow C6 entry.
We keep them all enabled knowing that we generate extra
interrupts.

v2: Rebase.
v3: Remove gen 11 extra check in logical_render_ring_init.
v4: Rebase fixes.
v5: Rebase/refactor.
v6: Rebase.
v7: Rebase.
v8: Update comment and commit message (Daniele)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-1-mika.kuoppala@linux.intel.com

drm/i915: Unwind vma pinning for intel_pin_and_fence_fb_obj error path

If we fail to acquire a fence when we must, we must unwind before
reporting the error. Otherwise, we lose tracking of the vma pinning and
eventually hit a bug like

<3>[   46.163202] i915_vma_unpin:333 GEM_BUG_ON(!i915_vma_is_pinned(vma))
<4>[   46.163424] ------------[ cut here ]------------
<2>[   46.163429] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:333!
<4>[   46.163444] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
<0>[   46.163451] Dumping ftrace buffer:
<0>[   46.163457] ---------------------------------
<0>[   46.163630]    <...>-84      1.... 46260767us : i915_gem_object_unpin_from_display_plane: i915_vma_unpin:333 GEM_BUG_ON(!i915_vma_is_pinned(vma))
<0>[   46.163635] ---------------------------------
<4>[   46.163638] Modules linked in: vgem i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich mei_me e1000e mei prime_numbers
<4>[   46.163667] CPU: 1 PID: 84 Comm: kworker/u16:1 Tainted: G     U           4.16.0-rc3-gc07ef2c77d14-kasan_18+ #1
<4>[   46.163671] Hardware name: Dell Inc. OptiPlex 755                 /0PU052, BIOS A08 02/19/2008
<4>[   46.163743] Workqueue: events_unbound intel_atomic_commit_work [i915]
<4>[   46.163809] RIP: 0010:i915_gem_object_unpin_from_display_plane+0x253/0x2f0 [i915]
<4>[   46.163813] RSP: 0018:ffff8800624cfb48 EFLAGS: 00010286
<4>[   46.163818] RAX: 000000000000000c RBX: ffff880064446c40 RCX: ffff8800653135b8
<4>[   46.163822] RDX: dffffc0000000000 RSI: 0000000000000054 RDI: ffff8800651e30d0
<4>[   46.163825] RBP: 00000000000003d0 R08: 0000000000000001 R09: ffff8800651e3158
<4>[   46.163829] R10: 0000000000000000 R11: ffff8800651e30f0 R12: 0000000000000001
<4>[   46.163832] R13: ffff880054c58620 R14: 0000000000000000 R15: dffffc0000000000
<4>[   46.163836] FS:  0000000000000000(0000) GS:ffff880066040000(0000) knlGS:0000000000000000
<4>[   46.163840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   46.163843] CR2: 00007f1fc6fb0000 CR3: 00000000526fe000 CR4: 00000000000006e0
<4>[   46.163846] Call Trace:
<4>[   46.163918]  intel_unpin_fb_vma+0xbd/0x300 [i915]
<4>[   46.163990]  intel_cleanup_plane_fb+0x99/0xc0 [i915]
<4>[   46.163998]  drm_atomic_helper_cleanup_planes+0x166/0x280
<4>[   46.164071]  intel_atomic_commit_tail+0x1594/0x33a0 [i915]
<4>[   46.164081]  ? process_one_work+0x66e/0x1460
<4>[   46.164151]  ? skl_update_crtcs+0x9c0/0x9c0 [i915]
<4>[   46.164157]  ? lock_acquire+0x13d/0x390
<4>[   46.164161]  ? lock_acquire+0x13d/0x390
<4>[   46.164169]  process_one_work+0x71a/0x1460
<4>[   46.164175]  ? __schedule+0x838/0x1e50
<4>[   46.164182]  ? pwq_dec_nr_in_flight+0x2b0/0x2b0
<4>[   46.164188]  ? _raw_spin_lock_irq+0xa/0x40
<4>[   46.164194]  worker_thread+0xdf/0xf60
<4>[   46.164204]  ? process_one_work+0x1460/0x1460
<4>[   46.164209]  kthread+0x2cf/0x3c0
<4>[   46.164213]  ? _kthread_create_on_node+0xa0/0xa0
<4>[   46.164218]  ret_from_fork+0x3a/0x50
<4>[   46.164227] Code: e8 78 d9 cd e8 48 8b 35 cc 9e 47 00 49 c7 c0 c0 31 84 c0 b9 4d 01 00 00 48 c7 c2 e0 80 84 c0 48 c7 c7 0e bb 57 c0 e8 5d 4b df e8 <0f> 0b 48 c7 c1 c0 30 84 c0 ba 4e 01 00 00 48 c7 c6 e0 80 84 c0
<1>[   46.164368] RIP: i915_gem_object_unpin_from_display_plane+0x253/0x2f0 [i915] RSP: ffff8800624cfb48

Fixes: 85798ac9b35f ("drm/i915: Fail if we can't get a fence for gen2/3 tiled scanout")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180305103312.29492-1-chris@chris-wilson.co.uk

drm/i915/icl: remove port A/E lane sharing limitation.

Platforms before Gen11 were sharing lanes between port-A & port-E.
This limitation is no more there.

Changes since V1:
- optimize the code (Shashank/Jani)
- create helper function to get max lanes (ville)
Changes since V2:
- Include BIOS fail fix-up in same helper function (ville)
Changes since V3:
- remove confusing if/else (jani)
- group intel_encoder initialization

Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206060855.30026-1-mahesh1.kumar@intel.com

drm/i915: Update DRIVER_DATE to 20180305

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915/huc: Mark firmware as failed on auth failure

If we fail to authenticate HuC firmware, we should change
its load status to FAIL. While around, print HUC_STATUS
on firmware verification failure.

v2: keep the variables sorted by length (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302133718.1260-1-michal.wajdeczko@intel.com

drm/i915/uc: Introduce intel_uc_suspend|resume

We want to use higher level 'uc' functions as the main entry points to
the GuC/HuC code to hide some details and keep code layered.

While here, move call to disable_guc_interrupts after sending suspend
action to the GuC to allow it work also with CTB as comm mechanism.

v2: update commit msg (Sagar)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302111550.21328-1-michal.wajdeczko@intel.com

drm/i915/execlists: Split spinlock from its irq disabling side-effect

During reset/wedging, we have to clean up the requests on the timeline
and flush the pending interrupt state. Currently, we are abusing the irq
disabling of the timeline spinlock to protect the irq state in
conjunction to the engine's timeline requests, but this is accidental
and conflates the spinlock with the irq state. A baffling state of
affairs for the reader.

Instead, explicitly disable irqs over the critical section, and separate
modifying the irq state from the timeline's requests.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302143246.2579-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

drm/i915/execlists: Move irq state manipulation inside irq disabled region

Although this state (execlists->active and engine->irq_posted) itself is
not protected by the engine->timeline spinlock, it does conveniently
ensure that irqs are disabled. We can use this to protect our
manipulation of the state and so ensure that the next IRQ to arrive sees
consistent state and (hopefully) ignores the reset engine.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302131246.22036-1-chris@chris-wilson.co.uk

drm/i915: Suspend submission tasklets around wedging

After staring hard at sequences like

[   28.199013]  systemd-1       2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?]
[   28.199095]  systemd-1       2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1
[   28.199177]  systemd-1       2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024
[   28.199258]  systemd-1       2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0
[   28.199340]  gem_eio-829     1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.1, seqno=1, prio=0
[   28.199421]   <idle>-0       2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?]
[   28.199503]   <idle>-0       2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
[   28.199585]  gem_eio-829     1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]:  ctx=3.1, seqno=2, prio=0
[   28.199667]  gem_eio-829     1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.2, seqno=1, prio=0
[   28.199749]   <idle>-0       2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?]
[   28.199830]   <idle>-0       2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1
[   28.199912]   <idle>-0       2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0
[   28.199994]  gem_eio-829     2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?]
[   28.200096]  gem_eio-829     2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5
[   28.200178]  gem_eio-829     2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0
[   28.200260]  gem_eio-829     2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id)

the conclusion is that the only place where the ports are reset to zero,
is from engine->cancel_requests called during i915_gem_set_wedged().

The race is horrible as it results from calling set-wedged on active HW
(the GPU reset failed) and as such we need to be careful as the HW state
changes beneath us. Fortunately, it's the same scary conditions as
affect normal reset, so we can reuse the same machinery to disable state
tracking as we clobber it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Fixes: af7a8ffad9c5 ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302113324.23189-2-chris@chris-wilson.co.uk

drm/i915: Deduplicate the code to fill the aux message header

We have two instances of the code to fill out the header for the aux
message. Pull it into a small helper.

v2: Rebase due to txbuf[] changes

Cc: Sean Paul <seanpaul@chromium.org>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222212802.4826-1-ville.syrjala@linux.intel.com
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>

drm/i915: Keep the AKSV details in intel_dp_hdcp_write_an_aksv()

Let's try to keep the details on the AKSV stuff concentrated
in one place. So move the control bit and +5 data size handling
there.

v2: Increase txbuf[] to include the payload which intel_dp_aux_xfer()
will still load into the registers even though the hardware
will ignore it

Cc: Sean Paul <seanpaul@chromium.org>
Cc: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222212732.4665-1-ville.syrjala@linux.intel.com
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>

drm/i915: s/intel_dp_aux_ch/intel_dp_aux_xfer/

Rename intel_dp_aux_ch() to intel_dp_aux_xfer() to better convey
what it actually does.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222181036.15251-6-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #irc

drm/i915/gen9, gen10: Disable FBC on planes with a misaligned Y-offset

Enabling FBC on a plane having a Y-offset that isn't divisible by 4 may
cause pipe FIFO underruns and flickers, so disable FBC on such a config.

I tried the followings to work around the issue:
- enable each HW work around in ILK_DPFC_CHICKEN
- disable each compression algorithm in ILK_DPFC_CONTROL
- disable low-power watermarks
None of the above got rid of the problem. I haven't found this issue in
the Bspec/WA database either.

Besides the igt testcase below (yet to be merged) an easy way to
reproduce the issue is to enable a plane with FBC and a plane Y-offset
not aligned to 4 and then just enable/disable FBC in a loop, keeping the
plane enabled.

I could trigger the problem on BXT/GLK/SKL/CNL, so assume for now that it's
only present on GEN9 and GEN10.

v2: (Ville)
- Run the test/apply the WA on CNL as well.
- Use IS_GEN() instead of INTEL_GEN().
- Fix spelling.

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Testcase: igt/kms_plane/plane-clipping-pipe-A-planes
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180301134457.13974-1-imre.deak@intel.com

drm/i915: Wedged engine mask makes more sense in hex

In decimal its just a weird big number, while in hex can actually log
which engines were requested to be wedged.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180228171844.20006-1-tvrtko.ursulin@linux.intel.com

drm/i915/uc: Make GuC/HuC fw fetch and loading functions/file structure symmetric

GuC load function is named intel_guc_fw_upload() and HuC load function is
named intel_huc_init_hw(). Make them consistent intel_*_fw_upload. Also
move HuC fw loading functions and declarations to separate files
intel_huc_fw.c|h like GuC.

While at this, do below changes
1. Update kernel-doc comment for intel_*_fw_upload() functions
2. s/huc_ucode_xfer/huc_fw_xfer
3. Introduce intel_huc_fw_init_early()

v2: Changed patch to update HuC functions instead of changing
guc_fw_upload and update file structure. (Michal Wajdeczko)

v3: Added SPDX License identifier to huc_fw.c|h. (Michal Wajdeczko)

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1519922745-25441-1-git-send-email-sagar.a.kamble@intel.com

drm/i915: Check for I915_MODE_FLAG_INHERITED before drm_atomic_helper_check_modeset

Moving the check upwards will mean we we no longer have to add planes
and connectors manually, because everything is handled correctly by
drm_atomic_helper_check_modeset() as intended.

[applied with whitespace changes to make sparse happy]
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221092808.30060-1-maarten.lankhorst@linux.intel.com

drm/i915: Replace open-coded wait-for loop

Now that we can pass arbitrary commands into the base __wait_for()
macro, we can reimplement the open-coded wait-for inside
i915_gem_idle_work_handler() using the new macro. This means that instead
of using ktime, we now use jiffies, and benefit from the exponential sleep
backoff that allows a fast response if the HW settles quickly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180301103338.5380-1-chris@chris-wilson.co.uk

drm/i915/perf: fix perf stream opening lock

We're seeing on CI that some contexts don't have the programmed OA
period timer that directs the OA unit on how often to write reports.

The issue is that we're not holding the drm lock from when we edit the
context images down to when we set the exclusive_stream variable. This
leaves a window for the deferred context allocation to call
i915_oa_init_reg_state() that will not program the expected OA timer
value, because we haven't set the exclusive_stream yet.

v2: Drop need_lock from gen8_configure_all_contexts() (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 701f8231a2f ("drm/i915/perf: prune OA configs")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102254
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103715
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103755
Link: https://patchwork.freedesktop.org/patch/msgid/20180301110613.1737-1-lionel.g.landwerlin@intel.com
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.14+

drm/i915/icl: Interrupt handling

v2: Rebase.

v3:
  * Remove DPF, it has been removed from SKL+.
  * Fix -internal rebase wrt. execlists interrupt handling.

v4: Rebase.

v5:
  * Updated for POR changes. (Daniele Ceraolo Spurio)
  * Merged with irq handling fixes by Daniele Ceraolo Spurio:
      * Simplify the code by using gen8_cs_irq_handler.
      * Fix interrupt handling for the upstream kernel.

v6:
  * Remove early bringup debug messages (Tvrtko)
  * Add NB about arbitrary spin wait timeout (Tvrtko)

v7 (from Paulo):
  * Don't try to write RO bits to registers.
  * Don't check for PCH types that don't exist. PCH interrupts are not
    here yet.

v9:
  * squashed in selector and shared register handling (Daniele)
  * skip writing of irq if data is not valid (Daniele)
  * use time_after32 (Chris)
  * use I915_MAX_VCS and I915_MAX_VECS (Daniele)
  * remove fake pm interrupt handling for later patch (Mika)

v10:
  * Direct processing of banks. clear banks early (Chris)
  * remove poll on valid bit, only clear valid bit (Mika)
  * use raw accessors, better naming (Chris)

v11:
  * adapt to raw_reg_[read|write]
  * bring back polling the valid bit (Daniele)

v12:
  * continue if unset intr_dw (Daniele)
  * comment the usage of gen8_de_irq_handler bits (Daniele)

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180228101153.7224-2-mika.kuoppala@linux.intel.com

drm/i915/icl: Prepare for more rings

Gen11 will add more VCS and VECS rings so prepare the
infrastructure to support that.

Bspec: 7021

v2: Rebase.
v3: Rebase.
v4: Rebase.
v5: Rebase.
v6:
  - Update for POR changes. (Daniele Ceraolo Spurio)
  - Add provisional guc engine ids - to be checked and confirmed.
v7:
  - Rebased.
  - Added the new ring masks.
  - Added the new HW ids.
v8:
  - Introduce I915_MAX_VCS/VECS to avoid magic numbers (Michal)

v9: increase MAX_ENGINE_INSTANCE to 3

Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180228101153.7224-1-mika.kuoppala@linux.intel.com

Merge drm-next into drm-intel-next-queued (this time for real)

To pull in the HDCP changes, especially wait_for changes to drm/i915
that Chris wants to build on top of.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

drm/i915/dp: Add HBR3 rate (8.1 Gbps) to dp_rates array

dp_rates[] array is a superset of all the link rates supported
by sink devices. DP 1.3 specification adds HBR3 (8.1Gbps) link rate
to the set of link rates supported by sink. This patch adds this rate
to dp_rates[] array that gets used to populate the sink_rates[]
array limited by max rate obtained from DP_MAX_LINK_RATE DPCD register.

v2:
* Rebased on top of Jani's localized rates patch

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1519857110-26916-1-git-send-email-manasi.d.navare@intel.com

Merge tag 'drm-intel-next-2018-02-21' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Driver Changes:

- Lift alpha_support protection from Cannonlake (Rodrigo)
* Meaning the driver should mostly work for the hardware we had
  at our disposal when testing
* Used to be preliminary_hw_support
- Add missing Cannonlake PCI device ID of 0x5A4C (Rodrigo)
- Cannonlake port register fix (Mahesh)

- Fix Dell Venue 8 Pro black screen after modeset (Hans)
- Fix for always returning zero out-fence from execbuf (Daniele)
- Fix HDMI audio when no no relevant video output is active (Jani)
- Fix memleak of VBT data on driver_unload (Hans)

- Fix for KASAN found locking issue (Maarten)
- RCU barrier consolidation to improve igt/gem_sync/idle (Chris)
- Optimizations to IRQ handlers (Chris)
- vblank tracking improvements (64-bit resolution, PM) (Dhinakaran)
- Pipe select bit corrections (Ville)
- Reduce runtime computed device_info fields (Chris)
- Tune down some WARN_ONs to GEM_BUG_ON now that CI has good coverage (Chris)
- A bunch of kerneldoc warning fixes (Chris)

* tag 'drm-intel-next-2018-02-21' of git://anongit.freedesktop.org/drm/drm-intel: (113 commits)
  drm/i915: Update DRIVER_DATE to 20180221
  drm/i915/fbc: Use PLANE_HAS_FENCE to determine if the plane is fenced
  drm/i915/fbdev: Use the PLANE_HAS_FENCE flags from the time of pinning
  drm/i915: Move the policy for placement of the GGTT vma into the caller
  drm/i915: Also check view->type for a normal GGTT view
  drm/i915: Drop WaDoubleCursorLP3Latency:ivb
  drm/i915: Set the primary plane pipe select bits on gen4
  drm/i915: Don't set cursor pipe select bits on g4x+
  drm/i915: Assert that we don't overflow frontbuffer tracking bits
  drm/i915: Track number of pending freed objects
  drm/i915/: Initialise trans_min for skl_compute_transition_wm()
  drm/i915: Clear the in-use marker on execbuf failure
  drm/i915: Prune gen8_gt_irq_handler
  drm/i915: Track GT interrupt handling using the master iir
  drm/i915: Remove WARN_ONCE for failing to pm_runtime_if_in_use
  drm: intel_dpio_phy: fix kernel-doc comments at nested struct
  drm/i915: Release connector iterator on a digital port conflict.
  drm/i915/execlists: Remove too early assert
  drm/i915: Assert that we always complete a submission to guc/execlists
  drm: move read_domains and write_domain into i915
  ...

Merge tag 'tilcdc-4.17' of https://github.com/jsarha/linux into drm-next

drm/tilcdc changes to v4.17

* tag 'tilcdc-4.17' of https://github.com/jsarha/linux:
  drm/tilcdc: tilcdc_panel: Rename device from "panel" to "tilcdc-panel"
  drm/tilcdc: Add support for drm panels
  drm/tilcdc: panel: Use common error handling code in of_get_panel_info()
  drm/tilcdc: Delete an error message for a failed memory allocation in seven functions

drm/i915/dp: move link rate arrays where they're used

Localize link rate arrays by moving them to the functions where they're
used. Further clarify the distinction between source and sink
capabilities. Split pre and post Haswell arrays, and get rid of the
array size arithmetics. Use a direct rate value in the paranoia case of
no common rates find.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180227105911.4485-1-jani.nikula@intel.com

drm/i915: Consult aux_ch instead of port in ->get_aux_clock_divider()

While it seems totally unlikely that any system would mix a cpu/north
aux channel with a pch/south port (or vice versa) we should still
consult intel_dp->aux_ch rather than encoder->port when figuring out
which clock is actually used by the aux ch.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222181036.15251-5-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #irc

drm/i915: Don't deref request->ctx inside unlocked print_request()

Although we protect the request itself, we don't lock inside
intel_engine_dump() and so the request maybe retired as we peek into it.
One consequence is that the request->ctx may be freed before we
dereference it, leading to a use-after-free. Replace the hw_id we are
peeking from inside request->ctx with the request->fence.context, with
which we can still track from which context the request originated
(although to tie to HW reports requires a little more legwork, but is
good enough to follow the GEM traces).

[52640.729670] general protection fault: 0000 [#2] SMP
[52640.729694] Dumping ftrace buffer:
[52640.729701]    (ftrace buffer empty)
[52640.729705] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_\
temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep gha\
sh_clmulni_intel snd_hda_core snd_pcm mei_me mei i915 r8169 mii prime_numbers i2c_hid
[52640.729748] CPU: 2 PID: 4335 Comm: gem_exec_schedu Tainted: G     UD W        4.16.0-rc3+ #7
[52640.729759] Hardware name: Acer Aspire E5-575G/Ironman_SK  , BIOS V1.12 08/02/2016
[52640.729803] RIP: 0010:print_request+0x2b/0xb0 [i915]
[52640.729811] RSP: 0018:ffffc90001453c18 EFLAGS: 00010206
[52640.729820] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8801e0292d40 RCX: 0000000000000006
[52640.729829] RDX: ffffc90001453c60 RSI: ffff8801e0292d40 RDI: 0000000000000003
[52640.729838] RBP: ffffc90001453d80 R08: 0000000000000000 R09: 0000000000000001
[52640.729847] R10: ffffc90001453bd0 R11: ffffc90001453c73 R12: ffffc90001453c60
[52640.729856] R13: ffffc90001453d80 R14: ffff8801d5a683c8 R15: ffff8801e0292d40
[52640.729866] FS:  00007f1ee50548c0(0000) GS:ffff8801e8200000(0000) knlGS:0000000000000000
[52640.729876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[52640.729884] CR2: 00007f1ee5077000 CR3: 00000001d9411004 CR4: 00000000003606e0
[52640.729893] Call Trace:
[52640.729922]  intel_engine_print_registers+0x623/0x890 [i915]
[52640.729948]  intel_engine_dump+0x4a3/0x590 [i915]
[52640.729957]  ? seq_printf+0x3a/0x50
[52640.729977]  i915_engine_info+0xb8/0xe0 [i915]
[52640.729984]  ? drm_mode_gamma_get_ioctl+0xf0/0xf0
[52640.729990]  seq_read+0xd5/0x410
[52640.729997]  full_proxy_read+0x4b/0x70
[52640.730004]  __vfs_read+0x1e/0x120
[52640.730009]  ? do_sys_open+0x134/0x220
[52640.730015]  ? kmem_cache_free+0x174/0x2b0
[52640.730021]  vfs_read+0xa1/0x150
[52640.730026]  SyS_read+0x40/0xa0
[52640.730032]  do_syscall_64+0x65/0x1a0
[52640.730038]  entry_SYSCALL_64_after_hwframe+0x42/0xb7

Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180228094732.28462-1-chris@chris-wilson.co.uk