git.openwrt.org Git - openwrt/staging/blogic.git/log

drm/i915/dsc: Add missing _MMIO() from PPS registers

This patch fixes the commit -
<2efbb2f099fb> ("i915/dp/dsc: Add DSC PPS register definitions"),
which did not have _MMIO() for DSCA and DSCC.

v2: Fix typos. (manasi)

v3: Change the commit message (Rodrigo)

Cc: Rodrigi Vivi <rodrigo.vivi@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1532122962-9068-1-git-send-email-anusha.srivatsa@intel.com

drm/i915: Fix psr sink status report.

First of all don't try to read dpcd if PSR is not even supported.

But also, if read failed return -EIO instead of reporting via a
backchannel.

v2: fix dev_priv: At this level m->private is the connector. (CI/DK)
don't convert dpcd read errors to EIO. (DK)

Fixes: 5b7b30864d1d ("drm/i915/psr: Split sink status into a separate debugfs node")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180720003155.16290-1-rodrigo.vivi@intel.com

drm/i915: Remove unused "ret" variable.

Just a small clean-up with no functional change, only
removing a variable that is never actually used.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: <nathan.d.ciobanu@linux.intel.com>
Reviewed-by: Nathan Ciobanu <nathan.d.ciobanu@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719234217.7855-1-rodrigo.vivi@intel.com

drm/i915: Only force GGTT coherency w/a on required chipsets

Not all chipsets have an internal buffer delaying the visibility of
writes via the GGTT being visible by other physical paths, but we use a
very heavy workaround for all. We only need to apply that workarounds to
the chipsets we know suffer from the delay and the resulting coherency
issue.

Similarly, the same inconsistent coherency fouls up our ABI promise that
a write into a mmap_gtt is immediately visible to others. Since the HW
has made that a lie, let userspace know when that contract is broken.
(Not that userspace would want to use mmap_gtt on those chipsets for
other performance reasons...)

Testcase: igt/drv_selftest/live_coherency
Testcase: igt/gem_mmap_gtt/coherency
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100587
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180720101910.11153-1-chris@chris-wilson.co.uk

drm/i915: Suppress assertion for i915_ggtt_disable_guc

Another step in the drv_module_reload fault-injection saga, is that we
try to disable the guc twice. Probably. It's a little unclear exactly
what is going on in the unload sequence that catches us out, so for the
time being suppress the assertion to get the test re-enabled.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180720095144.5885-1-chris@chris-wilson.co.uk

drm/i915/icl: compute the TBT PLL registers

Use the hardcoded tables provided by our spec.

v2:
- SSC stays disabled.
- Use intel_port_is_tc().

Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711215909.23945-2-paulo.r.zanoni@intel.com

drm/i915: Fix assert_plane() warning on bootup with external display

On KBL, WHL RVPs, booting up with an external display connected, triggers
below warning, when the BiOS brings up the external display too.
This warning is not seen during hotplug.

[    3.615226] ------------[ cut here ]------------
[    3.619829] plane 1A assertion failure (expected on, current off)
[    3.632039] WARNING: CPU: 2 PID: 354 at drivers/gpu/drm/i915/intel_display.c:1294 assert_plane+0x71/0xbb
[    3.633920] iwlwifi 0000:00:14.3: loaded firmware version 38.c0e03d94.0 op_mode iwlmvm
[    3.647157] Modules linked in: iwlwifi cfg80211 btusb btrtl btbcm btintel bluetooth ecdh_generic
[    3.647163] CPU: 2 PID: 354 Comm: frecon Not tainted 4.17.0-rc7-50176-g655af12d39c2 #3
[    3.647165] Hardware name: Intel Corporation CoffeeLake Client Platform/WhiskeyLake U DDR4 ERB, BIOS CNLSFWR1.R00.X140.B00.1804040304 04/04/2018
[    3.684509] RIP: 0010:assert_plane+0x71/0xbb
[    3.764451] Call Trace:
[    3.766888]  intel_atomic_commit_tail+0xa97/0xb77
[    3.771569]  intel_atomic_commit+0x26a/0x279
[    3.771572]  drm_atomic_helper_set_config+0x5c/0x76
[    3.780670]  __drm_mode_set_config_internal+0x66/0x109
[    3.780672]  drm_mode_setcrtc+0x4c9/0x5cc
[    3.780674]  ? drm_mode_getcrtc+0x162/0x162
[    3.789774]  ? drm_mode_getcrtc+0x162/0x162
[    3.798108]  drm_ioctl_kernel+0x8d/0xe4
[    3.801926]  drm_ioctl+0x27d/0x368
[    3.805311]  ? drm_mode_getcrtc+0x162/0x162
[    3.805314]  ? selinux_file_ioctl+0x14e/0x199
[    3.805317]  vfs_ioctl+0x21/0x2f
[    3.813812]  do_vfs_ioctl+0x491/0x4b4
[    3.813813]  ? security_file_ioctl+0x37/0x4b
[    3.813816]  ksys_ioctl+0x55/0x75
[    3.820672]  __x64_sys_ioctl+0x1a/0x1e
[    3.820674]  do_syscall_64+0x51/0x5f
[    3.820678]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    3.828221] RIP: 0033:0x7b5e04953967
[    3.835504] RSP: 002b:00007fff2eafb6f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[    3.835505] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007b5e04953967
[    3.835505] RDX: 00007fff2eafb730 RSI: 00000000c06864a2 RDI: 000000000000000f
[    3.835506] RBP: 00007fff2eafb720 R08: 0000000000000000 R09: 0000000000000000
[    3.835507] R10: 0000000000000070 R11: 0000000000000246 R12: 000000000000000f
[    3.879988] R13: 000056bc9dd7d210 R14: 00007fff2eafb730 R15: 00000000c06864a2
[    3.887081] Code: 48 c7 c7 06 71 a5 be 84 c0 48 c7 c2 06 fd a3 be 48 89 f9 48 0f 44 ca 84 db 48 0f 45 d7 48 c7 c7 df d3 a4 be 31 c0 e8 af a0 c0 ff <0f> 0b eb 2b 48 c7 c7 06 fd a3 be 84 c0 48 c7 c2 06 71 a5 be 48
[    3.905845] WARNING: CPU: 2 PID: 354 at drivers/gpu/drm/i915/intel_display.c:1294 assert_plane+0x71/0xbb
[    3.920964] ---[ end trace dac692f4ac46391a ]---

The warning is seen when mode_setcrtc() is called for pipeB
during bootup and before we get a mode_setcrtc() for pipeA,
while doing update_crtcs() in intel_atomic_commit_tail().
Now since, plane1A is still active after commit, update_crtcs()
is done for pipeA and eventually update_plane() for plane1A.

intel_plane_state->ctl for plane1A is not updated since set_modecrtc() is
called for pipeB. So intel_plane_state->ctl for plane 1A will be 0x0.
So doing an update_plane() for plane1A, will result in clearing
PLANE_CTL_ENABLE bit, and hence the warning.

To fix this warning, force all active planes to recompute their states
in probe.

Changes in v8:
- Actually add Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>

Changes in v7:
- Move call to intel_initial_commit() after sanitize_watermarks()
  Otherwise the plane update will still consult potentially bogus
  watermarks we read out from the hardware. (Ville)
- Carry Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
  from v6

Changes in v6:
- Handle EDEADLK for drm_atomic_get_crtc_state() and
  drm_atomic_add_affected_planes()
- Remove optimization of calling intel_initial_commit()
  only when there is more than one active pipe in probe.
- Avoid using intel_ types.

Changes in v5:
- Drop drm_modeset_lock_all_ctx() since locks will be taken later.

Changes in v4:
- Handle locking in intel_initial_commit()
- Move the for loop inside intel_initial_commit() so that
  drm_atomic_commit() is called only once
- Call intel_initial_commit() only for more than one active crtc on boot.
- Save the return value of intel_initial_commit() and print a message in
  case of an error

Changes in v3:
- Add comments

Changes in v2:
- Force all planes to recompute their states.(Ville Syrjälä)
- Update the commit message

Signed-off-by: Azhar Shaikh <azhar.shaikh@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1530902250-44583-1-git-send-email-azhar.shaikh@intel.com

drm/i915/gtt: Full ppgtt everywhere, no excuses

We believe we have all the kinks worked out, even for the early
Valleyview devices, for whom we currently disable all ppgtt.

References: 62942ed7279d ("drm/i915/vlv: disable PPGTT on early revs v3")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717095751.1034-2-chris@chris-wilson.co.uk

drm/i915/gtt: Enable full-ppgtt by default everywhere

We should we have all the kinks worked out and full-ppgtt now works
reliably on gen7 (Ivybridge, Valleyview/Baytrail and Haswell). If we can
let userspace have full control over their own ppgtt, it makes softpinning
far more effective, in turn making GPU dispatch far more efficient by
virtue of better mm segregation. On the other hand, switching over to a
different GTT for every client does incur noticeable overhead, but only
for very lightweight tasks.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717095751.1034-1-chris@chris-wilson.co.uk

drm/i915: Update DRIVER_DATE to 20180719

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Remove intel_panel_detect()

With neither LVDS or eDP no longer using intel_panel_detect() we can
kill it, and the accompanying modparam.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717174216.22252-3-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Assume eDP is always connected

We never registered any kind of lid notifier for eDP, so looking at the
lid status is pretty much bonkers. Let's just consider eDP always
connected instead.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717174216.22252-2-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Nuke the LVDS lid notifier

We broke the LVDS notifier resume thing in (presumably) commit
e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.") as
we no longer duplicate the current state in the LVDS notifier and
thus we never resume it properly either.

Instead of trying to fix it again let's just kill off the lid
notifier entirely. None of the machines tested thus far have
apparently needed it. Originally the lid notifier was added to
work around cases where the VBIOS was clobbering some of the
hardware state behind the driver's back, mostly on Thinkpads.
We now have a few report of Thinkpads working just fine without
the notifier. So maybe it was misdiagnosed originally, or
something else has changed (ACPI video stuff perhaps?).

If we do end up finding a machine where the VBIOS is still causing
problems I would suggest that we first try setting various bits in
the VBIOS scratch registers. There are several to choose from that
may instruct the VBIOS to steer clear.

With the notifier gone we'll also stop looking at the panel status
in ->detect().

v2: Nuke enum modeset_restore (Rodrigo)

Cc: stable@vger.kernel.org
Cc: Wolfgang Draxinger <wdraxinger.maillist@draxit.de>
Cc: Vito Caputo <vcaputo@pengaru.com>
Cc: kitsunyan <kitsunyan@airmail.cc>
Cc: Joonas Saarinen <jza@saunalahti.fi>
Tested-by: Vito Caputo <vcaputo@pengaru.com> # Thinkapd X61s
Tested-by: kitsunyan <kitsunyan@airmail.cc> # ThinkPad X200
Tested-by: Joonas Saarinen <jza@saunalahti.fi> # Fujitsu Siemens U9210
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105902
References: https://lists.freedesktop.org/archives/intel-gfx/2018-June/169315.html
References: https://bugs.freedesktop.org/show_bug.cgi?id=21230
Fixes: e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717174216.22252-1-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/execlists: Move the assertion we have the rpm wakeref down

There's a race between idling the engine and finishing off the last
tasklet (as we may kick the tasklets after declaring an individual
engine idle). However, since we do not need to access the device until
we try to submit to the ELSP register (processing the CSB just requires
normal CPU access to the HWSP, and when idle we should not need to
submit!) we can defer the assertion unto that point. The assertion is
still useful as it does verify that we do hold the longterm GT wakeref
taken from request allocation until request completion.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107274
Fixes: 9512f985c32d ("drm/i915/execlists: Direct submission of new requests (avoid tasklet/ksoftirqd)")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719075029.28643-1-chris@chris-wilson.co.uk

drm/i915: Handle recursive shrinker for vma->last_active allocation

If we call into the shrinker for direct relcaim inside kmalloc, it will
retire the requests. If we retire the vma->last_active while processing a
new i915_vma_move_to_active() we can upset the delicate bookkeeping
required for the cache. After the possible invocation of the shrinker, we
need to double check the vma->last_active is still valid.

Fixes: 8b293eb53a7d ("drm/i915: Track the last-active inside the i915_vma")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105600#c39
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180719072206.16015-1-chris@chris-wilson.co.uk

drm/i915/guc: Keep guc submission permanently engaged

We make a decision at module load whether to use the GuC backend or not,
but lose that setup across set-wedge. Currently, the guc doesn't
override the engine->set_default_submission hook letting execlists sneak
back in temporarily on unwedging leading to an unbalanced park/unpark.

v2: Remove comment about switching back temporarily to execlists on
guc_submission_disable(). We currently only call disable on shutdown,
and plan to also call disable before suspend and reset, in which case we
will either restore guc submission or mark the driver as wedged, making
the reset back to execlists pointless.
v3: Move reset.prepare across

Fixes: 63572937cebf ("drm/i915/execlists: Flush pending preemption events during reset")
Testcase: igt/drv_module_reload/basic-reload-inject
Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717202932.1423-1-chris@chris-wilson.co.uk

i915/dp/dsc: Add Rate Control Range Parameter Registers

RC model has these parameters that correspond with each of
15 ranges of RC buffer threshold value in the RC model.
The three elements are range_min_qp, range_max_qp and
range_bpg_offset.

Add the Rate Control range values for eDP/MIPI and DP case.
The actual values are calculated usung a helper function.
This patch adds the shifts to registers where the value will
be written during atomic commit.

v2:
- Use _MMIO_PIPE() instead of _MMIO(_PICK()) (Manasi)
- Combine shifts (Manasi)

Cc: Jose Souza <jose.souza@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1531861861-10950-4-git-send-email-anusha.srivatsa@intel.com

i915/dp/dsc: Add Rate Control Buffer Threshold Registers

Add register defines and shifts that control the RC buffer threshold
between encoder and decoder for eDP/MIPI and DP cases.

The actual values are calculated usung a helper function.
This patch adds the shifts to registers where the value will
be written during atomic commit.

v2:
- Use _MMIO_PIPE() instead of _MMIO_(_PICK()) (Manasi)
- Combine shifts (Manasi)

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1531861861-10950-3-git-send-email-anusha.srivatsa@intel.com

i915/dp/dsc: Add DSC PPS register definitions

Display Stream Compression(DSC) has a set of Picture
Parameter Set(PPS) components that the encoder must
communicate to the decoder.

This patch adds register definitions to
the PPS parameters for eDP/MIPI case and Display Port.

v2:
- Use _MMIO_PIPE instead of _MMIO(_PICK()). (Manasi)
- Use DSC constants as arguments. (Manasi)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1531861861-10950-2-git-send-email-anusha.srivatsa@intel.com

drm/i915/icl: Add VIDEO_DIP registers

The Picture Parameter Set metadata for DSC has to be sent
to the panel through secondary data packets. Add the error
correction registers, data registers and control registers
for the same.

The control registers for transcoders A and B are already
defined and will be reused for Icelake purpose. This patch adds
Control register for EDP and transcoder C apart from adding the
PPS data and error registers.

v2: reuse MMIO_TRANS2 for _PPS_DATA and _PPS_ECC.
The _MMIO_TRANS2(pipe, reg) macro definition takes care of the eDp case

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1531861861-10950-1-git-send-email-anusha.srivatsa@intel.com

drm/i915: Kill sink_crc for good

It was originally introduced following the VESA spec in order to validate PSR.

However we found so many issues around sink_crc that instead of helping PSR
development it only brought another layer of trouble to the table.

So, sink_crc has been a black whole for us in question of time, effort and hope.

First of the problems is that HW statement is clear: "Do not attempt to use
aux communication with PSR enabled". So the main reason behind sink_crc is
already compromised.

For a while we had hope on the aux-mutex could workaround this problem on SKL+
platforms, but that mutex was not reliable, not tested,
and we shouldn't use according to HW engineers.

Also, nor source, nor sink designed and implemented the sink_crc to be used like
we are trying to use here.

Well, the sink side of things is also apparently not prepared for this
case. Each panel that we tried seemed to have a different behavior with same
code and same source.

So, for all the time we lost on trying to ducktape all these different issues
I believe it is now time to move PSR to a more reliable validation.
Maybe not a perfect one as we dreamed for this sink_crc, but at least more
reliable.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705192528.30515-1-rodrigo.vivi@intel.com

drm/i915: Always retire residual requests before suspend

If the driver is wedged, we skip idling the GPU. However, we may still
have a few requests still not retired following the wedging (since they
will be waiting for a background worker trying to acquire struct_mutex).
As we hold the struct_mutex, always do a quick request retirement in
order to flush the wedged path.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107257
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717084121.28185-1-chris@chris-wilson.co.uk

drm/i915: Flush chipset caches after GGTT writes

Our I915g (early gen3, the oldest machine we have in the farm) is still
reporting occasional incoherency performing the following operations:

  1) write through GGTT (indirect write into memory)
  2) write through either CPU or WC (direct write into memory)
  3) read from GGTT (indirect read)

Instead of reporting the value from (2), the read from GGTT reports the
earlier value written via the GGTT. We have made sure that the writes are
flushed from the CPU (commit 3a32497f0dbe ("drm/i915/selftests: Provide
full mb() around clflush") and commit add00e6d896f ("drm/i915: Flush the
WCB following a WC write")), but still see the error, just less
frequently. The only remaining cache that might be affected here is a
chipset cache, so flush that as well.

Testcase: igt/drv_selftest/live_coherency #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717092655.28417-1-chris@chris-wilson.co.uk

drm/i915/selftests: Free the backing store between iterations

In the huge pages tests, we may have lots of objects being trapped on
the freelist as we hold the struct_mutex allowing the free worker no
opportunity to recover the backing store. We also have stricter
requirements and the desire for large contiguous pages, further
increasing the allocation pressure. To reduce the chance of running out
of memory, we could either drop the mutex and flush the free worker, or
we could release the backing store directly. We do the latter in this
patch for simplicity.

References: https://bugs.freedesktop.org/show_bug.cgi?id=107254
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180717082334.18774-1-chris@chris-wilson.co.uk

drm/i915/selftests: Exercise reset to break stuck GTT eviction

We must be able to reset the GPU while we are waiting on it to perform
an eviction (unbinding an active vma). So attach a spinning request to a
target vma and try and it evict it from a thread to see if that blocks
indefinitely.

v2: Add a wait for the thread to start just in case that takes more than
10ms...
v3: complete() not completion_done() to signal the completion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716134009.13143-1-chris@chris-wilson.co.uk

drm/i915/selftests: Force a preemption hang

Inject a failure into preemption completion to pretend as if the HW
didn't successfully handle preemption and we are forced to do a reset in
the middle.

v2: Wait for preemption, to force testing with the missed preemption.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716132154.12539-1-chris@chris-wilson.co.uk

drm/i915/execlists: Always clear preempt status on cancelling all

On reset/wedging, we cancel all pending replies from the HW and we also
want to cancel an outstanding preemption event. Since we use the same
function to cancel the pending replies for reset and for a preemption
event, we can simply clear the active tracking for all.

v2: Keep execlists_user_end() markup for wedging
v3: Move assignment to inline to hide the bare assignment.

Fixes: 60a943245413 ("drm/i915/execlists: Drop clear_gtiir() on GPU reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716125424.5715-1-chris@chris-wilson.co.uk

drm/i915/execlists: Disable submission tasklet upon wedging

If we declare the driver wedged before the GPU truly is, then we may see
the GPU complete some CS events following our cancellation. This leaves
us quite confused as we deleted all the bookkeeping and thus complain
about the inconsistent state.

We can just ignore the remaining events and let the GPU idle by not
feeding it, and so avoid trying to racily overwrite shared state. We
rely on there being a full GPU reset before unwedging, giving us the
opportunity to reset the shared state.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716080332.32283-4-chris@chris-wilson.co.uk

drm/i915: Remove pci private pointer after destroying the device private

On an aborted module load, we unwind and free our device private - but
we left a dangling pointer to our privates inside the pci_device. After
the attempted aborted unload, we may still get a call to i915_pci_remove()
when the module is removed, potentially chasing stale data.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716080332.32283-5-chris@chris-wilson.co.uk

drm/i915/selftests: Downgrade igt_timeout message

Give in, since CI continues to incorrectly insist that KERN_NOTICE is a
warning and flags the timeout message as unwanted spam. At first, the
intention was to use the message to indicate which tests might warrant
an extended run, but virtually all tests require a timeout so it is
simply not as interesting as first thought.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103667
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180716080332.32283-6-chris@chris-wilson.co.uk

drm/i915/guc: Disable rpm wakeref asserts in GuC irq handler

We're seeing "RPM wakelock ref not held during HW access" warning
otherwise. Since IRQs are synced for runtime suspend we can just disable
the wakeref asserts.

Reported-by: Marta Löfstedt <marta.lofstedt@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105710
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180714173703.7894-1-chris@chris-wilson.co.uk
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Drop clear_gtiir() on GPU reset

With the new CSB processing code, we are not vulnerable to delayed
delivery of a pre-reset interrupt as we use the CSB status pointers in
the HWSP to decide if we need to parse any CSB events and no longer need
to wait for the first post-reset interrupt to be assured that the CSB
mmio registers are valid.

The new icl code to clear registers has a nasty lock inversion:
[   57.409776] ======================================================
[   57.409779] WARNING: possible circular locking dependency detected
[   57.409783] 4.18.0-rc4-CI-CI_DII_1137+ #1 Tainted: G     U  W
[   57.409785] ------------------------------------------------------
[   57.409788] swapper/6/0 is trying to acquire lock:
[   57.409790] 000000004f304ee5 (&engine->timeline.lock/1){-.-.}, at: execlists_submit_request+0x2b/0x1a0 [i915]
[   57.409841]
               but task is already holding lock:
[   57.409844] 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915]
[   57.409869]
               which lock already depends on the new lock.

[   57.409872]
               the existing dependency chain (in reverse order) is:
[   57.409876]
               -> #2 (&(&rq->lock)->rlock#2){-.-.}:
[   57.409900]        notify_ring+0x2b2/0x480 [i915]
[   57.409922]        gen8_cs_irq_handler+0x39/0xa0 [i915]
[   57.409943]        gen11_irq_handler+0x2f0/0x420 [i915]
[   57.409949]        __handle_irq_event_percpu+0x42/0x370
[   57.409952]        handle_irq_event_percpu+0x2b/0x70
[   57.409956]        handle_irq_event+0x2f/0x50
[   57.409959]        handle_edge_irq+0xe7/0x190
[   57.409964]        handle_irq+0x67/0x160
[   57.409967]        do_IRQ+0x5e/0x120
[   57.409971]        ret_from_intr+0x0/0x1d
[   57.409974]        _raw_spin_unlock_irqrestore+0x4e/0x60
[   57.409979]        tasklet_action_common.isra.5+0x47/0xb0
[   57.409982]        __do_softirq+0xd9/0x505
[   57.409985]        irq_exit+0xa9/0xc0
[   57.409988]        do_IRQ+0x9a/0x120
[   57.409991]        ret_from_intr+0x0/0x1d
[   57.409995]        cpuidle_enter_state+0xac/0x360
[   57.409999]        do_idle+0x1f3/0x250
[   57.410004]        cpu_startup_entry+0x6a/0x70
[   57.410010]        start_secondary+0x19d/0x1f0
[   57.410015]        secondary_startup_64+0xa5/0xb0
[   57.410018]
               -> #1 (&(&dev_priv->irq_lock)->rlock){-.-.}:
[   57.410081]        clear_gtiir+0x30/0x200 [i915]
[   57.410116]        execlists_reset+0x6e/0x2b0 [i915]
[   57.410140]        i915_reset_engine+0x111/0x190 [i915]
[   57.410165]        i915_handle_error+0x11a/0x4a0 [i915]
[   57.410198]        i915_hangcheck_elapsed+0x378/0x530 [i915]
[   57.410204]        process_one_work+0x248/0x6c0
[   57.410207]        worker_thread+0x37/0x380
[   57.410211]        kthread+0x119/0x130
[   57.410215]        ret_from_fork+0x3a/0x50
[   57.410217]
               -> #0 (&engine->timeline.lock/1){-.-.}:
[   57.410224]        _raw_spin_lock_irqsave+0x33/0x50
[   57.410256]        execlists_submit_request+0x2b/0x1a0 [i915]
[   57.410289]        submit_notify+0x8d/0x124 [i915]
[   57.410314]        __i915_sw_fence_complete+0x81/0x250 [i915]
[   57.410339]        dma_i915_sw_fence_wake+0xd/0x20 [i915]
[   57.410344]        dma_fence_signal_locked+0x79/0x200
[   57.410368]        notify_ring+0x2ba/0x480 [i915]
[   57.410392]        gen8_cs_irq_handler+0x39/0xa0 [i915]
[   57.410416]        gen11_irq_handler+0x2f0/0x420 [i915]
[   57.410421]        __handle_irq_event_percpu+0x42/0x370
[   57.410425]        handle_irq_event_percpu+0x2b/0x70
[   57.410428]        handle_irq_event+0x2f/0x50
[   57.410432]        handle_edge_irq+0xe7/0x190
[   57.410436]        handle_irq+0x67/0x160
[   57.410439]        do_IRQ+0x5e/0x120
[   57.410445]        ret_from_intr+0x0/0x1d
[   57.410449]        cpuidle_enter_state+0xac/0x360
[   57.410453]        do_idle+0x1f3/0x250
[   57.410456]        cpu_startup_entry+0x6a/0x70
[   57.410460]        start_secondary+0x19d/0x1f0
[   57.410464]        secondary_startup_64+0xa5/0xb0
[   57.410466]
               other info that might help us debug this:

[   57.410471] Chain exists of:
                 &engine->timeline.lock/1 --> &(&dev_priv->irq_lock)->rlock --> &(&rq->lock)->rlock#2

[   57.410481]  Possible unsafe locking scenario:

[   57.410485]        CPU0                    CPU1
[   57.410487]        ----                    ----
[   57.410490]   lock(&(&rq->lock)->rlock#2);
[   57.410494]                                lock(&(&dev_priv->irq_lock)->rlock);
[   57.410498]                                lock(&(&rq->lock)->rlock#2);
[   57.410503]   lock(&engine->timeline.lock/1);
[   57.410506]
                *** DEADLOCK ***

[   57.410511] 4 locks held by swapper/6/0:
[   57.410514]  #0: 0000000074575789 (&(&dev_priv->irq_lock)->rlock){-.-.}, at: gen11_irq_handler+0x8a/0x420 [i915]
[   57.410542]  #1: 000000009b29b30e (rcu_read_lock){....}, at: notify_ring+0x1a/0x480 [i915]
[   57.410573]  #2: 00000000aad89594 (&(&rq->lock)->rlock#2){-.-.}, at: notify_ring+0x2b2/0x480 [i915]
[   57.410601]  #3: 000000009b29b30e (rcu_read_lock){....}, at: submit_notify+0x35/0x124 [i915]
[   57.410635]
               stack backtrace:
[   57.410640] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G     U  W         4.18.0-rc4-CI-CI_DII_1137+ #1
[   57.410644] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2222.A01.1805300339 05/30/2018
[   57.410650] Call Trace:
[   57.410652]  <IRQ>
[   57.410657]  dump_stack+0x67/0x9b
[   57.410662]  print_circular_bug.isra.16+0x1c8/0x2b0
[   57.410666]  __lock_acquire+0x1897/0x1b50
[   57.410671]  ? lock_acquire+0xa6/0x210
[   57.410674]  lock_acquire+0xa6/0x210
[   57.410706]  ? execlists_submit_request+0x2b/0x1a0 [i915]
[   57.410711]  _raw_spin_lock_irqsave+0x33/0x50
[   57.410741]  ? execlists_submit_request+0x2b/0x1a0 [i915]
[   57.410769]  execlists_submit_request+0x2b/0x1a0 [i915]
[   57.410774]  ? _raw_spin_unlock_irqrestore+0x39/0x60
[   57.410804]  submit_notify+0x8d/0x124 [i915]
[   57.410828]  __i915_sw_fence_complete+0x81/0x250 [i915]
[   57.410854]  dma_i915_sw_fence_wake+0xd/0x20 [i915]
[   57.410858]  dma_fence_signal_locked+0x79/0x200
[   57.410882]  notify_ring+0x2ba/0x480 [i915]
[   57.410907]  gen8_cs_irq_handler+0x39/0xa0 [i915]
[   57.410933]  gen11_irq_handler+0x2f0/0x420 [i915]
[   57.410938]  __handle_irq_event_percpu+0x42/0x370
[   57.410943]  handle_irq_event_percpu+0x2b/0x70
[   57.410947]  handle_irq_event+0x2f/0x50
[   57.410951]  handle_edge_irq+0xe7/0x190
[   57.410955]  handle_irq+0x67/0x160
[   57.410958]  do_IRQ+0x5e/0x120
[   57.410962]  common_interrupt+0xf/0xf
[   57.410965]  </IRQ>
[   57.410969] RIP: 0010:cpuidle_enter_state+0xac/0x360
[   57.410972] Code: 44 00 00 31 ff e8 84 93 91 ff 45 84 f6 74 12 9c 58 f6 c4 02 0f 85 31 02 00 00 31 ff e8 7d 30 98 ff e8 e8 0e 94 ff fb 4c 29 fb <48> ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 ea b8 ff
[   57.411015] RSP: 0018:ffffc90000133e90 EFLAGS: 00000216 ORIG_RAX: ffffffffffffffdd
[   57.411023] RAX: ffff8804ae748040 RBX: 000000000002a97d RCX: 0000000000000000
[   57.411029] RDX: 0000000000000046 RSI: ffffffff82141263 RDI: ffffffff820f05a7
[   57.411035] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[   57.411041] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8229f078
[   57.411045] R13: ffff8804ab2adfa8 R14: 0000000000000000 R15: 0000000d5de092e3
[   57.411052]  do_idle+0x1f3/0x250
[   57.411055]  cpu_startup_entry+0x6a/0x70
[   57.411059]  start_secondary+0x19d/0x1f0
[   57.411064]  secondary_startup_64+0xa5/0xb0

The easiest remedy is to remove the defunct code.

Fixes: ff047a87cfac ("drm/i915/icl: Correctly clear lost ctx-switch interrupts across reset for Gen11")
References: fd8526e50902 ("drm/i915/execlists: Trust the CSB")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-3-chris@chris-wilson.co.uk

drm/i915: Do not short-circuit tasklets during reset

Inside intel_engine_is_idle(), we flush the tasklet to ensure that is
being run in a timely fashion (ksoftirqd has taught us to expect the
worst). However, if we are in the middle of reset, the HW may not yet be
ready to execute the submission tasklet and so we must respect the
disable flag.

Fixes: dd0cf235d81f ("drm/i915: Speed up idle detection by kicking the tasklets")
Testcase: igt/drv_selftest/live_hangcheck
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-2-chris@chris-wilson.co.uk

drm/i915/selftests: Include the start of each subtest in the GEM trace

Knowing the boundary of each subtest can be instrumental in digesting
the voluminous trace output and finding the critical piece of
information.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180713203529.1973-1-chris@chris-wilson.co.uk

drm/i915/guc: Protect against no desc-pool on premature shutdown

Hopefully the final hack to get guc fault-injection happy before we can
clean it up again, starting from a known good baseline...

[  383.017530] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
[  383.017556] Oops: 0000 [#1] PREEMPT SMP PTI
[  383.017566] CPU: 7 PID: 4725 Comm: drv_module_relo Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4485+ #1
[  383.017581] Hardware name: Micro-Star International Co., Ltd. MS-7B54/Z370M MORTAR (MS-7B54), BIOS 1.10 12/28/2017
[  383.017664] RIP: 0010:guc_stage_desc_pool_destroy+0x17/0xe0 [i915]
[  383.017674] Code: 59 a0 c6 05 02 59 18 00 01 e8 5e 01 c3 e0 eb b1 0f 1f 00 53 48 89 fb 48 81 c7 90 02 00 00 e8 60 64 45 e1 48 8b 83 80 02 00 00 <48> 8b 80 a0 00 00 00 48 8b 90 68 02 00 00 48 83 ea 01 48 81 fa ff
[  383.017771] RSP: 0018:ffffc900004bbdd0 EFLAGS: 00010282
[  383.017782] RAX: 0000000000000000 RBX: ffff88012ff41300 RCX: 0000000000000000
[  383.017794] RDX: 0000000000000000 RSI: ffffc900004bbd80 RDI: 0000000000000000
[  383.017805] RBP: ffff88012ff40000 R08: 00000000d876ee11 R09: 0000000000000000
[  383.017817] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012ff47770
[  383.017828] R13: ffff88012ff40068 R14: ffff880264392ef8 R15: ffffffffa0639950
[  383.017840] FS:  00007fb9c18c8980(0000) GS:ffff8802663c0000(0000) knlGS:0000000000000000
[  383.017853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  383.017864] CR2: 00000000000000a0 CR3: 00000001df6cc003 CR4: 00000000003606e0
[  383.017875] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  383.017887] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  383.017898] Call Trace:
[  383.017962]  intel_uc_fini+0x34/0xd0 [i915]
[  383.018020]  i915_gem_fini+0x5c/0x100 [i915]
[  383.018093]  i915_driver_unload+0xd2/0x110 [i915]
[  383.018150]  i915_pci_remove+0x10/0x20 [i915]
[  383.018165]  pci_device_remove+0x36/0xb0
[  383.018179]  device_release_driver_internal+0x185/0x250
[  383.018193]  driver_detach+0x35/0x70
[  383.018205]  bus_remove_driver+0x53/0xd0
[  383.018217]  pci_unregister_driver+0x25/0xa0
[  383.018232]  __se_sys_delete_module+0x162/0x210
[  383.018245]  ? do_syscall_64+0xd/0x190
[  383.018257]  do_syscall_64+0x55/0x190
[  383.018270]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  383.018282] RIP: 0033:0x7fb9c0f7c1b7
[  383.018290] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
[  383.018408] RSP: 002b:00007fffa01c2aa8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  383.018425] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb9c0f7c1b7
[  383.018440] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000560b96856d48
[  383.018454] RBP: 0000560b96856ce0 R08: 0000560b96856d4c R09: 00007fffa01c2ae8
[  383.018468] R10: 00007fffa01c1aa4 R11: 0000000000000206 R12: 0000560b954f7470

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180713172658.14070-1-chris@chris-wilson.co.uk

drm/i915: Print the long_mask alongside the pin_mask

We're printing out which pins got a hotplug, so why not also print
out which pins detected the long pulse as opposed to a short pulse.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-9-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Pass hpd_pin to long_pulse_detect()

We're doing a pointless translation from hpd_pin to port simply for
passing the thing to long_pulse_detect(). Let's pass the hpd_pin
directly instead.

This removes the assumption that the hpd_pin and port always
match. The only other place where we make that assumption anymore
is intel_hpd_pin_default() and that's fine as it's what determines
the relationship between the two. If we ever get hardware where
the hpd pins are wired in more interesting ways it should be
trivial to handle from now on.

This should also fix the IS_CNL_WITH_PORT_F() case as that mapped
pin E back to port F and passed that to
spt_port_hotplug2_long_detect() which would always return false
for port F. Now that we pass in pin E directly it'll actually
do the right thing.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: cf53902f48c3 ("drm/i915/cnl: Add HPD support for Port F.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-7-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: s/int i/enum hpd_pin pin/

Use the enum hpd_pin type when talking about HPD pins, and rename the
variable from a very nondescript 'i' to 'pin', a name we already
use in other parts of the code.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-6-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Nuke dev_priv->irq_port[]

Instead of looping over ports and hpd_pins, let's loop over
the encoders when doing hotplug processing. And instead of
depending on dev_priv->irq_port[] to tell us whether the
encoder has the ->hpd_pulse() hook or not, we can just
check for that directly. So we can just nuke irq_port[] entirely.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-5-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Rewrite mst suspend/resume in terms of encoders

Rather than looping over all the ports and picking the encoder based on
the port, let's just loop over all the encoders instead. Gets rid of
some irq_port[] usage, which is a bit of an eye sore.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-4-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Introduce intel_encoder_is_dig_port()

Add intel_encoder_is_dig_port() to match intel_encoder_is_dp().

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-3-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915: Introduce for_each_intel_dp()

Add a convenience macro for iterating DP encoders.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705164357.28512-2-ville.syrjala@linux.intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/userptr: Enable read-only support on gen8+

On gen8 and onwards, we can mark GPU accesses through the ppGTT as being
read-only, that is cause any GPU write onto that page to be discarded
(not triggering a fault). This is all that we need to finally support
the read-only flag for userptr!

v2: Check default address space for read only support as a proxy for the
user context/ppgtt.

Testcase: igt/gem_userptr_blits/readonly*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712191430.9269-1-chris@chris-wilson.co.uk

drm/i915: Reject attempted pwrites into a read-only object

If the user created a read-only object, they should not be allowed to
circumvent the write protection using the pwrite ioctl.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-5-chris@chris-wilson.co.uk

drm/i915: Prevent writing into a read-only object via a GGTT mmap

If the user has created a read-only object, they should not be allowed
to circumvent the write protection by using a GGTT mmapping. Deny it.

Also most machines do not support read-only GGTT PTEs, so again we have
to reject attempted writes. Fortunately, this is known a priori, so we
can at least reject in the call to create the mmap (with a sanity check
in the fault handler).

v2: Check the vma->vm_flags during mmap() to allow readonly access.
v3: Remove VM_MAYWRITE to curtail mprotect()

Testcase: igt/gem_userptr_blits/readonly_mmap*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> #v1
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-4-chris@chris-wilson.co.uk

drm/i915/gtt: Disable read-only support under GVT

GVT is not propagating the PTE bits, and is always setting the
read-write bit, thus breaking read-only support.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-3-chris@chris-wilson.co.uk

drm/i915/gtt: Read-only pages for insert_entries on bdw+

Hook up the flags to allow read-only ppGTT mappings for gen8+

v2: Include a selftest to check that writes to a readonly PTE are
dropped
v3: Don't duplicate cpu_check() as we can just reuse it, and even worse
don't wholesale copy the theory-of-operation comment from igt_ctx_exec
without changing it to explain the intention behind the new test!
v4: Joonas really likes magic mystery values

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-2-chris@chris-wilson.co.uk

drm/i915/gtt: Add read only pages to gen8_pte_encode

We can set a bit inside the ppGTT PTE to indicate a page is read-only;
writes from the GPU will be discarded. We can use this to protect pages
and in particular support read-only userptr mappings (necessary for
importing PROT_READ vma).

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-1-chris@chris-wilson.co.uk

drm/i915/glk: Add Quirk for GLK NUC HDMI port issues.

On GLK NUC platforms the HDMI retiming buffer needs additional disabled
time to correctly sync to a faster incoming signal.

When measured on a scope the highspeed lines of the HDMI clock turn off
for ~400uS during a normal resolution change. The HDMI retimer on the
GLK NUC appears to require at least a full frame of quiet time before a
new faster clock can be correctly sync'd. Wait 100ms due to msleep
inaccuracies while waiting for a completed frame. Add a quirk to the
driver for GLK boards that use ITE66317 HDMI retimers.

V2: Add more devices to the quirk list
V3: Delay increased to 100ms, check to confirm crtc type is HDMI.
V4: crtc type check extended to include _DDI and whitespace fixes
v5: Fix white spaces, remove the macro for delay. Revert the crtc type
check introduced in v4.

Cc: Imre Deak <imre.deak@intel.com>
Cc: <stable@vger.kernel.org> # v4.14+
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105887
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Tested-by: Daniel Scheller <d.scheller.oss@gmail.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710200205.1478-1-radhakrishna.sripada@intel.com

drm/i915/guc: Protect against NULL client dereference in error path

After aborting a module load, we may try and disable guc before we have
finished setting it. Long term plan is to ensure perfect onion unwind,
but in the short term we want to fix the oops to re-enable
drv_module_reload.

[  317.401239] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
[  317.401279] Oops: 0000 [#1] PREEMPT SMP PTI
[  317.401294] CPU: 5 PID: 4275 Comm: drv_module_relo Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4476+ #1
[  317.401317] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 3610 03/29/2018
[  317.401440] RIP: 0010:unreserve_doorbell+0x0/0x80 [i915]
[  317.401454] Code: bb e0 48 8b 35 21 4d 18 00 49 c7 c0 a8 e5 62 a0 b9 cc 00 00 00 48 c7 c2 d8 41 5f a0 48 c7 c7 c9 f6 53 a0 e8 a2 3d c2 e0 0f 0b <0f> b7 47 30 66 3d 00 01 74 20 48 8b 57 18 48 0f a3 82 40 05 00 00
[  317.401602] RSP: 0018:ffffc900003d3da0 EFLAGS: 00010246
[  317.401619] RAX: ffffffff8223b300 RBX: 0000000000000000 RCX: 0000000000000000
[  317.401636] RDX: 0000001fffffffc0 RSI: ffff880219f115f0 RDI: 0000000000000000
[  317.401654] RBP: ffff880219f11838 R08: 0000000000000000 R09: 0000000000000000
[  317.401671] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880219f11300
[  317.401689] R13: ffff880219f17770 R14: ffff88022c1daef8 R15: ffffffffa06ae950
[  317.401707] FS:  00007febf77a9980(0000) GS:ffff880236d40000(0000) knlGS:0000000000000000
[  317.401727] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  317.401743] CR2: 0000000000000030 CR3: 0000000222072003 CR4: 00000000003606e0
[  317.401761] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  317.401779] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  317.401796] Call Trace:
[  317.401894]  guc_client_free+0x9/0x130 [i915]
[  317.401993]  intel_guc_submission_fini+0x50/0x90 [i915]
[  317.402092]  intel_uc_fini+0x34/0xd0 [i915]
[  317.402179]  i915_gem_fini+0x5c/0x100 [i915]
[  317.402249]  i915_driver_unload+0xd2/0x110 [i915]
[  317.402321]  i915_pci_remove+0x10/0x20 [i915]
[  317.402341]  pci_device_remove+0x36/0xb0
[  317.402357]  device_release_driver_internal+0x185/0x250
[  317.402374]  driver_detach+0x35/0x70
[  317.402390]  bus_remove_driver+0x53/0xd0
[  317.402404]  pci_unregister_driver+0x25/0xa0
[  317.402423]  __se_sys_delete_module+0x162/0x210
[  317.402439]  ? do_syscall_64+0xd/0x190
[  317.402454]  do_syscall_64+0x55/0x190
[  317.402470]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  317.402485] RIP: 0033:0x7febf6e5d1b7
[  317.402496] Code: 73 01 c3 48 8b 0d d1 8c 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a1 8c 2c 00 f7 d8 64 89 01 48
[  317.402646] RSP: 002b:00007fffb5e72798 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  317.402667] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007febf6e5d1b7
[  317.402686] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000562da1addd98
[  317.402703] RBP: 0000562da1addd30 R08: 0000562da1addd9c R09: 00007fffb5e727d8
[  317.402721] R10: 00007fffb5e71794 R11: 0000000000000206 R12: 0000562da0ff6470

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712202027.19801-1-chris@chris-wilson.co.uk

drm/i915: Update DRIVER_DATE to 20180712

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/psr: Remove few mod parameters option.

Reduce the module parameter to enable or disable.

The link stand by vs full link off was used only once.

And it was actually masking another bug fixed by commit
'84bb2916a683 ("drm/i915/psr: Check for SET_POWER_CAPABLE
bit at PSR init time.")'

So, let's remove these options for now. End goal is to
fully remove the mod param, moving it to a debugfs
interface in upcoming patches.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Tarun Vyas <tarun.vyas@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712052715.8177-1-rodrigo.vivi@intel.com

drm/i915/psr: Remove useless function calls.

PSR is no longer supported on VLV/CHV so this is just dead code.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711220050.21809-1-rodrigo.vivi@intel.com

drm/i915/psr: Split sink status into a separate debugfs node

This allows to read i915_edp_psr_status from tests without triggering
any AUX communication. Take this opportunity to move this under the
eDP-1 connector directory as the status we print is of the sink.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180705003121.2478-1-dhinakaran.pandiyan@intel.com

drm/i915: Use crtc_state->has_psr instead of CAN_PSR for pipe update

In commit "drm/i915: Wait for PSR exit before checking for vblank
evasion", the idea was to limit the PSR IDLE checks when PSR is
actually supported. While CAN_PSR does do that check, it doesn't
applies on a per-crtc basis. crtc_state->has_psr is a more granular
check that only applies to pipe(s) that have PSR enabled.

Without the has_psr check, we end up waiting on the eDP transcoder's
PSR_STATUS register irrespective of whether the pipe being updated is
driving it or not.

v2: Remove unnecessary parantheses, make checkpatch happy.

v3: Move the has_psr check to intel_psr_wait_for_idle and commit
message changes (DK).

v4: Derive dev_priv from intel_crtc_state (DK)

v5: Commit message changes to reflect the HW behavior (DK)

Fixes: a608987970b9 ("drm/i915: Wait for PSR exit before checking for vblank evasion")
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Tarun Vyas <tarun.vyas@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712053323.26266-1-tarun.vyas@intel.com

drm/i915/gmbus: Enable burst read

Support for Burst read in HW is added for HDCP2.2 compliance
requirement.

This patch enables the burst read for all the gmbus read of more than
511Bytes, on capable platforms.

v2:
  Extra line is removed.
v3:
  Macro is added for detecting the BURST_READ Support [Jani]
  Runtime detection of the need for burst_read [Jani]
  Calculation enhancement.
v4:
  GMBUS0 reg val is passed from caller [ville]
  Removed a extra var [ville]
  Extra brackets are removed [ville]
  Implemented the handling of 512Bytes Burst Read.
v5:
  Burst read max length is fixed at 767Bytes [Ville]
v6:
  Collecting the received reviewed-by.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1530192889-5789-3-git-send-email-ramalingam.c@intel.com

drm/i915/gmbus: Increase the Bytes per Rd/Wr Op

GMBUS HW supports 511Bytes as Max Bytes per single RD/WR op. Instead of
enabling the 511Bytes per RD/WR cycle on legacy platforms for no
absolute ROIs, this change allows the max bytes per op upto 511Bytes
from Gen9 onwards.

v2:
  No Change.
v3:
  Inline function for max_xfer_size and renaming of the macro.[Jani]
v4:
  Extra brackets removed [ville]
  Commit msg is modified.
v5:
  Collecting the Reviewed-By received.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1530192889-5789-2-git-send-email-ramalingam.c@intel.com

drm/i915/selftests: Fixup GuC FW negative test

Since:
0d4b78b3d2c0 ("drm/i915/guc: Assert we have the doorbell before setting it up")

We have asserts in GuC doorbell related functions, which is a good thing.
Unfortunately, we were using those to check whether GuC FW is refusing
to allocate invalid doorbell - which makes the test fail.
Well, it would make the test WARN, except we fumbled cleanup ordering
and eat the BUG_ON instead.
Let's keep the asserts and use the internal implementation in the test.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107186
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712112013.3253-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Tidy error handling in i915_gem_init_hw

Let's reorder things so that we can do onion teardown rather than double
goto.

References: b96f6ebfd024 ("drm/i915: Correctly handle error path in i915_gem_init_hw")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712124810.25241-1-michal.winiarski@intel.com

drm/i915/guc: Skip cleaning up the doorbells on error-before-allocate

If we fail the module load, we may try and cleanup before we even
allocate the GuC clients. KISS in order to try and re-enable
drv_module_reload for BAT.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712105830.20390-1-chris@chris-wilson.co.uk

drm/i915: Silence warning for no vlv powercontext

Along a module load error path, we may try to cleanup the powercontext
even before we have allocated it.  Reorganising GT powermanagement is an
on going process, so for simplicity handle it.

[  522.733832] WARN_ON(!dev_priv->vlv_pctx)
[  522.733986] WARNING: CPU: 1 PID: 3856 at drivers/gpu/drm/i915/intel_pm.c:7350 intel_cleanup_gt_powersave+0x5f/0x70 [i915]
[  522.733991] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic btusb btrtl btbcm btintel intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul bluetooth snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core ecdh_generic lpc_ich r8169 snd_pcm mii i2c_hid prime_numbers [last unloaded: i915]
[  522.734105] CPU: 1 PID: 3856 Comm: drv_module_relo Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4474+ #1
[  522.734110] Hardware name: \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff \xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff/DN2820FYK, BIOS FYBYT10H.86A.0059.2017.0607.2130 06/07/2017
[  522.734193] RIP: 0010:intel_cleanup_gt_powersave+0x5f/0x70 [i915]
[  522.734197] Code: 00 74 0d 48 c7 83 68 a6 00 00 00 00 00 00 eb c8 e8 36 6f 37 e1 eb ec 48 c7 c6 c5 7a 3d a0 48 c7 c7 b5 78 3d a0 e8 71 04 e0 e0 <0f> 0b eb aa 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 c3 0f 1f 40
[  522.734445] RSP: 0018:ffffc900004f3af0 EFLAGS: 00010282
[  522.734453] RAX: 0000000000000000 RBX: ffff880106360000 RCX: 0000000000000001
[  522.734458] RDX: 0000000080000001 RSI: ffffffff820c65c4 RDI: 00000000ffffffff
[  522.734463] RBP: ffff880106360000 R08: 000000009f79baee R09: 0000000000000000
[  522.734467] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013b3133f8
[  522.734472] R13: 00000000ffffffed R14: ffff880106360d58 R15: ffff88013b3133f8
[  522.734477] FS:  00007f43f70af980(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[  522.734481] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  522.734486] CR2: 000055a13a787580 CR3: 00000001325e6000 CR4: 00000000001006e0
[  522.734490] Call Trace:
[  522.734595]  intel_modeset_cleanup+0xcf/0x140 [i915]
[  522.734682]  i915_driver_load+0xc85/0x10a0 [i915]
[  522.734694]  ? _raw_spin_unlock_irqrestore+0x4c/0x60
[  522.734703]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[  522.734790]  i915_pci_probe+0x29/0x90 [i915]
[  522.734801]  pci_device_probe+0xa1/0x130
[  522.734813]  driver_probe_device+0x306/0x480
[  522.734824]  __driver_attach+0xdb/0x100
[  522.734830]  ? driver_probe_device+0x480/0x480
[  522.734836]  ? driver_probe_device+0x480/0x480
[  522.734844]  bus_for_each_dev+0x74/0xc0
[  522.734855]  bus_add_driver+0x15f/0x250
[  522.734863]  ? 0xffffffffa0793000
[  522.734870]  driver_register+0x56/0xe0
[  522.734877]  ? 0xffffffffa0793000
[  522.734883]  do_one_initcall+0x58/0x370
[  522.734893]  ? do_init_module+0x1d/0x1ea
[  522.734900]  ? rcu_read_lock_sched_held+0x6f/0x80
[  522.734906]  ? kmem_cache_alloc_trace+0x282/0x2e0
[  522.734918]  do_init_module+0x56/0x1ea
[  522.734927]  load_module+0x2435/0x2b20
[  522.734965]  ? __se_sys_finit_module+0xd3/0xf0
[  522.734972]  __se_sys_finit_module+0xd3/0xf0
[  522.734995]  do_syscall_64+0x55/0x190
[  522.735003]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  522.735009] RIP: 0033:0x7f43f675d839
[  522.735014] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[  522.735260] RSP: 002b:00007ffe69384238 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  522.735269] RAX: ffffffffffffffda RBX: 000056100e387090 RCX: 00007f43f675d839
[  522.735273] RDX: 0000000000000000 RSI: 000056100e37bff0 RDI: 0000000000000003
[  522.735278] RBP: 000056100e37bff0 R08: 0000000000000000 R09: 0000000000000000
[  522.735282] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000
[  522.735286] R13: 000056100e37c890 R14: 0000000000000020 R15: 0000000000000027
[  522.735309] irq event stamp: 1389594
[  522.735316] hardirqs last  enabled at (1389593): [<ffffffff810f896c>] console_unlock+0x3fc/0x600
[  522.735323] hardirqs last disabled at (1389594): [<ffffffff81a0111c>] error_entry+0x7c/0x100
[  522.735329] softirqs last  enabled at (1389356): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  522.735336] softirqs last disabled at (1389335): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
[  522.735432] WARNING: CPU: 1 PID: 3856 at drivers/gpu/drm/i915/intel_pm.c:7350 intel_cleanup_gt_powersave+0x5f/0x70 [i915]

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712105454.16091-1-chris@chris-wilson.co.uk

drm/i915/tv: fix strncpy truncation warning

Change it to use strlcpy instead

Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712074103.21571-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Merge tag 'gvt-next-2018-07-11' of https://github.com/intel/gvt-linux into drm-intel-next-queued

gvt-next-2018-07-11

- vGPU huge page support (Changbin)
- BXT display irq warning fix (Colin)
- Handle GVT dependency well (Henry)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711023353.GU1267@zhen-hp.sh.intel.com

drm/i915/execlists: Switch to rb_root_cached

The kernel recently gained an augmented rbtree with the purpose of
cacheing the leftmost element of the rbtree, a frequent optimisation to
avoid calls to rb_first() which is also employed by the
execlists->queue. Switch from our open-coded cache to the library.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180629075348.27358-9-chris@chris-wilson.co.uk

drm/i915/selftests: Add a safety net to live_workarounds

Since live_workarounds poke around the w/a registers and checks to see
if they survive across a reset, we are prone to fouling the machine and
leaving it in a non-recoverable state. Wrap the probe inside a timeout
to abort the test if the reset fails.

v2: Include GEM_TRACE on declaring wedged.
v3: Add a few includes to make the header look standalone.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107188
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711122952.18448-1-chris@chris-wilson.co.uk

drm/i915: Introduce i915_address_space.mutex

Add a mutex into struct i915_address_space to be used while operating on
the vma and their lists for a particular vm. As this may be called from
the shrinker, we taint the mutex with fs_reclaim so that from the start
lockdep warns us if we are caught holding the mutex across an
allocation. (With such small steps we will eventually rid ourselves of
struct_mutex recursion!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20180711073608.20286-2-chris@chris-wilson.co.uk

drm/i915: use the ICL stolen memory

Now that our stolen memory is already reserved by the x86 subsystem
(since commit "x86/gpu: reserve ICL's graphics stolen memory"), make
use of it.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: x86@kernel.org
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180504203252.28048-2-paulo.r.zanoni@intel.com

x86/gpu: reserve ICL's graphics stolen memory

ICL changes the registers and addresses to 64 bits.

I also briefly looked at implementing an u64 version of the PCI config
read functions, but I concluded this wouldn't be trivial, so it's not
worth doing it for a single user that can't have any racing problems
while reading the register in two separate operations.

v2:
- Scrub the development (non-public) changelog (Joonas).
- Remove the i915.ko bits so this can be easily backported in order
to properly avoid stolen memory even on machines without i915.ko
(Joonas).
- CC stable for the reasons above.

Issue: VIZ-9250
CC: stable@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Fixes: 412310019a20 ("drm/i915/icl: Add initial Icelake definitions.")
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180504203252.28048-1-paulo.r.zanoni@intel.com

drm/i915: Unwind HW init after GVT setup failure

Following intel_gvt_init() failure, we missed unwinding our setup
leaving pointers dangling past the module unload. For our example, the
pm_qos:

[  441.057615] top: 000000006b3baf1c, n: 0000000054d8ef33, p: 0000000097cdf1a2
               prev: 0000000054d8ef33, n: 0000000097cdf1a2, p: 000000006b3baf1c
               next: 0000000097cdf1a2, n: 000000006de8fc8b, p: 0000000081087253
[  441.057627] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40
[  441.057628] Modules linked in: i915(+) vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei prime_numbers [last unloaded: i915]
[  441.057652] CPU: 4 PID: 9277 Comm: drv_selftest Tainted: G     U            4.18.0-rc4-CI-CI_DRM_4464+ #1
[  441.057653] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
[  441.057656] RIP: 0010:plist_check_prev_next+0x2d/0x40
[  441.057657] Code: 08 48 39 f0 74 2b 49 89 f0 48 8b 4f 08 50 ff 32 52 48 89 fe 41 ff 70 08 48 8b 17 48 c7 c7 d8 ae 14 82 4d 8b 08 e8 63 0e 76 ff <0f> 0b 48 83 c4 20 c3 48 39 10 75 d0 f3 c3 0f 1f 44 00 00 41 54 55
[  441.057717] RSP: 0018:ffffc900003a3a68 EFLAGS: 00010082
[  441.057720] RAX: 0000000000000000 RBX: ffff8802193978c0 RCX: 0000000000000002
[  441.057721] RDX: 0000000080000002 RSI: ffffffff820c65a4 RDI: 00000000ffffffff
[  441.057722] RBP: ffff8802193978c0 R08: 0000000000000000 R09: 0000000000000001
[  441.057724] R10: ffffc900003a3a70 R11: 0000000000000000 R12: ffffffff82243de0
[  441.057725] R13: ffffffff82243de0 R14: ffff88021a6c78c0 R15: 0000000077359400
[  441.057726] FS:  00007fc23a4a9980(0000) GS:ffff880236d00000(0000) knlGS:0000000000000000
[  441.057728] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  441.057729] CR2: 0000563e4503d038 CR3: 0000000138f86005 CR4: 00000000003606e0
[  441.057730] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  441.057731] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  441.057732] Call Trace:
[  441.057736]  plist_check_list+0x2e/0x40
[  441.057738]  plist_add+0x23/0x130
[  441.057743]  pm_qos_update_target+0x1bd/0x2f0
[  441.057771]  i915_driver_load+0xec4/0x1060 [i915]
[  441.057775]  ? trace_hardirqs_on_caller+0xe0/0x1b0
[  441.057800]  i915_pci_probe+0x29/0x90 [i915]
[  441.057804]  pci_device_probe+0xa1/0x130
[  441.057807]  driver_probe_device+0x306/0x480
[  441.057810]  __driver_attach+0xdb/0x100
[  441.057812]  ? driver_probe_device+0x480/0x480
[  441.057813]  ? driver_probe_device+0x480/0x480
[  441.057816]  bus_for_each_dev+0x74/0xc0
[  441.057819]  bus_add_driver+0x15f/0x250
[  441.057821]  ? 0xffffffffa0696000
[  441.057823]  driver_register+0x56/0xe0
[  441.057825]  ? 0xffffffffa0696000
[  441.057827]  do_one_initcall+0x58/0x370
[  441.057830]  ? do_init_module+0x1d/0x1ea
[  441.057832]  ? rcu_read_lock_sched_held+0x6f/0x80
[  441.057834]  ? kmem_cache_alloc_trace+0x282/0x2e0
[  441.057838]  do_init_module+0x56/0x1ea
[  441.057841]  load_module+0x2435/0x2b20
[  441.057852]  ? __se_sys_finit_module+0xd3/0xf0
[  441.057854]  __se_sys_finit_module+0xd3/0xf0
[  441.057861]  do_syscall_64+0x55/0x190
[  441.057863]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  441.057865] RIP: 0033:0x7fc239d75839
[  441.057866] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
[  441.057927] RSP: 002b:00007fffb7825d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  441.057930] RAX: ffffffffffffffda RBX: 0000563e45035dd0 RCX: 00007fc239d75839
[  441.057931] RDX: 0000000000000000 RSI: 0000563e4502f8a0 RDI: 0000000000000004
[  441.057932] RBP: 0000563e4502f8a0 R08: 0000000000000004 R09: 0000000000000000
[  441.057933] R10: 00007fffb7825ea0 R11: 0000000000000246 R12: 0000000000000000
[  441.057934] R13: 0000563e4502f690 R14: 0000000000000000 R15: 000000000000003f
[  441.057940] irq event stamp: 231338
[  441.057943] hardirqs last  enabled at (231337): [<ffffffff8193e3fc>] _raw_spin_unlock_irqrestore+0x4c/0x60
[  441.057944] hardirqs last disabled at (231338): [<ffffffff8193e26d>] _raw_spin_lock_irqsave+0xd/0x50
[  441.057947] softirqs last  enabled at (231024): [<ffffffff81c0034f>] __do_softirq+0x34f/0x505
[  441.057949] softirqs last disabled at (231005): [<ffffffff8108c7b9>] irq_exit+0xa9/0xc0
[  441.057951] WARNING: CPU: 4 PID: 9277 at lib/plist.c:42 plist_check_prev_next+0x2d/0x40

v2: Add a load failure point to intel_gvt_init() so that we always
exercise this path in future.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107129
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710143821.1889-1-chris@chris-wilson.co.uk

drm/i915: Cleanup modesetting on load-error path

After handling a critical failure initialising GEM we need to unwind the
modesetting setup.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710094421.16223-2-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

drm/i915: Flush the residual parking on emergency shutdown

On unwinding following a critical failure inside GEM init, we also need
to be sure to flush the workers before unloading the module.

Testcase: igt/drv_module_reload/basic-reload-inject
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710094421.16223-1-chris@chris-wilson.co.uk

drm/i915: Tidy i915_gem_suspend()

In the next patch, we will make a fairly minor change to flush
outstanding resets before suspend. In order to keep churn to a minimum
in that functional patch, we fix up the comments and coding style now.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709130208.11730-7-chris@chris-wilson.co.uk

drm/i915: Only reset hangcheck at the start of an activity cycle

Across a reset, the seqno (and thus hangcheck) should restart and the
hangcheck naturally progress, for when it does not, we want to declare an
emergency. Currently, we only detect if reset and reinit fails, but we
do not detect if the call to reinit succeeds but the HW is fried - as we
are resetting hangcheck on initialisation the engine. Remove that and
rely on the natural progress to reset the hangcheck timer.

References: e21b141376f9 ("drm/i915: Mark the hangcheck as idle when unparking the engines")
References: 1fd00c0faeec ("drm/i915: Declare the driver wedged if hangcheck makes no progress")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709130208.11730-2-chris@chris-wilson.co.uk

drm/i915/selftests: Filter out both physical address swizzles

In our swizzling selftests, we cannot predict the physical address of
the target page (at least not simply!) and so skip bit17 swizzles.
However, there are two bit17 swizzle modes and we only skipped one, with
the second being observed on the lab gdg causing the test to fail,
as soon as we hit a page with bit17 set in its address.

Testcase: igt/drv_selftest/live_objects #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709194915.5789-1-chris@chris-wilson.co.uk

drm/i915/selftests: Constrain mock_gtt tests to fit within RAM

Be pessimistic and presume that we actually allocate every page we
exercise via the mock_gtt (e.g. for gvt). In which case we have to keep
our working set under the available physical memory to prevent oom.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180710080424.7821-1-chris@chris-wilson.co.uk

drm/i915: Remove function details from device error messages

Error messages are intended to be addressed to the user; be clear,
succinct, instructive and unambiguous. Adding the function name to
that message does not add any information the user requires and in
the process makes the message less clear.

E.g.

[ 245.539711] i915 0000:00:02.0: [drm:i915_gem_init [i915]] Failed to initialize GPU, declaring it wedged!

becomes

[ 245.539711] i915 0000:00:02.0: Failed to initialize GPU, declaring it wedged!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709134858.12446-1-chris@chris-wilson.co.uk

drm/i915/gvt: declare gvt as i915's soft dependency

This helps initramfs builder and other tools to know the full dependencies
of i915 and have gvt module loaded with i915.

v2: add condition and change to pre-dependency (Chris)
v3: move declaration to gvt.c. (Chris)
v4: remove xengt (Zhenyu)

Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Update DRIVER_DATE to 20180709

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/selftests: Prevent background reaping of active objects

igt_mmap_offset_exhaustion() wants to test what happens when the mmap
space is filled with zombie objects, objects discarded by userspace but
still active on the GPU. As they are only protected by the active
reference, we have to be certain that active reference is kept while we
peek into our dangling pointer. That active reference should not be
freed until we retire, but we do that retirement from a background
thread. This leaves us with a subtle timing problem, exacerbated and
highlighted by KASAN:

<3>[  132.380399] BUG: KASAN: use-after-free in drm_gem_create_mmap_offset+0x8c/0xd0
<3>[  132.380430] Read of size 8 at addr ffff8801e13245f8 by task drv_selftest/5822

<4>[  132.380470] CPU: 0 PID: 5822 Comm: drv_selftest Tainted: G     U            4.18.0-rc3-g7ae7763aa2be-kasan_48+ #1
<4>[  132.380473] Hardware name: Dell Inc. XPS 8300  /0Y2MRG, BIOS A06 10/17/2011
<4>[  132.380475] Call Trace:
<4>[  132.380481]  dump_stack+0x7c/0xbb
<4>[  132.380487]  print_address_description+0x65/0x270
<4>[  132.380493]  kasan_report+0x25b/0x380
<4>[  132.380497]  ? drm_gem_create_mmap_offset+0x8c/0xd0
<4>[  132.380503]  drm_gem_create_mmap_offset+0x8c/0xd0
<4>[  132.380584]  i915_gem_object_create_mmap_offset+0x6d/0x100 [i915]
<4>[  132.380650]  igt_mmap_offset_exhaustion+0x462/0x940 [i915]
<4>[  132.380714]  ? i915_gem_close_object+0x740/0x740 [i915]
<4>[  132.380784]  ? igt_gem_huge+0x269/0x3d0 [i915]
<4>[  132.380865]  __i915_subtests+0x5a/0x160 [i915]
<4>[  132.380936]  __run_selftests+0x1a2/0x2f0 [i915]
<4>[  132.381008]  i915_live_selftests+0x4e/0x80 [i915]
<4>[  132.381071]  i915_pci_probe+0xd8/0x1b0 [i915]
<4>[  132.381077]  pci_device_probe+0x1c5/0x3a0
<4>[  132.381087]  driver_probe_device+0x6b6/0xcb0
<4>[  132.381094]  __driver_attach+0x22d/0x2c0
<4>[  132.381100]  ? driver_probe_device+0xcb0/0xcb0
<4>[  132.381103]  bus_for_each_dev+0x113/0x1a0
<4>[  132.381108]  ? check_flags.part.24+0x450/0x450
<4>[  132.381112]  ? subsys_dev_iter_exit+0x10/0x10
<4>[  132.381123]  bus_add_driver+0x38b/0x6e0
<4>[  132.381131]  driver_register+0x189/0x400
<4>[  132.381136]  ? 0xffffffffc12d8000
<4>[  132.381140]  do_one_initcall+0xa0/0x4c0
<4>[  132.381145]  ? initcall_blacklisted+0x180/0x180
<4>[  132.381152]  ? do_init_module+0x4a/0x54c
<4>[  132.381156]  ? rcu_lockdep_current_cpu_online+0xdc/0x130
<4>[  132.381161]  ? kasan_unpoison_shadow+0x30/0x40
<4>[  132.381169]  do_init_module+0x1b5/0x54c
<4>[  132.381177]  load_module+0x619e/0x9b70
<4>[  132.381202]  ? module_frob_arch_sections+0x20/0x20
<4>[  132.381211]  ? vfs_read+0x257/0x2f0
<4>[  132.381214]  ? vfs_read+0x257/0x2f0
<4>[  132.381221]  ? kernel_read+0x8b/0x130
<4>[  132.381231]  ? copy_strings_kernel+0x120/0x120
<4>[  132.381244]  ? __se_sys_finit_module+0x17c/0x1a0
<4>[  132.381248]  __se_sys_finit_module+0x17c/0x1a0
<4>[  132.381252]  ? __ia32_sys_init_module+0xa0/0xa0
<4>[  132.381261]  ? __se_sys_newstat+0x77/0xd0
<4>[  132.381265]  ? cp_new_stat+0x590/0x590
<4>[  132.381269]  ? kmem_cache_free+0x2f0/0x340
<4>[  132.381285]  do_syscall_64+0x97/0x400
<4>[  132.381292]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  132.381295] RIP: 0033:0x7eff4af46839
<4>[  132.381297] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
<4>[  132.381426] RSP: 002b:00007ffcd84f4cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
<4>[  132.381432] RAX: ffffffffffffffda RBX: 000055dfdeb429a0 RCX: 00007eff4af46839
<4>[  132.381435] RDX: 0000000000000000 RSI: 000055dfdeb43670 RDI: 0000000000000004
<4>[  132.381437] RBP: 000055dfdeb43670 R08: 0000000000000004 R09: 0000000000000000
<4>[  132.381440] R10: 00007ffcd84f4e60 R11: 0000000000000246 R12: 0000000000000000
<4>[  132.381442] R13: 000055dfdeb3bec0 R14: 0000000000000000 R15: 000000000000003b

<3>[  132.381466] Allocated by task 5822:
<4>[  132.381485]  kmem_cache_alloc+0xdf/0x2e0
<4>[  132.381546]  i915_gem_object_create_internal+0x24/0x1e0 [i915]
<4>[  132.381609]  igt_mmap_offset_exhaustion+0x257/0x940 [i915]
<4>[  132.381677]  __i915_subtests+0x5a/0x160 [i915]
<4>[  132.381742]  __run_selftests+0x1a2/0x2f0 [i915]
<4>[  132.381806]  i915_live_selftests+0x4e/0x80 [i915]
<4>[  132.381865]  i915_pci_probe+0xd8/0x1b0 [i915]
<4>[  132.381868]  pci_device_probe+0x1c5/0x3a0
<4>[  132.381871]  driver_probe_device+0x6b6/0xcb0
<4>[  132.381874]  __driver_attach+0x22d/0x2c0
<4>[  132.381877]  bus_for_each_dev+0x113/0x1a0
<4>[  132.381880]  bus_add_driver+0x38b/0x6e0
<4>[  132.381884]  driver_register+0x189/0x400
<4>[  132.381886]  do_one_initcall+0xa0/0x4c0
<4>[  132.381889]  do_init_module+0x1b5/0x54c
<4>[  132.381892]  load_module+0x619e/0x9b70
<4>[  132.381895]  __se_sys_finit_module+0x17c/0x1a0
<4>[  132.381898]  do_syscall_64+0x97/0x400
<4>[  132.381901]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

<3>[  132.381914] Freed by task 150:
<4>[  132.381931]  kmem_cache_free+0xb7/0x340
<4>[  132.381995]  __i915_gem_free_objects+0x875/0xf50 [i915]
<4>[  132.382054]  __i915_gem_free_work+0x69/0xb0 [i915]
<4>[  132.382058]  process_one_work+0x78b/0x1740
<4>[  132.382061]  worker_thread+0x82/0xb80
<4>[  132.382064]  kthread+0x30c/0x3d0
<4>[  132.382067]  ret_from_fork+0x3a/0x50

<3>[  132.382081] The buggy address belongs to the object at ffff8801e1324500
                   which belongs to the cache drm_i915_gem_object of size 1168
<3>[  132.382133] The buggy address is located 248 bytes inside of
                   1168-byte region [ffff8801e1324500, ffff8801e1324990)
<3>[  132.382179] The buggy address belongs to the page:
<0>[  132.382202] page:ffffea000784c800 count:1 mapcount:0 mapping:ffff8801dedf6500 index:0xffff8801e1323ec0 compound_mapcount: 0
<0>[  132.382251] flags: 0x8000000000008100(slab|head)
<1>[  132.382274] raw: 8000000000008100 ffff8801d6317440 ffff8801d6317440 ffff8801dedf6500
<1>[  132.382307] raw: ffff8801e1323ec0 0000000000140013 00000001ffffffff 0000000000000000
<1>[  132.382339] page dumped because: kasan: bad access detected

<3>[  132.382373] Memory state around the buggy address:
<3>[  132.382395]  ffff8801e1324480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
<3>[  132.382426]  ffff8801e1324500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[  132.382457] >ffff8801e1324580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[  132.382488]                                                                 ^
<3>[  132.382517]  ffff8801e1324600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
<3>[  132.382548]  ffff8801e1324680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

This patch tricks the system into running without the background retire
thread, until after we finish the test. The only reaping should then be
performed by the mmap offset routine to reclaim the space as required.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709130208.11730-1-chris@chris-wilson.co.uk

drm/i915/selftests: Replace wait-on-timeout with explicit timeout

In igt_flush_test() we install a background timer in order to ensure
that the wait completes within a certain time. We can now tell the wait
that it has to complete within a timeout, and so no longer need the
background timer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-3-chris@chris-wilson.co.uk

drm/i915: Provide a timeout to i915_gem_wait_for_idle() on setup

With a broken GPU we expect it to fail during the initial
GPU setup where do a couple of context switches to record the defaults.
This is a task that takes a few milliseconds even on the slowest of
devices, but we may have to wait 60s for hangcheck to give in and
declare the machine inoperable. In this a case where any gpu hang is
unacceptable, both from a timeliness and practical standpoint.

We can therefore set a timeout on our wait-for-idle that is shorter than
the hangcheck (which may be up to 60s for a declaring a wedged driver)
and so detect the broken GPU much more quickly during driver load (and
so prevent stalling userspace for ages).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-2-chris@chris-wilson.co.uk

drm/i915: Provide a timeout to i915_gem_wait_for_idle()

Usually we have no idea about the upper bound we need to wait to catch
up with userspace when idling the device, but in a few situations we
know the system was idle beforehand and can provide a short timeout in
order to very quickly catch a failure, long before hangcheck kicks in.

In the following patches, we will use the timeout to curtain two overly
long waits, where we know we can expect the GPU to complete within a
reasonable time or declare it broken.

In particular, with a broken GPU we expect it to fail during the initial
GPU setup where do a couple of context switches to record the defaults.
This is a task that takes a few milliseconds even on the slowest of
devices, but we may have to wait 60s for hangcheck to give in and
declare the machine inoperable. In this a case where any gpu hang is
unacceptable, both from a timeliness and practical standpoint.

The other improvement is that in selftests, we do not need to arm an
independent timer to inject a wedge, as we can just limit the timeout on
the wait directly.

v2: Include the timeout parameter in the trace.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180709122044.7028-1-chris@chris-wilson.co.uk

drm/i915/selftests: Magic numbers for old Y-tiling

i915g has a slightly different tiling layout, and so requires a
different reference swizzle pattern.

Testcase: igt/drv_selftests/live_objects #gdg
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180707100405.817-1-chris@chris-wilson.co.uk

drm/i915/gvt: Handle EDP_PSR_IMR and EDP_PSR_IIR for BXT.

BXT supports EDP. However since GVT-g only simulate DP monitor
to guest and handles EDP_PSR_IMR and EDP_PSR_IIR as default MMIO
r/w. If guest r/w these IMR/IIR, GVT-g won't simulate the real
HW behavior and below warning is printed:
--------
Interrupt register 0x64838 is not zero: 0xffffffff
WARNING: CPU: 1 PID: 1 at drivers/gpu/drm/i915/i915_irq.c:161
gen3_assert_iir_is_zero+0x34/0xa0

Call Trace:
gen8_de_irq_postinstall+0xad/0x330
gen8_irq_postinstall+0x23/0x80
drm_irq_install+0xb5/0x130
i915_driver_load+0xafd/0xf70
--------
Since GVT-g won't simulate EDP to guest, always set EDP_PSR_IMR
and EDP_PSR_IIR IMR/IIR to 0.

Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Enable platform support for vGPU huge gtt pages

Now GVTg supports shadowing both 2M/64K huge gtt pages. So let's turn on
the cap info bit VGT_CAPS_HUGE_GTT.

v2: Split changes in i915 side into a separated patch.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Fix error handling in ppgtt_populate_spt_by_guest_entry

Don't forget to free allocated spt if shadowing failed.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Handle special sequence on PDE IPS bit

If the guest update the 64K gtt entry before changing IPS bit of PDE, we
need to re-shadow the whole page table. Because we have ignored all
updates to unused entries.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add 2M huge gtt support

This add 2M huge gtt support for GVTg. Unlike 64K gtt entry, we can
shadow 2M guest entry with real huge gtt. But before that, we have to
check memory physical continuous, alignment and if it is supported on
the host. We can get all supported page sizes from
intel_device_info.page_sizes.

Finally we must split the 2M page into smaller pages if we cannot
satisfy guest Huge Page.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/kvmgt: Support setting dma map for huge pages

To support huge gtt, we need to support huge pages in kvmgt first.
This patch adds a 'size' param to the intel_gvt_mpt::dma_map_guest_page
API and implements it in kvmgt.

v2: rebase.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add 64K huge gtt support

Finally, this add the first huge gtt support for GVTg - 64K pages. Since
64K page and 4K page cannot be mixed on the same page table, so we always
split a 64K entry into small 4K page. And when unshadow guest 64K entry,
we need ensure all the shadowed entries in shadow page table also get
cleared.

For page table which has 64K gtt entry, only PTE#0, PTE#16, PTE#32, ...
PTE#496 are used. Unused PTEs update should be ignored.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Make PTE iterator 64K entry aware

64K PTE is special, only PTE#0, PTE#16, PTE#32, ... PTE#496 are used in
the page table.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Split ppgtt_alloc_spt into two parts

We need a interface to allocate a pure shadow page which doesn't have
a guest page associated with. Such shadow page is used to shadow 2M
huge gtt entry.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add GTT clear_pse operation

Add clear_pse operation in case we need to split huge gtt into small pages.

v2: correct description.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add software PTE flag to mark special 64K splited entry

This add a software PTE flag on the Ignored bit of PTE. It will be used
to identify splited 64K shadow entries.

v2: fix mask definition.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Detect 64K gtt entry by IPS bit of PDE

This change help us detect the real entry type per PSE and IPS setting.
For 64K entry, we also need to check reg GEN8_GAMW_ECO_DEV_RW_IA.

v2: Extend IPS mmio control to Gen10. (Matthew Auld)

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Handle MMIO GEN8_GAMW_ECO_DEV_RW_IA for 64K GTT

The register RENDER_HWS_PGA_GEN7 is renamed to GEN8_GAMW_ECO_DEV_RW_IA
from GEN8 which can control IPS enabling.

v3: MMIO control for IPS is not removed from gen9 but gen10 (Matthew Auld)
v2: IPS of all engines must be enabled together for gen9.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add PTE IPS bit operations

Add three IPS operation functions to test/set/clear IPS in PDE.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915/gvt: Add new 64K entry type

Add a new entry type GTT_TYPE_PPGTT_PTE_64K_ENTRY. 64K entry is very
different from 2M/1G entry. 64K entry is controlled by IPS bit in upper
PDE. To leverage the current logic, I take IPS bit as 'PSE' for PTE
level. Which means, 64K entries can also processed by get_pse_type().

v2: Make it bisectable.

Signed-off-by: Changbin Du <changbin.du@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

drm/i915: Replace nested subclassing with explicit subclasses

In the next patch, we will want a third distinct class of timeline that
may overlap with the current pair of client and engine timeline classes.
Rather than use the ad hoc markup of SINGLE_DEPTH_NESTING, initialise
the different timeline classes with an explicit subclass.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706210710.16251-1-chris@chris-wilson.co.uk

drm/i915/selftests: Avoid warning if runtime pm is disabled

Inside the mock GEM device, we try to grab the runtime pm for the fake
device to prevent it from ever suspending. However, if CONFIG_PM is not
set, trying to obtain the wakref returns an error which we WARN about.
Suppress the expected warning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180706205947.11209-1-chris@chris-wilson.co.uk