Ramalingam C [Sat, 16 Feb 2019 17:36:58 +0000 (23:06 +0530)]
drm/i915: Handle HDCP2.2 downstream topology change
When repeater notifies a downstream topology change, this patch
reauthenticate the repeater alone without disabling the hdcp
encryption. If that fails then complete reauthentication is executed.
v2:
Rebased.
v3:
Typo in commit msg is fixed [Uma]
v4:
Rebased as part of patch reordering.
Minor style fixes.
v5:
Rebased.
v6:
Rebased.
v7:
Errors due to sinks are reported as DEBUG logs.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-12-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:57 +0000 (23:06 +0530)]
drm/i915: Implement HDCP2.2 link integrity check
Implements the link integrity check once in 500mSec.
Once encryption is enabled, an ongoing Link Integrity Check is
performed by the HDCP Receiver to check that cipher synchronization
is maintained between the HDCP Transmitter and the HDCP Receiver.
On the detection of synchronization lost, the HDCP Receiver must assert
the corresponding bits of the RxStatus register. The Transmitter polls
the RxStatus register and it may initiate re-authentication.
v2:
Rebased.
v3:
enum check_link_response is used check the link status [Uma]
v4:
Rebased as part of patch reordering.
v5:
Required members of intel_hdcp is defined [Sean Paul]
v6:
hdcp2_check_link is cancelled at required places.
v7:
Rebased for the component i/f changes.
Errors due to the sinks are reported as DEBUG logs.
v8:
hdcp_check_work is used for both hdcp1 and hdcp2 check_link [Daniel]
hdcp2.2 encryption status check is put under WARN_ON [Daniel]
drm_hdcp.h changes are moved into separate patch [Daniel]
v9:
enum check_link_status is defined at intel_drv.h [Daniel]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-11-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:56 +0000 (23:06 +0530)]
drm: HDCP2.2 link check period
Time period for HDCP2.2 link check.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-10-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:55 +0000 (23:06 +0530)]
drm/i915: Implement HDCP2.2 repeater authentication
Implements the HDCP2.2 repeaters authentication steps such as verifying
the downstream topology and sending stream management information.
v2: Rebased.
v3:
-EINVAL is returned for topology error and rollover scenario.
Endianness conversion func from drm_hdcp.h is used [Uma]
v4:
Rebased as part of patches reordering.
Defined the mei service functions [Daniel]
v5:
Redefined the mei service functions as per comp redesign.
v6:
%s/uintxx_t/uxx
Check for comp_master is removed.
v7:
Adjust to the new mei interface.
style issue fixed.
v8:
drm_hdcp.h change is moved into separate patch [Daniel]
v9:
%s/__swab16/cpu_to_be16. [Tomas]
Reviewed-by Uma.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-9-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:54 +0000 (23:06 +0530)]
drm/i915: Implement HDCP2.2 receiver authentication
Implements HDCP2.2 authentication for hdcp2.2 receivers, with
following steps:
Authentication and Key exchange (AKE).
Locality Check (LC).
Session Key Exchange(SKE).
DP Errata for stream type configuration for receivers.
At AKE, the HDCP Receiver’s public key certificate is verified by the
HDCP Transmitter. A Master Key k m is exchanged.
At LC, the HDCP Transmitter enforces locality on the content by
requiring that the Round Trip Time (RTT) between a pair of messages
is not more than 20 ms.
At SKE, The HDCP Transmitter exchanges Session Key ks with
the HDCP Receiver.
In DP HDCP2.2 encryption and decryption logics use the stream type as
one of the parameter. So Before enabling the Encryption DP HDCP2.2
receiver needs to be communicated with stream type. This is added to
spec as ERRATA.
This generic implementation is complete only with the hdcp2 specific
functions defined at hdcp_shim.
v2: Rebased.
v3:
%s/PARING/PAIRING
Coding style fixing [Uma]
v4:
Rebased as part of patch reordering.
Defined the functions for mei services. [Daniel]
v5:
Redefined the mei service functions as per comp redesign.
Required intel_hdcp members are defined [Sean Paul]
v6:
Typo of cipher is Fixed [Uma]
%s/uintxx_t/uxx
Check for comp_master is removed.
v7:
Adjust to the new interface.
Avoid using bool structure members. [Tomas]
v8: Rebased.
v9:
bool is used in struct intel_hdcp [Daniel]
config_stream_type is redesigned [Daniel]
Reviewed-by Uma.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-8-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:53 +0000 (23:06 +0530)]
drm/i915: Enable and Disable of HDCP2.2
Considering that HDCP2.2 is more secure than HDCP1.4, When a setup
supports HDCP2.2 and HDCP1.4, HDCP2.2 will be enabled.
When HDCP2.2 enabling fails and HDCP1.4 is supported, HDCP1.4 is
enabled.
This change implements a sequence of enabling and disabling of
HDCP2.2 authentication and HDCP2.2 port encryption.
v2:
Included few optimization suggestions [Chris Wilson]
Commit message is updated as per the rebased version.
intel_wait_for_register is used instead of wait_for. [Chris Wilson]
v3:
Extra comment added and Style issue fixed [Uma]
v4:
Rebased as part of patch reordering.
HDCP2 encryption status is tracked.
HW state check is moved into WARN_ON [Daniel]
v5:
Redefined the mei service functions as per comp redesign.
Merged patches related to hdcp2.2 enabling and disabling [Sean Paul].
Required shim functionality is defined [Sean Paul]
v6:
Return values are handles [Uma]
Realigned the code.
Check for comp_master is removed.
v7:
HDCP2.2 is attempted only if mei interface is up.
Adjust to the new interface
Avoid bool usage in struct [Tomas]
v8:
mei_binded status check is removed.
%s/hdcp2_in_use/hdcp2_encrypted
v9:
bool is used in struct intel_hdcp. [Daniel]
v10:
panel is replaced with sink [Uma]
Mei interface decided the hdcp2_capability.
WARN_ON if hdcp_enable is called when hdcp state is ENABLED.
Reviewed-by Uma.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-7-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:52 +0000 (23:06 +0530)]
drm/i915: hdcp1.4 CP_IRQ handling and SW encryption tracking
"hdcp_encrypted" flag is defined to denote the HDCP1.4 encryption status.
This SW tracking is used to determine the need for real hdcp1.4 disable
and hdcp_check_link upon CP_IRQ.
On CP_IRQ we filter the CP_IRQ related to the states like Link failure
and reauthentication req etc and handle them in hdcp_check_link.
CP_IRQ corresponding to the authentication msg availability are ignored.
WARN_ON is added for the abrupt stop of HDCP encryption of a port.
v2:
bool is used in struct for the cleaner coding. [Daniel]
check_link work_fn is scheduled for cp_irq handling [Daniel]
v3:
rebased.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-6-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:51 +0000 (23:06 +0530)]
drm/i915: MEI interface implementation
Defining the mei-i915 interface functions and initialization of
the interface.
v2:
Adjust to the new interface changes. [Tomas]
Added further debug logs for the failures at MEI i/f.
port in hdcp_port data is equipped to handle -ve values.
v3:
mei comp is matched for global i915 comp master. [Daniel]
In hdcp_shim hdcp_protocol() is replaced with const variable. [Daniel]
mei wrappers are adjusted as per the i/f change [Daniel]
v4:
port initialization is done only at hdcp2_init only [Danvet]
v5:
I915 registers a subcomponent to be matched with mei_hdcp [Daniel]
v6:
HDCP_disable for all connectors incase of comp_unbind.
Tear down HDCP comp interface at i915_unload [Daniel]
v7:
Component init and fini are moved out of connector ops [Daniel]
hdcp_disable is not called from unbind. [Daniel]
v8:
subcomponent name is dropped as it is already merged.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [v11]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-5-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:50 +0000 (23:06 +0530)]
drm/i915: Initialize HDCP2.2
Add the HDCP2.2 initialization to the existing HDCP1.4 stack.
v2:
mei interface handle is protected with mutex. [Chris Wilson]
v3:
Notifiers are used for the mei interface state.
v4:
Poll for mei client device state
Error msg for out of mem [Uma]
Inline req for init function removed [Uma]
v5:
Rebase as Part of reordering.
Component is used for the I915 and MEI_HDCP interface [Daniel]
v6:
HDCP2.2 uses the I915 component master to communicate with mei_hdcp
- [Daniel]
Required HDCP2.2 variables defined [Sean Paul]
v7:
intel_hdcp2.2_init returns void [Uma]
Realigning the codes.
v8:
Avoid using bool structure members.
MEI interface related changes are moved into separate patch.
Commit msg is updated accordingly.
intel_hdcp_exit is defined and used from i915_unload
v9:
Movement of the hdcp_check_link is moved to new patch [Daniel]
intel_hdcp2_exit is removed as mei_comp will be unbind in i915_unload.
v10:
bool is used in struct to make coding simpler. [Daniel]
hdmi hdcp init is placed correctly after encoder attachment.
v11:
hdcp2_capability check is moved into hdcp.c [Tomas]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-4-git-send-email-ramalingam.c@intel.com
Ramalingam C [Sat, 16 Feb 2019 17:36:48 +0000 (23:06 +0530)]
drm/i915: Gathering the HDCP1.4 routines together
All HDCP1.4 routines are gathered together, followed by the generic
functions those can be extended for HDCP2.2 too.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-2-git-send-email-ramalingam.c@intel.com
Chris Wilson [Tue, 19 Feb 2019 12:21:54 +0000 (12:21 +0000)]
drm/i915: Avoid reset lock in writing fence registers
The idea of taking the reset lock around writing the fence register was
to serialise the mmio write we also perform during the reset where those
registers get clobbered. However, the lock is overkill as write tearing
between reset and fence_update() is harmless; the final value of the
fence register is the same. A race between revoke_fences() and
fence_update() is also harmless at this point as on the fault path where
this is necessary, we acquire the reset lock to coordinate ourselves in
the upper layer.
The danger of acquiring the reset lock again in fence_update() is that
we may recurse from the shrinker along the i915_gem_fault() path.
<4> [125.739646] ============================================
<4> [125.739652] WARNING: possible recursive locking detected
<4> [125.739659]
5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1 Tainted: G U
<4> [125.739666] --------------------------------------------
<4> [125.739672] gem_mmap_gtt/1017 is trying to acquire lock:
<4> [125.739679]
00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x0/0x310 [i915]
<4> [125.739848]
but task is already holding lock:
<4> [125.739854]
00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.739918]
other info that might help us debug this:
<4> [125.739925] Possible unsafe locking scenario:
<4> [125.739930] CPU0
<4> [125.739934] ----
<4> [125.739937] lock(&dev_priv->gpu_error.reset_backoff_srcu);
<4> [125.739944] lock(&dev_priv->gpu_error.reset_backoff_srcu);
<4> [125.739950]
*** DEADLOCK ***
<4> [125.739958] May be due to missing lock nesting notation
<4> [125.739966] 5 locks held by gem_mmap_gtt/1017:
<4> [125.739972] #0:
00000000471f682c (&mm->mmap_sem){++++}, at: __do_page_fault+0x133/0x500
<4> [125.739987] #1:
0000000026542685 (&dev->struct_mutex){+.+.}, at: i915_gem_fault+0x1f6/0x860 [i915]
<4> [125.740061] #2:
00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.740126] #3:
00000000c828eb4f (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.25+0x0/0x30
<4> [125.740140] #4:
000000002d360d65 (shrinker_rwsem){++++}, at: shrink_slab+0x1cb/0x2c0
<4> [125.740151]
stack backtrace:
<4> [125.740159] CPU: 1 PID: 1017 Comm: gem_mmap_gtt Tainted: G U
5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1
<4> [125.740170] Hardware name: Dell Inc. OptiPlex 745 /0GW726, BIOS 2.3.1 05/21/2007
<4> [125.740180] Call Trace:
<4> [125.740189] dump_stack+0x67/0x9b
<4> [125.740199] __lock_acquire+0xc75/0x1b00
<4> [125.740209] ? arch_tlb_finish_mmu+0x2a/0xa0
<4> [125.740216] ? tlb_finish_mmu+0x1a/0x30
<4> [125.740222] ? zap_page_range_single+0xe2/0x130
<4> [125.740230] ? lock_acquire+0xa6/0x1c0
<4> [125.740237] lock_acquire+0xa6/0x1c0
<4> [125.740296] ? i915_clear_error_registers+0x280/0x280 [i915]
<4> [125.740357] i915_reset_trylock+0x44/0x310 [i915]
<4> [125.740417] ? i915_clear_error_registers+0x280/0x280 [i915]
<4> [125.740426] ? lockdep_hardirqs_on+0xe0/0x1b0
<4> [125.740434] ? _raw_spin_unlock_irqrestore+0x39/0x60
<4> [125.740499] fence_update+0x218/0x470 [i915]
<4> [125.740571] i915_vma_unbind+0xa6/0x550 [i915]
<4> [125.740640] i915_gem_object_unbind+0xfa/0x190 [i915]
<4> [125.740711] i915_gem_shrink+0x2dc/0x590 [i915]
<4> [125.740722] ? ___preempt_schedule+0x16/0x18
<4> [125.740792] ? i915_gem_shrinker_scan+0xc9/0x130 [i915]
<4> [125.740861] i915_gem_shrinker_scan+0xc9/0x130 [i915]
<4> [125.740870] do_shrink_slab+0x143/0x3f0
<4> [125.740878] shrink_slab+0x228/0x2c0
<4> [125.740886] shrink_node+0x167/0x450
<4> [125.740894] do_try_to_free_pages+0xc4/0x340
<4> [125.740902] try_to_free_pages+0xdc/0x2e0
<4> [125.740911] __alloc_pages_nodemask+0x662/0x1110
<4> [125.740921] ? reacquire_held_locks+0xb5/0x1b0
<4> [125.740928] ? reacquire_held_locks+0xb5/0x1b0
<4> [125.740986] ? i915_reset_trylock+0x192/0x310 [i915]
<4> [125.741045] ? i915_memcpy_init_early+0x30/0x30 [i915]
<4> [125.741054] pte_alloc_one+0x12/0x70
<4> [125.741060] __pte_alloc+0x11/0xf0
<4> [125.741067] apply_to_page_range+0x37e/0x440
<4> [125.741127] remap_io_mapping+0x6c/0x100 [i915]
<4> [125.741196] i915_gem_fault+0x5a9/0x860 [i915]
<4> [125.741204] ? ptlock_alloc+0x15/0x30
<4> [125.741212] __do_fault+0x2c/0xb0
<4> [125.741218] __handle_mm_fault+0x8ee/0xfa0
<4> [125.741227] handle_mm_fault+0x196/0x3a0
<4> [125.741235] __do_page_fault+0x246/0x500
<4> [125.741243] ? page_fault+0x8/0x30
<4> [125.741250] page_fault+0x1e/0x30
<4> [125.741256] RIP: 0033:0x55d0cc456e12
<4> [125.741264] Code: b0 df ff ff 89 c2 8b 85 70 df ff ff 01 c2 8b 85 70 df ff ff 48 98 48 8d 0c 85 00 00 00 00 48 8b 85 e0 df ff ff 48 01 c8 f7 d2 <89> 10 83 85 70 df ff ff 01 81 bd 70 df ff ff ff 03 00 00 7e be 48
<4> [125.741280] RSP: 002b:
00007ffc1bab7ab0 EFLAGS:
00010206
<4> [125.741287] RAX:
00007fc787cb6000 RBX:
0000000000000000 RCX:
0000000000000000
<4> [125.741295] RDX:
00000000ffffffff RSI:
0000000000005401 RDI:
0000000000000002
<4> [125.741303] RBP:
00007ffc1bab9b70 R08:
00007ffc1bab7920 R09:
000000000000001b
<4> [125.741310] R10:
7165722074736554 R11:
0000000000000246 R12:
000055d0cc454a80
<4> [125.741318] R13:
00007ffc1bab9f60 R14:
0000000000000000 R15:
0000000000000000
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109665
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-4-chris@chris-wilson.co.uk
Chris Wilson [Wed, 20 Feb 2019 14:56:37 +0000 (14:56 +0000)]
drm/i915: Beware temporary wedging when determining -EIO
At a few points in our uABI, we check to see if the driver is wedged and
report -EIO back to the user in that case. However, as we perform the
check and reset asynchronously (where once before they were both
serialised by the struct_mutex), we may instead see the temporary wedging
used to cancel inflight rendering to avoid a deadlock during reset
(caused by either us timing out in our reset handler,
i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
for a stuck modeset). If we suspect this is the case, that is we see a
wedged driver *and* reset in progress, then wait until the reset is
resolved before reporting upon the wedged status.
v2: might_sleep() (Mika)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109580
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190220145637.23503-1-chris@chris-wilson.co.uk
Joonas Lahtinen [Wed, 20 Feb 2019 10:05:46 +0000 (12:05 +0200)]
drm/i915: Update DRIVER_DATE to
20190220
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Joonas Lahtinen [Wed, 20 Feb 2019 09:27:15 +0000 (11:27 +0200)]
Merge tag 'topic/mei-hdcp-2019-02-19' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next-queued
Prep patches + headers for the mei-hdcp/i915 component interfaces
Also contains the prep work in the component helpers plus adjustements
for the snd-hda/i915 component interface.
Plus one small static inline in the drm_hdcp.h header that both i915
and mei_hdcp will need.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190219071619.GA11016@phenom.ffwll.local
Joonas Lahtinen [Wed, 20 Feb 2019 09:04:08 +0000 (11:04 +0200)]
Merge drm/drm-next into drm-intel-next-queued
Doing a backmerge to be able to merge topic/mei-hdcp-2019-02-19 PR.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Dave Airlie [Wed, 20 Feb 2019 02:16:30 +0000 (12:16 +1000)]
Merge https://gitlab.freedesktop.org/drm/msm into drm-next
On the display side, cleanups and fixes to enabled modifiers
(QCOM_COMPRESSED). And otherwise mostly misc fixes all around.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGuZ5uBKpf=fHvKpTiD10nychuEY8rnE+HeRz0QMvtY5_A@mail.gmail.com
Dave Airlie [Wed, 20 Feb 2019 00:08:35 +0000 (10:08 +1000)]
Merge branch 'linux-5.1' of git://github.com/skeggsb/linux into drm-next
Various fixes/cleanups, along with initial support for SVM features
utilising HMM address-space mirroring and device memory migration.
There's a lot more work to do in these areas, both in terms of
features and efficiency, but these can slowly trickle in later down
the track.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv5bsB4rRY1Gqa_Bp_KAd-v_q1rGZ4nYmOAQhceL0Nr-Xg@mail.gmail.com
Ben Skeggs [Fri, 15 Feb 2019 05:50:16 +0000 (15:50 +1000)]
drm/nouveau/dmem: use dma addresses during migration copies
Removes the need for temporary VMM mappings.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 15 Feb 2019 04:45:57 +0000 (14:45 +1000)]
drm/nouveau/dmem: use physical vram addresses during migration copies
Removes the need for temporary VMM mappings.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 15 Feb 2019 00:35:05 +0000 (10:35 +1000)]
drm/nouveau/dmem: extend copy function to allow direct use of physical addresses
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Jérôme Glisse [Tue, 7 Aug 2018 20:13:16 +0000 (16:13 -0400)]
drm/nouveau/svm: new ioctl to migrate process memory to GPU memory
This add an ioctl to migrate a range of process address space to the
device memory. On platform without cache coherent bus (x86, ARM, ...)
this means that CPU can not access that range directly, instead CPU
will fault which will migrate the memory back to system memory.
This is behind a staging flag so that we can evolve the API.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Jérôme Glisse [Thu, 26 Jul 2018 21:59:13 +0000 (17:59 -0400)]
drm/nouveau/dmem: device memory helpers for SVM
Device memory can be use in SVM, in which case we do not have any of
the existing buffer object. This commit add infrastructure to allow
use of device memory without nouveau_bo. Again this is a temporary
solution until a rework of GPU memory management.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Ben Skeggs [Thu, 5 Jul 2018 02:57:12 +0000 (12:57 +1000)]
drm/nouveau/svm: initial support for shared virtual memory
This uses HMM to mirror a process' CPU page tables into a channel's page
tables, and keep them synchronised so that both the CPU and GPU are able
to access the same memory at the same virtual address.
While this code also supports Volta/Turing, it's only enabled for Pascal
GPUs currently due to channel recovery being unreliable right now on the
later GPUs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 19 Feb 2019 07:21:48 +0000 (17:21 +1000)]
drm/nouveau: prepare for enabling svm with existing userspace interfaces
For a channel to make use of SVM features, it requires a different GPU MMU
configuration than we would normally use, which is not desirable to switch
to unless a client is actively going to use SVM.
In order to supporting SVM without more extensive changes to the userspace
interfaces, the SVM_INIT ioctl needs to replace the previous configuration
safely.
The only way we can currently do this safely, accounting for some unlikely
failure conditions, is to allocate the new VMM without destroying the last
one, and prioritising the SVM-enabled configuration in the code that cares.
This will get cleaned up again further down the track.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 8 May 2018 10:39:48 +0000 (20:39 +1000)]
drm/nouveau/fault/gv100-: expose VoltaFaultBufferA
This nvclass exposes the replayable fault buffer, which will be used
by SVM to manage GPU page faults.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 8 May 2018 10:39:48 +0000 (20:39 +1000)]
drm/nouveau/fault/gp100: expose MaxwellFaultBufferA
This nvclass exposes the replayable fault buffer, which will be used
by SVM to manage GPU page faults.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 8 May 2018 10:39:48 +0000 (20:39 +1000)]
drm/nouveau/mmu/gp100-: support vmms with gcc/tex replayable faults enabled
Some GPU units are capable of supporting "replayable" page faults, where
the execution unit will wait for SW to fixup GPU page tables rather than
triggering a channel-fatal fault.
This feature isn't useful (it's harmful, even) unless something like HMM
is being used to manage events appearing in the replayable fault buffer,
so, it's disabled by default.
This commit allows a client to request it be enabled.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 9 Jul 2018 06:07:40 +0000 (16:07 +1000)]
drm/nouveau/mmu/gp100-: add privileged methods for fault replay/cancel
Host methods exist to do at least some of what we need, but we are not
currently pushing replay/cancels through a channel like UVM does as it's
not clear whether it's necessary in our case (UVM also updates PTEs with
the GPU).
UVM also pushes a software method for fault cancels on Pascal, seemingly
because the host methods don't appear to be sufficient. If/when we want
to push the replay/cancel on the GPU, we can re-purpose the cancellation
code here to implement that swmthd.
Keep it simple for now, until we figure out exactly what we need here.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 7 Jul 2018 02:35:48 +0000 (12:35 +1000)]
drm/nouveau/mmu: add a privileged method to directly manage PTEs
This provides a somewhat more direct method of manipulating the GPU page
tables, which will be required to support SVM.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Sat, 7 Jul 2018 08:29:20 +0000 (18:29 +1000)]
drm/nouveau/mmu: store mapped flag separately from memory pointer
This will be used to support a privileged client providing PTEs directly,
without a memory object to use as a reference.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 13 Jun 2018 06:25:53 +0000 (16:25 +1000)]
drm/nouveau/mmu: support initialisation of client-managed address-spaces
NVKM is currently responsible for managing the allocation of a client's
GPU address-space, but there's various use-cases (ie. HMM address-space
mirroring) where giving a client more direct control is desirable.
This commit allows for a VMM to be created where the area allocated for
NVKM is limited to a client-specified window, the remainder of address-
space is controlled directly by the client.
Leaving a window is necessary to support various internal requirements,
but also to support existing allocation interfaces as not all of the HW
is capable of working with a HMM allocation.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 5 Feb 2019 04:54:53 +0000 (14:54 +1000)]
drm/nouveau/gr/gf100-: expose method to determine current context
MMU will need access to this info.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Fri, 1 Feb 2019 03:52:50 +0000 (13:52 +1000)]
drm/nouveau/gr/gf100-: expose fecs methods for pausing ctxsw
MMU will need access to these.
v2. Apply fix from Rhys Kidd to send correct FECS method for STOP_CTXSW.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Tue, 12 Feb 2019 13:51:18 +0000 (13:51 +0000)]
drm/nouveau/falcon: fix a few indentation issues
There are a few statements that are indented incorrectly. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/mmu/gf100-: virtualise setting pdb base address for invalidation
It appears that Pascal and newer need something different.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/mmu/gf100-: make mmu invalidate function more general
Will want to reuse this for fault replay/cancellation swmthds.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: store fecs/gpccs falcon pointers in substructures
Future changes will want to add some additional things here, keep them
grouped together.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: move fecs bind_pointer into a function
Makes the code somewhat less magic.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: remove some unnecessary reg writes
This is already done during golden context creation.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: move fecs elpg setup into functions
Makes the code somewhat less magic.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: move fecs discover_pm_image_size into a function
Makes the code somewhat less magic.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: move fecs discover_zcull_image_size into a function
Makes the code somewhat less magic.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: move fecs discover_image_size into a function
Makes the code somewhat less magic.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gr/gf100-: move fecs set_watchdog_timeout method into a function
Makes the code somewhat less magic.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau: allow accelerated buffer moves even when gr isn't present
There's no need to avoid using copy engines if gr init fails for some
reason (usually missing FW, or incomplete bring-up).
It's not terribly useful for an end-user, but it'll slightly speed up
suspend/resume when saving fb contents, and allow for host/ce code to
be validated.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/kms/nv04-nv4x: move resume code to dispnv04 init hook
It has no relevance to the atomic path used by newer GPUs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/kms/nv04-nv4x: move suspend code to dispnv04 fini hook
It has no relevance to the atomic path used by newer GPUs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/kms/nv04-nv4x: move a bunch of pre-nv50 page flip code to dispnv04
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/kms: display destroy/init/fini hooks can be static
Swapped order of functions in dispnv04 to allow this, but no code changes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau: allocate kernel channel(s) before initialising display
Some of the pre-NV50 depends on SW methods to implement synchronisation
for page flips, and we want to move this setup out of common code, thus
we require the channel to have been allocation before display init.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/disp/gf119-: decode exception reason to human-readable string
We also change the error strings to match NVIDIA's naming.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/bios/init: handle INIT_GENERIC_CONDITION_ID_NO_PANEL_SEQ_DELAYS
As I currently understand it, this is related to features we have no
support for as of yet.
In theory, this change should be a noop, just without the warning.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/bios/init: label existing INIT_GENERIC_CONDITION types
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/secboot: fix missing newline in error messages
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/sec2/tu102-: instantiate SEC2 falcon
Required for ACR.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/sec2: utilise engine PRI address from TOP
Turing has its SEC2 instance in an alternate location, and this avoids
needing to duplicate the code here for it.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/nvdec/tu102-: instantiate NVDEC0 falcon
Required to run VPR scrubber binary as part of secboot.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/nvdec/gp102-: utilise engine PRI address from TOP
Turing has its NVDEC instances in an alternate location.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/gsp/gv100-: instantiate GSP falcon
We need this for Turing ACR, but it's present from Volta onwards.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/top/gv100-: translate entry for the GSP
So we're able to connect fault/interrupt handling to the GSP subdev.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/top: add function to lookup PRI address for devices
Will be using this in upcoming changes to avoid the need for entirely
new subdevs to deal with Turing register moves.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau/core: define GSP subdev
Exact meaning of the acronym is unknown, but we need this for Turing ACR.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Mon, 8 Oct 2018 20:47:36 +0000 (21:47 +0100)]
drm/nouveau: fix missing break in switch statement
The NOUVEAU_GETPARAM_PCI_DEVICE case is missing a break statement and falls
through to the following NOUVEAU_GETPARAM_BUS_TYPE case and may end up
re-assigning the getparam->value to an undesired value. Fix this by adding
in the missing break.
Detected by CoverityScan, CID#
1460507 ("Missing break in switch")
Fixes: 359088d5b8ec ("drm/nouveau: remove trivial cases of nvxx_device() usage")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Gustavo A. R. Silva [Tue, 29 Jan 2019 20:30:46 +0000 (14:30 -0600)]
drm/nouveau: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
This patch fixes the following warning:
drivers/gpu/drm/nouveau/nouveau_bo.c:1434:53: warning: this statement may fall through [-Wimplicit-fallthrough=]
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enabling
-Wimplicit-fallthrough.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Lyude Paul [Mon, 28 Jan 2019 21:03:50 +0000 (16:03 -0500)]
drm/nouveau: Don't WARN_ON VCPI allocation failures
This is much louder then we want. VCPI allocation failures are quite
normal, since they will happen if any part of the modesetting process is
interrupted by removing the DP MST topology in question. So just print a
debugging message on VCPI failures instead.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: f479c0ba4a17 ("drm/nouveau/kms/nv50: initial support for DP 1.2 multi-stream")
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Wed, 19 Dec 2018 15:29:49 +0000 (15:29 +0000)]
drm/nouveau/pmu: don't print reply values if exec is false
Currently the uninitialized values in the array reply are printed out
when exec is false and nvkm_pmu_send has not updated the array. Avoid
confusion by only dumping out these values if they have been actually
updated.
Detected by CoverityScan, CID#
1271291 ("Uninitialized scaler variable")
Fixes: ebb58dc2ef8c ("drm/nouveau/pmu: rename from pwr (no binary change)")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Sun, 25 Nov 2018 17:09:18 +0000 (17:09 +0000)]
drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RON
Currently, the expression for calculating RON is always going to result
in zero no matter the value of ram->mr[1] because the ! operator has
higher precedence than the shift >> operator. I believe the missing
parentheses around the expression before appying the ! operator will
result in the desired result.
[ Note, not tested ]
Detected by CoveritScan, CID#
1324005 ("Operands don't affect result")
Fixes: c25bf7b6155c ("drm/nouveau/bios/ramcfg: Separate out RON pull value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Tue, 4 Sep 2018 15:23:33 +0000 (16:23 +0100)]
drm/nouveau/bios/dp: make array vsoff static, shrinks object size
Don't populate the array vsoff on the stack but instead make it
static. Makes the object code smaller by 67 bytes:
Before:
text data bss dec hex filename
5753 112 0 5865 16e9 .../nouveau/nvkm/subdev/bios/dp.o
After:
text data bss dec hex filename
5622 176 0 5798 16a6 .../nouveau/nvkm/subdev/bios/dp.o
(gcc version 8.2.0 x86_64)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 02:13:00 +0000 (12:13 +1000)]
drm/nouveau/ce/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 02:11:47 +0000 (12:11 +1000)]
drm/nouveau/fifo/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 02:10:06 +0000 (12:10 +1000)]
drm/nouveau/disp/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 02:07:23 +0000 (12:07 +1000)]
drm/nouveau/fault/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 02:06:01 +0000 (12:06 +1000)]
drm/nouveau/bar/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 02:04:02 +0000 (12:04 +1000)]
drm/nouveau/mmu/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 01:46:39 +0000 (11:46 +1000)]
drm/nouveau/mc/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Thu, 17 Jan 2019 01:41:53 +0000 (11:41 +1000)]
drm/nouveau/devinit/tu102: rename implementation from tu104
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ilia Mirkin [Sun, 13 Jan 2019 22:50:10 +0000 (17:50 -0500)]
drm/nouveau/volt/gf117: fix speedo readout register
GF117 appears to use the same register as GK104 (but still with the
general Fermi readout mechanism).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108980
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Jordan Crouse [Tue, 19 Feb 2019 18:40:19 +0000 (11:40 -0700)]
drm/msm: Truncate the buffer object name if the copy from user failed
(Resend since there was a compile error that I forgot to commit before sending)
If there is a error while doing a copy_from_user() for MSM_INFO_SET_NAME
make sure to truncate the object name so that there isn't a chance that
we'll have random data in the string.
This is on top of [1] reported and fixed by Dan Carpenter.
[1] https://patchwork.freedesktop.org/series/56656/
Fixes: f05c83e77460 ("drm/msm: add uapi to get/set debug name")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Dan Carpenter [Thu, 14 Feb 2019 07:19:27 +0000 (10:19 +0300)]
drm/msm: fix an error code in the ioctl
The copy_to/from_user() functions return the number of bytes remaining
to be copied but we should return -EFAULT to the user.
Fixes: f05c83e77460 ("drm/msm: add uapi to get/set debug name")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Chris Wilson [Tue, 19 Feb 2019 12:21:52 +0000 (12:21 +0000)]
drm/i915: Use time based guilty context banning
Currently, we accumulate each time a context hangs the GPU, offset
against the number of requests it submits, and if that score exceeds a
certain threshold, we ban that context from submitting any more requests
(cancelling any work in flight). In contrast, we use a simple timer on
the file, that if we see more than a 9 hangs faster than 60s apart in
total across all of its contexts, we will ban the client from creating
any more contexts. This leads to a confusing situation where the file
may be banned before the context, so lets use a simple timer scheme for
each.
If the context submits 3 hanging requests within a 120s period, declare
it forbidden to ever send more requests.
This has the advantage of not being easy to repair by simply sending
empty requests, but has the disadvantage that if the context is idle
then it is forgiven. However, if the context is idle, it is not
disrupting the system, but a hog can evade the request counting and
cause much more severe disruption to the system.
Updating ban_score from request retirement is dubious as the retirement
is purposely not in sync with request submission (i.e. we try and batch
retirement to reduce overhead and avoid latency on submission), which
leads to surprising situations where we can forgive a hang immediately
due to a backlog of requests from before the hang being retired
afterwards.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-2-chris@chris-wilson.co.uk
Chris Wilson [Tue, 19 Feb 2019 12:21:57 +0000 (12:21 +0000)]
drm/i915: Trim delays for wedging
CI still reports the occasional multi-second delay for resets, in
particular along the wedge+recovery paths. As the likely, and unbounded,
delay here is from sync_rcu, use the expedited variant instead.
Testcase: igt/gem_eio/unwedge-stress
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-7-chris@chris-wilson.co.uk
Chris Wilson [Tue, 19 Feb 2019 12:21:51 +0000 (12:21 +0000)]
drm/i915: Move verify_wm_state() to heap
The stack usage exceeded 1024 bytes prompting warnings on conservative
setups, so move the temporary allocation for HW readback onto the heap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190219122215.8941-1-chris@chris-wilson.co.uk
Chris Wilson [Mon, 18 Feb 2019 09:46:28 +0000 (09:46 +0000)]
drm/i915: Include reminders about leaving no holes in uAPI enums
We don't want to pre-reserve any holes in our uAPI for that is a sign of
nefarious and hidden activity. Add a reminder about our uAPI
expectations to encourage good practice when adding new defines/enums.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190218094628.13522-1-chris@chris-wilson.co.uk
Ramalingam C [Sat, 16 Feb 2019 05:04:59 +0000 (10:34 +0530)]
drm/audio: declaration of struct device
Header has used the references to struct device without it definition
or declaration. Hence resulting in compilation warning such as
"'struct device' declared inside parameter list..."
This changes adds a declaration to struct device in the header to avoid
any such warnings.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
cc: Takashi Iwai <tiwai@suse.de>
cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/1550293499-5560-1-git-send-email-ramalingam.c@intel.com
Chris Wilson [Mon, 18 Feb 2019 15:31:06 +0000 (15:31 +0000)]
drm/i915: Restore interrupt enabling after a reset
At least on i965g and i965gm, performing a device reset clobbers the IER
resulting in loss of interrupts thereafter. So, run the irq_postinstall
hook to restore them.
v2: Ville pointed out that he already attempted to solve this problem by
reinstalling the interrupts in intel_reset_finish() (part of the display
handling around reset). However, reinstalling the irq clobbers the
i915->irq_mask which we need for handling MI_USER_INTERRUPTS, and does
so too late to handle any interrupts generated from resuming the rings.
The simple solution to both is to pull the interrupt reenabling from
afterwards to around the device reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190218153106.16768-1-chris@chris-wilson.co.uk
Chris Wilson [Mon, 18 Feb 2019 14:50:50 +0000 (14:50 +0000)]
drm/i915/selftests: Make unbannable contexts for reset handling
igt_ctx_sseu was caught using bannable contexts, and in the course of
resetting rapidly to run its test, was banned. Don't let ourselves ban
the test!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190218145051.18981-1-chris@chris-wilson.co.uk
Chris Wilson [Mon, 18 Feb 2019 10:58:21 +0000 (10:58 +0000)]
drm/i915: Optionally disable automatic recovery after a GPU reset
Some clients, such as mesa, may only emit minimal incremental batches
that rely on the logical context state from previous batches. They know
that recovery is impossible after a hang as their required GPU state is
lost, and that each in flight and subsequent batch will hang (resetting
the context image back to default perpetuating the problem).
To avoid getting into the state in the first place, we can allow clients
to opt out of automatic recovery and elect to ban any guilty context
following a hang. This prevents the continual stream of hangs and allows
the client to recreate their context and rebuild the state from scratch.
v2: Prefer calling it recoverable rather than unrecoverable.
References: https://lists.freedesktop.org/archives/mesa-dev/2019-February/215431.html
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> # for mesa
Link: https://patchwork.freedesktop.org/patch/msgid/20190218105821.17293-1-chris@chris-wilson.co.uk
Dave Airlie [Mon, 18 Feb 2019 03:27:15 +0000 (13:27 +1000)]
Merge v5.0-rc7 into drm-next
Backmerging for nouveau and imx that needed some fixes for next pulls.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Mon, 18 Feb 2019 02:46:40 +0000 (18:46 -0800)]
Linux 5.0-rc7
Chris Wilson [Sun, 17 Feb 2019 20:25:18 +0000 (20:25 +0000)]
drm/i915/selftests: Move local mock_ggtt allocations to the heap
This struct appears quite large and pushes our stack frame over
1024 bytes -- too high for conservative setups. So move the mock_ggtt
struct to the heap.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190217202518.24730-1-chris@chris-wilson.co.uk
Linus Torvalds [Sun, 17 Feb 2019 17:22:01 +0000 (09:22 -0800)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"This tree reverts a GICv3 commit (which was broken) and fixes it in
another way, by adding a memblock build-time entries quirk for ARM64"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/arm: Revert "Defer persistent reservations until after paging_init()"
arm64, mm, efi: Account for GICv3 LPI tables in static memblock reserve table
Linus Torvalds [Sun, 17 Feb 2019 16:44:38 +0000 (08:44 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Three changes:
- An UV fix/quirk to pull UV BIOS calls into the efi_runtime_lock
locking regime. (This done by aliasing __efi_uv_runtime_lock to
efi_runtime_lock, which should make the quirk nature obvious and
maintain the general policy that the EFI lock (name...) isn't
exposed to drivers.)
- Our version of MAGA: Make a.out Great Again.
- Add a new Intel model name enumerator to an upstream header to help
reduce dependencies going forward"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls
x86/CPU: Add Icelake model number
x86/a.out: Clear the dump structure initially
Linus Torvalds [Sun, 17 Feb 2019 16:38:13 +0000 (08:38 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two fixes on the kernel side: fix an over-eager condition that failed
larger perf ring-buffer sizes, plus fix crashes in the Intel BTS code
for a corner case, found by fuzzing"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix impossible ring-buffer sizes warning
perf/x86: Add check_period PMU callback
Linus Torvalds [Sun, 17 Feb 2019 16:36:21 +0000 (08:36 -0800)]
Merge tag 'powerpc-5.0-5' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Just one fix, for pgd/pud_present() which were broken on big endian
since v4.20, leading to possible data corruption.
Thanks to: Aneesh Kumar K.V., Erhard F., Jan Kara"
* tag 'powerpc-5.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
Linus Torvalds [Sun, 17 Feb 2019 16:34:10 +0000 (08:34 -0800)]
Merge tag 'csky-for-linus-5.0-rc6' of git://github.com/c-sky/csky-linux
Pull arch/csky fixes from Guo Ren:
"Here are some fixup patches for 5.0-rc6"
* tag 'csky-for-linus-5.0-rc6' of git://github.com/c-sky/csky-linux:
csky: Fixup dead loop in show_stack
csky: Fixup io-range page attribute for mmap("/dev/mem")
csky: coding convention: Use task_stack_page
csky: Fixup wrong pt_regs size
csky: Fixup _PAGE_GLOBAL bit for 610 tlb entry
Linus Torvalds [Sun, 17 Feb 2019 16:32:25 +0000 (08:32 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Two more driver bugfixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: bcm2835: Clear current buffer pointers and counts after a transfer
i2c: cadence: Fix the hold bit setting
Linus Torvalds [Sun, 17 Feb 2019 16:30:35 +0000 (08:30 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- tweaks to Elan drivers (both PS/2 and I2C) to support new devices.
Also revert of one of IDs as that device should really be driven by
i2c-hid + hid-multitouch
- a few drivers have been switched to set_brightness_blocking() call
because they either were sleeping the their set_brightness()
implementation or used workqueue but were not canceling it on unbind.
- ps2-gpio and matrix_keypad needed to [properly] flush their works to
avoid potential use-after-free on unbind.
- other miscellaneous fixes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK
Input: st-keyscan - fix potential zalloc NULL dereference
Input: apanel - switch to using brightness_set_blocking()
Revert "Input: elan_i2c - add ACPI ID for touchpad in ASUS Aspire F5-573G"
Input: qt2160 - switch to using brightness_set_blocking()
Input: matrix_keypad - use flush_delayed_work()
Input: ps2-gpio - flush TX work when closing port
Input: cap11xx - switch to using set_brightness_blocking()
Input: elantech - enable 3rd button support on Fujitsu CELSIUS H780
Input: bma150 - register input device after setting private data
Input: pwm-vibra - stop regulator after disabling pwm, not before
Input: pwm-vibra - prevent unbalanced regulator
Input: snvs_pwrkey - allow selecting driver for i.MX 7D
Linus Torvalds [Sun, 17 Feb 2019 16:28:49 +0000 (08:28 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"A somewhat bigger ARM update, and the usual smattering of x86 bug
fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: vmx: Fix entry number check for add_atomic_switch_msr()
KVM: x86: Recompute PID.ON when clearing PID.SN
KVM: nVMX: Restore a preemption timer consistency check
x86/kvm/nVMX: read from MSR_IA32_VMX_PROCBASED_CTLS2 only when it is available
KVM: arm64: Forbid kprobing of the VHE world-switch code
KVM: arm64: Relax the restriction on using stage2 PUD huge mapping
arm: KVM: Add missing kvm_stage2_has_pmd() helper
KVM: arm/arm64: vgic: Always initialize the group of private IRQs
arm/arm64: KVM: Don't panic on failure to properly reset system registers
arm/arm64: KVM: Allow a VCPU to fully reset itself
KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded
arm64: KVM: Don't generate UNDEF when LORegion feature is present
KVM: arm/arm64: vgic: Make vgic_cpu->ap_list_lock a raw_spinlock
KVM: arm/arm64: vgic: Make vgic_dist->lpi_list_lock a raw_spinlock
KVM: arm/arm64: vgic: Make vgic_irq->irq_lock a raw_spinlock
Mauro Ciancio [Mon, 14 Jan 2019 13:24:53 +0000 (10:24 -0300)]
Input: elan_i2c - add ACPI ID for touchpad in Lenovo V330-15ISK
This adds ELAN0617 to the ACPI table to support Elan touchpad found in
Lenovo V330-15ISK.
Signed-off-by: Mauro Ciancio <mauro@acadeu.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Gabriel Fernandez [Sun, 17 Feb 2019 05:10:16 +0000 (21:10 -0800)]
Input: st-keyscan - fix potential zalloc NULL dereference
This patch fixes the following static checker warning:
drivers/input/keyboard/st-keyscan.c:156 keyscan_probe()
error: potential zalloc NULL dereference: 'keypad_data->input_dev'
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gabriel Fernandez <gabriel.fernandez@st.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>