Chuck Lever [Wed, 19 Dec 2018 15:58:56 +0000 (10:58 -0500)]
xprtrdma: Remove support for FMR memory registration
FMR is not supported on most recent RDMA devices. It is also less
secure than FRWR because an FMR memory registration can expose
adjacent bytes to remote reading or writing. As discussed during the
RDMA BoF at LPC 2018, it is time to remove support for FMR in the
NFS/RDMA client stack.
Note that NFS/RDMA server-side uses either local memory registration
or FRWR. FMR is not used.
There are a few Infiniband/RoCE devices in the kernel tree that do
not appear to support MEM_MGT_EXTENSIONS (FRWR), and therefore will
not support client-side NFS/RDMA after this patch. These are:
- mthca
- qib
- hns (RoCE)
Users of these devices can use NFS/TCP on IPoIB instead.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:51 +0000 (10:58 -0500)]
xprtrdma: Reduce max_frwr_depth
Some devices advertise a large max_fast_reg_page_list_len
capability, but perform optimally when MRs are significantly smaller
than that depth -- probably when the MR itself is no larger than a
page.
By default, the RDMA R/W core API uses max_sge_rd as the maximum
page depth for MRs. For some devices, the value of max_sge_rd is
1, which is also not optimal. Thus, when max_sge_rd is larger than
1, use that value. Otherwise use the value of the
max_fast_reg_page_list_len attribute.
I've tested this with CX-3 Pro, FastLinq, and CX-5 devices. It
reproducibly improves the throughput of large I/Os by several
percent.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:45 +0000 (10:58 -0500)]
xprtrdma: Fix ri_max_segs and the result of ro_maxpages
With certain combinations of krb5i/p, MR size, and r/wsize, I/O can
fail with EMSGSIZE. This is because the calculated value of
ri_max_segs (the max number of MRs per RPC) exceeded
RPCRDMA_MAX_HDR_SEGS, which caused Read or Write list encoding to
walk off the end of the transport header.
Once that was addressed, the ro_maxpages result has to be corrected
to account for the number of MRs needed for Reply chunks, which is
2 MRs smaller than a normal Read or Write chunk.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:40 +0000 (10:58 -0500)]
xprtrdma: Don't wake pending tasks until disconnect is done
Transport disconnect processing does a "wake pending tasks" at
various points.
Suppose an RPC Reply is being processed. The RPC task that Reply
goes with is waiting on the pending queue. If a disconnect wake-up
happens before reply processing is done, that reply, even if it is
good, is thrown away, and the RPC has to be sent again.
This window apparently does not exist for socket transports because
there is a lock held while a reply is being received which prevents
the wake-up call until after reply processing is done.
To resolve this, all RPC replies being processed on an RPC-over-RDMA
transport have to complete before pending tasks are awoken due to a
transport disconnect.
Callers that already hold the transport write lock may invoke
->ops->close directly. Others use a generic helper that schedules
a close when the write lock can be taken safely.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:35 +0000 (10:58 -0500)]
xprtrdma: No qp_event disconnect
After thinking about this more, and auditing other kernel ULP imple-
mentations, I believe that a DISCONNECT cm_event will occur after a
fatal QP event. If that's the case, there's no need for an explicit
disconnect in the QP event handler.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:29 +0000 (10:58 -0500)]
xprtrdma: Replace rpcrdma_receive_wq with a per-xprt workqueue
To address a connection-close ordering problem, we need the ability
to drain the RPC completions running on rpcrdma_receive_wq for just
one transport. Give each transport its own RPC completion workqueue,
and drain that workqueue when disconnecting the transport.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:24 +0000 (10:58 -0500)]
xprtrdma: Refactor Receive accounting
Clean up: Divide the work cleanly:
- rpcrdma_wc_receive is responsible only for RDMA Receives
- rpcrdma_reply_handler is responsible only for RPC Replies
- the posted send and receive counts both belong in rpcrdma_ep
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:19 +0000 (10:58 -0500)]
xprtrdma: Ensure MRs are DMA-unmapped when posting LOCAL_INV fails
The recovery case in frwr_op_unmap_sync needs to DMA unmap each MR.
frwr_release_mr does not DMA-unmap, but the recycle worker does.
Fixes: 61da886bf74e ("xprtrdma: Explicitly resetting MRs is ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chuck Lever [Wed, 19 Dec 2018 15:58:13 +0000 (10:58 -0500)]
xprtrdma: Yet another double DMA-unmap
While chasing yet another set of DMAR fault reports, I noticed that
the frwr recycler conflates whether or not an MR has been DMA
unmapped with frwr->fr_state. Actually the two have only an indirect
relationship. It's in fact impossible to guess reliably whether the
MR has been DMA unmapped based on its fr_state field, especially as
the surrounding code and its assumptions have changed over time.
A better approach is to track the DMA mapping status explicitly so
that the recycler is less brittle to unexpected situations, and
attempts to DMA-unmap a second time are prevented.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v4.20
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Chris Perl [Mon, 17 Dec 2018 15:56:38 +0000 (10:56 -0500)]
NFS: nfs_compare_mount_options always compare auth flavors.
This patch removes the check from nfs_compare_mount_options to see if a
`sec' option was passed for the current mount before comparing auth
flavors and instead just always compares auth flavors.
Consider the following scenario:
You have a server with the address 192.168.1.1 and two exports /export/a
and /export/b. The first export supports `sys' and `krb5' security, the
second just `sys'.
Assume you start with no mounts from the server.
The following results in EIOs being returned as the kernel nfs client
incorrectly thinks it can share the underlying `struct nfs_server's:
$ mkdir /tmp/{a,b}
$ sudo mount -t nfs -o vers=3,sec=krb5 192.168.1.1:/export/a /tmp/a
$ sudo mount -t nfs -o vers=3 192.168.1.1:/export/b /tmp/b
$ df >/dev/null
df: ‘/tmp/b’: Input/output error
Signed-off-by: Chris Perl <cperl@janestreet.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:31 +0000 (11:30 +1100)]
SUNRPC discard cr_uid from struct rpc_cred.
Just use ->cr_cred->fsuid directly.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:31 +0000 (11:30 +1100)]
SUNRPC: simplify auth_unix.
1/ discard 'struct unx_cred'. We don't need any data that
is not already in 'struct rpc_cred'.
2/ Don't keep these creds in a hash table. When a credential
is needed, simply allocate it. When not needed, discard it.
This can easily be faster than performing a lookup on
a shared hash table.
As the lookup can happen during write-out, use a mempool
to ensure forward progress.
This means that we cannot compare two credentials for
equality by comparing the pointers, but we never do that anyway.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:31 +0000 (11:30 +1100)]
SUNRPC: remove crbind rpc_cred operation
This now always just does get_rpccred(), so we
don't need an operation pointer to know to do that.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:31 +0000 (11:30 +1100)]
SUNRPC: remove generic cred code.
This is no longer used.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:31 +0000 (11:30 +1100)]
NFS/NFSD/SUNRPC: replace generic creds with 'struct cred'.
SUNRPC has two sorts of credentials, both of which appear as
"struct rpc_cred".
There are "generic credentials" which are supplied by clients
such as NFS and passed in 'struct rpc_message' to indicate
which user should be used to authorize the request, and there
are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS
which describe the credential to be sent over the wires.
This patch replaces all the generic credentials by 'struct cred'
pointers - the credential structure used throughout Linux.
For machine credentials, there is a special 'struct cred *' pointer
which is statically allocated and recognized where needed as
having a special meaning. A look-up of a low-level cred will
map this to a machine credential.
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
NFS: struct nfs_open_dir_context: convert rpc_cred pointer to cred.
Use the common 'struct cred' to pass credentials for readdir.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
NFS: change access cache to use 'struct cred'.
Rather than keying the access cache with 'struct rpc_cred',
use 'struct cred'. Then use cred_fscmp() to compare
credentials rather than comparing the raw pointer.
A benefit of this approach is that in the common case we avoid the
rpc_lookup_cred_nonblock() call which can be slow when the cred cache is large.
This also keeps many fewer items pinned in the rpc cred cache, so the
cred cache is less likely to get large.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: remove RPCAUTH_AUTH_NO_CRKEY_TIMEOUT
This is no longer used.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
NFS: move credential expiry tracking out of SUNRPC into NFS.
NFS needs to know when a credential is about to expire so that
it can modify write-back behaviour to finish the write inside the
expiry time.
It currently uses functions in SUNRPC code which make use of a
fairly complex callback scheme and flags in the generic credientials.
As I am working to discard the generic credentials, this has to change.
This patch moves the logic into NFS, in part by finding and caching
the low-level credential in the open_context. We then make direct
cred-api calls on that.
This makes the code much simpler and removes a dependency on generic
rpc credentials.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: add side channel to use non-generic cred for rpc call.
The credential passed in rpc_message.rpc_cred is always a
generic credential except in one instance.
When gss_destroying_context() calls rpc_call_null(), it passes
a specific credential that it needs to destroy.
In this case the RPC acts *on* the credential rather than
being authorized by it.
This special case deserves explicit support and providing that will
mean that rpc_message.rpc_cred is *always* generic, allowing
some optimizations.
So add "tk_op_cred" to rpc_task and "rpc_op_cred" to the setup data.
Use this to pass the cred down from rpc_call_null(), and have
rpcauth_bindcred() notice it and bind it in place.
Credit to kernel test robot <fengguang.wu@intel.com> for finding
a bug in earlier version of this patch.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: introduce RPC_TASK_NULLCREDS to request auth_none
In almost all cases the credential stored in rpc_message.rpc_cred
is a "generic" credential. One of the two expections is when an
AUTH_NULL credential is used such as for RPC ping requests.
To improve consistency, don't pass an explicit credential in
these cases, but instead pass NULL and set a task flag,
similar to RPC_TASK_ROOTCREDS, which requests that NULL credentials
be used by default.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
NFS/SUNRPC: don't lookup machine credential until rpcauth_bindcred().
When NFS creates a machine credential, it is a "generic" credential,
not tied to any auth protocol, and is really just a container for
the princpal name.
This doesn't get linked to a genuine credential until rpcauth_bindcred()
is called.
The lookup always succeeds, so various places that test if the machine
credential is NULL, are pointless.
As a step towards getting rid of generic credentials, this patch gets
rid of generic machine credentials. The nfs_client and rpc_client
just hold a pointer to a constant principal name.
When a machine credential is wanted, a special static 'struct rpc_cred'
pointer is used. rpcauth_bindcred() recognizes this, finds the
principal from the client, and binds the correct credential.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: discard RPC_DO_ROOTOVERRIDE()
it is never used.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
NFSv4: don't require lock for get_renew_cred or get_machine_cred
This lock is no longer necessary.
If nfs4_get_renew_cred() needs to hunt through the open-state
creds for a user cred, it still takes the lock to stablize
the rbtree, but otherwise there are no races.
Note that this completely removes the lock from nfs4_renew_state().
It appears that the original need for the locking here was removed
long ago, and there is no longer anything to protect.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
NFSv4: add cl_root_cred for use when machine cred is not available.
NFSv4 state management tries a root credential when no machine
credential is available, as can happen with kerberos.
It does this by replacing the cl_machine_cred with a root credential.
This means that any user of the machine credential needs to take
a lock while getting a reference to the machine credential, which is
a little cumbersome.
So introduce an explicit cl_root_cred, and never free either
credential until client shutdown. This means that no locking
is needed to reference these credentials. Future patches
will make use of this.
This is only a temporary addition. both cl_machine_cred and
cl_root_cred will disappear later in the series.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: remove machine_cred field from struct auth_cred
The cred is a machine_cred iff ->principal is set, so there is no
need for the extra flag.
There is one case which deserves some
explanation. nfs4_root_machine_cred() calls rpc_lookup_machine_cred()
with a NULL principal name which results in not getting a machine
credential, but getting a root credential instead.
This appears to be what is expected of the caller, and is
clearly the result provided by both auth_unix and auth_gss
which already ignore the flag.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: remove uid and gid from struct auth_cred
Use cred->fsuid and cred->fsgid instead.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: remove groupinfo from struct auth_cred.
We can use cred->groupinfo (from the 'struct cred') instead.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
SUNRPC: add 'struct cred *' to auth_cred and rpc_cred
The SUNRPC credential framework was put together before
Linux has 'struct cred'. Now that we have it, it makes sense to
use it.
This first step just includes a suitable 'struct cred *' pointer
in every 'struct auth_cred' and almost every 'struct rpc_cred'.
The rpc_cred used for auth_null has a NULL 'struct cred *' as nothing
else really makes sense.
For rpc_cred, the pointer is reference counted.
For auth_cred it isn't. struct auth_cred are either allocated on
the stack, in which case the thread owns a reference to the auth,
or are part of 'struct generic_cred' in which case gc_base owns the
reference, and "acred" shares it.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
cred: allow get_cred() and put_cred() to be given NULL.
It is common practice for helpers like this to silently,
accept a NULL pointer.
get_rpccred() and put_rpccred() used by NFS act this way
and using the same interface will ease the conversion
for NFS, and simplify the resulting code.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
cred: export get_task_cred().
There is no reason that modules should not be able
to use this, and NFS will need it when converted to
use 'struct cred'.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
cred: add get_cred_rcu()
Sometimes we want to opportunistically get a
ref to a cred in an rcu_read_lock protected section.
get_task_cred() does this, and NFS does as similar thing
with its own credential structures.
To prepare for NFS converting to use 'struct cred' more
uniformly, define get_cred_rcu(), and use it in
get_task_cred().
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
NeilBrown [Mon, 3 Dec 2018 00:30:30 +0000 (11:30 +1100)]
cred: add cred_fscmp() for comparing creds.
NFS needs to compare to credentials, to see if they can
be treated the same w.r.t. filesystem access. Sometimes
an ordering is needed when credentials are used as a key
to an rbtree.
NFS currently has its own private credential management from
before 'struct cred' existed. To move it over to more consistent
use of 'struct cred' we need a comparison function.
This patch adds that function.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Ben Dooks [Wed, 28 Nov 2018 15:05:58 +0000 (15:05 +0000)]
SUNRPC: allow /proc entries without CONFIG_SUNRPC_DEBUG
If we want /proc/sys/sunrpc the current kernel also drags in other debug
features which we don't really want. Instead, we should always show the
following entries:
/proc/sys/sunrpc/udp_slot_table_entries
/proc/sys/sunrpc/tcp_slot_table_entries
/proc/sys/sunrpc/tcp_max_slot_table_entries
/proc/sys/sunrpc/min_resvport
/proc/sys/sunrpc/max_resvport
/proc/sys/sunrpc/tcp_fin_timeout
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Thomas Preston <thomas.preston@codethink.co.uk>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Pavel Tikhomirov [Wed, 14 Nov 2018 08:12:05 +0000 (11:12 +0300)]
nfs: fix comment to nfs_generic_pg_test which does the opposite
Please see comment to filelayout_pg_test for reference.
To: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna.schumaker@netapp.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Olga Kornievskaia [Fri, 26 Oct 2018 20:16:58 +0000 (16:16 -0400)]
NFSv4: cleanup remove unused nfs4_xdev_fs_type
commit
e8f25e6d6d19 "NFS: Remove the NFS v4 xdev mount function"
removed the last use of this.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Trond Myklebust [Mon, 17 Dec 2018 22:33:33 +0000 (17:33 -0500)]
SUNRPC: Remove xprt_connect_status()
Over the years, xprt_connect_status() has been superseded by
call_connect_status(), which now handles all the errors that
xprt_connect_status() does and more. Since the latter converts
all errors that it doesn't recognise to EIO, then it is time
for it to be retired.
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Trond Myklebust [Mon, 17 Dec 2018 22:38:51 +0000 (17:38 -0500)]
SUNRPC: Fix a race with XPRT_CONNECTING
Ensure that we clear XPRT_CONNECTING before releasing the XPRT_LOCK so that
we don't have races between the (asynchronous) socket setup code and
tasks in xprt_connect().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Trond Myklebust [Mon, 17 Dec 2018 18:34:59 +0000 (13:34 -0500)]
SUNRPC: Fix disconnection races
When the socket is closed, we need to call xprt_disconnect_done() in order
to clean up the XPRT_WRITE_SPACE flag, and wake up the sleeping tasks.
However, we also want to ensure that we don't wake them up before the socket
is closed, since that would cause thundering herd issues with everyone
piling up to retransmit before the TCP shutdown dance has completed.
Only the task that holds XPRT_LOCKED needs to wake up early in order to
allow the close to complete.
Reported-by: Dave Wysochanski <dwysocha@redhat.com>
Reported-by: Scott Mayhew <smayhew@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Linus Torvalds [Sun, 16 Dec 2018 23:46:55 +0000 (15:46 -0800)]
Linux 4.20-rc7
Linus Torvalds [Fri, 14 Dec 2018 23:35:30 +0000 (15:35 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/spdxcheck.py: always open files in binary mode
checkstack.pl: fix for aarch64
userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
fs/iomap.c: get/put the page in iomap_page_create/release()
hugetlbfs: call VM_BUG_ON_PAGE earlier in free_huge_page()
memblock: annotate memblock_is_reserved() with __init_memblock
psi: fix reference to kernel commandline enable
arch/sh/include/asm/io.h: provide prototypes for PCI I/O mapping in asm/io.h
mm/sparse: add common helper to mark all memblocks present
mm: introduce common STRUCT_PAGE_MAX_SHIFT define
alpha: fix hang caused by the bootmem removal
Thierry Reding [Fri, 14 Dec 2018 22:17:24 +0000 (14:17 -0800)]
scripts/spdxcheck.py: always open files in binary mode
The spdxcheck script currently falls over when confronted with a binary
file (such as Documentation/logo.gif). To avoid that, always open files
in binary mode and decode line-by-line, ignoring encoding errors.
One tricky case is when piping data into the script and reading it from
standard input. By default, standard input will be opened in text mode,
so we need to reopen it in binary mode.
The breakage only happens with python3 and results in a
UnicodeDecodeError (according to Uwe).
Link: http://lkml.kernel.org/r/20181212131210.28024-1-thierry.reding@gmail.com
Fixes: 6f4d29df66ac ("scripts/spdxcheck.py: make python3 compliant")
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Jeremy Cline <jcline@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Joe Perches <joe@perches.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Fri, 14 Dec 2018 22:17:20 +0000 (14:17 -0800)]
checkstack.pl: fix for aarch64
There is actually a space after "sp," like this,
ffff2000080813c8:
a9bb7bfd stp x29, x30, [sp, #-80]!
Right now, checkstack.pl isn't able to print anything on aarch64,
because it won't be able to match the stating objdump line of a function
due to this missing space. Hence, it displays every stack as zero-size.
After this patch, checkpatch.pl is able to match the start of a
function's objdump, and is then able to calculate each function's stack
correctly.
Link: http://lkml.kernel.org/r/20181207195843.38528-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Fri, 14 Dec 2018 22:17:17 +0000 (14:17 -0800)]
userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON. Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.
Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: <stable@vger.kernel.org>
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Piotr Jaroszynski [Fri, 14 Dec 2018 22:17:14 +0000 (14:17 -0800)]
fs/iomap.c: get/put the page in iomap_page_create/release()
migrate_page_move_mapping() expects pages with private data set to have
a page_count elevated by 1. This is what used to happen for xfs through
the buffer_heads code before the switch to iomap in commit
82cb14175e7d
("xfs: add support for sub-pagesize writeback without buffer_heads").
Not having the count elevated causes move_pages() to fail on memory
mapped files coming from xfs.
Make iomap compatible with the migrate_page_move_mapping() assumption by
elevating the page count as part of iomap_page_create() and lowering it
in iomap_page_release().
It causes the move_pages() syscall to misbehave on memory mapped files
from xfs. It does not not move any pages, which I suppose is "just" a
perf issue, but it also ends up returning a positive number which is out
of spec for the syscall. Talking to Michal Hocko, it sounds like
returning positive numbers might be a necessary update to move_pages()
anyway though
(https://lkml.kernel.org/r/
20181116114955.GJ14706@dhcp22.suse.cz).
I only hit this in tests that verify that move_pages() actually moved
the pages. The test also got confused by the positive return from
move_pages() (it got treated as a success as positive numbers were not
expected and not handled) making it a bit harder to track down what's
going on.
Link: http://lkml.kernel.org/r/20181115184140.1388751-1-pjaroszynski@nvidia.com
Fixes: 82cb14175e7d ("xfs: add support for sub-pagesize writeback without buffer_heads")
Signed-off-by: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yongkai Wu [Fri, 14 Dec 2018 22:17:10 +0000 (14:17 -0800)]
hugetlbfs: call VM_BUG_ON_PAGE earlier in free_huge_page()
A stack trace was triggered by VM_BUG_ON_PAGE(page_mapcount(page), page)
in free_huge_page(). Unfortunately, the page->mapping field was set to
NULL before this test. This made it more difficult to determine the
root cause of the problem.
Move the VM_BUG_ON_PAGE tests earlier in the function so that if they do
trigger more information is present in the page struct.
Link: http://lkml.kernel.org/r/1543491843-23438-1-git-send-email-nic_w@163.com
Signed-off-by: Yongkai Wu <nic_w@163.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yueyi Li [Fri, 14 Dec 2018 22:17:06 +0000 (14:17 -0800)]
memblock: annotate memblock_is_reserved() with __init_memblock
Found warning:
WARNING: EXPORT symbol "gsi_write_channel_scratch" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: vmlinux.o(.text+0x1e0a0): Section mismatch in reference from the function valid_phys_addr_range() to the function .init.text:memblock_is_reserved()
The function valid_phys_addr_range() references
the function __init memblock_is_reserved().
This is often because valid_phys_addr_range lacks a __init
annotation or the annotation of memblock_is_reserved is wrong.
Use __init_memblock instead of __init.
Link: http://lkml.kernel.org/r/BLUPR13MB02893411BF12EACB61888E80DFAE0@BLUPR13MB0289.namprd13.prod.outlook.com
Signed-off-by: Yueyi Li <liyueyi@live.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Baruch Siach [Fri, 14 Dec 2018 22:17:03 +0000 (14:17 -0800)]
psi: fix reference to kernel commandline enable
The kernel commandline parameter named in CONFIG_PSI_DEFAULT_DISABLED
help text contradicts the documentation in kernel-parameters.txt, and
the code. Fix that.
Link: http://lkml.kernel.org/r/20181203213416.GA12627@cmpxchg.org
Fixes: e0c274472d ("psi: make disabling/enabling easier for vendor kernels")
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark Brown [Fri, 14 Dec 2018 22:17:00 +0000 (14:17 -0800)]
arch/sh/include/asm/io.h: provide prototypes for PCI I/O mapping in asm/io.h
Most architectures provide prototypes for the PCI I/O mapping operations
when asm/io.h is included but SH doesn't currently do that, leading to
for example warnings in sound/pci/hda/patch_ca0132.c when pci_iomap() is
used on current -next. Make SH more consistent with other architectures
by including asm-generic/pci_iomap.h in asm/io.h.
Link: http://lkml.kernel.org/r/20181106175142.27988-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 14 Dec 2018 22:16:57 +0000 (14:16 -0800)]
mm/sparse: add common helper to mark all memblocks present
Presently the arches arm64, arm and sh have a function which loops
through each memblock and calls memory present. riscv will require a
similar function.
Introduce a common memblocks_present() function that can be used by all
the arches. Subsequent patches will cleanup the arches that make use of
this.
Link: http://lkml.kernel.org/r/20181107205433.3875-3-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Logan Gunthorpe [Fri, 14 Dec 2018 22:16:53 +0000 (14:16 -0800)]
mm: introduce common STRUCT_PAGE_MAX_SHIFT define
This define is used by arm64 to calculate the size of the vmemmap
region. It is defined as the log2 of the upper bound on the size of a
struct page.
We move it into mm_types.h so it can be defined properly instead of set
and checked with a build bug. This also allows us to use the same
define for riscv.
Link: http://lkml.kernel.org/r/20181107205433.3875-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Rapoport [Fri, 14 Dec 2018 22:16:50 +0000 (14:16 -0800)]
alpha: fix hang caused by the bootmem removal
The conversion of alpha to memblock as the early memory manager caused
boot to hang as described at [1].
The issue is caused because for CONFIG_DISCTONTIGMEM=y case,
memblock_add() is called using memory start PFN that had been rounded
down to the nearest 8Mb and it caused memblock to see more memory that
is actually present in the system.
Besides, memblock allocates memory from high addresses while bootmem was
using low memory, which broke the assumption that early allocations are
always accessible by the hardware.
This patch ensures that memblock_add() is using the correct PFN for the
memory start and forces memblock to use bottom-up allocations.
[1] https://lkml.org/lkml/2018/11/22/1032
Link: http://lkml.kernel.org/r/1543233216-25833-1-git-send-email-rppt@linux.ibm.com
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 14 Dec 2018 20:18:30 +0000 (12:18 -0800)]
Merge tag 'for-linus-
20181214' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Three small fixes for this week. contains:
- spectre indexing fix for aio (Jeff)
- fix for the previous zeroing bio fix, we don't need it for user
mapped pages, and in fact it breaks some applications if we do
(Keith)
- allocation failure fix for null_blk with zoned (Shin'ichiro)"
* tag 'for-linus-
20181214' of git://git.kernel.dk/linux-block:
block: Fix null_blk_zoned creation failure with small number of zones
aio: fix spectre gadget in lookup_ioctx
block/bio: Do not zero user pages
Linus Torvalds [Fri, 14 Dec 2018 20:14:41 +0000 (12:14 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One fix for the qcom QCS404 clk driver that was merged for this
release.
It specified the wrong parent for a PLL so a part of the clk tree
wasn't rooted correctly. This fixes it by using the right name"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: qcom: qcs404: Fix gpll0_out_main parent
Linus Torvalds [Fri, 14 Dec 2018 17:36:41 +0000 (09:36 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Invalidate the caches before clearing the DMA buffer via the
non-cacheable alias in the FORCE_CONTIGUOUS case"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: dma-mapping: Fix FORCE_CONTIGUOUS buffer clearing
Linus Torvalds [Fri, 14 Dec 2018 17:33:34 +0000 (09:33 -0800)]
Merge tag 'powerpc-4.20-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"One notable fix for our change to split pt_regs between user/kernel,
we forgot to update BPF to use the user-visible type which was an ABI
break for BPF programs.
A slightly ugly but minimal fix to do_syscall_trace_enter() so that we
use tracehook_report_syscall_entry() properly. We'll rework the code
in next to avoid the empty if body.
Seven commits fixing bugs in the new papr_scm (Storage Class Memory)
driver. The driver was finally able to be tested on the other
hypervisor which exposed several bugs. The fixes are all fairly
minimal at least.
Fix a crash in our MSI code if an MSI-capable device is plugged into a
non-MSI capable PHB, only seen on older hardware (MPC8378).
Fix our legacy serial code to look for "stdout-path" since the device
trees were updated to use that instead of "linux,stdout-path".
A change to the COFF zImage code to fix booting old powermacs.
A couple of minor build fixes.
Thanks to: Benjamin Herrenschmidt, Daniel Axtens, Dmitry V. Levin,
Elvira Khabirova, Oliver O'Halloran, Paul Mackerras, Radu Rendec, Rob
Herring, Sandipan Das"
* tag 'powerpc-4.20-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call
powerpc/mm: Fallback to RAM if the altmap is unusable
powerpc/papr_scm: Use ibm,unit-guid as the iset cookie
powerpc/papr_scm: Fix DIMM device registration race
powerpc/papr_scm: Remove endian conversions
powerpc/papr_scm: Update DT properties
powerpc/papr_scm: Fix resource end address
powerpc/papr_scm: Use depend instead of select
powerpc/bpf: Fix broken uapi for BPF_PROG_TYPE_PERF_EVENT
powerpc/boot: Fix build failures with -j 1
powerpc: Look for "stdout-path" when setting up legacy consoles
powerpc/msi: Fix NULL pointer access in teardown code
powerpc/mm: Fix linux page tables build with some configs
powerpc: Fix COFF zImage booting on old powermacs
Linus Torvalds [Fri, 14 Dec 2018 17:22:14 +0000 (09:22 -0800)]
Merge tag 'ceph-for-4.20-rc7' of https://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"Luis discovered a problem with the new copyfrom offload on the server
side. Disable it for now"
* tag 'ceph-for-4.20-rc7' of https://github.com/ceph/ceph-client:
ceph: make 'nocopyfrom' a default mount option
Linus Torvalds [Fri, 14 Dec 2018 17:17:17 +0000 (09:17 -0800)]
Merge tag 'pinctrl-v4.20-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Three pin control fixes for the v4.20 series. Just odd drivers, so
nothing particularly interesting:
- Set the tile property on Qualcomm SDM60.
- Fix up enable register calculation for the Meson
- Fix an IRQ offset on the Sunxi (Allwinner)"
* tag 'pinctrl-v4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: sunxi: a83t: Fix IRQ offset typo for PH11
pinctrl: meson: fix pull enable register calculation
pinctrl: sdm660: Set tile property for pingroups
Linus Torvalds [Fri, 14 Dec 2018 17:12:02 +0000 (09:12 -0800)]
Merge tag 'drm-fixes-2018-12-14' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"While I hoped things would calm down, the world hasn't joined with me,
but it's a few things scattered over a wide area. The i915 workarounds
regression fix is probably the largest, the rest are more usual sized.
We also get some new AMD PCI IDs.
There is also a patch in here to MAINTAINERS to added Daniel as an
official DRM toplevel co-maintainer, he's decided he wants to step up
and share the glory, and he'll likely process next weeks fixes while
I'm away on holidays.
Summary:
amdgpu:
- some new PCI IDs
- fixed firmware image updates
- power management fixes
- locking warning fix
nouveau:
- framebuffer flushing fix
- memory leak fix
- tegra device init regression fix
vmwgfx:
- OOM kernel memory fix
- excess return in function fix
i915:
- the biggest fix is a regression fix where workarounds weren't
getting reapplied after a gpu hang causing further crashing, this
fixes the workaround application to make it happen again
- GPU hang fixes for Braswell and some GEN3 GPUs
- GVT fix for broadwell tiling
rockchip:
- revert to fix a regression causing a WARN on shutdown
mediatek:
- avoid crash attaching to non-existant bridges"
* tag 'drm-fixes-2018-12-14' of git://anongit.freedesktop.org/drm/drm: (23 commits)
drm/vmwgfx: Protect from excessive execbuf kernel memory allocations v3
MAINTAINERS: Daniel for drm co-maintainer
drm/amdgpu: drop fclk/gfxclk ratio setting
drm/vmwgfx: remove redundant return ret statement
drm/i915: Flush GPU relocs harder for gen3
drm/i915: Allocate a common scratch page
drm/i915/execlists: Apply a full mb before execution for Braswell
drm/nouveau/kms: Fix memory leak in nv50_mstm_del()
drm/nouveau/kms/nv50-: also flush fb writes when rewinding push buffer
drm/amdgpu: Fix DEBUG_LOCKS_WARN_ON(depth <= 0) in amdgpu_ctx.lock
Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec"
drm/nouveau/drm/nouveau: tegra: Call nouveau_drm_device_init()
drm/amdgpu/powerplay: Apply avfs cks-off voltages on VI
drm/amdgpu: update SMC firmware image for polaris10 variants
drm/amdkfd: add new vega20 pci id
drm/amdkfd: add new vega10 pci ids
drm/amdgpu: add some additional vega20 pci ids
drm/amdgpu: add some additional vega10 pci ids
drm/amdgpu: update smu firmware images for VI variants (v2)
drm/i915: Introduce per-engine workarounds
...
Linus Torvalds [Fri, 14 Dec 2018 00:35:58 +0000 (16:35 -0800)]
Merge tag 'xarray-4.20-rc7' of git://git.infradead.org/users/willy/linux-dax
Pull XArray fixes from Matthew Wilcox:
"Two bugfixes, each with test-suite updates, two improvements to the
test-suite without associated bugs, and one patch adding a missing
API"
* tag 'xarray-4.20-rc7' of git://git.infradead.org/users/willy/linux-dax:
XArray: Fix xa_alloc when id exceeds max
XArray tests: Check iterating over multiorder entries
XArray tests: Handle larger indices more elegantly
XArray: Add xa_cmpxchg_irq and xa_cmpxchg_bh
radix tree: Don't return retry entries from lookup
Linus Torvalds [Thu, 13 Dec 2018 20:57:21 +0000 (12:57 -0800)]
Merge tag 'linux-kselftest-4.20-rc7' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
"A single fix for a seccomp test from Kees Cook."
* tag 'linux-kselftest-4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/seccomp: Remove SIGSTOP si_pid check
Linus Torvalds [Thu, 13 Dec 2018 20:02:41 +0000 (12:02 -0800)]
Merge branch 'fixes' of git://git./linux/kernel/git/evalenti/linux-soc-thermal
Pull thermal fixes from Eduardo Valentin:
"Fixes for STM and HISI thermal drivers"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: stm32: Fix stm_thermal_read_factory_settings
thermal: stm32: read factory settings inside stm_thermal_prepare
thermal/drivers/hisi: Fix number of sensors on hi3660
thermal/drivers/hisi: Fix wrong platform_get_irq_byname()
Dave Airlie [Thu, 13 Dec 2018 19:19:25 +0000 (05:19 +1000)]
Merge branch 'vmwgfx-fixes-4.20' of git://people.freedesktop.org/~thomash/linux into drm-fixes
One regression fix for avoiding kernel OOM, one cleanup return fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181213122815.10581-1-thellstrom@vmware.com
Matthew Wilcox [Thu, 13 Dec 2018 18:57:42 +0000 (13:57 -0500)]
XArray: Fix xa_alloc when id exceeds max
Specifying a starting ID greater than the maximum ID isn't something
attempted very often, but it should fail. It was succeeding due to
xas_find_marked() returning the wrong error state, so add tests for
both xa_alloc() and xas_find_marked().
Fixes: b803b42823d0 ("xarray: Add XArray iterators")
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Linus Torvalds [Thu, 13 Dec 2018 19:02:23 +0000 (11:02 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Doug Ledford:
"We have 5 small fixes for this pull request. One is a performance
regression, so not necessarily strictly a fix, but it was small and
reasonable and claimed to avoid thrashing in the scheduler, so I took
it. The remaining are all legitimate fixes that match the "we take
fixes any time" criteria.
Summary:
- One performance regression for hfi1
- One kasan fix for hfi1
- A couple mlx5 fixes
- A core oops fix"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/core: Fix oops in netdev_next_upper_dev_rcu()
IB/mlx5: Block DEVX umem from the non applicable cases
IB/mlx5: Fix implicit ODP interrupted page fault
IB/hfi1: Fix an out-of-bounds access in get_hw_stats
IB/hfi1: Fix a latency issue for small messages
Linus Torvalds [Thu, 13 Dec 2018 18:59:02 +0000 (10:59 -0800)]
Merge tag 'mmc-v4.20-rc5' of git://git./linux/kernel/git/ulfh/mmc
Pull mmc fixes from Ulf Hansson:
"MMC core:
- Fixup RPMB requests to use mrq->sbc when sending CMD23
MMC host:
- omap: Fix broken MMC/SD on OMAP15XX/OMAP5910/OMAP310
- sdhci-omap: Fix DCRC error handling during tuning
- sdhci: Fixup the timeout check window for clock and reset"
* tag 'mmc-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci: fix the timeout check window for clock and reset
mmc: sdhci-omap: Fix DCRC error handling during tuning
MMC: OMAP: fix broken MMC on OMAP15XX/OMAP5910/OMAP310
mmc: core: use mrq->sbc when sending CMD23 for RPMB
Linus Torvalds [Thu, 13 Dec 2018 18:54:13 +0000 (10:54 -0800)]
Merge tag 'sound-4.20-rc7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Only usual suspects here: a few more fixups for Realtek HD-audio on
various PCs, including a regression fix in the previous fix for Lenovo
X1 Carbon, as well as a typo fix in the recent Fireface patch"
* tag 'sound-4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Enable audio jacks of ASUS UX433FN/UX333FA with ALC294
ALSA: hda/realtek: Enable audio jacks of ASUS UX533FD with ALC294
ALSA: hda/realtek: ALC294 mic and headset-mode fixups for ASUS X542UN
ALSA: fireface: fix reference to wrong register for clock configuration
ALSA: hda/realtek - Fix the mute LED regresion on Lenovo X1 Carbon
ALSA: hda/realtek - Fixed headphone issue for ALC700
Thomas Hellstrom [Wed, 12 Dec 2018 10:52:08 +0000 (11:52 +0100)]
drm/vmwgfx: Protect from excessive execbuf kernel memory allocations v3
With the new validation code, a malicious user-space app could
potentially submit command streams with enough buffer-object and resource
references in them to have the resulting allocated validion nodes and
relocations make the kernel run out of GFP_KERNEL memory.
Protect from this by having the validation code reserve TTM graphics
memory when allocating.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
---
v2: Removed leftover debug printouts
Linus Torvalds [Thu, 13 Dec 2018 02:29:24 +0000 (18:29 -0800)]
Merge tag 'for-4.20/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM cache metadata to verify that a cache has block before trying
to continue with operation that requires them.
- Fix bio-based DM core's dm_make_request() to properly impose device
limits on individual bios by making use of blk_queue_split().
- Fix long-standing race with how DM thinp notified userspace of
thin-pool mode state changes before they were actually made.
- Fix the zoned target's bio completion handling; this is a fairly
invassive fix at this stage but it is localized to the zoned target.
Any zoned target users will benefit from this fix.
* tag 'for-4.20/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm thin: bump target version
dm thin: send event about thin-pool state change _after_ making it
dm zoned: Fix target BIO completion handling
dm: call blk_queue_split() to impose device limits on bios
dm cache metadata: verify cache has blocks in blocks_are_clean_separate_dirty()
Linus Torvalds [Thu, 13 Dec 2018 02:24:32 +0000 (18:24 -0800)]
Merge tag 'media/v4.20-5' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- one regression at vsp1 driver
- some last time changes for the upcoming request API logic and for
stateless codec support. As the stateless codec "cedrus" driver is at
staging, don't apply the MPEG controls as part of the main V4L2 API,
as those may not be ready for production yet.
* tag 'media/v4.20-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: Add a Kconfig option for the Request API
media: extended-controls.rst: add note to the MPEG2 state controls
media: mpeg2-ctrls.h: move MPEG2 state controls to non-public header
media: vicodec: set state resolution from raw format
media: vivid: drop v4l2_ctrl_request_complete() from start_streaming
media: vb2: don't unbind/put the object when going to state QUEUED
media: vb2: keep a reference to the request until dqbuf
media: vb2: skip request checks for VIDIOC_PREPARE_BUF
media: vb2: don't call __vb2_queue_cancel if vb2_start_streaming failed
media: cedrus: Fix a NULL vs IS_ERR() check
media: vsp1: Fix LIF buffer thresholds
Linus Torvalds [Thu, 13 Dec 2018 02:19:44 +0000 (18:19 -0800)]
Merge tag 'ovl-fixes-4.20-rc7' of git://git./linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
"Needed to revert a patch, because it possibly introduces a security
hole. Since the patch is basically a conceptual cleanup, not a bug
fix, it's safe to revert. I'm not giving up on this, and discussions
seemed to have reached an agreement over how to move forward, but that
can wait 'till the next release.
The other two patches are fixes for bugs introduced in recent
releases"
* tag 'ovl-fixes-4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
Revert "ovl: relax permission checking on underlying layers"
ovl: fix decode of dir file handle with multi lower layers
ovl: fix missing override creds in link of a metacopy upper
Linus Torvalds [Thu, 13 Dec 2018 02:17:35 +0000 (18:17 -0800)]
Merge tag 'fuse-fixes-4.20-rc7' of git://git./linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
"There's one patch fixing a minor but long lived bug, the others are
fixing regressions introduced in this cycle"
* tag 'fuse-fixes-4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS
fuse: Fix memory leak in fuse_dev_free()
fuse: fix revalidation of attributes for permission check
fuse: fix fsync on directory
fuse: Add bad inode check in fuse_destroy_inode()
Linus Torvalds [Thu, 13 Dec 2018 02:15:29 +0000 (18:15 -0800)]
Merge tag 'trace-v4.20-rc6' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"While running various ftrace tests on new development code, the
kmemleak detector found some allocations that were not freed
correctly.
This fixes a couple of leaks in the event trigger code as well as in
adding function trace filters in trace instances"
* tag 'trace-v4.20-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix memory leak of instance function hash filters
tracing: Fix memory leak in set_trigger_filter()
tracing: Fix memory leak in create_filter()
Dave Airlie [Wed, 12 Dec 2018 23:55:05 +0000 (09:55 +1000)]
Merge branch 'mediatek-drm-fixes-4.20' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes
Single bridge attachment fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1544407975.18825.3.camel@mtksdaap41
Daniel Vetter [Mon, 10 Dec 2018 10:30:01 +0000 (11:30 +0100)]
MAINTAINERS: Daniel for drm co-maintainer
lkml and Linus gained a CoC, and it's serious this time. Which means
my no 1 reason for declining to officially step up as drm maintainer
is gone, and I didn't find any new good excuse.
I chatted with a few people in private already, and the biggest
concern is that I mislay my community hat and start running around
with my intel hat only. Or some other convenient abuse of trust.
That's why this patch doesn't just need a lot of acks that mean "yeah
seems fine to me", but a lot of acks that mean "yeah we'll tell you
when you're over the line and usurp you from that comfy chair if you
don't get it". Which I think we've been done a fairly good job here at
dri-devel in general, but better to be clear.
Rough idea is that I'll do this for maybe 2-3 years, helping Dave
figure out a group model for drm overall. And getting the tooling and
infrastructure for that off the ground. Then step down again because
some other shiny thing that needs chasing. Of course as plans tend to
do, this one will probably pan out a bit different in reality.
Cc: David Airlie <airlied@linux.ie>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Acked-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Sean Paul <sean@poorly.run>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181210103001.30549-1-daniel.vetter@ffwll.ch
Dave Airlie [Wed, 12 Dec 2018 23:32:41 +0000 (09:32 +1000)]
Merge branch 'linux-4.20' of git://github.com/skeggsb/linux into drm-fixes
Three fixes:
tegra regression fix
display flushing fix
mst cleanup fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv7WCPzjQZonk+eS1FgEUKirz-4LOrVpMUVMM=D-GjbVpg@mail.gmail.com
Dave Airlie [Wed, 12 Dec 2018 23:24:18 +0000 (09:24 +1000)]
Merge branch 'drm-fixes-4.20' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Fixes for 4.20:
- Stability fixes for new polaris variants (e.g., RX590)
- New vega pci ids
- Vega20 smu fix
- Ctx locking fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212203022.3054-1-alexander.deucher@amd.com
Dave Airlie [Wed, 12 Dec 2018 21:25:00 +0000 (07:25 +1000)]
Merge tag 'drm-misc-fixes-2018-12-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- rockchip: Revert change causing WARN on shutdown (Brian)
Cc: Brian Norris <briannorris@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212204309.GA150523@art_vandelay
Dave Airlie [Wed, 12 Dec 2018 21:22:47 +0000 (07:22 +1000)]
Merge tag 'drm-intel-fixes-2018-12-12-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Two fixes to avoid GPU hangs (on Braswell and Gen3)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212134010.GA18900@jlahtine-desk.ger.corp.intel.com
Dave Airlie [Wed, 12 Dec 2018 21:21:29 +0000 (07:21 +1000)]
Merge tag 'drm-intel-fixes-2018-12-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix for system crash after GPU hang (Bugzilla #107945)
- GVT fix for guest graphics corruption (https://github.com/intel/gvt-linux/issues/61)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207104352.GA18214@jlahtine-desk.ger.corp.intel.com
Evan Quan [Wed, 12 Dec 2018 06:56:14 +0000 (14:56 +0800)]
drm/amdgpu: drop fclk/gfxclk ratio setting
Since this is not needed any more on the latest SMC firmware.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Mark Zhang [Wed, 5 Dec 2018 13:50:49 +0000 (15:50 +0200)]
IB/core: Fix oops in netdev_next_upper_dev_rcu()
When support for bonding of RoCE devices was added, there was
necessarily a link between the RoCE device and the paired netdevice that
was part of the bond. If you remove the mlx4_en module, that paired
association is broken (the RoCE device is still present but the paired
netdevice has been released). We need to account for this in
is_upper_ndev_bond_master_filter() and filter out those links with a
broken pairing or else we later oops in netdev_next_upper_dev_rcu().
Fixes: 408f1242d940 ("IB/core: Delete lower netdevice default GID entries in bonding scenario")
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Mike Snitzer [Wed, 12 Dec 2018 14:39:54 +0000 (09:39 -0500)]
dm thin: bump target version
Decoupled version bump from commit
f6c367585d0 ("dm thin: send event
about thin-pool state change _after_ making it") because version bumps
just create conflicts when backporting to the stable trees.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Colin Ian King [Thu, 4 Oct 2018 17:49:53 +0000 (18:49 +0100)]
drm/vmwgfx: remove redundant return ret statement
The return statement is redundant as there is a return statement
immediately before it so we have dead code that can be removed.
Also remove the unused declaration of ret.
Detected by CoverityScan, CID#
1473793 ("Structurally dead code")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Chris Wilson [Fri, 7 Dec 2018 13:40:37 +0000 (13:40 +0000)]
drm/i915: Flush GPU relocs harder for gen3
Adding an extra MI_STORE_DWORD_IMM to the gpu relocation path for gen3
was good, but still not good enough. To survive 24+ hours under test we
needed to perform not one, not two but three extra store-dw. Doing so
for each GPU relocation was a little unsightly and since we need to
worry about userspace hitting the same issues, we should apply the dummy
store-dw into the EMIT_FLUSH.
Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing")
References:
7fa28e146994 ("drm/i915: Write GPU relocs harder with gen3")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181207134037.11848-1-chris@chris-wilson.co.uk
(cherry picked from commit
a889580c087a9cf91fddb3832ece284174214183)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Tue, 4 Dec 2018 14:15:16 +0000 (14:15 +0000)]
drm/i915: Allocate a common scratch page
Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.
For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.
v2: Check that we allocated a large enough scratch for I830 w/a
Fixes: 06e562e7f515 ("drm/i915/ringbuffer: Delay after EMIT_INVALIDATE for gen4/gen5") # v4.18.20
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108850
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181204141522.13640-1-chris@chris-wilson.co.uk
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.18.20+
(cherry picked from commit
5179749925933575a67f9d8f16d0cc204f98a29f)
[Joonas: Use new function in gen9_init_indirectctx_bb too]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Thu, 6 Dec 2018 08:44:31 +0000 (08:44 +0000)]
drm/i915/execlists: Apply a full mb before execution for Braswell
Braswell is really picky about having our writes posted to memory before
we execute or else the GPU may see stale values. A wmb() is insufficient
as it only ensures the writes are visible to other cores, we need a full
mb() to ensure the writes are in memory and visible to the GPU.
The most frequent failure in flushing before execution is that we see
stale PTE values and execute the wrong pages.
References:
987abd5c62f9 ("drm/i915/execlists: Force write serialisation into context image vs execution")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181206084431.9805-3-chris@chris-wilson.co.uk
(cherry picked from commit
490b8c65b9db45896769e1095e78725775f47b3e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Lyude Paul [Tue, 11 Dec 2018 23:56:20 +0000 (18:56 -0500)]
drm/nouveau/kms: Fix memory leak in nv50_mstm_del()
Noticed this while working on redoing the reference counting scheme in
the DP MST helpers. Nouveau doesn't attempt to call
drm_dp_mst_topology_mgr_destroy() at all, which leaves it leaking all of
the resources for drm_dp_mst_topology_mgr and it's children mstbs+ports.
Fixes: f479c0ba4a17 ("drm/nouveau/kms/nv50: initial support for DP 1.2 multi-stream")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Wed, 12 Dec 2018 06:51:17 +0000 (16:51 +1000)]
drm/nouveau/kms/nv50-: also flush fb writes when rewinding push buffer
Should hopefully fix a regression some people have been seeing since EVO
push buffers were moved to VRAM by default on Pascal GPUs.
Fixes: d00ddd9da ("drm/nouveau/kms/nv50-: allocate push buffers in vidmem on pascal")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: <stable@vger.kernel.org> # 4.19+
Kees Cook [Thu, 6 Dec 2018 23:50:38 +0000 (15:50 -0800)]
selftests/seccomp: Remove SIGSTOP si_pid check
Commit
f149b3155744 ("signal: Never allocate siginfo for SIGKILL or SIGSTOP")
means that the seccomp selftest cannot check si_pid under SIGSTOP anymore.
Since it's believed[1] there are no other userspace things depending on the
old behavior, this removes the behavioral check in the selftest, since it's
more a "extra" sanity check (which turns out, maybe, not to have been
useful to test).
[1] https://lkml.kernel.org/r/CAGXu5jJaZAOzP1qFz66tYrtbuywqb+UN2SOA1VLHpCCOiYvYeg@mail.gmail.com
Reported-by: Tycho Andersen <tycho@tycho.ws>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
Shin'ichiro Kawasaki [Tue, 11 Dec 2018 12:08:26 +0000 (21:08 +0900)]
block: Fix null_blk_zoned creation failure with small number of zones
null_blk_zoned creation fails if the number of zones specified is equal to or is
smaller than 64 due to a memory allocation failure in blk_alloc_zones(). With
such a small number of zones, the required memory size for all zones descriptors
fits in a single page, and the page order for alloc_pages_node() is zero. Allow
this value in blk_alloc_zones() for the allocation to succeed.
Fixes: bf5054569653 "block: Introduce blk_revalidate_disk_zones()"
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Chad Austin [Mon, 10 Dec 2018 18:54:52 +0000 (10:54 -0800)]
fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS
When FUSE_OPEN returns ENOSYS, the no_open bit is set on the connection.
Because the FUSE_RELEASE and FUSE_RELEASEDIR paths share code, this
incorrectly caused the FUSE_RELEASEDIR request to be dropped and never sent
to userspace.
Pass an isdir bool to distinguish between FUSE_RELEASE and FUSE_RELEASEDIR
inside of fuse_file_put.
Fixes: 7678ac50615d ("fuse: support clients that don't implement 'open'")
Cc: <stable@vger.kernel.org> # v3.14
Signed-off-by: Chad Austin <chadaustin@fb.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Mike Snitzer [Tue, 11 Dec 2018 18:31:40 +0000 (13:31 -0500)]
dm thin: send event about thin-pool state change _after_ making it
Sending a DM event before a thin-pool state change is about to happen is
a bug. It wasn't realized until it became clear that userspace response
to the event raced with the actual state change that the event was
meant to notify about.
Fix this by first updating internal thin-pool state to reflect what the
DM event is being issued about. This fixes a long-standing racey/buggy
userspace device-mapper-test-suite 'resize_io' test that would get an
event but not find the state it was looking for -- so it would just go
on to hang because no other events caused the test to reevaluate the
thin-pool's state.
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Steven Rostedt (VMware) [Tue, 11 Dec 2018 04:58:01 +0000 (23:58 -0500)]
tracing: Fix memory leak of instance function hash filters
The following commands will cause a memory leak:
# cd /sys/kernel/tracing
# mkdir instances/foo
# echo schedule > instance/foo/set_ftrace_filter
# rmdir instances/foo
The reason is that the hashes that hold the filters to set_ftrace_filter and
set_ftrace_notrace are not freed if they contain any data on the instance
and the instance is removed.
Found by kmemleak detector.
Cc: stable@vger.kernel.org
Fixes: 591dffdade9f ("ftrace: Allow for function tracing instance to filter functions")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Steven Rostedt (VMware) [Mon, 10 Dec 2018 02:17:30 +0000 (21:17 -0500)]
tracing: Fix memory leak in set_trigger_filter()
When create_event_filter() fails in set_trigger_filter(), the filter may
still be allocated and needs to be freed. The caller expects the
data->filter to be updated with the new filter, even if the new filter
failed (we could add an error message by setting set_str parameter of
create_event_filter(), but that's another update).
But because the error would just exit, filter was left hanging and
nothing could free it.
Found by kmemleak detector.
Cc: stable@vger.kernel.org
Fixes: bac5fb97a173a ("tracing: Add and use generic set_trigger_filter() implementation")
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Steven Rostedt (VMware) [Sun, 9 Dec 2018 02:10:04 +0000 (21:10 -0500)]
tracing: Fix memory leak in create_filter()
The create_filter() calls create_filter_start() which allocates a
"parse_error" descriptor, but fails to call create_filter_finish() that
frees it.
The op_stack and inverts in predicate_parse() were also not freed.
Found by kmemleak detector.
Cc: stable@vger.kernel.org
Fixes: 80765597bc587 ("tracing: Rewrite filter logic to be simpler and faster")
Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Jeff Moyer [Tue, 11 Dec 2018 17:37:49 +0000 (12:37 -0500)]
aio: fix spectre gadget in lookup_ioctx
Matthew pointed out that the ioctx_table is susceptible to spectre v1,
because the index can be controlled by an attacker. The below patch
should mitigate the attack for all of the aio system calls.
Cc: stable@vger.kernel.org
Reported-by: Matthew Wilcox <willy@infradead.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Luis Henriques [Mon, 10 Dec 2018 10:23:12 +0000 (10:23 +0000)]
ceph: make 'nocopyfrom' a default mount option
Since we found a problem with the 'copy-from' operation after objects have
been truncated, offloading object copies to OSDs should be discouraged
until the issue is fixed.
Thus, this patch adds the 'nocopyfrom' mount option to the default mount
options which effectily means that remote copies won't be done in
copy_file_range unless they are explicitly enabled at mount time.
[ Adjust ceph_show_options() accordingly. ]
Link: https://tracker.ceph.com/issues/37378
Signed-off-by: Luis Henriques <lhenriques@suse.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Andrey Grodzovsky [Thu, 6 Dec 2018 20:51:37 +0000 (15:51 -0500)]
drm/amdgpu: Fix DEBUG_LOCKS_WARN_ON(depth <= 0) in amdgpu_ctx.lock
If CS is submitted using guilty ctx, we terminate amdgpu_cs_parser_init
before locking ctx->lock, latter in amdgpu_cs_parser_fini we still are
trying to release the lock just becase parser->ctx != NULL.
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brian Norris [Wed, 5 Dec 2018 18:16:57 +0000 (10:16 -0800)]
Revert "drm/rockchip: Allow driver to be shutdown on reboot/kexec"
This reverts commit
7f3ef5dedb146e3d5063b6845781ad1bb59b92b5.
It causes new warnings [1] on shutdown when running the Google Kevin or
Scarlet (RK3399) boards under Chrome OS. Presumably our usage of DRM is
different than what Marc and Heiko test.
We're looking at a different approach (e.g., [2]) to replace this, but
IMO the revert should be taken first, as it already propagated to
-stable.
[1] Report here:
http://lkml.kernel.org/lkml/
20181205030127.GA200921@google.com
WARNING: CPU: 4 PID: 2035 at drivers/gpu/drm/drm_mode_config.c:477 drm_mode_config_cleanup+0x1c4/0x294
...
Call trace:
drm_mode_config_cleanup+0x1c4/0x294
rockchip_drm_unbind+0x4c/0x8c
component_master_del+0x88/0xb8
rockchip_drm_platform_remove+0x2c/0x44
rockchip_drm_platform_shutdown+0x20/0x2c
platform_drv_shutdown+0x2c/0x38
device_shutdown+0x164/0x1b8
kernel_restart_prepare+0x40/0x48
kernel_restart+0x20/0x68
...
Memory manager not clean during takedown.
WARNING: CPU: 4 PID: 2035 at drivers/gpu/drm/drm_mm.c:950 drm_mm_takedown+0x34/0x44
...
drm_mm_takedown+0x34/0x44
rockchip_drm_unbind+0x64/0x8c
component_master_del+0x88/0xb8
rockchip_drm_platform_remove+0x2c/0x44
rockchip_drm_platform_shutdown+0x20/0x2c
platform_drv_shutdown+0x2c/0x38
device_shutdown+0x164/0x1b8
kernel_restart_prepare+0x40/0x48
kernel_restart+0x20/0x68
...
[2] https://patchwork.kernel.org/patch/
10556151/
https://www.spinics.net/lists/linux-rockchip/msg21342.html
[PATCH] drm/rockchip: shutdown drm subsystem on shutdown
Fixes: 7f3ef5dedb14 ("drm/rockchip: Allow driver to be shutdown on reboot/kexec")
Cc: Jeffy Chen <jeffy.chen@rock-chips.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Vicente Bergas <vicencb@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Heiko Stuebner <heiko@sntech.de>
Cc: stable@vger.kernel.org
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20181205181657.177703-1-briannorris@chromium.org