openwrt/staging/blogic.git
6 years agoblk-wbt: don't maintain inflight counts if disabled
Jens Axboe [Thu, 23 Aug 2018 15:34:46 +0000 (09:34 -0600)]
blk-wbt: don't maintain inflight counts if disabled

A previous commit removed the ability to have per-rq flags. We used
those flags to maintain inflight counts. Since we don't have those
anymore, we have to always maintain inflight counts, even if wbt is
disabled. This is clearly suboptimal.

Add a queue quiesce around changing the wbt latency settings from sysfs
to work around this. With that, we can reliably put the enabled check in
our bio_to_wbt_flags(), since we know the WBT_TRACKED flag will be
consistent for the lifetime of the request.

Fixes: c1c80384c8f ("block: remove external dependency on wbt_flags")
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblk-wbt: fix has-sleeper queueing check
Jens Axboe [Mon, 20 Aug 2018 19:22:27 +0000 (13:22 -0600)]
blk-wbt: fix has-sleeper queueing check

We need to do this inside the loop as well, or we can allow new
IO to supersede previous IO.

Tested-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblk-wbt: use wq_has_sleeper() for wq active check
Jens Axboe [Mon, 20 Aug 2018 19:20:50 +0000 (13:20 -0600)]
blk-wbt: use wq_has_sleeper() for wq active check

We need the memory barrier before checking the list head,
use the appropriate helper for this. The matching queue
side memory barrier is provided by set_current_state().

Tested-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoblk-wbt: move disable check into get_limit()
Jens Axboe [Mon, 20 Aug 2018 19:24:25 +0000 (13:24 -0600)]
blk-wbt: move disable check into get_limit()

Check it in one place, instead of in multiple places.

Tested-by: Anchal Agarwal <anchalag@amazon.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agobcache: release dc->writeback_lock properly in bch_writeback_thread()
Shan Hai [Wed, 22 Aug 2018 18:02:56 +0000 (02:02 +0800)]
bcache: release dc->writeback_lock properly in bch_writeback_thread()

The writeback thread would exit with a lock held when the cache device
is detached via sysfs interface, fix it by releasing the held lock
before exiting the while-loop.

Fixes: fadd94e05c02 (bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set)
Signed-off-by: Shan Hai <shan.hai@oracle.com>
Signed-off-by: Coly Li <colyli@suse.de>
Tested-by: Shenghui Wang <shhuiw@foxmail.com>
Cc: stable@vger.kernel.org #4.17+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoMerge tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-block
Linus Torvalds [Wed, 22 Aug 2018 20:38:05 +0000 (13:38 -0700)]
Merge tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-block

Pull more block updates from Jens Axboe:

 - Set of bcache fixes and changes (Coly)

 - The flush warn fix (me)

 - Small series of BFQ fixes (Paolo)

 - wbt hang fix (Ming)

 - blktrace fix (Steven)

 - blk-mq hardware queue count update fix (Jianchao)

 - Various little fixes

* tag 'for-4.19/post-20180822' of git://git.kernel.dk/linux-block: (31 commits)
  block/DAC960.c: make some arrays static const, shrinks object size
  blk-mq: sync the update nr_hw_queues with blk_mq_queue_tag_busy_iter
  blk-mq: init hctx sched after update ctx and hctx mapping
  block: remove duplicate initialization
  tracing/blktrace: Fix to allow setting same value
  pktcdvd: fix setting of 'ret' error return for a few cases
  block: change return type to bool
  block, bfq: return nbytes and not zero from struct cftype .write() method
  block, bfq: improve code of bfq_bfqq_charge_time
  block, bfq: reduce write overcharge
  block, bfq: always update the budget of an entity when needed
  block, bfq: readd missing reset of parent-entity service
  blk-wbt: fix IO hang in wbt_wait()
  block: don't warn for flush on read-only device
  bcache: add the missing comments for smp_mb()/smp_wmb()
  bcache: remove unnecessary space before ioctl function pointer arguments
  bcache: add missing SPDX header
  bcache: move open brace at end of function definitions to next line
  bcache: add static const prefix to char * array declarations
  bcache: fix code comments style
  ...

6 years agoMerge tag 'f2fs-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk...
Linus Torvalds [Wed, 22 Aug 2018 20:29:39 +0000 (13:29 -0700)]
Merge tag 'f2fs-for-4.19' of git://git./linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've tuned f2fs to improve general performance by
  serializing block allocation and enhancing discard flows like fstrim
  which avoids user IO contention. And we've added fsync_mode=nobarrier
  which gives an option to user where it skips issuing cache_flush
  commands to underlying flash storage. And there are many bug fixes
  related to fuzzed images, revoked atomic writes, quota ops, and minor
  direct IO.

  Enhancements:
   - add fsync_mode=nobarrier which bypasses cache_flush command
   - enhance the discarding flow which avoids user IOs and issues in
     LBA order
   - readahead some encrypted blocks during GC
   - enable in-memory inode checksum to verify the blocks if
     F2FS_CHECK_FS is set
   - enhance nat_bits behavior
   - set -o discard by default
   - set REQ_RAHEAD to bio in ->readpages

  Bug fixes:
   - fix a corner case to corrupt atomic_writes revoking flow
   - revisit i_gc_rwsem to fix race conditions
   - fix some dio behaviors captured by xfstests
   - correct handling errors given by quota-related failures
   - add many sanity check flows to avoid fuzz test failures
   - add more error number propagation to their callers
   - fix several corner cases to continue fault injection w/ shutdown
     loop"

* tag 'f2fs-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (89 commits)
  f2fs: readahead encrypted block during GC
  f2fs: avoid fi->i_gc_rwsem[WRITE] lock in f2fs_gc
  f2fs: fix performance issue observed with multi-thread sequential read
  f2fs: fix to skip verifying block address for non-regular inode
  f2fs: rework fault injection handling to avoid a warning
  f2fs: support fault_type mount option
  f2fs: fix to return success when trimming meta area
  f2fs: fix use-after-free of dicard command entry
  f2fs: support discard submission error injection
  f2fs: split discard command in prior to block layer
  f2fs: wake up gc thread immediately when gc_urgent is set
  f2fs: fix incorrect range->len in f2fs_trim_fs()
  f2fs: refresh recent accessed nat entry in lru list
  f2fs: fix avoid race between truncate and background GC
  f2fs: avoid race between zero_range and background GC
  f2fs: fix to do sanity check with block address in main area v2
  f2fs: fix to do sanity check with inline flags
  f2fs: fix to reset i_gc_failures correctly
  f2fs: fix invalid memory access
  f2fs: fix to avoid broken of dnode block list
  ...

6 years agoovl: set I_CREATING on inode being created
Miklos Szeredi [Wed, 22 Aug 2018 08:55:22 +0000 (10:55 +0200)]
ovl: set I_CREATING on inode being created

...otherwise there will be list corruption due to inode_sb_list_add() being
called for inode already on the sb list.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: e950564b97fd ("vfs: don't evict uninitialized inode")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Wed, 22 Aug 2018 19:34:08 +0000 (12:34 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge more updates from Andrew Morton:

 - the rest of MM

 - procfs updates

 - various misc things

 - more y2038 fixes

 - get_maintainer updates

 - lib/ updates

 - checkpatch updates

 - various epoll updates

 - autofs updates

 - hfsplus

 - some reiserfs work

 - fatfs updates

 - signal.c cleanups

 - ipc/ updates

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (166 commits)
  ipc/util.c: update return value of ipc_getref from int to bool
  ipc/util.c: further variable name cleanups
  ipc: simplify ipc initialization
  ipc: get rid of ids->tables_initialized hack
  lib/rhashtable: guarantee initial hashtable allocation
  lib/rhashtable: simplify bucket_table_alloc()
  ipc: drop ipc_lock()
  ipc/util.c: correct comment in ipc_obtain_object_check
  ipc: rename ipcctl_pre_down_nolock()
  ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
  ipc: reorganize initialization of kern_ipc_perm.seq
  ipc: compute kern_ipc_perm.id under the ipc lock
  init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
  fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
  adfs: use timespec64 for time conversion
  kernel/sysctl.c: fix typos in comments
  drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md
  fork: don't copy inconsistent signal handler state to child
  signal: make get_signal() return bool
  signal: make sigkill_pending() return bool
  ...

6 years agoipc/util.c: update return value of ipc_getref from int to bool
Manfred Spraul [Wed, 22 Aug 2018 05:02:04 +0000 (22:02 -0700)]
ipc/util.c: update return value of ipc_getref from int to bool

ipc_getref has still a return value of type "int", matching the atomic_t
interface of atomic_inc_not_zero()/atomic_add_unless().

ipc_getref now uses refcount_inc_not_zero, which has a return value of
type "bool".

Therefore, update the return code to avoid implicit conversions.

Link: http://lkml.kernel.org/r/20180712185241.4017-13-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc/util.c: further variable name cleanups
Manfred Spraul [Wed, 22 Aug 2018 05:02:00 +0000 (22:02 -0700)]
ipc/util.c: further variable name cleanups

The varable names got a mess, thus standardize them again:

id: user space id. Called semid, shmid, msgid if the type is known.
    Most functions use "id" already.
idx: "index" for the idr lookup
    Right now, some functions use lid, ipc_addid() already uses idx as
    the variable name.
seq: sequence number, to avoid quick collisions of the user space id
key: user space key, used for the rhash tree

Link: http://lkml.kernel.org/r/20180712185241.4017-12-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc: simplify ipc initialization
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:56 +0000 (22:01 -0700)]
ipc: simplify ipc initialization

Now that we know that rhashtable_init() will not fail, we can get rid of a
lot of the unnecessary cleanup paths when the call errored out.

[manfred@colorfullife.com: variable name added to util.h to resolve checkpatch warning]
Link: http://lkml.kernel.org/r/20180712185241.4017-11-manfred@colorfullife.com
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc: get rid of ids->tables_initialized hack
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:52 +0000 (22:01 -0700)]
ipc: get rid of ids->tables_initialized hack

In sysvipc we have an ids->tables_initialized regarding the rhashtable,
introduced in 0cfb6aee70bd ("ipc: optimize semget/shmget/msgget for lots
of keys")

It's there, specifically, to prevent nil pointer dereferences, from using
an uninitialized api.  Considering how rhashtable_init() can fail
(probably due to ENOMEM, if anything), this made the overall ipc
initialization capable of failure as well.  That alone is ugly, but fine,
however I've spotted a few issues regarding the semantics of
tables_initialized (however unlikely they may be):

- There is inconsistency in what we return to userspace: ipc_addid()
  returns ENOSPC which is certainly _wrong_, while ipc_obtain_object_idr()
  returns EINVAL.

- After we started using rhashtables, ipc_findkey() can return nil upon
  !tables_initialized, but the caller expects nil for when the ipc
  structure isn't found, and can therefore call into ipcget() callbacks.

Now that rhashtable initialization cannot fail, we can properly get rid of
the hack altogether.

[manfred@colorfullife.com: commit id extended to 12 digits]
Link: http://lkml.kernel.org/r/20180712185241.4017-10-manfred@colorfullife.com
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/rhashtable: guarantee initial hashtable allocation
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:48 +0000 (22:01 -0700)]
lib/rhashtable: guarantee initial hashtable allocation

rhashtable_init() may fail due to -ENOMEM, thus making the entire api
unusable.  This patch removes this scenario, however unlikely.  In order
to guarantee memory allocation, this patch always ends up doing
GFP_KERNEL|__GFP_NOFAIL for both the tbl as well as
alloc_bucket_spinlocks().

Upon the first table allocation failure, we shrink the size to the
smallest value that makes sense and retry with __GFP_NOFAIL semantics.
With the defaults, this means that from 64 buckets, we retry with only 4.
Any later issues regarding performance due to collisions or larger table
resizing (when more memory becomes available) is the least of our
problems.

Link: http://lkml.kernel.org/r/20180712185241.4017-9-manfred@colorfullife.com
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/rhashtable: simplify bucket_table_alloc()
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:45 +0000 (22:01 -0700)]
lib/rhashtable: simplify bucket_table_alloc()

As of ce91f6ee5b3b ("mm: kvmalloc does not fallback to vmalloc for
incompatible gfp flags") we can simplify the caller and trust kvzalloc()
to just do the right thing.  For the case of the GFP_ATOMIC context, we
can drop the __GFP_NORETRY flag for obvious reasons, and for the
__GFP_NOWARN case, however, it is changed such that the caller passes the
flag instead of making bucket_table_alloc() handle it.

This slightly changes the gfp flags passed on to nested_table_alloc() as
it will now also use GFP_ATOMIC | __GFP_NOWARN.  However, I consider this
a positive consequence as for the same reasons we want nowarn semantics in
bucket_table_alloc().

[manfred@colorfullife.com: commit id extended to 12 digits, line wraps updated]
Link: http://lkml.kernel.org/r/20180712185241.4017-8-manfred@colorfullife.com
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc: drop ipc_lock()
Davidlohr Bueso [Wed, 22 Aug 2018 05:01:41 +0000 (22:01 -0700)]
ipc: drop ipc_lock()

ipc/util.c contains multiple functions to get the ipc object pointer given
an id number.

There are two sets of function: One set verifies the sequence counter part
of the id number, other functions do not check the sequence counter.

The standard for function names in ipc/util.c is
- ..._check() functions verify the sequence counter
- ..._idr() functions do not verify the sequence counter

ipc_lock() is an exception: It does not verify the sequence counter value,
but this is not obvious from the function name.

Furthermore, shm.c is the only user of this helper.  Thus, we can simply
move the logic into shm_lock() and get rid of the function altogether.

[manfred@colorfullife.com: most of changelog]
Link: http://lkml.kernel.org/r/20180712185241.4017-7-manfred@colorfullife.com
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc/util.c: correct comment in ipc_obtain_object_check
Manfred Spraul [Wed, 22 Aug 2018 05:01:37 +0000 (22:01 -0700)]
ipc/util.c: correct comment in ipc_obtain_object_check

The comment that explains ipc_obtain_object_check is wrong: The function
checks the sequence number, not the reference counter.

Note that checking the reference counter would be meaningless: The
reference counter is decreased without holding any locks, thus an object
with kern_ipc_perm.deleted=true may disappear at the end of the next rcu
grace period.

Link: http://lkml.kernel.org/r/20180712185241.4017-6-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc: rename ipcctl_pre_down_nolock()
Manfred Spraul [Wed, 22 Aug 2018 05:01:34 +0000 (22:01 -0700)]
ipc: rename ipcctl_pre_down_nolock()

Both the comment and the name of ipcctl_pre_down_nolock() are misleading:
The function must be called while holdling the rw semaphore.

Therefore the patch renames the function to ipcctl_obtain_check(): This
name matches the other names used in util.c:

- "obtain" function look up a pointer in the idr, without
  acquiring the object lock.
- The caller is responsible for locking.
- _check means that the sequence number is checked.

Link: http://lkml.kernel.org/r/20180712185241.4017-5-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()
Manfred Spraul [Wed, 22 Aug 2018 05:01:29 +0000 (22:01 -0700)]
ipc/util.c: use ipc_rcu_putref() for failues in ipc_addid()

ipc_addid() is impossible to use:
- for certain failures, the caller must not use ipc_rcu_putref(),
  because the reference counter is not yet initialized.
- for other failures, the caller must use ipc_rcu_putref(),
  because parallel operations could be ongoing already.

The patch cleans that up, by initializing the refcount early, and by
modifying all callers.

The issues is related to the finding of
syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com: syzbot found an
issue with reading kern_ipc_perm.seq, here both read and write to already
released memory could happen.

Link: http://lkml.kernel.org/r/20180712185241.4017-4-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc: reorganize initialization of kern_ipc_perm.seq
Manfred Spraul [Wed, 22 Aug 2018 05:01:25 +0000 (22:01 -0700)]
ipc: reorganize initialization of kern_ipc_perm.seq

ipc_addid() initializes kern_ipc_perm.seq after having called idr_alloc()
(within ipc_idr_alloc()).

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
may see an uninitialized value.

The patch moves the initialization of kern_ipc_perm.seq before the calls
of idr_alloc().

Notes:
1) This patch has a user space visible side effect:
If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
if semget()/msgget()/shmget() fails in the final step of adding the id
to the rhash tree, then .._next_id is cleared. Before the patch, is
remained unmodified.

There is no change of the behavior after a successful ..get() call: It
always clears .._next_id, there is no impact to non checkpoint/restore
code as that code does not use .._next_id.

2) The patch correctly documents that after a call to ipc_idr_alloc(),
the full tear-down sequence must be used. The callers of ipc_addid()
do not fullfill that, i.e. more bugfixes are required.

The patch is a squash of a patch from Dmitry and my own changes.

Link: http://lkml.kernel.org/r/20180712185241.4017-3-manfred@colorfullife.com
Reported-by: syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoipc: compute kern_ipc_perm.id under the ipc lock
Manfred Spraul [Wed, 22 Aug 2018 05:01:21 +0000 (22:01 -0700)]
ipc: compute kern_ipc_perm.id under the ipc lock

ipc_addid() initializes kern_ipc_perm.id after having called
ipc_idr_alloc().

Thus a parallel semctl() or msgctl() that uses e.g.  MSG_STAT may use this
unitialized value as the return code.

The patch moves all accesses to kern_ipc_perm.id under the spin_lock().

The issues is related to the finding of
syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com: syzbot found an
issue with kern_ipc_perm.seq

Link: http://lkml.kernel.org/r/20180712185241.4017-2-manfred@colorfullife.com
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoinit/Kconfig: remove EXPERT from CHECKPOINT_RESTORE
Adrian Reber [Wed, 22 Aug 2018 05:01:17 +0000 (22:01 -0700)]
init/Kconfig: remove EXPERT from CHECKPOINT_RESTORE

The CHECKPOINT_RESTORE configuration option was introduced in 2012 and
combined with EXPERT.  CHECKPOINT_RESTORE is already enabled in many
distribution kernels and also part of the defconfigs of various
architectures.

To make it easier for distributions to enable CHECKPOINT_RESTORE this
removes EXPERT and moves the configuration option out of the EXPERT block.

Link: http://lkml.kernel.org/r/20180712130733.11510-1-adrian@lisas.de
Signed-off-by: Adrian Reber <adrian@lisas.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp
Arnd Bergmann [Wed, 22 Aug 2018 05:01:13 +0000 (22:01 -0700)]
fs/sysv/inode.c: use ktime_get_real_seconds() for superblock stamp

get_seconds() is deprecated in favor of ktime_get_real_seconds(), which
returns a 64-bit timestamp.

In the SYSV file system, the superblock timestamp is only 32 bits wide,
and it is used to check whether a file system is clean, so the best
solution seems to be to force a wraparound and explicitly convert it to an
unsigned 32-bit value.

This is independent of the inode timestamps that are also 32-bit wide on
disk and that come from current_time().

Link: http://lkml.kernel.org/r/20180713145236.3152513-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoadfs: use timespec64 for time conversion
Arnd Bergmann [Wed, 22 Aug 2018 05:01:09 +0000 (22:01 -0700)]
adfs: use timespec64 for time conversion

We just truncate the seconds to 32-bit in one place now, so this can
trivially be converted over to using timespec64 consistently.

Link: http://lkml.kernel.org/r/20180620100133.4035614-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agokernel/sysctl.c: fix typos in comments
Randy Dunlap [Wed, 22 Aug 2018 05:01:06 +0000 (22:01 -0700)]
kernel/sysctl.c: fix typos in comments

Fix a few typos/spellos in kernel/sysctl.c.

Link: http://lkml.kernel.org/r/bb09a8b9-f984-6dd4-b07b-3ecaf200862e@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agodrivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md
Colin Ian King [Wed, 22 Aug 2018 05:01:01 +0000 (22:01 -0700)]
drivers/rapidio/devices/rio_mport_cdev.c: remove redundant pointer md

Pointer md is being assigned but is never used hence it is redundant and
can be removed.

Cleans up clang warning:
warning: variable 'md' set but not used [-Wunused-but-set-variable]

Link: http://lkml.kernel.org/r/20180711082346.5223-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Alexandre Bounine <alex.bou9@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofork: don't copy inconsistent signal handler state to child
Jann Horn [Wed, 22 Aug 2018 05:00:58 +0000 (22:00 -0700)]
fork: don't copy inconsistent signal handler state to child

Before this change, if a multithreaded process forks while one of its
threads is changing a signal handler using sigaction(), the memcpy() in
copy_sighand() can race with the struct assignment in do_sigaction().  It
isn't clear whether this can cause corruption of the userspace signal
handler pointer, but it definitely can cause inconsistency between
different fields of struct sigaction.

Take the appropriate spinlock to avoid this.

I have tested that this patch prevents inconsistency between sa_sigaction
and sa_flags, which is possible before this patch.

Link: http://lkml.kernel.org/r/20180702145108.73189-1-jannh@google.com
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make get_signal() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:54 +0000 (22:00 -0700)]
signal: make get_signal() return bool

make get_signal() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-18-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make sigkill_pending() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:50 +0000 (22:00 -0700)]
signal: make sigkill_pending() return bool

sigkill_pending() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-17-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make legacy_queue() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:46 +0000 (22:00 -0700)]
signal: make legacy_queue() return bool

legacy_queue() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-16-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make wants_signal() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:42 +0000 (22:00 -0700)]
signal: make wants_signal() return bool

wants_signal() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-15-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make flush_sigqueue_mask() void
Christian Brauner [Wed, 22 Aug 2018 05:00:38 +0000 (22:00 -0700)]
signal: make flush_sigqueue_mask() void

The return value of flush_sigqueue_mask() is never checked anywhere.

Link: http://lkml.kernel.org/r/20180602103653.18181-14-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make unhandled_signal() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:34 +0000 (22:00 -0700)]
signal: make unhandled_signal() return bool

unhandled_signal() already behaves like a boolean function.  Let's
actually declare it as such too.  All callers treat it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-13-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make recalc_sigpending_tsk() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:30 +0000 (22:00 -0700)]
signal: make recalc_sigpending_tsk() return bool

recalc_sigpending_tsk() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-12-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make has_pending_signals() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:27 +0000 (22:00 -0700)]
signal: make has_pending_signals() return bool

has_pending_signals() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-11-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make sig_ignored() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:23 +0000 (22:00 -0700)]
signal: make sig_ignored() return bool

sig_ignored() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-10-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make sig_task_ignored() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:19 +0000 (22:00 -0700)]
signal: make sig_task_ignored() return bool

sig_task_ignored() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-9-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make sig_handler_ignored() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:15 +0000 (22:00 -0700)]
signal: make sig_handler_ignored() return bool

sig_handler_ignored() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-8-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make kill_ok_by_cred() return bool
Christian Brauner [Wed, 22 Aug 2018 05:00:11 +0000 (22:00 -0700)]
signal: make kill_ok_by_cred() return bool

kill_ok_by_cred() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-7-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: simplify rt_sigaction()
Christian Brauner [Wed, 22 Aug 2018 05:00:07 +0000 (22:00 -0700)]
signal: simplify rt_sigaction()

The goto is not needed and does not add any clarity.  Simply return
-EINVAL on unexpected sigset_t struct size directly.

Link: http://lkml.kernel.org/r/20180602103653.18181-6-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make do_sigpending() void
Christian Brauner [Wed, 22 Aug 2018 05:00:02 +0000 (22:00 -0700)]
signal: make do_sigpending() void

do_sigpending() returned 0 unconditionally so it doesn't make sense to
have it return at all.  This allows us to simplify a bunch of syscall
callers.

Link: http://lkml.kernel.org/r/20180602103653.18181-5-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make may_ptrace_stop() return bool
Christian Brauner [Wed, 22 Aug 2018 04:59:59 +0000 (21:59 -0700)]
signal: make may_ptrace_stop() return bool

may_ptrace_stop() already behaves like a boolean function.  Let's actually
declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-4-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make kill_as_cred_perm() return bool
Christian Brauner [Wed, 22 Aug 2018 04:59:55 +0000 (21:59 -0700)]
signal: make kill_as_cred_perm() return bool

kill_as_cred_perm() already behaves like a boolean function.  Let's
actually declare it as such too.

Link: http://lkml.kernel.org/r/20180602103653.18181-3-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosignal: make force_sigsegv() void
Christian Brauner [Wed, 22 Aug 2018 04:59:51 +0000 (21:59 -0700)]
signal: make force_sigsegv() void

Patch series "signal: refactor some functions", v3.

This series refactors a bunch of functions in signal.c to simplify parts
of the code.

The greatest single change is declaring the static do_sigpending() helper
as void which makes it possible to remove a bunch of unnecessary checks in
the syscalls later on.

This patch (of 17):

force_sigsegv() returned 0 unconditionally so it doesn't make sense to have
it return at all. In addition, there are no callers that check
force_sigsegv()'s return value.

Link: http://lkml.kernel.org/r/20180602103653.18181-2-christian@brauner.io
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Morris <james.morris@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofat: propagate 64-bit inode timestamps
Arnd Bergmann [Wed, 22 Aug 2018 04:59:48 +0000 (21:59 -0700)]
fat: propagate 64-bit inode timestamps

Now that we pass down 64-bit timestamps from VFS, we just need to convert
that correctly into on-disk timestamps.  To make that work correctly, this
changes the last use of time_to_tm() in the kernel to time64_to_tm(),
which also lets use remove that deprecated interfaces.

Similarly, the time_t use in fat_time_fat2unix() truncates the timestamp
on the way in, which can be avoided by using types that are wide enough to
hold the intermediate values during the conversion.

[hirofumi@mail.parknet.co.jp: remove useless temporary variable, needless long long]
Link: http://lkml.kernel.org/r/20180619153646.3637529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofat: validate ->i_start before using
OGAWA Hirofumi [Wed, 22 Aug 2018 04:59:44 +0000 (21:59 -0700)]
fat: validate ->i_start before using

On corrupted FATfs may have invalid ->i_start.  To handle it, this checks
->i_start before using, and return proper error code.

Link: http://lkml.kernel.org/r/87o9f8y1t5.fsf_-_@mail.parknet.co.jp
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofat: add FITRIM ioctl for FAT file system
Wentao Wang [Wed, 22 Aug 2018 04:59:41 +0000 (21:59 -0700)]
fat: add FITRIM ioctl for FAT file system

Add FITRIM ioctl for FAT file system

[witallwang@gmail.com: use u64s]
Link: http://lkml.kernel.org/r/87h8l37hub.fsf@mail.parknet.co.jp
[hirofumi@mail.parknet.co.jp: bug fixes, coding style fixes, add signal check]
Link: http://lkml.kernel.org/r/87fu10anhj.fsf@mail.parknet.co.jp
Signed-off-by: Wentao Wang <witallwang@gmail.com>
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoreiserfs: fix broken xattr handling (heap corruption, bad retval)
Jann Horn [Wed, 22 Aug 2018 04:59:37 +0000 (21:59 -0700)]
reiserfs: fix broken xattr handling (heap corruption, bad retval)

This fixes the following issues:

- When a buffer size is supplied to reiserfs_listxattr() such that each
  individual name fits, but the concatenation of all names doesn't fit,
  reiserfs_listxattr() overflows the supplied buffer.  This leads to a
  kernel heap overflow (verified using KASAN) followed by an out-of-bounds
  usercopy and is therefore a security bug.

- When a buffer size is supplied to reiserfs_listxattr() such that a
  name doesn't fit, -ERANGE should be returned.  But reiserfs instead just
  truncates the list of names; I have verified that if the only xattr on a
  file has a longer name than the supplied buffer length, listxattr()
  incorrectly returns zero.

With my patch applied, -ERANGE is returned in both cases and the memory
corruption doesn't happen anymore.

Credit for making me clean this code up a bit goes to Al Viro, who pointed
out that the ->actor calling convention is suboptimal and should be
changed.

Link: http://lkml.kernel.org/r/20180802151539.5373-1-jannh@google.com
Fixes: 48b32a3553a5 ("reiserfs: use generic xattr handlers")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Jeff Mahoney <jeffm@suse.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoreiserfs: change j_timestamp type to time64_t
Arnd Bergmann [Wed, 22 Aug 2018 04:59:34 +0000 (21:59 -0700)]
reiserfs: change j_timestamp type to time64_t

This uses the deprecated time_t type but is write-only, and could be
removed, but as Jeff explains, having a timestamp can be usefule for
post-mortem analysis in crash dumps.

In order to remove one of the last instances of time_t, this changes the
type to time64_t, same as j_trans_start_time.

Link: http://lkml.kernel.org/r/20180622133315.221210-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoreiserfs: remove obsolete print_time function
Arnd Bergmann [Wed, 22 Aug 2018 04:59:30 +0000 (21:59 -0700)]
reiserfs: remove obsolete print_time function

Before linux-2.4.6, print_time() was used to pretty-print an inode time
when running reiserfs in user space, after that it has become obsolete and
is still a bit incorrect: It behaves differently on 32-bit and 64-bit
machines, and uses a static buffer to hold a string, which could lead to
undefined behavior if we ever called this from multiple places
simultaneously.

Since we always want to treat the timestamps as 'unsigned' anyway, simply
printing them as an integer is both simpler and safer while avoiding the
deprecated time_t type.

Link: http://lkml.kernel.org/r/20180620142522.27639-3-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoreiserfs: use monotonic time for j_trans_start_time
Arnd Bergmann [Wed, 22 Aug 2018 04:59:26 +0000 (21:59 -0700)]
reiserfs: use monotonic time for j_trans_start_time

Using CLOCK_REALTIME time_t timestamps breaks on 32-bit systems in 2038,
and gives surprising results with a concurrent settimeofday().

This changes the reiserfs journal timestamps to use ktime_get_seconds()
instead, which makes it use a 64-bit CLOCK_MONOTONIC stamp.

In the procfs output, the monotonic timestamp needs to be converted back
to CLOCK_REALTIME to keep the existing ABI.

Link: http://lkml.kernel.org/r/20180620142522.27639-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agohfsplus: drop ACL support
Ernesto A. Fernández [Wed, 22 Aug 2018 04:59:23 +0000 (21:59 -0700)]
hfsplus: drop ACL support

The HFS+ Access Control Lists have not worked at all for the past five
years, and nobody seems to have noticed.  Besides, POSIX draft ACLs are
not compatible with MacOS.  Drop the feature entirely.

Link: http://lkml.kernel.org/r/20180714190608.wtnmmtjqeyladkut@eaf
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agohfsplus: fix decomposition of Hangul characters
Ernesto A. Fernández [Wed, 22 Aug 2018 04:59:19 +0000 (21:59 -0700)]
hfsplus: fix decomposition of Hangul characters

Files created under macOS cannot be opened under linux if their names
contain Korean characters, and vice versa.

The Korean alphabet is special because its normalization is done without a
table.  The module deals with it correctly when composing, but forgets
about it for the decomposition.

Fix this using the Hangul decomposition function provided in the Unicode
Standard.  The code fits a bit awkwardly because it requires a buffer,
while all the other normalizations are returned as pointers to the
decomposition table.  This is actually also a bug because reordering may
still be needed, but for now leave it as it is.

The patch will cause trouble for Hangul filenames already created by the
module in the past.  This shouldn't really be concern because its main
purpose was always sharing with macOS.  If a user actually needs to access
such a file the nodecompose mount option should be enough.

Link: http://lkml.kernel.org/r/20180717220951.p6qqrgautc4pxvzu@eaf
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reported-by: Ting-Chang Hou <tchou@synology.com>
Tested-by: Ting-Chang Hou <tchou@synology.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agohfsplus: avoid deadlock on file truncation
Ernesto A. Fernández [Wed, 22 Aug 2018 04:59:16 +0000 (21:59 -0700)]
hfsplus: avoid deadlock on file truncation

After an extent is removed from the extent tree, the corresponding bits
are also cleared from the block allocation file.  This is currently done
without releasing the tree lock.

The problem is that the allocation file has extents of its own; if it is
fragmented enough, some of them may be in the extent tree as well, and
hfsplus_get_block() will try to take the lock again.

To avoid deadlock, only hold the extent tree lock during the actual tree
operations.

Link: http://lkml.kernel.org/r/20180709202549.auxwkb6memlegb4a@eaf
Signed-off-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agohfsplus: don't return 0 when fill_super() failed
Tetsuo Handa [Wed, 22 Aug 2018 04:59:12 +0000 (21:59 -0700)]
hfsplus: don't return 0 when fill_super() failed

syzbot is reporting NULL pointer dereference at mount_fs() [1].  This is
because hfsplus_fill_super() is by error returning 0 when
hfsplus_fill_super() detected invalid filesystem image, and mount_bdev()
is returning NULL because dget(s->s_root) == NULL if s->s_root == NULL,
and mount_fs() is accessing root->d_sb because IS_ERR(root) == false if
root == NULL.  Fix this by returning -EINVAL when hfsplus_fill_super()
detected invalid filesystem image.

[1] https://syzkaller.appspot.com/bug?id=21acb6850cecbc960c927229e597158cf35f33d0

Link: http://lkml.kernel.org/r/d83ce31a-874c-dd5b-f790-41405983a5be@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+01ffaf5d9568dd1609f7@syzkaller.appspotmail.com>
Reviewed-by: Ernesto A. Fernández <ernesto.mnd.fernandez@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/nilfs2/file.c: use new return type vm_fault_t
Souptick Joarder [Wed, 22 Aug 2018 04:59:08 +0000 (21:59 -0700)]
fs/nilfs2/file.c: use new return type vm_fault_t

Use new return type vm_fault_t for page_mkwrite handler.

Link: http://lkml.kernel.org/r/1529555928-2411-1-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agonilfs2: use 64-bit superblock timstamps
Arnd Bergmann [Wed, 22 Aug 2018 04:59:05 +0000 (21:59 -0700)]
nilfs2: use 64-bit superblock timstamps

The mount time field in the superblock uses a 64-bit timestamp, but
calling get_seconds() may truncate the current time to 32 bits.

This changes it to ktime_get_real_seconds() to avoid the potential
overflow.

Link: http://lkml.kernel.org/r/20180620075041.4154396-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: add AUTOFS_EXP_FORCED flag
Ian Kent [Wed, 22 Aug 2018 04:59:01 +0000 (21:59 -0700)]
autofs: add AUTOFS_EXP_FORCED flag

The userspace automount(8) daemon is meant to perform a forced expire when
sent a SIGUSR2.

But since the expiration is routed through the kernel and the kernel
doesn't send an expire request if the mount is busy this hasn't worked at
least since autofs version 5.

Add an AUTOFS_EXP_FORCED flag to allow implemention of the feature and
bump the protocol version so user space can check if it's implemented if
needed.

Link: http://lkml.kernel.org/r/152937734715.21213.6594007182776598970.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: make expire flags usage consistent with v5 params
Ian Kent [Wed, 22 Aug 2018 04:58:58 +0000 (21:58 -0700)]
autofs: make expire flags usage consistent with v5 params

Make the usage of the expire flags consistent by naming the expire flags
the same as it is named in the version 5 miscelaneous ioctl parameters and
only check the bit flags when needed.

Link: http://lkml.kernel.org/r/152937734046.21213.9454131988766280028.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: make autofs_expire_indirect() static
Ian Kent [Wed, 22 Aug 2018 04:58:54 +0000 (21:58 -0700)]
autofs: make autofs_expire_indirect() static

autofs_expire_indirect() isn't used outside of fs/autofs/expire.c so make
it static.

Link: http://lkml.kernel.org/r/152937733512.21213.10509996499623738446.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: make autofs_expire_direct() static
Ian Kent [Wed, 22 Aug 2018 04:58:51 +0000 (21:58 -0700)]
autofs: make autofs_expire_direct() static

autofs_expire_direct() isn't used outside of fs/autofs/expire.c so make it
static.

Link: http://lkml.kernel.org/r/152937732944.21213.11821977712410930973.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: fix clearing AUTOFS_EXP_LEAVES in autofs_expire_indirect()
Ian Kent [Wed, 22 Aug 2018 04:58:48 +0000 (21:58 -0700)]
autofs: fix clearing AUTOFS_EXP_LEAVES in autofs_expire_indirect()

The expire flag AUTOFS_EXP_LEAVES is cleared before the second call to
should_expire() in autofs_expire_indirect() but the parameter passed in
the second call is incorrect.

Fortunately AUTOFS_EXP_LEAVES expire flag has not been used for a long
time but might be needed in the future so fix it rather than remove the
expire leaves functionality.

Link: http://lkml.kernel.org/r/152937732410.21213.7447294898147765076.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: fix inconsistent use of now variable
Ian Kent [Wed, 22 Aug 2018 04:58:44 +0000 (21:58 -0700)]
autofs: fix inconsistent use of now variable

The global variable "now" in fs/autofs/expire.c is used in an inconsistent
way, sometimes using jiffies directly, and sometimes using the "now"
variable, and setting it isn't done consistently either.

But the autofs dentry info last_used field is only updated during path
walks or during expire so jiffies can be used directly and the global
variable "now" removed.

Link: http://lkml.kernel.org/r/152937731702.21213.7371321165189170865.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoautofs: fix directory and symlink access
Ian Kent [Wed, 22 Aug 2018 04:58:41 +0000 (21:58 -0700)]
autofs: fix directory and symlink access

Depending on how it is configured the autofs user space daemon can leave
in use mounts mounted at exit and re-connect to them at start up.  But for
this to work best the state of the autofs file system needs to be left
intact over the restart.

Also, at system shutdown, mounts in an autofs file system might be
umounted exposing a mount point trigger for which subsequent access can
lead to a hang.  So recent versions of automount(8) now does its best to
set autofs file system mounts catatonic at shutdown.

When autofs file system mounts are catatonic it's currently possible to
create and remove directories and symlinks which can be a problem at
restart, as described above.

So return EACCES in the directory, symlink and unlink methods if the
autofs file system is catatonic.

Link: http://lkml.kernel.org/r/152902119090.4144.9561910674530214291.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoinit/main.c: log init process file name
Paul Menzel [Wed, 22 Aug 2018 04:58:37 +0000 (21:58 -0700)]
init/main.c: log init process file name

Add a log message to `run_init_process()`.

This log message serves two purposes.

1.  If the init process is not specified on the Linux Kernel command
    line, the user sees, what file was chosen.

2.  The time stamps shows exactly, when the Linux kernel handed over
    control to the init process.

Link: http://lkml.kernel.org/r/b1fc97fa-4aa9-1904-ddb5-859e78995c41@molgen.mpg.de
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoinit/Kconfig: fix its typos
Randy Dunlap [Wed, 22 Aug 2018 04:58:34 +0000 (21:58 -0700)]
init/Kconfig: fix its typos

Correct typos of "it's" to "its.

Link: http://lkml.kernel.org/r/0ac627b6-5527-55f4-0489-1631aa34fc11@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoinit/: remove ineffective sparse disabling
Luc Van Oostenryck [Wed, 22 Aug 2018 04:58:30 +0000 (21:58 -0700)]
init/: remove ineffective sparse disabling

Sparse checking used to be disabled on init/do_mounts.c and a few related
files because "Many of the syscalls used in this file expect some of the
arguments to be __user pointers not __kernel pointers".

However since 28128c61e ("kconfig.h: Include compiler types to avoid
missed struct attributes") the checks are, in fact, not disabled anymore
because of the more early include of "linux/compiler_types.h"

So remove the now ineffective #undefery that was done to disable these
warnings, as well as the associated comment.

Link: http://lkml.kernel.org/r/20180617115355.53799-1-luc.vanoostenryck@gmail.com
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/eventpoll.c: simplify ep_is_linked() callers
Davidlohr Bueso [Wed, 22 Aug 2018 04:58:26 +0000 (21:58 -0700)]
fs/eventpoll.c: simplify ep_is_linked() callers

Instead of having each caller pass the rdllink explicitly, just have
ep_is_linked() pass it while the callers just need the epi pointer.  This
helper is all about the rdllink, and this change, furthermore, improves
the function's self documentation.

Link: http://lkml.kernel.org/r/20180727053432.16679-3-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/eventpoll.c: loosen irq safety in ep_poll()
Davidlohr Bueso [Wed, 22 Aug 2018 04:58:23 +0000 (21:58 -0700)]
fs/eventpoll.c: loosen irq safety in ep_poll()

Similar to other calls, ep_poll() is not called with interrupts disabled,
and we can therefore avoid the irq save/restore dance and just disable
local irqs.  In fact, the call should never be called in irq context at
all, considering that the only path is

epoll_wait(2) -> do_epoll_wait() -> ep_poll().

When running on a 2 socket 40-core (ht) IvyBridge a common pipe based
epoll_wait(2) microbenchmark, the following performance improvements are
seen:

    # threads       vanilla         dirty
 1          1805587     2106412
 2          1854064     2090762
 4          1805484     2017436
 8          1751222     1974475
 16         1725299     1962104
 32         1378463     1571233
 64          787368      900784

Which is a pretty constantly near 15%.

Also add a lockdep check such that we detect any mischief before
deadlocking.

Link: http://lkml.kernel.org/r/20180727053432.16679-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/eventpoll.c: simply CONFIG_NET_RX_BUSY_POLL ifdefery
Davidlohr Bueso [Wed, 22 Aug 2018 04:58:19 +0000 (21:58 -0700)]
fs/eventpoll.c: simply CONFIG_NET_RX_BUSY_POLL ifdefery

... 'tis easier on the eye.

[akpm@linux-foundation.org: use inlines rather than macros]
Link: http://lkml.kernel.org/r/20180725185620.11020-1-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: DT bindings should be a separate patch
Rob Herring [Wed, 22 Aug 2018 04:58:16 +0000 (21:58 -0700)]
checkpatch: DT bindings should be a separate patch

Devicetree bindings should be their own patch as documented in
Documentation/devicetree/bindings/submitting-patches.txt section I.1.
This is because bindings are logically independent from a driver
implementation, they have a different maintainer (even though they often
are applied via the same tree), and it makes for a cleaner history in the
DT only tree created with git-filter-branch.

[robh@kernel.org: add doc pointer to warning, simplify logic]
Link: http://lkml.kernel.org/r/20180810170513.26284-1-robh@kernel.org
[robh@kernel.org: v3]
Link: http://lkml.kernel.org/r/20180810225049.20452-1-robh@kernel.org
Link: http://lkml.kernel.org/r/20180809205032.22205-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: warn on unnecessary int declarations
Joe Perches [Wed, 22 Aug 2018 04:58:12 +0000 (21:58 -0700)]
checkpatch: warn on unnecessary int declarations

On Sun, 2018-08-05 at 08:52 -0700, Linus Torvalds wrote:
> "long unsigned int" isn't _technically_ wrong. But we normally
> call that type "unsigned long".

So add a checkpatch test for it.

Link: http://lkml.kernel.org/r/7bbd97dc0a1e5896a0251fada7bb68bb33643f77.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: check for space after "else" keyword
Michal Zylowski [Wed, 22 Aug 2018 04:58:08 +0000 (21:58 -0700)]
checkpatch: check for space after "else" keyword

Current checkpatch implementation permits notation like

} else{

in kernel code.  It looks like oversight and inconsistency in checkpatch
rules (e.g.  instruction like 'do' is tested).

Add regex for checking space after 'else' keyword and trigger error if
space is not present.

Link: http://lkml.kernel.org/r/1533545753-8870-1-git-send-email-michal.zylowski@intel.com
Signed-off-by: Michal Zylowski <michal.zylowski@intel.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: fix SPDX license check with --root=<path>
Joe Perches [Wed, 22 Aug 2018 04:58:04 +0000 (21:58 -0700)]
checkpatch: fix SPDX license check with --root=<path>

checkpatch uses the in-kernel script spdxcheck.py to validate the specific
license in a file or script.

This check can currently fail for a couple reasons:

o spdxcheck.py assumes the existence of git tree that may not
  exist for a bare source tree from something like a tarball
o the spdxcheck.py must be run from the top level root directory

So add a git existence test and set the subprocess subdirectory.

Link: http://lkml.kernel.org/r/2b32864324ae9c92948b002ec4c0c22409ed98f1.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Charlemagne Lasse <charlemagnelasse@gmail.com>
Tested-by: Charlemagne Lasse <charlemagnelasse@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: warn when a patch doesn't have a description
Joe Perches [Wed, 22 Aug 2018 04:58:01 +0000 (21:58 -0700)]
checkpatch: warn when a patch doesn't have a description

Potential patches should have a commit description.  Emit a warning when
there isn't one.

[akpm@linux-foundation.org: s/else if/elsif/]
Link: http://lkml.kernel.org/r/1b099f4d8373aa583a17011992676bf0f3f09eee.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Suggested-by: Prakruthi Deepak Heragu <pheragu@codeaurora.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: check for #if 0/#if 1
Prakruthi Deepak Heragu [Wed, 22 Aug 2018 04:57:57 +0000 (21:57 -0700)]
checkpatch: check for #if 0/#if 1

The #if 0 or #if 1 is used to toggle features. Warn if #if 0 or #if 1
is present and suggest that they can be removed.

[akpm@linux-foundation.org: fix spacing around periods, per Joe\
Link: http://lkml.kernel.org/r/1532625218-24321-1-git-send-email-pheragu@codeaurora.org
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Prakruthi Deepak Heragu <pheragu@codeaurora.org>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: fix krealloc reuse test
Joe Perches [Wed, 22 Aug 2018 04:57:50 +0000 (21:57 -0700)]
checkpatch: fix krealloc reuse test

The current krealloc test does not function correctly when the temporary
pointer return name contains the original pointer name.

Fix that by maximally matching the return pointer name and the original
pointer name and doing a separate comparison of the both names.

Link: http://lkml.kernel.org/r/e617ecb8c019a9c4c56540a1bec16c8aed43a4e4.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Manish Narani <manish.narani@xilinx.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: validate SPDX license with spdxcheck.py
Joe Perches [Wed, 22 Aug 2018 04:57:47 +0000 (21:57 -0700)]
checkpatch: validate SPDX license with spdxcheck.py

Use the existing scripts/spdxcheck.py to validate any
SPDX-License-Identifier found in line 1 or 2 of patches or files.

Miscellanea:

o Properly indent the existing SPDX-License-Identifier block.

Link: http://lkml.kernel.org/r/05b832407b24e0a27e419906187cd863bc1617c7.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: fix macro argument reuse test
Joe Perches [Wed, 22 Aug 2018 04:57:43 +0000 (21:57 -0700)]
checkpatch: fix macro argument reuse test

Multiple line macro definitions where the arguments are separated by line
continuations can cause checkpatch to emit invalid syntax regex tests.

This can occur when a single argument is modified in a part of a patch.

For example: (to not add a diff in the commit message)

$ ./scripts/checkpatch.pl --git db023296f0115d2fe01fdabad54678f2b806da23
Unterminated \g... pattern in regex; <very long regex omitted>

And, the test does not work correctly when these arguments are all new as
the initial patch line addition "+" is used in the argument name.

Fix this by stripping the line continuations and any "+" from the list of
arguments.

Link: http://lkml.kernel.org/r/86cdb43a4db70670c102020093f7fb4eb3003e01.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: warn if missing author Signed-off-by
Geert Uytterhoeven [Wed, 22 Aug 2018 04:57:40 +0000 (21:57 -0700)]
checkpatch: warn if missing author Signed-off-by

Print a warning if none of the Signed-off-by lines cover the patch author.

Non-ASCII quoted printable encoding in From: headers and (lack of) double
quotes are handled.  Split From: headers are not fully handled: only the
first part is compared.

[geert+renesas@glider.be: only encode UTF-8 quoted printable mail headers]
Link: http://lkml.kernel.org/r/20180718145254.4770-1-geert+renesas@glider.be
Link: http://lkml.kernel.org/r/20180712100323.26684-1-geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Joe Perches <joe@perches.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: update section keywords
Geert Uytterhoeven [Wed, 22 Aug 2018 04:57:36 +0000 (21:57 -0700)]
checkpatch: update section keywords

As of commit bd721ea73e1f ("treewide: replace obsolete _refok by
__ref"), __init_refok no longer exists, so it can be removed.  While at
it, add the modern variants that were still missing.

Link: http://lkml.kernel.org/r/20180706084205.26367-1-geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: improve runtime execution speed a little
Joe Perches [Wed, 22 Aug 2018 04:57:33 +0000 (21:57 -0700)]
checkpatch: improve runtime execution speed a little

checkpatch repeatedly uses a runtime minimum version check that validates
the minimum perl version required for a regex match by using a "$^V ge
5.10.0" runtime string match.

Only perform that minimum version test once and store the result to reduce
string matching time.

This reduces runtime execution time for patches or files with high line
counts.

An example runtime improvement:

new: $ time ./scripts/checkpatch.pl -f drivers/net/ethernet/intel/i40e/i40e_main.c > /dev/null

real 0m11.856s
user 0m11.831s
sys 0m0.025s

old: $ time ./scripts/checkpatch.pl -f drivers/net/ethernet/intel/i40e/i40e_main.c > /dev/null

real 0m13.330s
user 0m13.282s
sys 0m0.049s

Link: http://lkml.kernel.org/r/db21aa9703833bad65ab70cc4e8a78da5b399138.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: add --fix for CONCATENATED_STRING and STRING_FRAGMENTS
Joe Perches [Wed, 22 Aug 2018 04:57:29 +0000 (21:57 -0700)]
checkpatch: add --fix for CONCATENATED_STRING and STRING_FRAGMENTS

Add the ability to --fix these string issues.

e.g.:
printk(KERN_INFO"bar" "baz"QUX);
converts to
printk(KERN_INFO "barbaz" QUX);

Link: http://lkml.kernel.org/r/a9fb505ccfedffc5869d08832a7ff05a21d85621.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agocheckpatch: add a --strict test for structs with bool member definitions
Joe Perches [Wed, 22 Aug 2018 04:57:26 +0000 (21:57 -0700)]
checkpatch: add a --strict test for structs with bool member definitions

A struct with a bool member can have different sizes on various
architectures because neither bool size nor alignment is standardized.

So emit a message on the use of bool in structs only in .h files and not
.c files.

There is the real possibility that this test could have a false positive
when a bool is declared as an automatic, so limit the test to .h files
where the only false positive is for declarations in static inline
functions.

Link: http://lkml.kernel.org/r/95477c93db187bab6da8a8ba7c57836868446179.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/test_hexdump.c: fix failure on big endian cpu
Christophe Leroy [Wed, 22 Aug 2018 04:57:22 +0000 (21:57 -0700)]
lib/test_hexdump.c: fix failure on big endian cpu

On a big endian cpu, test_hexdump fails as follows.  The logs show that
bytes are expected in reversed order.

  [...]
  test_hexdump: Len: 24 buflen: 130 strlen: 97
  test_hexdump: Result: 97 'be32db7b 0a1893b2 70bac424 7d83349b a69c31ad 9c0face9                    .2.{....p..$}.4...1.....'
  test_hexdump: Expect: 97 '7bdb32be b293180a 24c4ba70 9b34837d ad319ca6 e9ac0f9c                    .2.{....p..$}.4...1.....'
  test_hexdump: Len: 8 buflen: 130 strlen: 77
  test_hexdump: Result: 77 'be32db7b0a1893b2                                                     .2.{....'
  test_hexdump: Expect: 77 'b293180a7bdb32be                                                     .2.{....'
  test_hexdump: Len: 6 buflen: 131 strlen: 87
  test_hexdump: Result: 87 'be32 db7b 0a18                                                                   .2.{..'
  test_hexdump: Expect: 87 '32be 7bdb 180a                                                                   .2.{..'
  test_hexdump: Len: 24 buflen: 131 strlen: 97
  test_hexdump: Result: 97 'be32db7b 0a1893b2 70bac424 7d83349b a69c31ad 9c0face9                    .2.{....p..$}.4...1.....'
  test_hexdump: Expect: 97 '7bdb32be b293180a 24c4ba70 9b34837d ad319ca6 e9ac0f9c                    .2.{....p..$}.4...1.....'
  test_hexdump: Len: 32 buflen: 131 strlen: 101
  test_hexdump: Result: 101 'be32db7b0a1893b2 70bac4247d83349b a69c31ad9c0face9 4cd1199943b1af0c  .2.{....p..$}.4...1.....L...C...'
  test_hexdump: Expect: 101 'b293180a7bdb32be 9b34837d24c4ba70 e9ac0f9cad319ca6 0cafb1439919d14c  .2.{....p..$}.4...1.....L...C...'
  test_hexdump: failed 801 out of 1184 tests

This patch fixes it.

Link: http://lkml.kernel.org/r/f3112437f62c2f48300535510918e8be1dceacfb.1533610877.git.christophe.leroy@c-s.fr
Fixes: 64d1d77a44697 ("hexdump: introduce test suite")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: rashmica <rashmicy@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/Kconfig: remove 'default n' for tests
Andy Shevchenko [Wed, 22 Aug 2018 04:57:18 +0000 (21:57 -0700)]
lib/Kconfig: remove 'default n' for tests

It seems contributors follow the style of Kconfig entries where explicit
'default n' is present.  The default 'default' is 'n' already, thus, drop
these lines from Kconfig to make it more clear.

Link: http://lkml.kernel.org/r/20180719085131.79541-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agobcache: use routines from lib/crc64.c for CRC64 calculation
Coly Li [Wed, 22 Aug 2018 04:57:15 +0000 (21:57 -0700)]
bcache: use routines from lib/crc64.c for CRC64 calculation

Now we have crc64 calculation in lib/crc64.c, it is unnecessary for
bcache to use its own version.  This patch changes bcache code to use
crc64 routines in lib/crc64.c.

Link: http://lkml.kernel.org/r/20180718165545.1622-3-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Noah Massey <noah.massey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib: add crc64 calculation routines
Coly Li [Wed, 22 Aug 2018 04:57:11 +0000 (21:57 -0700)]
lib: add crc64 calculation routines

Patch series "add crc64 calculation as kernel library", v5.

This patchset adds basic implementation of crc64 calculation as a Linux
kernel library.  Since bcache already does crc64 by itself, this patchset
also modifies bcache code to use the new crc64 library routine.

Currently bcache is the only user of crc64 calculation, another potential
user is bcachefs which is on the way to be in mainline kernel.  Therefore
it makes sense to make crc64 calculation to be a public library.

bcache uses crc64 as storage checksum, if a change of crc lib routines
results an inconsistent result, the unmatched checksum may make bcache
'think' the on-disk is corrupted, such a change should be avoided or
detected as early as possible.  Therefore a patch is being prepared which
adds a crc test framework, to check consistency of different calculations.

This patch (of 2):

Add the re-write crc64 calculation routines for Linux kernel.  The CRC64
polynomical arithmetic follows ECMA-182 specification, inspired by CRC
paper of Dr.  Ross N.  Williams (see
http://www.ross.net/crc/download/crc_v3.txt) and other public domain
implementations.

All the changes work in this way,
- When Linux kernel is built, host program lib/gen_crc64table.c will be
  compiled to lib/gen_crc64table and executed.
- The output of gen_crc64table execution is an array called as lookup
  table (a.k.a POLY 0x42f0e1eba9ea369) which contain 256 64-bit long
  numbers, this table is dumped into header file lib/crc64table.h.
- Then the header file is included by lib/crc64.c for normal 64bit crc
  calculation.
- Function declaration of the crc64 calculation routines is placed in
  include/linux/crc64.h

Currently bcache is the only user of crc64_be(), another potential user is
bcachefs which is on the way to be in mainline kernel.  Therefore it makes
sense to move crc64 calculation into lib/crc64.c as public code.

[colyli@suse.de: fix review comments from v4]
Link: http://lkml.kernel.org/r/20180726053352.2781-2-colyli@suse.de
Link: http://lkml.kernel.org/r/20180718165545.1622-2-colyli@suse.de
Signed-off-by: Coly Li <colyli@suse.de>
Co-developed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Michael Lyle <mlyle@lyle.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Noah Massey <noah.massey@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/test_debug_virtual.c: make struct pointer foo static
Colin Ian King [Wed, 22 Aug 2018 04:57:07 +0000 (21:57 -0700)]
lib/test_debug_virtual.c: make struct pointer foo static

The pointer foo is local to the source and does not need to be
in global scope, so make it static.

Cleans up sparse warning:
symbol 'foo' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20180624112206.5722-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoinclude/linux/bitops.h: introduce BITS_PER_TYPE
Chris Wilson [Wed, 22 Aug 2018 04:57:03 +0000 (21:57 -0700)]
include/linux/bitops.h: introduce BITS_PER_TYPE

net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the
number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro
to bitops.h, alongside BITS_PER_BYTE, for wider usage.

Link: http://lkml.kernel.org/r/20180706094458.14116-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Andy Gospodarek <gospo@broadcom.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agolib/bitmap.c: drop unnecessary 0 check for u32 array operations
Andy Shevchenko [Wed, 22 Aug 2018 04:56:59 +0000 (21:56 -0700)]
lib/bitmap.c: drop unnecessary 0 check for u32 array operations

nbits == 0 is safe to be supplied to the function body, so remove
unnecessary checks in bitmap_to_arr32() and bitmap_from_arr32().

Link: http://lkml.kernel.org/r/20180531131914.44352-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoget_maintainer: allow option --mpath <directory> to read all files in <directory>
Joe Perches [Wed, 22 Aug 2018 04:56:55 +0000 (21:56 -0700)]
get_maintainer: allow option --mpath <directory> to read all files in <directory>

There is an external use case for multiple private MAINTAINER style files
in a separate directory.  Allow it.

--mpath has a default of "./MAINTAINERS".

The value entered can be either a file or a directory.

The behaviors are now:

--mpath <file>          Read only the specific file as <MAINTAINER_TYPE> file
--mpath <directory>     Read all files in <directory> as <MAINTAINER_TYPE> files
--mpath <directory> --find-maintainer-files
                        Recurse through <directory> and read all files named MAINTAINERS

Link: http://lkml.kernel.org/r/991b2f20112d53863cd79e61d908f1d26d3e1971.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoget_maintainer.pl: add -mpath=<path or file> for MAINTAINERS file location
Joe Perches [Wed, 22 Aug 2018 04:56:52 +0000 (21:56 -0700)]
get_maintainer.pl: add -mpath=<path or file> for MAINTAINERS file location

Add the ability to have an override for the location of the MAINTAINERS
file.

Miscellanea:

o Properly indent a few lines with leading spaces

Link: http://lkml.kernel.org/r/a86e69195076ed3c4c526fddc76b86c28e0a1e37.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Suggested-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoget_maintainer: allow usage outside of kernel tree
Antonio Nino Diaz [Wed, 22 Aug 2018 04:56:48 +0000 (21:56 -0700)]
get_maintainer: allow usage outside of kernel tree

Add option '--no-tree' to get_maintainer.pl script to allow using this
script in projects that aren't the Linux kernel if they use the same
format for their MAINTAINERS file.  This command is also available in
checkpatch.pl, for example.

Link: http://lkml.kernel.org/r/04452ac6-1575-f612-72c6-6ea88e70a9d5@arm.com
Signed-off-by: Antonio Nino Diaz <antonio.ninodiaz@arm.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agos/epoll: robustify irq safety with lockdep_assert_irqs_enabled()
Davidlohr Bueso [Wed, 22 Aug 2018 04:56:45 +0000 (21:56 -0700)]
s/epoll: robustify irq safety with lockdep_assert_irqs_enabled()

Sprinkle lockdep_assert_irqs_enabled() checks in the functions that do not
save and restore interrupts when dealing with the ep->wq.lock.  These are
ep_scan_ready_list() and those called by epoll_ctl(): ep_insert, ep_modify
and ep_remove.

[akpm@linux-foundation.org: remove too-obvious comments]
Link: http://lkml.kernel.org/r/20180721183127.3busfa335zlcjeox@linux-r8p5
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/epoll: loosen irq safety in epoll_insert() and epoll_remove()
Davidlohr Bueso [Wed, 22 Aug 2018 04:56:41 +0000 (21:56 -0700)]
fs/epoll: loosen irq safety in epoll_insert() and epoll_remove()

Both functions are similar to the context of ep_modify(), called via
epoll_ctl(2).  Just like ep_modify(), saving and restoring interrupts is
an overkill in these calls as it will never be called with irqs disabled.
While ep_remove() can be called directly from EPOLL_CTL_DEL, it can also
be called when releasing the file, but this also complies with the above.

Link: http://lkml.kernel.org/r/20180720172956.2883-3-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agofs/epoll: loosen irq safety in ep_scan_ready_list()
Davidlohr Bueso [Wed, 22 Aug 2018 04:56:38 +0000 (21:56 -0700)]
fs/epoll: loosen irq safety in ep_scan_ready_list()

Patch series "fs/epoll: loosen irq safety when possible".

Both patches replace saving+restoring interrupts when taking the ep->lock
(now the waitqueue lock), with just disabling local irqs.  This shows
immediate performance benefits in patch 1 for an epoll workload running on
Xen.  The main concern we need to have with this sort of changes in epoll
is the ep_poll_callback() which is passed to the wait queue wakeup and is
done very often under irq context, this patch does not touch this call.

Patches have been tested pretty heavily with the customer workload,
microbenchmarks, ltp testcases and two high level workloads that use epoll
under the hood: nginx and libevent benchmarks.

This patch (of 2):

Saving and restoring interrupts in ep_scan_ready_list() is an
overkill as it is never called with interrupts disabled. Loosen
this to simply disabling local irqs such that archs where managing
irqs is expensive or virtual environments. This patch yields
some throughput improvements on a workload that is epoll intensive
running on a single Xen DomU.

1 Job  7500 -->    8800 enq/s  (+17%)
2 Jobs 14000   -->   15200 enq/s  (+8%)
3 Jobs 20500 -->   22300 enq/s  (+8%)
4 Jobs 25000   -->   28000 enq/s  (+8-12)%

On bare metal:

For a 2-socket 40-core (ht) IvyBridge on a few workloads, unfortunately I
don't have a xen environment and the results for Xen I do have (which
numbers are in patch 1) I don't have the actual workload, so cannot
compare them directly.

1) Different configurations were used for a epoll_wait (pipes io)
   microbench (http://linux-scalability.org/epoll/epoll-test.c) and shows
   around a 7-10% improvement in overall total number of times the
   epoll_wait() loops when using both regular and nested epolls, so very
   raw numbers, but measurable nonetheless.

# threads vanilla dirty
     1 1677717 1805587
     2 1660510 1854064
     4 1610184 1805484
     8 1577696 1751222
     16 1568837 1725299
     32 1291532 1378463
     64  752584  787368

   Note that stddev is pretty small.

2) Another pipe test, which shows no real measurable improvement.
   (http://www.xmailserver.org/linux-patches/pipetest.c)

Link: http://lkml.kernel.org/r/20180720172956.2883-2-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosched/wait: assert the wait_queue_head lock is held in __wake_up_common
Christoph Hellwig [Wed, 22 Aug 2018 04:56:34 +0000 (21:56 -0700)]
sched/wait: assert the wait_queue_head lock is held in __wake_up_common

Better ensure we actually hold the lock using lockdep than just commenting
on it.  Due to the various exported _locked interfaces it is far too easy
to get the locking wrong.

Link: http://lkml.kernel.org/r/20171214152344.6880-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agouserfaultfd: use fault_wqh lock
Matthew Wilcox [Wed, 22 Aug 2018 04:56:30 +0000 (21:56 -0700)]
userfaultfd: use fault_wqh lock

The userfaultfd code currently uses the unlocked waitqueue helpers for
managing fault_wqh, but instead of holding the waitqueue lock for this
waitqueue around these calls, it the waitqueue lock of
fault_pending_wq, which is a different waitqueue instance.  Given that
the waitqueue is not exposed to the rest of the kernel this actually
works ok at the moment, but prevents the userfaultfd locking rules from
being enforced using lockdep.

Switch to the internally locked waitqueue helpers instead.  This means
that the lock inside fault_wqh now nests inside the fault_pending_wqh
lock, but that's not a problem since it was entirely unused before.

[hch@lst.de: slight changelog updates]
[rppt@linux.vnet.ibm.com: spotted changelog spellos]
Link: http://lkml.kernel.org/r/20171214152344.6880-3-hch@lst.de
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoepoll: use the waitqueue lock to protect ep->wq
Christoph Hellwig [Wed, 22 Aug 2018 04:56:26 +0000 (21:56 -0700)]
epoll: use the waitqueue lock to protect ep->wq

Patch series "waitqueue lockdep annotation", v3.

This series adds a strategic lockdep_assert_held to __wake_up_common to
ensure callers really do hold the wait_queue_head lock when calling the
unlocked wake_up variants.  It turns out epoll did not do this for a
fairly common path (hit all the time by systemd during bootup), so the
second patch fixed this instance as well.

This patch (of 3):

The epoll code currently uses the unlocked waitqueue helpers for managing
ep->wq, but instead of holding the waitqueue lock around these calls, it
uses its own ep->lock spinlock.  Given that the waitqueue is not exposed
to the rest of the kernel this actually works ok at the moment, but
prevents the epoll locking rules from being enforced using lockdep.
Remove ep->lock and use the waitqueue lock to not only reduce the size of
struct eventpoll but also to make sure we can assert locking invariants in
the waitqueue code.

Link: http://lkml.kernel.org/r/20171214152344.6880-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Baron <jbaron@akamai.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>