git.openwrt.org Git - openwrt/staging/blogic.git/log

vfs: Deduplicate code shared by xattr system calls operating on paths

The following pairs of system calls dealing with extended attributes only
differ in their behavior on whether the symbolic link is followed (when
the named file is a symbolic link):

- setxattr() and lsetxattr()
- getxattr() and lgetxattr()
- listxattr() and llistxattr()
- removexattr() and lremovexattr()

Despite this, the implementations all had duplicated code, so this commit
redirects each of the above pairs of system calls to a corresponding
function to which different lookup flags (LOOKUP_FOLLOW or 0) are passed.

For me this reduced the stripped size of xattr.o from 8824 to 8248 bytes.

Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

reiserfs: remove pointless forward declaration of struct nameidata

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

don't need that forward declaration of struct nameidata in dcache.h anymore

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

take dname_external() into fs/dcache.c

never used outside and it's too low-level for legitimate uses outside
of fs/dcache.c anyway

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

let path_init() failures treated the same way as subsequent link_path_walk()

As it is, path_lookupat() and path_mounpoint() might end up leaking struct file
reference in some cases.

Spotted-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fix misuses of f_count() in ppp and netlink

we used to check for "nobody else could start doing anything with
that opened file" by checking that refcount was 2 or less - one
for descriptor table and one we'd acquired in fget() on the way to
wherever we are.  That was race-prone (somebody else might have
had a reference to descriptor table and do fget() just as we'd
been checking) and it had become flat-out incorrect back when
we switched to fget_light() on those codepaths - unlike fget(),
it doesn't grab an extra reference unless the descriptor table
is shared.  The same change allowed a race-free check, though -
we are safe exactly when refcount is less than 2.

It was a long time ago; pre-2.6.12 for ioctl() (the codepath leading
to ppp one) and 2.6.17 for sendmsg() (netlink one).  OTOH,
netlink hadn't grown that check until 3.9 and ppp used to live
in drivers/net, not drivers/net/ppp until 3.1.  The bug existed
well before that, though, and the same fix used to apply in old
location of file.

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ncpfs: use list_for_each_entry() for d_subdirs walk

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: move getname() from callers to do_mount()

It would make more sense to pass char __user * instead of
char * in callers of do_mount() and do getname() inside do_mount().

Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Seunghun Lee <waydi1@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

gfs2_atomic_open(): skip lookups on hashed dentry

hashed dentry can be passed to ->atomic_open() only if
a) it has just passed revalidation and
b) it's negative

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

[infiniband] remove pointless assignments

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

gadgetfs: saner API for gadgetfs_create_file()

return dentry, not inode. dev->inode is never used by anything,
don't bother with storing it.

Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

f_fs: saner API for ffs_sb_create_file()

make it return dentry instead of inode

Acked-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

jfs: don't hash direct inode

hlist_add_fake(inode->i_hash), same as for the rest of special ones...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

[s390] remove pointless assignment of ->f_op in vmlogrdr ->open()

The only way we can get to that function is from misc_open(), after
the latter has set file->f_op to exactly the same value we are
(re)assigning there.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ecryptfs: ->f_op is never NULL

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

android: ->f_op is never NULL

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

nouveau: __iomem misannotations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

missing annotation in fs/file.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: namespace: suppress 'may be used uninitialized' warnings

The gcc version 4.9.1 compiler complains Even though it isn't possible for
these variables to not get initialized before they are used.

fs/namespace.c: In function ‘SyS_mount’:
fs/namespace.c:2720:8: warning: ‘kernel_dev’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
        ^
fs/namespace.c:2699:8: note: ‘kernel_dev’ was declared here
  char *kernel_dev;
        ^
fs/namespace.c:2720:8: warning: ‘kernel_type’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
        ^
fs/namespace.c:2697:8: note: ‘kernel_type’ was declared here
  char *kernel_type;
        ^

Fix the warnings by simplifying copy_mount_string() as suggested by Al Viro.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

saner perf_atoll()

That loop in there is both anti-idiomatic *and* completely pointless.
strtoll() is there for purpose; use it and compare what's left with
acceptable suffices.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

switch /dev/kmsg to ->write_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

switch logger to ->write_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

switch hci_vhci to ->write_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

switch /dev/zero and /dev/full to ->read_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

dma-buf: don't open-code atomic_long_read()

... not to mention that even atomic_long_read() is too low-level here -
there's file_count().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

rsxx debugfs inanity

check with the author of that horror...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

carma-fpga: switch to simple_read_from_buffer()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

carma-fpga: switch to fixed_size_llseek()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

cachefiles_write_page(): switch to __kernel_write()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vme: don't open-code fixed_size_llseek()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ashmem: use vfs_llseek()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

9p: switch to %p[dD]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

cifs: switch to use of %p[dD]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: make cont_expand_zero interruptible

This patch makes it possible to kill a process looping in
cont_expand_zero. A process may spend a lot of time in this function, so
it is desirable to be able to kill it.

It happened to me that I wanted to copy a piece data from the disk to a
file. By mistake, I used the "seek" parameter to dd instead of "skip". Due
to the "seek" parameter, dd attempted to extend the file and became stuck
doing so - the only possibility was to reset the machine or wait many
hours until the filesystem runs out of space and cont_expand_zero fails.
We need this patch to be able to terminate the process.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Add copy_to_iter(), copy_from_iter() and iov_iter_zero()

For DAX, we want to be able to copy between iovecs and kernel addresses
that don't necessarily have a struct page.  This is a fairly simple
rearrangement for bvec iters to kmap the pages outside and pass them in,
but for user iovecs it gets more complicated because we might try various
different ways to kmap the memory.  Duplicating the existing logic works
out best in this case.

We need to be able to write zeroes to an iovec for reads from unwritten
ranges in a file.  This is performed by the new iov_iter_zero() function,
again patterned after the existing code that handles iovec iterators.

[AV: and export the buggers...]

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: Fix theoretical division by 0 in super_cache_scan().

total_objects could be 0 and is used as a denom.

While total_objects is a "long", total_objects == 0 unlikely happens for
3.12 and later kernels because 32-bit architectures would not be able to
hold (1 << 32) objects. However, total_objects == 0 may happen for kernels
between 3.1 and 3.11 because total_objects in prune_super() was an "int"
and (e.g.) x86_64 architecture might be able to hold (1 << 32) objects.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable <stable@kernel.org> # 3.1+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

dcache: Fix no spaces at the start of a line in dcache.c

Fixed coding style in dcache.c

Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

[jffs2] kill wbuf_queued/wbuf_dwork_lock

schedule_delayed_work() happening when the work is already pending is
a cheap no-op. Don't bother with ->wbuf_queued logics - it's both
broken (cancelling ->wbuf_dwork leaves it set, as spotted by Jeff Harris)
and pointless. It's cheaper to let schedule_delayed_work() handle that
case.

Reported-by: Jeff Harris <jefftharris@gmail.com>
Tested-by: Jeff Harris <jefftharris@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: fix typo in s_op->alloc_inode() documentation

The function which calls s_op->alloc_inode() is not inode_alloc(), but
instead alloc_inode() which lives in fs/inode.c .

The typo was there from the beginning from 5ea626aa (VFS: update
documentation, 2005) - there was no standalone inode_alloc() for the
whole kernel history.

Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

constify file_inode()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

handle suicide on late failure exits in execve() in search_binary_handler()

... rather than doing that in the guts of ->load_binary().
[updated to fix the bug spotted by Shentino - for SIGSEGV we really need
something stronger than send_sig_info(); again, better do that in one place]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

dcache.c: call ->d_prune() regardless of d_unhashed()

the only in-tree instance checks d_unhashed() anyway,
out-of-tree code can preserve the current behaviour by
adding such check if they want it and we get an ability
to use it in cases where we *want* to be notified of
killing being inevitable before ->d_lock is dropped,
whether it's unhashed or not. In particular, autofs
would benefit from that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

d_prune_alias(): just lock the parent and call __dentry_kill()

The only reason for games with ->d_prune() was __d_drop(), which
was needed only to force dput() into killing the sucker off.

Note that lock_parent() can be called under ->i_lock and won't
drop it, so dentry is safe from somebody managing to kill it
under us - it won't happen while we are holding ->i_lock.

__dentry_kill() is called only with ->d_lockref.count being 0
(here and when picked from shrink list) or 1 (dput() and dropping
the ancestors in shrink_dentry_list()), so it will never be called
twice - the first thing it's doing is making ->d_lockref.count
negative and once that happens, nothing will increment it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

proc: Update proc_flush_task_mnt to use d_invalidate

Now that d_invalidate always succeeds and flushes mount points use
it in stead of a combination of shrink_dcache_parent and d_drop
in proc_flush_task_mnt. This removes the danger of a mount point
under /proc/<pid>/... becoming unreachable after the d_drop.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Remove d_drop calls from d_revalidate implementations

Now that d_invalidate always succeeds it is not longer necessary or
desirable to hard code d_drop calls into filesystem specific
d_revalidate implementations.

Remove the unnecessary d_drop calls and rely on d_invalidate
to drop the dentries. Using d_invalidate ensures that paths
to mount points will not be dropped.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Make d_invalidate return void

Now that d_invalidate can no longer fail, stop returning a useless
return code. For the few callers that checked the return code update
remove the handling of d_invalidate failure.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Merge check_submounts_and_drop and d_invalidate

Now that d_invalidate is the only caller of check_submounts_and_drop,
expand check_submounts_and_drop inline in d_invalidate.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Remove unnecessary calls of check_submounts_and_drop

Now that check_submounts_and_drop can not fail and is called from
d_invalidate there is no longer a need to call check_submounts_and_drom
from filesystem d_revalidate methods so remove it.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Lazily remove mounts on unlinked files and directories.

With the introduction of mount namespaces and bind mounts it became
possible to access files and directories that on some paths are mount
points but are not mount points on other paths.  It is very confusing
when rm -rf somedir returns -EBUSY simply because somedir is mounted
somewhere else.  With the addition of user namespaces allowing
unprivileged mounts this condition has gone from annoying to allowing
a DOS attack on other users in the system.

The possibility for mischief is removed by updating the vfs to support
rename, unlink and rmdir on a dentry that is a mountpoint and by
lazily unmounting mountpoints on deleted dentries.

In particular this change allows rename, unlink and rmdir system calls
on a dentry without a mountpoint in the current mount namespace to
succeed, and it allows rename, unlink, and rmdir performed on a
distributed filesystem to update the vfs cache even if when there is a
mount in some namespace on the original dentry.

There are two common patterns of maintaining mounts: Mounts on trusted
paths with the parent directory of the mount point and all ancestory
directories up to / owned by root and modifiable only by root
(i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu,
cpuacct, ...}, /usr, /usr/local).  Mounts on unprivileged directories
maintained by fusermount.

In the case of mounts in trusted directories owned by root and
modifiable only by root the current parent directory permissions are
sufficient to ensure a mount point on a trusted path is not removed
or renamed by anyone other than root, even if there is a context
where the there are no mount points to prevent this.

In the case of mounts in directories owned by less privileged users
races with users modifying the path of a mount point are already a
danger.  fusermount already uses a combination of chdir,
/proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races.  The
removable of global rename, unlink, and rmdir protection really adds
nothing new to consider only a widening of the attack window, and
fusermount is already safe against unprivileged users modifying the
directory simultaneously.

In principle for perfect userspace programs returning -EBUSY for
unlink, rmdir, and rename of dentires that have mounts in the local
namespace is actually unnecessary.  Unfortunately not all userspace
programs are perfect so retaining -EBUSY for unlink, rmdir and rename
of dentries that have mounts in the current mount namespace plays an
important role of maintaining consistency with historical behavior and
making imperfect userspace applications hard to exploit.

v2: Remove spurious old_dentry.
v3: Optimized shrink_submounts_and_drop
    Removed unsued afs label
v4: Simplified the changes to check_submounts_and_drop
    Do not rename check_submounts_and_drop shrink_submounts_and_drop
    Document what why we need atomicity in check_submounts_and_drop
    Rely on the parent inode mutex to make d_revalidate and d_invalidate
    an atomic unit.
v5: Refcount the mountpoint to detach in case of simultaneous
    renames.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Add a function to lazily unmount all mounts from any dentry.

The new function detach_mounts comes in two pieces.  The first piece
is a static inline test of d_mounpoint that returns immediately
without taking any locks if d_mounpoint is not set.  In the common
case when mountpoints are absent this allows the vfs to continue
running with it's same cacheline foot print.

The second piece of detach_mounts __detach_mounts actually does the
work and it assumes that a mountpoint is present so it is slow and
takes namespace_sem for write, and then locks the mount hash (aka
mount_lock) after a struct mountpoint has been found.

With those two locks held each entry on the list of mounts on a
mountpoint is selected and lazily unmounted until all of the mount
have been lazily unmounted.

v7: Wrote a proper change description and removed the changelog
    documenting deleted wrong turns.

Signed-off-by: Eric W. Biederman <ebiederman@twitter.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: factor out lookup_mountpoint from new_mountpoint

I am shortly going to add a new user of struct mountpoint that
needs to look up existing entries but does not want to create
a struct mountpoint if one does not exist. Therefore to keep
the code simple and easy to read split out lookup_mountpoint
from new_mountpoint.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Keep a list of mounts on a mount point

To spot any possible problems call BUG if a mountpoint
is put when it's list of mounts is not empty.

AV: use hlist instead of list_head

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Eric W. Biederman <ebiederman@twitter.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Don't allow overwriting mounts in the current mount namespace

In preparation for allowing mountpoints to be renamed and unlinked
in remote filesystems and in other mount namespaces test if on a dentry
there is a mount in the local mount namespace before allowing it to
be renamed or unlinked.

The primary motivation here are old versions of fusermount unmount
which is not safe if the a path can be renamed or unlinked while it is
verifying the mount is safe to unmount. More recent versions are simpler
and safer by simply using UMOUNT_NOFOLLOW when unmounting a mount
in a directory owned by an arbitrary user.

Miklos Szeredi <miklos@szeredi.hu> reports this is approach is good
enough to remove concerns about new kernels mixed with old versions
of fusermount.

A secondary motivation for restrictions here is that it removing empty
directories that have non-empty mount points on them appears to
violate the rule that rmdir can not remove empty directories. As
Linus Torvalds pointed out this is useful for programs (like git) that
test if a directory is empty with rmdir.

Therefore this patch arranges to enforce the existing mount point
semantics for local mount namespace.

v2: Rewrote the test to be a drop in replacement for d_mountpoint
v3: Use bool instead of int as the return type of is_local_mountpoint

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: More precise tests in d_invalidate

The current comments in d_invalidate about what and why it is doing
what it is doing are wildly off-base. Which is not surprising as
the comments date back to last minute bug fix of the 2.2 kernel.

The big fat lie of a comment said: If it's a directory, we can't drop
it for fear of somebody re-populating it with children (even though
dropping it would make it unreachable from that root, we still might
repopulate it if it was a working directory or similar).

[AV] What we really need to avoid is multiple dentry aliases of the
same directory inode; on all filesystems that have ->d_revalidate()
we either declare all positive dentries always valid (and thus never
fed to d_invalidate()) or use d_materialise_unique() and/or d_splice_alias(),
which take care of alias prevention.

The current rules are:
- To prevent mount point leaks dentries that are mount points or that
have childrent that are mount points may not be be unhashed.
- All dentries may be unhashed.
- Directories may be rehashed with d_materialise_unique

check_submounts_and_drop implements this already for well maintained
remote filesystems so implement the current rules in d_invalidate
by just calling check_submounts_and_drop.

The one difference between d_invalidate and check_submounts_and_drop
is that d_invalidate must respect it when a d_revalidate method has
earlier called d_drop so preserve the d_unhashed check in
d_invalidate.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfs: Document the effect of d_revalidate on d_find_alias

d_drop or check_submounts_and_drop called from d_revalidate can result
in renamed directories with child dentries being unhashed. These
renamed and drop directory dentries can be rehashed after
d_materialise_unique uses d_find_alias to find them.

Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

delayed mntput

On final mntput() we want fs shutdown to happen before return to
userland; however, the only case where we want it happen right
there (i.e. where task_work_add won't do) is MNT_INTERNAL victim.
Those have to be fully synchronous - failure halfway through module
init might count on having vfsmount killed right there. Fortunately,
final mntput on MNT_INTERNAL vfsmounts happens on shallow stack.
So we handle those synchronously and do an analog of delayed fput
logics for everything else.

As the result, we are guaranteed that fs shutdown will always happen
on shallow stack.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

autofs - remove obsolete d_invalidate() from expire

Biederman's umount-on-rmdir series changes d_invalidate() to sumarily remove
mounts under the passed in dentry regardless of whether they are busy
or not. So calling this in fs/autofs4/expire.c:autofs4_tree_busy() is
definitely the wrong thing to do becuase it will silently umount entries
instead of just cleaning stale dentrys.

But this call shouldn't be needed and testing shows that automounting
continues to function without it.

As Al Viro correctly surmises the original intent of the call was to
perform what shrink_dcache_parent() does.

If at some time in the future I see stale dentries accumulating
following failed mounts I'll revisit the issue and possibly add a
shrink_dcache_parent() call if needed.

Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Allow sharing external names after __d_move()

* external dentry names get a small structure prepended to them
(struct external_name).
* it contains an atomic refcount, matching the number of struct dentry
instances that have ->d_name.name pointing to that external name.  The
first thing free_dentry() does is decrementing refcount of external name,
so the instances that are between the call of free_dentry() and
RCU-delayed actual freeing do not contribute.
* __d_move(x, y, false) makes the name of x equal to the name of y,
external or not.  If y has an external name, extra reference is grabbed
and put into x->d_name.name.  If x used to have an external name, the
reference to the old name is dropped and, should it reach zero, freeing
is scheduled via kfree_rcu().
* free_dentry() in dentry with external name decrements the refcount of
that name and, should it reach zero, does RCU-delayed call that will
free both the dentry and external name.  Otherwise it does what it
used to do, except that __d_free() doesn't even look at ->d_name.name;
it simply frees the dentry.

All non-RCU accesses to dentry external name are safe wrt freeing since they
all should happen before free_dentry() is called.  RCU accesses might run
into a dentry seen by free_dentry() or into an old name that got already
dropped by __d_move(); however, in both cases dentry must have been
alive and refer to that name at some point after we'd done rcu_read_lock(),
which means that any freeing must be still pending.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

missing data dependency barrier in prepend_name()

AFAICS, prepend_name() is broken on SMP alpha.  Disclaimer: I don't have
SMP alpha boxen to reproduce it on.  However, it really looks like the race
is real.

CPU1: d_path() on /mnt/ramfs/<255-character>/foo
CPU2: mv /mnt/ramfs/<255-character> /mnt/ramfs/<63-character>

CPU2 does d_alloc(), which allocates an external name, stores the name there
including terminating NUL, does smp_wmb() and stores its address in
dentry->d_name.name.  It proceeds to d_add(dentry, NULL) and d_move()
old dentry over to that.  ->d_name.name value ends up in that dentry.

In the meanwhile, CPU1 gets to prepend_name() for that dentry.  It fetches
->d_name.name and ->d_name.len; the former ends up pointing to new name
(64-byte kmalloc'ed array), the latter - 255 (length of the old name).
Nothing to force the ordering there, and normally that would be OK, since we'd
run into the terminating NUL and stop.  Except that it's alpha, and we'd need
a data dependency barrier to guarantee that we see that store of NUL
__d_alloc() has done.  In a similar situation dentry_cmp() would survive; it
does explicit smp_read_barrier_depends() after fetching ->d_name.name.
prepend_name() doesn't and it risks walking past the end of kmalloc'ed object
and possibly oops due to taking a page fault in kernel mode.

Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"Assorted fixes + unifying __d_move() and __d_materialise_dentry() +
  minimal regression fix for d_path() of victims of overwriting rename()
  ported on top of that"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: Don't exchange "short" filenames unconditionally.
  fold swapping ->d_name.hash into switch_names()
  fold unlocking the children into dentry_unlock_parents_for_move()
  kill __d_materialise_dentry()
  __d_materialise_dentry(): flip the order of arguments
  __d_move(): fold manipulations with ->d_child/->d_subdirs
  don't open-code d_rehash() in d_materialise_unique()
  pull rehashing and unlocking the target dentry into __d_materialise_dentry()
  ufs: deal with nfsd/iget races
  fuse: honour max_read and max_write in direct_io mode
  shmem: fix nlink for rename overwrite directory

Merge branch 'for-3.17-fixes' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
"This is quite late but these need to be backported anyway.

  This is the fix for a long-standing cpuset bug which existed from
  2009.  cpuset makes use of PF_SPREAD_{PAGE|SLAB} flags to modify the
  task's memory allocation behavior according to the settings of the
  cpuset it belongs to; unfortunately, when those flags have to be
  changed, cpuset did so directly even whlie the target task is running,
  which is obviously racy as task->flags may be modified by the task
  itself at any time.  This obscure bug manifested as corrupt
  PF_USED_MATH flag leading to a weird crash.

  The bug is fixed by moving the flag to task->atomic_flags.  The first
  two are prepatory ones to help defining atomic_flags accessors and the
  third one is the actual fix"

* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags
  sched: add macros to define bitops for task atomic flags
  sched: fix confusing PFA_NO_NEW_PRIVS constant

Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"Here's our last set of fixes for 3.17.  Most of these are for TI
  platforms, fixing some noisy Kconfig issues, runtime clock and power
  issues on several platforms and NAND timings on DRA7.

  There are also a couple of bug fixes for i.MX, one for QCOM and a
small fix to avoid section mismatch noise on PXA.

  Diffstat looks large, partially due to some tables being updated and
  thus touching many lines.  The qcom gsbi change also restructures
  clock management a bit and thus touches a bunch of lines.

  All in all, a bit more changes than we'd like at this point, but
  nothing stands out as risky either so it seems like the right thing to
  send it up now instead of holding it to the merge window"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  drivers/soc: qcom: do not disable the iface clock in probe
  ARM: imx: fix .is_enabled() of shared gate clock
  ARM: OMAP3: Fix I/O chain clock line assertion timed out error
  ARM: keystone: dts: fix bindings for pcie and usb clock nodes
  bus: omap_l3_noc: Fix connID for OMAP4
  ARM: DT: imx53: fix lvds channel 1 port
  ARM: dts: cm-t54: fix serial console power supply.
  ARM: dts: dra7-evm: Fix NAND GPMC timings
  ARM: pxa: fix section mismatch warning for pxa_timer_nodt_init
  ARM: OMAP: Fix Kconfig warning for omap1

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"The final round of fixes.  One corner case in the math emulator and
  another one in the mcount function for ftrace"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: mcount: Adjust stack pointer for static trace in MIPS32
  MIPS: Fix MFC1 & MFHC1 emulation for 64-bit MIPS systems

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"This has:

   - EFI revert to fix a boot regression
   - early_ioremap() fix for boot failure
   - KASLR fix for possible boot failures
   - EFI fix for corrupted string printing
   - remove a misleading EFI bootup 'failed!' error message

  Unfortunately it's all rather close to the merge window"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Truncate 64-bit values when calling 32-bit OutputString()
  x86/efi: Delete misleading efi_printk() error message
  Revert "efi/x86: efistub: Move shared dependencies to <asm/efi.h>"
  x86/kaslr: Avoid the setup_data area when picking location
  x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8

vfs: Don't exchange "short" filenames unconditionally.

Only exchange source and destination filenames
if flags contain RENAME_EXCHANGE.
In case if executable file was running and replaced by
other file /proc/PID/exe should still show correct file name,
not the old name of the file by which it was replaced.

The scenario when this bug manifests itself was like this:
* ALT Linux uses rpm and start-stop-daemon;
* during a package upgrade rpm creates a temporary file
  for an executable to rename it upon successful unpacking;
* start-stop-daemon is run subsequently and it obtains
  the (nonexistant) temporary filename via /proc/PID/exe
  thus failing to identify the running process.

Note that "long" filenames (> DNAiME_INLINE_LEN) are still
exchanged without RENAME_EXCHANGE and this behaviour exists
long enough (should be fixed too apparently).
So this patch is just an interim workaround that restores
behavior for "short" names as it was before changes
introduced by commit da1ce0670c14 ("vfs: add cross-rename").

See https://lkml.org/lkml/2014/9/7/6 for details.

AV: the comments about being more careful with ->d_name.hash
than with ->d_name.name are from back in 2.3.40s; they
became obsolete by 2.3.60s, when we started to unhash the
target instead of swapping hash chain positions followed
by d_delete() as we used to do when dcache was first
introduced.

Acked-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: da1ce0670c14 "vfs: add cross-rename"
Signed-off-by: Mikhail Efremov <sem@altlinux.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fold swapping ->d_name.hash into switch_names()

and do it along with ->d_name.len there

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fold unlocking the children into dentry_unlock_parents_for_move()

... renaming it into dentry_unlock_for_move() and making it more
symmetric with dentry_lock_for_move().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

kill __d_materialise_dentry()

it folds into __d_move() now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

__d_materialise_dentry(): flip the order of arguments

... thus making it much closer to (now unreachable, BTW) IS_ROOT(dentry)
case in __d_move(). A bit more and it'll fold in.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

__d_move(): fold manipulations with ->d_child/->d_subdirs

list_del() + list_add() is a slightly pessimised list_move()
list_del() + INIT_LIST_HEAD() is a slightly pessimised list_del_init()

Interleaving those makes the resulting code even worse. And harder to follow...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

don't open-code d_rehash() in d_materialise_unique()

... and get rid of duplicate BUG_ON() there

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

pull rehashing and unlocking the target dentry into __d_materialise_dentry()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

ufs: deal with nfsd/iget races

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fuse: honour max_read and max_write in direct_io mode

The third argument of fuse_get_user_pages() "nbytesp" refers to the number of
bytes a caller asked to pack into fuse request. This value may be lesser
than capacity of fuse request or iov_iter. So fuse_get_user_pages() must
ensure that *nbytesp won't grow.

Now, when helper iov_iter_get_pages() performs all hard work of extracting
pages from iov_iter, it can be done by passing properly calculated
"maxsize" to the helper.

The other caller of iov_iter_get_pages() (dio_refill_pages()) doesn't need
this capability, so pass LONG_MAX as the maxsize argument here.

Fixes: c9c37e2e6378 ("fuse: switch to iov_iter_get_pages()")
Reported-by: Werner Baumann <werner.baumann@onlinehome.de>
Tested-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

shmem: fix nlink for rename overwrite directory

If overwriting an empty directory with rename, then need to drop the extra
nlink.

Test prog:

#include <stdio.h>
#include <fcntl.h>
#include <err.h>
#include <sys/stat.h>

int main(void)
{
const char *test_dir1 = "test-dir1";
const char *test_dir2 = "test-dir2";
int res;
int fd;
struct stat statbuf;

res = mkdir(test_dir1, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir1);

res = mkdir(test_dir2, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir2);

fd = open(test_dir2, O_RDONLY);
if (fd == -1)
err(1, "open(\"%s\")", test_dir2);

res = rename(test_dir1, test_dir2);
if (res == -1)
err(1, "rename(\"%s\", \"%s\")", test_dir1, test_dir2);

res = fstat(fd, &statbuf);
if (res == -1)
err(1, "fstat(%i)", fd);

if (statbuf.st_nlink != 0) {
fprintf(stderr, "nlink is %lu, should be 0\n", statbuf.st_nlink);
return 1;
}

return 0;
}

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fix from Dmitry Torokhov:
"A small fixup to i8042 adding Asus X450LCP to the nomux list"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: i8042 - fix Asus X450LCP touchpad detection

Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
"A CONFIG_STACK_GROWSUP=y fix, and a hotplug llc CPU mask fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix unreleased llc_shared_mask bit during CPU hotplug
sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP

Merge branch 'akpm' (fixes from Andrew Morton)

Merge fixes from Andrew Morton:
"9 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: softdirty: keep bit when zapping file pte
  fs/cachefiles: add missing \n to kerror conversions
  genalloc: fix device node resource counter
  drivers/rtc/rtc-efi.c: add missing module alias
  mm, slab: initialize object alignment on cache creation
  mm: softdirty: addresses before VMAs in PTE holes aren't softdirty
  ocfs2/dlm: do not get resource spinlock if lockres is new
  nilfs2: fix data loss with mmap()
  ocfs2: free vol_label in ocfs2_delete_osb()

mm: softdirty: keep bit when zapping file pte

This fixes the same bug as b43790eedd31 ("mm: softdirty: don't forget to
save file map softdiry bit on unmap") and 9aed8614af5a ("mm/memory.c:
don't forget to set softdirty on file mapped fault") where the return
value of pte_*mksoft_dirty was being ignored.

To be sure that no other pte/pmd "mk" function return values were being
ignored, I annotated the functions in arch/x86/include/asm/pgtable.h
with __must_check and rebuilt.

The userspace effect of this bug is that the softdirty mark might be
lost if a file mapped pte get zapped.

Signed-off-by: Peter Feiner <pfeiner@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Jamie Liu <jamieliu@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org> [3.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/cachefiles: add missing \n to kerror conversions

Commit 0227d6abb378 ("fs/cachefiles: replace kerror by pr_err") didn't
include newline featuring in original kerror definition

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reported-by: David Howells <dhowells@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: <stable@vger.kernel.org> [3.16.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

genalloc: fix device node resource counter

Decrement the np_pool device_node refcount, which was incremented on
the preceding of_parse_phandle() call.

Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-efi.c: add missing module alias

Without proper alias kernel module is not loaded for rtc-efi driver.

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Cc: dann frazier <dannf@dannf.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm, slab: initialize object alignment on cache creation

Since commit 4590685546a3 ("mm/sl[aou]b: Common alignment code"), the
"ralign" automatic variable in __kmem_cache_create() may be used as
uninitialized.

The proper alignment defaults to BYTES_PER_WORD and can be overridden by
SLAB_RED_ZONE or the alignment specified by the caller.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=85031

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Andrei Elovikov <a.elovikov@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: softdirty: addresses before VMAs in PTE holes aren't softdirty

In PTE holes that contain VM_SOFTDIRTY VMAs, unmapped addresses before
VM_SOFTDIRTY VMAs are reported as softdirty by /proc/pid/pagemap.  This
bug was introduced in commit 68b5a6524856 ("mm: softdirty: respect
VM_SOFTDIRTY in PTE holes").  That commit made /proc/pid/pagemap look at
VM_SOFTDIRTY in PTE holes but neglected to observe the start of VMAs
returned by find_vma.

Tested:
  Wrote a selftest that creates a PMD-sized VMA then unmaps the first
  page and asserts that the page is not softdirty. I'm going to send the
  pagemap selftest in a later commit.

Signed-off-by: Peter Feiner <pfeiner@google.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Jamie Liu <jamieliu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2/dlm: do not get resource spinlock if lockres is new

There is a deadlock case which reported by Guozhonghua:
  https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html

This case is caused by &res->spinlock and &dlm->master_lock
misordering in different threads.

It was introduced by commit 8d400b81cc83 ("ocfs2/dlm: Clean up refmap
helpers").  Since lockres is new, it doesn't not require the
&res->spinlock.  So remove it.

Fixes: 8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers")
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Reported-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nilfs2: fix data loss with mmap()

This bug leads to reproducible silent data loss, despite the use of
msync(), sync() and a clean unmount of the file system.  It is easily
reproducible with the following script:

  ----------------[BEGIN SCRIPT]--------------------
  mkfs.nilfs2 -f /dev/sdb
  mount /dev/sdb /mnt

  dd if=/dev/zero bs=1M count=30 of=/mnt/testfile

  umount /mnt
  mount /dev/sdb /mnt
  CHECKSUM_BEFORE="$(md5sum /mnt/testfile)"

  /root/mmaptest/mmaptest /mnt/testfile 30 10 5

  sync
  CHECKSUM_AFTER="$(md5sum /mnt/testfile)"
  umount /mnt
  mount /dev/sdb /mnt
  CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)"
  umount /mnt

  echo "BEFORE MMAP:\t$CHECKSUM_BEFORE"
  echo "AFTER MMAP:\t$CHECKSUM_AFTER"
  echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT"
  ----------------[END SCRIPT]--------------------

The mmaptest tool looks something like this (very simplified, with
error checking removed):

  ----------------[BEGIN mmaptest]--------------------
  data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
              MAP_SHARED, fd, file_offset);

  for (i = 0; i < write_count; ++i) {
        memcpy(data + i * 4096, buf, sizeof(buf));
        msync(data, file_size - file_offset, MS_SYNC))
  }
  ----------------[END mmaptest]--------------------

The output of the script looks something like this:

  BEFORE MMAP:    281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
  AFTER MMAP:     6604a1c31f10780331a6850371b3a313  /mnt/testfile
  AFTER REMOUNT:  281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile

So it is clear, that the changes done using mmap() do not survive a
remount.  This can be reproduced a 100% of the time.  The problem was
introduced in commit 136e8770cd5d ("nilfs2: fix issue of
nilfs_set_page_dirty() for page at EOF boundary").

If the page was read with mpage_readpage() or mpage_readpages() for
example, then it has no buffers attached to it.  In that case
page_has_buffers(page) in nilfs_set_page_dirty() will be false.
Therefore nilfs_set_file_dirty() is never called and the pages are never
collected and never written to disk.

This patch fixes the problem by also calling nilfs_set_file_dirty() if the
page has no buffers attached to it.

[akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/]
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: free vol_label in ocfs2_delete_osb()

osb->vol_label is malloced in ocfs2_initialize_super but not freed if
error occurs or during umount, thus causing a memory leak.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MIPS: mcount: Adjust stack pointer for static trace in MIPS32

Every mcount() call in the MIPS 32-bit kernel is done as follows:

[...]
move at, ra
jal _mcount
addiu sp, sp, -8
[...]

but upon returning from the mcount() function, the stack pointer
is not adjusted properly. This is explained in details in 58b69401c797
(MIPS: Function tracer: Fix broken function tracing).

Commit ad8c396936e3 ("MIPS: Unbreak function tracer for 64-bit kernel.)
fixed the stack manipulation for 64-bit but it didn't fix it completely
for MIPS32.

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: <stable@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7792/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Fix MFC1 & MFHC1 emulation for 64-bit MIPS systems

Commit bbd426f542cb "MIPS: Simplify FP context access" modified the
SIFROMREG & SIFROMHREG macros such that they return unsigned rather
than signed 32b integers. I had believed that to be fine, but
inadvertently missed the MFC1 & MFHC1 cases which write to a struct
pt_regs regs element. On MIPS32 this is fine, but on 64 bit those
saved regs' fields are 64 bit wide. Using unsigned values caused the
32 bit value from the FP register to be zero rather than sign extended
as the architecture specifies, causing incorrect emulation of the
MFC1 & MFHc1 instructions. Fix by reintroducing the casts to signed
integers, and therefore the sign extension.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: stable@vger.kernel.org # v3.15+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7848/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Merge tag 'pm+acpi-3.17-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI and power management fixes from Rafael Wysocki:
"These are regression fixes (ACPI hotplug, cpufreq, hibernation, ACPI
  LPSS driver), fixes for stuff that never worked correctly (ACPI GPIO
  support in some cases and a wrong sign of an error code in the ACPI
  core in one place), and one blacklist item for ACPI backlight
  handling.

  Specifics:

   - Revert of a recent hibernation core commit that introduced a NULL
     pointer dereference during resume for at least one user (Rafael J
     Wysocki).

   - Fix for the ACPI LPSS (Low-Power Subsystem) driver to disable
     asynchronous PM callback execution for LPSS devices during system
     suspend/resume (introduced in 3.16) which turns out to break
     ordering expectations on some systems.  From Fu Zhonghui.

   - cpufreq core fix related to the handling of sysfs nodes during
     system suspend/resume that has been broken for intel_pstate since
     3.15 from Lan Tianyu.

   - Restore the generation of "online" uevents for ACPI container
     devices that was removed in 3.14, but some user space utilities
     turn out to need them (Rafael J Wysocki).

   - The cpufreq core fails to release a lock in an error code path
     after changes made in 3.14.  Fix from Prarit Bhargava.

   - ACPICA and ACPI/GPIO fixes to make the handling of ACPI GPIO
     operation regions (which means AML using GPIOs) work correctly in
     all cases from Bob Moore and Srinivas Pandruvada.

   - Fix for a wrong sign of the ACPI core's create_modalias() return
     value in case of an error from Mika Westerberg.

   - ACPI backlight blacklist entry for ThinkPad X201s from Aaron Lu"

* tag 'pm+acpi-3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()"
  gpio / ACPI: Use pin index and bit length
  ACPICA: Update to GPIO region handler interface.
  ACPI / platform / LPSS: disable async suspend/resume of LPSS devices
  cpufreq: release policy->rwsem on error
  cpufreq: fix cpufreq suspend/resume for intel_pstate
  ACPI / scan: Correct error return value of create_modalias()
  ACPI / video: disable native backlight for ThinkPad X201s
  ACPI / hotplug: Generate online uevents for ACPI containers

Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"This is probably not the kind of pull request you want to see that
  late in the cycle.  Yet, the ACPI refactorization was problematic
  again and caused another two issues which need fixing.  My holidays
  with limited internet (plus travelling) and the developer's illness
  didn't help either :(

  The details:

   - ACPI code was refactored out into a seperate file and as a
     side-effect, the i2c-core module got renamed.  Jean Delvare
     rightfully complained about the rename being problematic for
     distributions.  So, Mika and I thought the least problematic way to
     deal with it is to move all the code back into the main i2c core
     source file.  This is mainly a huge code move with some #ifdeffery
     applied.  No functional code changes.  Our personal tests and the
     testbots did not find problems.  (I was thinking about reverting,
     too, yet that would also have ~800 lines changed)

   - The new ACPI code also had a NULL pointer exception, thanks to
     Peter for finding and fixing it.

   - Mikko fixed a locking problem by decoupling clock_prepare and
     clock_enable.

   - Addy learnt that the datasheet was wrong and reimplemented the
     frequency setup according to the new algorithm.

  - Fan fixed an off-by-one error when copying data

  - Janusz fixed a copy'n'paste bug which gave a wrong error message

  - Sergei made sure that "don't touch" bits are not accessed"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: acpi: Fix NULL Pointer dereference
  i2c: move acpi code back into the core
  i2c: rk3x: fix divisor calculation for SCL frequency
  i2c: mxs: fix error message in pio transfer
  i2c: ismt: use correct length when copy buffer
  i2c: rcar: fix RCAR_IRQ_ACK_{RECV|SEND}
  i2c: tegra: Move clk_prepare/clk_set_rate to probe

MAINTAINERS: new Documentation maintainer

Transfer Documentation maintainership to Jiri Kosina.
Thanks, Jiri.

I'll still be reviewing and working on documentation.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branches 'pm-cpufreq' and 'pm-sleep'

* pm-cpufreq:
  cpufreq: release policy->rwsem on error
  cpufreq: fix cpufreq suspend/resume for intel_pstate

* pm-sleep:
  Revert "PM / Hibernate: Iterate over set bits instead of PFNs in swsusp_free()"

Merge branches 'acpi-hotplug', 'acpi-scan', 'acpi-lpss', 'acpi-gpio' and 'acpi-video'

* acpi-hotplug:
  ACPI / hotplug: Generate online uevents for ACPI containers

* acpi-scan:
  ACPI / scan: Correct error return value of create_modalias()

* acpi-lpss:
  ACPI / platform / LPSS: disable async suspend/resume of LPSS devices

* acpi-gpio:
  gpio / ACPI: Use pin index and bit length
  ACPICA: Update to GPIO region handler interface.

* acpi-video:
  ACPI / video: disable native backlight for ThinkPad X201s

Merge tag 'gpio-v3.17-4' of git://git./linux/kernel/git/linusw/linux-gpio

Pull gpio fixes from Linus Walleij:
"Two GPIO fixes:

   - GPIO direction flags where handled wrong in the new descriptor-
     based API, so direction changes did not always "take".

   - Fix a handler installation race in the generic GPIO irqchip code"

* tag 'gpio-v3.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: Fix potential NULL handler data in chained irqchip handler
  gpio: Fix gpio direction flags not getting set

Merge tag 'efi-urgent' of git://git./linux/kernel/git/mfleming/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

  * Revert the static library changes from the merge window since they're
    causing issues for Macbooks and Fedora + Grub2 (Matt Fleming)

  * Delete the misleading "setup_efi_pci() failed!" message which some
    people are seeing when booting EFI (Matt Fleming)

  * Fix printing strings from the 32-bit EFI boot stub by only passing
    32-bit addresses to the firmware (Matt Fleming)

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux

Pull devicetree bug fixes and documentation from Grant Likely:
"Several bug fix commits for issues found in the v3.17 rc series.

  Most of these are minor in that they aren't actively dangerous, but
  they have been seen in the wild.  The one important fix is commit
  7dbe5849fb50 ("of: make sure of_alias is initialized before accessing
  it"), without which some powerpc platforms will fail to find stdout
  for the console"

* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
  of/fdt: fix memory range check
  of: Fix memory block alignment in early_init_dt_add_memory_arch()
  of: make sure of_alias is initialized before accessing it
  of: Documentation regarding attaching OF Selftest testdata
  of: Disabling OF functions that use sysfs if CONFIG_SYSFS disabled
  of: correct of_console_check()'s return value

i2c: acpi: Fix NULL Pointer dereference

If adapter->dev.parent == NULL there is a NULL pointer dereference in
acpi_i2c_install_space_handler and acpi_i2c_remove_space_handler.

This is present since introduction of this code:
366047515c6e "i2c: rework kernel config I2C_ACPI" or even
da3c6647ee08 "I2C/ACPI: Clean up I2C ACPI code and Add CONFIG_I2C_ACPI"

The adapter->dev.parent == NULL case is valid for the i2c_stub,
so loading i2c_stub with ACPI_I2C_OPREGION enabled results in an oops.
This is also valid at least for i2c_tiny_usb and i2c_robotfuzz_osif.

Fix by checking whether it is null before calling ACPI_HANDLE.

Signed-off-by: Peter Huewe <peterhuewe@gmx.de>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

i2c: move acpi code back into the core

Commit 5d98e61d337c ("I2C/ACPI: Add i2c ACPI operation region support")
renamed the i2c-core module. This may cause regressions for
distributions, so put the ACPI code back into the core.

Reported-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Tested-by: Lan Tianyu <tianyu.lan@intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>

of/fdt: fix memory range check

In cases where board has below memory DT node

memory{
device_type = "memory";
reg = <0x80000000 0x80000000>;
};

Check on the memory range in fdt.c will always fail because it is
comparing MAX_PHYS_ADDR with base + size, in fact it should compare
it with base + size - 1.

This issue was originally noticed on Qualcomm IFC6410 board.
Without this patch kernel shows up noticed unnecessary warnings

[    0.000000] Machine model: Qualcomm APQ8064/IFC6410
[    0.000000] Ignoring memory range 0xffffffff - 0x100000000
[    0.000000] cma: Reserved 64 MiB at ab800000

as a result the size get reduced to 0x7fffffff which looks wrong.

This patch fixes the check involved in generating this warning and
as a result it also fixes the wrong size calculation.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
[grant.likely: adjust new size calculation also]
Signed-off-by: Grant Likely <grant.likely@linaro.org>