git.openwrt.org Git - openwrt/staging/blogic.git/log

projects / openwrt / staging / blogic.git / log

Jeff Layton [Wed, 25 Jul 2012 14:19:47 +0000 (10:19 -0400)]

vfs: don't let do_last pass negative dentry to audit_inode

I can reliably reproduce the following panic by simply setting an audit
rule on a recent 3.5.0+ kernel:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
IP: [<ffffffff810d1250>] audit_copy_inode+0x10/0x90
PGD 7acd9067 PUD 7b8fb067 PMD 0
Oops: 0000 [#86] SMP
Modules linked in: nfs nfs_acl auth_rpcgss fscache lockd sunrpc tpm_bios btrfs zlib_deflate libcrc32c kvm_amd kvm joydev virtio_net pcspkr i2c_piix4 floppy virtio_balloon microcode virtio_blk cirrus drm_kms_helper ttm drm i2c_core [last unloaded: scsi_wait_scan]
CPU 0
Pid: 1286, comm: abrt-dump-oops Tainted: G      D      3.5.0+ #1 Bochs Bochs
RIP: 0010:[<ffffffff810d1250>]  [<ffffffff810d1250>] audit_copy_inode+0x10/0x90
RSP: 0018:ffff88007aebfc38  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88003692d860 RCX: 00000000000038c4
RDX: 0000000000000000 RSI: ffff88006baf5d80 RDI: ffff88003692d860
RBP: ffff88007aebfc68 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffff880036d30f00 R14: ffff88006baf5d80 R15: ffff88003692d800
FS:  00007f7562634740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000040 CR3: 000000003643d000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process abrt-dump-oops (pid: 1286, threadinfo ffff88007aebe000, task ffff880079614530)
Stack:
  ffff88007aebfdf8 ffff88007aebff28 ffff88007aebfc98 ffffffff81211358
  ffff88003692d860 0000000000000000 ffff88007aebfcc8 ffffffff810d4968
  ffff88007aebfcc8 ffff8800000038c4 0000000000000000 0000000000000000
Call Trace:
  [<ffffffff81211358>] ? ext4_lookup+0xe8/0x160
  [<ffffffff810d4968>] __audit_inode+0x118/0x2d0
  [<ffffffff811955a9>] do_last+0x999/0xe80
  [<ffffffff81191fe8>] ? inode_permission+0x18/0x50
  [<ffffffff81171efa>] ? kmem_cache_alloc_trace+0x11a/0x130
  [<ffffffff81195b4a>] path_openat+0xba/0x420
  [<ffffffff81196111>] do_filp_open+0x41/0xa0
  [<ffffffff811a24bd>] ? alloc_fd+0x4d/0x120
  [<ffffffff811855cd>] do_sys_open+0xed/0x1c0
  [<ffffffff810d40cc>] ? __audit_syscall_entry+0xcc/0x300
  [<ffffffff811856c1>] sys_open+0x21/0x30
  [<ffffffff81611ca9>] system_call_fastpath+0x16/0x1b
  RSP <ffff88007aebfc38>
CR2: 0000000000000040

The problem is that do_last is passing a negative dentry to audit_inode.
The comments on lookup_open note that it can pass back a negative dentry
if O_CREAT is not set.

This patch fixes the oops, but I'm not clear on whether there's a better
approach.

Cc: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 22 Jul 2012 17:15:37 +0000 (21:15 +0400)]

brcm80211: pointless current->files passed to filp_close()

... only needed if it's been in descriptor table

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 22 Jul 2012 17:26:51 +0000 (21:26 +0400)]

sound_firmware: don't pass crap to filp_close()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 22 Jul 2012 17:23:33 +0000 (21:23 +0400)]

gadgetfs: clean up

sigh...
* opened files have non-NULL dentries and non-NULL inodes
* close_filp() needs current->files only if the file had been
in descriptor table.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 22 Jul 2012 17:09:14 +0000 (21:09 +0400)]

slightly reduce lossage in gdm72xx

* filp_close() needs non-NULL second argument only if it'd been in descriptor
table
* opened files have non-NULL dentries, TYVM
* ... and those dentries are positive - it's kinda hard to open a file that
doesn't exist.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 22 Jul 2012 17:02:01 +0000 (21:02 +0400)]

slightly reduce idiocy in drivers/staging/bcm/Misc.c

a) vfs_llseek() does *not* access userland pointers of any kind
b) neither does filp_close(), for that matter
c) ... nor filp_open()
d) vfs_read() does, but we do have a wrapper for that (kernel_read()),
so there's no need to reinvent it.
e) passing current->files to filp_close() on something that never
had been in descriptor table is pointless.

ISAGN: voodoo dolls to be used on voodoo programmers...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sat, 21 Jul 2012 11:33:25 +0000 (15:33 +0400)]

consolidate pipe file creation

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 20 Jul 2012 19:28:46 +0000 (23:28 +0400)]

take grabbing f->f_path to do_dentry_open()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 20 Jul 2012 19:05:59 +0000 (23:05 +0400)]

uninline file_free_rcu()

What inline? Its only use is passing its address to call_rcu(), for fuck sake!

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 20 Jul 2012 08:09:19 +0000 (12:09 +0400)]

ecryptfs_lookup_interpose(): allocate dentry_info first

less work on failure that way

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 20 Jul 2012 08:03:41 +0000 (12:03 +0400)]

sanitize ecryptfs_lookup()

* ->lookup() never gets hit with . or ..
* dentry it gets is unhashed, so unless we had gone and hashed it ourselves, there's
no need to d_drop() the sucker.
* wrong name printed in one of the printks (NULL, in fact)

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 22:37:29 +0000 (02:37 +0400)]

clean unix_bind() up a bit

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 22:25:00 +0000 (02:25 +0400)]

pull mnt_want_write()/mnt_drop_write() into kern_path_create()/done_path_create() resp.

One side effect - attempt to create a cross-device link on a read-only fs fails
with EROFS instead of EXDEV now. Makes more sense, POSIX allows, etc.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 21:17:26 +0000 (01:17 +0400)]

mknod: take sanity checks on mode into the very beginning

Note that applying umask can't affect their results. While
that affects errno in cases like
mknod("/no_such_directory/a", 030000)
yielding -EINVAL (due to impossible mode_t) instead of
-ENOENT (due to inexistent directory), IMO that makes a lot
more sense, POSIX allows to return either and any software
that relies on getting -ENOENT instead of -EINVAL in that
case deserves everything it gets.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 21:15:31 +0000 (01:15 +0400)]

new helper: done_path_create()

releases what needs to be released after {kern,user}_path_create()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 12:23:13 +0000 (16:23 +0400)]

pull unlock+dput() out into do_spu_create()

... and cleaning spufs_create() a bit, while we are at it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 12:12:22 +0000 (16:12 +0400)]

spufs: pull unlock-and-dput() up into spufs_create()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 12:07:30 +0000 (16:07 +0400)]

spufs_create_context(): simplify failure exits

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 12:03:21 +0000 (16:03 +0400)]

move spu_forget() into spufs_rmdir()

now that __fput() is *not* done in any callchain containing mmput(),
we can do that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 07:19:07 +0000 (11:19 +0400)]

ext4: switch EXT4_IOC_RESIZE_FS to mnt_want_write_file()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 07:17:49 +0000 (11:17 +0400)]

btrfs: switch btrfs_ioctl_balance() to mnt_want_write_file()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Tue, 26 Jun 2012 17:58:53 +0000 (21:58 +0400)]

switch dentry_open() to struct path, make it grab references itself

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 25 Jun 2012 07:46:13 +0000 (11:46 +0400)]

spufs: shift dget/mntget towards dentry_open()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sat, 14 Jul 2012 09:49:40 +0000 (13:49 +0400)]

zoran: don't bother with struct file * in zoran_map

all we need it for is file->private_data, which is assign-once, already
assigned by that point and, incidentally, its value is already in use
by zoran ->mmap() anyway. So just store that pointer instead...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Mon, 25 Jun 2012 07:38:56 +0000 (11:38 +0400)]

ecryptfs: don't reinvent the wheels, please - use struct completion

... and keep the sodding requests on stack - they are small enough.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 19 Jul 2012 05:18:15 +0000 (09:18 +0400)]

don't expose I_NEW inodes via dentry->d_inode

d_instantiate(dentry, inode);
unlock_new_inode(inode);

is a bad idea; do it the other way round...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 18 Jul 2012 16:43:19 +0000 (20:43 +0400)]

tidy up namei.c a bit

locking/unlocking for rcu walk taken to a couple of inline helpers

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 18 Jul 2012 13:32:50 +0000 (17:32 +0400)]

unobfuscate follow_up() a bit

really convoluted test in there has grown up during struct mount
introduction; what it checks is that we'd reached the root of
mount tree.

commit | commitdiff | tree

Eric Sandeen [Mon, 30 Apr 2012 18:16:04 +0000 (13:16 -0500)]

ext3: pass custom EOF to generic_file_llseek_size()

Use the new custom EOF argument to generic_file_llseek_size so
that SEEK_END will go to the max hash value for htree dirs
in ext3 rather than to i_size_read()

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Eric Sandeen [Mon, 30 Apr 2012 18:14:03 +0000 (13:14 -0500)]

ext4: use core vfs llseek code for dir seeks

Use the new functionality in generic_file_llseek_size() to
accept a custom EOF position, and un-cut-and-paste all the
vfs llseek code from ext4.

Also fix up comments on ext4_llseek() to reflect reality.

Signed-off-by: Eric Sandeen <sandeen@redaht.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Eric Sandeen [Mon, 30 Apr 2012 18:11:29 +0000 (13:11 -0500)]

vfs: allow custom EOF in generic_file_llseek code

For ext3/4 htree directories, using the vfs llseek function with
SEEK_END goes to i_size like for any other file, but in reality
we want the maximum possible hash value.  Recent changes
in ext4 have cut & pasted generic_file_llseek() back into fs/ext4/dir.c,
but replicating this core code seems like a bad idea, especially
since the copy has already diverged from the vfs.

This patch updates generic_file_llseek_size to accept
both a custom maximum offset, and a custom EOF position.  With this
in place, ext4_dir_llseek can pass in the appropriate maximum hash
position for both maxsize and eof, and get what it wants.

As far as I know, this does not fix any bugs - nfs in the kernel
doesn't use SEEK_END, and I don't know of any user who does.  But
some ext4 folks seem keen on doing the right thing here, and I can't
really argue.

(Patch also fixes up some comments slightly)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:34 +0000 (16:45 +0200)]

vfs: Avoid unnecessary WB_SYNC_NONE writeback during sys_sync and reorder sync passes

wakeup_flusher_threads(0) will queue work doing complete writeback for each
flusher thread. Thus there is not much point in submitting another work doing
full inode WB_SYNC_NONE writeback by writeback_inodes_sb().

After this change it does not make sense to call nonblocking ->sync_fs and
block device flush before calling sync_inodes_sb() because
wakeup_flusher_threads() is completely asynchronous and thus these functions
would be called in parallel with inode writeback running which will effectively
void any work they do. So we move sync_inodes_sb() call before these two
functions.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:33 +0000 (16:45 +0200)]

vfs: Remove unnecessary flushing of block devices

It is not necessary to write block devices twice. The reason why we first did
flush and then proper sync is that
  for_each_bdev() {
    write_bdev()
    wait_for_completion()
  }
is much slower than
  for_each_bdev()
    write_bdev()
  for_each_bdev()
    wait_for_completion()
when there is bigger amount of data. But as is seen in the above, there's no real
need to scan pages and submit them twice. We just need to separate the submission
and waiting part. This patch does that.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:32 +0000 (16:45 +0200)]

vfs: Make sys_sync writeout also block device inodes

In case block device does not have filesystem mounted on it, sys_sync will just
ignore it and doesn't writeout its dirty pages. This is because writeback code
avoids writing inodes from superblock without backing device and
blockdev_superblock is such a superblock. Since it's unexpected that sync
doesn't writeout dirty data for block devices be nice to users and change the
behavior to do so. So now we iterate over all block devices on blockdev_super
instead of iterating over all superblocks when syncing block devices.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:31 +0000 (16:45 +0200)]

vfs: Create function for iterating over block devices

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:30 +0000 (16:45 +0200)]

vfs: Reorder operations during sys_sync

Change the order of operations during sync from

for_each_sb {
        writeback_inodes_sb();
        sync_fs(nowait);
        __sync_blockdev(nowait);
}
for_each_sb {
        sync_inodes_sb();
        sync_fs(wait);
        __sync_blockdev(wait);
}

to

for_each_sb
        writeback_inodes_sb();
for_each_sb
        sync_fs(nowait);
for_each_sb
        __sync_blockdev(nowait);
for_each_sb
        sync_inodes_sb();
for_each_sb
        sync_fs(wait);
for_each_sb
        __sync_blockdev(wait);

This is a preparation for the following patches in this series.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:29 +0000 (16:45 +0200)]

quota: Move quota syncing to ->sync_fs method

Since the moment writes to quota files are using block device page cache and
space for quota structures is reserved at the moment they are first accessed we
have no reason to sync quota before inode writeback. In fact this order is now
only harmful since quota information can easily change during inode writeback
(either because conversion of delayed-allocated extents or simply because of
allocation of new blocks for simple filesystems not using page_mkwrite).

So move syncing of quota information after writeback of inodes into ->sync_fs
method. This way we do not have to use ->quota_sync callback which is primarily
intended for use by quotactl syscall anyway and we get rid of calling
->sync_fs() twice unnecessarily. We skip quota syncing for OCFS2 since it does
proper quota journalling in all cases (unlike ext3, ext4, and reiserfs which
also support legacy non-journalled quotas) and thus there are no dirty quota
structures.

CC: "Theodore Ts'o" <tytso@mit.edu>
CC: Joel Becker <jlbec@evilplan.org>
CC: reiserfs-devel@vger.kernel.org
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Dave Kleikamp <shaggy@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:28 +0000 (16:45 +0200)]

quota: Split dquot_quota_sync() to writeback and cache flushing part

Split off part of dquot_quota_sync() which writes dquots into a quota file
to a separate function. In the next patch we will use the function from
filesystems and we do not want to abuse ->quota_sync quotactl callback more
than necessary.

Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Jan Kara [Tue, 3 Jul 2012 14:45:27 +0000 (16:45 +0200)]

vfs: Move noop_backing_dev_info check from sync into writeback

In principle, a filesystem may want to have ->sync_fs() called during sync(1)
although it does not have a bdi (i.e. s_bdi is set to noop_backing_dev_info).
Only writeback code really needs bdi set to something reasonable. So move the
checks where they are more logical.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 13:28:08 +0000 (16:28 +0300)]

fs/ufs: get rid of write_super

This patch makes UFS stop using the VFS '->write_super()' method along with
the 's_dirt' superblock flag, because they are on their way out.

The way we implement this is that we schedule a delay job instead relying on
's_dirt' and '->write_super()'.

The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblocks using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds, even if there are no diry superblocks, or there are no client
file-systems which would need this (e.g., btrfs does not use
'->write_super()'). So we want to kill it completely and thus, we need to make
file-systems to stop using the '->write_super()' VFS service, and then remove
it together with the kernel thread.

Tested using fsstress from the LTP project.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 13:28:07 +0000 (16:28 +0300)]

fs/ufs: re-arrange the code a bit

This patch does not do any functional changes. It only moves 3 functions
in fs/ufs/super.c a little bit up in order to prepare for further changes
where I'll need this new arrangement to avoid forward declarations.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 13:28:06 +0000 (16:28 +0300)]

fs/ufs: remove extra superblock write on unmount

UFS calls 'ufs_write_super()' from 'ufs_put_super()' in order to write the
superblocks to the media. However, it is not needed because VFS calls
'->sync_fs()' before calling '->put_super()' - so by the time we are in
'ufs_write_super()', the superblocks are already synchronized.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Tue, 3 Jul 2012 13:43:28 +0000 (16:43 +0300)]

fs/sysv: stop using write_super and s_dirt

It does not look like sysv FS needs 'write_super()' at all, because all it
does is a timestamp update. I cannot test this patch, because this
file-system is so old and probably has not been used by anyone for years,
so there are no tools to create it in Linux. But from the code I see that
marking the superblock as dirty is basically marking the superblock buffers as
drity and then setting the s_dirt flag. And when 'write_super()' is executed to
handle the s_dirt flag, we just update the timestamp and again mark the
superblock buffer as dirty. Seems pointless.

It looks like we can update the timestamp more opprtunistically - on unmount
or remount of sync, and nothing should change.

Thus, this patch removes 'sysv_write_super()' and 's_dirt'.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Tue, 3 Jul 2012 13:43:27 +0000 (16:43 +0300)]

fs/sysv: remove another useless write_super call

We do not need to call 'sysv_write_super()' from 'sysv_remount()',
because VFS has called 'sysv_sync_fs()' before calling '->remount()'.
So remove it. Remove also '(un)lock_super()' which obvioulsy is becoming
useless in this function.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Tue, 3 Jul 2012 13:43:26 +0000 (16:43 +0300)]

fs/sysv: remove useless write_super call

We do not need to call 'sysv_write_super()' from 'sysv_put_super()',
because VFS has called 'sysv_sync_fs()' before calling '->put_super()'.
So remove it.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:28:49 +0000 (17:28 +0300)]

hfs: get rid of hfs_sync_super

This patch makes hfs stop using the VFS '->write_super()' method along with
the 's_dirt' superblock flag, because they are on their way out.

The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblocks using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds, even if there are no diry superblocks, or there are no client
file-systems which would need this (e.g., btrfs does not use
'->write_super()'). So we want to kill it completely and thus, we need to make
file-systems to stop using the '->write_super()' VFS service, and then remove
it together with the kernel thread.

Tested using fsstress from the LTP project.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:28:48 +0000 (17:28 +0300)]

hfs: introduce VFS superblock object back-reference

Add an 'sb' VFS superblock back-reference to the 'struct hfs_sb_info' data
structure - we will need to find the VFS superblock from a
'struct hfs_sb_info' object in the next patch, so this change is jut a
preparation.

Remove few useless newlines while on it.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:28:47 +0000 (17:28 +0300)]

hfs: simplify a bit checking for R/O

We have the following pattern in 2 places in HFS

if (!RDONLY)
hfs_mdb_commit();

This patch pushes the RDONLY check down to 'hfs_mdb_commit()'. This will
make the following patches a bit simpler.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:28:46 +0000 (17:28 +0300)]

hfs: remove extra mdb write on unmount

HFS calls 'hfs_write_super()' from 'hfs_put_super()' in order to write the MDB
to the media. However, it is not needed because VFS calls '->sync_fs()' before
calling '->put_super()' - so by the time we are in 'hfs_write_super()', the MDB
is already synchronized.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:28:45 +0000 (17:28 +0300)]

hfs: get rid of lock_super

Stop using lock_super for serializing the MDB changes - use the buffer-head own
lock instead. Tested with fsstress.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:28:44 +0000 (17:28 +0300)]

hfs: push lock_super down

HFS uses 'lock_super()'/'unlock_super()' around 'hfs_mdb_commit()' in order
to serialize MDB (Master Directory Block) changes. Push it down to
'hfs_mdb_commit()' in order to simplify the code a bit.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:26:31 +0000 (17:26 +0300)]

hfsplus: get rid of write_super

This patch makes hfsplus stop using the VFS '->write_super()' method along with
the 's_dirt' superblock flag, because they are on their way out.

The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblocks using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds, even if there are no diry superblocks, or there are no client
file-systems which would need this (e.g., btrfs does not use
'->write_super()'). So we want to kill it completely and thus, we need to make
file-systems to stop using the '->write_super()' VFS service, and then remove
it together with the kernel thread.

Tested using fsstress from the LTP project.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:26:30 +0000 (17:26 +0300)]

hfsplus: remove useless check

This check is useless because we always have 'sb->s_fs_info' to be non-NULL.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:26:29 +0000 (17:26 +0300)]

hfsplus: amend debugging print

Print correct function name in the debugging print of the
'hfsplus_sync_fs()' function.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Artem Bityutskiy [Thu, 12 Jul 2012 14:26:28 +0000 (17:26 +0300)]

hfsplus: make hfsplus_sync_fs static

... because it is used only in fs/hfsplus/super.c.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sat, 30 Jun 2012 07:55:24 +0000 (11:55 +0400)]

hold task_lock around checks in keyctl

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 24 Jun 2012 06:03:05 +0000 (10:03 +0400)]

get rid of ->scm_work_list

recursion in __scm_destroy() will be cut by delaying final fput()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 24 Jun 2012 06:00:10 +0000 (10:00 +0400)]

aio: now fput() is OK from interrupt context; get rid of manual delayed __fput()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 24 Jun 2012 05:56:45 +0000 (09:56 +0400)]

switch fput to task_work_add

... and schedule_work() for interrupt/kernel_thread callers
(and yes, now it *is* OK to call from interrupt).

We are guaranteed that __fput() will be done before we return
to userland (or exit).  Note that for fput() from a kernel
thread we get an async behaviour; it's almost always OK, but
sometimes you might need to have __fput() completed before
you do anything else.  There are two mechanisms for that -
a general barrier (flush_delayed_fput()) and explicit
__fput_sync().  Both should be used with care (as was the
case for fput() from kernel threads all along).  See comments
in fs/file_table.c for details.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 27 Jun 2012 07:33:29 +0000 (11:33 +0400)]

deal with task_work callbacks adding more work

It doesn't matter on normal return to userland path (we'll recheck the
NOTIFY_RESUME flag anyway), but in case of exit_task_work() we'll
need that as soon as we get callbacks capable of triggering more
task_work_add().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 27 Jun 2012 07:31:24 +0000 (11:31 +0400)]

move exit_task_work() past exit_files() et.al.

... and get rid of PF_EXITING check in task_work_add().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 27 Jun 2012 07:07:19 +0000 (11:07 +0400)]

merge task_work and rcu_head, get rid of separate allocation for keyring case

task_work and rcu_head are identical now; merge them (calling the result
struct callback_head, rcu_head #define'd to it), kill separate allocation
in security/keys since we can just use cred->rcu now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Wed, 27 Jun 2012 05:24:13 +0000 (09:24 +0400)]

trim task_work: get rid of hlist

layout based on Oleg's suggestion; single-linked list,
task->task_works points to the last element, forward pointer
from said last element points to head. I'd still prefer
much more regular scheme with two pointers in task_work,
but...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Tue, 26 Jun 2012 18:10:04 +0000 (22:10 +0400)]

trimming task_work: kill ->data

get rid of the only user of ->data; this is _not_ the final variant - in the
end we'll have task_work and rcu_head identical and just use cred->rcu,
at which point the separate allocation will be gone completely.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 15 Jul 2012 10:10:52 +0000 (14:10 +0400)]

signal: make sure we don't get stopped with pending task_work

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 22 Jul 2012 19:46:21 +0000 (23:46 +0400)]

use __lookup_hash() in kern_path_parent()

No need to bother with lookup_one_len() here - it's an overkill

Signed-off-by Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

David Howells [Mon, 25 Jun 2012 11:55:46 +0000 (12:55 +0100)]

VFS: Split inode_permission()

Split inode_permission() into inode- and superblock-dependent parts.

This is aimed at unionmounts where the superblock from the upper layer has to
be checked rather than the superblock from the lower layer as the upper layer
may be writable, thus allowing an unwritable file from the lower layer to be
copied up and modified.

Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com> (Further development)
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

David Howells [Mon, 25 Jun 2012 11:55:37 +0000 (12:55 +0100)]

VFS: Pass mount flags to sget()

Pass mount flags to sget() so that it can use them in initialising a new
superblock before the set function is called. They could also be passed to the
compare function.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

David Howells [Mon, 25 Jun 2012 11:55:28 +0000 (12:55 +0100)]

VFS: Comment mount following code

Add comments describing what the directions "up" and "down" mean and ref count
handling to the VFS mount following family of functions.

Signed-off-by: Valerie Aurora <vaurora@redhat.com> (Original author)
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

David Howells [Mon, 25 Jun 2012 11:55:18 +0000 (12:55 +0100)]

VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors

copy_tree() can theoretically fail in a case other than ENOMEM, but always
returns NULL which is interpreted by callers as -ENOMEM. Change it to return
an explicit error.

Also change clone_mnt() for consistency and because union mounts will add new
error cases.

Thanks to Andreas Gruenbacher <agruen@suse.de> for a bug fix.
[AV: folded braino fix by Dan Carpenter]

Original-author: Valerie Aurora <vaurora@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Valerie Aurora <valerie.aurora@gmail.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

David Howells [Mon, 25 Jun 2012 11:55:09 +0000 (12:55 +0100)]

VFS: Make chown() and lchown() call fchownat()

Make the chown() and lchown() syscalls jump to the fchownat() syscall with the
appropriate extra arguments.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sat, 23 Jun 2012 18:49:45 +0000 (22:49 +0400)]

do_dentry_open(): close the race with mark_files_ro() in failure exit

we want to take it out of mark_files_ro() reach *before* we start
checking if we ought to drop write access.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sat, 23 Jun 2012 18:41:54 +0000 (22:41 +0400)]

mark_files_ro(): don't bother with mntget/mntput

mnt_drop_write_file() is safe under any lock

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Andrew Morton [Tue, 19 Jun 2012 23:55:58 +0000 (09:55 +1000)]

notify_change(): check that i_mutex is held

Cc: Djalal Harouni <tixxdz@opendz.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Christoph Hellwig [Mon, 18 Jun 2012 14:47:04 +0000 (10:47 -0400)]

fs: add nd_jump_link

Add a helper that abstracts out the jump to an already parsed struct path
from ->follow_link operation from procfs. Not only does this clean up
the code by moving the two sides of this game into a single helper, but
it also prepares for making struct nameidata private to namei.c

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Christoph Hellwig [Mon, 18 Jun 2012 14:47:03 +0000 (10:47 -0400)]

fs: move path_put on failure out of ->follow_link

Currently the non-nd_set_link based versions of ->follow_link are expected
to do a path_put(&nd->path) on failure. This calling convention is unexpected,
undocumented and doesn't match what the nd_set_link-based instances do.

Move the path_put out of the only non-nd_set_link based ->follow_link
instance into the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 00:40:20 +0000 (20:40 -0400)]

debugfs: get rid of useless arguments to debugfs_{mkdir,symlink}

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 00:33:28 +0000 (20:33 -0400)]

debugfs: fold debugfs_create_by_name() into the only caller

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 00:28:22 +0000 (20:28 -0400)]

debugfs: make sure that debugfs_create_file() gets used only for regulars

It, debugfs_create_dir() and debugfs_create_link() use the common helper
now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 8 Jun 2012 19:59:33 +0000 (15:59 -0400)]

__d_unalias() should refuse to move mountpoints

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 8 Jun 2012 00:56:54 +0000 (20:56 -0400)]

sysfs: just use d_materialise_unique()

same as for nfs et.al.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 8 Jun 2012 00:51:39 +0000 (20:51 -0400)]

sysfs: switch to ->s_d_op and ->d_release()

a) ->d_iput() is wrong here - what we do to inode is completely usual, it's
dentry->d_fsdata that we want to drop. Just use ->d_release().

b) switch to ->s_d_op - no need to play with d_set_d_op()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Thu, 14 Jun 2012 23:01:42 +0000 (03:01 +0400)]

get rid of kern_path_parent()

all callers want the same thing, actually - a kinda-sorta analog of
kern_path_create(). I.e. they want parent vfsmount/dentry (with
->i_mutex held, to make sure the child dentry is still their child)
+ the child dentry.

Signed-off-by Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

David Howells [Thu, 14 Jun 2012 15:13:46 +0000 (16:13 +0100)]

VFS: Fix the banner comment on lookup_open()

Since commit 197e37d9, the banner comment on lookup_open() no longer matches
what the function returns. It used to return a struct file pointer or NULL and
now it returns an integer and is passed the struct file pointer it is to use
amongst its arguments. Update the comment to reflect this.

Also add a banner comment to atomic_open().

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 22:09:36 +0000 (18:09 -0400)]

don't pass nameidata * to vfs_create()

all we want is a boolean flag, same as the method gets now

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 22:05:36 +0000 (18:05 -0400)]

don't pass nameidata to ->create()

boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 21:17:17 +0000 (17:17 -0400)]

fs/namei.c: don't pass nameidata to __lookup_hash() and lookup_real()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 21:13:09 +0000 (17:13 -0400)]

stop passing nameidata to ->lookup()

Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 22 Jun 2012 08:42:10 +0000 (12:42 +0400)]

fs/namei.c: don't pass namedata to lookup_dcache()

just the flags...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 20:10:59 +0000 (16:10 -0400)]

fs/namei.c: don't pass nameidata to d_revalidate()

since the method wrapped by it doesn't need that anymore...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 20:03:43 +0000 (16:03 -0400)]

stop passing nameidata * to ->d_revalidate()

Just the lookup flags. Die, bastard, die...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 19:36:40 +0000 (15:36 -0400)]

fs/nfs/dir.c: switch to passing nd->flags instead of nd wherever possible

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 19:33:51 +0000 (15:33 -0400)]

nfs_lookup_verify_inode() - nd is *always* non-NULL here

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 19:18:15 +0000 (15:18 -0400)]

switch nfs_lookup_check_intent() away from nameidata

just pass the flags

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 18:32:45 +0000 (14:32 -0400)]

do_dentry_open(): take initialization of file->f_path to caller

... and get rid of a couple of arguments and a pointless reassignment
in finish_open() case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 18:24:38 +0000 (14:24 -0400)]

fold __dentry_open() into its sole caller

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 18:22:04 +0000 (14:22 -0400)]

switch do_dentry_open() to returning int

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Sun, 10 Jun 2012 10:48:09 +0000 (06:48 -0400)]

make finish_no_open() return int

namely, 1 ;-) That's what we want to return from ->atomic_open()
instances after finish_no_open().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 22 Jun 2012 08:41:10 +0000 (12:41 +0400)]

fs/namei.c: get do_last() and friends return int

Same conventions as for ->atomic_open(). Trimmed the
forest of labels a bit, while we are at it...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit | commitdiff | tree

Al Viro [Fri, 22 Jun 2012 08:40:19 +0000 (12:40 +0400)]

kill struct opendata

Just pass struct file *. Methods are happier that way...
There's no need to return struct file * from finish_open() now,
so let it return int. Next: saner prototypes for parts in
namei.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

John Crispins staging tree

RSS Atom