Jens Axboe [Thu, 23 Apr 2009 10:14:56 +0000 (12:14 +0200)]
cfq-iosched: fix bug with aliased request and cooperation detection
cfq_prio_tree_lookup() should return the direct match, yet it always
returns zero. Fix that.
cfq_prio_tree_add() assumes that we don't get a direct match, while
it is very possible that we do. Using O_DIRECT, you can have different
cfqq with matching requests, since you don't have the page cache
to serialize things for you. Fix this bug by only adding the cfqq if
there isn't an existing match.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Thu, 23 Apr 2009 10:13:27 +0000 (12:13 +0200)]
cfq-iosched: clear ->prio_trees[] on cfqd alloc
Not strictly needed, but we should make it clear that we init the
rbtree roots here.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Hannes Reinecke [Thu, 23 Apr 2009 08:32:59 +0000 (10:32 +0200)]
block: fix intermittent dm timeout based oops
Very rarely under stress testing of dm, oopses are occuring as
something tampers with an old stack frame. This has been traced back
to blk_abort_queue() leaving a timeout_list pointing to the stack.
The reason is that sometimes blk_abort_request() won't delete the
timer (if the request is marked as complete but before the timer has
been removed, a small race window). Fix this by splicing back from
the ususally empty list to the q->timeout_list.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Sage Weil [Thu, 23 Apr 2009 06:37:58 +0000 (08:37 +0200)]
umem: fix request_queue lock warning
The umem driver issues two warnings on boot, due to blk_plug_device() and
blk_remove_plug() being called without q->queue_lock held. Starting with
e48ec690 (block: extend queue_flag bitops), the queue_flag_* functions
warn if q->queue_lock doesn't appear to be locked. In fact, q->queue_lock
is NULL (though that apparently isn't otherwise a problem as the driver is
using card->lock for everything).
Although blk_init_queue() with take a request_fn_proc and spinlock_t*,
there isn't a corresponding init helper that takes a make_request_fn.
Setting queue_lock to &card->lock explicitly seems to work fine for me.
The warning goes away and the device appears to behave.
[ 1.531881] v2.3 : Micro Memory(tm) PCI memory board block driver
[ 1.538136] umem 0000:02:01.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
[ 1.545018] umem 0000:02:01.0: Micro Memory(tm) controller found (PCI Mem Module (Battery Backup))
[ 1.554176] umem 0000:02:01.0: CSR 0xfc9ffc00 -> 0xffffc200013d0c00 (0x100)
[ 1.561279] umem 0000:02:01.0: Size
1048576 KB, Battery 1 Disabled (FAILURE), Battery 2 Disabled (FAILURE)
[ 1.571114] umem 0000:02:01.0: Window size
16777216 bytes, IRQ 20
[ 1.577304] umem 0000:02:01.0: memory NOT initialized. Consider over-writing whole device.
[ 1.585989] umema:<4>------------[ cut here ]------------
[ 1.591775] WARNING: at include/linux/blkdev.h:492 blk_plug_device+0x6d/0x106()
[ 1.592025] Hardware name: H8SSL
[ 1.592025] Modules linked in:
[ 1.592025] Pid: 1, comm: swapper Not tainted 2.6.29 #8
[ 1.592025] Call Trace:
[ 1.592025] [<
ffffffff8023c994>] warn_slowpath+0xd3/0xf2
[ 1.592025] [<
ffffffff8025a5b5>] ? save_trace+0x3f/0x9b
[ 1.592025] [<
ffffffff8025a68b>] ? add_lock_to_list+0x7a/0xba
[ 1.592025] [<
ffffffff8025e609>] ? validate_chain+0xb3b/0xce8
[ 1.592025] [<
ffffffff80441556>] ? mm_make_request+0x27/0x59
[ 1.592025] [<
ffffffff80441556>] ? mm_make_request+0x27/0x59
[ 1.592025] [<
ffffffff8025ef04>] ? __lock_acquire+0x74e/0x7b9
[ 1.592025] [<
ffffffff8025a70e>] ? get_lock_stats+0x34/0x5e
[ 1.592025] [<
ffffffff8025a746>] ? put_lock_stats+0xe/0x27
[ 1.592025] [<
ffffffff80441556>] ? mm_make_request+0x27/0x59
[ 1.592025] [<
ffffffff803ad165>] blk_plug_device+0x6d/0x106
[ 1.592025] [<
ffffffff80441575>] mm_make_request+0x46/0x59
[ 1.592025] [<
ffffffff803ac2d9>] generic_make_request+0x335/0x3cf
[ 1.592025] [<
ffffffff8027fcc7>] ? mempool_alloc_slab+0x11/0x13
[ 1.592025] [<
ffffffff8027fdce>] ? mempool_alloc+0x45/0x101
[ 1.592025] [<
ffffffff8025a746>] ? put_lock_stats+0xe/0x27
[ 1.592025] [<
ffffffff803adda5>] submit_bio+0x10a/0x119
[ 1.592025] [<
ffffffff802c8d00>] submit_bh+0xe5/0x109
[ 1.592025] [<
ffffffff802cbf43>] block_read_full_page+0x2aa/0x2cb
[ 1.592025] [<
ffffffff802cf4c4>] ? blkdev_get_block+0x0/0x4c
[ 1.592025] [<
ffffffff805c90a8>] ? _spin_unlock_irq+0x36/0x51
[ 1.592025] [<
ffffffff80286836>] ? __lru_cache_add+0x92/0xb2
[ 1.592025] [<
ffffffff802cf008>] blkdev_readpage+0x13/0x15
[ 1.592025] [<
ffffffff8027de06>] read_cache_page_async+0x90/0x134
[ 1.592025] [<
ffffffff802ceff5>] ? blkdev_readpage+0x0/0x15
[ 1.592025] [<
ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[ 1.592025] [<
ffffffff8027deb8>] read_cache_page+0xe/0x45
[ 1.592025] [<
ffffffff802f5170>] read_dev_sector+0x2e/0x93
[ 1.592025] [<
ffffffff802f5f44>] adfspart_check_ICS+0x28/0x16c
[ 1.592025] [<
ffffffff8025d427>] ? trace_hardirqs_on+0xd/0xf
[ 1.592025] [<
ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[ 1.592025] [<
ffffffff802f59c5>] rescan_partitions+0x168/0x2fb
[ 1.592025] [<
ffffffff802ceae9>] __blkdev_get+0x259/0x336
[ 1.592025] [<
ffffffff803ca1e2>] ? kobject_put+0x47/0x4b
[ 1.592025] [<
ffffffff802cebd1>] blkdev_get+0xb/0xd
[ 1.592025] [<
ffffffff802f5773>] register_disk+0xc4/0x12b
[ 1.592025] [<
ffffffff803b2a7b>] add_disk+0xc3/0x12d
[ 1.592025] [<
ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[ 1.592025] [<
ffffffff808a1e73>] mm_init+0x129/0x1a5
[ 1.592025] [<
ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[ 1.592025] [<
ffffffff80209056>] _stext+0x56/0x130
[ 1.592025] [<
ffffffff80274932>] ? register_irq_proc+0xae/0xca
[ 1.592025] [<
ffffffff802f0000>] ? proc_pid_lookup+0xb4/0x18b
[ 1.592025] [<
ffffffff8087f975>] kernel_init+0x132/0x18b
[ 1.592025] [<
ffffffff8020d17a>] child_rip+0xa/0x20
[ 1.592025] [<
ffffffff8020cb40>] ? restore_args+0x0/0x30
[ 1.592025] [<
ffffffff8087f843>] ? kernel_init+0x0/0x18b
[ 1.592025] [<
ffffffff8020d170>] ? child_rip+0x0/0x20
[ 1.592025] ---[ end trace
7150b3b86da74e1e ]---
[ 1.889858] ------------[ cut here ]------------[ve_plug+0x5f/0x91()
[ 1.893848] Hardware name: H8SSL
[ 1.893848] Modules linked in:
[ 1.893848] Pid: 1, comm: swapper Tainted: G W 2.6.29 #8
[ 1.893848] Call Trace:
[ 1.893848] [<
ffffffff8023c994>] warn_slowpath+0xd3/0xf2
[ 1.893848] [<
ffffffff805c8411>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 1.893848] [<
ffffffff8020cb40>] ? restore_args+0x0/0x30
[ 1.893848] [<
ffffffff80254245>] ? __atomic_notifier_call_chain+0x0/0xb2
[ 1.893848] [<
ffffffff805c90a3>] ? _spin_unlock_irq+0x31/0x51
[ 1.893848] [<
ffffffff805c90bf>] ? _spin_unlock_irq+0x4d/0x51
[ 1.893848] [<
ffffffff8044157d>] ? mm_make_request+0x4e/0x59
[ 1.893848] [<
ffffffff8025a70e>] ? get_lock_stats+0x34/0x5e
[ 1.893848] [<
ffffffff8025a75d>] ? put_lock_stats+0x25/0x27
[ 1.893848] [<
ffffffff80441504>] ? mm_unplug_device+0x25/0x50
[ 1.893848] [<
ffffffff803acf23>] blk_remove_plug+0x5f/0x91
[ 1.893848] [<
ffffffff8044150f>] mm_unplug_device+0x30/0x50
[ 1.893848] [<
ffffffff803ab74a>] blk_unplug+0x78/0x7d
[ 1.893848] [<
ffffffff803ab75c>] blk_backing_dev_unplug+0xd/0xf
[ 1.893848] [<
ffffffff802c853c>] block_sync_page+0x4a/0x4c
[ 1.893848] [<
ffffffff8027da1c>] sync_page+0x44/0x4d
[ 1.893848] [<
ffffffff805c66fd>] __wait_on_bit_lock+0x42/0x8a
[ 1.893848] [<
ffffffff8027d9d8>] ? sync_page+0x0/0x4d
[ 1.893848] [<
ffffffff8027d9c4>] __lock_page+0x64/0x6b
[ 1.893848] [<
ffffffff802508db>] ? wake_bit_function+0x0/0x2a
[ 1.893848] [<
ffffffff8027de4a>] read_cache_page_async+0xd4/0x134
[ 1.893848] [<
ffffffff802ceff5>] ? blkdev_readpage+0x0/0x15
[ 1.893848] [<
ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[ 1.893848] [<
ffffffff8027deb8>] read_cache_page+0xe/0x45
[ 1.893848] [<
ffffffff802f5170>] read_dev_sector+0x2e/0x93
[ 1.893848] [<
ffffffff802f5f44>] adfspart_check_ICS+0x28/0x16c
[ 1.893848] [<
ffffffff8025d427>] ? trace_hardirqs_on+0xd/0xf
[ 1.893848] [<
ffffffff802f5f1c>] ? adfspart_check_ICS+0x0/0x16c
[ 1.893848] [<
ffffffff802f59c5>] rescan_partitions+0x168/0x2fb
[ 1.893848] [<
ffffffff802ceae9>] __blkdev_get+0x259/0x336
[ 1.893848] [<
ffffffff803ca1e2>] ? kobject_put+0x47/0x4b
[ 1.893848] [<
ffffffff802cebd1>] blkdev_get+0xb/0xd
[ 1.893848] [<
ffffffff802f5773>] register_disk+0xc4/0x12b
[ 1.893848] [<
ffffffff803b2a7b>] add_disk+0xc3/0x12d
[ 1.893848] [<
ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[ 1.893848] [<
ffffffff808a1e73>] mm_init+0x129/0x1a5
[ 1.893848] [<
ffffffff808a1d4a>] ? mm_init+0x0/0x1a5
[ 1.893848] [<
ffffffff80209056>] _stext+0x56/0x130
[ 1.893848] [<
ffffffff80274932>] ? register_irq_proc+0xae/0xca
[ 1.893848] [<
ffffffff802f0000>] ? proc_pid_lookup+0xb4/0x18b
[ 1.893848] [<
ffffffff8087f975>] kernel_init+0x132/0x18b
[ 1.893848] [<
ffffffff8020d17a>] child_rip+0xa/0x20
[ 1.893848] [<
ffffffff8020cb40>] ? restore_args+0x0/0x30
[ 1.893848] [<
ffffffff8087f843>] ? kernel_init+0x0/0x18b
[ 1.893848] [<
ffffffff8020d170>] ? child_rip+0x0/0x20
[ 1.893848] ---[ end trace
7150b3b86da74e1f ]---
Signed-off-by: Sage Weil <sage@newdream.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jerome Marchand [Wed, 22 Apr 2009 12:01:49 +0000 (14:01 +0200)]
block: simplify I/O stat accounting
This simplifies I/O stat accounting switching code and separates it
completely from I/O scheduler switch code.
Requests are accounted according to the state of their request queue
at the time of the request allocation. There is no need anymore to
flush the request queue when switching I/O accounting state.
Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Alexander Beregalov [Tue, 21 Apr 2009 07:33:14 +0000 (09:33 +0200)]
pktcdvd.h should include mempool.h
Fix this build error:
In file included from fs/compat_ioctl.c:104:
include/linux/pktcdvd.h:285: error: expected specifier-qualifier-list before 'mempool_t'
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jeff Moyer [Tue, 21 Apr 2009 05:31:56 +0000 (07:31 +0200)]
cfq-iosched: use the default seek distance when there aren't enough seek samples
If the cfq io context doesn't have enough samples yet to provide a mean
seek distance, then use the default threshold we have for seeky IO instead
of defaulting to 0.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jeff Moyer [Tue, 21 Apr 2009 05:25:04 +0000 (07:25 +0200)]
cfq-iosched: make seek_mean converge more quickly
Right now, depending on the first sector to which a process issues I/O,
the seek time may start out way out of whack. So make sure we start
with 0 sectors in seek, instead of the offset of the first request
issued.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Fri, 17 Apr 2009 06:36:50 +0000 (08:36 +0200)]
block: make blk_abort_queue() ignore non-request based devices
There's nothing to do for those devices, since the timeout handling is
based on requests.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Fri, 17 Apr 2009 06:34:48 +0000 (08:34 +0200)]
block: include empty disks in /proc/diskstats
/proc/diskstats used to show stats for all disks whether they're
zero-sized or not and their non-zero partitions. Commit
074a7aca7afa6f230104e8e65eba3420263714a5 accidentally changed the
behavior such that it doesn't print out zero sized disks. This patch
implements DISK_PITER_INCL_EMPTY_PART0 flag to partition iterator and
uses it in diskstats_show() such that empty part0 is shown in
/proc/diskstats.
Reported and bisectd by Dianel Collins.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Collins <solemnwarning@solemnwarning.no-ip.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 15 Apr 2009 13:10:27 +0000 (22:10 +0900)]
bio: use bio_kmalloc() in copy/map functions
Impact: remove possible deadlock condition
There is no reason to use mempool backed allocation for map functions.
Also, because kern mapping is used inside LLDs (e.g. for EH), using
mempool backed allocation can lead to deadlock under extreme
conditions (mempool already consumed by the time a request reached EH
and requests are blocked on EH).
Switch copy/map functions to bio_kmalloc().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 15 Apr 2009 17:50:51 +0000 (19:50 +0200)]
bio: fix bio_kmalloc()
Impact: fix bio_kmalloc() and its destruction path
bio_kmalloc() was broken in two ways.
* bvec_alloc_bs() first allocates bvec using kmalloc() and then
ignores it and allocates again like non-kmalloc bvecs.
* bio_kmalloc_destructor() didn't check for and free bio integrity
data.
This patch fixes the above problems. kmalloc patch is separated out
from bio_alloc_bioset() and allocates the requested number of bvecs as
inline bvecs.
* bio_alloc_bioset() no longer takes NULL @bs. None other than
bio_kmalloc() used it and outside users can't know how it was
allocated anyway.
* Define and use BIO_POOL_NONE so that pool index check in
bvec_free_bs() triggers if inline or kmalloc allocated bvec gets
there.
* Relocate destructors on top of each allocation function so that how
they're used is more clear.
Jens Axboe suggested allocating bvecs inline.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 15 Apr 2009 13:10:25 +0000 (22:10 +0900)]
block: fix queue bounce limit setting
Impact: don't set GFP_DMA in q->bounce_gfp unnecessarily
All DMA address limits are expressed in terms of the last addressable
unit (byte or page) instead of one plus that. However, when
determining bounce_gfp for 64bit machines in blk_queue_bounce_limit(),
it compares the specified limit against 0x100000000UL to determine
whether it's below 4G ending up falsely setting GFP_DMA in
q->bounce_gfp.
As DMA zone is very small on x86_64, this makes larger SG_IO transfers
very eager to trigger OOM killer. Fix it. While at it, rename the
parameter to @dma_mask for clarity and convert comment to proper
winged style.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 15 Apr 2009 13:10:24 +0000 (22:10 +0900)]
block: fix SG_IO vector request data length handling
Impact: fix SG_IO behavior such that it matches the documentation
SG_IO howto says that if ->dxfer_len and sum of iovec disagress, the
shorter one wins. However, the current implementation returns -EINVAL
for such cases. Trim iovc if it's longer than ->dxfer_len.
This patch uses iov_*() helpers which take struct iovec * by casting
struct sg_iovec * to it. sg_iovec is always identical to iovec and
this will be further cleaned up with later patches.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Tejun Heo [Wed, 15 Apr 2009 13:10:23 +0000 (22:10 +0900)]
scatterlist: make sure sg_miter_next() doesn't return 0 sized mappings
Impact: fix not-so-critical but annoying bug
sg_miter_next() returns 0 sized mapping if there is an zero sized sg
entry in the list or at the end of each iteration. As the users
always check the ->length field, this bug shouldn't be critical other
than causing unnecessary iteration.
Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Linus Torvalds [Wed, 22 Apr 2009 03:07:00 +0000 (20:07 -0700)]
Linux 2.6.30-rc3
Arjan van de Ven [Tue, 21 Apr 2009 20:32:54 +0000 (13:32 -0700)]
driver synchronization: make scsi_wait_scan more advanced
There is currently only one way for userspace to say "wait for my storage
device to get ready for the modules I just loaded": to load the
scsi_wait_scan module. Expectations of userspace are that once this
module is loaded, all the (storage) devices for which the drivers
were loaded before the module load are present.
Now, there are some issues with the implementation, and the async
stuff got caught in the middle of this: The existing code only
waits for the scsy async probing to finish, but it did not take
into account at all that probing might not have begun yet.
(Russell ran into this problem on his computer and the fix works for him)
This patch fixes this more thoroughly than the previous "fix", which
had some bad side effects (namely, for kernel code that wanted to wait for
the scsi scan it would also do an async sync, which would deadlock if you did
it from async context already.. there's a report about that on lkml):
The patch makes the module first wait for all device driver probes, and then it
will wait for the scsi parallel scan to finish.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jonathan Corbet [Tue, 21 Apr 2009 22:30:32 +0000 (16:30 -0600)]
Trivial: fix a typo in slow-work.h
Fix a comment typo in slow-work.h
...a trivial mistake, but it will mess up kerneldoc if nothing else.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 21 Apr 2009 22:00:29 +0000 (23:00 +0100)]
PERCPU: Collect the DECLARE/DEFINE declarations together
Collect the DECLARE/DEFINE declarations together in linux/percpu-defs.h so
that they're in one place, and give them descriptive comments, particularly
the SHARED_ALIGNED variant.
It would be nice to collect these in linux/percpu.h, but that's not possible
without sorting out the severe #include recursion between the x86 arch headers
and the general headers (and possibly other arches too).
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 21 Apr 2009 22:00:24 +0000 (23:00 +0100)]
FRV: Fix the section attribute on UP DECLARE_PER_CPU()
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU(). This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.
On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section. The linker throws
up the following errors:
kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU(). However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 21 Apr 2009 21:12:58 +0000 (14:12 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: fix btrfs fallocate oops and deadlock
Btrfs: use the right node in reada_for_balance
Btrfs: fix oops on page->mapping->host during writepage
Btrfs: add a priority queue to the async thread helpers
Btrfs: use WRITE_SYNC for synchronous writes
Linus Torvalds [Tue, 21 Apr 2009 21:12:43 +0000 (14:12 -0700)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
go7007: Convert to the new i2c device binding model
Roel Kluin [Tue, 21 Apr 2009 19:24:58 +0000 (12:24 -0700)]
bfin_5xx: misplaced parentheses
`!' has a higher precedence than `&', parentheses are misplaced.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Acked-by: Sonic Zhang <sonic.zhang@analog.com>
Cc: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Tue, 21 Apr 2009 19:24:57 +0000 (12:24 -0700)]
vmscan,memcg: reintroduce sc->may_swap
Commit
a6dc60f8975ad96d162915e07703a4439c80dcf0 ("vmscan: rename
sc.may_swap to may_unmap") removed the may_swap flag, but memcg had used
it as a flag for "we need to use swap?", as the name indicate.
And in the current implementation, memcg cannot reclaim mapped file
caches when mem+swap hits the limit.
re-introduce may_swap flag and handle it at get_scan_ratio(). This
patch doesn't influence any scan_control users other than memcg.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Jiang [Tue, 21 Apr 2009 19:24:56 +0000 (12:24 -0700)]
edac: ppc mpc85xx fix mc err detect
Error found by Jeff Haran.
The error detect register is 0s when no errors are detected. The check
code is incorrect, so reverse check sense.
Reported-by: Jeff Haran <jharan@Brocade.COM>
Signed-off-by: Dave Jiang <djiang@mvista.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Paris [Tue, 21 Apr 2009 19:24:54 +0000 (12:24 -0700)]
scsi: mpt: suppress debugobjects warning
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13133
ODEBUG: object is on stack, but not annotated
------------[ cut here ]------------
WARNING: at lib/debugobjects.c:253 __debug_object_init+0x1f3/0x276()
Hardware name: VMware Virtual Platform
Modules linked in: mptspi(+) mptscsih mptbase scsi_transport_spi ext3 jbd mbcache
Pid: 540, comm: insmod Not tainted 2.6.28-mm1 #2
Call Trace:
[<
c042c51c>] warn_slowpath+0x74/0x8a
[<
c0469600>] ? start_critical_timing+0x96/0xb7
[<
c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c
[<
c0446fad>] ? trace_hardirqs_off_caller+0x18/0xaf
[<
c044704f>] ? trace_hardirqs_off+0xb/0xd
[<
c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c
[<
c042cb84>] ? release_console_sem+0x1a5/0x1ad
[<
c05013e6>] __debug_object_init+0x1f3/0x276
[<
c0501494>] debug_object_init+0x13/0x17
[<
c0433c56>] init_timer+0x10/0x1a
[<
e08e5b54>] mpt_config+0x1c1/0x2b7 [mptbase]
[<
e08e3b82>] ? kmalloc+0x8/0xa [mptbase]
[<
e08e3b82>] ? kmalloc+0x8/0xa [mptbase]
[<
e08e6fa2>] mpt_do_ioc_recovery+0x950/0x1212 [mptbase]
[<
c04496c2>] ? __lock_acquire+0xa69/0xacc
[<
c060c8f1>] ? _spin_unlock_irqrestore+0x36/0x3c
[<
c060c3af>] ? _spin_unlock_irq+0x22/0x26
[<
c04f2d8b>] ? string+0x2b/0x76
[<
c04f310e>] ? vsnprintf+0x338/0x7b3
[<
c04496c2>] ? __lock_acquire+0xa69/0xacc
[<
c060c8ea>] ? _spin_unlock_irqrestore+0x2f/0x3c
[<
c04496c2>] ? __lock_acquire+0xa69/0xacc
[<
c044897d>] ? debug_check_no_locks_freed+0xeb/0x105
[<
c060c8f1>] ? _spin_unlock_irqrestore+0x36/0x3c
[<
c04488bc>] ? debug_check_no_locks_freed+0x2a/0x105
[<
c0446b8c>] ? lock_release_holdtime+0x43/0x48
[<
c043f742>] ? up_read+0x16/0x29
[<
c05076f8>] ? pci_get_slot+0x66/0x72
[<
e08e89ca>] mpt_attach+0x881/0x9b1 [mptbase]
[<
e091c8e5>] mptspi_probe+0x11/0x354 [mptspi]
Noticing that every caller of mpt_config has its CONFIGPARMS struct
declared on the stack and thus the &pCfg->timer is always on the stack I
changed init_timer() to init_timer_on_stack() and it seems to have shut
up.....
Cc: "Moore, Eric Dean" <Eric.Moore@lsil.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "Desai, Kashyap" <Kashyap.Desai@lsi.com>
Cc: <stable@kernel.org> [2.6.29.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robin Holt [Tue, 21 Apr 2009 19:24:53 +0000 (12:24 -0700)]
sgi-xp/sgi-gru: allow modules to load on non-uv systems
For an upcoming distro release, we need to have the xp kernel module
loadable even when not on UV equipment. The xpc module will not load.
This will allow one set of modules dependent upon xp to work on either UV
or non-UV equipment.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
WANG Cong [Tue, 21 Apr 2009 19:24:52 +0000 (12:24 -0700)]
uml: kill a kconfig warning
Got this warning from Kconfig:
boolean symbol INPUT tested for 'm'? test forced to 'n'
because INPUT is tristate, not bool.
Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Tue, 21 Apr 2009 19:24:51 +0000 (12:24 -0700)]
frv: insert PCI root bus resources for the MB93090 devel motherboard
Insert PCI root bus resources for the FRV-based MB93090 development kit
motherboard. This is required because the CPU's window onto the PCI bus
address space is considerably smaller than the CPU's full address space
and non-PCI devices lie outside of the PCI window that we might want to
access.
Without this patch, the PCI root bus uses the platform-level bus
resources, and these are then confined to the PCI window, thus making
platform_device_add() reject devices outside of this window.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Halasa [Tue, 21 Apr 2009 19:24:49 +0000 (12:24 -0700)]
rtc-cmos: fix printk output
With no IRQ available/defined, RTC-CMOS driver prints something like:
rtc0: alarms up to one no, y3k, 114 bytes nvram
^^^^
I guess the following is a bit easier to understand:
rtc0: no alarms, y3k, 114 bytes nvram
Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Brownell [Tue, 21 Apr 2009 19:24:49 +0000 (12:24 -0700)]
spi: documentation: emphasise spi_setup() semantics
This is a doc-only patch which I hope will reduce the number of
spi_master controller driver patches starting out with a common
implementation bug.
(As in: almost every spi_master driver I see starts out with its
version of this bug. Sigh.)
It just re-emphasizes that the setup() method may be called for one
device while a transfer is active on another ... which means that most
driver implementations shouldn't touch any registers.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Robert P. J. Day [Tue, 21 Apr 2009 19:24:47 +0000 (12:24 -0700)]
MAINTAINERS: add a more searchable string for the H8300 architecture.
Add a parenthesized string of "H8300" for more convenient searchability
in the MAINTAINERS file.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matt Mackall [Tue, 21 Apr 2009 19:24:47 +0000 (12:24 -0700)]
MAINTAINERS: add Matt Mackall to embedded maintainers
Impact: make more work for myself
Signed-off-by: Matt Mackall <mpm@selenic.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roel Kluin [Tue, 21 Apr 2009 19:24:46 +0000 (12:24 -0700)]
spi: pxa2xx: limit reaches -1
On line 944 the return value of flush() is considered as a boolean,
but limit reaches -1 upon timeout which evaluates to true.
On 540, 594, 720 the same occurs for wait_ssp_rx_stall()
On 536 the same occurs for wait_dma_channel_stop()
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 21 Apr 2009 19:24:45 +0000 (12:24 -0700)]
MAINTAINERS: update KMEMTRACE pattern after file rename
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 21 Apr 2009 19:24:44 +0000 (12:24 -0700)]
MAINTAINERS: remove include/asm-*/suspend* file patterns
There are no more arches with suspend support using these directories.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Daniel Ribeiro [Tue, 21 Apr 2009 19:24:43 +0000 (12:24 -0700)]
pxa2xx_spi: restore DRCMR on resume
If DMA is enabled, any spi_sync call after suspend/resume would block
forever, because DRCMR is lost on suspend. This patch restores DRCMR to
the same values set by probe.
Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Helge Deller [Tue, 21 Apr 2009 19:24:42 +0000 (12:24 -0700)]
drivers/input/serio/hp_sdc.c: fix crash when removing hp_sdc module
On parisc machines, which don't have HIL, removing the hp_sdc module
panics the kernel. Fix this by returning early in hp_sdc_exit() if no HP
SDC controller was found.
Add functionality to probe for the hp_sdc_mlc kernel module (which takes
care of the upper layer HIL functionality on parisc) after two seconds.
This is needed to get all the other HIL drivers (keyboard / mouse/ ..)
drivers automatically loaded by udev later as well.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Frans Pop <elendil@planet.nl>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KAMEZAWA Hiroyuki [Tue, 21 Apr 2009 19:24:41 +0000 (12:24 -0700)]
memcg: use rcu_dereference to access mm->owner
mm->owner should be accessed with rcu_dereference().
Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Akinobu Mita [Tue, 21 Apr 2009 19:24:05 +0000 (12:24 -0700)]
hugetlbfs: return negative error code for bad mount option
This fixes the following BUG:
# mount -o size=MM -t hugetlbfs none /huge
hugetlbfs: Bad value 'MM' for mount option 'size=MM'
------------[ cut here ]------------
kernel BUG at fs/super.c:996!
Due to
BUG_ON(!mnt->mnt_sb);
in vfs_kern_mount().
Also, remove unused #include <linux/quotaops.h>
Cc: William Irwin <wli@holomorphy.com>
Cc: <stable@kernel.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dann frazier [Tue, 21 Apr 2009 19:24:05 +0000 (12:24 -0700)]
ipmi: add oem message handling
Enable userspace to receive messages that a BMC transmits using an OEM
medium. This is used by the HP iLO2.
Based on code originally written by Patrick Schoeller.
Signed-off-by: dann frazier <dannf@hp.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Tue, 21 Apr 2009 19:24:04 +0000 (12:24 -0700)]
ipmi: fix statistics counting issues
Bela Lubkin noticed that the statistics for send IPMB and LAN commands
in the IPMI driver could be incremented even if an error occurred. Move
the increments to the proper place to avoid this.
Also add some statistics for retransmissions that failed, and some little
helper functions to neaten up the code a little.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: Bela Lubkin <blubkin@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Tue, 21 Apr 2009 19:24:03 +0000 (12:24 -0700)]
ipmi: test for event buffer before using
The IPMI driver would attempt to use the event buffer even if that
didn't exist on the BMC. This patch modified the IPMI driver to check
for the event buffer's existence before trying to use it.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Corey Minyard [Tue, 21 Apr 2009 19:24:02 +0000 (12:24 -0700)]
ipmi: fix platform return check
The wrong return value is being tested when allocating a platform device
in the IPMI SI code. Check the right value.
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Magnus Damm [Tue, 21 Apr 2009 19:24:02 +0000 (12:24 -0700)]
clocksource: add enable() and disable() callbacks
Add enable() and disable() callbacks for clocksources.
This allows us to put unused clocksources in power save mode. The
functions clocksource_enable() and clocksource_disable() wrap the
callbacks and are inserted in the timekeeping code to enable before use
and disable after switching to a new clocksource.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Magnus Damm [Tue, 21 Apr 2009 19:24:00 +0000 (12:24 -0700)]
clocksource: pass clocksource to read() callback
Pass clocksource pointer to the read() callback for clocksources. This
allows us to share the callback between multiple instances.
[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Denis V. Lunev [Tue, 21 Apr 2009 19:23:59 +0000 (12:23 -0700)]
pxafb: lcsr1 is unused without CONFIG_FB_PXA_OVERLAY
Fixes the warning:
drivers/video/pxafb.c: In function 'pxafb_handle_irq':
drivers/video/pxafb.c:1442: warning: unused variable 'lcsr1'
[akpm@linux-foundation.org: save an ifdef]
Signed-off-by: Denis V. Lunev <den@openvz.org>
Cc: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlada Peric [Tue, 21 Apr 2009 19:23:59 +0000 (12:23 -0700)]
asiliantfb: add missing return statement
Commit
032220ba (asiliantfb: fix cmap memory leaks) changed the function
init_asiliant from void to int, resulting in the following compile warning:
drivers/video/asiliantfb.c: In function `init_asiliant':
drivers/video/asiliantfb.c:536: warning: control reaches end of non-void function
Fix the warning by returning 0.
Signed-off-by: Vlada Peric <vlada.peric@gmail.com>
Cc: Andres Salomon <dilinger@debian.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Tue, 21 Apr 2009 19:47:22 +0000 (21:47 +0200)]
go7007: Convert to the new i2c device binding model
Move the go7007 driver away from the legacy i2c binding model, which
is going away really soon now.
The I2C addresses of the audio and video chips in s2250-board didn't
look quite right, apparently they were left-aligned values when Linux
wants right-aligned values, so I fixed them too.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Chris Mason [Tue, 21 Apr 2009 15:53:38 +0000 (11:53 -0400)]
Btrfs: fix btrfs fallocate oops and deadlock
Btrfs fallocate was incorrectly starting a transaction with a lock held
on the extent_io tree for the file, which could deadlock. Strictly
speaking it was using join_transaction which would be safe, but it is better
to move the transaction outside of the lock.
When preallocated extents are overwritten, btrfs_mark_buffer_dirty was
being called on an unlocked buffer. This was triggering an assertion and
oops because the lock is supposed to be held.
The bug was calling btrfs_mark_buffer_dirty on a leaf after btrfs_del_item had
been run. btrfs_del_item takes care of dirtying things, so the solution is a
to skip the btrfs_mark_buffer_dirty call in this case.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Linus Torvalds [Tue, 21 Apr 2009 15:27:30 +0000 (08:27 -0700)]
Merge git://git./linux/kernel/git/steve/gfs2-2.6-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: Fix page_mkwrite() return code
GFS2: Clear dirty bit at end of inode glock sync
Linus Torvalds [Tue, 21 Apr 2009 15:16:14 +0000 (08:16 -0700)]
Merge branch 'sh/for-2.6.30' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.30' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Fix mmap2 for handling differing PAGE_SIZEs.
sh: sh7723: Don't default enable the RTC clock.
sh: sh7722: Don't default enable the RTC clock.
rtc: rtc-sh: clock framework support.
Linus Torvalds [Tue, 21 Apr 2009 14:56:17 +0000 (07:56 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
reiserfs: fix j_last_flush_trans_id type
fs: Mark get_filesystem_list() as __init function.
kill vfs_stat_fd / vfs_lstat_fd
Separate out common fstatat code into vfs_fstatat
ecryptfs: use memdup_user()
ncpfs: use memdup_user()
xfs: use memdup_user()
sysfs: use memdup_user()
btrfs: use memdup_user()
xattr: use memdup_user()
autofs4: use memchr() in invalid_string()
Documentation/filesystems: remove out of date reference to BKL being held
Fix i_mutex vs. readdir handling in nfsd
fs/compat_ioctl: fix build when !BLOCK
Fix autofs_expire()
No need for crossing to mountpoint in audit_tag_tree()
Safer nfsd_cross_mnt()
Touch all affected namespaces on propagation of mount
Fix AUTOFS_DEV_IOCTL_REQUESTER_CMD
Thomas Bogendoerfer [Tue, 21 Apr 2009 11:44:13 +0000 (13:44 +0200)]
Fix SYSCALL_ALIAS for older MIPS assembler
Older MIPS assembler don't support .set for defining aliases.
Using = works for old and new assembers.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trond Myklebust [Mon, 20 Apr 2009 18:58:35 +0000 (14:58 -0400)]
NFS: Fix the XDR iovec calculation in nfs3_xdr_setaclargs
Commit
ae46141ff08f1965b17c531b571953c39ce8b9e2 (NFSv3: Fix posix ACL code)
introduces a bug in the calculation of the XDR header iovec. In the case
where we are inlining the acls, we need to adjust the length of the iovec
req->rq_svec, in addition to adjusting the total buffer length.
Tested-by: Leonardo Chiquitto <leonardo.lists@gmail.com>
Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paul Mundt [Tue, 21 Apr 2009 08:12:16 +0000 (17:12 +0900)]
Merge branch 'sh/stable-updates' into sh/for-2.6.30
Al Viro [Tue, 21 Apr 2009 03:29:41 +0000 (23:29 -0400)]
reiserfs: fix j_last_flush_trans_id type
Conversion in commit
600ed41675d8c384519d8f0b3c76afed39ef2f4b had missed
that one, but converted format from %lu to %u. As the result,
/proc/..../journal got buggered on 64bit boxen.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Tetsuo Handa [Thu, 9 Apr 2009 11:17:52 +0000 (20:17 +0900)]
fs: Mark get_filesystem_list() as __init function.
"int get_filesystem_list(char * buf)" is called by only
"static void __init get_fs_names(char *page)".
We can mark get_filesystem_list() as "__init".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Christoph Hellwig [Wed, 8 Apr 2009 20:34:03 +0000 (16:34 -0400)]
kill vfs_stat_fd / vfs_lstat_fd
There's really no reason to keep vfs_stat_fd and vfs_lstat_fd with
Oleg's vfs_fstatat. Use vfs_fstatat for the few cases having the
directory fd, and switch all others to vfs_stat / vfs_lstat.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Oleg Drokin [Wed, 8 Apr 2009 16:05:42 +0000 (20:05 +0400)]
Separate out common fstatat code into vfs_fstatat
This is a version incorporating Christoph's suggestion.
Separate out common *fstatat functionality into a single function
instead of duplicating it all over the code.
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan [Wed, 8 Apr 2009 07:09:29 +0000 (15:09 +0800)]
ecryptfs: use memdup_user()
Remove open-coded memdup_user().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan [Wed, 8 Apr 2009 07:08:53 +0000 (15:08 +0800)]
ncpfs: use memdup_user()
Remove open-coded memdup_user()
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan [Wed, 8 Apr 2009 07:08:04 +0000 (15:08 +0800)]
xfs: use memdup_user()
Remove open-coded memdup_user()
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan [Wed, 8 Apr 2009 07:07:30 +0000 (15:07 +0800)]
sysfs: use memdup_user()
Remove open-coded memdup_user().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan [Wed, 8 Apr 2009 07:06:54 +0000 (15:06 +0800)]
btrfs: use memdup_user()
Remove open-coded memdup_user().
Note this changes some GFP_NOFS to GFP_KERNEL, since copy_from_user() may
cause pagefault, it's pointless to pass GFP_NOFS to kmalloc().
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Li Zefan [Wed, 8 Apr 2009 07:06:12 +0000 (15:06 +0800)]
xattr: use memdup_user()
Remove open-coded memdup_user()
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 7 Apr 2009 15:12:46 +0000 (11:12 -0400)]
autofs4: use memchr() in invalid_string()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Adrian McMenamin [Tue, 21 Apr 2009 01:38:28 +0000 (18:38 -0700)]
Documentation/filesystems: remove out of date reference to BKL being held
Documentation/filesystems/vfs.txt incorrectly states that the kernel is
locked during the call to statfs (Documentation/filesystems/Locking
correctly says it is not). This patch removes the offending sentence.
remove reference to BKL being held in statfs
Signed-off-by: Adrian McMenamin <adrian@mcmen.demon.co.uk>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Woodhouse [Mon, 20 Apr 2009 22:18:37 +0000 (23:18 +0100)]
Fix i_mutex vs. readdir handling in nfsd
Commit
14f7dd63 ("Copy XFS readdir hack into nfsd code") introduced a
bug to generic code which had been extant for a long time in the XFS
version -- it started to call through into lookup_one_len() and hence
into the file systems' ->lookup() methods without i_mutex held on the
directory.
This patch fixes it by locking the directory's i_mutex again before
calling the filldir functions. The original deadlocks which commit
14f7dd63 was designed to avoid are still avoided, because they were due
to fs-internal locking, not i_mutex.
While we're at it, fix the return type of nfsd_buffered_readdir() which
should be a __be32 not an int -- it's an NFS errno, not a Linux errno.
And return nfserrno(-ENOMEM) when allocation fails, not just -ENOMEM.
Sparse would have caught that, if it wasn't so busy bitching about
__cold__.
Commit
05f4f678 ("nfsd4: don't do lookup within readdir in recovery
code") introduced a similar problem with calling lookup_one_len()
without i_mutex, which this patch also addresses. To fix that, it was
necessary to fix the called functions so that they expect i_mutex to be
held; that part was done by J. Bruce Fields.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Umm-I-can-live-with-that-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: J. R. Okajima <hooanon05@yahoo.co.jp>
Tested-by: J. Bruce Fields <bfields@citi.umich.edu>
LKML-Reference: <8036.
1237474444@jrobl>
Cc: stable@kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Alexander Beregalov [Mon, 20 Apr 2009 08:23:02 +0000 (12:23 +0400)]
fs/compat_ioctl: fix build when !BLOCK
In file included from fs/compat_ioctl.c:61:
include/linux/loop.h:59: error: field 'lo_bio_list' has incomplete type
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 18 Apr 2009 15:19:26 +0000 (11:19 -0400)]
Fix autofs_expire()
mnt should remain the same for all iterations through the list;
as it is, if we have a busy mount, mnt follows into it and isn't
restored for the next iteration.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 18 Apr 2009 07:25:41 +0000 (03:25 -0400)]
No need for crossing to mountpoint in audit_tag_tree()
is_under() will DTRT anyway. And yes, is_subdir() behaviour
is intentional.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Sat, 18 Apr 2009 06:32:31 +0000 (02:32 -0400)]
Safer nfsd_cross_mnt()
AFAICS, we have a subtle bug there: if we have crossed mountpoint
*and* it got mount --move'd away, we'll be holding only one
reference to fs containing dentry - exp->ex_path.mnt. IOW, we
ought to dput() before exp_put().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 7 Apr 2009 16:15:39 +0000 (12:15 -0400)]
Touch all affected namespaces on propagation of mount
We shouldn't just touch the namespace of current process
Caught-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 7 Apr 2009 13:03:30 +0000 (09:03 -0400)]
Fix AUTOFS_DEV_IOCTL_REQUESTER_CMD
Missing conversion from kernel to userland dev_t; this sucker
breaks as soon as we get sufficiently many autofs mounts for
new_encode_dev(s_dev) != s_dev.
Note: this is the minimal fix.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Toshinobu Sugioka [Mon, 20 Apr 2009 22:34:53 +0000 (07:34 +0900)]
sh: Fix mmap2 for handling differing PAGE_SIZEs.
mmap2 uses a fixed page shift of 12, regardless of the PAGE_SIZE setting.
Fix up the mmap2 code to add some sanity checks on the mapping, and to
update pgoff accordingly.
Error handling bits based on
4280e3126f641898f0ed1a931645373d3489e2a6
("frv: fix mmap2 error handling").
Signed-off-by: Toshinobu Sugioka <sugioka@itonet.co.jp>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Chris Mason [Mon, 20 Apr 2009 19:50:10 +0000 (15:50 -0400)]
Btrfs: use the right node in reada_for_balance
reada_for_balance was using the wrong index into the path node array,
so it wasn't reading the right blocks. We never directly used the
results of the read done by this function because the btree search is
started over at the end.
This fixes reada_for_balance to reada in the correct node and to
avoid searching past the last slot in the node. It also makes sure to
hold the parent lock while we are finding the nodes to read.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:09 +0000 (15:50 -0400)]
Btrfs: fix oops on page->mapping->host during writepage
The extent_io writepage call updates the writepage index in the inode
as it makes progress. But, it was doing the update after unlocking the page,
which isn't legal because page->mapping can't be trusted once the page
is unlocked.
This lead to an oops, especially common with compression turned on. The
fix here is to update the writeback index before unlocking the page.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:09 +0000 (15:50 -0400)]
Btrfs: add a priority queue to the async thread helpers
Btrfs is using WRITE_SYNC_PLUG to send down synchronous IOs with a
higher priority. But, the checksumming helper threads prevent it
from being fully effective.
There are two problems. First, a big queue of pending checksumming
will delay the synchronous IO behind other lower priority writes. Second,
the checksumming uses an ordered async work queue. The ordering makes sure
that IOs are sent to the block layer in the same order they are sent
to the checksumming threads. Usually this gives us less seeky IO.
But, when we start mixing IO priorities, the lower priority IO can delay
the higher priority IO.
This patch solves both problems by adding a high priority list to the async
helper threads, and a new btrfs_set_work_high_prio(), which is used
to make put a new async work item onto the higher priority list.
The ordering is still done on high priority IO, but all of the high
priority bios are ordered separately from the low priority bios. This
ordering is purely an IO optimization, it is not involved in data
or metadata integrity.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Mon, 20 Apr 2009 19:50:09 +0000 (15:50 -0400)]
Btrfs: use WRITE_SYNC for synchronous writes
Part of reducing fsync/O_SYNC/O_DIRECT latencies is using WRITE_SYNC for
writes we plan on waiting on in the near future. This patch
mirrors recent changes in other filesystems and the generic code to
use WRITE_SYNC when WB_SYNC_ALL is passed and to use WRITE_SYNC for
other latency critical writes.
Btrfs uses async worker threads for checksumming before the write is done,
and then again to actually submit the bios. The bio submission code just
runs a per-device list of bios that need to be sent down the pipe.
This list is split into low priority and high priority lists so the
WRITE_SYNC IO happens first.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Linus Torvalds [Mon, 20 Apr 2009 19:34:36 +0000 (12:34 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] fix allmodconfig compilation breakage.
[IA64] smp_flush_tlb_mm() should only send IPI's to cpus in cpu_vm_mask
[IA64] export smp_send_reschedule
Isaku Yamahata [Sat, 18 Apr 2009 03:15:23 +0000 (12:15 +0900)]
[IA64] fix allmodconfig compilation breakage.
This patch fixes the following compilation error caused by recursive
inclusion of kernel.h which defines BUILD_BUG_ON().
In this case, the case it catches will be caught by the case
CONFIG_PARAVIRT=n, so removing it would not hurt compile time check
very much. So fix the breakage by removing it.
CC arch/ia64/kernel/asm-offsets.s
In file included from include/linux/bitops.h:17,
from include/linux/kernel.h:15,
from include/linux/sched.h:52,
from arch/ia64/kernel/asm-offsets.c:9:
arch/ia64/include/asm/bitops.h: In function 'set_bit':
arch/ia64/include/asm/bitops.h:47: error: implicit declaration of function 'BUILD_BUG_ON'
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Linus Torvalds [Mon, 20 Apr 2009 15:43:06 +0000 (08:43 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/rafael/suspend-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM/Suspend: Introduce two new platform callbacks to avoid breakage
Linus Torvalds [Mon, 20 Apr 2009 15:42:48 +0000 (08:42 -0700)]
Merge branch 'drm-linus' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
agp: zero pages before sending to userspace
drm: check for minor master before allowing drop master.
drm: set/clear is_master when master changed
drm: clean dirty memory after device release
drm: count reaches -1
Linus Torvalds [Mon, 20 Apr 2009 15:37:37 +0000 (08:37 -0700)]
Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
md: support bitmaps on RAID10 arrays larger then 2 terabytes
md: update sync_completed and reshape_position even more often.
md: improve usefulness and accuracy of sysfs file md/sync_completed.
md: allow setting newly added device to 'in_sync' via sysfs.
md: tiny md.h cleanups
David Howells [Mon, 20 Apr 2009 14:46:45 +0000 (15:46 +0100)]
FS-Cache: Add MAINTAINERS record for FS-Cache and CacheFiles
Add MAINTAINERS record for FS-Cache and CacheFiles.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Mon, 20 Apr 2009 11:46:24 +0000 (12:46 +0100)]
FRV: Don't attempt to #include <linux/blk.h> as it doesn't exist
Stop the FRV arch from attempting to #include <linux/blk.h> as it doesn't
exist.
Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kay Sievers [Sat, 18 Apr 2009 22:05:45 +0000 (15:05 -0700)]
driver: dont update dev_name via device_add path
notice one system /proc/iomem some entries missed the name for pci_devices
it turns that dev->dev.kobj name is changed after device_add.
for pci code: via acpi_pci_root_driver.ops.add (aka acpi_pci_root_add)
==> pci_acpi_scan_root is used to scan pci bus/device, and at the same
time we read the resource for pci_dev in the pci_read_bases, we have
res->name = pci_name(pci_dev); pci_name is calling dev_name.
later via acpi_pci_root_driver.ops.start (aka acpi_pci_root_start) ==>
pci_bus_add_device to add all pci_dev in kobj tree. pci_bus_add_device
will call device_add.
actually in device_add
/* first, register with generic layer. */
error = kobject_add(&dev->kobj, dev->kobj.parent, "%s", dev_name(dev));
if (error)
goto Error;
will get one new name for that kobj, old name is freed.
[Impact: fix corrupted names in /proc/iomem ]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven Whitehouse [Mon, 20 Apr 2009 08:45:54 +0000 (09:45 +0100)]
GFS2: Fix page_mkwrite() return code
This allows for the possibility of returning VM_FAULT_OOM as
well as VM_FAULT_SIGBUS. This ensures that the correct action
is taken.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Steven Whitehouse [Mon, 20 Apr 2009 07:58:45 +0000 (08:58 +0100)]
GFS2: Clear dirty bit at end of inode glock sync
The dirty bit can get set during the inode glock sync. Its too
complicated to change that at the moment, so this is the quick
fix - to clear the bit again at the end of the function.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
NeilBrown [Mon, 20 Apr 2009 01:50:24 +0000 (11:50 +1000)]
md: support bitmaps on RAID10 arrays larger then 2 terabytes
.. and other arrays with components larger than 2 terabytes.
We use a "long" rather than a "sector_t" in part of the bitmap
size calculations, which is sad.
Reported-by: "Mario 'BitKoenig' Holbe" <Mario.Holbe@TU-Ilmenau.DE>
Signed-off-by: NeilBrown <neilb@suse.de>
Shaohua Li [Mon, 20 Apr 2009 00:08:35 +0000 (10:08 +1000)]
agp: zero pages before sending to userspace
AGP pages might be mapped into userspace finally, so the pages should be
set to zero before userspace can use it. Otherwise there is potential
information leakage.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Sun, 19 Apr 2009 23:32:50 +0000 (09:32 +1000)]
drm: check for minor master before allowing drop
When fast user switching a lot eventually we get to the point,
where we were checking for the wrong thing in this function.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jonas Bonn [Thu, 16 Apr 2009 07:00:02 +0000 (09:00 +0200)]
drm: set/clear is_master when master changed
The variable is_master is being used to track the drm_file that is currently
master, so its value needs to be updated accordingly when the master is
changed.
Signed-off-by: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Ma Ling [Thu, 16 Apr 2009 09:51:25 +0000 (17:51 +0800)]
drm: clean dirty memory after device release
In current code we register/unregister connector object by
drm_sysfs_connector_add/remove function.
However under some cases, we need to dynamically register or unregister device
multiple times, so we have to go through register -> unregister ->register
routine.
Because after device_unregister function our memory is dirty, we need to do
clean operation in order to re-register the device, otherwise the system
will crash. The patch intends to clean device after device release.
Signed-off-by: Ma Ling <ling.ma@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Roel Kluin [Thu, 16 Apr 2009 20:57:46 +0000 (22:57 +0200)]
drm: count reaches -1
With a postfix decrement in the test count will reach -1 rather than 0,
subsequent tests fail.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Rafael J. Wysocki [Sun, 19 Apr 2009 18:08:42 +0000 (20:08 +0200)]
PM/Suspend: Introduce two new platform callbacks to avoid breakage
Commit
900af0d973856d6feb6fc088c2d0d3fde57707d3 (PM: Change suspend
code ordering) changed the ordering of suspend code in such a way
that the platform .prepare() callback is now executed after the
device drivers' late suspend callbacks have run. Unfortunately, this
turns out to break ARM platforms that need to talk via I2C to power
control devices during the .prepare() callback.
For this reason introduce two new platform suspend callbacks,
.prepare_late() and .wake(), that will be called just prior to
disabling non-boot CPUs and right after bringing them back on line,
respectively, and use them instead of .prepare() and .finish() for
ACPI suspend. Make the PM core execute the .prepare() and .finish()
platform suspend callbacks where they were executed previously (that
is, right after calling the regular suspend methods provided by
device drivers and right before executing their regular resume
methods, respectively).
It is not necessary to make analogous changes to the hibernation
code and data structures at the moment, because they are only used
by ACPI platforms.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Sun, 19 Apr 2009 17:58:20 +0000 (10:58 -0700)]
Merge git://git./linux/kernel/git/rusty/linux-2.6-lguest-and-virtio
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest-and-virtio:
lguest: document 32-bit and PAE requirements
lguest: tell git to ignore Documentation/lguest/lguest
virtio: fix suspend when using virtio_balloon
lguest: fix guest crash on non-linear addresses in gdt pvops
lguest: fix crash on vmlinux images
Linus Torvalds [Sun, 19 Apr 2009 17:57:38 +0000 (10:57 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Set function_id only on FG nodes
ALSA: MAINTAINERS - Update SOUND
ALSA: emu10k1 - off by 1 in snd_emu10k1_wait()
ASoC: OMAP: Fix FS polarity in OSK5912 machine driver
ASoC: OMAP: Fix DSP_B format in OMAP McBSP DAI driver
ASoC: Fix include build error in s3c2412-i2s.c
ASoC: Fix s3c-i2s-v2.c snd_soc_dai changes
ASoC: s3c-i2s-v2.c fix for s3c_i2sv2_iis_calc_rate
ASoC: Fix jive_wm8750.c build problems
ASoC: pxa-ssp: allow setting of dai format 0
ALSA: hda - Add upper-limit of mixer amp for
AD1884A-laptop model, too
ALSA: hda - Fix headphone-detection on some machines with STAC/IDT codecs
ALSA: Intel8x0: Add hp_only quirk for SSID 0x1028016a (Dell Inspiron 8600)
ALSA: Intel8x0: Remove conflicting quirk for SSID 0x103c0934
ALSA: hda_intel.c - Consolidate bitfields
Linus Torvalds [Sun, 19 Apr 2009 17:54:06 +0000 (10:54 -0700)]
Merge git://git./linux/kernel/git/sam/kbuild-fixes
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
kbuild: introduce subdir-ccflags-y
kbuild: support include/generated