Linus Torvalds [Thu, 12 May 2011 14:53:34 +0000 (07:53 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic
sparc32: fix sparcstation 5 boot
sparc32: fix section mismatch warnings in apc, pmc and time_32
Linus Torvalds [Thu, 12 May 2011 14:53:06 +0000 (07:53 -0700)]
Merge branch 'fixes' of /home/rmk/linux-2.6-arm
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses
ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls
ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM
ARM: zImage: the page table memory must be considered before relocation
ARM: zImage: make sure not to relocate on top of the relocation code
ARM: zImage: Fix bad SP address after relocating kernel
ARM: zImage: make sure the stack is 64-bit aligned
ARM: RiscPC: acornfb: fix section mismatches
ARM: RiscPC: etherh: fix section mismatches
Catalin Marinas [Wed, 6 Apr 2011 15:18:47 +0000 (16:18 +0100)]
ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses
Since mandatory barriers may be used (explicitly or implicitly via readl
etc.) to ensure the ordering between Device and Normal memory accesses,
a DMB is not enough. This patch converts it to a DSB.
Cc: Colin Cross <ccross@android.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Arnd Bergmann [Tue, 3 May 2011 17:32:55 +0000 (18:32 +0100)]
ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls
GDB's interrupt.exp test cases currenly fail on ARM. The problem is how do_signal
handled restarting interrupted system calls:
The entry.S assembler code determines that we come from a system call; and that
information is passed as "syscall" parameter to do_signal. That routine then
calls get_signal_to_deliver [*] and if a signal is to be delivered, calls into
handle_signal. If a system call is to be restarted either after the signal
handler returns, or if no handler is to be called in the first place, the PC
is updated after the get_signal_to_deliver call, either in handle_signal (if
we have a handler) or at the end of do_signal (otherwise).
Now the problem is that during [*], the call to get_signal_to_deliver, a ptrace
intercept may happen. During this intercept, the debugger may change registers,
including the PC. This is done by GDB if it wants to execute an "inferior call",
i.e. the execution of some code in the debugged program triggered by GDB.
To this purpose, GDB will save all registers, allocate a stack frame, set up
PC and arguments as appropriate for the call, and point the link register to
a dummy breakpoint instruction. Once the process is restarted, it will execute
the call and then trap back to the debugger, at which point GDB will restore
all registers and continue original execution.
This generally works fine. However, now consider what happens when GDB attempts
to do exactly that while the process was interrupted during execution of a to-be-
restarted system call: do_signal is called with the syscall flag set; it calls
get_signal_to_deliver, at which point the debugger takes over and changes the PC
to point to a completely different place. Now get_signal_to_deliver returns
without a signal to deliver; but now do_signal decides it should be restarting
a system call, and decrements the PC by 2 or 4 -- so it now points to 2 or 4
bytes before the function GDB wants to call -- which leads to a subsequent crash.
To fix this problem, two things need to be supported:
- do_signal must be able to recognize that get_signal_to_deliver changed the PC
to a different location, and skip the restart-syscall sequence
- once the debugger has restored all registers at the end of the inferior call
sequence, do_signal must recognize that *now* it needs to restart the pending
system call, even though it was now entered from a breakpoint instead of an
actual svc instruction
This set of issues is solved on other platforms, usually by one of two
mechanisms:
- The status information "do_signal is handling a system call that may need
restarting" is itself carried in some register that can be accessed via
ptrace. This is e.g. on Intel the "orig_eax" register; on Sparc the kernel
defines a magic extra bit in the flags register for this purpose.
This allows GDB to manage that state: reset it when doing an inferior call,
and restore it after the call is finished.
- On s390, do_signal transparently handles this problem without requiring
GDB interaction, by performing system call restarting in the following
way: first, adjust the PC as necessary for restarting the call. Then,
call get_signal_to_deliver; and finally just continue execution at the
PC. This way, if GDB does not change the PC, everything is as before.
If GDB *does* change the PC, execution will simply continue there --
and once GDB restores the PC it saved at that point, it will automatically
point to the *restarted* system call. (There is the minor twist how to
handle system calls that do *not* need restarting -- do_signal will undo
the PC change in this case, after get_signal_to_deliver has returned, and
only if ptrace did not change the PC during that call.)
Because there does not appear to be any obvious register to carry the
syscall-restart information on ARM, we'd either have to introduce a new
artificial ptrace register just for that purpose, or else handle the issue
transparently like on s390. The patch below implements the second option;
using this patch makes the interrupt.exp test cases pass on ARM, with no
regression in the GDB test suite otherwise.
Cc: patches@linaro.org
Signed-off-by: Ulrich Weigand <ulrich.weigand@linaro.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Will Deacon [Thu, 28 Apr 2011 17:44:31 +0000 (18:44 +0100)]
ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM
The SPARSEMEM code allocates memmap entries only for sections which are
present (i.e. those which contain some valid memory). The membank checks
in free_unused_memmap do not take this into account and can incorrectly
attempt to free memory which is not allocated, resulting in a BUG() in
the bootmem code.
However, if memory is configured as follows:
|<----section---->|<----hole---->|<----section---->|
+--------+--------+--------------+--------+--------+
| bank 0 | unused | | bank 1 | unused |
+--------+--------+--------------+--------+--------+
where a bank only occupies part of a section, the memmap allocated for
the remainder of the section *can* be freed.
This patch modifies the checks in free_unused_memmap so that only valid
memmap entries are considered for removal.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tkhai Kirill [Tue, 10 May 2011 02:31:41 +0000 (02:31 +0000)]
sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic
When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits,
but it's not guaranteed one of them is zero. So we can get unaligned memory access
in label ccte. Example of parameters which lead to this:
%o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3
With the parameters I had a memory corruption, when the additional 5 bytes were rewritten.
This patch corrects the error.
One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft
stores word or less.
Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 12 May 2011 02:13:34 +0000 (19:13 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: do not use i_wrbuffer_ref as refcount for Fb cap
ceph: fix list_add in ceph_put_snap_realm
ceph: print debug message before put mds session
Linus Torvalds [Thu, 12 May 2011 02:13:16 +0000 (19:13 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/nouveau: fix build regression on alpha due to Xen changes.
drm/radeon/kms: fix cayman acceleration
drm/radeon: fix cayman struct accessors.
Linus Torvalds [Thu, 12 May 2011 02:00:15 +0000 (19:00 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sameo/mfd-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6:
mfd: Fix for the TWL4030 PM sleep/wakeup sequence
mfd: Fix asic3 build error
mfd: Fixed gpio polarity of omap-usb gpio USB-phy reset
Linus Torvalds [Thu, 12 May 2011 01:59:45 +0000 (18:59 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] fix alloc_pgste check in init_new_context
[S390] oprofile: fix min/max interval query checks
[S390] replace diag10() with diag10_range() function
[S390] disassembler: handle b280/spp instruction
[S390] kernel: Initialize register 14 when starting new CPU
[S390] dasd: prevent IO error during reserve/release loop
[S390] sclp/memory hotplug: fix initial usecount of increments
Linus Torvalds [Thu, 12 May 2011 01:58:16 +0000 (18:58 -0700)]
Revert "Bluetooth: fix shutdown on SCO sockets"
This reverts commit
f21ca5fff6e548833fa5ee8867239a8378623150.
Quoth Gustavo F. Padovan:
"Commit
f21ca5fff6e548833fa5ee8867239a8378623150 can cause a NULL
dereference if we call shutdown in a bluetooth SCO socket and doesn't
wait the shutdown completion to call close(). Please revert it. I
may have a fix for it soon, but we don't have time anymore, so revert
is the way to go. ;)"
Requested-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 12 May 2011 01:57:05 +0000 (18:57 -0700)]
Merge branch 'pm-fixes' of git://git./linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM
PM / Hibernate: Make snapshot_release() restore GFP mask
PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl
Mel Gorman [Wed, 11 May 2011 22:13:39 +0000 (15:13 -0700)]
mm: tracing: add missing GFP flags to tracing
include/linux/gfp.h and include/trace/events/gfpflags.h are out of sync.
When tracing is enabled, certain flags are not recognised and the text
output is less useful as a result. Add the missing flags.
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 11 May 2011 22:13:38 +0000 (15:13 -0700)]
tmpfs: fix spurious ENOSPC when racing with unswap
Testing the shmem_swaplist replacements for igrab() revealed another bug:
writes to /dev/loop0 on a tmpfs file which fills its filesystem were
sometimes failing with "Buffer I/O error"s.
These came from ENOSPC failures of shmem_getpage(), when racing with
swapoff: the same could happen when racing with another shmem_getpage(),
pulling the page in from swap in between our find_lock_page() and our
taking the info->lock (though not in the single-threaded loop case).
This is unacceptable, and surprising that I've not noticed it before:
it dates back many years, but (presumably) was made a lot easier to
reproduce in 2.6.36, which sited a page preallocation in the race window.
Fix it by rechecking the page cache before settling on an ENOSPC error.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 11 May 2011 22:13:37 +0000 (15:13 -0700)]
tmpfs: fix race between umount and swapoff
The use of igrab() in swapoff's shmem_unuse_inode() is just as vulnerable
to umount as that in shmem_writepage().
Fix this instance by extending the protection of shmem_swaplist_mutex
right across shmem_unuse_inode(): while it's on the list, the inode cannot
be evicted (and the filesystem cannot be unmounted) without
shmem_evict_inode() taking that mutex to remove it from the list.
But since shmem_writepage() might take that mutex, we should avoid making
memory allocations or memcg charges while holding it: prepare them at the
outer level in shmem_unuse(). When mem_cgroup_cache_charge() was
originally placed, we didn't know until that point that the page from swap
was actually a shmem page; but nowadays it's noted in the swap_map, so
we're safe to charge upfront. For the radix_tree, do as is done in
shmem_getpage(): preload upfront, but don't pin to the cpu; so we make a
habit of refreshing the node pool, but might dip into GFP_NOWAIT reserves
on occasion if subsequently preempted.
With the allocation and charge moved out from shmem_unuse_inode(),
we can also hold index map and info->lock over from finding the entry.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 11 May 2011 22:13:36 +0000 (15:13 -0700)]
tmpfs: fix race between umount and writepage
Konstanin Khlebnikov reports that a dangerous race between umount and
shmem_writepage can be reproduced by this script:
for i in {1..300} ; do
mkdir $i
while true ; do
mount -t tmpfs none $i
dd if=/dev/zero of=$i/test bs=1M count=$(($RANDOM % 100))
umount $i
done &
done
on a 6xCPU node with 8Gb RAM: kernel very unstable after this accident. =)
Kernel log:
VFS: Busy inodes after unmount of tmpfs.
Self-destruct in 5 seconds. Have a nice day...
WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98()
list_del corruption. prev->next should be
ffff880222fdaac8, but was (null)
Pid: 11222, comm: mount.tmpfs Not tainted 2.6.39-rc2+ #4
Call Trace:
warn_slowpath_common+0x80/0x98
warn_slowpath_fmt+0x41/0x43
__list_del_entry+0x8d/0x98
evict+0x50/0x113
iput+0x138/0x141
...
BUG: unable to handle kernel paging request at
ffffffffffffffff
IP: shmem_free_blocks+0x18/0x4c
Pid: 10422, comm: dd Tainted: G W 2.6.39-rc2+ #4
Call Trace:
shmem_recalc_inode+0x61/0x66
shmem_writepage+0xba/0x1dc
pageout+0x13c/0x24c
shrink_page_list+0x28e/0x4be
shrink_inactive_list+0x21f/0x382
...
shmem_writepage() calls igrab() on the inode for the page which came from
page reclaim, to add it later into shmem_swaplist for swapoff operation.
This igrab() can race with super-block deactivating process:
shrink_inactive_list() deactivate_super()
pageout() tmpfs_fs_type->kill_sb()
shmem_writepage() kill_litter_super()
generic_shutdown_super()
evict_inodes()
igrab()
atomic_read(&inode->i_count)
skip-inode
iput()
if (!list_empty(&sb->s_inodes))
printk("VFS: Busy inodes after...
This igrap-iput pair was added in commit
1b1b32f2c6f6 "tmpfs: fix
shmem_swaplist races" based on incorrect assumptions: igrab() protects the
inode from concurrent eviction by deletion, but it does nothing to protect
it from concurrent unmounting, which goes ahead despite the raised
i_count.
So this use of igrab() was wrong all along, but the race made much worse
in 2.6.37 when commit
63997e98a3be "split invalidate_inodes()" replaced
two attempts at invalidate_inodes() by a single evict_inodes().
Konstantin posted a plausible patch, raising sb->s_active too: I'm unsure
whether it was correct or not; but burnt once by igrab(), I am sure that
we don't want to rely more deeply upon externals here.
Fix it by adding the inode to shmem_swaplist earlier, while the page lock
on page in page cache still secures the inode against eviction, without
artifically raising i_count. It was originally added later because
shmem_unuse_inode() is liable to remove an inode from the list while it's
unswapped; but we can guard against that by taking spinlock before
dropping mutex.
Reported-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 11 May 2011 22:13:35 +0000 (15:13 -0700)]
memcg: allocate memory cgroup structures in local nodes
Commit
dde79e005a769 ("page_cgroup: reduce allocation overhead for
page_cgroup array for CONFIG_SPARSEMEM") added a regression that the
memory cgroup data structures all end up in node 0 because the first
attempt at allocating them would not pass in a node hint. Since the
initialization runs on CPU #0 it would all end up node 0. This is a
problem on large memory systems, where node 0 would lose a lot of
memory.
Change the alloc_pages_exact() to alloc_pages_exact_nid(). This will
still fall back to other nodes if not enough memory is available.
[ RED-PEN: right now it would fall back first before trying
vmalloc_node. Probably not the best strategy ... But I left it like
that for now. ]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reported-by: Doug Nelson
Cc: David Rientjes <rientjes@google.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andi Kleen [Wed, 11 May 2011 22:13:34 +0000 (15:13 -0700)]
mm: add alloc_pages_exact_nid()
Add a alloc_pages_exact_nid() that allocates on a specific node.
The naming is quite broken, but fixing that would need a larger renaming
action.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Harry Wei [Wed, 11 May 2011 22:13:33 +0000 (15:13 -0700)]
MAINTAINERS: fix sorting
Take alphabetical orders for MAINTAINERS file.
Signed-off-by: Harry Wei <harryxiyou@gmail.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yinghai Lu [Wed, 11 May 2011 22:13:32 +0000 (15:13 -0700)]
mm: use alloc_bootmem_node_nopanic() on really needed path
Stefan found nobootmem does not work on his system that has only 8M of
RAM. This causes an early panic:
BIOS-provided physical RAM map:
BIOS-88:
0000000000000000 -
000000000009f000 (usable)
BIOS-88:
0000000000100000 -
0000000000840000 (usable)
bootconsole [earlyser0] enabled
Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
DMI not present or invalid.
last_pfn = 0x840 max_arch_pfn = 0x100000
init_memory_mapping:
0000000000000000-
0000000000840000
8MB LOWMEM available.
mapped low ram: 0 -
00840000
low ram: 0 -
00840000
Zone PFN ranges:
DMA 0x00000001 -> 0x00001000
Normal empty
Movable zone start PFN for each node
early_node_map[2] active PFN ranges
0: 0x00000001 -> 0x0000009f
0: 0x00000100 -> 0x00000840
BUG: Int 6: CR2 (null)
EDI
c034663c ESI (null) EBP
c0329f38 ESP
c0329ef4
EBX
c0346380 EDX
00000006 ECX
ffffffff EAX
fffffff4
err (null) EIP
c0353191 CS
c0320060 flg
00010082
Stack: (null)
c030c533 000007cd (null)
c030c533 00000001 (null) (null)
00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
(null)
00000100 00000840 (null)
c0329f64 00000001 00001000 (null)
Pid: 0, comm: swapper Not tainted 2.6.36 #5
Call Trace:
[<
c02e3707>] ? 0xc02e3707
[<
c035e6e5>] 0xc035e6e5
[<
c0353191>] ? 0xc0353191
[<
c03534d6>] 0xc03534d6
[<
c034f1cd>] 0xc034f1cd
[<
c034a824>] 0xc034a824
[<
c03513cb>] ? 0xc03513cb
[<
c0349432>] 0xc0349432
[<
c0349066>] 0xc0349066
It turns out that we should ignore the low limit of 16M.
Use alloc_bootmem_node_nopanic() in this case.
[akpm@linux-foundation.org: less mess]
Signed-off-by: Yinghai LU <yinghai@kernel.org>
Reported-by: Stefan Hellermann <stefan@the2masters.de>
Tested-by: Stefan Hellermann <stefan@the2masters.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org> [2.6.34+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 11 May 2011 22:13:30 +0000 (15:13 -0700)]
mm: check PageUnevictable in lru_deactivate_fn()
The lru_deactivate_fn should not move page which in on unevictable lru
into inactive list. Otherwise, we can meet BUG when we use
isolate_lru_pages as __isolate_lru_page could return -EINVAL.
Reported-by: Ying Han <yinghan@google.com>
Tested-by: Ying Han <yinghan@google.com>
Signed-off-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Wed, 11 May 2011 22:13:28 +0000 (15:13 -0700)]
drivers/rtc/rtc-s3c.c: fixup wake support for rtc
The driver is not balancing set_irq and disable_irq_wake() calls, so
ensure that it keeps track of whether the wake is enabled.
The fixes the following error on S3C6410 devices:
WARNING: at kernel/irq/manage.c:382 set_irq_wake+0x84/0xec()
Unbalanced IRQ 92 wake disable
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Tue, 10 May 2011 19:10:13 +0000 (21:10 +0200)]
PM / Hibernate: Fix ioctl SNAPSHOT_S2RAM
The SNAPSHOT_S2RAM ioctl used for implementing the feature allowing
one to suspend to RAM after creating a hibernation image is currently
broken, because it doesn't clear the "ready" flag in the struct
snapshot_data object handled by it. As a result, the
SNAPSHOT_UNFREEZE doesn't work correctly after SNAPSHOT_S2RAM has
returned and the user space hibernate task cannot thaw the other
processes as appropriate. Make SNAPSHOT_S2RAM clear data->ready
to fix this problem.
Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Rafael J. Wysocki [Tue, 10 May 2011 19:10:01 +0000 (21:10 +0200)]
PM / Hibernate: Make snapshot_release() restore GFP mask
If the process using the hibernate user space interface closes
/dev/snapshot after creating a hibernation image without thawing
tasks, snapshot_release() should call pm_restore_gfp_mask() to
restore the GFP mask used before the creation of the image. Make
that happen.
Tested-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Rafael J. Wysocki [Tue, 10 May 2011 19:09:53 +0000 (21:09 +0200)]
PM: Fix warning in pm_restrict_gfp_mask() during SNAPSHOT_S2RAM ioctl
A warning is printed by pm_restrict_gfp_mask() while the
SNAPSHOT_S2RAM ioctl is being executed after creating a hibernation
image, because pm_restrict_gfp_mask() has been called once already
before the image creation and suspend_devices_and_enter() calls it
once again. This happens after commit
452aa6999e6703ffbddd7f6ea124d3
(mm/pm: force GFP_NOIO during suspend/hibernation and resume).
To avoid this issue, move pm_restrict_gfp_mask() and
pm_restore_gfp_mask() from suspend_devices_and_enter() to its caller
in kernel/power/suspend.c.
Reported-by: Alexandre Felipe Muller de Souza <alexandrefm@mandriva.com.br>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
Henry C Chang [Wed, 11 May 2011 10:29:54 +0000 (10:29 +0000)]
ceph: do not use i_wrbuffer_ref as refcount for Fb cap
We increments i_wrbuffer_ref when taking the Fb cap. This breaks
the dirty page accounting and causes looping in
__ceph_do_pending_vmtruncate, and ceph client hangs.
This bug can be reproduced occasionally by running blogbench.
Add a new field i_wb_ref to inode and dedicate it to Fb reference
counting.
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Henry C Chang [Wed, 11 May 2011 10:29:53 +0000 (10:29 +0000)]
ceph: fix list_add in ceph_put_snap_realm
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Henry C Chang [Wed, 11 May 2011 10:29:52 +0000 (10:29 +0000)]
ceph: print debug message before put mds session
The mds session, s, could be freed during ceph_put_mds_session.
Move dout before ceph_put_mds_session.
Signed-off-by: Henry C Chang <henry.cy.chang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
Lesly A M [Thu, 14 Apr 2011 12:27:49 +0000 (17:57 +0530)]
mfd: Fix for the TWL4030 PM sleep/wakeup sequence
Only configure sleep script when the flag is TWL4030_SLEEP_SCRIPT.
Adding the missing brackets for fixing the issue.
Signed-off-by: Lesly A M <leslyam@ti.com>
Cc: Nishanth Menon <nm@ti.com>
Cc: David Derrick <dderrick@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Axel Lin [Thu, 14 Apr 2011 14:43:47 +0000 (22:43 +0800)]
mfd: Fix asic3 build error
Fix below compile error:
CC drivers/mfd/asic3.o
drivers/mfd/asic3.c: In function 'asic3_irq_demux':
drivers/mfd/asic3.c:147: error: 'irq_data' undeclared (first use in this function)
drivers/mfd/asic3.c:147: error: (Each undeclared identifier is reported only once
drivers/mfd/asic3.c:147: error: for each function it appears in.)
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Juergen Kilb [Thu, 14 Apr 2011 07:31:43 +0000 (09:31 +0200)]
mfd: Fixed gpio polarity of omap-usb gpio USB-phy reset
With commit
19403165 a main part of ehci-omap.c moved to
drivers/mfd/omap-usb-host.c created by commit
17cdd29d.
Due to this reorganisation the polarity used to reset the
external USB phy changed and USB host doesn't recognize
any devices.
Signed-off-by: Juergen Kilb <J.Kilb@phytec.de>
Acked-by: Felipe Balbi <balbi@ti.com>
Tested-by: Steve Sakoman <steve@sakoman.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Dave Airlie [Mon, 9 May 2011 02:24:04 +0000 (02:24 +0000)]
drm/radeon/nouveau: fix build regression on alpha due to Xen changes.
The Xen changes were using DMA_ERROR_CODE which isn't defined on a few
platforms, however we reverted the Xen patch that caused use to try and
use this code path earlier in 2.6.39 cycle, so for now lets just force
the code to never take this path and allow it to build again on alpha.
The proper long term answer is probably to store if the dma_addr has
been assigned to alongside the dma_addr in the higher level code,
though I think Thomas wanted to rewrite most of this anyways properly.
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Alex Deucher [Tue, 10 May 2011 02:14:52 +0000 (02:14 +0000)]
drm/radeon/kms: fix cayman acceleration
The TCC disable setup was incorrect. This
prevents the GPU from hanging when draw commands
are issued.
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 9 May 2011 04:54:33 +0000 (14:54 +1000)]
drm/radeon: fix cayman struct accessors.
We are accessing totally the wrong struct in this case, and putting
uninitialised values into the GPU, which it doesn't like unsurprisingly.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Wed, 11 May 2011 00:39:01 +0000 (17:39 -0700)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
slcan: fix ldisc->open retval
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
xfrm: Don't allow esn with disabled anti replay detection
xfrm: Assign the inner mode output function to the dst entry
net: dev_close() should check IFF_UP
vlan: fix GVRP at dismantle time
netfilter: revert
a2361c8735e07322023aedc36e4938b35af31eb0
netfilter: IPv6: fix DSCP mangle code
netfilter: IPv6: initialize TOS field in REJECT target module
IPVS: init and cleanup restructuring
IPVS: Change of socket usage to enable name space exit.
netfilter: ebtables: only call xt_compat_add_offset once per rule
netfilter: fix ebtables compat support
netfilter: ctnetlink: fix timestamp support for new conntracks
pch_gbe: support ML7223 IOH
PCH_GbE : Fixed the issue of checksum judgment
PCH_GbE : Fixed the issue of collision detection
NET: slip, fix ldisc->open retval
be2net: Fixed bugs related to PVID.
ehea: fix wrongly reported speed and port
...
David Rientjes [Wed, 11 May 2011 00:08:54 +0000 (17:08 -0700)]
slub: Revert "[PARISC] slub: fix panic with DISCONTIGMEM"
This reverts commit
4a5fa3590f09, which did not allow SLUB to be used
on architectures that use DISCONTIGMEM without compiling NUMA support
without CONFIG_BROKEN also set.
The slub panic that it was intended to prevent is addressed by
d9b41e0b54fd ("[PARISC] set memory ranges in N_NORMAL_MEMORY when
onlined") on parisc so there is no further slub issues with such a
configuration.
The reverts allows SLUB now to be used on such architectures since
there haven't been any reports of additional errors.
Cc: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Tue, 10 May 2011 22:04:35 +0000 (15:04 -0700)]
Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6
Oliver Hartkopp [Tue, 10 May 2011 20:12:30 +0000 (13:12 -0700)]
slcan: fix ldisc->open retval
TTY layer expects 0 if the ldisc->open operation succeeded.
Reported-by: Matvejchikov Ilya <matvejchikov@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Mon, 9 May 2011 07:43:20 +0000 (07:43 +0000)]
net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
Like other mobile broadband device ethernet interfaces, mark the LG
VL600 with the 'wwan' devtype so userspace knows it needs additional
configuration via the AT port before the interface can be used.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Mon, 9 May 2011 19:43:05 +0000 (19:43 +0000)]
xfrm: Don't allow esn with disabled anti replay detection
Unlike the standard case, disabled anti replay detection needs some
nontrivial extra treatment on ESN. RFC 4303 states:
Note: If a receiver chooses to not enable anti-replay for an SA, then
the receiver SHOULD NOT negotiate ESN in an SA management protocol.
Use of ESN creates a need for the receiver to manage the anti-replay
window (in order to determine the correct value for the high-order
bits of the ESN, which are employed in the ICV computation), which is
generally contrary to the notion of disabling anti-replay for an SA.
So return an error if an ESN state with disabled anti replay detection
is inserted for now and add the extra treatment later if we need it.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert [Mon, 9 May 2011 19:36:38 +0000 (19:36 +0000)]
xfrm: Assign the inner mode output function to the dst entry
As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.
With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 10 May 2011 19:26:06 +0000 (12:26 -0700)]
net: dev_close() should check IFF_UP
Commit
443457242beb (factorize sync-rcu call in
unregister_netdevice_many) mistakenly removed one test from dev_close()
Following actions trigger a BUG :
modprobe bonding
modprobe dummy
ifconfig bond0 up
ifenslave bond0 dummy0
rmmod dummy
dev_close() must not close a non IFF_UP device.
With help from Frank Blaschka and Einar EL Lueck
Reported-by: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Reported-by: Einar EL Lueck <ELELUECK@de.ibm.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 10 May 2011 19:22:54 +0000 (12:22 -0700)]
vlan: fix GVRP at dismantle time
ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
ip link set eth2.103 up
rmmod tg3 # driver providing eth2
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
PGD
11d251067 PUD
11b9e0067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
CPU 0
Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]
Pid: 11494, comm: rmmod Tainted: G W
2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
RIP: 0010:[<
ffffffffa0030c9e>] [<
ffffffffa0030c9e>] garp_request_leave+0x3e/0xc0 [garp]
RSP: 0018:
ffff88007a19bae8 EFLAGS:
00010286
RAX:
0000000000000000 RBX:
ffff88011b5e2000 RCX:
0000000000000002
RDX:
0000000000000000 RSI:
0000000000000175 RDI:
ffffffffa0030d5b
RBP:
ffff88007a19bb18 R08:
0000000000000001 R09:
ffff88011bd64a00
R10:
ffff88011d34ec00 R11:
0000000000000000 R12:
0000000000000002
R13:
ffff88007a19bc48 R14:
ffff88007a19bb88 R15:
0000000000000001
FS:
0000000000000000(0000) GS:
ffff88011fc00000(0063) knlGS:
00000000f77d76c0
CS: 0010 DS: 002b ES: 002b CR0:
000000008005003b
CR2:
0000000000000000 CR3:
000000011a675000 CR4:
00000000000006f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process rmmod (pid: 11494, threadinfo
ffff88007a19a000, task
ffff8800798595c0)
Stack:
ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
Call Trace:
[<
ffffffffa003a5f6>] vlan_gvrp_request_leave+0x46/0x50 [8021q]
[<
ffffffffa00397e7>] vlan_dev_stop+0xb7/0xc0 [8021q]
[<
ffffffff8137e427>] __dev_close_many+0x87/0xe0
[<
ffffffff8137e507>] dev_close_many+0x87/0x110
[<
ffffffff8137e630>] rollback_registered_many+0xa0/0x240
[<
ffffffff8137e7e9>] unregister_netdevice_many+0x19/0x60
[<
ffffffffa00389eb>] vlan_device_event+0x53b/0x550 [8021q]
[<
ffffffff8143f448>] ? ip6mr_device_event+0xa8/0xd0
[<
ffffffff81479d03>] notifier_call_chain+0x53/0x80
[<
ffffffff81062539>] __raw_notifier_call_chain+0x9/0x10
[<
ffffffff81062551>] raw_notifier_call_chain+0x11/0x20
[<
ffffffff8137df82>] call_netdevice_notifiers+0x32/0x60
[<
ffffffff8137e69f>] rollback_registered_many+0x10f/0x240
[<
ffffffff8137e85f>] rollback_registered+0x2f/0x40
[<
ffffffff8137e8c8>] unregister_netdevice_queue+0x58/0x90
[<
ffffffff8137e9eb>] unregister_netdev+0x1b/0x30
[<
ffffffffa005d73f>] tg3_remove_one+0x6f/0x10b [tg3]
We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
is called right after unregister_netdevice_queue(). In batch mode,
unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 10 May 2011 19:00:53 +0000 (12:00 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (28 commits)
MIPS: Alchemy: fix xxs1500 build error
MIPS: Invalidate old TLB mappings when updating huge page PTEs.
MIPS: Hibernation: Fixes for PAGE_SIZE >= 64kb
MIPS: JZ4740: Set one-shot feature flag for the clockevent
MIPS: JZ4740: Export symbols to the watchdog driver module
MIPS: JZ4740: Fix GCC 4.6.0 build error.
MIPS: Audit: Fix success success argument pass to audit_syscall_exit
MIPS: Fix calc_vmlinuz_load_addr build warnings.
MIPS: Alchemy: Fix GCC 4.6.0 build error.
MIPS: Document former use of timerfd(2) syscall number.
MIPS: IP27: Fix GCC 4.6.0 build error.
MIPS: IP27: Fix GCC 4.6.0 build error.
MIPS: bcm63xx: Fix header_crc comment in bcm963xx_tag.h
MIPS: Octeon: Guard the Kconfig body with CPU_CAVIUM_OCTEON
MIPS: Octeon: Cleanup Kconfig IRQ_CPU* symbols.
MIPS: Rename .data..mostly and properly handle it in linker script
MIPS: MSP: Fix build error
MIPS: MSP71xx: Fix typo in msp_per_irq_controller
MIPS: Loongson: Fix GCC 2.6.0 build error.
MIPS: Jazz: Fix GCC 4.6.0 build error
...
Linus Torvalds [Tue, 10 May 2011 18:56:35 +0000 (11:56 -0700)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: fix race condition in AIL push trigger
xfs: make AIL target updates and compares 32bit safe.
xfs: always push the AIL to the target
xfs: exit AIL push work correctly when AIL is empty
xfs: ensure reclaim cursor is reset correctly at end of AG
Manuel Lauss [Sat, 7 May 2011 11:55:19 +0000 (13:55 +0200)]
MIPS: Alchemy: fix xxs1500 build error
This fixes:
alchemy/xxs1500/init.c: In function 'prom_init':
alchemy/xxs1500/init.c:57:17: error: ignoring return value of 'kstrtoul', declared with attribute warn_unused_result
Signed-off-by: Manuel Lauss <manuel.lauss@googlemail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2340/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Wed, 27 Apr 2011 23:39:28 +0000 (16:39 -0700)]
MIPS: Invalidate old TLB mappings when updating huge page PTEs.
Without this, stale Icache or TLB entries may be used.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
https://patchwork.linux-mips.org/patch/2318/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Wu Zhangjin [Sat, 23 Apr 2011 21:56:59 +0000 (05:56 +0800)]
MIPS: Hibernation: Fixes for PAGE_SIZE >= 64kb
PAGE_SIZE >= 64kb (1 << 16) is too big to be the immediate of the
addiu/daddiu instruction, so, use addu/daddu instruction instead.
The following compiling error is fixed:
AS arch/mips/power/hibernate.o
arch/mips/power/hibernate.S: Assembler messages:
arch/mips/power/hibernate.S:38: Error: expression out of range
make[2]: *** [arch/mips/power/hibernate.o] Error 1
make[1]: *** [arch/mips/power] Error 2
Reported-by: Roman Mamedov <rm@romanrm.ru>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2313/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Lars-Peter Clausen [Thu, 31 Mar 2011 18:52:20 +0000 (20:52 +0200)]
MIPS: JZ4740: Set one-shot feature flag for the clockevent
The code for supporting one-shot mode for the clockevent is already there,
only the feature flag was not set. Setting the one-shot flag allows the
kernel to run in tickless mode.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2261/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 18 Apr 2011 10:19:32 +0000 (11:19 +0100)]
MIPS: JZ4740: Export symbols to the watchdog driver module
MODPOST 356 modules
ERROR: "jz4740_timer_disable_watchdog" [drivers/watchdog/jz4740_wdt.ko] undefine
d!
ERROR: "jz4740_timer_enable_watchdog" [drivers/watchdog/jz4740_wdt.ko] undefined
!
make[1]: *** [__modpost] Error 1
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 18 Apr 2011 10:16:42 +0000 (11:16 +0100)]
MIPS: JZ4740: Fix GCC 4.6.0 build error.
CC arch/mips/jz4740/dma.o
arch/mips/jz4740/dma.c: In function 'jz4740_dma_chan_irq':
arch/mips/jz4740/dma.c:245:11: error: variable 'status' set but not used [-Werro
r=unused-but-set-variable]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 21:51:23 +0000 (23:51 +0200)]
MIPS: Audit: Fix success success argument pass to audit_syscall_exit
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 19:49:54 +0000 (21:49 +0200)]
MIPS: Fix calc_vmlinuz_load_addr build warnings.
HOSTCC arch/mips/boot/compressed/calc_vmlinuz_load_addr
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c: In function 'main':
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:35:2: warning: format '%llx' expects type 'long long unsigned int *', but argument 3 has type 'uint64_t *'
arch/mips/boot/compressed/calc_vmlinuz_load_addr.c:54:2: warning: format '%llx' expects type 'long long unsigned int', but argument 2 has type 'uint64_t'
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 19:15:09 +0000 (21:15 +0200)]
MIPS: Alchemy: Fix GCC 4.6.0 build error.
CC arch/mips/alchemy/devboards/db1x00/board_setup.o
arch/mips/alchemy/devboards/db1x00/board_setup.c: In function 'board_setup':
arch/mips/alchemy/devboards/db1x00/board_setup.c:130:6: error: variable 'pin_func' set but not used [-Werror=unused-but-set-variable]
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 13 Apr 2011 18:50:46 +0000 (20:50 +0200)]
MIPS: Document former use of timerfd(2) syscall number.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 11 Apr 2011 09:48:31 +0000 (11:48 +0200)]
MIPS: IP27: Fix GCC 4.6.0 build error.
CC arch/mips/sgi-ip27/ip27-hubio.o
arch/mips/sgi-ip27/ip27-hubio.c: In function 'hub_pio_map':
arch/mips/sgi-ip27/ip27-hubio.c:32:20: error: variable 'junk' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Mon, 11 Apr 2011 09:37:15 +0000 (11:37 +0200)]
MIPS: IP27: Fix GCC 4.6.0 build error.
CC arch/mips/sgi-ip27/ip27-hubio.o
arch/mips/sgi-ip27/ip27-hubio.c: In function 'hub_pio_map':
arch/mips/sgi-ip27/ip27-hubio.c:32:20: error: variable 'junk' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Jonas Gorski [Fri, 8 Apr 2011 12:32:15 +0000 (14:32 +0200)]
MIPS: bcm63xx: Fix header_crc comment in bcm963xx_tag.h
The CRC32 actually includes the tag_version.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2275/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Fri, 18 Feb 2011 02:23:32 +0000 (18:23 -0800)]
MIPS: Octeon: Guard the Kconfig body with CPU_CAVIUM_OCTEON
Instead of making each Octeon specific option depend on
CPU_CAVIUM_OCTEON, gate the body of the entire file with
CPU_CAVIUM_OCTEON. With this change, CAVIUM_OCTEON_SPECIFIC_OPTIONS
becomes useless, so get rid of it as well.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2091/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Thu, 17 Feb 2011 22:04:33 +0000 (14:04 -0800)]
MIPS: Octeon: Cleanup Kconfig IRQ_CPU* symbols.
Octeon doesn't use IRQ_CPU, so don't select it.
IRQ_CPU_OCTEON is a completely unused symbol, remove it completely.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2086/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Catalin Marinas [Tue, 29 Mar 2011 10:40:06 +0000 (11:40 +0100)]
MIPS: Rename .data..mostly and properly handle it in linker script
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 14:09:25 +0000 (16:09 +0200)]
MIPS: MSP: Fix build error
Reported and original patch by Yoichi Yuasa <yuasa@linux-mips.org>.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Yoichi Yuasa [Tue, 29 Mar 2011 06:53:56 +0000 (15:53 +0900)]
MIPS: MSP71xx: Fix typo in msp_per_irq_controller
CC arch/mips/pmc-sierra/msp71xx/msp_irq_per.o
arch/mips/pmc-sierra/msp71xx/msp_irq_per.c:101:2: error: expected identifier before '.' token
make[2]: *** [arch/mips/pmc-sierra/msp71xx/msp_irq_per.o] Error 1
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2246/
Cc: linux-mips <linux-mips@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 10:32:55 +0000 (12:32 +0200)]
MIPS: Loongson: Fix GCC 2.6.0 build error.
CC arch/mips/loongson/common/env.o
arch/mips/loongson/common/env.c: In function 'prom_init_env':
arch/mips/loongson/common/env.c:50:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
arch/mips/loongson/common/env.c:51:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
arch/mips/loongson/common/env.c:52:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
arch/mips/loongson/common/env.c:53:12: error: variable 'ret' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 10:09:51 +0000 (12:09 +0200)]
MIPS: Jazz: Fix GCC 4.6.0 build error
CC arch/mips/jazz/jazzdma.o
arch/mips/jazz/jazzdma.c: In function 'vdma_remap':
arch/mips/jazz/jazzdma.c:214:20: error: variable 'npages' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:57:11 +0000 (11:57 +0200)]
MIPS: SNI: Fix GCC 4.6.0 build error
CC arch/mips/sni/time.o
arch/mips/sni/time.c: In function 'dosample':
arch/mips/sni/time.c:98:19: error: variable 'lsb' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:48:22 +0000 (11:48 +0200)]
MIPS: Malta: Fix GCC 4.6.0 build error
CC arch/mips/mti-malta/malta-int.o
arch/mips/mti-malta/malta-int.c: In function 'mips_pcibios_iack':
arch/mips/mti-malta/malta-int.c:59:6: error: variable 'dummy' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:43:19 +0000 (11:43 +0200)]
MIPS: Malta: Fix GCC 4.6.0 build error
CC arch/mips/mti-malta/malta-init.o
arch/mips/mti-malta/malta-init.c: In function 'prom_init':
arch/mips/mti-malta/malta-init.c:196:6: error: variable 'result' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:06:49 +0000 (11:06 +0200)]
MIPS: IP22: Fix GCC 4.6.0 build error
CC arch/mips/sgi-ip22/ip22-platform.o
arch/mips/sgi-ip22/ip22-platform.c: In function 'sgiseeq_devinit':
arch/mips/sgi-ip22/ip22-platform.c:135:15: error: variable 'tmp' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
While at it rename the variable to pbdma for readability; there is a
local variable tmp of different type being used in two nested blocks.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 09:00:44 +0000 (11:00 +0200)]
MIPS: IP22: Fix GCC 4.6.0 build error
CC arch/mips/sgi-ip22/ip22-time.o
arch/mips/sgi-ip22/ip22-time.c: In function 'dosample':
arch/mips/sgi-ip22/ip22-time.c:35:10: error: variable 'lsb' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 08:54:54 +0000 (10:54 +0200)]
MIPS: tlbex: Fix GCC 4.6.0 build error
CC arch/mips/mm/tlbex.o
arch/mips/mm/tlbex.c: In function 'build_r4000_tlb_refill_handler':
arch/mips/mm/tlbex.c:1155:22: error: variable 'vmalloc_mode' set but not used [-Werror=unused-but-set-variable]
arch/mips/mm/tlbex.c:1154:28: error: variable 'htlb_info' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Tue, 29 Mar 2011 08:50:38 +0000 (10:50 +0200)]
MIPS: c-r4k: Fix GCC 4.6.0 build error
CC arch/mips/mm/c-r4k.o
arch/mips/mm/c-r4k.c: In function 'probe_scache':
arch/mips/mm/c-r4k.c:1078:6: error: variable 'tmp' set but not used [-Werror=unused-but-set-variable]
cc1: all warnings being treated as errors
Older GCC versions didn't warn about the unused variable tmp because it was
getting initialized.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Tue, 28 Dec 2010 21:21:37 +0000 (13:21 -0800)]
MIPS: Mask jump target in ftrace_dyn_arch_init_insns().
The current code is abusing the uasm interface by passing jump target
addresses with high bits set. Mask the addresses to avoid annoying
messages at boot time.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/1922/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Tue, 10 May 2011 16:41:03 +0000 (09:41 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/ryusuke/nilfs2
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix infinite loop in nilfs_palloc_freev function
Linus Torvalds [Tue, 10 May 2011 16:39:11 +0000 (09:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: Handle get_user_pages_fast return properly
Martin Schwidefsky [Tue, 10 May 2011 15:13:43 +0000 (17:13 +0200)]
[S390] fix alloc_pgste check in init_new_context
Processes started with kernel_execve from a kernel thread will have
current->mm==NULL. Reading current->mm->context.alloc_pgste will
read a more or less random bit from lowcore in this case. If the
bit turns out to be set the whole process tree started this way
will allocate page table extensions although they have no need
for it.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Tue, 10 May 2011 15:13:42 +0000 (17:13 +0200)]
[S390] oprofile: fix min/max interval query checks
oprofile_min_interval and oprofile_max_interval are unsigned, checking
for negative values doesn't work. Change hwsampler_query_min_interval
and hwsampler_query_max_interval to return an unsigned long and
check for a zero value instead.
Reported-by: Nicolas Kaiser <nikai@nikai.net>
Acked-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Tue, 10 May 2011 15:13:41 +0000 (17:13 +0200)]
[S390] replace diag10() with diag10_range() function
Currently the diag10() function can only release one page. For exploiters
that have to call diag10 on a contiguous memory region this is suboptimal.
This patch replaces the diag10() function with diag10_range() that is
able to release multiple pages. In addition to that the new function now
allows to release memory with addresses higher than 2047 MiB. This was
due to a restriction of the diagnose implementation under z/VM prior to
release 5.2.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Christian Borntraeger [Tue, 10 May 2011 15:13:40 +0000 (17:13 +0200)]
[S390] disassembler: handle b280/spp instruction
arch/s390/kvm/sie64a.S uses the b280 instruction. Tell the builtin
disassembler to handle that code.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Michael Holzheu [Tue, 10 May 2011 15:13:39 +0000 (17:13 +0200)]
[S390] kernel: Initialize register 14 when starting new CPU
When starting a new CPU we currently jump to start_secondary() without
setting register 14 (the return address) correctly. Therefore on the stack
frame for start_secondary an invalid return address is stored. This leads
to wrong stack back traces in kernel dumps.
Example:
#00 [
1f33fe48] cpu_idle at 10614a
#01 [
1f33fe90] start_secondary at 54fa88
#02 [
1f33feb8] (null) at 0 <--- invalid
To fix this start_secondary() is called now with basr/brasl that sets
register 14 correctly. The output of the stack backtrace looks then
like the following:
#00 [
1f33fe48] cpu_idle at 10614a
#01 [
1f33fe90] start_secondary at 54fa88
#02 [
1f33feb8] restart_base at 54f41e <--- correct
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Haberland [Tue, 10 May 2011 15:13:38 +0000 (17:13 +0200)]
[S390] dasd: prevent IO error during reserve/release loop
The termination of running CQR caused by reserve/release operations
may lead to an IO error if reserve/release is done in a tight loop.
Prevent this by increasing the retry counter after termination.
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 10 May 2011 15:13:37 +0000 (17:13 +0200)]
[S390] sclp/memory hotplug: fix initial usecount of increments
Fix initial usecount of attached and assigned storage increments so
they can be set offline.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Ryusuke Konishi [Tue, 10 May 2011 11:59:34 +0000 (20:59 +0900)]
nilfs2: fix infinite loop in nilfs_palloc_freev function
After having applied commit
9954e7af14868b8b ("nilfs2: add free
entries count only if clear bit operation succeeded"), a free routine
of nilfs came to fall into an infinite loop, outputting the same
message endlessly:
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed
nilfs_palloc_freev: entry number 29497 already freed ...
That patch broke the routine so that a loop counter is never updated
in an abnormal state. This fixes the regression.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Pablo Neira Ayuso [Tue, 10 May 2011 10:13:36 +0000 (12:13 +0200)]
netfilter: revert
a2361c8735e07322023aedc36e4938b35af31eb0
This patch reverts
a2361c8735e07322023aedc36e4938b35af31eb0:
"[PATCH] netfilter: xt_conntrack: warn about use in raw table"
Florian Wesphal says:
"... when the packet was sent from the local machine the skb
already has ->nfct attached, and -m conntrack seems to do
the right thing."
Acked-by: Jan Engelhardt <jengelh@medozas.de>
Reported-by: Florian Wesphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fernando Luis Vazquez Cao [Tue, 10 May 2011 08:00:21 +0000 (10:00 +0200)]
netfilter: IPv6: fix DSCP mangle code
The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fernando Luis Vazquez Cao [Tue, 10 May 2011 07:55:44 +0000 (09:55 +0200)]
netfilter: IPv6: initialize TOS field in REJECT target module
The IPv6 header is not zeroed out in alloc_skb so we must initialize
it properly unless we want to see IPv6 packets with random TOS fields
floating around. The current implementation resets the flow label
but this could be changed if deemed necessary.
We stumbled upon this issue when trying to apply a mangle rule to
the RST packet generated by the REJECT target module.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Hans Schillstrom [Tue, 3 May 2011 20:09:31 +0000 (22:09 +0200)]
IPVS: init and cleanup restructuring
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.
IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.
This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.
The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
Hans Schillstrom [Tue, 3 May 2011 20:09:30 +0000 (22:09 +0200)]
IPVS: Change of socket usage to enable name space exit.
If the sync daemons run in a name space while it crashes
or get killed, there is no way to stop them except for a reboot.
When all patches are there, ip_vs_core will handle register_pernet_(),
i.e. ip_vs_sync_init() and ip_vs_sync_cleanup() will be removed.
Kernel threads should not increment the use count of a socket.
By calling sk_change_net() after creating a socket this is avoided.
sock_release cant be used intead sk_release_kernel() should be used.
Thanks Eric W Biederman for your advices.
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
Florian Westphal [Thu, 21 Apr 2011 08:58:25 +0000 (10:58 +0200)]
netfilter: ebtables: only call xt_compat_add_offset once per rule
The optimizations in commit
255d0dc34068a976
(netfilter: x_table: speedup compat operations) assume that
xt_compat_add_offset is called once per rule.
ebtables however called it for each match/target found in a rule.
The match/watcher/target parser already returns the needed delta, so it
is sufficient to move the xt_compat_add_offset call to a more reasonable
location.
While at it, also get rid of the unused COMPAT iterator macros.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Eric Dumazet [Thu, 21 Apr 2011 08:57:21 +0000 (10:57 +0200)]
netfilter: fix ebtables compat support
commit
255d0dc34068a976 (netfilter: x_table: speedup compat operations)
made ebtables not working anymore.
1) xt_compat_calc_jump() is not an exact match lookup
2) compat_table_info() has a typo in xt_compat_init_offsets() call
3) compat_do_replace() misses a xt_compat_init_offsets() call
Reported-by: dann frazier <dannf@dannf.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Pablo Neira Ayuso [Thu, 21 Apr 2011 08:55:07 +0000 (10:55 +0200)]
netfilter: ctnetlink: fix timestamp support for new conntracks
This patch fixes the missing initialization of the start time if
the timestamp support is enabled.
libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp 6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=
1303296401 use=2
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
M. Mohan Kumar [Fri, 15 Apr 2011 08:29:33 +0000 (13:59 +0530)]
net/9p: Handle get_user_pages_fast return properly
Use proper data type to handle get_user_pages_fast error condition. Also
do not treat EFAULT error as fatal.
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Linus Torvalds [Tue, 10 May 2011 02:33:54 +0000 (19:33 -0700)]
Linux 2.6.39-rc7
Hugh Dickins [Tue, 10 May 2011 00:44:42 +0000 (17:44 -0700)]
vm: fix vm_pgoff wrap in upward expansion
Commit
a626ca6a6564 ("vm: fix vm_pgoff wrap in stack expansion") fixed
the case of an expanding mapping causing vm_pgoff wrapping when you had
downward stack expansion. But there was another case where IA64 and
PA-RISC expand mappings: upward expansion.
This fixes that case too.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 9 May 2011 23:59:51 +0000 (16:59 -0700)]
Merge branch 'drm-intel-fixes' of git://git./linux/kernel/git/keithp/linux-2.6
* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6:
drm/i915/lvds: Only act on lid notify when the device is on
drm/i915: fix intel_crtc_clock_get pipe reads after "cleanup cleanup"
drm/i915: Only enable the plane after setting the fb base (pre-ILK)
drm/i915/dp: Be paranoid in case we disable a DP before it is attached
drm/i915: Release object along create user fb error path
Dave Chinner [Fri, 6 May 2011 02:54:08 +0000 (02:54 +0000)]
xfs: fix race condition in AIL push trigger
The recent conversion of the xfsaild functionality to a work queue
introduced a hard-to-hit log space grant hang. One is caused by a
race condition in determining whether there is a psh in progress or
not.
The XFS_AIL_PUSHING_BIT is used to determine whether a push is
currently in progress. When the AIL push work completes, it checked
whether the target changed and cleared the PUSHING bit to allow a
new push to be requeued. The race condition is as follows:
Thread 1 push work
smp_wmb()
smp_rmb()
check ailp->xa_target unchanged
update ailp->xa_target
test/set PUSHING bit
does not queue
clear PUSHING bit
does not requeue
Now that the push target is updated, new attempts to push the AIL
will not trigger as the push target will be the same, and hence
despite trying to push the AIL we won't ever wake it again.
The fix is to ensure that the AIL push work clears the PUSHING bit
before it checks if the target is unchanged.
As a result, both push triggers operate on the same test/set bit
criteria, so even if we race in the push work and miss the target
update, the thread requesting the push will still set the PUSHING
bit and queue the push work to occur. For safety sake, the same
queue check is done if the push work detects the target change,
though only one of the two will will queue new work due to the use
of test_and_set_bit() checks.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit
e4d3c4a43b595d5124ae824d300626e6489ae857)
Dave Chinner [Fri, 6 May 2011 02:54:07 +0000 (02:54 +0000)]
xfs: make AIL target updates and compares 32bit safe.
The recent conversion of the xfsaild functionality to a work queue
introduced a hard-to-hit log space grant hang. One of the problems
noticed was that updates of the push target are not 32 bit safe as
the target is a 64 bit value.
We cannot copy a 64 bit LSN without the possibility of corrupting
the result when racing with another updating thread. We have
function to do this update safely without needing to care about
32/64 bit issues - xfs_trans_ail_copy_lsn() - so use that when
updating the AIL push target.
Also move the reading of the target in the push work inside the AIL
lock, and use XFS_LSN_CMP() for the unlocked comparison during work
termination to close read holes as well.
Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit
fd5670f22fce247754243cf2ed41941e5762d990)
Dave Chinner [Fri, 6 May 2011 02:54:06 +0000 (02:54 +0000)]
xfs: always push the AIL to the target
The recent conversion of the xfsaild functionality to a work queue
introduced a hard-to-hit log space grant hang. One of the problems
discovered is a target mismatch between the item pushing loop and
the target itself.
The push trigger checks for the target increasing (i.e. new target >
current) while the push loop only pushes items that have a LSN <
current. As a result, we can get the situation where the push target
is X, the items at the tail of the AIL have LSN X and they don't get
pushed. The push work then completes thinking it is done, and cannot
be restarted until the push target increases to >= X + 1. If the
push target then never increases (because the tail is not moving),
then we never run the push work again and we stall.
Fix it by making sure log items with a LSN that matches the target
exactly are pushed during the loop.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit
cb64026b6e8af50db598ec7c3f59d504259b00bb)
Dave Chinner [Fri, 6 May 2011 02:54:05 +0000 (02:54 +0000)]
xfs: exit AIL push work correctly when AIL is empty
The recent conversion of the xfsaild functionality to a work queue
introduced a hard-to-hit log space grant hang. The main cause is a
regression where a work exit path fails to clear the PUSHING state
and recheck the target correctly.
Make both exit paths do the same PUSHING bit clearing and target
checking when the "no more work to be done" condition is hit.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit
ea35a20021f8497390d05b93271b4d675516c654)
Dave Chinner [Fri, 6 May 2011 02:54:04 +0000 (02:54 +0000)]
xfs: ensure reclaim cursor is reset correctly at end of AG
On a 32 bit highmem PowerPC machine, the XFS inode cache was growing
without bound and exhausting low memory causing the OOM killer to be
triggered. After some effort, the problem was reproduced on a 32 bit
x86 highmem machine.
The problem is that the per-ag inode reclaim index cursor was not
getting reset to the start of the AG if the radix tree tag lookup
found no more reclaimable inodes. Hence every further reclaim
attempt started at the same index beyond where any reclaimable
inodes lay, and no further background reclaim ever occurred from the
AG.
Without background inode reclaim the VM driven cache shrinker
simply cannot keep up with cache growth, and OOM is the result.
While the change that exposed the problem was the conversion of the
inode reclaim to use work queues for background reclaim, it was not
the cause of the bug. The bug was introduced when the cursor code
was added, just waiting for some weird configuration to strike....
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Tested-By: Christian Kujau <lists@nerdbynature.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
(cherry picked from commit
b223221956675ce8a7b436d198ced974bb388571)