Trond Myklebust [Tue, 15 Mar 2011 17:36:43 +0000 (13:36 -0400)]
VFS: Fix the nfs sillyrename regression in kernel 2.6.38
The new vfs locking scheme introduced in 2.6.38 breaks NFS sillyrename
because the latter relies on being able to determine the parent
directory of the dentry in the ->iput() callback in order to send the
appropriate unlink rpc call.
Looking at the code that cares about races with dput(), there doesn't
seem to be anything that specifically uses d_parent as a test for
whether or not there is a race:
- __d_lookup_rcu(), __d_lookup() all test for d_hashed() after d_parent
- shrink_dcache_for_umount() is safe since nothing else can rearrange
the dentries in that super block.
- have_submount(), select_parent() and d_genocide() can test for a
deletion if we set the DCACHE_DISCONNECTED flag when the dentry
is removed from the parent's d_subdirs list.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org (2.6.38, needs commit c826cb7dfce8 "dcache.c:
create helper function for duplicated functionality" )
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 15 Mar 2011 22:29:21 +0000 (15:29 -0700)]
dcache.c: create helper function for duplicated functionality
This creates a helper function for he "try to ascend into the parent
directory" case, which was written out in triplicate before. With all
the locking and subtle sequence number stuff, we really don't want to
duplicate that kind of code.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 15 Mar 2011 17:59:09 +0000 (10:59 -0700)]
Merge branch 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm
* 'for-linus' of git://xenbits.xen.org/people/sstabellini/linux-pvhvm:
xen: suspend: remove xen_hvm_suspend
xen: suspend: pull pre/post suspend hooks out into suspend_info
xen: suspend: move arch specific pre/post suspend hooks into generic hooks
xen: suspend: refactor non-arch specific pre/post suspend hooks
xen: suspend: add "arch" to pre/post suspend hooks
xen: suspend: pass extra hypercall argument via suspend_info struct
xen: suspend: refactor cancellation flag into a structure
xen: suspend: use HYPERVISOR_suspend for PVHVM case instead of open coding
xen: switch to new schedop hypercall by default.
xen: use new schedop interface for suspend
xen: do not respond to unknown xenstore control requests
xen: fix compile issue if XEN is enabled but XEN_PVHVM is disabled
xen: PV on HVM: support PV spinlocks and IPIs
xen: make the ballon driver work for hvm domains
xen-blkfront: handle Xen major numbers other than XENVBD
xen: do not use xen_info on HVM, set pv_info name to "Xen HVM"
xen: no need to delay xen_setup_shutdown_event for hvm guests anymore
Linus Torvalds [Tue, 15 Mar 2011 17:49:16 +0000 (10:49 -0700)]
Merge branches 'stable/ia64', 'stable/blkfront-cleanup' and 'stable/cleanup' of git://git./linux/kernel/git/konrad/xen
* 'stable/ia64' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: ia64 build broken due to "xen: switch to new schedop hypercall by default."
* 'stable/blkfront-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: Union the blkif_request request specific fields
* 'stable/cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: annotate functions which only call into __init at start of day
xen p2m: annotate variable which appears unused
xen: events: mark cpu_evtchn_mask_p as __refdata
Linus Torvalds [Tue, 15 Mar 2011 17:47:56 +0000 (10:47 -0700)]
Merge branch 'stable/irq.cleanup' of git://git./linux/kernel/git/konrad/xen
* 'stable/irq.cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: events: remove dom0 specific xen_create_msi_irq
xen: events: use xen_bind_pirq_msi_to_irq from xen_create_msi_irq
xen: events: push set_irq_msi down into xen_create_msi_irq
xen: events: update pirq_to_irq in xen_create_msi_irq
xen: events: refactor xen_create_msi_irq slightly
xen: events: separate MSI PIRQ allocation from PIRQ binding to IRQ
xen: events: assume PHYSDEVOP_get_free_pirq exists
xen: pci: collapse apic_register_gsi_xen_hvm and xen_hvm_register_pirq
xen: events: return irq from xen_allocate_pirq_msi
xen: events: drop XEN_ALLOC_IRQ flag to xen_allocate_pirq_msi
xen: events: do not leak IRQ from xen_allocate_pirq_msi when no pirq available.
xen: pci: only define xen_initdom_setup_msi_irqs if CONFIG_XEN_DOM0
Linus Torvalds [Tue, 15 Mar 2011 17:47:16 +0000 (10:47 -0700)]
Merge branches 'stable/irq.rework' and 'stable/pcifront-fixes' of git://git./linux/kernel/git/konrad/xen
* 'stable/irq.rework' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/irq: Cleanup up the pirq_to_irq for DomU PV PCI passthrough guests as well.
xen: Use IRQF_FORCE_RESUME
xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend.
xen: Fix compile error introduced by "switch to new irq_chip functions"
xen: Switch to new irq_chip functions
xen: Remove stale irq_chip.end
xen: events: do not free legacy IRQs
xen: events: allocate GSIs and dynamic IRQs from separate IRQ ranges.
xen: events: add xen_allocate_irq_{dynamic, gsi} and xen_free_irq
xen:events: move find_unbound_irq inside CONFIG_PCI_MSI
xen: handled remapped IRQs when enabling a pcifront PCI device.
genirq: Add IRQF_FORCE_RESUME
* 'stable/pcifront-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
pci/xen: When free-ing MSI-X/MSI irq->desc also use generic code.
pci/xen: Cleanup: convert int** to int[]
pci/xen: Use xen_allocate_pirq_msi instead of xen_allocate_pirq
xen-pcifront: Sanity check the MSI/MSI-X values
xen-pcifront: don't use flush_scheduled_work()
Linus Torvalds [Tue, 15 Mar 2011 17:32:15 +0000 (10:32 -0700)]
Merge branches 'stable/p2m-identity.v4.9.1' and 'stable/e820' of git://git./linux/kernel/git/konrad/xen
* 'stable/p2m-identity.v4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/m2p: Check whether the MFN has IDENTITY_FRAME bit set..
xen/m2p: No need to catch exceptions when we know that there is no RAM
xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set.
xen/debugfs: Add 'p2m' file for printing out the P2M layout.
xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
xen/mmu: WARN_ON when racing to swap middle leaf.
xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN.
xen/mmu: Add the notion of identity (1-1) mapping.
xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY.
* 'stable/e820' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/e820: Don't mark balloon memory as E820_UNUSABLE when running as guest and fix overflow.
xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps.
Linus Torvalds [Tue, 15 Mar 2011 01:20:32 +0000 (18:20 -0700)]
Linux 2.6.38
Linus Torvalds [Mon, 14 Mar 2011 22:20:39 +0000 (15:20 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/dhowells/linux-2.6-mn10300
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
MN10300: atomic_read() should ensure it emits a load
MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
MN10300: Proper use of macros get_user() in the case of incremented pointers
Linus Torvalds [Mon, 14 Mar 2011 22:20:12 +0000 (15:20 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (26 commits)
MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
MIPS: MTX-1: Make au1000_eth probe all PHY addresses
MIPS: Jz4740: Add HAVE_CLK
MIPS: Move idle task creation to work queue
MIPS, Perf-events: Use unsigned delta for right shift in event update
MIPS, Perf-events: Work with the new callchain interface
MIPS, Perf-events: Fix event check in validate_event()
MIPS, Perf-events: Work with the new PMU interface
MIPS, Perf-events: Work with irq_work
MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
MIPS: Loongson: Fix potentially wrong string handling
MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
MIPS: Remove unused code from arch/mips/kernel/syscall.c
MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
MIPS: MSP: Fix MSP71xx bpci interrupt handler return value
MIPS: Select R4K timer lib for all MSP platforms
MIPS: Loongson: Remove ad-hoc cmdline default
MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
MIPS: Add an unreachable return statement to satisfy buggy GCCs.
...
Linus Torvalds [Mon, 14 Mar 2011 22:19:09 +0000 (15:19 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: ce4100: Set pci ops via callback instead of module init
x86/mm: Fix pgd_lock deadlock
x86/mm: Handle mm_fault_error() in kernel space
x86: Don't check for BIOS corruption in first 64K when there's no need to
Linus Torvalds [Mon, 14 Mar 2011 22:17:07 +0000 (15:17 -0700)]
Revert "oom: oom_kill_process: fix the child_points logic"
This reverts the parent commit. I hate doing that, but it's generating
some discussion ("half of it is right"), and since I am planning on
doing the 2.6.38 release later today we can punt it to stable if
required. Let's not rock the boat right now.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Oleg Nesterov [Mon, 14 Mar 2011 19:05:30 +0000 (20:05 +0100)]
oom: oom_kill_process: fix the child_points logic
oom_kill_process() starts with victim_points == 0. This means that
(most likely) any child has more points and can be killed erroneously.
Also, "children has a different mm" doesn't match the reality, we should
check child->mm != t->mm. This check is not exactly correct if t->mm ==
NULL but this doesn't really matter, oom_kill_task() will kill them
anyway.
Note: "Kill all processes sharing p->mm" in oom_kill_task() is wrong
too.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Florian Fainelli [Mon, 21 Feb 2011 13:28:02 +0000 (14:28 +0100)]
MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
Since commit
32fd6901 (MIPS: Alchemy: get rid of common/reset.c)
Alchemy-based boards use their own reset function. For MTX-1 and XXS1500,
the reset function pokes at the BCSR.SYSTEM_RESET register, but this does
not work. According to Bruno Randolf, this was not tested when written.
Previously, the generic au1000_restart() routine called the board specific
reset function, which for MTX-1 and XXS1500 did not work, but finally made
a jump to the reset vector, which really triggers a system restart. Fix
reboot for both targets by jumping to the reset vector.
Signed-off-by: Florian Fainelli <florian@openwrt.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2093/
Acked-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Florian Fainelli [Sun, 27 Feb 2011 18:53:53 +0000 (19:53 +0100)]
MIPS: MTX-1: Make au1000_eth probe all PHY addresses
When au1000_eth probes the MII bus for PHY address, if we do not set
au1000_eth platform data's phy_search_highest_address, the MII probing
logic will exit early and will assume a valid PHY is found at address 0.
For MTX-1, the PHY is at address 31, and without this patch, the link
detection/speed/duplex would not work correctly.
CC: stable@kernel.org
Signed-off-by: Florian Fainelli <florian@openwrt.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2111/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Maurus Cuelenaere [Mon, 28 Feb 2011 23:20:01 +0000 (00:20 +0100)]
MIPS: Jz4740: Add HAVE_CLK
Jz4740 supports the clock framework but doesn't have HAVE_CLK defined,
so define it!
Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/2112/
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Maksim Rayskiy [Sat, 12 Feb 2011 18:21:32 +0000 (10:21 -0800)]
MIPS: Move idle task creation to work queue
To avoid forking usermode thread when creating an idle task, move fork_idle
to a work queue.
If kernel starts with maxcpus= option which does not bring all available
cpus online at boot time, idle tasks for offline cpus are not created. If
later offline cpus are hotplugged through sysfs, __cpu_up is called in
the context of the user task, and fork_idle copies its non-zero mm
pointer. This causes BUG() in per_cpu_trap_init.
This also avoids issues with resource limits of the CPU writing to sysfs,
containers, maybe others.
Signed-off-by: Maksim Rayskiy <mrayskiy@broadcom.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2070/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:21 +0000 (16:19 +0800)]
MIPS, Perf-events: Use unsigned delta for right shift in event update
Leverage the commit for ARM by Will Deacon:
-
446a5a8b1eb91a6990e5c8fe29f14e7a95b69132
ARM: 6205/1: perf: ensure counter delta is treated as unsigned
Hardware performance counters on ARM are 32-bits wide but atomic64_t
variables are used to represent counter data in the hw_perf_event structure.
The armpmu_event_update function right-shifts a signed 64-bit delta variable
and adds the result to the event count. This can lead to shifting in sign-bits
if the MSB of the 32-bit counter value is set. This results in perf output
such as:
Performance counter stats for 'sleep 20':
18446744073460670464 cycles <-- 0xFFFFFFFFF12A6000
7783773 instructions # 0.000 IPC
465 context-switches
161 page-faults
1172393 branches
20.
154242147 seconds time elapsed
This patch ensures that the delta value is treated as unsigned so that the
right shift sets the upper bits to zero.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2015/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:20 +0000 (16:19 +0800)]
MIPS, Perf-events: Work with the new callchain interface
This is the MIPS part of the following commits by Frederic Weisbecker:
-
f72c1a931e311bb7780fee19e41a89ac42cab50e
perf: Factorize callchain context handling
Store the kernel and user contexts from the generic layer instead
of archs, this gathers some repetitive code.
-
56962b4449af34070bb1994621ef4f0265eed4d8
perf: Generalize some arch callchain code
- Most archs use one callchain buffer per cpu, except x86 that needs
to deal with NMIs. Provide a default perf_callchain_buffer()
implementation that x86 overrides.
- Centralize all the kernel/user regs handling and invoke new arch
handlers from there: perf_callchain_user() / perf_callchain_kernel()
That avoid all the user_mode(), current->mm checks and so...
- Invert some parameters in perf_callchain_*() helpers: entry to the
left, regs to the right, following the traditional (dst, src).
-
70791ce9ba68a5921c9905ef05d23f62a90bc10c
perf: Generalize callchain_store()
callchain_store() is the same on every archs, inline it in
perf_event.h and rename it to perf_callchain_store() to avoid
any collision.
This removes repetitive code.
-
c1a65932fd7216fdc9a0db8bbffe1d47842f862c
perf: Drop unappropriate tests on arch callchains
Drop the TASK_RUNNING test on user tasks for callchains as
this check doesn't seem to make any sense.
Also remove the tests for !current that is not supposed to
happen and current->pid as this should be handled at the
generic level, with exclude_idle attribute.
Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2014/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:19 +0000 (16:19 +0800)]
MIPS, Perf-events: Fix event check in validate_event()
Ignore events that are in off/error state or belong to a different PMU.
This patch originates from the following commit for ARM by Will Deacon:
-
65b4711ff513767341aa1915c822de6ec0de65cb
ARM: 6352/1: perf: fix event validation
The validate_event function in the ARM perf events backend has the
following problems:
1.) Events that are disabled count towards the cost.
2.) Events associated with other PMUs [for example, software events or
breakpoints] do not count towards the cost, but do fail validation,
causing the group to fail.
This patch changes validate_event so that it ignores events in the
PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2013/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:18 +0000 (16:19 +0800)]
MIPS, Perf-events: Work with the new PMU interface
This is the MIPS part of the following commits by Peter Zijlstra:
-
a4eaf7f14675cb512d69f0c928055e73d0c6d252
perf: Rework the PMU methods
Replace pmu::{enable,disable,start,stop,unthrottle} with
pmu::{add,del,start,stop}, all of which take a flags argument.
The new interface extends the capability to stop a counter while
keeping it scheduled on the PMU. We replace the throttled state with
the generic stopped state.
This also allows us to efficiently stop/start counters over certain
code paths (like IRQ handlers).
It also allows scheduling a counter without it starting, allowing for
a generic frozen state (useful for rotating stopped counters).
The stopped state is implemented in two different ways, depending on
how the architecture implemented the throttled state:
1) We disable the counter:
a) the pmu has per-counter enable bits, we flip that
b) we program a NOP event, preserving the counter state
2) We store the counter state and ignore all read/overflow events
For MIPSXX, the stopped state is implemented in the way of 1.b as above.
-
33696fc0d141bbbcb12f75b69608ea83282e3117
perf: Per PMU disable
Changes perf_disable() into perf_pmu_disable().
-
24cd7f54a0d47e1d5b3de29e2456bfbd2d8447b7
perf: Reduce perf_disable() usage
Since the current perf_disable() usage is only an optimization,
remove it for now. This eases the removal of the __weak
hw_perf_enable() interface.
-
b0a873ebbf87bf38bf70b5e39a7cadc96099fa13
perf: Register PMU implementations
Simple registration interface for struct pmu, this provides the
infrastructure for removing all the weak functions.
-
51b0fe39549a04858001922919ab355dee9bdfcf
perf: Deconstify struct pmu
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`
Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2012/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Deng-Cheng Zhu [Fri, 21 Jan 2011 08:19:17 +0000 (16:19 +0800)]
MIPS, Perf-events: Work with irq_work
This is the MIPS part of the following commit by Peter Zijlstra:
-
e360adbe29241a0194e10e20595360dd7b98a2b3
irq_work: Add generic hardirq context callbacks
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.
Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.
The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.
Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.
For MIPSXX, we need to call irq_work_run() at the tail of the perf IRQ
handler as described above.
Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com,
Patchwork: http://patchwork.linux-mips.org/patch/2011/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Yoichi Yuasa [Mon, 7 Feb 2011 02:31:36 +0000 (11:31 +0900)]
MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2055/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Stefan Weil [Sun, 30 Jan 2011 20:41:44 +0000 (21:41 +0100)]
MIPS: Loongson: Fix potentially wrong string handling
This error was reported by cppcheck:
arch/mips/loongson/common/machtype.c:56: error: Dangerous usage of 'str' (strncpy doesn't always 0-terminate it)
If strncpy copied MACHTYPE_LEN bytes, the destination string str
was not terminated.
The patch adds one more byte to str and makes sure that this byte is
always 0.
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/2053/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Mon, 24 Jan 2011 22:51:37 +0000 (14:51 -0800)]
MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
Under some combinations of CONFIG_*, lastpfn in page_is_ram is 'set
but not used'. Mark it as __maybe_unused to quiet the warning/error.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2033/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Mon, 24 Jan 2011 22:51:36 +0000 (14:51 -0800)]
MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
GCC-4.6 can find more unused code than previous versions could.
In the case of arch/mips/math-emu/ieee754int.h, the COMPXSP and
COMPXDP macros are used in several places, but a couple of them leave
xs unused. The easiest thing to do is mark it as __maybe_unused to
quiet the warning.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2032/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Mon, 24 Jan 2011 22:51:35 +0000 (14:51 -0800)]
MIPS: Remove unused code from arch/mips/kernel/syscall.c
The variable arg3 in _sys_sysmips() is unused. Remove it.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2034/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Mon, 24 Jan 2011 22:51:34 +0000 (14:51 -0800)]
MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
GCC-4.6 can find more unused code than previous versions could.
In the case of protected_restore_fp_context{,32}, the variable tmp is
really used. Its use is tricky in that we really care about the side
effects of the __put_user() calls. So we must mark tmp with
__maybe_unused to quiet the warning.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2035/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Anoop P A [Thu, 18 Nov 2010 10:32:50 +0000 (16:02 +0530)]
MIPS: MSP: Fix MSP71xx bpci interrupt handler return value
Signed-off-by: Anoop P A <anoop.pa@gmail.com>
To: Ben Hutchings <ben@decadent.org.uk>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1804/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Anoop P A [Thu, 18 Nov 2010 08:12:28 +0000 (13:42 +0530)]
MIPS: Select R4K timer lib for all MSP platforms
Signed-off-by: Anoop P A <anoop.pa@gmail.com>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1803/
Tested-by: Shane McDonald <mcdonald.shane@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Robert Millan [Sun, 7 Nov 2010 12:38:29 +0000 (13:38 +0100)]
MIPS: Loongson: Remove ad-hoc cmdline default
Loongson builds have an ad-hoc cmdline default of "console=ttyS0,115200
root=/dev/hda1". These settings come from a vendor; I remember builds
from Lemote branch requiring a "console=tty" override in order to get a
working console.
At least on Yeeloong, they're particularly useless: there's no external
serial port, and the IDE drive is now recognised as /dev/sda.
Signed-off-by: Robert Millan <rmh@gnu.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1759/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Stefan Oberhumer [Mon, 17 Jan 2011 08:19:53 +0000 (09:19 +0100)]
MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
The sysmips(MIPS_FIXADE, ...) case contains an obvious copy-and-paste
error in the handling of the TIF_LOGADE flag. Fix that
Patchwork: https://patchwork.linux-mips.org/patch/1997/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David Daney [Wed, 19 Jan 2011 23:24:42 +0000 (15:24 -0800)]
MIPS: Add an unreachable return statement to satisfy buggy GCCs.
It was reported that GCC-4.3.3 (with CodeSourcery extensions) fails
without this.
Reported-by: Jonas Gorski <jonas.gorski@gmail.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2010/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Wu Zhangjin [Fri, 21 Jan 2011 18:01:53 +0000 (02:01 +0800)]
MIPS, Tracing: Fix set_graph_function of function graph tracer
trace.func should be set to the recorded ip of the mcount calling site
in the __mcount_loc section to filter the function entries configured
through the tracing/set_graph_function interface, but before, this is
set to the self_ra(the return address of mcount), which has made
set_graph_function not work as expected.
This fixes it via calculating the right recorded ip in the __mcount_loc
section and assign it to trace.func.
Reported-by: Zhiping Zhong <xzhong86@163.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2017/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>
Wu Zhangjin [Wed, 19 Jan 2011 19:28:31 +0000 (03:28 +0800)]
MIPS, Tracing: Clean up ftrace_make_nop()
This moves the comments out of ftrace_make_nop() and cleans it. At the
same time, a macro MCOUNT_OFFSET_INSNS is defined for sharing with the
next patch.
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2008/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>
Wu Zhangjin [Wed, 19 Jan 2011 19:28:30 +0000 (03:28 +0800)]
MIPS, Tracing: Clean up prepare_ftrace_return()
The old prepare_ftrace_return() for MIPS is confused and have introduced
some problem. This patch cleans up the names of the arguments, variables
and related functions.
For MIPS, the 2nd argument of prepare_ftrace_return() is not really the
'selfpc' described in ftrace-design.txt but instead it is the self
return address. This did break the compatibility of the generic
interface but really reduced one unneeded calculation for to get the
current function name, the parent return address and the self return
address are enough, no need to tranform the self return address to the
self address.
But set_graph_function of function graph tracer is an exception, it does
need the 2nd argument of prepare_ftrace_return() as 'selfpc', for it
will use 'selfpc' to match user's configuration of function graph
entries, but in reality, it doesn't need the 'selfpc' but the recorded
ip address of the mcount calling site in the __mcount_loc section. So,
the 2nd argument of prepare_ftrace_return() is not important, the real
requirement is the right recorded ip address should be calculated and
assign to trace.func, this will be fixed in the next patches.
Reported-by: Zhiping Zhong <xzhong86@163.com>
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2007/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>
Wu Zhangjin [Wed, 19 Jan 2011 19:28:29 +0000 (03:28 +0800)]
MIPS, Tracing: Substitute in_kernel_space() for in_module()
The old in_module() may not work in some situations(e.g. when module &
kernel are in the same address space when CONFIG_MAPPED_KERNEL=y), The
in_kernel_space() is more generic and it is also easy to be implemented
via cloning the existing core_kernel_text(), so, replace the in_module()
with in_kernel_space().
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2005/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>
Wu Zhangjin [Wed, 19 Jan 2011 19:28:27 +0000 (03:28 +0800)]
MIPS, Tracing: Speed up function graph tracer
This simply moves the "ip-=4" statement down to the end of the do { ...
} while (...); loop, which reduces one unneeded subtration and the
subsequent memory loading and comparison.
Signed-off-by: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2006/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>
Thomas Gleixner [Sun, 23 Jan 2011 15:17:00 +0000 (15:17 +0000)]
MIPS: Replace deprecated spinlock initialization
SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant instead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2025/
Signed-off-by: Ralf Baechle <ralf@duck.linux-mips.net>
Linus Torvalds [Mon, 14 Mar 2011 18:19:50 +0000 (11:19 -0700)]
Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: NFSROOT should default to "proto=udp"
nfs4: remove duplicated #include
NFSv4: nfs4_state_mark_reclaim_nograce() should be static
NFSv4: Fix the setlk error handler
NFSv4.1: Fix the handling of the SEQUENCE status bits
NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
NFSv4.1 reclaim complete must wait for completion
NFSv4: remove duplicate clientid in struct nfs_client
NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
sunrpc: Propagate errors from xs_bind() through xs_create_sock()
(try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
nfs: fix compilation warning
nfs: add kmalloc return value check in decode_and_add_ds
SUNRPC: Remove resource leak in svc_rdma_send_error()
nfs: close NFSv4 COMMIT vs. CLOSE race
SUNRPC: Close a race in __rpc_wait_for_completion_task()
Linus Torvalds [Mon, 14 Mar 2011 18:17:43 +0000 (11:17 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon: fix problem with changing active VRAM size. (v2)
Linus Torvalds [Mon, 14 Mar 2011 17:15:43 +0000 (10:15 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
watchdog: hpwdt: eliminate section mismatch warning
watchdog: w83697ug_wdt: Fix set bit 0 to activate GPIO2
watchdog: sch311x_wdt: fix printk condition
watchdog: sch311x_wdt: Fix LDN active check
watchdog: cpwd: Fix buffer-overflow
Timo Warns [Mon, 14 Mar 2011 13:59:33 +0000 (14:59 +0100)]
Fix corrupted OSF partition table parsing
The kernel automatically evaluates partition tables of storage devices.
The code for evaluating OSF partitions contains a bug that leaks data
from kernel heap memory to userspace for certain corrupted OSF
partitions.
In more detail:
for (i = 0 ; i < le16_to_cpu(label->d_npartitions); i++, partition++) {
iterates from 0 to d_npartitions - 1, where d_npartitions is read from
the partition table without validation and partition is a pointer to an
array of at most 8 d_partitions.
Add the proper and obvious validation.
Signed-off-by: Timo Warns <warns@pre-sense.de>
Cc: stable@kernel.org
[ Changed the patch trivially to not repeat the whole le16_to_cpu()
thing, and to use an explicit constant for the magic value '8' ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Mon, 14 Mar 2011 08:08:47 +0000 (01:08 -0700)]
thp+memcg-numa: fix BUG at include/linux/mm.h:370!
THP's collapse_huge_page() has an understandable but ugly difference
in when its huge page is allocated: inside if NUMA but outside if not.
It's hardly surprising that the memcg failure path forgot that, freeing
the page in the non-NUMA case, then hitting a VM_BUG_ON in get_page()
(or even worse, using the freed page).
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefano Stabellini [Wed, 2 Feb 2011 18:32:59 +0000 (18:32 +0000)]
xen/m2p: Check whether the MFN has IDENTITY_FRAME bit set..
If there is no proper PFN value in the M2P for the MFN
(so we get 0xFFFFF.. or 0x55555, or 0x0), we should
consult the M2P override to see if there is an entry for this.
[Note: we also consult the M2P override if the MFN
is past our machine_to_phys size].
We consult the P2M with the PFN. In case the returned
MFN is one of the special values: 0xFFF.., 0x5555
(which signify that the MFN can be either "missing" or it
belongs to DOMID_IO) or the p2m(m2p(mfn)) != mfn, we check
the M2P override. If we fail the M2P override check, we reset
the PFN value to INVALID_P2M_ENTRY.
Next we try to find the MFN in the P2M using the MFN
value (not the PFN value) and if found, we know
that this MFN is an identity value and return it as so.
Otherwise we have exhausted all the posibilities and we
return the PFN, which at this stage can either be a real
PFN value found in the machine_to_phys.. array, or
INVALID_P2M_ENTRY value.
[v1: Added Review-by tag]
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Fri, 14 Jan 2011 22:55:44 +0000 (17:55 -0500)]
xen/m2p: No need to catch exceptions when we know that there is no RAM
.. beyound what we think is the end of memory. However there might
be more System RAM - but assigned to a guest. Hence jump to the
M2P override check and consult.
[v1: Added Review-by tag]
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Thu, 23 Dec 2010 21:25:29 +0000 (16:25 -0500)]
xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set.
Only enabled if XEN_DEBUG is enabled. We print a warning
when:
pfn_to_mfn(pfn) == pfn, but no VM_IO (_PAGE_IOMAP) flag set
(and pfn is an identity mapped pfn)
pfn_to_mfn(pfn) != pfn, and VM_IO flag is set.
(ditto, pfn is an identity mapped pfn)
[v2: Make it dependent on CONFIG_XEN_DEBUG instead of ..DEBUG_FS]
[v3: Fix compiler warning]
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 22 Dec 2010 13:57:30 +0000 (08:57 -0500)]
xen/debugfs: Add 'p2m' file for printing out the P2M layout.
We walk over the whole P2M tree and construct a simplified view of
which PFN regions belong to what level and what type they are.
Only enabled if CONFIG_XEN_DEBUG_FS is set.
[v2: UNKN->UNKNOWN, use uninitialized_var]
[v3: Rebased on top of mmu->p2m code split]
[v4: Fixed the else if]
Reviewed-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Tue, 1 Feb 2011 22:15:30 +0000 (17:15 -0500)]
xen/setup: Set identity mapping for non-RAM E820 and E820 gaps.
We walk the E820 region and start at 0 (for PV guests we start
at ISA_END_ADDRESS) and skip any E820 RAM regions. For all other
regions and as well the gaps we set them to be identity mappings.
The reasons we do not want to set the identity mapping from 0->
ISA_END_ADDRESS when running as PV is b/c that the kernel would
try to read DMI information and fail (no permissions to read that).
There is a lot of gnarly code to deal with that weird region so
we won't try to do a cleanup in this patch.
This code ends up calling 'set_phys_to_identity' with the start
and end PFN of the the E820 that are non-RAM or have gaps.
On 99% of machines that means one big region right underneath the
4GB mark. Usually starts at 0xc0000 (or 0x80000) and goes to
0x100000.
[v2: Fix for E820 crossing 1MB region and clamp the start]
[v3: Squshed in code that does this over ranges]
[v4: Moved the comment to the correct spot]
[v5: Use the "raw" E820 from the hypervisor]
[v6: Added Review-by tag]
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 19 Jan 2011 01:17:10 +0000 (20:17 -0500)]
xen/mmu: WARN_ON when racing to swap middle leaf.
The initial bootup code uses set_phys_to_machine quite a lot, and after
bootup it would be used by the balloon driver. The balloon driver does have
mutex lock so this should not be necessary - but just in case, add
a WARN_ON if we do hit this scenario. If we do fail this, it is OK
to continue as there is a backup mechanism (VM_IO) that can bypass
the P2M and still set the _PAGE_IOMAP flags.
[v2: Change from WARN to BUG_ON]
[v3: Rebased on top of xen->p2m code split]
[v4: Change from BUG_ON to WARN]
Reviewed-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 5 Jan 2011 20:46:31 +0000 (15:46 -0500)]
xen/mmu: Set _PAGE_IOMAP if PFN is an identity PFN.
If we find that the PFN is within the P2M as an identity
PFN make sure to tack on the _PAGE_IOMAP flag.
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 19 Jan 2011 01:15:21 +0000 (20:15 -0500)]
xen/mmu: Add the notion of identity (1-1) mapping.
Our P2M tree structure is a three-level. On the leaf nodes
we set the Machine Frame Number (MFN) of the PFN. What this means
is that when one does: pfn_to_mfn(pfn), which is used when creating
PTE entries, you get the real MFN of the hardware. When Xen sets
up a guest it initially populates a array which has descending
(or ascending) MFN values, as so:
idx: 0, 1, 2
[0x290F, 0x290E, 0x290D, ..]
so pfn_to_mfn(2)==0x290D. If you start, restart many guests that list
starts looking quite random.
We graft this structure on our P2M tree structure and stick in
those MFN in the leafs. But for all other leaf entries, or for the top
root, or middle one, for which there is a void entry, we assume it is
"missing". So
pfn_to_mfn(0xc0000)=INVALID_P2M_ENTRY.
We add the possibility of setting 1-1 mappings on certain regions, so
that:
pfn_to_mfn(0xc0000)=0xc0000
The benefit of this is, that we can assume for non-RAM regions (think
PCI BARs, or ACPI spaces), we can create mappings easily b/c we
get the PFN value to match the MFN.
For this to work efficiently we introduce one new page p2m_identity and
allocate (via reserved_brk) any other pages we need to cover the sides
(1GB or 4MB boundary violations). All entries in p2m_identity are set to
INVALID_P2M_ENTRY type (Xen toolstack only recognizes that and MFNs,
no other fancy value).
On lookup we spot that the entry points to p2m_identity and return the identity
value instead of dereferencing and returning INVALID_P2M_ENTRY. If the entry
points to an allocated page, we just proceed as before and return the PFN.
If the PFN has IDENTITY_FRAME_BIT set we unmask that in appropriate functions
(pfn_to_mfn).
The reason for having the IDENTITY_FRAME_BIT instead of just returning the
PFN is that we could find ourselves where pfn_to_mfn(pfn)==pfn for a
non-identity pfn. To protect ourselves against we elect to set (and get) the
IDENTITY_FRAME_BIT on all identity mapped PFNs.
This simplistic diagram is used to explain the more subtle piece of code.
There is also a digram of the P2M at the end that can help.
Imagine your E820 looking as so:
1GB 2GB
/-------------------+---------\/----\ /----------\ /---+-----\
| System RAM | Sys RAM ||ACPI| | reserved | | Sys RAM |
\-------------------+---------/\----/ \----------/ \---+-----/
^- 1029MB ^- 2001MB
[1029MB = 263424 (0x40500), 2001MB = 512256 (0x7D100), 2048MB = 524288 (0x80000)]
And dom0_mem=max:3GB,1GB is passed in to the guest, meaning memory past 1GB
is actually not present (would have to kick the balloon driver to put it in).
When we are told to set the PFNs for identity mapping (see patch: "xen/setup:
Set identity mapping for non-RAM E820 and E820 gaps.") we pass in the start
of the PFN and the end PFN (263424 and 512256 respectively). The first step is
to reserve_brk a top leaf page if the p2m[1] is missing. The top leaf page
covers 512^2 of page estate (1GB) and in case the start or end PFN is not
aligned on 512^2*PAGE_SIZE (1GB) we loop on aligned 1GB PFNs from start pfn to
end pfn. We reserve_brk top leaf pages if they are missing (means they point
to p2m_mid_missing).
With the E820 example above, 263424 is not 1GB aligned so we allocate a
reserve_brk page which will cover the PFNs estate from 0x40000 to 0x80000.
Each entry in the allocate page is "missing" (points to p2m_missing).
Next stage is to determine if we need to do a more granular boundary check
on the 4MB (or 2MB depending on architecture) off the start and end pfn's.
We check if the start pfn and end pfn violate that boundary check, and if
so reserve_brk a middle (p2m[x][y]) leaf page. This way we have a much finer
granularity of setting which PFNs are missing and which ones are identity.
In our example 263424 and 512256 both fail the check so we reserve_brk two
pages. Populate them with INVALID_P2M_ENTRY (so they both have "missing" values)
and assign them to p2m[1][2] and p2m[1][488] respectively.
At this point we would at minimum reserve_brk one page, but could be up to
three. Each call to set_phys_range_identity has at maximum a three page
cost. If we were to query the P2M at this stage, all those entries from
start PFN through end PFN (so 1029MB -> 2001MB) would return INVALID_P2M_ENTRY
("missing").
The next step is to walk from the start pfn to the end pfn setting
the IDENTITY_FRAME_BIT on each PFN. This is done in 'set_phys_range_identity'.
If we find that the middle leaf is pointing to p2m_missing we can swap it over
to p2m_identity - this way covering 4MB (or 2MB) PFN space. At this point we
do not need to worry about boundary aligment (so no need to reserve_brk a middle
page, figure out which PFNs are "missing" and which ones are identity), as that
has been done earlier. If we find that the middle leaf is not occupied by
p2m_identity or p2m_missing, we dereference that page (which covers
512 PFNs) and set the appropriate PFN with IDENTITY_FRAME_BIT. In our example
263424 and 512256 end up there, and we set from p2m[1][2][256->511] and
p2m[1][488][0->256] with IDENTITY_FRAME_BIT set.
All other regions that are void (or not filled) either point to p2m_missing
(considered missing) or have the default value of INVALID_P2M_ENTRY (also
considered missing). In our case, p2m[1][2][0->255] and p2m[1][488][257->511]
contain the INVALID_P2M_ENTRY value and are considered "missing."
This is what the p2m ends up looking (for the E820 above) with this
fabulous drawing:
p2m /--------------\
/-----\ | &mfn_list[0],| /-----------------\
| 0 |------>| &mfn_list[1],| /---------------\ | ~0, ~0, .. |
|-----| | ..., ~0, ~0 | | ~0, ~0, [x]---+----->| IDENTITY [@256] |
| 1 |---\ \--------------/ | [p2m_identity]+\ | IDENTITY [@257] |
|-----| \ | [p2m_identity]+\\ | .... |
| 2 |--\ \-------------------->| ... | \\ \----------------/
|-----| \ \---------------/ \\
| 3 |\ \ \\ p2m_identity
|-----| \ \-------------------->/---------------\ /-----------------\
| .. +->+ | [p2m_identity]+-->| ~0, ~0, ~0, ... |
\-----/ / | [p2m_identity]+-->| ..., ~0 |
/ /---------------\ | .... | \-----------------/
/ | IDENTITY[@0] | /-+-[x], ~0, ~0.. |
/ | IDENTITY[@256]|<----/ \---------------/
/ | ~0, ~0, .... |
| \---------------/
|
p2m_missing p2m_missing
/------------------\ /------------\
| [p2m_mid_missing]+---->| ~0, ~0, ~0 |
| [p2m_mid_missing]+---->| ..., ~0 |
\------------------/ \------------/
where ~0 is INVALID_P2M_ENTRY. IDENTITY is (PFN | IDENTITY_BIT)
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
[v5: Changed code to use ranges, added ASCII art]
[v6: Rebased on top of xen->p2m code split]
[v4: Squished patches in just this one]
[v7: Added RESERVE_BRK for potentially allocated pages]
[v8: Fixed alignment problem]
[v9: Changed 1<<3X to 1<<BITS_PER_LONG-X]
[v10: Copied git commit description in the p2m code + Add Review tag]
[v11: Title had '2-1' - should be '1-1' mapping]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
David Howells [Mon, 14 Mar 2011 14:49:44 +0000 (14:49 +0000)]
MN10300: atomic_read() should ensure it emits a load
atomic_read() needs to ensure that it emits a load (which it can do by using
ACCESS_ONCE()).
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Mon, 14 Mar 2011 14:45:29 +0000 (14:45 +0000)]
MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
The invalidate-only versions of flush_icache_*range() are trying sending the
SMP_ICACHE_INV_FLUSH_RANGE IPI command in SMP kernels when they should be
sending SMP_ICACHE_INV_RANGE as the former does not exist.
Signed-off-by: David Howells <dhowells@redhat.com>
Tkhai Kirill [Mon, 14 Mar 2011 13:27:46 +0000 (13:27 +0000)]
MN10300: Proper use of macros get_user() in the case of incremented pointers
Using __get_user_check(x, ptr++, size) leads to double increment of pointer.
This macro uses the macro get_user directly, which itself is used in this way
(get_user(x, ptr++)) in some functions of the kernel. The patch fixes the
error.
Reported-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David Howells <dhowells@redhat.com>
Sebastian Andrzej Siewior [Mon, 14 Mar 2011 09:33:40 +0000 (10:33 +0100)]
x86: ce4100: Set pci ops via callback instead of module init
Setting the pci ops on subsys initcall unconditionally will break
multi platform kernels on anything except ce4100.
Use x86_init.pci.init ops to call this only on real ce4100 platforms.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: sodaville@linutronix.de
LKML-Reference: <
20110314093340.GA21026@www.tglx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Axel Lin [Wed, 2 Mar 2011 03:49:44 +0000 (11:49 +0800)]
watchdog: hpwdt: eliminate section mismatch warning
hpwdt_init_nmi_decoding() is called in hpwdt_init_one error handling,
thus remove the __devexit annotation of hpwdt_exit_nmi_decoding().
This patch fixes below warning:
WARNING: drivers/watchdog/hpwdt.o(.devinit.text+0x36f): Section mismatch in reference from the function hpwdt_init_one() to the function .devexit.text:hpwdt_exit_nmi_decoding()
The function __devinit hpwdt_init_one() references
a function __devexit hpwdt_exit_nmi_decoding().
This is often seen when error handling in the init function
uses functionality in the exit path.
The fix is often to remove the __devexit annotation of
hpwdt_exit_nmi_decoding() so it may be used outside an exit section.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Thomas Mingarelli <Thomas.Mingarelli@hp.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Wim Van Sebroeck [Mon, 21 Feb 2011 19:28:58 +0000 (19:28 +0000)]
watchdog: w83697ug_wdt: Fix set bit 0 to activate GPIO2
outb_p(c || 0x01, WDT_EFDR); -> || should be |
Reported-By: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Dan Carpenter [Wed, 23 Feb 2011 20:26:01 +0000 (23:26 +0300)]
watchdog: sch311x_wdt: fix printk condition
"==" has higher precedence than "&". Since
if (sch311x_sio_inb(sio_config_port, 0x30) & (0x01 == 0)) is always
false the message is never printed.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Wim Van Sebroeck [Mon, 21 Feb 2011 19:09:40 +0000 (19:09 +0000)]
watchdog: sch311x_wdt: Fix LDN active check
if (sch311x_sio_inb(sio_config_port, 0x30) && 0x01 == 0) -> && should be &
Reported-By: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Wim Van Sebroeck [Mon, 21 Feb 2011 10:52:43 +0000 (10:52 +0000)]
watchdog: cpwd: Fix buffer-overflow
cppcheck-1.47 reports:
[drivers/watchdog/cpwd.c:650]: (error) Buffer access out-of-bounds: p.devs
The source code is
for (i = 0; i < 4; i++) {
misc_deregister(&p->devs[i].misc);
where devs is defined as WD_NUMDEVS big and WD_NUMDEVS is equal to 3.
So the 4 should be a 3 or WD_NUMDEVS.
Reported-By: David Binderman
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Dave Airlie [Sun, 13 Mar 2011 23:47:24 +0000 (09:47 +1000)]
drm/radeon: fix problem with changing active VRAM size. (v2)
So we used to use lpfn directly to restrict VRAM when we couldn't
access the unmappable area, however this was removed in
93225b0d7bc030f4a93165347a65893685822d70 as it also restricted
the gtt placements. However it was only later noticed that this
broke on some hw.
This removes the active_vram_size, and just explicitly sets it
when it changes, TTM/drm_mm will always use the real_vram_size,
and the active vram size will change the TTM size used for lpfn
setting.
We should re-work the fpfn/lpfn to per-placement at some point
I suspect, but that is too late for this kernel.
Hopefully this addresses:
https://bugs.freedesktop.org/show_bug.cgi?id=35254
v2: fix reported useful VRAM size to userspace to be correct.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Al Viro [Sun, 13 Mar 2011 23:24:46 +0000 (23:24 +0000)]
compat breakage in preadv() and pwritev()
Fix for a dumb preadv()/pwritev() compat bug - unlike the native
variants, the compat_... ones forget to check FMODE_P{READ,WRITE}, so
e.g. on pipe the native preadv() will fail with -ESPIPE and compat one
will act as readv() and succeed.
Not critical, but it's a clear bug with trivial fix, so IMO it's OK for
-final.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 13 Mar 2011 23:01:11 +0000 (16:01 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
hwmon/f71882fg: Set platform drvdata to NULL later
hwmon/f71882fg: Fix a typo in a comment
Linus Torvalds [Sun, 13 Mar 2011 23:00:49 +0000 (16:00 -0700)]
Merge git://git./linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: break out of shrink_delalloc earlier
btrfs: fix not enough reserved space
btrfs: fix dip leak
Btrfs: make sure not to return overlapping extents to fiemap
Btrfs: deal with short returns from copy_from_user
Btrfs: fix regressions in copy_from_user handling
Linus Torvalds [Sun, 13 Mar 2011 23:00:28 +0000 (16:00 -0700)]
Merge git://git./linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] target: Fix t_transport_aborted handling in LUN_RESET + active I/O shutdown
Michal Marek [Fri, 11 Mar 2011 21:34:47 +0000 (22:34 +0100)]
kbuild: Fix computing srcversion for modules
Recent change to fixdep:
commit
b7bd182176960fdd139486cadb9962b39f8a2b50
Author: Michal Marek <mmarek@suse.cz>
Date: Thu Feb 17 15:13:54 2011 +0100
fixdep: Do not record dependency on the source file itself
changed the format of the *.cmd files without realizing that it is also
used by modpost. Put the path to the source file to the file back, in a
special variable, so that modpost sees all source files when calculating
srcversion for modules.
Reported-and-tested-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 13 Mar 2011 22:56:22 +0000 (15:56 -0700)]
Merge git://git.infradead.org/users/dwmw2/mtd-2.6.38
* git://git.infradead.org/users/dwmw2/mtd-2.6.38:
mtd: add "platform:" prefix for platform modalias
mtd: mtd_blkdevs: fix double free on error path
mtd: amd76xrom: fix oops at boot when resources are not available
mtd: fix race in cfi_cmdset_0001 driver
mtd: jedec_probe: initialise make sector erase command variable
mtd: jedec_probe: Change variable name from cfi_p to cfi
Linus Torvalds [Sun, 13 Mar 2011 22:52:48 +0000 (15:52 -0700)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon: fix page flipping hangs on r300/r400
drm/radeon: add pageflip hooks for fusion
Linus Torvalds [Sun, 13 Mar 2011 22:50:37 +0000 (15:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: fix mis-synchronisation in blkdev_issue_zeroout()
Linus Torvalds [Sun, 13 Mar 2011 22:50:01 +0000 (15:50 -0700)]
Merge branch 'fix/asoc' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: Ensure WM8958 gets all WM8994 late revision widgets
ASoC: Fix typo in late revision WM8994 DAC2R name
ASoC: Use the correct DAPM context when cleaning up final widget set
ASoC: Fix broken bitfield definitions in WM8978
ASoC: AM3517: Update codec name after multi-component update
Axel Lin [Fri, 11 Mar 2011 22:58:30 +0000 (14:58 -0800)]
gpio: add MODULE_DEVICE_TABLE
The device table is required to load modules based on modaliases.
After adding MODULE_DEVICE_TABLE, below entries will be added to
modules.pcimap:
pch_gpio 0x00008086 0x00008803 0xffffffff 0xffffffff 0x00000000 0x00000000 0x0
ml_ioh_gpio 0x000010db 0x0000802e 0xffffffff 0xffffffff 0x00000000 0x00000000 0x0
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: Tomoya MORINAGA <tomoya-linux@dsn.okisemi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Fri, 11 Mar 2011 22:58:29 +0000 (14:58 -0800)]
thp: fix page_referenced to modify mapcount/vm_flags only if page is found
When vmscan.c calls page_referenced(), if an anon page was created
before a process forked, rmap will search for it in both of the
processes, even though one of them might have since broken COW.
If the child process mlocks the vma where the COWed page belongs to,
page_referenced() running on the page mapped by the parent would lead to
*vm_flags getting VM_LOCKED set erroneously (leading to the references
on the parent page being ignored and evicting the parent page too
early).
*mapcount would also be decremented by page_referenced_one even if the
page wasn't found by page_check_address.
This also lets pmdp_clear_flush_young_notify() go ahead on a
pmd_trans_splitting() pmd.
We hold the page_table_lock so __split_huge_page_map() must wait the
pmdp_clear_flush_young_notify() to complete before it can modify the
pmd. The pmd is also still mapped in userland so the young bit may
materialize through a tlb miss before split_huge_page_map runs.
This will provide a more accurate page_referenced() behavior during
split_huge_page().
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel<riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hans de Goede [Sun, 13 Mar 2011 12:50:33 +0000 (13:50 +0100)]
hwmon/f71882fg: Set platform drvdata to NULL later
This avoids a possible race leading to trying to dereference NULL.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Hans de Goede [Sun, 13 Mar 2011 12:50:32 +0000 (13:50 +0100)]
hwmon/f71882fg: Fix a typo in a comment
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Dave Airlie [Fri, 11 Mar 2011 11:17:41 +0000 (21:17 +1000)]
drm/radeon: fix page flipping hangs on r300/r400
We've been getting reports of complete system lockups with rv3xx hw on
AGP and PCIE when running gnome-shell or kwin with compositing.
It appears the hw really doesn't like setting these registers while
stuff is running, this moves the setting of the registers into the modeset
since they aren't required to be changed anywhere else.
fixes: https://bugs.freedesktop.org/show_bug.cgi?id=35183
Reported-and-tested-by: Álmos <aaalmosss@gmail.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Chris Mason [Sat, 12 Mar 2011 12:08:42 +0000 (07:08 -0500)]
Btrfs: break out of shrink_delalloc earlier
Josef had changed shrink_delalloc to exit after three shrink
attempts, which wasn't quite enough because new writers could
race in and steal free space.
But it also fixed deadlocks and stalls as we tried to recover
delalloc reservations. The code was tweaked to loop 1024
times, and would reset the counter any time a small amount
of progress was made. This was too drastic, and with a
lot of writers we can end up stuck in shrink_delalloc forever.
The shrink_delalloc loop is fairly complex because the caller is looping
too, and the caller will go ahead and force a transaction commit to make
sure we reclaim space.
This reworks things to exit shrink_delalloc when we've forced some
writeback and the delalloc reservations have gone down. This means
the writeback has not just started but has also finished at
least some of the metadata changes required to reclaim delalloc
space.
If we've got this wrong, we're returning ENOSPC too early, which
is a big improvement over the current behavior of hanging the machine.
Test 224 in xfstests hammers on this nicely, and with 1000 writers
trying to fill a 1GB drive we get our first ENOSPC at 93% full. The
other writers are able to continue until we get 100%.
This is a worst case test for btrfs because the 1000 writers are doing
small IO, and the small FS size means we don't have a lot of room
for metadata chunks.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chuck Lever [Fri, 11 Mar 2011 20:31:06 +0000 (15:31 -0500)]
NFS: NFSROOT should default to "proto=udp"
There have been a number of recent reports that NFSROOT is no longer
working with default mount options, but fails only with certain NICs.
Brian Downing <bdowning@lavos.net> bisected to commit
56463e50 "NFS:
Use super.c for NFSROOT mount option parsing". Among other things,
this commit changes the default mount options for NFSROOT to use TCP
instead of UDP as the underlying transport.
TCP seems less able to deal with NICs that are slow to initialize.
The system logs that have accompanied reports of problems all show
that NFSROOT attempts to establish a TCP connection before the NIC is
fully initialized, and thus the TCP connection attempt fails.
When a TCP connection attempt fails during a mount operation, the
NFS stack needs to fail the operation. Usually user space knows how
and when to retry it. The network layer does not report a distinct
error code for this particular failure mode. Thus, there isn't a
clean way for the RPC client to see that it needs to retry in this
case, but not in others.
Because NFSROOT is used in some environments where it is not possible
to update the kernel command line to specify "udp", the proper thing
to do is change NFSROOT to use UDP by default, as it did before commit
56463e50.
To make it easier to see how to change default mount options for
NFSROOT and to distinguish default settings from mandatory settings,
I've adjusted a couple of areas to document the specifics.
root_nfs_cat() is also modified to deal with commas properly when
concatenating strings containing mount option lists. This keeps
root_nfs_cat() call sites simpler, now that we may be concatenating
multiple mount option strings.
Tested-by: Brian Downing <bdowning@lavos.net>
Tested-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@kernel.org> # 2.6.37
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Huang Weiyi [Tue, 8 Mar 2011 23:11:30 +0000 (23:11 +0000)]
nfs4: remove duplicated #include
Remove duplicated #include('s) in
fs/nfs/nfs4proc.c
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 9 Mar 2011 21:12:46 +0000 (16:12 -0500)]
NFSv4: nfs4_state_mark_reclaim_nograce() should be static
There are no more external users of nfs4_state_mark_reclaim_nograce() or
nfs4_state_mark_reclaim_reboot(), so mark them as static.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 9 Mar 2011 21:00:56 +0000 (16:00 -0500)]
NFSv4: Fix the setlk error handler
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 9 Mar 2011 21:00:55 +0000 (16:00 -0500)]
NFSv4.1: Fix the handling of the SEQUENCE status bits
We want SEQUENCE status bits to be handled by the state manager in order
to avoid threading issues.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Wed, 9 Mar 2011 21:00:53 +0000 (16:00 -0500)]
NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
nfs4_schedule_state_recovery() should only be used when we need to force
the state manager to check the lease. If we just want to start the
state manager in order to handle a state recovery situation, we should be
using nfs4_schedule_state_manager().
This patch fixes the abuses of nfs4_schedule_state_recovery() by replacing
its use with a set of helper functions that do the right thing.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Konrad Rzeszutek Wilk [Thu, 10 Mar 2011 03:03:16 +0000 (22:03 -0500)]
xen/e820: Don't mark balloon memory as E820_UNUSABLE when running as guest and fix overflow.
If we have a guest that asked for:
memory=1024
maxmem=2048
Which means we want 1GB now, and create pagetables so that we can expand
up to 2GB, we would have this E820 layout:
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] Xen:
0000000000000000 -
00000000000a0000 (usable)
[ 0.000000] Xen:
00000000000a0000 -
0000000000100000 (reserved)
[ 0.000000] Xen:
0000000000100000 -
0000000080800000 (usable)
Due to patch: "xen/setup: Inhibit resource API from using System RAM E820 gaps as PCI mem gaps."
we would mark the memory past the 1GB mark as unusuable resulting in:
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] Xen:
0000000000000000 -
00000000000a0000 (usable)
[ 0.000000] Xen:
00000000000a0000 -
0000000000100000 (reserved)
[ 0.000000] Xen:
0000000000100000 -
0000000040000000 (usable)
[ 0.000000] Xen:
0000000040000000 -
0000000080800000 (unusable)
which meant that we could not balloon up anymore. We could
balloon the guest down. The fix is to run the code introduced
by the above mentioned patch only for the initial domain.
We will have to revisit this once we start introducing a modified
E820 for PCI passthrough so that we can utilize the P2M identity code.
We also fix an overflow by having UL instead of ULL on 32-bit machines.
[v2: Ian pointed to the overflow issue]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Lukas Czerner [Fri, 11 Mar 2011 09:23:53 +0000 (10:23 +0100)]
block: fix mis-synchronisation in blkdev_issue_zeroout()
BZ29402
https://bugzilla.kernel.org/show_bug.cgi?id=29402
We can hit serious mis-synchronization in bio completion path of
blkdev_issue_zeroout() leading to a panic.
The problem is that when we are going to wait_for_completion() in
blkdev_issue_zeroout() we check if the bb.done equals issued (number of
submitted bios). If it does, we can skip the wait_for_completition()
and just out of the function since there is nothing to wait for.
However, there is a ordering problem because bio_batch_end_io() is
calling atomic_inc(&bb->done) before complete(), hence it might seem to
blkdev_issue_zeroout() that all bios has been completed and exit. At
this point when bio_batch_end_io() is going to call complete(bb->wait),
bb and wait does not longer exist since it was allocated on stack in
blkdev_issue_zeroout() ==> panic!
(thread 1) (thread 2)
bio_batch_end_io() blkdev_issue_zeroout()
if(bb) { ...
if (bb->end_io) ...
bb->end_io(bio, err); ...
atomic_inc(&bb->done); ...
... while (issued != atomic_read(&bb.done))
... (let issued == bb.done)
... (do the rest of the function)
... return ret;
complete(bb->wait);
^^^^^^^^
panic
We can fix this easily by simplifying bio_batch and completion counting.
Also remove bio_end_io_t *end_io since it is not used.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reported-by: Eric Whitney <eric.whitney@hp.com>
Tested-by: Eric Whitney <eric.whitney@hp.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
CC: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Axel Lin [Mon, 7 Mar 2011 03:04:24 +0000 (11:04 +0800)]
mtd: add "platform:" prefix for platform modalias
Since
43cc71eed1250755986da4c0f9898f9a635cb3bf (platform: prefix MODALIAS
with "platform:"), the platform modalias is prefixed with "platform:".
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Maxim Levitsky [Sat, 8 Jan 2011 23:25:06 +0000 (01:25 +0200)]
mtd: mtd_blkdevs: fix double free on error path
This one liner patch fixes double free that will occur if add_mtd_blktrans_dev
fails. On failure it frees the input argument, but all its users also free it
on error which is natural thing to do. Thus don't free it.
All credit for finding that bug belongs to reporters of the bug in the android bugzilla
http://code.google.com/p/android/issues/detail?id=13761
Commit message tweaked by Artem.
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Stanislaw Gruszka [Sat, 8 Jan 2011 14:24:37 +0000 (15:24 +0100)]
mtd: amd76xrom: fix oops at boot when resources are not available
For some unknown reasons resources needed by amd76xrom driver can be
unavailable. And instead of returning an error, the driver keeps going
and crash the kernel. This patch fixes the problem by making the driver
return -EBUSY if the resources are not available.
Commit messages tweaked by Artem.
Reported-by: Russell Whitaker <russ@ashlandhome.net>
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Joakim Tjernlund [Mon, 7 Feb 2011 16:07:11 +0000 (17:07 +0100)]
mtd: fix race in cfi_cmdset_0001 driver
As inval_cache_and_wait_for_operation() drop and reclaim the lock
to invalidate the cache, some other thread may suspend the operation
before reaching the for(;;) loop. Therefore the loop must start with
checking the chip->state before reading status from the chip.
Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Acked-by: Michael Cashwell <mboards@prograde.net>
Acked-by: Stefan Bigler <stefan.bigler@keymile.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: stable@kernel.org
Antony Pavlov [Fri, 11 Feb 2011 10:00:37 +0000 (13:00 +0300)]
mtd: jedec_probe: initialise make sector erase command variable
In the commit
08968041bef437ec363623cd3218c2b083537ada
(mtd: cfi_cmdset_0002: make sector erase command variable)
introdused a field sector_erase_cmd. In the same commit initialisation
of cfi->sector_erase_cmd made in cfi_chip_setup()
(file drivers/mtd/chips/cfi_probe.c), so the CFI chip has no problem:
...
cfi->cfi_mode = CFI_MODE_CFI;
cfi->sector_erase_cmd = CMD(0x30);
...
But for the JEDEC chips this initialisation is not carried out,
so the JEDEC chips have sector_erase_cmd == 0.
This patch adds the missing initialisation.
Signed-off-by: Antony Pavlov <antony@niisi.msk.ru>
Acked-by: Guillaume LECERF <glecerf@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
CC: stable@kernel.org
Antony Pavlov [Fri, 11 Feb 2011 10:00:37 +0000 (13:00 +0300)]
mtd: jedec_probe: Change variable name from cfi_p to cfi
In the following commit, we'll need to use the CMD() macro in order to
fix the initialisation of the sector_erase_cmd field. That requires the
local variable to be called 'cfi', so change it first in a simple patch.
Signed-off-by: Antony Pavlov <antony@niisi.msk.ru>
Acked-by: Guillaume LECERF <glecerf@gmail.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
CC: stable@kernel.org
Dave Airlie [Fri, 11 Mar 2011 00:04:23 +0000 (10:04 +1000)]
drm/radeon: add pageflip hooks for fusion
Looks like these got passed over with both being merged at the same
time but not quite meeting in the middle.
should fix: https://bugs.freedesktop.org/show_bug.cgi?id=34137
along with Michael's phoronix article.
Reported-by: Chi-Thanh Christopher Nguyen
Article-written-by: Michael Larabel @ phoronix
Signed-off-by: Dave Airlie <airlied@redhat.com>
Linus Torvalds [Fri, 11 Mar 2011 00:30:21 +0000 (16:30 -0800)]
Merge git://git./linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
ariadne: remove redundant NULL check
ip6ip6: autoload ip6 tunnel
net: bridge builtin vs. ipv6 modular
ipv6: Don't create clones of host routes.
pktgen: fix errata in show results
ipv4: Fix erroneous uses of ifa_address.
vxge: update MAINTAINERS
r6040: bump to version 0.27 and date
23Feb2011
r6040: fix multicast operations
rds: prevent BUG_ON triggering on congestion map updates
bonding 802.3ad: Rename rx_machine_lock to state_machine_lock
bonding 802.3ad: Fix the state machine locking v2
drivers/net/macvtap: fix error check
net: fix multithreaded signal handling in unix recv routines
net: Enter net/ipv6/ even if CONFIG_IPV6=n
net/smsc911x.c: Set the VLAN1 register to fix VLAN MTU problem
bnx2x: fix MaxBW configuration
bnx2x: (NPAR) prevent HW access in D3 state
bnx2x: fix link notification
bnx2x: fix non-pmf device load flow
Doing my first --no-ff merge here, to get the explicit merge commit.
David did a back-merge in order to get commit
8909c9ad8ff0 ("net: don't
allow CAP_NET_ADMIN to load non-netdev kernel modules") so that we can
add Stephen Hemminger's fix to handle ip6 tunnels as well, which uses
the MODULE_ALIAS_NETDEV() macro created by that change.
j223yang@asset.uwaterloo.ca [Thu, 10 Mar 2011 12:36:37 +0000 (12:36 +0000)]
ariadne: remove redundant NULL check
Simply remove redundant 'dev' NULL check.
Signed-off-by: Jinqiu Yang <crindy646@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Thu, 10 Mar 2011 11:43:19 +0000 (11:43 +0000)]
ip6ip6: autoload ip6 tunnel
Add necessary alias to autoload ip6ip6 tunnel module.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 10 Mar 2011 22:00:44 +0000 (14:00 -0800)]
Merge branch 'master' of /home/davem/src/GIT/linux-2.6/
Randy Dunlap [Thu, 10 Mar 2011 21:45:57 +0000 (13:45 -0800)]
net: bridge builtin vs. ipv6 modular
When configs BRIDGE=y and IPV6=m, this build error occurs:
br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr'
BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding
depends on IPV6 || IPV6=n
to BRIDGE_IGMP_SNOOPING would be a good fix. As it is currently,
making BRIDGE depend on the IPV6 config works.
Reported-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 10 Mar 2011 21:22:10 +0000 (13:22 -0800)]
Merge branch 'media_fixes' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
[media] mantis_pci: remove asm/pgtable.h include
[media] tda829x: fix regression in probe functions
[media] mceusb: don't claim multifunction device non-IR parts
[media] nuvoton-cir: fix wake from suspend
[media] cx18: Add support for Hauppauge HVR-1600 models with s5h1411
[media] ivtv: Fix corrective action taken upon DMA ERR interrupt to avoid hang
[media] cx25840: fix probing of cx2583x chips
[media] cx23885: Remove unused 'err:' labels to quiet compiler warning
[media] cx23885: Revert "Check for slave nack on all transactions"
[media] DiB7000M: add pid filtering
[media] Fix sysfs rc protocol lookup for rc-5-sz
[media] au0828: fix VBI handling when in V4L2 streaming mode
[media] ir-raw: Properly initialize the IR event (BZ#27202)
[media] s2255drv: firmware re-loading changes
[media] Fix double free of video_device in mem2mem_testdev
[media] DM04/QQBOX memcpy to const char fix
Doe, YiCheng [Thu, 10 Mar 2011 20:00:21 +0000 (14:00 -0600)]
ipmi: Fix IPMI errors due to timing problems
This patch fixes an issue in OpenIPMI module where sometimes an ABORT command
is sent after sending an IPMI request to BMC causing the IPMI request to fail.
Signed-off-by: YiCheng Doe <yicheng.doe@hp.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Tom Mingarelli <thomas.mingarelli@hp.com>
Tested-by: Andy Cress <andy.cress@us.kontron.com>
Tested-by: Mika Lansirine <Mika.Lansirinne@stonesoft.com>
Tested-by: Brian De Wolf <bldewolf@csupomona.edu>
Cc: Jean Michel Audet <Jean-Michel.Audet@ca.Kontron.com>
Cc: Jozef Sudelsky <jozef.sudolsky@elbiahosting.sk>
Acked-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 10 Mar 2011 21:16:01 +0000 (13:16 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fs/dcache: allow d_obtain_alias() to return unhashed dentries
Check for immutable/append flag in fallocate path
sysctl: the include of rcupdate.h is only needed in the kernel
fat: fix d_revalidate oopsen on NFS exports
jfs: fix d_revalidate oopsen on NFS exports
ocfs2: fix d_revalidate oopsen on NFS exports
gfs2: fix d_revalidate oopsen on NFS exports
fuse: fix d_revalidate oopsen on NFS exports
ceph: fix d_revalidate oopsen on NFS exports
reiserfs xattr ->d_revalidate() shouldn't care about RCU
/proc/self is never going to be invalidated...