Linus Torvalds [Fri, 18 Mar 2011 17:45:21 +0000 (10:45 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Flush TLB if PGD entry is changed in i386 PAE mode
x86, dumpstack: Correct stack dump info when frame pointer is available
x86: Clean up csum-copy_64.S a bit
x86: Fix common misspellings
x86: Fix misspelling and align params
x86: Use PentiumPro-optimized partial_csum() on VIA C7
Linus Torvalds [Fri, 18 Mar 2011 17:44:05 +0000 (10:44 -0700)]
Merge branches 'irq-fixes-for-linus' and 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix incorrect unlock in __setup_irq()
cris: Use generic show_interrupts()
genirq: show_interrupts: Check desc->name before printing it blindly
cris: Use accessor functions to set IRQ_PER_CPU flag
cris: Fix irq conversion fallout
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched, kernel-doc: Fix runqueue_is_locked() description
Linus Torvalds [Fri, 18 Mar 2011 17:38:34 +0000 (10:38 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
trace, filters: Initialize the match variable in process_ops() properly
trace, documentation: Fix branch profiling location in debugfs
oprofile, s390: Cleanups
oprofile, s390: Remove hwsampler_files.c and merge it into init.c
perf: Fix tear-down of inherited group events
perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
perf: Handle stopped state with tracepoints
perf: Fix the software events state check
perf, powerpc: Handle events that raise an exception without overflowing
perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
perf, x86: Clean up SandyBridge PEBS events
perf lock: Fix sorting by wait_min
perf tools: Version incorrect with some versions of grep
perf evlist: New command to list the names of events present in a perf.data file
perf script: Add support for H/W and S/W events
perf script: Add support for dumping symbols
perf script: Support custom field selection for output
perf script: Move printing of 'common' data from print_event and rename
perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
perf script: Change process_event prototype
...
Linus Torvalds [Fri, 18 Mar 2011 17:37:40 +0000 (10:37 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
Update cpuset info & webiste for cgroups
dcdbas: force SMI to happen when expected
arch/arm/Kconfig: remove one to many l's in the word.
asm-generic/user.h: Fix spelling in comment
drm: fix printk typo 'sracth'
Remove one to many n's in a word
Documentation/filesystems/romfs.txt: fixing link to genromfs
drivers:scsi Change printk typo initate -> initiate
serial, pch uart: Remove duplicate inclusion of linux/pci.h header
fs/eventpoll.c: fix spelling
mm: Fix out-of-date comments which refers non-existent functions
drm: Fix printk typo 'failled'
coh901318.c: Change initate to initiate.
mbox-db5500.c Change initate to initiate.
edac: correct i82975x error-info reported
edac: correct i82975x mci initialisation
edac: correct commented info
fs: update comments to point correct document
target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
...
Trivial conflict in fs/eventpoll.c (spelling vs addition)
Linus Torvalds [Fri, 18 Mar 2011 17:35:30 +0000 (10:35 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (48 commits)
HID: add support for Logitech Driving Force Pro wheel
HID: hid-ortek: remove spurious reference
HID: add support for Ortek PKB-1700
HID: roccat-koneplus: vorrect mode of sysfs attr 'sensor'
HID: hid-ntrig: init settle and mode check
HID: merge hid-egalax into hid-multitouch
HID: hid-multitouch: Send events per slot if CONTACTCOUNT is missing
HID: ntrig remove if and drop an indent
HID: ACRUX - activate the device immediately after binding
HID: ntrig: apply NO_INIT_REPORTS quirk
HID: hid-magicmouse: Correct touch orientation direction
HID: ntrig don't dereference unclaimed hidinput
HID: Do not create input devices for feature reports
HID: bt hidp: send Output reports using SET_REPORT on the Control channel
HID: hid-sony.c: Fix sending Output reports to the Sixaxis
HID: add support for Keytouch IEC 60945
HID: Add HID Report Descriptor to sysfs
HID: add IRTOUCH infrared USB to hid_have_special_driver
HID: kernel oops in out_cleanup in function hidinput_connect
HID: Add teletext/color keys - gyration remote - EU version (GYAR3101CKDE)
...
Ingo Molnar [Fri, 18 Mar 2011 13:41:27 +0000 (14:41 +0100)]
trace, filters: Initialize the match variable in process_ops() properly
Make sure the 'match' variable always has a value.
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Fri, 18 Mar 2011 13:31:43 +0000 (06:31 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/benh/powerpc
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (62 commits)
powerpc/85xx: Fix signedness bug in cache-sram
powerpc/fsl: 85xx: document cache sram bindings
powerpc/fsl: define binding for fsl mpic interrupt controllers
powerpc/fsl_msi: Handle msi-available-ranges better
drivers/serial/ucc_uart.c: Add of_node_put to avoid memory leak
powerpc/85xx: Fix SPE float to integer conversion failure
powerpc/85xx: Update sata controller compatible for p1022ds board
ATA: Add FSL sata v2 controller support
powerpc/mpc8xxx_gpio: simplify searching for 'fsl, qoriq-gpio' compatiable
powerpc/8xx: remove obsolete mgsuvd board
powerpc/82xx: rename and update mgcoge board support
powerpc/83xx: rename and update kmeter1
powerpc/85xx: Workaroudn e500 CPU erratum A005
powerpc/fsl_pci: Add support for FSL PCIe controllers v2.x
powerpc/85xx: Fix writing to spin table 'cpu-release-addr' on ppc64e
powerpc/pseries: Disable MSI using new interface if possible
powerpc: Enable GENERIC_HARDIRQS_NO_DEPRECATED.
powerpc: core irq_data conversion.
powerpc: sysdev/xilinx_intc irq_data conversion.
powerpc: sysdev/uic irq_data conversion.
...
Fix up conflicts in arch/powerpc/sysdev/fsl_msi.c (due to getting rid of
of_platform_driver in arch/powerpc)
Shaohua Li [Wed, 16 Mar 2011 03:37:29 +0000 (11:37 +0800)]
x86: Flush TLB if PGD entry is changed in i386 PAE mode
According to intel CPU manual, every time PGD entry is changed in i386 PAE
mode, we need do a full TLB flush. Current code follows this and there is
comment for this too in the code.
But current code misses the multi-threaded case. A changed page table
might be used by several CPUs, every such CPU should flush TLB. Usually
this isn't a problem, because we prepopulate all PGD entries at process
fork. But when the process does munmap and follows new mmap, this issue
will be triggered.
When it happens, some CPUs keep doing page faults:
http://marc.info/?l=linux-kernel&m=
129915020508238&w=2
Reported-by: Yasunori Goto<y-goto@jp.fujitsu.com>
Tested-by: Yasunori Goto<y-goto@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Shaohua Li<shaohua.li@intel.com>
Cc: Mallick Asit K <asit.k.mallick@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm <linux-mm@kvack.org>
Cc: stable <stable@kernel.org>
LKML-Reference: <
1300246649.2337.95.camel@sli10-conroe>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Namhyung Kim [Fri, 18 Mar 2011 02:40:06 +0000 (11:40 +0900)]
x86, dumpstack: Correct stack dump info when frame pointer is available
Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.
However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.
Commit
9c0729dc8062 ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.
This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.
End result looks like below:
before:
[ 3.508329] Call Trace:
[ 3.508551] [<
ffffffff814f35c9>] ? panic+0x91/0x199
[ 3.508662] [<
ffffffff814f3739>] ? printk+0x68/0x6a
[ 3.508770] [<
ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
[ 3.508876] [<
ffffffff81a9821f>] ? mount_root+0x56/0x5a
[ 3.508975] [<
ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
[ 3.509216] [<
ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
[ 3.509335] [<
ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
[ 3.509442] [<
ffffffff814f6880>] ? restore_args+0x0/0x30
[ 3.509542] [<
ffffffff81a97559>] ? kernel_init+0x0/0x1e2
[ 3.509641] [<
ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
after:
[ 3.522991] Call Trace:
[ 3.523351] [<
ffffffff814f35b9>] panic+0x91/0x199
[ 3.523468] [<
ffffffff814f3729>] ? printk+0x68/0x6a
[ 3.523576] [<
ffffffff81a981b2>] mount_block_root+0x257/0x26e
[ 3.523681] [<
ffffffff81a9821f>] mount_root+0x56/0x5a
[ 3.523780] [<
ffffffff81a98393>] prepare_namespace+0x170/0x1a9
[ 3.523885] [<
ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
[ 3.523987] [<
ffffffff81003894>] kernel_thread_helper+0x4/0x10
[ 3.524228] [<
ffffffff814f6880>] ? restore_args+0x0/0x30
[ 3.524345] [<
ffffffff81a97559>] ? kernel_init+0x0/0x1e2
[ 3.524445] [<
ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
-v5:
* fix build breakage with oprofile
-v4:
* use 0 instead of regs->bp
* separate out printk changes
-v3:
* apply comment from Frederic
* add a couple of printk fixes
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Soren Sandmann <ssp@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
LKML-Reference: <
1300416006-3163-1-git-send-email-namhyung@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 18 Mar 2011 09:42:11 +0000 (10:42 +0100)]
x86: Clean up csum-copy_64.S a bit
The many stray whitespaces and other uncleanlinesses made this code
almost unreadable to me - so fix those.
No changes to the code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lucas De Marchi [Thu, 17 Mar 2011 19:24:16 +0000 (16:24 -0300)]
x86: Fix common misspellings
They were generated by 'codespell' and then manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <
1300389856-1099-3-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lucas De Marchi [Thu, 17 Mar 2011 19:24:15 +0000 (16:24 -0300)]
x86: Fix misspelling and align params
Fix 'upto' misspelling and align parameters.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: trivial@kernel.org
LKML-Reference: <
1300389856-1099-2-git-send-email-lucas.demarchi@profusion.mobi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Fri, 18 Mar 2011 09:38:53 +0000 (10:38 +0100)]
Merge branch 'linus' into x86/urgent
Merge reason: Merge upstream commits to avoid conflicts in upcoming patches.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Fri, 18 Mar 2011 02:34:12 +0000 (19:34 -0700)]
Merge git://git./linux/kernel/git/cmetcalf/linux-tile
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (27 commits)
arch/tile: support newer binutils assembler shift semantics
arch/tile: fix deadlock bugs in rwlock implementation
drivers/edac: provide support for tile architecture
tile on-chip network driver: sync up with latest fixes
arch/tile: support 4KB page size as well as 64KB
arch/tile: add some more VMSPLIT options and use consistent naming
arch/tile: fix some comments and whitespace
arch/tile: export some additional module symbols
arch/tile: enhance existing finv_buffer_remote() routine
arch/tile: fix two bugs in the backtracer code
arch/tile: use extended assembly to inline __mb_incoherent()
arch/tile: use a cleaner technique to enable interrupt for cpu_idle()
arch/tile: sync up with <arch/sim.h> and <arch/sim_def.h> changes
arch/tile: fix reversed test of strict_strtol() return value
arch/tile: avoid a simulator warning during bootup
arch/tile: export <asm/hardwall.h> to userspace
arch/tile: warn and retry if an IPI is not accepted by the target cpu
arch/tile: stop disabling INTCTRL_1 interrupts during hypervisor downcalls
arch/tile: fix __ndelay etc to work better
arch/tile: bug fix: exec'ed task thought it was still single-stepping
...
Fix up trivial conflict in arch/tile/kernel/vmlinux.lds.S (percpu
alignment vs section naming convention fix)
Linus Torvalds [Fri, 18 Mar 2011 02:28:15 +0000 (19:28 -0700)]
Merge branch 'omap-for-linus' of git://git./linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (258 commits)
omap: zoom: host should not pull up wl1271's irq line
arm: plat-omap: iommu: fix request_mem_region() error path
OMAP2+: Common CPU DIE ID reading code reads wrong registers for OMAP4430
omap4: mux: Remove duplicate mux modes
omap: iovmm: don't check 'da' to set IOVMF_DA_FIXED flag
omap: iovmm: disallow mapping NULL address when IOVMF_DA_ANON is set
omap2+: mux: Fix compile when CONFIG_OMAP_MUX is not selected
omap4: board-omap4panda: Initialise the serial pads
omap3: board-3430sdp: Initialise the serial pads
omap4: board-4430sdp: Initialise the serial pads
omap2+: mux: Add macro for configuring static with omap_hwmod_mux_init
omap2+: mux: Remove the use of IDLE flag
omap2+: Add separate list for dynamic pads to mux
perf: add OMAP support for the new power events
OMAP4: Add IVA OPP enteries.
OMAP4: Update Voltage Rail Values for MPU, IVA and CORE
OMAP4: Enable 800 MHz and 1 GHz MPU-OPP
OMAP3+: OPP: Replace voltage values with Macros
OMAP3: wdtimer: Fix CORE idle transition
Watchdog: omap_wdt: add fine grain runtime-pm
...
Fix up various conflicts in
- arch/arm/mach-omap2/board-omap3evm.c
- arch/arm/mach-omap2/clock3xxx_data.c
- arch/arm/mach-omap2/usb-musb.c
- arch/arm/plat-omap/include/plat/usb.h
- drivers/usb/musb/musb_core.h
Linus Torvalds [Fri, 18 Mar 2011 02:13:18 +0000 (19:13 -0700)]
Merge branch 'for-linus' of git://codeaurora.org/quic/kernel/davidb/linux-msm
* 'for-linus' of git://codeaurora.org/quic/kernel/davidb/linux-msm: (46 commits)
msm: scm: Check for interruption immediately
msm: scm: Fix improper register assignment
msm: scm: Mark inline asm as volatile
msm: iommu: Enable HTW L2 redirection on MSM8960
msm: iommu: Don't read from write-only registers
msm: iommu: Remove dependency on IDR
msm: iommu: Use ASID tagging instead of VMID tagging
msm: iommu: Rework clock logic and add IOMMU bus clock control
msm: iommu: Clock control for the IOMMU driver
msm: mdp: Set the correct pack pattern for XRGB/ARGB
msm_fb: Fix framebuffer console
msm: mdp: Add support for RGBX 8888 image format.
video: msmfb: Put the partial update magic value into the fix_screen struct.
msm: clock: Migrate to clkdev
msm: clock: Remove references to clk_ops_pcom
msm: headsmp.S: Fix section mismatch
msm: Use explicit GPLv2 licenses
msm: iommu: Enable IOMMU support for MSM8960
msm: iommu: Generalize platform data for multiple targets
msm: iommu: Create a Kconfig item for the IOMMU driver
...
Linus Torvalds [Fri, 18 Mar 2011 02:08:06 +0000 (19:08 -0700)]
Merge branch 'devel-stable' of /home/rmk/linux-2.6-arm
* 'devel-stable' of master.kernel.org:/home/rmk/linux-2.6-arm: (289 commits)
davinci: DM644x EVM: register MUSB device earlier
davinci: add spi devices on tnetv107x evm
davinci: add ssp config for tnetv107x evm board
davinci: add tnetv107x ssp platform device
spi: add ti-ssp spi master driver
mfd: add driver for sequencer serial port
ARM: EXYNOS4: Implement Clock gating for System MMU
ARM: EXYNOS4: Enhancement of System MMU driver
ARM: EXYNOS4: Add support for gpio interrupts
ARM: S5P: Add function to register gpio interrupt bank data
ARM: S5P: Cleanup S5P gpio interrupt code
ARM: EXYNOS4: Add missing GPYx banks
ARM: S3C64XX: Fix section mismatch from cpufreq init
ARM: EXYNOS4: Add keypad device to the SMDKV310
ARM: EXYNOS4: Update clocks for keypad
ARM: EXYNOS4: Update keypad base address
ARM: EXYNOS4: Add keypad device helpers
ARM: EXYNOS4: Add support for SATA on ARMLEX4210
plat-nomadik: make GPIO interrupts work with cpuidle ApSleep
mach-u300: define a dummy filter function for coh901318
...
Fix up various conflicts in
- arch/arm/mach-exynos4/cpufreq.c
- arch/arm/mach-mxs/gpio.c
- drivers/net/Kconfig
- drivers/tty/serial/Kconfig
- drivers/tty/serial/Makefile
- drivers/usb/gadget/fsl_mxc_udc.c
- drivers/video/Kconfig
Linus Torvalds [Fri, 18 Mar 2011 01:48:35 +0000 (18:48 -0700)]
Merge branches 'defcfg', 'drivers' and 'cyberpro-next' of /home/rmk/linux-2.6-arm
* 'defcfg' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6647/1: add Versatile Express defconfig
ARM: 6644/1: mach-ux500: update the U8500 defconfig
* 'drivers' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6764/1: pl011: factor out FIFO to TTY code
ARM: 6763/1: pl011: add optional RX DMA to PL011 v2
ARM: 6758/1: amba: support pm ops
ARM: amba: make amba_driver id_table const
ARM: amba: make internal ID table handling const
ARM: amba: make probe() functions take const id tables
ARM: 6662/1: amba: make amba_bustype non-static
ARM: mmci: add dmaengine-based DMA support
ARM: mmci: no need for separate host->data_xfered
ARM: mmci: avoid unnecessary switch to data available PIO interrupts
ARM: mmci: no need to call flush_dcache_page() with sg_miter API
ARM: mmci: avoid reporting too many completed bytes on fifo overrun
ALSA: AACI: make fifo variables more explanitory
ALSA: AACI: no need to call snd_pcm_period_elapsed() for each period
ALSA: AACI: use snd_pcm_lib_period_bytes()
ALSA: AACI: clean up AACI announcement printk
ALSA: AACI: fix channel mask selection
ALSA: AACI: fix number of channels for record
ALSA: AACI: fix multiple IRQ claiming
* 'cyberpro-next' of master.kernel.org:/home/rmk/linux-2.6-arm:
VIDEO: cyberpro: remove unused cyber2000fb_get_fb_var()
VIDEO: cyberpro: remove useless function extreg pointers
VIDEO: cyberpro: update handling of device structures
VIDEO: cyberpro: add support for video capture I2C
VIDEO: cyberpro: make 'reg_b0_lock' always present
VIDEO: cyberpro: add I2C support
VIDEO: cyberpro: select lowest multipler/divisor for PLL
Linus Torvalds [Fri, 18 Mar 2011 01:40:35 +0000 (18:40 -0700)]
Merge branch 'kvm-updates/2.6.39' of git://git./virt/kvm/kvm
* 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits)
KVM: unbreak userspace that does not sets tss address
KVM: MMU: cleanup pte write path
KVM: MMU: introduce a common function to get no-dirty-logged slot
KVM: fix rcu usage in init_rmode_* functions
KVM: fix kvmclock regression due to missing clock update
KVM: emulator: Fix permission checking in io permission bitmap
KVM: emulator: Fix io permission checking for 64bit guest
KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n
KVM: x86: Remove useless regs_page pointer from kvm_lapic
KVM: improve comment on rcu use in irqfd_deassign
KVM: MMU: remove unused macros
KVM: MMU: cleanup page alloc and free
KVM: MMU: do not record gfn in kvm_mmu_pte_write
KVM: MMU: move mmu pages calculated out of mmu lock
KVM: MMU: set spte accessed bit properly
KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits
KVM: Start lock documentation
KVM: better readability of efer_reserved_bits
KVM: Clear async page fault hash after switching to real mode
KVM: VMX: Initialize vm86 TSS only once.
...
Linus Torvalds [Fri, 18 Mar 2011 01:37:42 +0000 (18:37 -0700)]
Merge branch 'stable/xen.pm.bug-fixes' of git://git./linux/kernel/git/konrad/xen
* 'stable/xen.pm.bug-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
xen: xenbus PM events support
Linus Torvalds [Fri, 18 Mar 2011 01:27:49 +0000 (18:27 -0700)]
Merge branches 'stable/irq.fairness' and 'stable/irq.ween_of_nr_irqs' of git://git./linux/kernel/git/konrad/xen
* 'stable/irq.fairness' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: events: Remove redundant clear of l2i at end of round-robin loop
xen: events: Make round-robin scan fairer by snapshotting each l2 word once only
xen: events: Clean up round-robin evtchn scan.
xen: events: Make last processed event channel a per-cpu variable.
xen: events: Process event channels notifications in round-robin order.
* 'stable/irq.ween_of_nr_irqs' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: events: Fix compile error if CONFIG_SMP is not defined.
xen: events: correct locking in xen_irq_from_pirq
xen: events: propagate irq allocation failure instead of panicking
xen: events: do not workaround too-small nr_irqs
xen: events: remove use of nr_irqs as upper bound on number of pirqs
xen: events: dynamically allocate irq info structures
xen: events: maintain a list of Xen interrupts
xen: events: push setup of irq<->{evtchn,ipi,virq,pirq} maps into irq_info init functions
xen: events: turn irq_info constructors into initialiser functions
xen: events: use per-cpu variable for cpu_evtchn_mask
xen: events: refactor GSI pirq bindings functions
xen: events: rename restore_cpu_pirqs -> restore_pirqs
xen: events: remove unused public functions
xen: events: fix xen_map_pirq_gsi error return
xen: events: simplify comment
xen: events: separate two unrelated halves of if condition
Fix up trivial conflicts in drivers/xen/events.c
Linus Torvalds [Fri, 18 Mar 2011 01:16:36 +0000 (18:16 -0700)]
Merge branches 'stable/hvc-console', 'stable/gntalloc.v6' and 'stable/balloon' of git://git./linux/kernel/git/konrad/xen
* 'stable/hvc-console' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/hvc: Disable probe_irq_on/off from poking the hvc-console IRQ line.
* 'stable/gntalloc.v6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: gntdev: fix build warning
xen/p2m/m2p/gnttab: do not add failed grant maps to m2p override
xen-gntdev: Add cast to pointer
xen-gntdev: Fix incorrect use of zero handle
xen: change xen/[gntdev/gntalloc] to default m
xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only mappings
xen-gntdev: Avoid double-mapping memory
xen-gntdev: Avoid unmapping ranges twice
xen-gntdev: Use map->vma for checking map validity
xen-gntdev: Fix unmap notify on PV domains
xen-gntdev: Fix memory leak when mmap fails
xen/gntalloc,gntdev: Add unmap notify ioctl
xen-gntalloc: Userspace grant allocation driver
xen-gntdev: Support mapping in HVM domains
xen-gntdev: Add reference counting to maps
xen-gntdev: Use find_vma rather than iterating our vma list manually
xen-gntdev: Change page limit to be global instead of per-open
* 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (24 commits)
xen-gntdev: Use ballooned pages for grant mappings
xen-balloon: Add interface to retrieve ballooned pages
xen-balloon: Move core balloon functionality out of module
xen/balloon: Remove pr_info's and don't alter retry_count
xen/balloon: Protect against CPU exhaust by event/x process
xen/balloon: Migration from mod_timer() to schedule_delayed_work()
xen/balloon: Removal of driver_pages
Linus Torvalds [Fri, 18 Mar 2011 00:54:40 +0000 (17:54 -0700)]
Merge git://git./linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (170 commits)
[SCSI] scsi_dh_rdac: Add MD36xxf into device list
[SCSI] scsi_debug: add consecutive medium errors
[SCSI] libsas: fix ata list corruption issue
[SCSI] hpsa: export resettable host attribute
[SCSI] hpsa: move device attributes to avoid forward declarations
[SCSI] scsi_debug: Logical Block Provisioning (SBC3r26)
[SCSI] sd: Logical Block Provisioning update
[SCSI] Include protection operation in SCSI command trace
[SCSI] hpsa: fix incorrect PCI IDs and add two new ones (2nd try)
[SCSI] target: Fix volume size misreporting for volumes > 2TB
[SCSI] bnx2fc: Broadcom FCoE offload driver
[SCSI] fcoe: fix broken fcoe interface reset
[SCSI] fcoe: precedence bug in fcoe_filter_frames()
[SCSI] libfcoe: Remove stale fcoe-netdev entries
[SCSI] libfcoe: Move FCOE_MTU definition from fcoe.h to libfcoe.h
[SCSI] libfc: introduce __fc_fill_fc_hdr that accepts fc_hdr as an argument
[SCSI] fcoe, libfc: initialize EM anchors list and then update npiv EMs
[SCSI] Revert "[SCSI] libfc: fix exchange being deleted when the abort itself is timed out"
[SCSI] libfc: Fixing a memory leak when destroying an interface
[SCSI] megaraid_sas: Version and Changelog update
...
Fix up trivial conflicts due to whitespace differences in
drivers/scsi/libsas/{sas_ata.c,sas_scsi_host.c}
Linus Torvalds [Fri, 18 Mar 2011 00:42:14 +0000 (17:42 -0700)]
Merge branch 'next' of git://git./linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] pcc-cpufreq: remove duplicate statements
[CPUFREQ] Remove the pm_message_t argument from driver suspend
[CPUFREQ] Remove unneeded locks
[CPUFREQ] Remove old, deprecated per cpu ondemand/conservative sysfs files
[CPUFREQ] Remove deprecated sysfs file sampling_rate_max
[CPUFREQ] powernow-k8: The table index is not worth displaying
[CPUFREQ] calculate delay after dbs_check_cpu
[CPUFREQ] Add documentation for sampling_down_factor
[CPUFREQ] drivers/cpufreq: Remove unnecessary semicolons
Linus Torvalds [Fri, 18 Mar 2011 00:41:19 +0000 (17:41 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
ext3: Always set dx_node's fake_dirent explicitly.
ext3: Fix an overflow in ext3_trim_fs.
jbd: Remove one to many n's in a word.
ext3: skip orphan cleanup on rocompat fs
ext2: Fix link count corruption under heavy link+rename load
ext3: speed up group trim with the right free block count.
ext3: Adjust trim start with first_data_block.
quota: return -ENOMEM when memory allocation fails
Linus Torvalds [Fri, 18 Mar 2011 00:40:00 +0000 (17:40 -0700)]
Merge branch 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits)
RPC: killing RPC tasks races fixed
xprt: remove redundant check
SUNRPC: Convert struct rpc_xprt to use atomic_t counters
SUNRPC: Ensure we always run the tk_callback before tk_action
sunrpc: fix printk format warning
xprt: remove redundant null check
nfs: BKL is no longer needed, so remove the include
NFS: Fix a warning in fs/nfs/idmap.c
Cleanup: Factor out some cut-and-paste code.
cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.
NFS: account direct-io into task io accounting
gss:krb5 only include enctype numbers in gm_upcall_enctypes
RPCRDMA: Fix FRMR registration/invalidate handling.
RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
NFSv4: Send unmapped uid/gids to the server when using auth_sys
NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
NFSv4: cleanup idmapper functions to take an nfs_server argument
NFSv4: Send unmapped uid/gids to the server if the idmapper fails
NFSv4: If the server sends us a numeric uid/gid then accept it
NFSv4.1: reject zero layout with zeroed stripe unit
...
Linus Torvalds [Fri, 18 Mar 2011 00:29:38 +0000 (17:29 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-udf-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
UDF: Fix compiler warning
udf: Convert UDF to new truncate calling sequence
Linus Torvalds [Fri, 18 Mar 2011 00:21:32 +0000 (17:21 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/bp/bp
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: (38 commits)
amd64_edac: Fix decode_syndrome types
amd64_edac: Fix DCT argument type
amd64_edac: Fix ranges signedness
amd64_edac: Drop local variable
amd64_edac: Fix PCI config addressing types
amd64_edac: Fix DRAM base macros
amd64_edac: Fix node id signedness
amd64_edac: Drop redundant declarations
amd64_edac: Enable driver on F15h
amd64_edac: Adjust ECC symbol size to F15h
amd64_edac: Simplify scrubrate setting
PCI: Rename CPU PCI id define
amd64_edac: Improve DRAM address mapping
amd64_edac: Sanitize ->read_dram_ctl_register
amd64_edac: Adjust sys_addr to chip select conversion routine to F15h
amd64_edac: Beef up early exit reporting
amd64_edac: Revamp online spare handling
amd64_edac: Fix channel interleave removal
amd64_edac: Correct node interleaving removal
amd64_edac: Add support for interleaved region swapping
...
Fix up trivial conflict in include/linux/pci_ids.h due to
AMD_15H_NB_MISC being renamed as AMD_15H_NB_F3 next to the new
AMD_15H_NB_LINK entry.
Linus Torvalds [Fri, 18 Mar 2011 00:10:19 +0000 (17:10 -0700)]
Merge branch 'for-linus/2639/i2c-1' of git://git.fluff.org/bjdooks/linux
* 'for-linus/2639/i2c-1' of git://git.fluff.org/bjdooks/linux:
i2c-mpc: Add support for 64bit system
i2c: add driver for Freescale i.MX28
i2c: tegra: Add i2c support
Linus Torvalds [Fri, 18 Mar 2011 00:09:29 +0000 (17:09 -0700)]
Merge git://git./linux/kernel/git/wim/linux-2.6-watchdog
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
watchdog: booke_wdt: clean up status messages
watchdog: cleanup spaces before tabs
watchdog: convert to DEFINE_PCI_DEVICE_TABLE
watchdog: Xen watchdog driver
watchdog: Intel SCU Watchdog Timer Driver for Moorestown and Medfield platforms.
watchdog: jz4740_wdt - fix magic character checking
watchdog: add JZ4740 watchdog driver
watchdog: it87_wdt: Add support for IT8721F watchdog
watchdog: hpwdt: build hpwdt as module by default with NMI_DECODING enabled
watchdog: hpwdt: Fix a couple of typos
Linus Torvalds [Thu, 17 Mar 2011 23:59:38 +0000 (16:59 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging: (44 commits)
hwmon: (lineage-pem): Fix in1 voltage alarm sysfs attributes
hwmon/f71882fg: Add support for
f71808e
hwmon/f71882fg: Add support for
f71869f and
f71869e
hwmon/f71882fg: Add support for
f71889ed
hwmon/f71882fg: Break out test for auto pwm's controlled by digital readings
hwmon/f71882fg: Separate temp beep sysfs attr from the other temp sysfs attr
hwmon/f71882fg: Remove bogus temp2_type for certain models
hwmon/f71882fg: Make number of temps configurable
hwmon/f71882fg: Make creation of in sysfs attributes more generic
hwmon/f71882fg: Only allow negative auto point temps if fan_neg_temp is enabled
hwmon/f71882fg: Fix temp1 sensor type reporting
hwmon: (w83627ehf) Display correct temperature sensor labels for systems with NCT6775F
hwmon: (w83627ehf) Add fan debounce support for NCT6775F and NCT6776F
hwmon: (w83627ehf) Update Kconfig for W83677HG-B, NCT6775F and NCT6776F
hwmon: (w83627ehf) Store rpm instead of raw fan speed data
hwmon: (w83627ehf) Use 16 bit fan count registers if supported
hwmon: (w83627ehf) Add support for Nuvoton NCT6775F and NCT6776F
hwmon: (w83627ehf) Permit enabling SmartFan IV mode if configured at startup
hwmon: (w83627ehf) Convert register arrays to 16 bit, and convert access to pointers
hwmon: (w83627ehf) Remove references to datasheets which no longer exist
...
Milton Miller [Tue, 15 Mar 2011 19:27:17 +0000 (13:27 -0600)]
smp_call_function_interrupt: use typedef and %pf
Use the newly added smp_call_func_t in smp_call_function_interrupt for
the func variable, and make the comment above the WARN more assertive
and explicit. Also, func is a function pointer and does not need an
offset, so use %pf not %pS.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Milton Miller [Tue, 15 Mar 2011 19:27:17 +0000 (13:27 -0600)]
smp_call_function_many: handle concurrent clearing of mask
Mike Galbraith reported finding a lockup ("perma-spin bug") where the
cpumask passed to smp_call_function_many was cleared by other cpu(s)
while a cpu was preparing its call_data block, resulting in no cpu to
clear the last ref and unlock the block.
Having cpus clear their bit asynchronously could be useful on a mask of
cpus that might have a translation context, or cpus that need a push to
complete an rcu window.
Instead of adding a BUG_ON and requiring yet another cpumask copy, just
detect the race and handle it.
Note: arch_send_call_function_ipi_mask must still handle an empty
cpumask because the data block is globally visible before the that arch
callback is made. And (obviously) there are no guarantees to which cpus
are notified if the mask is changed during the call; only cpus that were
online and had their mask bit set during the whole call are guaranteed
to be called.
Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Jan Beulich <JBeulich@novell.com>
Acked-by: Jan Beulich <jbeulich@novell.com>
Cc: stable@kernel.org
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Milton Miller [Tue, 15 Mar 2011 19:27:16 +0000 (13:27 -0600)]
call_function_many: add missing ordering
Paul McKenney's review pointed out two problems with the barriers in the
2.6.38 update to the smp call function many code.
First, a barrier that would force the func and info members of data to
be visible before their consumption in the interrupt handler was
missing. This can be solved by adding a smp_wmb between setting the
func and info members and setting setting the cpumask; this will pair
with the existing and required smp_rmb ordering the cpumask read before
the read of refs. This placement avoids the need a second smp_rmb in
the interrupt handler which would be executed on each of the N cpus
executing the call request. (I was thinking this barrier was present
but was not).
Second, the previous write to refs (establishing the zero that we the
interrupt handler was testing from all cpus) was performed by a third
party cpu. This would invoke transitivity which, as a recient or
concurrent addition to memory-barriers.txt now explicitly states, would
require a full smp_mb().
However, we know the cpumask will only be set by one cpu (the data
owner) and any preivous iteration of the mask would have cleared by the
reading cpu. By redundantly writing refs to 0 on the owning cpu before
the smp_wmb, the write to refs will follow the same path as the writes
that set the cpumask, which in turn allows us to keep the barrier in the
interrupt handler a smp_rmb instead of promoting it to a smp_mb (which
will be be executed by N cpus for each of the possible M elements on the
list).
I moved and expanded the comment about our (ab)use of the rcu list
primitives for the concurrent walk earlier into this function. I
considered moving the first two paragraphs to the queue list head and
lock, but felt it would have been too disconected from the code.
Cc: Paul McKinney <paulmck@linux.vnet.ibm.com>
Cc: stable@kernel.org (2.6.32 and later)
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Milton Miller [Tue, 15 Mar 2011 19:27:16 +0000 (13:27 -0600)]
call_function_many: fix list delete vs add race
Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.
Fix this by not setting refs until we have gotten the lock for the list.
Take advantage of the wmb in list_add_rcu to save an explicit additional
one.
I tried to force this race with a udelay before the lock & list_add and
by mixing all 64 online cpus with just 3 random cpus in the mask, but
was unsuccessful. Still, inspection shows a valid race, and the fix is
a extension of the existing protection window in the current code.
Cc: stable@kernel.org (v2.6.32 and later)
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea Arcangeli [Thu, 17 Mar 2011 23:16:35 +0000 (00:16 +0100)]
mm: PageBuddy and mapcount robustness
Change the _mapcount value indicating PageBuddy from -2 to -128 for
more robusteness against page_mapcount() undeflows.
Use reset_page_mapcount instead of __ClearPageBuddy in bad_page to
ignore the previous retval of PageBuddy().
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Metcalf [Thu, 17 Mar 2011 18:32:06 +0000 (14:32 -0400)]
arch/tile: support newer binutils assembler shift semantics
This change supports building the kernel with newer binutils where
a shift of greater than the word size is no longer interpreted
silently as modulo the word size, but instead generates a warning.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Chris Metcalf [Thu, 17 Mar 2011 18:14:12 +0000 (14:14 -0400)]
Merge tag 'v2.6.38' of git://git./linux/kernel/git/torvalds/linux-2.6 into for-linus
Linus Torvalds [Thu, 17 Mar 2011 17:11:25 +0000 (10:11 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/epip/linux-2.6-unicore32
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/epip/linux-2.6-unicore32: (40 commits)
unicore32: rewrite arch-specific tlb.h to use asm-generic version
unicore32: modify io_p2v and io_v2p macros, and adjust PKUNITY_mmio_BASEs
unicore32: replace unicore32-specific iomap functions with generic lib implementation
unicore32 machine related: add frame buffer driver for pkunity-v3 soc
unicore32 machine related files: add i2c bus drivers for pkunity-v3 soc
unicore32 io: redefine __REG(x) and re-use readl/writel funcs
unicore32 i8042 upgrade and bugfix: adjust resource request region type
unicore32 upgrade to v2.6.38-rc5: add one more paramter for pte_alloc_map call
unicore32 i8042: adjust io funcs of i8042-unicore32io.h
unicore32: rename PKUNITY_IOSPACE_BASE to PKUNITY_MMIO_BASE
unicore32: modify function names and parameters for irq_chips
unicore32: remove unused lines in arch/unicore32/include/asm/irq.h
unicore32 time.c: change calculate method for clock_event_device
unicore32: ADD MAINTAINER for unicore32 architecture
unicore32 machine related files: ps2 driver
unicore32 machine related files: pci bus handling
unicore32 machine related files: hardware registers
unicore32 machine related files: core files
unicore32 additional architecture files: boot process
unicore32 additional architecture files: low-level lib: misc
...
Acked-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Thu, 17 Mar 2011 17:10:49 +0000 (10:10 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] kexec: Disable ftrace during kexec
[S390] support XZ compressed kernel
[S390] css_bus_type: make it static
[S390] css_driver: remove duplicate members
[S390] css: remove subchannel private
[S390] css: move chsc_private to drv_data
[S390] css: move io_private to drv_data
[S390] cio: move cdev pointer to io_subchannel_private
[S390] cio: move options to io_sch_private
[S390] cio: move asms to generic header
[S390] cio: move orb definitions to separate header
[S390] Write protect module text and RO data
[S390] dasd: get rid of compile warning
[S390] remove superfluous check from do_IRQ
[S390] remove redundant stack check option
Linus Torvalds [Thu, 17 Mar 2011 16:57:10 +0000 (09:57 -0700)]
Merge branch 'sh-latest' of git://git./linux/kernel/git/lethal/sh-2.6
* 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (34 commits)
sh: Convert to generic show_interrupts.
sh: Wire up new fhandle and clock_adjtime syscalls.
sh: modify platform_device for sh_eth driver
sh: add GETHER's platform_device in board-sh7757lcr
sh: update sh7757lcr_defconfig
sh: add platform_device of tmio_mmc and sh_mmcif to sh7757lcr
sh: dmaengine support for SH7757
sh: add mmc clock in clock-sh7757
sh: add spi_board_info in sh7757lcr
sh: add platform_device for SPI
sh: add USB_ARCH_HAS_EHCI and OHCI for SH7757
sh: Rename cpuidle states to fit general conventions
serial: sh-sci: fix deadlock when resuming from S3 sleep
sh: Enable CONFIG_GCOV_PROFILE_ALL for sh
sh: Fix up async PCIe probing on SMP.
serial: sh-sci: Kill off the special earlyprintk device.
serial: sh-sci: Use dev_name() for region reservations.
serial: sh-sci: Fix up earlyprintk port mapping.
serial: sh-sci: Limit early console to one device.
serial: sh-sci: Fix up break timer scheduling race.
...
Linus Torvalds [Thu, 17 Mar 2011 16:56:43 +0000 (09:56 -0700)]
Merge git://git./linux/kernel/git/lethal/fbdev-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
fbdev: sh_mobile_lcdc: Add YUV framebuffer support
viafb: split pll configs up
viafb: remove duplicated clock storage
viafb: always return the best possible clock
viafb: remove duplicated clock information
fbdev: sh_mobile_lcdcfb: add backlight support
viafb: factor lcd scaling parameters out
viafb: strip some structures
viafb: remove unused data_mode and device_type
viafb: kill lcd_panel_id
video via: make local variables static
video via: fix iomem access
video/via: drop deprecated (and unused) i2c_adapter.id
Stanislav Kinsbursky [Thu, 17 Mar 2011 15:54:23 +0000 (18:54 +0300)]
RPC: killing RPC tasks races fixed
RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up
task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to
NULL).
Also, as Trond Myklebust mentioned, such approach (instead of checking
tk_waitqueue to NULL) allows us to "optimise away the call to
rpc_wake_up_queued_task() altogether for those
tasks that aren't queued".
Here is an example of dereferencing of tk_waitqueue equal to NULL:
CPU 0 CPU 1 CPU 2
-------------------- --------------------- --------------------------
nfs4_run_open_task
rpc_run_task
rpc_execute
rpc_set_active
rpc_make_runnable
(waiting)
rpc_async_schedule
nfs4_open_prepare
nfs_wait_on_sequence
nfs_umount_begin
rpc_killall_tasks
rpc_wake_up_task
rpc_wake_up_queued_task
spin_lock(tk_waitqueue == NULL)
BUG()
rpc_sleep_on
spin_lock(&q->lock)
__rpc_sleep_on
task->tk_waitqueue = q
Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
j223yang@asset.uwaterloo.ca [Wed, 16 Mar 2011 15:16:22 +0000 (11:16 -0400)]
xprt: remove redundant check
remove redundant check.
Signed-off-by: Jinqiu Yang <crindy646@gmail.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 15 Mar 2011 23:56:30 +0000 (19:56 -0400)]
SUNRPC: Convert struct rpc_xprt to use atomic_t counters
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Trond Myklebust [Tue, 15 Mar 2011 23:56:30 +0000 (19:56 -0400)]
SUNRPC: Ensure we always run the tk_callback before tk_action
This fixes a race in which the task->tk_callback() puts the rpc_task
to sleep, setting a new callback. Under certain circumstances, the current
code may end up executing the task->tk_action before it gets round to the
callback.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Linus Torvalds [Thu, 17 Mar 2011 16:11:39 +0000 (09:11 -0700)]
Merge branch 'drm-core-next' of git://git./linux/kernel/git/airlied/drm-2.6
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (177 commits)
drm/radeon: fixup refcounts in radeon dumb create ioctl.
drm: radeon: *_cs_packet_parse_vline() cleanup
radeon: merge list_del()/list_add_tail() to list_move_tail()
drm: Retry i2c transfer of EDID block after failure
drm/radeon/kms: fix typo in atom overscan setup
drm: Hold the mode mutex whilst probing for sysfs status
drm/nouveau: fix __nouveau_fence_wait performance
drm/nv40: attempt to reserve just enough vram for all 32 channels
drm/nv50: check for vm traps on every gr irq
drm/nv50: decode vm faults some more
drm/nouveau: add nouveau_enum_find() util function
drm/nouveau: properly handle pushbuffer check failures
drm/nvc0: remove vm hack forcing large/small pages to not share a PDE
drm/i915: disable opregion lid detection for now.
drm/i915: Only wait on a pending flip if we intend to write to the buffer
drm/i915/dp: Sanity check eDP existence
drm: add cap bit to denote if dumb ioctl is available or not.
drm/core: add ioctl to query device/driver capabilities
drm/radeon/kms: allow max clock of 340 Mhz on hdmi 1.3+
drm/radeon/kms: add cayman pci ids
...
Gleb Natapov [Sun, 13 Mar 2011 10:34:27 +0000 (12:34 +0200)]
KVM: unbreak userspace that does not sets tss address
Commit
6440e5967bc broke old userspaces that do not set tss address
before entering vcpu. Unbreak it by setting tss address to a safe
value on the first vcpu entry. New userspaces should set tss address,
so print warning in case it doesn't.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Wed, 9 Mar 2011 07:43:51 +0000 (15:43 +0800)]
KVM: MMU: cleanup pte write path
This patch does:
- call vcpu->arch.mmu.update_pte directly
- use gfn_to_pfn_atomic in update_pte path
The suggestion is from Avi.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Wed, 9 Mar 2011 07:43:00 +0000 (15:43 +0800)]
KVM: MMU: introduce a common function to get no-dirty-logged slot
Cleanup the code of pte_prefetch_gfn_to_memslot and mapping_level_dirty_bitmap
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Wed, 9 Mar 2011 07:41:04 +0000 (15:41 +0800)]
KVM: fix rcu usage in init_rmode_* functions
fix:
[ 3494.671786] stack backtrace:
[ 3494.671789] Pid: 10527, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23
[ 3494.671790] Call Trace:
[ 3494.671796] [] ? lockdep_rcu_dereference+0x9d/0xa5
[ 3494.671826] [] ? kvm_memslots+0x6b/0x73 [kvm]
[ 3494.671834] [] ? gfn_to_memslot+0x16/0x4f [kvm]
[ 3494.671843] [] ? gfn_to_hva+0x16/0x27 [kvm]
[ 3494.671851] [] ? kvm_write_guest_page+0x31/0x83 [kvm]
[ 3494.671861] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm]
[ 3494.671867] [] ? vmx_set_tss_addr+0x83/0x122 [kvm_intel]
and:
[ 8328.789599] stack backtrace:
[ 8328.789601] Pid: 18736, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23
[ 8328.789603] Call Trace:
[ 8328.789609] [] ? lockdep_rcu_dereference+0x9d/0xa5
[ 8328.789621] [] ? kvm_memslots+0x6b/0x73 [kvm]
[ 8328.789628] [] ? gfn_to_memslot+0x16/0x4f [kvm]
[ 8328.789635] [] ? gfn_to_hva+0x16/0x27 [kvm]
[ 8328.789643] [] ? kvm_write_guest_page+0x31/0x83 [kvm]
[ 8328.789699] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm]
[ 8328.789713] [] ? vmx_create_vcpu+0x316/0x3c8 [kvm_intel]
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Nikola Ciprich [Wed, 9 Mar 2011 22:36:51 +0000 (23:36 +0100)]
KVM: fix kvmclock regression due to missing clock update
commit
387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu),
breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function
to proper place.
Signed-off-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Acked-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 7 Mar 2011 12:55:07 +0000 (14:55 +0200)]
KVM: emulator: Fix permission checking in io permission bitmap
Currently if io port + len crosses 8bit boundary in io permission bitmap the
check may allow IO that otherwise should not be allowed. The patch fixes that.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 7 Mar 2011 12:55:06 +0000 (14:55 +0200)]
KVM: emulator: Fix io permission checking for 64bit guest
Current implementation truncates upper 32bit of TR base address during IO
permission bitmap check. The patch fixes this.
Reported-and-tested-by: Francis Moreau <francis.moro@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 8 Mar 2011 14:09:51 +0000 (16:09 +0200)]
KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n
With CONFIG_CC_STACKPROTECTOR, we need a valid %gs at all times, so disable
lazy reload and do an eager reload immediately after the vmexit.
Reported-by: IVAN ANGELOV <ivangotoy@gmail.com>
Acked-By: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Takuya Yoshikawa [Sat, 5 Mar 2011 03:40:20 +0000 (12:40 +0900)]
KVM: x86: Remove useless regs_page pointer from kvm_lapic
Access to this page is mostly done through the regs member which holds
the address to this page. The exceptions are in vmx_vcpu_reset() and
kvm_free_lapic() and these both can easily be converted to using regs.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Avi Kivity <avi@redhat.com>
Michael S. Tsirkin [Sun, 6 Mar 2011 11:03:26 +0000 (13:03 +0200)]
KVM: improve comment on rcu use in irqfd_deassign
The RCU use in kvm_irqfd_deassign is tricky: we have rcu_assign_pointer
but no synchronize_rcu: synchronize_rcu is done by kvm_irq_routing_update
which we share a spinlock with.
Fix up a comment in an attempt to make this clearer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 11:01:39 +0000 (19:01 +0800)]
KVM: MMU: remove unused macros
These macros are not used, so removed
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 11:01:10 +0000 (19:01 +0800)]
KVM: MMU: cleanup page alloc and free
Using __get_free_page instead of alloc_page and page_address,
using free_page instead of __free_page and virt_to_page
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 11:00:00 +0000 (19:00 +0800)]
KVM: MMU: do not record gfn in kvm_mmu_pte_write
No need to record the gfn to verifier the pte has the same mode as
current vcpu, it's because we only speculatively update the pte only
if the pte and vcpu have the same mode
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 10:59:21 +0000 (18:59 +0800)]
KVM: MMU: move mmu pages calculated out of mmu lock
kvm_mmu_calculate_mmu_pages need to walk all memslots and it's protected by
kvm->slots_lock, so move it out of mmu spinlock
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 10:58:02 +0000 (18:58 +0800)]
KVM: MMU: set spte accessed bit properly
Set spte accessed bit only if guest_initiated == 1 that means the really
accessed
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 10:56:41 +0000 (18:56 +0800)]
KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits
Only remove write access in the last sptes.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Wed, 9 Feb 2011 14:11:28 +0000 (15:11 +0100)]
KVM: Start lock documentation
The goal of this document shall be
- overview of all locks used in KVM core
- provide details on the scope of each lock
- explain the lock type, specifically of a raw spin locks
- provide a lock ordering guide
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Lai Jiangshan [Mon, 21 Feb 2011 03:51:35 +0000 (11:51 +0800)]
KVM: better readability of efer_reserved_bits
use EFER_SCE, EFER_LME and EFER_LMA instead of magic numbers.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Lai Jiangshan [Mon, 21 Feb 2011 03:21:30 +0000 (11:21 +0800)]
KVM: Clear async page fault hash after switching to real mode
The hash array of async gfns may still contain some left gfns after
kvm_clear_async_pf_completion_queue() called, need to clear them.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 21 Feb 2011 10:07:59 +0000 (12:07 +0200)]
KVM: VMX: Initialize vm86 TSS only once.
Currently vm86 task is initialized on each real mode entry and vcpu
reset. Initialization is done by zeroing TSS and updating relevant
fields. But since all vcpus are using the same TSS there is a race where
one vcpu may use TSS while other vcpu is initializing it, so the vcpu
that uses TSS will see wrong TSS content and will behave incorrectly.
Fix that by initializing TSS only once.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 21 Feb 2011 10:07:58 +0000 (12:07 +0200)]
KVM: VMX: update live TR selector if it changes in real mode
When rmode.vm86 is active TR descriptor is updated with vm86 task values,
but selector is left intact. vmx_set_segment() makes sure that if TR
register is written into while vm86 is active the new values are saved
for use after vm86 is deactivated, but since selector is not updated on
vm86 activation/deactivation new value is lost. Fix this by writing new
selector into vmcs immediately.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Lai Jiangshan [Fri, 11 Feb 2011 06:29:40 +0000 (14:29 +0800)]
KVM: VMX: add the __noclone attribute to vmx_vcpu_run
The changelog of
104f226 said "adds the __noclone attribute",
but it was missing in its patch. I think it is still needed.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Fri, 4 Feb 2011 09:49:11 +0000 (10:49 +0100)]
KVM: x86: Convert tsc_write_lock to raw_spinlock
Code under this lock requires non-preemptibility. Ensure this also over
-rt by converting it to raw spinlock.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Wed, 9 Feb 2011 10:09:46 +0000 (12:09 +0200)]
KVM: remove isr_ack logic from PIC
isr_ack logic was added by
e48258009d to avoid unnecessary IPIs. Back
then it made sense, but now the code checks that vcpu is ready to accept
interrupt before sending IPI, so this logic is no longer needed. The
patch removes it.
Fixes a regression with Debian/Hurd.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reported-and-tested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joseph Cihula [Tue, 8 Feb 2011 19:45:56 +0000 (11:45 -0800)]
KVM: VMX: fix detection of BIOS disabling VMX
This patch fixes the logic used to detect whether BIOS has disabled VMX, for
the case where VMX is enabled only under SMX, but tboot is not active.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Tue, 8 Feb 2011 11:55:33 +0000 (12:55 +0100)]
KVM: Convert kvm_lock to raw_spinlock
Code under this lock requires non-preemptibility. Ensure this also over
-rt by converting it to raw spinlock.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 3 Feb 2011 13:29:52 +0000 (15:29 +0200)]
KVM: SVM: check for progress after IRET interception
When we enable an NMI window, we ask for an IRET intercept, since
the IRET re-enables NMIs. However, the IRET intercept happens before
the instruction executes, while the NMI window architecturally opens
afterwards.
To compensate for this mismatch, we only open the NMI window in the
following exit, assuming that the IRET has by then executed; however,
this assumption is not always correct; we may exit due to a host interrupt
or page fault, without having executed the instruction.
Fix by checking for forward progress by recording and comparing the IRET's
rip. This is somewhat of a hack, since an unchaging rip does not mean that
no forward progress has been made, but is the simplest fix for now.
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 3 Feb 2011 13:07:07 +0000 (15:07 +0200)]
KVM: Fix race between nmi injection and enabling nmi window
The interrupt injection logic looks something like
if an nmi is pending, and nmi injection allowed
inject nmi
if an nmi is pending
request exit on nmi window
the problem is that "nmi is pending" can be set asynchronously by
the PIT; if it happens to fire between the two if statements, we
will request an nmi window even though nmi injection is allowed. On
SVM, this has disasterous results, since it causes eflags.TF to be
set in random guest code.
The fix is simple; make nmi_pending synchronous using the standard
vcpu->requests mechanism; this ensures the code above is completely
synchronous wrt nmi_pending.
Signed-off-by: Avi Kivity <avi@redhat.com>
Rik van Riel [Tue, 1 Feb 2011 14:53:28 +0000 (09:53 -0500)]
KVM: use yield_to instead of sleep in kvm_vcpu_on_spin
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic
slowdowns of certain workloads, we instead use yield_to to get
another VCPU in the same KVM guest to run sooner.
This seems to give a 10-15% speedup in certain workloads.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Rik van Riel [Tue, 1 Feb 2011 14:52:41 +0000 (09:52 -0500)]
KVM: keep track of which task is running a KVM vcpu
Keep track of which task is running a KVM vcpu. This helps us
figure out later what task to wake up if we want to boost a
vcpu that got preempted.
Unfortunately there are no guarantees that the same task
always keeps the same vcpu, so we can only track the task
across a single "run" of the vcpu.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Rik van Riel [Tue, 1 Feb 2011 14:51:46 +0000 (09:51 -0500)]
export pid symbols needed for kvm_vcpu_on_spin
Export the symbols required for a race-free kvm_vcpu_on_spin.
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 1 Feb 2011 14:32:04 +0000 (16:32 +0200)]
KVM: Drop ad-hoc vendor specific instruction restriction
Use the new support in the emulator, and drop the ad-hoc code in x86.c.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Tue, 1 Feb 2011 14:32:03 +0000 (16:32 +0200)]
KVM: x86 emulator: vendor specific instructions
Mark some instructions as vendor specific, and allow the caller to request
emulation only of vendor specific instructions. This is useful in some
circumstances (responding to a #UD fault).
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Tue, 1 Feb 2011 14:32:02 +0000 (16:32 +0200)]
KVM: Drop bogus x86_decode_insn() error check
x86_decode_insn() doesn't return X86EMUL_* values, so the check
for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper
check later on, so there is no need for a replacement for this
code.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Tue, 1 Feb 2011 12:27:15 +0000 (13:27 +0100)]
KVM: x86: Drop obsolete warning about INIT on runnable VCPU
This warning was once used for debugging QEMU user space. Though
uncommon, it is actually possible to send an INIT request to a running
VCPU. So better drop this warning before someone misuses it to flood
kernel logs this way.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Glauber Costa [Tue, 1 Feb 2011 19:16:40 +0000 (14:16 -0500)]
KVM: x86: release kvmclock page on reset
When a vcpu is reset, kvmclock page keeps being written to this days.
This is wrong and inconsistent: a cpu reset should take it to its
initial state.
Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Sun, 30 Jan 2011 03:15:49 +0000 (11:15 +0800)]
mm: remove is_hwpoison_address
Unused.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Sun, 30 Jan 2011 03:15:49 +0000 (11:15 +0800)]
KVM: Replace is_hwpoison_address with __get_user_pages
is_hwpoison_address only checks whether the page table entry is
hwpoisoned, regardless the memory page mapped. While __get_user_pages
will check both.
QEMU will clear the poisoned page table entry (via unmap/map) to make
it possible to allocate a new memory page for the virtual address
across guest rebooting. But it is also possible that the underlying
memory page is kept poisoned even after the corresponding page table
entry is cleared, that is, a new memory page can not be allocated.
__get_user_pages can catch these situations.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Sun, 30 Jan 2011 03:15:48 +0000 (11:15 +0800)]
mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally
Make __get_user_pages return -EHWPOISON for HWPOISON page only if
FOLL_HWPOISON is specified. With this patch, the interested callers
can distinguish HWPOISON pages from general FAULT pages, while other
callers will still get -EFAULT for all these pages, so the user space
interface need not to be changed.
This feature is needed by KVM, where UCR MCE should be relayed to
guest for HWPOISON page, while instruction emulation and MMIO will be
tried for general FAULT page.
The idea comes from Andrew Morton.
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Huang Ying [Sun, 30 Jan 2011 03:15:47 +0000 (11:15 +0800)]
mm: export __get_user_pages
In most cases, get_user_pages and get_user_pages_fast should be used
to pin user pages in memory. But sometimes, some special flags except
FOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in
following patch, KVM needs FOLL_HWPOISON. To support these users,
__get_user_pages is exported directly.
There are some symbol name conflicts in infiniband driver, fixed them too.
Signed-off-by: Huang Ying <ying.huang@intel.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Michel Lespinasse <walken@google.com>
CC: Roland Dreier <roland@kernel.org>
CC: Ralph Campbell <infinipath@qlogic.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
john cooper [Fri, 21 Jan 2011 05:21:00 +0000 (00:21 -0500)]
KVM: x86: handle guest access to BBL_CR_CTL3 MSR
A correction to Intel cpu model CPUID data (patch queued)
caused winxp to BSOD when booted with a Penryn model.
This was traced to the CPUID "model" field correction from
6 -> 23 (as is proper for a Penryn class of cpu). Only in
this case does the problem surface.
The cause for this failure is winxp accessing the BBL_CR_CTL3
MSR which is unsupported by current kvm, appears to be a
legacy MSR not fully characterized yet existing in current
silicon, and is apparently carried forward in MSR space to
accommodate vintage code as here. It is not yet conclusive
whether this MSR implements any of its legacy functionality
or is just an ornamental dud for compatibility. While I
found no silicon version specific documentation link to
this MSR, a general description exists in Intel's developer's
reference which agrees with the functional behavior of
other bootloader/kernel code I've examined accessing
BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19
called out as "reserved" in the above document.
So to minimally accommodate this MSR, kvm msr get will provide
the equivalent mock data and kvm msr write will simply toss the
guest passed data without interpretation. While this treatment
of BBL_CR_CTL3 addresses the immediate problem, the approach may
be modified pending clarification from Intel.
Signed-off-by: john cooper <john.cooper@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Wed, 12 Jan 2011 07:41:22 +0000 (15:41 +0800)]
KVM: make make_all_cpus_request() lockless
Now, we have 'vcpu->mode' to judge whether need to send ipi to other
cpus, this way is very exact, so checking request bit is needless,
then we can drop the spinlock let it's collateral
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Wed, 12 Jan 2011 07:40:31 +0000 (15:40 +0800)]
KVM: Add "exiting guest mode" state
Currently we keep track of only two states: guest mode and host
mode. This patch adds an "exiting guest mode" state that tells
us that an IPI will happen soon, so unless we need to wait for the
IPI, we can avoid it completely.
Also
1: No need atomically to read/write ->mode in vcpu's thread
2: reorganize struct kvm_vcpu to make ->mode and ->requests
in the same cache line explicitly
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Heiko Carstens [Mon, 17 Jan 2011 20:21:08 +0000 (21:21 +0100)]
KVM: fix build warning within __kvm_set_memory_region() on s390
Get rid of this warning:
CC arch/s390/kvm/../../../virt/kvm/kvm_main.o
arch/s390/kvm/../../../virt/kvm/kvm_main.c:596:12: warning: 'kvm_create_dirty_bitmap' defined but not used
The only caller of the function is within a !CONFIG_S390 section, so add the
same ifdef around kvm_create_dirty_bitmap() as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Sat, 15 Jan 2011 09:00:53 +0000 (10:00 +0100)]
KVM: x86: Remove user space triggerable MCE error message
This case is a pure user space error we do not need to record. Moreover,
it can be misused to flood the kernel log. Remove it.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Wed, 12 Jan 2011 07:39:18 +0000 (15:39 +0800)]
KVM: fix rcu usage warning in kvm_arch_vcpu_ioctl_set_sregs()
Fix:
[ 1001.499596] ===================================================
[ 1001.499599] [ INFO: suspicious rcu_dereference_check() usage. ]
[ 1001.499601] ---------------------------------------------------
[ 1001.499604] include/linux/kvm_host.h:301 invoked rcu_dereference_check() without protection!
......
[ 1001.499636] Pid: 6035, comm: qemu-system-x86 Not tainted 2.6.37-rc6+ #62
[ 1001.499638] Call Trace:
[ 1001.499644] [] lockdep_rcu_dereference+0x9d/0xa5
[ 1001.499653] [] gfn_to_memslot+0x8d/0xc8 [kvm]
[ 1001.499661] [] gfn_to_hva+0x16/0x3f [kvm]
[ 1001.499669] [] kvm_read_guest_page+0x1e/0x5e [kvm]
[ 1001.499681] [] kvm_read_guest_page_mmu+0x53/0x5e [kvm]
[ 1001.499699] [] load_pdptrs+0x3f/0x9c [kvm]
[ 1001.499705] [] ? vmx_set_cr0+0x507/0x517 [kvm_intel]
[ 1001.499717] [] kvm_arch_vcpu_ioctl_set_sregs+0x1f3/0x3c0 [kvm]
[ 1001.499727] [] kvm_vcpu_ioctl+0x6a5/0xbc5 [kvm]
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Thu, 6 Jan 2011 16:09:12 +0000 (18:09 +0200)]
KVM: VMX: Avoid atomic operation in vmx_vcpu_run
Instead of exchanging the guest and host rcx, have separate storage
for each. This allows us to avoid using the xchg instruction, which
is is a little slower than normal operations.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Thu, 6 Jan 2011 16:09:11 +0000 (18:09 +0200)]
KVM: VMX: Simplify saving guest rcx in vmx_vcpu_run
Change
push top-of-stack
pop guest-rcx
pop dummy
to
pop guest-rcx
which is the same thing, only simpler.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Rik van Riel [Tue, 4 Jan 2011 14:51:33 +0000 (09:51 -0500)]
KVM: VMX: increase ple_gap default to 128
On some CPUs, a ple_gap of 41 is simply insufficient to ever trigger
PLE exits, even with the minimalistic PLE test from kvm-unit-tests.
http://git.kernel.org/?p=virt/kvm/kvm-unit-tests.git;a=commitdiff;h=
eda71b28fa122203e316483b35f37aaacd42f545
For example, the Xeon X5670 CPU needs a ple_gap of at least 48 in
order to get pause loop exits:
# modprobe kvm_intel ple_gap=47
# taskset 1 /usr/local/bin/qemu-system-x86_64 \
-device testdev,chardev=log -chardev stdio,id=log \
-kernel x86/vmexit.flat -append ple-round-robin -smp 2
VNC server running on `::1:5900'
enabling apic
enabling apic
ple-round-robin
58298446
# rmmod kvm_intel
# modprobe kvm_intel ple_gap=48
# taskset 1 /usr/local/bin/qemu-system-x86_64 \
-device testdev,chardev=log -chardev stdio,id=log \
-kernel x86/vmexit.flat -append ple-round-robin -smp 2
VNC server running on `::1:5900'
enabling apic
enabling apic
ple-round-robin 36616
Increase the ple_gap to 128 to be on the safe side.
Signed-off-by: Rik van Riel <riel@redhat.com>
Acked-by: Zhai, Edwin <edwin.zhai@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Joerg Roedel [Fri, 14 Jan 2011 15:45:02 +0000 (16:45 +0100)]
KVM: SVM: Add support for perf-kvm
This patch adds the necessary code to run perf-kvm on AMD
machines.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 3 Jan 2011 12:28:52 +0000 (14:28 +0200)]
KVM: VMX: Avoid leaking fake realmode state to userspace
When emulating real mode, we fake some state:
- tr.base points to a fake vm86 tss
- segment registers are made to conform to vm86 restrictions
change vmx_get_segment() not to expose this fake state to userspace;
instead, return the original state.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Mon, 3 Jan 2011 12:28:51 +0000 (14:28 +0200)]
KVM: VMX: Save and restore tr selector across mode switches
When emulating real mode we play with tr hidden state, but leave
tr.selector alone. That works well, except for save/restore, since
loading TR writes it to the hidden state in vmx->rmode.
Fix by also saving and restoring the tr selector; this makes things
more consistent and allows migration to work during the early
boot stages of Windows XP.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Peter Tyser [Wed, 29 Dec 2010 19:51:25 +0000 (13:51 -0600)]
KVM: PPC: Fix SPRG get/set for Book3S and BookE
Previously SPRGs 4-7 were improperly read and written in
kvm_arch_vcpu_ioctl_get_regs() and kvm_arch_vcpu_ioctl_set_regs();
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Peter Tyser <ptyser@xes-inc.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>