git.openwrt.org Git - openwrt/staging/blogic.git/log

KVM: x86: fix KVM_GET_MSR for PV EOI

KVM_GET_MSR was missing support for PV EOI,
which is needed for migration.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

kvm: Fix nonsense handling of compat ioctl

KVM_SET_SIGNAL_MASK passed a NULL argument leaves the on stack signal
sets uninitialized. It then passes them through to
kvm_vcpu_ioctl_set_sigmask.

We should be passing a NULL in this case not translated garbage.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Merge tag 'fixes-3.6-rc3' of git://git./linux/kernel/git/arm/arm-soc

Pull arm-soc fixes from Arnd Bergmann:
"Bug fixes for various ARM platforms.  About half of these are for OMAP
  and submitted before but did not make it into v3.6-rc2."

* tag 'fixes-3.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (39 commits)
  ARM: ux500: don't select LEDS_GPIO for snowball
  ARM: imx: build i.MX6 functions only when needed
  ARM: imx: select CPU_FREQ_TABLE when needed
  ARM: imx: fix ksz9021rn_phy_fixup
  ARM: imx: build pm-imx5 code only when PM is enabled
  ARM: omap: allow building omap44xx without SMP
  ARM: dts: imx51-babbage: fix esdhc cd/wp properties
  ARM: imx6: spin the cpu until hardware takes it down
  ARM: ux500: Ensure probing of Audio devices when Device Tree is enabled
  ARM: ux500: Fix merge error, no matching driver name for 'snd_soc_u8500'
  ARM i.MX6q: Add virtual 1/3.5 dividers in the LDB clock path
  ARM: Kirkwood: fix Makefile.boot
  ARM: Kirkwood: Fix iconnect leds
  ARM: Orion: Set eth packet size csum offload limit
  ARM: mv78xx0: fix win_cfg_base prototype
  ARM: OMAP: dmtimers: Fix locking issue in omap_dm_timer_request*()
  ARM: mmp: fix potential NULL dereference
  ARM: OMAP4: Register the OPP table only for 4430 device
  cpufreq: OMAP: Handle missing frequency table on SMP systems
  ARM: OMAP4: sleep: Save the complete used register stack frame
  ...

Merge tag 'stable/for-linus-3.6-rc3-tag' of git://git./linux/kernel/git/konrad/xen

Pull three xen bug-fixes from Konrad Rzeszutek Wilk:
- Revert the kexec fix which caused on non-kexec shutdowns a race.
- Reuse existing P2M leafs - instead of requiring to allocate a large
   area of bootup virtual address estate.
- Fix a one-off error when adding PFNs for balloon pages.

* tag 'stable/for-linus-3.6-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M.
  xen/p2m: Reuse existing P2M leafs if they are filled with 1:1 PFNs or INVALID.
  Revert "xen PVonHVM: move shared_info to MMIO before kexec"

Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc

Pull powerpc fixes from Benjamin Herrenschmidt:
"I meant to sent that earlier but got swamped with other things, so
  here are some powerpc fixes for 3.6.  A few regression fixes and some
  bug fixes that I deemed should still make it.

  There's a FSL update from Kumar with a bunch of defconfig updates
  along with a few embedded fixes.

  I also reverted my g5_defconfig update that I merged earlier as it was
  completely busted, not too sure what happened there, I'll do a new one
  later."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  Revert "powerpc: Update g5_defconfig"
  powerpc/perf: Use pmc_overflow() to detect rolled back events
  powerpc: Fix VMX in interrupt check in POWER7 copy loops
  powerpc: POWER7 copy_to_user/copy_from_user patch applied twice
  powerpc: Fix personality handling in ppc64_personality()
  powerpc/dma-iommu: Fix IOMMU window check
  powerpc: Remove unnecessary ifdefs
  powerpc/kgdb: Restore current_thread_info properly
  powerpc/kgdb: Bail out of KGDB when we've been triggered
  powerpc/kgdb: Do not set kgdb_single_step on ppc
  powerpc/mpic_msgr: Add missing includes
  powerpc: Fix null pointer deref in perf hardware breakpoints
  powerpc: Fixup whitespace in xmon
  powerpc: Fix xmon dl command for new printk implementation
  powerpc/fsl: fix "Failed to mount /dev: No such device" errors
  powerpc/fsl: update defconfigs
  booke/wdt: some ioctls do not return values properly
  powerpc/p4080ds: dts - add usb controller version info and port0
  powerpc/85xx: mpc85xx_defconfig - add VIA PATA support for MPC85xxCDS
  powerpc/fsl-pci: Only scan PCI bus if configured as a host

Merge git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86 emulator: use stack size attribute to mask rsp in stack ops
  KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended
  ppc: e500_tlb memset clears nothing
  KVM: PPC: Add cache flush on page map
  KVM: PPC: Book3S HV: Fix incorrect branch in H_CEDE code
  KVM: x86: update KVM_SAVE_MSRS_BEGIN to correct value

Merge tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs

Pull xfs bugfixes from Ben Myers:
- fix uninitialised variable in xfs_rtbuf_get()
- unlock the AGI buffer when looping in xfs_dialloc
- check for possible overflow in xfs_ioc_trim

* tag 'for-linus-v3.6-rc4' of git://oss.sgi.com/xfs/xfs:
  xfs: check for possible overflow in xfs_ioc_trim
  xfs: unlock the AGI buffer when looping in xfs_dialloc
  xfs: fix uninitialised variable in xfs_rtbuf_get()

Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
"Random fixes across the MIPS tree.  The two hotspots are several bugs
  in the module loader and the ath79 SOC support; also noteworthy is the
  restructuring of the code to synchronize CPU timers across CPUs on
  startup; the old code recently ceased to work due to unrelated
  changes.

  All except one of these patches have sat for a significant time in
  linux-next for testing."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: pci-ar724x: avoid data bus error due to a missing PCIe module
  MIPS: Malta: Delete duplicate PCI fixup.
  MIPS: ath79: don't hardcode the unavailability of the DSP ASE
  MIPS: Synchronize MIPS count one CPU at a time
  MIPS: BCM63xx: Fix SPI message control register handling for BCM6338/6348.
  MIPS: Module: Deal with malformed HI16/LO16 relocation sequences.
  MIPS: Fix race condition in module relocation code.
  MIPS: Fix memory leak in error path of HI16/LO16 relocation handling.
  MIPS: MTX-1: Add udelay to mtx1_pci_idsel
  MIPS: ath79: select HAVE_CLK
  MIPS: ath79: Use correct IRQ number for the OHCI controller on AR7240
  MIPS: ath79: Fix number of GPIO lines for AR724[12]
  MIPS: Octeon: Fix broken interrupt controller code.

Merge branch 'for-3.6' of git://linux-nfs.org/~bfields/linux

Pull nfsd bugfixes from J. Bruce Fields:
"Particular thanks to Michael Tokarev, Malahal Naineni, and Jamie
  Heilman for their testing and debugging help."

* 'for-3.6' of git://linux-nfs.org/~bfields/linux:
  svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping
  svcrpc: sends on closed socket should stop immediately
  svcrpc: fix BUG() in svc_tcp_clear_pages
  nfsd4: fix security flavor of NFSv4.0 callback

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block-related fixes from Jens Axboe:

- Improvements to the buffered and direct write IO plugging from
   Fengguang.

- Abstract out the mapping of a bio in a request, and use that to
   provide a blk_bio_map_sg() helper.  Useful for mapping just a bio
   instead of a full request.

- Regression fix from Hugh, fixing up a patch that went into the
   previous release cycle (and marked stable, too) attempting to prevent
   a loop in __getblk_slow().

- Updates to discard requests, fixing up the sizing and how we align
   them.  Also a change to disallow merging of discard requests, since
   that doesn't really work properly yet.

- A few drbd fixes.

- Documentation updates.

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: replace __getblk_slow misfix by grow_dev_page fix
  drbd: Write all pages of the bitmap after an online resize
  drbd: Finish requests that completed while IO was frozen
  drbd: fix drbd wire compatibility for empty flushes
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update tunable options in block/cfq-iosched.txt
  Documentation: update missing index files in block/00-INDEX
  block: move down direct IO plugging
  block: remove plugging at buffered write time
  block: disable discard request merge temporarily
  bio: Fix potential memory leak in bio_find_or_create_slab()
  block: Don't use static to define "void *p" in show_partition_start()
  block: Add blk_bio_map_sg() helper
  block: Introduce __blk_segment_map_sg() helper
  fs/block-dev.c:fix performance regression in O_DIRECT writes to md block devices
  block: split discard into aligned requests
  block: reorganize rounding of max_discard_sectors

Merge tag 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

Pull libata fixes from Jeff Garzik:
- libata-acpi regression fix
- additional or corrected drive quirks for ata_blacklist
- Kconfig text tweaking
- new PCI IDs
- pata_atiixp: quirk for MSI motherboard
- export ahci_dev_classify for an ahci_platform driver

* tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry
  [libata] new quirk, lift bridge limits for Buffalo DriveStation Quattro
  [libata] Kconfig: Elaborate that SFF is meant for legacy and PATA stuff
  [libata] acpi: call ata_acpi_gtm during ata port init time
  ata_piix: Add Device IDs for Intel Lynx Point-LP PCH
  ahci: Add Device IDs for Intel Lynx Point-LP PCH
  pata_atiixp: override cable detection on MSI E350DM-E33
  ahci: un-staticize ahci_dev_classify

libata: Add a space to " 2GB ATA Flash Disk" DMA blacklist entry

commit d70e551c8e1ecb6f20422f8db6bfe6a0049edcb8, Add " 2GB ATA Flash
Disk"/"ADMA428M" to DMA blacklist, should have added a space before 2GB.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

Revert "powerpc: Update g5_defconfig"

This reverts commit b1acf1bb544cf28c1f4be0a45620fa899c74b7e9.

Something went horribly wrong when I did savedefconfig, not sure what,
but what's in there is busted so let's revert it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/perf: Use pmc_overflow() to detect rolled back events

For certain speculative events on Power7, 'perf stat' reports far higher
event count than 'perf record' for the same event.

As described in following commit, a performance monitor exception is raised
even when the the performance events are rolled back.

        commit 0837e3242c73566fc1c0196b4ec61779c25ffc93
        Author: Anton Blanchard <anton@samba.org>
        Date:   Wed Mar 9 14:38:42 2011 +1100

perf_event_interrupt() records an event only when an overflow occurs. But
this check for overflow is a simple 'if (val < 0)'.

Because the events are rolled back, this check for overflow fails and the
event is not recorded. perf_event_interrupt() later uses pmc_overflow() to
detect the overflow and resets the counters and the events are lost completely.

To properly detect the overflow of rolled back events, use pmc_overflow()
even when recording events.

To reproduce:
        $ cat strcpy.c
        #include <stdio.h>
        #include <string.h>
        main()
        {
                char buf[256];

                alarm(5);
                while(1)
                        strcpy(buf, "string1");
        }

        $ perf record -e r20014 ./strcpy
        $ perf report -n > report.1
        $ perf stat -e r20014 > report.2
        # Compare report.1 and report.2

Reported-by: Maynard Johnson <mpjohn@us.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Fix VMX in interrupt check in POWER7 copy loops

The enhanced prefetch hint patches corrupt the condition register
that was used to check if we are in interrupt. Fix this by using cr1.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: POWER7 copy_to_user/copy_from_user patch applied twice

"powerpc: Use enhanced touch instructions in POWER7
copy_to_user/copy_from_user" was applied twice. Remove one.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Fix personality handling in ppc64_personality()

Directly comparing current->personality against PER_LINUX32 doesn't work
in cases when any of the personality flags stored in the top three bytes
are used.

Directly forcefully setting personality to PER_LINUX32 or PER_LINUX
discards any flags stored in the top three bytes

Use personality() macro to compare only PER_MASK bytes and make sure that
we are setting only the bits that should be set, instead of overwriting
the whole value.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/dma-iommu: Fix IOMMU window check

Checking for device mask to cover the whole IOMMU table is too strict.
IOMMU allocators should handle mask constraint properly for each
allocation.

The patch enables to use old AirPort Extreme cards on PowerMacs with
more than 1GB of memory; without the patch the driver init fails with:

  b43-pci-bridge 0001:01:01.0: Warning: IOMMU window too big for device mask
  b43-pci-bridge 0001:01:01.0: mask: 0x3fffffff, table end: 0x80000000
  b43-phy0 ERROR: The machine/kernel does not support the required 30-bit DMA mask

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Remove unnecessary ifdefs

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/kgdb: Restore current_thread_info properly

For powerpc BooKE and e200, singlestep is handled on the critical/dbg
exception stack. This causes current_thread_info() to fail for kgdb
internal, so previously We work around this issue by copying
the thread_info from the kernel stack before calling kgdb_handle_exception,
and copying it back afterwards.

But actually we don't do this properly. We should backup current_thread_info
then restore that when exit.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/kgdb: Bail out of KGDB when we've been triggered

We need to skip a breakpoint exception when it occurs after
a breakpoint has already been removed.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/kgdb: Do not set kgdb_single_step on ppc

The kgdb_single_step flag has the possibility to indefinitely
hang the system on an SMP system.

The x86 arch have the same problem, and that problem was fixed by
commit 8097551d9ab9b9e3630(kgdb,x86: do not set kgdb_single_step
on x86). This patch does the same behaviors as x86's patch.

Signed-off-by: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc/mpic_msgr: Add missing includes

Add several #includes that mpic_msgr relies on being pulled implicitly,
which only happens on certain configs.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Meador Inge <meador_inge@mentor.com>
Cc: Jia Hongtao <B38951@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Fix null pointer deref in perf hardware breakpoints

Currently if you are doing a global perf recording with hardware
breakpoints (ie perf record -e mem:0xdeadbeef -a), you can oops with:

  Faulting instruction address: 0xc000000000738890
  cpu 0xc: Vector: 300 (Data Access) at [c0000003f76af8d0]
      pc: c000000000738890: .hw_breakpoint_handler+0xa0/0x1e0
      lr: c000000000738830: .hw_breakpoint_handler+0x40/0x1e0
      sp: c0000003f76afb50
     msr: 8000000000001032
     dar: 6f0
   dsisr: 42000000
    current = 0xc0000003f765ac00
    paca    = 0xc00000000f262a00   softe: 0        irq_happened: 0x01
    pid   = 6810, comm = loop-read
  enter ? for help
  [c0000003f76afbe0] c00000000073cd04 .notifier_call_chain.isra.0+0x84/0xe0
  [c0000003f76afc80] c00000000073cdbc .notify_die+0x3c/0x60
  [c0000003f76afd20] c0000000000139f0 .do_dabr+0x40/0xf0
  [c0000003f76afe30] c000000000005a9c handle_dabr_fault+0x14/0x48
  --- Exception: 300 (Data Access) at 0000000010000480
  SP (ff8679e0) is in userspace

This is because we don't check to see if the break point is associated
with task before we deference the task_struct pointer.

This changes the update to use current.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Fixup whitespace in xmon

There are a few whitespace goolies in xmon.c, some of them appear to
be my fault. Fix them all in one go.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

powerpc: Fix xmon dl command for new printk implementation

Since the printk internals were reworked the xmon 'dl' command which
dumps the content of __log_buf has stopped working.

It is now a structured buffer, so just dumping it doesn't really work.

Use the helpers added for kgdb to print out the content.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Merge git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
"This push fixes a build error on 32-bit archs in the hifn driver as
  well as a potential deadlock in the caam driver."

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: caam - fix possible deadlock condition
  crypto: hifn_795x - fix 64bit division and undefined __divdi3 on 32bit archs

Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull UDF, ext3 & reiserfs fixes from Jan Kara:
"A couple of fixes (udf, reiserfs, ext3) that accumulated over my
  vacation."

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: fix retun value on error path in udf_load_logicalvol
  jbd: don't write superblock when unmounting an ro filesystem
  reiserfs: fix deadlocks with quotas
  quota: Move down dqptr_sem read after initializing default warn[] type at __dquot_alloc_space().
  UDF: During mount free lvid_bh before rescanning with different blocksize
  udf: fix udf_setsize() for file data in ICB

Merge tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs

Pull UBIFS fixes from Artem Bityutskiy:
- Fix crash on error which prevents emulated power-cut testing.
- Fix log reply regression introduced in 3.6-rc1.
- Fix UBIFS complaints about too small debug buffer size which.
- Fix error message spelling, and remove incorrect commentary.

* tag 'upstream-3.6-rc3' of git://git.infradead.org/linux-ubifs:
  UBIFS: fix error messages spelling
  UBIFS: fix complaints about too small debug buffer size
  UBIFS: fix replay regression
  UBIFS: fix crash on error path
  UBIFS: remove stale commentary

Merge git://git./linux/kernel/git/davem/ide

Pull IDE power management bugfix from David S. Miller.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: fix generic_ide_suspend/resume Oops

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"This tree contains misc fixlets: a perf script python binding fix, a
  uprobes fix and a syscall tracing fix."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Add missing files to build the python binding
  uprobes: Fix mmap_region()'s mm->mm_rb corruption if uprobe_mmap() fails
  tracing/syscalls: Fix perf syscall tracing when syscall_nr == -1

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"This tree contains assorted fixlets: an alternatives patching crash
  fix, an irq migration/hotplug interaction fix, a fix for large AMD
  microcode images and a comment fixlet."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode, AMD: Fix broken ucode patch size check
  x86/alternatives: Fix p6 nops on non-modular kernels
  x86/fixup_irq: Use cpu_online_mask instead of cpu_all_mask
  x86/spinlocks: Fix comment in spinlock.h

Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
"Mostly small fixes for the fallout of the timekeeping overhaul in 3.6
  along with stable fixes to address an accumulation problem and missing
  sanity checks for RTC readouts and user space provided values."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Avoid making adjustments if we haven't accumulated anything
  time: Avoid potential shift overflow with large shift values
  time: Fix casting issue in timekeeping_forward_now
  time: Ensure we normalize the timekeeper in tk_xtime_add
  time: Improve sanity checking of timekeeping inputs

Merge branch 'upstream-fixes' of git://git./linux/kernel/git/jikos/hid

Pull HID fix from Jiri Kosina:
"Fix for one particular device not being properly claimed by
hid-multitouch driver"

* 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: Remove QUANTA from special drivers list

xfs: check for possible overflow in xfs_ioc_trim

If range.start or range.minlen is bigger than filesystem size, return
invalid value error. This fixes possible overflow in BTOBB macro when
passed value was nearly ULLONG_MAX.

Signed-off-by: Tomas Racek <tracek@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

xfs: unlock the AGI buffer when looping in xfs_dialloc

Also update some commens in the area to make the code easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

xfs: fix uninitialised variable in xfs_rtbuf_get()

Results in this assert failure in generic/090:

XFS: Assertion failed: *nmap >= 1, file: fs/xfs/xfs_bmap.c, line: 4363
.....
Call Trace:
[<ffffffff814680db>] xfs_bmapi_read+0x6b/0x370
[<ffffffff814b64b2>] xfs_rtbuf_get+0x42/0x130
[<ffffffff814b6f09>] xfs_rtget_summary+0x89/0x120
[<ffffffff814b7bfe>] xfs_rtallocate_extent_size+0xce/0x340
[<ffffffff814b89f0>] xfs_rtallocate_extent+0x240/0x290
[<ffffffff81462c1a>] xfs_bmap_rtalloc+0x1ba/0x340
[<ffffffff81463a65>] xfs_bmap_alloc+0x35/0x40
[<ffffffff8146f111>] xfs_bmapi_allocate+0xf1/0x350
[<ffffffff8146f9de>] xfs_bmapi_write+0x66e/0xa60
[<ffffffff8144538a>] xfs_iomap_write_direct+0x22a/0x3f0
[<ffffffff8143707b>] __xfs_get_blocks+0x38b/0x5d0
[<ffffffff814372d4>] xfs_get_blocks_direct+0x14/0x20
[<ffffffff811b0081>] do_blockdev_direct_IO+0xf71/0x1eb0
[<ffffffff811b1015>] __blockdev_direct_IO+0x55/0x60
[<ffffffff814355ca>] xfs_vm_direct_IO+0x11a/0x1e0
[<ffffffff8112d617>] generic_file_direct_write+0xd7/0x1b0
[<ffffffff8143e16c>] xfs_file_dio_aio_write+0x13c/0x320
[<ffffffff8143e6f2>] xfs_file_aio_write+0x1c2/0x1d0
[<ffffffff81174a07>] do_sync_write+0xa7/0xe0
[<ffffffff81175288>] vfs_write+0xa8/0x160
[<ffffffff81175702>] sys_pwrite64+0x92/0xb0
[<ffffffff81b68f69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

powerpc/fsl: fix "Failed to mount /dev: No such device" errors

Yocto (Built by Poky 7.0) 1.2 root filesystems fail to boot,
at least over nfs, with:

Failed to mount /dev: No such device

Configuring DEVTMPFS fixes it.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

powerpc/fsl: update defconfigs

run make savedefconfig on fsl defconfigs.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

Merge branch 'randconfig/mach' into fixes

Small platform specific bug fixes for problems found in randconfig builds.

* randconfig/mach:
  ARM: ux500: don't select LEDS_GPIO for snowball
  ARM: imx: build i.MX6 functions only when needed
  ARM: imx: select CPU_FREQ_TABLE when needed
  ARM: imx: fix ksz9021rn_phy_fixup
  ARM: imx: build pm-imx5 code only when PM is enabled
  ARM: omap: allow building omap44xx without SMP

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

ARM: ux500: don't select LEDS_GPIO for snowball

Using 'select' in Kconfig is hard, a platform cannot just
enable a driver without also making sure that its subsystem
is there. Also, there is no actual code dependency between
the platform and the gpio leds driver.

Without this patch, building without LEDS_CLASS esults in:

drivers/built-in.o: In function `create_gpio_led.part.2':
governor_userspace.c:(.devinit.text+0x5a58): undefined reference to `led_classdev_register'
drivers/built-in.o: In function `gpio_led_remove':
governor_userspace.c:(.devexit.text+0x6b8): undefined reference to `led_classdev_unregister'

This reverts 8733f53c6 "ARM: ux500: Kconfig: Compile in leds-gpio
support for Snowball" that introduced the regression and did not
provide a helpful explanation.

In order to leave the GPIO LED code still present in normal
builds, this also enables the symbol in u8500_defconfig, in addition
to the other LED drivers that are already selected there.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Lee Jones <lee.jones@linaro.org>

ARM: imx: build i.MX6 functions only when needed

The head-v7.S contains a call to the generic cpu_suspend function,
which is only available when selected by the i.MX6 code. As
pointed out by Shawn Guo, i.MX5 does not actually use any
functions defined in head-v7.S. It is also needed only for
the i.MX6 power management code and for the SMP code, so
we can restrict building this file to situations in which
at least one of those two is present.

Finally, other platforms with a similar file call it headsmp.S,
so we can rename it to the same for consistency.

Without this patch, building imx5 standalone results in:

arch/arm/mach-imx/built-in.o: In function `v7_cpu_resume':
arch/arm/mach-imx/head-v7.S:104: undefined reference to `cpu_resume'

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Eric Miao <eric.miao@linaro.org>
Cc: stable@vger.kernel.org

ARM: imx: select CPU_FREQ_TABLE when needed

The i.MX cpufreq implementation uses the CPU_FREQ_TABLE helpers,
so it needs to select that code to be built. This problem has
apparently existed since the i.MX cpufreq code was first merged
in v2.6.37.

Building IMX without CPU_FREQ_TABLE results in:

arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_exit':
arch/arm/plat-mxc/cpufreq.c:173: undefined reference to `cpufreq_frequency_table_put_attr'
arch/arm/plat-mxc/built-in.o: In function `mxc_set_target':
arch/arm/plat-mxc/cpufreq.c:84: undefined reference to `cpufreq_frequency_table_target'
arch/arm/plat-mxc/built-in.o: In function `mxc_verify_speed':
arch/arm/plat-mxc/cpufreq.c:65: undefined reference to `cpufreq_frequency_table_verify'
arch/arm/plat-mxc/built-in.o: In function `mxc_cpufreq_init':
arch/arm/plat-mxc/cpufreq.c:154: undefined reference to `cpufreq_frequency_table_cpuinfo'
arch/arm/plat-mxc/cpufreq.c:162: undefined reference to `cpufreq_frequency_table_get_attr'

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Yong Shen <yong.shen@linaro.org>
Cc: stable@vger.kernel.org

ARM: imx: fix ksz9021rn_phy_fixup

The ksz9021rn_phy_fixup and mx6q_sabrelite functions try to
set up an ethernet phy if they can. They do check whether
phylib is enabled, but unfortunately the functions can only
be called from platform code if phylib is builtin, not
if it is a module

Without this patch, building with a modular phylib results in:

arch/arm/mach-imx/mach-imx6q.c: In function 'imx6q_sabrelite_init':
arch/arm/mach-imx/mach-imx6q.c:120:5: error: 'ksz9021rn_phy_fixup' undeclared (first use in this function)
arch/arm/mach-imx/mach-imx6q.c:120:5: note: each undeclared identifier is reported only once for each function it appears in

The bug was originally reported by Artem Bityutskiy but only
partially fixed in ef441806 "ARM: imx6q: register phy fixup only when
CONFIG_PHYLIB is enabled".

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>

ARM: imx: build pm-imx5 code only when PM is enabled

This moves the imx5 pm code out of the list of unconditionally
compiled files for imx5, mirroring what we already do for imx6
and how it was done before the code was move from mach-mx5 to
mach-imx in v3.3.

Without this patch, building with CONFIG_PM disabled results in:

arch/arm/mach-imx/pm-imx5.c:202:116: error: redefinition of 'imx51_pm_init'
arch/arm/mach-imx/include/mach-imx/common.h:154:91: note: previous definition of 'imx51_pm_init' was here
arch/arm/mach-imx/pm-imx5.c:209:116: error: redefinition of 'imx53_pm_init'
arch/arm/mach-imx/include/mach-imx/common.h:155:91: note: previous definition of 'imx53_pm_init' was here

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: stable@vger.kernel.org

ARM: omap: allow building omap44xx without SMP

The new omap4 cpuidle implementation currently requires
ARCH_NEEDS_CPU_IDLE_COUPLED, which only works on SMP.

This patch makes it possible to build a non-SMP kernel
for that platform. This is not normally desired for
end-users but can be useful for testing.

Without this patch, building rand-0y2jSKT results in:

drivers/cpuidle/coupled.c: In function 'cpuidle_coupled_poke':
drivers/cpuidle/coupled.c:317:3: error: implicit declaration of function '__smp_call_function_single' [-Werror=implicit-function-declaration]

It's not clear if this patch is the best solution for
the problem at hand. I have made sure that we can now
build the kernel in all configurations, but that does
not mean it will actually work on an OMAP44xx.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: Kevin Hilman <khilman@ti.com>
Cc: Tony Lindgren <tony@atomide.com>

Merge tag 'ux500-fixes-v3.6-rc2' of git://git./linux/kernel/git/linusw/linux-stericsson into fixes

From Linus Walleij <linus.walleij@linaro.org>:
Here are two audio fixes for the ux500 found by Lee Jones.

* tag 'ux500-fixes-v3.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson:
ARM: ux500: Ensure probing of Audio devices when Device Tree is enabled
ARM: ux500: Fix merge error, no matching driver name for 'snd_soc_u8500'

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge branch 'v3.6-samsung-fixes-1' of git://git./linux/kernel/git/kgene/linux-samsung into fixes

From Kukjin Kim <kgene.kim@samsung.com>:

For HDMI, already HDMI support for EXYNOS in mainline kernel is broken
because its configuration moved to platform data but regarding platform
data didn't support yet. And others are for fix warnings.

* 'v3.6-samsung-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Set HDMI platform data in Origen board
  ARM: EXYNOS: Set HDMI platform data in SMDKV310
  ARM: SAMSUNG: Add API to set platform data for s5p-tv driver
  ARM: SAMSUNG: Set HDMI platform data for Exynos4x12 SoCs
  ARM: Samsung: Make uart_save static in pm.c file
  ARM: S3C24XX: Fix s3c2410_dma_enqueue parameters
  ARM: S3C24XX: Add missing DMACH_DT_PROP

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Merge branch 'imx/fixes-for-3.6' of git://git.linaro.org/people/shawnguo/linux-2.6 into fixes

* 'imx/fixes-for-3.6' of git://git.linaro.org/people/shawnguo/linux-2.6:
  ARM: dts: imx51-babbage: fix esdhc cd/wp properties
  ARM: imx6: spin the cpu until hardware takes it down
  ARM i.MX6q: Add virtual 1/3.5 dividers in the LDB clock path

Also updates to Linux 3.6-rc2

Signed-off-by: Arnd Bergmann <arnd@arndb.de>

xen/setup: Fix one-off error when adding for-balloon PFNs to the P2M.

When we are finished with return PFNs to the hypervisor, then
populate it back, and also mark the E820 MMIO and E820 gaps
as IDENTITY_FRAMEs, we then call P2M to set areas that can
be used for ballooning. We were off by one, and ended up
over-writting a P2M entry that most likely was an IDENTITY_FRAME.
For example:

1-1 mapping on 40000->40200
1-1 mapping on bc558->bc5ac
1-1 mapping on bc5b4->bc8c5
1-1 mapping on bc8c6->bcb7c
1-1 mapping on bcd00->100000
Released 614 pages of unused memory
Set 277889 page(s) to 1-1 mapping
Populating 40200-40466 pfn range: 614 pages added

=> here we set from 40466 up to bc559 P2M tree to be
INVALID_P2M_ENTRY. We should have done it up to bc558.

The end result is that if anybody is trying to construct
a PTE for PFN bc558 they end up with ~PAGE_PRESENT.

CC: stable@vger.kernel.org
Reported-by-and-Tested-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

MIPS: pci-ar724x: avoid data bus error due to a missing PCIe module

If the controller has no PCIe module attached, accessing of the device
configuration space causes a data bus error. Avoid this by checking the
status of the PCIe link in advance, and indicate an error if the link
is down.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: stable@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/4293/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

ARM: dts: imx51-babbage: fix esdhc cd/wp properties

The binding doc and dts use properties "fsl,{cd,wp}-internal" while
esdhc driver uses "fsl,{cd,wp}-controller". Fix binding doc and dts
to get them match driver code.

Reported-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: <stable@vger.kernel.org>
Acked-by: Chris Ball <cjb@laptop.org>

ARM: imx6: spin the cpu until hardware takes it down

Though commit 602bf40 (ARM: imx6: exit coherency when shutting down
a cpu) improves the stability of imx6q cpu hotplug a lot, there are
still hangs seen with a more stressful hotplug testing.

It's expected that once imx_enable_cpu(cpu, false) is called, the cpu
will be taken down by hardware immediately, and the code after that
will not get any chance to execute.  However, this is not always the
case from the testing.  The cpu could possibly be alive for a few
cycles before hardware actually takes it down.  So rather than letting
cpu execute some code that could cause a hang in these cycles, let's
make the cpu spin there and wait for hardware to take it down.

Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>

block: replace __getblk_slow misfix by grow_dev_page fix

Commit 91f68c89d8f3 ("block: fix infinite loop in __getblk_slow")
is not good: a successful call to grow_buffers() cannot guarantee
that the page won't be reclaimed before the immediate next call to
__find_get_block(), which is why there was always a loop there.

Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
inode #19278: block 664: comm cc1: unable to read itable block" on console,
which pointed to this commit.

I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
sometimes fails from a missing header file, under memory pressure on
ppc G5. I've never seen this on x86, and I've never seen it on 3.5-rc7
itself, despite that commit being in there: bisection pointed to an
irrelevant pinctrl merge, but hard to tell when failure takes between
18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).

(I've since found such __ext4_get_inode_loc errors in /var/log/messages
from previous weeks: why the message never appeared on console until
yesterday morning is a mystery for another day.)

Revert 91f68c89d8f3, restoring __getblk_slow() to how it was (plus
a checkpatch nitfix). Simplify the interface between grow_buffers()
and grow_dev_page(), and avoid the infinite loop beyond end of device
by instead checking init_page_buffers()'s end_block there (I presume
that's more efficient than a repeated call to blkdev_max_block()),
returning -ENXIO to __getblk_slow() in that case.

And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
comment, but that is worrying: are all users of __getblk() really
now prepared for a NULL bh beyond end of device, or will some oops??

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org # 3.0 3.2 3.4 3.5
Signed-off-by: Jens Axboe <axboe@kernel.dk>

x86, microcode, AMD: Fix broken ucode patch size check

This issue was recently observed on an AMD C-50 CPU where a patch of
maximum size was applied.

Commit be62adb49294 ("x86, microcode, AMD: Simplify ucode verification")
added current_size in get_matching_microcode(). This is calculated as
size of the ucode patch + 8 (ie. size of the header). Later this is
compared against the maximum possible ucode patch size for a CPU family.
And of course this fails if the patch has already maximum size.

Cc: <stable@vger.kernel.org> [3.3+]
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Link: http://lkml.kernel.org/r/1344361461-10076-1-git-send-email-bp@amd64.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>

KVM: x86 emulator: use stack size attribute to mask rsp in stack ops

The sub-register used to access the stack (sp, esp, or rsp) is not
determined by the address size attribute like other memory references,
but by the stack segment's B bit (if not in x86_64 mode).

Fix by using the existing stack_mask() to figure out the correct mask.

This long-existing bug was exposed by a combination of a27685c33acccce
(emulate invalid guest state by default), which causes many more
instructions to be emulated, and a seabios change (possibly a bug) which
causes the high 16 bits of esp to become polluted across calls to real
mode software interrupts.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Linux 3.6-rc3

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
"Intel: edid fixes, power consumption fix, s/r fix, haswell fix

  Radeon: BIOS loading fixes for UEFI and Thunderbolt machines, better
  MSAA validation, lockup timeout fixes, modesetting fixes

  One udl dpms fix, one vmwgfx fix, a couple of trivial core changes.

  There is an export added to ACPI as part of the radeon bios fixes.

  I've also included the fbcon flashing cursor vs deinit race fix, that
  seems the simplest place to start"

Trivial conflict in drivers/video/console/fbcon.c due to me having
already applied the fbcon flashing cursor vs deinit race fix, and Dave
had added a comment in there too.

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (22 commits)
  fbcon: fix race condition between console lock and cursor timer (v1.1)
  drm: Add missing static storage class specifiers in drm_proc.c file
  drm/udl: dpms off the crtc when disabled.
  drm: Remove two unused fields from struct drm_display_mode
  drm: stop vmgfx driver explosion
  drm/radeon/ss: use num_crtc rather than hardcoded 6
  Revert "drm/radeon: fix bo creation retry path"
  drm/i915: use hsw rps tuning values everywhere on gen6+
  drm/radeon: split ATRM support out from the ATPX handler (v3)
  drm/radeon: convert radeon vfct code to use acpi_get_table_with_size
  ACPI: export symbol acpi_get_table_with_size
  drm/radeon: implement ACPI VFCT vbios fetch (v3)
  drm/radeon/kms: extend the Fujitsu D3003-S2 board connector quirk to cover later silicon stepping
  drm/radeon: fix checking of MSAA renderbuffers on r600-r700
  drm/radeon: allow CMASK and FMASK in the CS checker on r600-r700
  drm/radeon: init lockup timeout on ring init
  drm/radeon: avoid turning off spread spectrum for used pll
  drm/i915: fall back to bit-banging if GMBUS fails in CRT EDID reads
  drm/i915: extract connector update from intel_ddc_get_modes() for reuse
  drm/i915: fix hsw uncached pte
  ...

Merge git://git./linux/kernel/git/nab/target-pending

Pull SCSI target fixes from Nicholas Bellinger:
"The executive summary includes:

   - Post-merge review comments for tcm_vhost (MST + nab)
   - Avoid debugging overhead when not debugging for tcm-fc(FCoE) (MDR)
   - Fix NULL pointer dereference bug on alloc_page failulre (Yi Zou)
   - Fix REPORT_LUNs regression bug with pSCSI export (AlexE + nab)
   - Fix regression bug with handling of zero-length data CDBs (nab)
   - Fix vhost_scsi_target structure alignment (MST)

  Thanks again to everyone who contributed a bugfix patch, gave review
  feedback on tcm_vhost code, and/or reported a bug during their own
  testing over the last weeks.

  There is one other outstanding bug reported by Roland recently related
  to SCSI transfer length overflow handling, for which the current
  proposed bugfix has been left in queue pending further testing with
  other non iscsi-target based fabric drivers.

  As the patch is verified with loopback (local SGL memory from SCSI
  LLD) + tcm_qla2xxx (TCM allocated SGL memory mapped to PCI HW) fabric
  ports, it will be included into the next 3.6-rc-fixes PULL request."

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Remove unused se_cmd.cmd_spdtl
  tcm_fc: rcu_deref outside rcu lock/unlock section
  tcm_vhost: Fix vhost_scsi_target structure alignment
  target: Fix regression bug with handling of zero-length data CDBs
  target/pscsi: Fix bug with REPORT_LUNs handling for SCSI passthrough
  tcm_vhost: Change vhost_scsi_target->vhost_wwpn to char *
  target: fix NULL pointer dereference bug alloc_page() fails to get memory
  tcm_fc: Avoid debug overhead when not debugging
  tcm_vhost: Post-merge review changes requested by MST
  tcm_vhost: Fix incorrect IS_ERR() usage in vhost_scsi_map_iov_to_sgl

Merge branch 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux

Pull i2c-embedded fixes from Wolfram Sang:
"Some bugfixes for the "embedded" part of the I2C subsystem.  The fixes
  affect mostly drivers which have been largely reworked lately and
  where regressions appeared."

* 'i2c-embedded/for-current' of git://git.pengutronix.de/git/wsa/linux:
  i2c: tegra: protect suspend/resume callbacks with CONFIG_PM_SLEEP
  i2c: diolan-u2c: Fix master_xfer return code
  I2C: OMAP: xfer: fix runtime PM get/put balance on error
  i2c: nomadik: Add default configuration into the Nomadik I2C driver

Merge tag 'for-3.6-rc3' of git://gitorious.org/linux-pwm/linux-pwm

Pull pwm fixes from Thierry Reding:
"These patches fix the Samsung PWM driver and perform some minor
  cleanups like fixing checkpatch and sparse warnings.

  Two redundant error messages are removed and the Kconfig help text for
  the PWM subsystem is made more descriptive."

* tag 'for-3.6-rc3' of git://gitorious.org/linux-pwm/linux-pwm:
  pwm: Improve Kconfig help text
  pwm: core: Fix coding style issues
  pwm: vt8500: Fix coding style issue
  pwm: Remove a redundant error message when devm_request_and_ioremap fails
  pwm: samsung: add missing device pointer to struct pwm_chip
  pwm: Add missing static storage class specifiers in core.c file

Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client

Pull ceph fixes from Sage Weil:
"Jim's fix closes a narrow race introduced with the msgr changes.  One
  fix resolves problems with debugfs initialization that Yan found when
  multiple client instances are created (e.g., two clusters mounted, or
  rbd + cephfs), another one fixes problems with mounting a nonexistent
  server subdirectory, and the last one fixes a divide by zero error
  from unsanitized ioctl input that Dan Carpenter found."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: avoid divide by zero in __validate_layout()
  libceph: avoid truncation due to racing banners
  ceph: tolerate (and warn on) extraneous dentry from mds
  libceph: delay debugfs initialization until we learn global_id

Merge tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
- NFSv3 mounts need to fail if the FSINFO rpc call fails
- Ensure that the NFS commit cache gets torn down when we unload the
   NFS module.
- Fix memory scribble issues when interrupting a LAYOUTGET rpc call
- Fix NFSv4 legacy idmapper regressions
- Fix issues with the NFSv4 getacl command
- Fix a regression when using the legacy "mount -t nfs4"

* tag 'nfs-for-3.6-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFSv3: Ensure that do_proc_get_root() reports errors correctly
  NFSv4: Ensure that nfs4_alloc_client cleans up on error.
  NFS: return -ENOKEY when the upcall fails to map the name
  NFS: Clear key construction data if the idmap upcall fails
  NFSv4: Don't use private xdr_stream fields in decode_getacl
  NFSv4: Fix the acl cache size calculation
  NFSv4: Fix pointer arithmetic in decode_getacl
  NFS: Alias the nfs module to nfs4
  NFS: Fix a regression when loading the NFS v4 module
  NFSv4.1: Remove a bogus BUG_ON() in nfs4_layoutreturn_done
  pnfs-obj: Better IO pattern in case of unaligned offset
  NFS41: add pg_layout_private to nfs_pageio_descriptor
  pnfs: nfs4_proc_layoutget returns void
  pnfs: defer release of pages in layoutget
  nfs: tear down caches in nfs_init_writepagecache when allocation fails

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull assorted fixes - mostly vfs - from Al Viro:
"Assorted fixes, with an unexpected detour into vfio refcounting logics
  (fell out when digging in an analog of eventpoll race in there)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  task_work: add a scheduling point in task_work_run()
  fs: fix fs/namei.c kernel-doc warnings
  eventpoll: use-after-possible-free in epoll_create1()
  vfio: grab vfio_device reference *before* exposing the sucker via fd_install()
  vfio: get rid of vfio_device_put()/vfio_group_get_device* races
  vfio: get rid of open-coding kref_put_mutex
  introduce kref_put_mutex()
  vfio: don't dereference after kfree...
  mqueue: lift mnt_want_write() outside ->i_mutex, clean up a bit

HID: Remove QUANTA from special drivers list

This QUANTA device is driven by the generic hid-multitouch.ko driver, and
therefore shouldn't be in the special drivers list.

This has been an oversight in 4fa3a58 ("HID: hid-multitouch: Switch to
device groups").

Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>

UBIFS: fix error messages spelling

Corruptio -> corruption.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>

task_work: add a scheduling point in task_work_run()

It seems commit 4a9d4b02 (switch fput to task_work_add) reintroduced
the problem addressed in commit 944be0b2 (close_files(): add scheduling
point)

If a server process with a lot of files (say 2 million tcp sockets)
is killed, we can spend a lot of time in task_work_run() and trigger
a soft lockup.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

fs: fix fs/namei.c kernel-doc warnings

Fix kernel-doc warnings in fs/namei.c:

Warning(fs/namei.c:360): No description found for parameter 'inode'
Warning(fs/namei.c:672): No description found for parameter 'nd'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

eventpoll: use-after-possible-free in epoll_create1()

As soon as we'd installed the file into descriptor table, it can
get closed by another thread. Freeing ep in process...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfio: grab vfio_device reference *before* exposing the sucker via fd_install()

It's not critical (anymore) since another thread closing the file will block
on ->device_lock before it gets to dropping the final reference, but it's
definitely cleaner that way...

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfio: get rid of vfio_device_put()/vfio_group_get_device* races

we really need to make sure that dropping the last reference happens
under the group->device_lock; otherwise a loop (under device_lock)
might find vfio_device instance that is being freed right now, has
already dropped the last reference and waits on device_lock to exclude
the sucker from the list.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfio: get rid of open-coding kref_put_mutex

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

introduce kref_put_mutex()

equivalent of
mutex_lock(mutex);
if (!kref_put(kref, release))
mutex_unlock(mutex);

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

vfio: don't dereference after kfree...

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended

Although the possible race described in

  commit 85b7059169e128c57a3a8a3e588fb89cb2031da1
  KVM: MMU: fix shrinking page from the empty mmu

was correct, the real cause of that issue was a more trivial bug of
mmu_shrink() introduced by

  commit 1952639665e92481c34c34c3e2a71bf3e66ba362
  KVM: MMU: do not iterate over all VMs in mmu_shrink()

Here is the bug:

if (kvm->arch.n_used_mmu_pages > 0) {
if (!nr_to_scan--)
break;
continue;
}

We skip VMs whose n_used_mmu_pages is not zero and try to shrink others:
in other words we try to shrink empty ones by mistake.

This patch reverses the logic so that mmu_shrink() can free pages from
the first VM whose n_used_mmu_pages is not zero.  Note that we also add
comments explaining the role of nr_to_scan which is not practically
important now, hoping this will be improved in the future.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>

x86/alternatives: Fix p6 nops on non-modular kernels

Probably a leftover from the early days of self-patching, p6nops
are marked __initconst_or_module, which causes them to be
discarded in a non-modular kernel. If something later triggers
patching, it will overwrite kernel code with garbage.

Reported-by: Tomas Racek <tracek@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/5034AE84.90708@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

UBIFS: fix complaints about too small debug buffer size

When debugging is enabled, we use a temporary on-stack buffer for formatting
the key strings like "(11368871, direntry, 0xcd0750)". The buffer size is
32 bytes and sometimes it is not enough to fit the key string - e.g., when
inode numbers are high. This is not fatal, but the key strings are incomplete
and UBIFS complains like this:

UBIFS assert failed in dbg_snprintf_key at 137 (pid 1)

This is a regression caused by "515315a UBIFS: fix key printing".

Fix the issue by increasing the buffer to 48 bytes.

Reported-by: Michael Hench <michaelhench@gmail.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Michael Hench <michaelhench@gmail.com>
Cc: stable@vger.kernel.org [v3.3+]

time: Avoid making adjustments if we haven't accumulated anything

If update_wall_time() is called and the current offset isn't large
enough to accumulate, avoid re-calling timekeeping_adjust which may
change the clock freq and can cause 1ns inconsistencies with
CLOCK_REALTIME_COARSE/CLOCK_MONOTONIC_COARSE.

Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1345595449-34965-5-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

time: Avoid potential shift overflow with large shift values

Andreas Schwab noticed that the 1 << tk->shift could overflow if the
shift value was greater than 30, since 1 would be a 32bit long on
32bit architectures. This issue was introduced by 1e75fa8be (time:
Condense timekeeper.xtime into xtime_sec)

Use 1ULL instead to ensure we don't overflow on the shift.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1345595449-34965-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

time: Fix casting issue in timekeeping_forward_now

arch_gettimeoffset returns a u32 value which when shifted by tk->shift
can overflow. This issue was introduced with 1e75fa8be (time: Condense
timekeeper.xtime into xtime_sec)

Cast it to u64 first.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1345595449-34965-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

time: Ensure we normalize the timekeeper in tk_xtime_add

Andreas noticed problems with resume on specific hardware after commit
1e75fa8b (time: Condense timekeeper.xtime into xtime_sec) combined
with commit b44d50dca (time: Fix casting issue in tk_set_xtime and
tk_xtime_add)

After some digging I realized we aren't normalizing the timekeeper
after the add. Add the missing normalize call.

Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1345595449-34965-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

x86/fixup_irq: Use cpu_online_mask instead of cpu_all_mask

When one CPU is going down and this CPU is the last one in irq
affinity, current code is setting cpu_all_mask as the new
affinity for that irq.

But for some systems (such as in Medfield Android mobile) the
firmware sends the interrupt to each CPU in the irq affinity
mask, averaged, and cpu_all_mask includes all potential CPUs,
i.e. offline ones as well.

So replace cpu_all_mask with cpu_online_mask.

Signed-off-by: liu chuansheng <chuansheng.liu@intel.com>
Acked-by: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A137286@SHSMSX101.ccr.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>

x86/spinlocks: Fix comment in spinlock.h

This comment is no longer true. We support up to 2^16 CPUs
because __ticket_t is an u16 if NR_CPUS is larger than 256.

Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

fbcon: fix race condition between console lock and cursor timer (v1.1)

So we've had a fair few reports of fbcon handover breakage between
efi/vesafb and i915 surface recently, so I dedicated a couple of
days to finding the problem.

Essentially the last thing we saw was the conflicting framebuffer
message and that was all.

So after much tracing with direct netconsole writes (printks
under console_lock not so useful), I think I found the race.

Thread A (driver load)    Thread B (timer thread)
  unbind_con_driver ->              |
  bind_con_driver ->                |
  vc->vc_sw->con_deinit ->          |
  fbcon_deinit ->                   |
  console_lock()                    |
      |                             |
      |                       fbcon_flashcursor timer fires
      |                       console_lock() <- blocked for A
      |
      |
fbcon_del_cursor_timer ->
  del_timer_sync
  (BOOM)

Of course because all of this is under the console lock,
we never see anything, also since we also just unbound the active
console guess what we never see anything.

Hopefully this fixes the problem for anyone seeing vesafb->kms
driver handoff.

v1.1: add comment suggestion from Alan.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'akpm' (Andrew's patch-bomb)

Merge fixes from Andrew Morton.

Random drivers and some VM fixes.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (17 commits)
  mm: compaction: Abort async compaction if locks are contended or taking too long
  mm: have order > 0 compaction start near a pageblock with free pages
  rapidio/tsi721: fix unused variable compiler warning
  rapidio/tsi721: fix inbound doorbell interrupt handling
  drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode
  mm: correct page->pfmemalloc to fix deactivate_slab regression
  drivers/rtc/rtc-pcf2123.c: initialize dynamic sysfs attributes
  mm/compaction.c: fix deferring compaction mistake
  drivers/misc/sgi-xp/xpc_uv.c: SGI XPC fails to load when cpu 0 is out of IRQ resources
  string: do not export memweight() to userspace
  hugetlb: update hugetlbpage.txt
  checkpatch: add control statement test to SINGLE_STATEMENT_DO_WHILE_MACRO
  mm: hugetlbfs: correctly populate shared pmd
  cciss: fix incorrect scsi status reporting
  Documentation: update mount option in filesystem/vfat.txt
  mm: change nr_ptes BUG_ON to WARN_ON
  cs5535-clockevt: typo, it's MFGPT, not MFPGT

Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
"For bug fixes, at soc_camera, si470x, uvcvideo, iguanaworks IR driver,
  radio_shark Kbuild fixes, and at the V4L2 core (radio fixes)."

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] media: soc_camera: don't clear pix->sizeimage in JPEG mode
  [media] media: mx2_camera: Fix clock handling for i.MX27
  [media] video: mx2_camera: Use clk_prepare_enable/clk_disable_unprepare
  [media] video: mx1_camera: Use clk_prepare_enable/clk_disable_unprepare
  [media] media: mx3_camera: buf_init() add buffer state check
  [media] radio-shark2: Only compile led support when CONFIG_LED_CLASS is set
  [media] radio-shark: Only compile led support when CONFIG_LED_CLASS is set
  [media] radio-shark*: Call cancel_work_sync from disconnect rather then release
  [media] radio-shark*: Remove work-around for dangling pointer in usb intfdata
  [media] Add USB dependency for IguanaWorks USB IR Transceiver
  [media] Add missing logging for rangelow/high of hwseek
  [media] VIDIOC_ENUM_FREQ_BANDS fix
  [media] mem2mem_testdev: fix querycap regression
  [media] si470x: v4l2-compliance fixes
  [media] DocBook: Remove a spurious character
  [media] uvcvideo: Reset the bytesused field when recycling an erroneous buffer

Merge git://git./linux/kernel/git/davem/net

Pull networking update from David Miller:
"A couple weeks of bug fixing in there.  The largest chunk is all the
  broken crap Amerigo Wang found in the netpoll layer."

1) netpoll and it's users has several serious bugs:
    a) uses GFP_KERNEL with locks held
    b) interfaces requiring interrupts disabled are called with them
       enabled
    c) and vice versa
    d) VLAN tag demuxing, as per all other RX packet input paths, is not
       applied

    All from Amerigo Wang.

2) Hopefully cure the ipv4 mapped ipv6 address TCP early demux bugs for
    good, from Neal Cardwell.

3) Unlike AF_UNIX, AF_PACKET sockets don't set a default credentials
    when the user doesn't specify one explicitly during sendmsg().
    Instead we attach an empty (zero) SCM credential block which is
    definitely not what we want.  Fix from Eric Dumazet.

4) IPv6 illegally invokes netdevice notifiers with RCU lock held, fix
    from Ben Hutchings.

5) inet_csk_route_child_sock() checks wrong inet options pointer, fix
    from Christoph Paasch.

6) When AF_PACKET is used for transmit, packet loopback doesn't behave
    properly when a socket fanout is enabled, from Eric Leblond.

7) On bluetooth l2cap channel create failure, we leak the socket, from
    Jaganath Kanakkassery.

8) Fix all the netprio file handling bugs found by Al Viro, from John
    Fastabend.

9) Several error return and NULL deref bug fixes in networking drivers
    from Julia Lawall.

10) A large smattering of struct padding et al.  kernel memory leaks to
    userspace found of Mathias Krause.

11) Conntrack expections in netfilter can access an uninitialized timer,
    fix from Pablo Neira Ayuso.

12) Several netfilter SIP tracker bug fixes from Patrick McHardy.

13) IPSEC ipv6 routes are not initialized correctly all the time,
    resulting in an OOPS in inet_putpeer().  Also from Patrick McHardy.

14) Bridging does rcu_dereference() outside of RCU protected area, from
    Stephen Hemminger.

15) Fix routing cache removal performance regression when looking up
    output routes that have a local destination.  From Zheng Yan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
  af_netlink: force credentials passing [CVE-2012-3520]
  ipv4: fix ip header ident selection in __ip_make_skb()
  ipv4: Use newinet->inet_opt in inet_csk_route_child_sock()
  tcp: fix possible socket refcount problem
  net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child()
  net/core/dev.c: fix kernel-doc warning
  netconsole: remove a redundant netconsole_target_put()
  net: ipv6: fix oops in inet_putpeer()
  net/stmmac: fix issue of clk_get for Loongson1B.
  caif: Do not dereference NULL in chnl_recv_cb()
  af_packet: don't emit packet on orig fanout group
  drivers/net/irda: fix error return code
  drivers/net/wan/dscc4.c: fix error return code
  drivers/net/wimax/i2400m/fw.c: fix error return code
  smsc75xx: add missing entry to MAINTAINERS
  net: qmi_wwan: new devices: UML290 and K5006-Z
  net: sh_eth: Add eth support for R8A7779 device
  netdev/phy: skip disabled mdio-mux nodes
  dt: introduce for_each_available_child_of_node, of_get_next_available_child
  net: netprio: fix cgrp create and write priomap race
  ...

mm: compaction: Abort async compaction if locks are contended or taking too long

Jim Schutt reported a problem that pointed at compaction contending
heavily on locks.  The workload is straight-forward and in his own words;

The systems in question have 24 SAS drives spread across 3 HBAs,
running 24 Ceph OSD instances, one per drive.  FWIW these servers
are dual-socket Intel 5675 Xeons w/48 GB memory.  I've got ~160
Ceph Linux clients doing dd simultaneously to a Ceph file system
backed by 12 of these servers.

Early in the test everything looks fine

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  31 15          0     287216        576   38606628    0    0     2  1158    2   14   1  3  95  0  0
  27 15          0     225288        576   38583384    0    0    18 2222016 203357 134876  11 56  17 15  0
  28 17          0     219256        576   38544736    0    0    11 2305932 203141 146296  11 49  23 17  0
   6 18          0     215596        576   38552872    0    0     7 2363207 215264 166502  12 45  22 20  0
  22 18          0     226984        576   38596404    0    0     3 2445741 223114 179527  12 43  23 22  0

and then it goes to pot

  procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
   r  b       swpd       free       buff      cache   si   so    bi    bo   in   cs  us sy  id wa st
  163  8          0     464308        576   36791368    0    0    11 22210  866  536   3 13  79  4  0
  207 14          0     917752        576   36181928    0    0   712 1345376 134598 47367   7 90   1  2  0
  123 12          0     685516        576   36296148    0    0   429 1386615 158494 60077   8 84   5  3  0
  123 12          0     598572        576   36333728    0    0  1107 1233281 147542 62351   7 84   5  4  0
  622  7          0     660768        576   36118264    0    0   557 1345548 151394 59353   7 85   4  3  0
  223 11          0     283960        576   36463868    0    0    46 1107160 121846 33006   6 93   1  1  0

Note that system CPU usage is very high blocks being written out has
dropped by 42%. He analysed this with perf and found

  perf record -g -a sleep 10
  perf report --sort symbol --call-graph fractal,5
    34.63%  [k] _raw_spin_lock_irqsave
            |
            |--97.30%-- isolate_freepages
            |          compaction_alloc
            |          unmap_and_move
            |          migrate_pages
            |          compact_zone
            |          compact_zone_order
            |          try_to_compact_pages
            |          __alloc_pages_direct_compact
            |          __alloc_pages_slowpath
            |          __alloc_pages_nodemask
            |          alloc_pages_vma
            |          do_huge_pmd_anonymous_page
            |          handle_mm_fault
            |          do_page_fault
            |          page_fault
            |          |
            |          |--87.39%-- skb_copy_datagram_iovec
            |          |          tcp_recvmsg
            |          |          inet_recvmsg
            |          |          sock_recvmsg
            |          |          sys_recvfrom
            |          |          system_call
            |          |          __recv
            |          |          |
            |          |           --100.00%-- (nil)
            |          |
            |           --12.61%-- memcpy
             --2.70%-- [...]

There was other data but primarily it is all showing that compaction is
contended heavily on the zone->lock and zone->lru_lock.

commit [b2eef8c0: mm: compaction: minimise the time IRQs are disabled
while isolating pages for migration] noted that it was possible for
migration to hold the lru_lock for an excessive amount of time. Very
broadly speaking this patch expands the concept.

This patch introduces compact_checklock_irqsave() to check if a lock
is contended or the process needs to be scheduled. If either condition
is true then async compaction is aborted and the caller is informed.
The page allocator will fail a THP allocation if compaction failed due
to contention. This patch also introduces compact_trylock_irqsave()
which will acquire the lock only if it is not contended and the process
does not need to schedule.

Reported-by: Jim Schutt <jaschut@sandia.gov>
Tested-by: Jim Schutt <jaschut@sandia.gov>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: have order > 0 compaction start near a pageblock with free pages

Commit 7db8889ab05b ("mm: have order > 0 compaction start off where it
left") introduced a caching mechanism to reduce the amount work the free
page scanner does in compaction.  However, it has a problem.  Consider
two process simultaneously scanning free pages

     C
Process A M     S      F
|---------------------------------------|
Process B M FS

C is zone->compact_cached_free_pfn
S is cc->start_pfree_pfn
M is cc->migrate_pfn
F is cc->free_pfn

In this diagram, Process A has just reached its migrate scanner, wrapped
around and updated compact_cached_free_pfn accordingly.

Simultaneously, Process B finishes isolating in a block and updates
compact_cached_free_pfn again to the location of its free scanner.

Process A moves to "end_of_zone - one_pageblock" and runs this check

                if (cc->order > 0 && (!cc->wrapped ||
                                      zone->compact_cached_free_pfn >
                                      cc->start_free_pfn))
                        pfn = min(pfn, zone->compact_cached_free_pfn);

compact_cached_free_pfn is above where it started so the free scanner
skips almost the entire space it should have scanned.  When there are
multiple processes compacting it can end in a situation where the entire
zone is not being scanned at all.  Further, it is possible for two
processes to ping-pong update to compact_cached_free_pfn which is just
random.

Overall, the end result wrecks allocation success rates.

There is not an obvious way around this problem without introducing new
locking and state so this patch takes a different approach.

First, it gets rid of the skip logic because it's not clear that it
matters if two free scanners happen to be in the same block but with
racing updates it's too easy for it to skip over blocks it should not.

Second, it updates compact_cached_free_pfn in a more limited set of
circumstances.

If a scanner has wrapped, it updates compact_cached_free_pfn to the end
of the zone. When a wrapped scanner isolates a page, it updates
compact_cached_free_pfn to point to the highest pageblock it
can isolate pages from.

If a scanner has not wrapped when it has finished isolated pages it
checks if compact_cached_free_pfn is pointing to the end of the
zone. If so, the value is updated to point to the highest
pageblock that pages were isolated from. This value will not
be updated again until a free page scanner wraps and resets
compact_cached_free_pfn.

This is not optimal and it can still race but the compact_cached_free_pfn
will be pointing to or very near a pageblock with free pages.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rapidio/tsi721: fix unused variable compiler warning

Fix unused variable compiler warning when built with CONFIG_RAPIDIO_DEBUG
option off.

This patch is applicable to kernel versions starting from v3.2

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rapidio/tsi721: fix inbound doorbell interrupt handling

Make sure that there is no doorbell messages left behind due to disabled
interrupts during inbound doorbell processing.

The most common case for this bug is loss of rionet JOIN messages in
systems with three or more rionet participants and MSI or MSI-X enabled.
As result, requests for packet transfers may finish with "destination
unreachable" error message.

This patch is applicable to kernel versions starting from v3.2.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode

Correct the offset by subtracting 20 from tm_hour before taking the
modulo 12.

[ "Why 20?" I hear you ask. Or at least I did.

  Here's the reason why: RS5C348_BIT_PM is 32, and is - stupidly -
  included in the RS5C348_HOURS_MASK define.  So it's really subtracting
  out that bit to get "hour+12".  But then because it does things modulo
  12, it needs to add the 12 in again afterwards anyway.

  This code is confused.  It would be much clearer if RS5C348_HOURS_MASK
  just didn't include the RS5C348_BIT_PM bit at all, then it wouldn't
  need to do the silly subtract either.

  Whatever. It's all just math, the end result is the same.   - Linus ]

Reported-by: James Nute <newten82@gmail.com>
Tested-by: James Nute <newten82@gmail.com>
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: correct page->pfmemalloc to fix deactivate_slab regression

Commit cfd19c5a9ecf ("mm: only set page->pfmemalloc when
ALLOC_NO_WATERMARKS was used") tried to narrow down page->pfmemalloc
setting, but it missed some places the pfmemalloc should be set.

So, in __slab_alloc, the unalignment pfmemalloc and ALLOC_NO_WATERMARKS
cause incorrect deactivate_slab() on our core2 server:

    64.73%           fio  [kernel.kallsyms]     [k] _raw_spin_lock
                     |
                     --- _raw_spin_lock
                        |
                        |---0.34%-- deactivate_slab
                        |          __slab_alloc
                        |          kmem_cache_alloc
                        |          |

That causes our fio sync write performance to have a 40% regression.

Move the checking in get_page_from_freelist() which resolves this issue.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Sage Weil <sage@inktank.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/rtc/rtc-pcf2123.c: initialize dynamic sysfs attributes

Dynamically allocated sysfs attributes must be initialized using
sysfs_attr_init(), otherwise lockdep complains: BUG: key <address> not in
.data!

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Ilya Shchepetkov <shchepetkov@ispras.ru>
Cc: Chris Verges <chrisv@cyberswitching.com>
Cc: Christian Pellegrin <chripell@fsfe.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/compaction.c: fix deferring compaction mistake

Commit aff622495c9a ("vmscan: only defer compaction for failed order and
higher") fixed bad deferring policy but made mistake about checking
compact_order_failed in __compact_pgdat().  So it can't update
compact_order_failed with the new order.  This ends up preventing
correct operation of policy deferral.  This patch fixes it.

Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/misc/sgi-xp/xpc_uv.c: SGI XPC fails to load when cpu 0 is out of IRQ resources

On many of our larger systems, CPU 0 has had all of its IRQ resources
consumed before XPC loads. Worst cases on machines with multiple 10
GigE cards and multiple IB cards have depleted the entire first socket
of IRQs.

This patch makes selecting the node upon which IRQs are allocated (as
well as all the other GRU Message Queue structures) specifiable as a
module load param and has a default behavior of searching all nodes/cpus
for an available resources.

[akpm@linux-foundation.org: fix build: include cpu.h and module.h]
Signed-off-by: Robin Holt <holt@sgi.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

string: do not export memweight() to userspace

Fix the following warning:

usr/include/linux/string.h:8: userspace cannot reference function or variable defined in the kernel

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

hugetlb: update hugetlbpage.txt

Commit f0f57b2b1488 ("mm: move hugepage test examples to
tools/testing/selftests/vm") moved map_hugetlb.c, hugepage-shm.c and
hugepage-mmap.c tests into tools/testing/selftests/vm/ directory, but it
didn't update hugetlbpage.txt

Signed-off-by: Zhouping Liu <sanweidaying@gmail.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch: add control statement test to SINGLE_STATEMENT_DO_WHILE_MACRO

Commit b13edf7ff2dd ("checkpatch: add checks for do {} while (0) macro
misuses") added a test that is overly simplistic for single statement
macros.

Macros that start with control tests should be enclosed in a do {} while
(0) loop.

Add the necessary control tests to the check.

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Tested-by: Franz Schrober <franzschrober@yahoo.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: hugetlbfs: correctly populate shared pmd

Each page mapped in a process's address space must be correctly
accounted for in _mapcount.  Normally the rules for this are
straightforward but hugetlbfs page table sharing is different.  The page
table pages at the PMD level are reference counted while the mapcount
remains the same.

If this accounting is wrong, it causes bugs like this one reported by
Larry Woodman:

  kernel BUG at mm/filemap.c:135!
  invalid opcode: 0000 [#1] SMP
  CPU 22
  Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi]
  Pid: 18001, comm: mpitest Tainted: G        W    3.3.0+ #4 Dell Inc. PowerEdge R620/07NDJ2
  RIP: 0010:[<ffffffff8112cfed>]  [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170
  Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20)
  Call Trace:
    delete_from_page_cache+0x40/0x80
    truncate_hugepages+0x115/0x1f0
    hugetlbfs_evict_inode+0x18/0x30
    evict+0x9f/0x1b0
    iput_final+0xe3/0x1e0
    iput+0x3e/0x50
    d_kill+0xf8/0x110
    dput+0xe2/0x1b0
    __fput+0x162/0x240

During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc()
shared page tables with the check dst_pte == src_pte.  The logic is if
the PMD page is the same, they must be shared.  This assumes that the
sharing is between the parent and child.  However, if the sharing is
with a different process entirely then this check fails as in this
diagram:

  parent
    |
    ------------>pmd
                 src_pte----------> data page
                                        ^
  other--------->pmd--------------------|
                  ^
  child-----------|
                 dst_pte

For this situation to occur, it must be possible for Parent and Other to
have faulted and failed to share page tables with each other.  This is
possible due to the following style of race.

  PROC A                                          PROC B
  copy_hugetlb_page_range                         copy_hugetlb_page_range
    src_pte == huge_pte_offset                      src_pte == huge_pte_offset
    !src_pte so no sharing                          !src_pte so no sharing

  (time passes)

  hugetlb_fault                                   hugetlb_fault
    huge_pte_alloc                                  huge_pte_alloc
      huge_pmd_share                                 huge_pmd_share
        LOCK(i_mmap_mutex)
        find nothing, no sharing
        UNLOCK(i_mmap_mutex)
                                                      LOCK(i_mmap_mutex)
                                                      find nothing, no sharing
                                                      UNLOCK(i_mmap_mutex)
      pmd_alloc                                       pmd_alloc
      LOCK(instantiation_mutex)
      fault
      UNLOCK(instantiation_mutex)
                                                  LOCK(instantiation_mutex)
                                                  fault
                                                  UNLOCK(instantiation_mutex)

These two processes are not poing to the same data page but are not
sharing page tables because the opportunity was missed.  When either
process later forks, the src_pte == dst pte is potentially insufficient.
As the check falls through, the wrong PTE information is copied in
(harmless but wrong) and the mapcount is bumped for a page mapped by a
shared page table leading to the BUG_ON.

This patch addresses the issue by moving pmd_alloc into huge_pmd_share
which guarantees that the shared pud is populated in the same critical
section as pmd.  This also means that huge_pte_offset test in
huge_pmd_share is serialized correctly now which in turn means that the
success of the sharing will be higher as the racing tasks see the pud
and pmd populated together.

Race identified and changelog written mostly by Mel Gorman.

{akpm@linux-foundation.org: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style]
Reported-by: Larry Woodman <lwoodman@redhat.com>
Tested-by: Larry Woodman <lwoodman@redhat.com>
Reviewed-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Ken Chen <kenchen@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>