openwrt/staging/blogic.git
14 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzi...
Linus Torvalds [Sat, 28 Aug 2010 20:54:55 +0000 (13:54 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  libata-sff: remove harmful BUG_ON from ata_bmdma_qc_issue
  sata_mv: fix broken DSM/TRIM support (v2)
  libata: be less of a drama queen on empty data commands
  [libata] sata_dwc_460ex: signdness bug
  ahci: add HFLAG_YES_FBS and apply it to 88SE9128
  libata: remove no longer needed pata_winbond driver
  pata_cmd64x: revert commit d62f5576

14 years agomm: fix hang on anon_vma->root->lock
Hugh Dickins [Thu, 26 Aug 2010 06:12:54 +0000 (23:12 -0700)]
mm: fix hang on anon_vma->root->lock

After several hours, kbuild tests hang with anon_vma_prepare() spinning on
a newly allocated anon_vma's lock - on a box with CONFIG_TREE_PREEMPT_RCU=y
(which makes this very much more likely, but it could happen without).

The ever-subtle page_lock_anon_vma() now needs a further twist: since
anon_vma_prepare() and anon_vma_fork() are liable to change the ->root
of a reused anon_vma structure at any moment, page_lock_anon_vma()
needs to check page_mapped() again before succeeding, otherwise
page_unlock_anon_vma() might address a different root->lock.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agolibata-sff: remove harmful BUG_ON from ata_bmdma_qc_issue
Mark Lord [Fri, 20 Aug 2010 14:13:16 +0000 (10:13 -0400)]
libata-sff: remove harmful BUG_ON from ata_bmdma_qc_issue

Remove harmful BUG_ON() from ata_bmdma_qc_issue(),
as it casts too wide of a net and breaks sata_mv.
It also crashes the kernel while doing the BUG_ON().

There's already a WARN_ON_ONCE() further down to catch
the case of POLLING for a BMDMA operation.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: stable@kernel.org
14 years agosata_mv: fix broken DSM/TRIM support (v2)
Mark Lord [Fri, 20 Aug 2010 01:40:44 +0000 (21:40 -0400)]
sata_mv: fix broken DSM/TRIM support (v2)

Fix DSM/TRIM commands in sata_mv (v2).
These need to be issued using old-school "BM DMA",
rather than via the EDMA host queue.

Since the chips don't have proper BM DMA status,
we need to be more careful with setting the ATA_DMA_INTR bit,
since DSM/TRIM often has a long delay between "DMA complete"
and "command complete".

GEN_I chips don't have BM DMA, so no TRIM for them.

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: stable@kernel.org
14 years agolibata: be less of a drama queen on empty data commands
Tejun Heo [Mon, 23 Aug 2010 09:27:27 +0000 (11:27 +0200)]
libata: be less of a drama queen on empty data commands

ata_qc_issue() BUG_ON()s on data commands w/o data, which may be
submitted via SG_IO.  Be less of a drama queen and just trigger
WARN_ON_ONCE() and fail the command with AC_ERR_SYSTEM.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Stefan Hübner <stefan.huebner@stud.tu-ilmenau.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years ago[libata] sata_dwc_460ex: signdness bug
Dan Carpenter [Sat, 21 Aug 2010 08:43:25 +0000 (10:43 +0200)]
[libata] sata_dwc_460ex: signdness bug

dma_dwc_xfer_setup() returns an int and "dma_chan" needs to be signed
for the error handling to work.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agoahci: add HFLAG_YES_FBS and apply it to 88SE9128
Tejun Heo [Sat, 24 Jul 2010 14:53:48 +0000 (16:53 +0200)]
ahci: add HFLAG_YES_FBS and apply it to 88SE9128

88SE9128 can do FBS and sets it in HOST_CAP but forgets to set FBSCP
in PORT_CMD.  Implement AHCI_HFLAG_YES_FBS and apply it to 88SE9128.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agolibata: remove no longer needed pata_winbond driver
Bartlomiej Zolnierkiewicz [Wed, 25 Nov 2009 07:08:33 +0000 (07:08 +0000)]
libata: remove no longer needed pata_winbond driver

Winbond W83759A controller is fully supported by pata_legacy driver
so remove no longer needed pata_winbond driver.

Leave PATA_WINBOND_VLB config option for compatibility reasons
and teach pata_legacy to preserve the old behavior of pata_winbond
driver.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agopata_cmd64x: revert commit d62f5576
Tejun Heo [Tue, 17 Aug 2010 12:13:42 +0000 (14:13 +0200)]
pata_cmd64x: revert commit d62f5576

Commit d62f5576 (pata_cmd64x: fix handling of address setup timings)
incorrectly called ata_timing_compute() on UDMA mode on 0 @UT leading
to devide by zero fault.  Revert it until better fix is available.
This is reported in bko#16607 by Milan Kocian who also root caused it.

  https://bugzilla.kernel.org/show_bug.cgi?id=16607

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-root-caused-by: Milan Kocian <milan.kocian@wq.cz>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
14 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 25 Aug 2010 17:50:07 +0000 (10:50 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag
  tracing/trace_stack: Fix stack trace on ppc64

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Wed, 25 Aug 2010 16:57:59 +0000 (09:57 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  Eliminate sparse warning - bad constant expression
  cifs: check for NULL session password
  missing changes during ntlmv2/ntlmssp auth and sign
  [CIFS] Fix ntlmv2 auth with ntlmssp
  cifs: correction of unicode header files
  cifs: fix NULL pointer dereference in cifs_find_smb_ses
  cifs: consolidate error handling in several functions
  cifs: clean up error handling in cifs_mknod

14 years agoMerge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelv...
Linus Torvalds [Wed, 25 Aug 2010 15:41:18 +0000 (08:41 -0700)]
Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  MAINTAINERS: hwmon/coretemp: Change maintainers
  hwmon: (k8temp) Differentiate between AM2 and ASB1
  hwmon: (ads7871) Fix ads7871_probe error paths
  hwmon: (coretemp) Fix harmless build warning

14 years agoMerge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 25 Aug 2010 15:40:56 +0000 (08:40 -0700)]
Merge branch 'sched-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states
  sched: Fix rq->clock synchronization when migrating tasks

14 years agoMerge branch 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
Linus Torvalds [Wed, 25 Aug 2010 15:40:31 +0000 (08:40 -0700)]
Merge branch 'upstream/core' of git://git./linux/kernel/git/jeremy/xen

* 'upstream/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xen: handle events as edge-triggered
  xen: use percpu interrupts for IPIs and VIRQs

14 years agoMerge branch '2.6.36-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc...
Linus Torvalds [Wed, 25 Aug 2010 15:39:07 +0000 (08:39 -0700)]
Merge branch '2.6.36-fixes' of git://git./linux/kernel/git/dgc/xfsdev

* '2.6.36-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
  xfs: do not discard page cache data on EAGAIN
  xfs: don't do memory allocation under the CIL context lock
  xfs: Reduce log force overhead for delayed logging
  xfs: dummy transactions should not dirty VFS state
  xfs: ensure f_ffree returned by statfs() is non-negative
  xfs: handle negative wbc->nr_to_write during sync writeback
  writeback: write_cache_pages doesn't terminate at nr_to_write <= 0
  xfs: fix untrusted inode number lookup
  xfs: ensure we mark all inodes in a freed cluster XFS_ISTALE
  xfs: unlock items before allowing the CIL to commit

14 years agoMAINTAINERS: hwmon/coretemp: Change maintainers
Fenghua Yu [Wed, 25 Aug 2010 13:42:14 +0000 (15:42 +0200)]
MAINTAINERS: hwmon/coretemp: Change maintainers

Huaxu and Rudolf want me to be the hwmon coretemp driver maintainer and
remove their names from the coretemp maintainer entry.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Rudolf Marek <r.marek@assembler.cz>
Acked-by: Huaxu Wan <huaxu.wan@intel.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agohwmon: (k8temp) Differentiate between AM2 and ASB1
Andreas Herrmann [Wed, 25 Aug 2010 13:42:12 +0000 (15:42 +0200)]
hwmon: (k8temp) Differentiate between AM2 and ASB1

Commit 8bf0223ed515be24de0c671eedaff49e78bebc9c (hwmon, k8temp: Fix
temperature reporting for ASB1 processor revisions) fixed temperature
reporting for ASB1 CPUs. But those CPU models (model 0x6b, 0x6f, 0x7f)
were packaged both as AM2 (desktop) and ASB1 (mobile). Thus the commit
leads to wrong temperature reporting for AM2 CPU parts.

The solution is to determine the package type for models 0x6b, 0x6f,
0x7f.

This is done using BrandId from CPUID Fn8000_0001_EBX[15:0]. See
"Constructing the processor Name String" in "Revision Guide for AMD
NPT Family 0Fh Processors" (Rev. 3.46).

Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: stable@kernel.org [.32.x, .33.x, .34.x, .35.x]
Reported-by: Vladislav Guberinic <neosisani@gmail.com>
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agohwmon: (ads7871) Fix ads7871_probe error paths
Axel Lin [Wed, 25 Aug 2010 13:42:10 +0000 (15:42 +0200)]
hwmon: (ads7871) Fix ads7871_probe error paths

1. remove 'status' variable
2. remove unneeded initialization of 'err' variable
3. return missing error code if sysfs_create_group fail.
4. fix the init sequence as:
   - check hardware existence
   - kzalloc for ads7871_data
   - sysfs_create_group
   - hwmon_device_register

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Jean Delvare <khali@linux-fr.org>
14 years agohwmon: (coretemp) Fix harmless build warning
Jean Delvare [Wed, 25 Aug 2010 13:42:08 +0000 (15:42 +0200)]
hwmon: (coretemp) Fix harmless build warning

Fix the following build warning:

  CC [M]  drivers/hwmon/coretemp.o
drivers/hwmon/coretemp.c: In function "coretemp_init":
drivers/hwmon/coretemp.c:521: warning: unused variable "n"
drivers/hwmon/coretemp.c:521: warning: unused variable "p"

Introduced by commit 851b29cb3b196cb66452ec964ab5f66c9c9cd1ed. When
you drop code, you also have to drop the variables this code was
using.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Rudolf Marek <r.marek@assembler.cz>
Cc: Huaxu Wan <huaxu.wan@intel.com>
14 years agoperf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag
Lin Ming [Wed, 25 Aug 2010 21:06:32 +0000 (21:06 +0000)]
perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag

If on Pentium4 CPUs the FORCE_OVF flag is set then an NMI happens
on every event, which can generate a flood of NMIs. Clear it.

Reported-by: Vince Weaver <vweaver1@eecs.utk.edu>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agotracing/trace_stack: Fix stack trace on ppc64
Anton Blanchard [Wed, 25 Aug 2010 01:32:38 +0000 (11:32 +1000)]
tracing/trace_stack: Fix stack trace on ppc64

save_stack_trace() stores the instruction pointer, not the
function descriptor. On ppc64 the trace stack code currently
dereferences the instruction pointer and shows 8 bytes of
instructions in our backtraces:

 # cat /sys/kernel/debug/tracing/stack_trace
        Depth    Size   Location    (26 entries)
        -----    ----   --------
  0)     5424     112   0x6000000048000004
  1)     5312     160   0x60000000ebad01b0
  2)     5152     160   0x2c23000041c20030
  3)     4992     240   0x600000007c781b79
  4)     4752     160   0xe84100284800000c
  5)     4592     192   0x600000002fa30000
  6)     4400     256   0x7f1800347b7407e0
  7)     4144     208   0xe89f0108f87f0070
  8)     3936     272   0xe84100282fa30000

Since we aren't dealing with function descriptors, use %pS
instead of %pF to fix it:

 # cat /sys/kernel/debug/tracing/stack_trace
        Depth    Size   Location    (26 entries)
        -----    ----   --------
  0)     5424     112   ftrace_call+0x4/0x8
  1)     5312     160   .current_io_context+0x28/0x74
  2)     5152     160   .get_io_context+0x48/0xa0
  3)     4992     240   .cfq_set_request+0x94/0x4c4
  4)     4752     160   .elv_set_request+0x60/0x84
  5)     4592     192   .get_request+0x2d4/0x468
  6)     4400     256   .get_request_wait+0x7c/0x258
  7)     4144     208   .__make_request+0x49c/0x610
  8)     3936     272   .generic_make_request+0x390/0x434

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: rostedt@goodmis.org
Cc: fweisbec@gmail.com
LKML-Reference: <20100825013238.GE28360@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
14 years agoMerge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 24 Aug 2010 19:21:49 +0000 (12:21 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  watchdog: Don't throttle the watchdog
  tracing: Fix timer tracing

14 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 24 Aug 2010 19:21:02 +0000 (12:21 -0700)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  mutex: Improve the scalability of optimistic spinning

14 years agoguard page for stacks that grow upwards
Luck, Tony [Tue, 24 Aug 2010 18:44:18 +0000 (11:44 -0700)]
guard page for stacks that grow upwards

pa-risc and ia64 have stacks that grow upwards. Check that
they do not run into other mappings. By making VM_GROWSUP
0x0 on architectures that do not ever use it, we can avoid
some unpleasant #ifdefs in check_stack_guard_page().

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agodrm/i915: fix vblank wait test condition
Jesse Barnes [Tue, 24 Aug 2010 18:31:16 +0000 (11:31 -0700)]
drm/i915: fix vblank wait test condition

When converting this to the new wait_for macro I inverted the wait
condition, which causes all sorts of problems.  So correct it to fix
several failures caused by the bad wait (flickering, bad output
detection, tearing, etc.).

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoxen: handle events as edge-triggered
Jeremy Fitzhardinge [Sat, 21 Aug 2010 02:10:01 +0000 (19:10 -0700)]
xen: handle events as edge-triggered

Xen events are logically edge triggered, as Xen only calls the event
upcall when an event is newly set, but not continuously as it remains set.
As a result, use handle_edge_irq rather than handle_level_irq.

This has the important side-effect of fixing a long-standing bug of
events getting lost if:
 - an event's interrupt handler is running
 - the event is migrated to a different vcpu
 - the event is re-triggered

The most noticable symptom of these lost events is occasional lockups
of blkfront.

Many thanks to Tom Kopec and Daniel Stodden in tracking this down.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Tom Kopec <tek@acm.org>
Cc: Daniel Stodden <daniel.stodden@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
14 years agoxen: use percpu interrupts for IPIs and VIRQs
Jeremy Fitzhardinge [Sat, 21 Aug 2010 01:57:53 +0000 (18:57 -0700)]
xen: use percpu interrupts for IPIs and VIRQs

IPIs and VIRQs are inherently per-cpu event types, so treat them as such:
 - use a specific percpu irq_chip implementation, and
 - handle them with handle_percpu_irq

This makes the path for delivering these interrupts more efficient
(no masking/unmasking, no locks), and it avoid problems with attempts
to migrate them.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
14 years agoEliminate sparse warning - bad constant expression
shirishpargaonkar@gmail.com [Tue, 24 Aug 2010 16:53:48 +0000 (11:53 -0500)]
Eliminate sparse warning - bad constant expression

Eliminiate sparse warning during usage of crypto_shash_* APIs
       error: bad constant expression

Allocate memory for shash descriptors once, so that we do not kmalloc/kfree it
for every signature generation (shash descriptor for md5 hash).

From ed7538619817777decc44b5660b52268077b74f3 Mon Sep 17 00:00:00 2001
From: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Date: Tue, 24 Aug 2010 11:47:43 -0500
Subject: [PATCH] eliminate sparse warnings during crypto_shash_* APis usage

Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
14 years agoMerge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
Linus Torvalds [Tue, 24 Aug 2010 17:43:08 +0000 (10:43 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/linux-2.6

* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] fix tlb flushing vs. concurrent /proc accesses
  [S390] s390: fix build error (sys_execve)

14 years agointel_scu_ipc: fix IPC i2c write bug
Jianwei Yang [Tue, 24 Aug 2010 13:32:38 +0000 (14:32 +0100)]
intel_scu_ipc: fix IPC i2c write bug

We should pass the data to the data register.

Signed-off-by: Jianwei Yang <jianwei.yang@intel.com>
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agorar: Fix off by one error
Ossama Othman [Tue, 24 Aug 2010 11:55:14 +0000 (12:55 +0100)]
rar: Fix off by one error

It looks like there is an off-by-one error in one of your changes to
drivers/staging/rar_register/rar_register.c:

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoV4L/DVB: mantis: Fix IR_CORE dependency
Ingo Molnar [Tue, 24 Aug 2010 08:41:33 +0000 (10:41 +0200)]
V4L/DVB: mantis: Fix IR_CORE dependency

This build bug triggers:

 drivers/built-in.o: In function `mantis_exit':
 (.text+0x377413): undefined reference to `ir_input_unregister'
 drivers/built-in.o: In function `mantis_input_init':
 (.text+0x3774ff): undefined reference to `__ir_input_register'

If MANTIS_CORE is enabled but IR_CORE is not. Add the correct
dependency.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Tue, 24 Aug 2010 17:10:13 +0000 (10:10 -0700)]
Merge git://git./linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc64: Get rid of indirect p1275 PROM call buffer.
  sparc64: Fill a missing delay slot.
  sparc64: Make lock backoff really a NOP on UP builds.
  sparc64: simple microoptimizations for atomic functions
  sparc64: Make rwsems 64-bit.
  sparc64: Really fix atomic64_t interface types.

14 years ago[S390] fix tlb flushing vs. concurrent /proc accesses
Martin Schwidefsky [Tue, 24 Aug 2010 07:26:21 +0000 (09:26 +0200)]
[S390] fix tlb flushing vs. concurrent /proc accesses

The tlb flushing code uses the mm_users field of the mm_struct to
decide if each page table entry needs to be flushed individually with
IPTE or if a global flush for the mm_struct is sufficient after all page
table updates have been done. The comment for mm_users says "How many
users with user space?" but the /proc code increases mm_users after it
found the process structure by pid without creating a new user process.
Which makes mm_users useless for the decision between the two tlb
flusing methods. The current code can be confused to not flush tlb
entries by a concurrent access to /proc files if e.g. a fork is in
progres. The solution for this problem is to make the tlb flushing
logic independent from the mm_users field.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Linus Torvalds [Tue, 24 Aug 2010 07:26:34 +0000 (00:26 -0700)]
Merge git://git./linux/kernel/git/benh/powerpc

* git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
  powerpc: Fix config dependency problem with MPIC_U3_HT_IRQS
  via-pmu: Add compat_pmu_ioctl
  powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls
  powerpc/pci: Fix checking for child bridges in PCI code.
  powerpc: Fix typo in uImage target
  powerpc: Initialise paca->kstack before early_setup_secondary
  powerpc: Fix bogus it_blocksize in VIO iommu code
  powerpc: Inline ppc64_runlatch_off
  powerpc: Correct smt_enabled=X boot option for > 2 threads per core
  powerpc: Silence xics_migrate_irqs_away() during cpu offline
  powerpc: Silence __cpu_up() under normal operation
  powerpc: Re-enable preemption before cpu_die()
  powerpc/pci: Drop unnecessary null test
  powerpc/powermac: Drop unnecessary null test
  powerpc/powermac: Drop unnecessary of_node_put
  powerpc/kdump: Stop all other CPUs before running crash handlers
  powerpc/mm: Fix vsid_scrample typo
  powerpc: Use is_32bit_task() helper to test 32 bit binary
  powerpc: Export memstart_addr and kernstart_addr on ppc64
  powerpc: Make rwsem use "long" type
  ...

14 years ago[S390] s390: fix build error (sys_execve)
Sebastian Ott [Tue, 24 Aug 2010 07:26:20 +0000 (09:26 +0200)]
[S390] s390: fix build error (sys_execve)

fix this build error:
arch/s390/kernel/process.c:272: error: conflicting types for 'sys_execve'
arch/s390/kernel/entry.h:45: error: previous declaration of 'sys_execve' was here
make[1]: *** [arch/s390/kernel/process.o] Error 1
make: *** [arch/s390/kernel] Error 2

introduced by d7627467b7a8dd6944885290a03a07ceb28c10eb

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
Linus Torvalds [Tue, 24 Aug 2010 07:21:45 +0000 (00:21 -0700)]
Merge git://git./linux/kernel/git/gregkh/staging-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
  Staging: sep: remove driver
  Staging: batman-adv: Don't write in not allocated packet_buff
  Staging: batman-adv: Don't use net_dev after dev_put
  Staging: batman-adv: Create batman_if only on register event
  Staging: batman-adv: fix own mac address detection
  Staging: batman-adv: always reply batman icmp packets with primary mac
  Staging: batman-adv: fix batman icmp originating from secondary interface
  Staging: batman-adv: unify orig_hash_lock spinlock handling to avoid deadlocks
  Staging: batman-adv: Fix merge of linus tree
  Staging: spectra: removes unused functions
  Staging: spectra: initializa lblk variable
  Staging: spectra: removes unused variable
  Staging: spectra: remove duplicate GLOB_VERSION definition
  Staging: spectra: don't use locked_ioctl, fix build
  Staging: use new REQ_FLUSH flag, fix build breakage
  Staging: spectra: removes q->prepare_flush_fn, fix build breakage

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
Linus Torvalds [Tue, 24 Aug 2010 07:21:27 +0000 (00:21 -0700)]
Merge git://git./linux/kernel/git/gregkh/tty-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  68328serial: check return value of copy_*_user() instead of access_ok()
  synclink: add mutex_unlock() on error path
  rocket: add a mutex_unlock()
  ip2: return -EFAULT on copy_to_user errors
  ip2: remove unneeded NULL check
  serial: print early console device address in hex

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
Linus Torvalds [Tue, 24 Aug 2010 07:21:02 +0000 (00:21 -0700)]
Merge git://git./linux/kernel/git/gregkh/driver-core-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  kobject_uevent: fix typo in comments
  firmware_class: fix typo in error path
  kobject: Break the kobject namespace defs into their own header

14 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Tue, 24 Aug 2010 07:20:44 +0000 (00:20 -0700)]
Merge git://git./linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (29 commits)
  ARM: imx: fix build failure concerning otg/ulpi
  USB: ftdi_sio: add product ID for Lenz LI-USB
  USB: adutux: fix misuse of return value of copy_to_user()
  USB: iowarrior: fix misuse of return value of copy_to_user()
  USB: xHCI: update ring dequeue pointer when process missed tds
  USB: xhci: Remove buggy assignment in next_trb()
  USB: ftdi_sio: Add ID for Ionics PlugComputer
  USB: serial: io_ti.c: don't return 0 if writing the download record failed
  USB: otg: twl4030: fix wrong assumption of starting state
  USB: gadget: Return -ENOMEM on memory allocation failure
  USB: gadget: fix composite kernel-doc warnings
  USB: ssu100: set tty_flags in ssu100_process_packet
  USB: ssu100: add disconnect function for ssu100
  USB: serial: export symbol usb_serial_generic_disconnect
  USB: ssu100: rework logic for TIOCMIWAIT
  USB: ssu100: add register parameter to ssu100_setregister
  USB: ssu100: remove duplicate #defines in ssu100
  USB: ssu100: refine process_packet in ssu100
  USB: ssu100: add locking for port private data in ssu100
  USB: r8a66597-udc: return -ENOMEM if kzalloc() fails
  ...

14 years agosparc64: Get rid of indirect p1275 PROM call buffer.
David S. Miller [Tue, 24 Aug 2010 06:10:57 +0000 (23:10 -0700)]
sparc64: Get rid of indirect p1275 PROM call buffer.

This is based upon a report by Meelis Roos showing that it's possible
that we'll try to fetch a property that is 32K in size with some
devices.  With the current fixed 3K buffer we use for moving data in
and out of the firmware during PROM calls, that simply won't work.

In fact, it will scramble random kernel data during bootup.

The reasoning behind the temporary buffer is entirely historical.  It
used to be the case that we had problems referencing dynamic kernel
memory (including the stack) early in the boot process before we
explicitly told the firwmare to switch us over to the kernel trap
table.

So what we did was always give the firmware buffers that were locked
into the main kernel image.

But we no longer have problems like that, so get rid of all of this
indirect bounce buffering.

Besides fixing Meelis's bug, this also makes the kernel data about 3K
smaller.

It was also discovered during these conversions that the
implementation of prom_retain() was completely wrong, so that was
fixed here as well.  Currently that interface is not in use.

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agopowerpc: Fix config dependency problem with MPIC_U3_HT_IRQS
Andreas Schwab [Mon, 23 Aug 2010 07:36:41 +0000 (07:36 +0000)]
powerpc: Fix config dependency problem with MPIC_U3_HT_IRQS

MPIC_U3_HT_IRQS is selected both by PPC_PMAC64 and PPC_MAPLE, but depends
on PPC_MAPLE, so a PPC_PMAC64-only config gets this warning:

warning: (PPC_PMAC64 && PPC_PMAC && POWER4 || PPC_MAPLE && PPC64 && PPC_BOOK3S) selects MPIC_U3_HT_IRQS which has unmet direct dependencies (PPC_MAPLE)

Fix that by removing the dependency on PPC_MAPLE.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agovia-pmu: Add compat_pmu_ioctl
Andreas Schwab [Sun, 22 Aug 2010 06:23:17 +0000 (06:23 +0000)]
via-pmu: Add compat_pmu_ioctl

The ioctls are actually compatible, but due to historical mistake the
numbers differ between 32bit and 64bit.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls
Andreas Schwab [Thu, 19 Aug 2010 05:15:37 +0000 (05:15 +0000)]
powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscalls

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pci: Fix checking for child bridges in PCI code.
Grant Likely [Wed, 18 Aug 2010 08:27:55 +0000 (08:27 +0000)]
powerpc/pci: Fix checking for child bridges in PCI code.

pci_device_to_OF_node() can return null, and list_for_each_entry will
never enter the loop when dev is NULL, so it looks like this test is
a typo.

Reported-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Fix typo in uImage target
Anatolij Gustschin [Sun, 15 Aug 2010 22:26:56 +0000 (22:26 +0000)]
powerpc: Fix typo in uImage target

Commit e32e78c5ee8aadef020fbaecbe6fb741ed9029fd
(powerpc: fix build with make 3.82) introduced a
typo in uImage target and broke building uImage:

make: *** No rule to make target `uImage'.  Stop.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Initialise paca->kstack before early_setup_secondary
Matt Evans [Thu, 12 Aug 2010 20:58:28 +0000 (20:58 +0000)]
powerpc: Initialise paca->kstack before early_setup_secondary

As early setup calls down to slb_initialize(), we must have kstack
initialised before checking "should we add a bolted SLB entry for our kstack?"

Failing to do so means stack access requires an SLB miss exception to refill
an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text
& static data).  It's not always allowable to take such a miss, and
intermittent crashes will result.

Primary CPUs don't have this issue; an SLB entry is not bolted for their
stack anyway (as that lives within SLB(0)).  This patch therefore only
affects the init of secondaries.

Signed-off-by: Matt Evans <matt@ozlabs.org>
Cc: stable <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Fix bogus it_blocksize in VIO iommu code
Anton Blanchard [Wed, 11 Aug 2010 16:42:48 +0000 (16:42 +0000)]
powerpc: Fix bogus it_blocksize in VIO iommu code

When looking at some issues with the virtual ethernet driver I noticed
that TCE allocation was following a very strange pattern:

address 00e9000 length 2048
address 0409000 length 2048 <-----
address 0429000 length 2048
address 0449000 length 2048
address 0469000 length 2048
address 0489000 length 2048
address 04a9000 length 2048
address 04c9000 length 2048
address 04e9000 length 2048
address 4009000 length 2048 <-----
address 4029000 length 2048

Huge unexplained gaps in what should be an empty TCE table. It turns out
it_blocksize, the amount we want to align the next allocation to, was
c0000000fe903b20. Completely bogus.

Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc
everywhere to protect against this when we next add a non compulsary
field to iommu code and forget to initialise it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Inline ppc64_runlatch_off
Anton Blanchard [Fri, 6 Aug 2010 03:28:19 +0000 (03:28 +0000)]
powerpc: Inline ppc64_runlatch_off

I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it
into the callers. To avoid a mess of circular includes I didn't add
it as an inline function.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Correct smt_enabled=X boot option for > 2 threads per core
Nathan Fontenot [Thu, 5 Aug 2010 07:42:11 +0000 (07:42 +0000)]
powerpc: Correct smt_enabled=X boot option for > 2 threads per core

The 'smt_enabled=X' boot option does not handle values of X > 2.
For Power 7 processors with smt modes of 0,1,2,3, and 4 this does
not work.  This patch allows the smt_enabled option to be set to
any value limited to a max equal to the number of threads per
core.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Silence xics_migrate_irqs_away() during cpu offline
Signed-off-by: Darren Hart [Wed, 4 Aug 2010 18:28:35 +0000 (18:28 +0000)]
powerpc: Silence xics_migrate_irqs_away() during cpu offline

All IRQs are migrated away from a CPU that is being offlined so the
following messages suggest a problem when the system is behaving as
designed:

IRQ 262 affinity broken off cpu 1
IRQ 17 affinity broken off cpu 0
IRQ 18 affinity broken off cpu 0
IRQ 19 affinity broken off cpu 0
IRQ 256 affinity broken off cpu 0
IRQ 261 affinity broken off cpu 0
IRQ 262 affinity broken off cpu 0

Don't print these messages when the CPU is not online.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Silence __cpu_up() under normal operation
Signed-off-by: Darren Hart [Wed, 4 Aug 2010 18:28:34 +0000 (18:28 +0000)]
powerpc: Silence __cpu_up() under normal operation

During CPU offline/online tests __cpu_up would flood the logs with
the following message:

Processor 0 found.

This provides no useful information to the user as there is no context
provided, and since the operation was a success (to this point) it is expected
that the CPU will come back online, providing all the feedback necessary.

Change the "Processor found" message to DBG() similar to other such messages in
the same function. Also, add an appropriate log level for the "Processor is
stuck" message.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Re-enable preemption before cpu_die()
Signed-off-by: Darren Hart [Wed, 4 Aug 2010 18:28:33 +0000 (18:28 +0000)]
powerpc: Re-enable preemption before cpu_die()

start_secondary() is called shortly after _start and also via

cpu_idle()->cpu_die()->pseries_mach_cpu_die()

start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is
called via the cpu_idle() routine with preemption disabled, resulting in the
following repeating message during rapid cpu offline/online tests
with CONFIG_PREEMPT=y:

BUG: scheduling while atomic: swapper/0/0x00000002
Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan]
Call Trace:
[c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable)
[c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c
[c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c
[c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800
[c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8
[c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4
[c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14

Move the cpu_die() call inside the existing preemption enabled block of
cpu_idle(). This is safe as the idle task is affined to a single CPU so the
debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are
in a "migration disabled" region.

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/pci: Drop unnecessary null test
Julia Lawall [Tue, 3 Aug 2010 11:35:17 +0000 (11:35 +0000)]
powerpc/pci: Drop unnecessary null test

list_for_each_entry binds its first argument to a non-null value, and thus
any null test on the value of that argument is superfluous.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
iterator I;
expression x,E,E1,E2;
statement S,S1,S2;
@@

I(x,...) { <...
- if (x != NULL || ...)
  S
  ...> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/powermac: Drop unnecessary null test
Julia Lawall [Tue, 3 Aug 2010 11:33:43 +0000 (11:33 +0000)]
powerpc/powermac: Drop unnecessary null test

for_each_node_by_name binds its first argument to a non-null value, and
thus any null test on the value of that argument is superfluous.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
iterator I;
expression x,E;
@@

I(x,...) { <...
(
- (x != NULL) &&
  E
  ...> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/powermac: Drop unnecessary of_node_put
Julia Lawall [Tue, 3 Aug 2010 09:50:32 +0000 (09:50 +0000)]
powerpc/powermac: Drop unnecessary of_node_put

for_each_node_by_name only exits when its first argument is NULL, and a
subsequent call to of_node_put on that argument is unnecessary.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@
iterator name for_each_node_by_name;
expression np,E;
identifier l;
@@

for_each_node_by_name(np,...) {
  ... when != break;
      when != goto l;
}
... when != np = E
- of_node_put(np);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/kdump: Stop all other CPUs before running crash handlers
Anton Blanchard [Mon, 2 Aug 2010 20:39:41 +0000 (20:39 +0000)]
powerpc/kdump: Stop all other CPUs before running crash handlers

During kdump we run the crash handlers first then stop all other CPUs.
We really want to stop all CPUs as close to the fail as possible and also
have a very controlled environment for running the crash handlers, so it
makes sense to reverse the order.

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc/mm: Fix vsid_scrample typo
Anton Blanchard [Mon, 2 Aug 2010 20:35:18 +0000 (20:35 +0000)]
powerpc/mm: Fix vsid_scrample typo

The code is wrapped in an #if 0, but it's wrong so we may as well fix it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Use is_32bit_task() helper to test 32 bit binary
Denis Kirjanov [Thu, 29 Jul 2010 22:04:39 +0000 (22:04 +0000)]
powerpc: Use is_32bit_task() helper to test 32 bit binary

Use is_32bit_task() helper to test 32 bit binary.

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Export memstart_addr and kernstart_addr on ppc64
Sonny Rao [Thu, 19 Aug 2010 18:08:09 +0000 (18:08 +0000)]
powerpc: Export memstart_addr and kernstart_addr on ppc64

Some modules (like eHCA) want to map all of kernel memory, for this to
work with a relocated kernel, we need to export kernstart_addr so
modules can use PHYSICAL_START and memstart_addr so they could use
MEMORY_START.  Note that the 32bit code already exports these symbols.

Signed-off-By: Sonny Rao <sonnyrao@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agopowerpc: Make rwsem use "long" type
Benjamin Herrenschmidt [Tue, 24 Aug 2010 04:41:48 +0000 (14:41 +1000)]
powerpc: Make rwsem use "long" type

This makes the 64-bit kernel use 64-bit signed integers for the counter
(effectively supporting 32-bit of active count in the semaphore), thus
avoiding things like overflow of the mmap_sem if you use a really crazy
number of threads

Note: Ideally the type in the structure should be atomic_long_t rather
than "long". However, there's some nasty issues with that. It needs to
be initialized statically -and- lib/rwsem.c does things like

        sem->count = RWSEM_UNLOCKED_VALUE;

Now, if you mix in the fact that atomic_* types are actually structures
with one member and note typedefs of a scalar, it makes its really nasty.

So I stuck to what we did before using a long and casts for now.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
14 years agoMerge remote branch 'jwb/merge' into merge
Benjamin Herrenschmidt [Tue, 24 Aug 2010 04:36:45 +0000 (14:36 +1000)]
Merge remote branch 'jwb/merge' into merge

14 years agoARM: imx: fix build failure concerning otg/ulpi
Uwe Kleine-König [Fri, 13 Aug 2010 12:06:50 +0000 (14:06 +0200)]
ARM: imx: fix build failure concerning otg/ulpi

The build failure was introduced by

13dd0c9 (USB: otg/ulpi: extend the generic ulpi driver.)

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Igor Grinberg <grinberg@compulab.co.il>
Cc: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ftdi_sio: add product ID for Lenz LI-USB
Galen Seitz [Thu, 19 Aug 2010 18:15:20 +0000 (11:15 -0700)]
USB: ftdi_sio: add product ID for Lenz LI-USB

Add ftdi product ID for Lenz LI-USB, a model train interface.  This
was NOT tested against 2.6.35, but a similar patch was tested with the
CentOS 2.6.18-194.11.1.el5 kernel.  It wasn't clear to me what
ordering is being used in ftdi_sio.c, so I inserted the ID after another
model train entry(SPROG_II).

Signed-off-by: Galen Seitz <galens@seitzassoc.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: adutux: fix misuse of return value of copy_to_user()
Kulikov Vasiliy [Sat, 31 Jul 2010 17:40:07 +0000 (21:40 +0400)]
USB: adutux: fix misuse of return value of copy_to_user()

copy_to_user() returns number of not copied bytes, not error code.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: iowarrior: fix misuse of return value of copy_to_user()
Kulikov Vasiliy [Sat, 31 Jul 2010 17:39:46 +0000 (21:39 +0400)]
USB: iowarrior: fix misuse of return value of copy_to_user()

copy_to_user() returns number of not copied bytes, not error code.

Signed-off-by: Kulikov Vasiliy <segooon@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: xHCI: update ring dequeue pointer when process missed tds
Andiry Xu [Mon, 9 Aug 2010 20:56:15 +0000 (13:56 -0700)]
USB: xHCI: update ring dequeue pointer when process missed tds

This patch fixes a isoc transfer bug reported by Sander Eikelenboom.
When ep->skip is set, endpoint ring dequeue pointer should be updated
when processed every missed td. Although ring dequeue pointer will also
be updated when ep->skip is clear, leave it intact during missed tds
processing may cause two issues:

1). If the very next valid transfer following missed tds is a short
transfer, its actual_length will be miscalculated;
2). If there are too many missed tds during transfer, new inserted tds
may found the transfer ring full and urb enqueue fails.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Andiry Xu <andiry.xu@amd.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: xhci: Remove buggy assignment in next_trb()
John Youn [Mon, 9 Aug 2010 20:56:11 +0000 (13:56 -0700)]
USB: xhci: Remove buggy assignment in next_trb()

The code to increment the TRB pointer has a slight ambiguity that could
lead to a bug on different compilers.  The ANSI C specification does not
specify the precedence of the assignment operator over the postfix
operator.  gcc 4.4 produced the correct code (increment the pointer and
assign the value), but a MIPS compiler that one of John's clients used
assigned the old (unincremented) value.

Remove the unnecessary assignment to make all compilers produce the
correct assembly.

Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ftdi_sio: Add ID for Ionics PlugComputer
Martin Michlmayr [Tue, 10 Aug 2010 19:31:21 +0000 (20:31 +0100)]
USB: ftdi_sio: Add ID for Ionics PlugComputer

Add the ID for the Ionics PlugComputer (<http://ionicsplug.com/>).

Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: serial: io_ti.c: don't return 0 if writing the download record failed
Roel Kluin [Tue, 10 Aug 2010 21:29:19 +0000 (14:29 -0700)]
USB: serial: io_ti.c: don't return 0 if writing the download record failed

If the write download record failed we shouldn't return 0.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: otg: twl4030: fix wrong assumption of starting state
Felipe Balbi [Wed, 11 Aug 2010 10:02:32 +0000 (13:02 +0300)]
USB: otg: twl4030: fix wrong assumption of starting state

The reset state of twl4030-usb is not sleeping, it starts
up awaken and we need to disable it if we have booted
with a disconnected cable to avoid over consumption on
the default state.

To avoid problems later, we read the current state of the
transceiver from the PHY_PWR_CTRL register. The bootloader
can, anyways, put the device to sleep before us.

Tested on a custom OMAP board.

Signed-off-by: Felipe Balbi <felipe.balbi@nokia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: gadget: Return -ENOMEM on memory allocation failure
Julia Lawall [Wed, 11 Aug 2010 10:10:48 +0000 (12:10 +0200)]
USB: gadget: Return -ENOMEM on memory allocation failure

In this code, 0 is returned on memory allocation failure, even though other
failures return -ENOMEM or other similar values.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression ret;
expression x,e1,e2,e3;
@@

ret = 0
... when != ret = e1
*x = \(kmalloc\|kcalloc\|kzalloc\)(...)
... when != ret = e2
if (x == NULL) { ... when != ret = e3
  return ret;
}
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: gadget: fix composite kernel-doc warnings
Randy Dunlap [Wed, 11 Aug 2010 19:07:13 +0000 (12:07 -0700)]
USB: gadget: fix composite kernel-doc warnings

Warning(include/linux/usb/composite.h:284): No description found for parameter 'disconnect'
Warning(drivers/usb/gadget/composite.c:744): No description found for parameter 'c'
Warning(drivers/usb/gadget/composite.c:744): Excess function parameter 'cdev' description in 'usb_string_ids_n'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: set tty_flags in ssu100_process_packet
Bill Pemberton [Fri, 13 Aug 2010 13:59:31 +0000 (09:59 -0400)]
USB: ssu100: set tty_flags in ssu100_process_packet

flag was never set in ssu100_process_packet.  Add logic to set it
before calling tty_insert_flip_*

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: add disconnect function for ssu100
Bill Pemberton [Thu, 5 Aug 2010 21:01:11 +0000 (17:01 -0400)]
USB: ssu100: add disconnect function for ssu100

Add a disconnect function to the functions of this device.  The
disconnect is a call to usb_serial_generic_disconnect() so it requires
that symbol to be exported from generic.c.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: serial: export symbol usb_serial_generic_disconnect
Bill Pemberton [Thu, 5 Aug 2010 21:01:10 +0000 (17:01 -0400)]
USB: serial: export symbol usb_serial_generic_disconnect

This is needed by the ssu100 driver to use this function.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: rework logic for TIOCMIWAIT
Bill Pemberton [Thu, 5 Aug 2010 21:01:09 +0000 (17:01 -0400)]
USB: ssu100: rework logic for TIOCMIWAIT

Rework the logic for TIOCMIWAIT to use wait_event_interruptible.

This also adds support for TIOCGICOUNT.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: add register parameter to ssu100_setregister
Bill Pemberton [Thu, 5 Aug 2010 21:01:08 +0000 (17:01 -0400)]
USB: ssu100: add register parameter to ssu100_setregister

The function ssu100_setregister was hard coded to only set the MCR
register.  Add a register parameter so that other registers can be
set.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: remove duplicate #defines in ssu100
Bill Pemberton [Thu, 5 Aug 2010 21:01:07 +0000 (17:01 -0400)]
USB: ssu100: remove duplicate #defines in ssu100

The ssu100 uses a TI16C550C UART so the SERIAL_ defines in this code
are duplicates of those found in serial_reg.h.  Remove the defines in
ssu100.c and use the ones in the header file.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: refine process_packet in ssu100
Bill Pemberton [Thu, 5 Aug 2010 21:01:06 +0000 (17:01 -0400)]
USB: ssu100: refine process_packet in ssu100

The status information does not appear at the start of each incoming
packet so the check for len < 4 at the start of ssu100_process_packet
is wrong.  Remove it.

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ssu100: add locking for port private data in ssu100
Bill Pemberton [Thu, 5 Aug 2010 21:01:05 +0000 (17:01 -0400)]
USB: ssu100: add locking for port private data in ssu100

Signed-off-by: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: r8a66597-udc: return -ENOMEM if kzalloc() fails
Axel Lin [Tue, 17 Aug 2010 01:41:29 +0000 (09:41 +0800)]
USB: r8a66597-udc: return -ENOMEM if kzalloc() fails

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: io_ti: check firmware version before updating
Greg Kroah-Hartman [Tue, 17 Aug 2010 22:15:37 +0000 (15:15 -0700)]
USB: io_ti: check firmware version before updating

If we can't read the firmware for a device from the disk, and yet the
device already has a valid firmware image in it, we don't want to
replace the firmware with something invalid.  So check the version
number to be less than the current one to verify this is the correct
thing to do.

Reported-by: Chris Beauchamp <chris@chillibean.tv>
Tested-by: Chris Beauchamp <chris@chillibean.tv>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: ftdi_sio: fix endianess of max packet size
Michael Wileczka [Wed, 18 Aug 2010 14:14:37 +0000 (07:14 -0700)]
USB: ftdi_sio: fix endianess of max packet size

The USB max packet size (always little-endian) was not being byte
swapped on big-endian systems.

Applicable since [USB: ftdi_sio: fix hi-speed device packet size calculation] approx 2.6.31

Signed-off-by: Michael Wileczka <mikewileczka@yahoo.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: CP210x Fix Break On/Off
Craig Shelley [Wed, 18 Aug 2010 21:13:39 +0000 (22:13 +0100)]
USB: CP210x Fix Break On/Off

The definitions for BREAK_ON and BREAK_OFF are inverted, causing break
requests to fail. This patch sets BREAK_ON and BREAK_OFF to the correct
values.

Signed-off-by: Craig Shelley <craig@microtron.org.uk>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: pl2303: New vendor and product id
Jef Driesen [Mon, 9 Aug 2010 13:55:32 +0000 (15:55 +0200)]
USB: pl2303: New vendor and product id

Add support for the Zeagle N2iTiON3 dive computer interface. Since
Zeagle devices are actually manufactured by Seiko, this patch will
support other Seiko based models as well.

Signed-off-by: Jef Driesen <jefdriesen@telenet.be>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: serial: fix leak of usb serial module refrence count
Ming Lei [Sat, 7 Aug 2010 08:20:35 +0000 (16:20 +0800)]
USB: serial: fix leak of usb serial module refrence count

The patch with title below makes reference count of usb serial module
always more than one after driver is bound.

USB-BKL: Remove BKL use for usb serial driver probing

In fact, the patch above only replaces lock_kernel() with try_module_get()
, and does not use module_put() to do what unlock_kernel() did, so casue leak
of reference count of usb serial module and the module can not be unloaded
after serial driver is bound with device.

This patch fixes the issue, also simplifies such things:
-only call try_module_get() once in the entry of usb_serial_probe()
-only call module_put() once in the exit of usb_serial_probe

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: Johan Hovold <jhovold@gmail.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: add device IDs for igotu to navman
Ross Burton [Fri, 6 Aug 2010 15:36:39 +0000 (16:36 +0100)]
USB: add device IDs for igotu to navman

I recently bought a i-gotU USB GPS, and whilst hunting around for linux
support discovered this post by you back in 2009:

http://kerneltrap.org/mailarchive/linux-usb/2009/3/12/5148644

>Try the navman driver instead.  You can either add the device id to the
> driver and rebuild it, or do this before you plug the device in:
>  modprobe navman
>  echo -n "0x0df7 0x0900" > /sys/bus/usb-serial/drivers/navman/new_id
>
> and then plug your device in and see if that works.

I can confirm that the navman driver works with the right device IDs on
my i-gotU GT-600, which has the same device IDs.  Attached is a patch
adding the IDs.

From: Ross Burton <ross@linux.intel.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: isp1760: use a write barrier to ensure proper ndelay timing
Michael Hennerich [Thu, 5 Aug 2010 21:53:57 +0000 (17:53 -0400)]
USB: isp1760: use a write barrier to ensure proper ndelay timing

The ISP1760 has some timing requirements where it has to delay a short
period after a write to a register has started.  However, this delay is
from the time the write hits the USB chip (the ISP1760), not from the
time where the processor started processing the write.  So on a quick
enough processor, it is sometimes possible for the write to not hit the
device before we start delaying, and we then violate the part's timing
requirements, so things stop working.

To avoid all this, insert a write barrier after the register write and
before the timing delay/register read so we can guarantee we only start
counting time after the write has hit the device.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: option: add Celot CT-650
Michael Tokarev [Fri, 6 Aug 2010 14:49:21 +0000 (18:49 +0400)]
USB: option: add Celot CT-650

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: uvc_v4l2: cleanup test for end of loop
Dan Carpenter [Thu, 12 Aug 2010 07:59:58 +0000 (09:59 +0200)]
USB: uvc_v4l2: cleanup test for end of loop

We're trying to test for the the end of the loop here.  "format" is
never NULL.  We don't know what "format->fcc" is because we're past the
end of the loop and I think "fmt->fmt.pix.pixelformat" comes from the
user so we don't know what that is either.  It works, but it's cleaner
to just test to see if (i == ARRAY_SIZE(uvc_formats).

Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoxfs: do not discard page cache data on EAGAIN
Christoph Hellwig [Tue, 24 Aug 2010 01:47:51 +0000 (11:47 +1000)]
xfs: do not discard page cache data on EAGAIN

If xfs_map_blocks returns EAGAIN because of lock contention we must redirty the
page and not disard the pagecache content and return an error from writepage.
We used to do this correctly, but the logic got lost during the recent
reshuffle of the writepage code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Mike Gao <ygao.linux@gmail.com>
Tested-by: Mike Gao <ygao.linux@gmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
14 years agoxfs: don't do memory allocation under the CIL context lock
Dave Chinner [Tue, 24 Aug 2010 01:45:53 +0000 (11:45 +1000)]
xfs: don't do memory allocation under the CIL context lock

Formatting items requires memory allocation when using delayed
logging. Currently that memory allocation is done while holding the
CIL context lock in read mode. This means that if memory allocation
takes some time (e.g. enters reclaim), we cannot push on the CIL
until the allocation(s) required by formatting complete. This can
stall CIL pushes for some time, and once a push is stalled so are
all new transaction commits.

Fix this splitting the item formatting into two steps. The first
step which does the allocation and memcpy() into the allocated
buffer is now done outside the CIL context lock, and only the CIL
insert is done inside the CIL context lock. This avoids the stall
issue.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: Reduce log force overhead for delayed logging
Dave Chinner [Tue, 24 Aug 2010 01:40:03 +0000 (11:40 +1000)]
xfs: Reduce log force overhead for delayed logging

Delayed logging adds some serialisation to the log force process to
ensure that it does not deference a bad commit context structure
when determining if a CIL push is necessary or not. It does this by
grabing the CIL context lock exclusively, then dropping it before
pushing the CIL if necessary. This causes serialisation of all log
forces and pushes regardless of whether a force is necessary or not.
As a result fsync heavy workloads (like dbench) can be significantly
slower with delayed logging than without.

To avoid this penalty, copy the current sequence from the context to
the CIL structure when they are swapped. This allows us to do
unlocked checks on the current sequence without having to worry
about dereferencing context structures that may have already been
freed. Hence we can remove the CIL context locking in the forcing
code and only call into the push code if the current context matches
the sequence we need to force.

By passing the sequence into the push code, we can check the
sequence again once we have the CIL lock held exclusive and abort if
the sequence has already been pushed. This avoids a lock round-trip
and unnecessary CIL pushes when we have racing push calls.

The result is that the regression in dbench performance goes away -
this change improves dbench performance on a ramdisk from ~2100MB/s
to ~2500MB/s. This compares favourably to not using delayed logging
which retuns ~2500MB/s for the same workload.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: dummy transactions should not dirty VFS state
Dave Chinner [Tue, 24 Aug 2010 01:46:31 +0000 (11:46 +1000)]
xfs: dummy transactions should not dirty VFS state

When we  need to cover the log, we issue dummy transactions to ensure
the current log tail is on disk. Unfortunately we currently use the
root inode in the dummy transaction, and the act of committing the
transaction dirties the inode at the VFS level.

As a result, the VFS writeback of the dirty inode will prevent the
filesystem from idling long enough for the log covering state
machine to complete. The state machine gets stuck in a loop issuing
new dummy transactions to cover the log and never makes progress.

To avoid this problem, the dummy transactions should not cause
externally visible state changes. To ensure this occurs, make sure
that dummy transactions log an unchanging field in the superblock as
it's state is never propagated outside the filesystem. This allows
the log covering state machine to complete successfully and the
filesystem now correctly enters a fully idle state about 90s after
the last modification was made.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: ensure f_ffree returned by statfs() is non-negative
Stuart Brodsky [Tue, 24 Aug 2010 01:46:05 +0000 (11:46 +1000)]
xfs: ensure f_ffree returned by statfs() is non-negative

Because of delayed updates to sb_icount field in the super block, it
is possible to allocate over maxicount number of inodes.  This
causes the arithmetic to calculate a negative number of free inodes
in user commands like df or stat -f.

Since maxicount is a somewhat arbitrary number, a slight over
allocation is not critical but user commands should be displayed as
0 or greater and never go negative.  To do this the value in the
stats buffer f_ffree is capped to never go negative.

[ Modified to use max_t as per Christoph's comment. ]

Signed-off-by: Stu Brodsky <sbrodsky@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
14 years agoxfs: handle negative wbc->nr_to_write during sync writeback
Dave Chinner [Tue, 24 Aug 2010 01:44:56 +0000 (11:44 +1000)]
xfs: handle negative wbc->nr_to_write during sync writeback

During data integrity (WB_SYNC_ALL) writeback, wbc->nr_to_write will
go negative on inodes with more than 1024 dirty pages due to
implementation details of write_cache_pages(). Currently XFS will
abort page clustering in writeback once nr_to_write drops below
zero, and so for data integrity writeback we will do very
inefficient page at a time allocation and IO submission for inodes
with large numbers of dirty pages.

Fix this by only aborting the page clustering code when
wbc->nr_to_write is negative and the sync mode is WB_SYNC_NONE.

Cc: <stable@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agowriteback: write_cache_pages doesn't terminate at nr_to_write <= 0
Dave Chinner [Tue, 24 Aug 2010 01:44:34 +0000 (11:44 +1000)]
writeback: write_cache_pages doesn't terminate at nr_to_write <= 0

I noticed XFS writeback in 2.6.36-rc1 was much slower than it should have
been. Enabling writeback tracing showed:

    flush-253:16-8516  [007] 1342952.351608: wbc_writepage: bdi 253:16: towrt=1024 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [007] 1342952.351654: wbc_writepage: bdi 253:16: towrt=1023 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [000] 1342952.369520: wbc_writepage: bdi 253:16: towrt=0 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [000] 1342952.369542: wbc_writepage: bdi 253:16: towrt=-1 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0
    flush-253:16-8516  [000] 1342952.369549: wbc_writepage: bdi 253:16: towrt=-2 skip=0 mode=0 kupd=0 bgrd=1 reclm=0 cyclic=1 more=0 older=0x0 start=0x0 end=0x0

Writeback is not terminating in background writeback if ->writepage is
returning with wbc->nr_to_write == 0, resulting in sub-optimal single page
writeback on XFS.

Fix the write_cache_pages loop to terminate correctly when this situation
occurs and so prevent this sub-optimal background writeback pattern. This
improves sustained sequential buffered write performance from around
250MB/s to 750MB/s for a 100GB file on an XFS filesystem on my 8p test VM.

Cc:<stable@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: fix untrusted inode number lookup
Dave Chinner [Tue, 24 Aug 2010 01:42:30 +0000 (11:42 +1000)]
xfs: fix untrusted inode number lookup

Commit 7124fe0a5b619d65b739477b3b55a20bf805b06d ("xfs: validate untrusted inode
numbers during lookup") changes the inode lookup code to do btree lookups for
untrusted inode numbers. This change made an invalid assumption about the
alignment of inodes and hence incorrectly calculated the first inode in the
cluster. As a result, some inode numbers were being incorrectly considered
invalid when they were actually valid.

The issue was not picked up by the xfstests suite because it always runs fsr
and dump (the two utilities that utilise the bulkstat interface) on cache hot
inodes and hence the lookup code in the cold cache path was not sufficiently
exercised to uncover this intermittent problem.

Fix the issue by relaxing the btree lookup criteria and then checking if the
record returned contains the inode number we are lookup for. If it we get an
incorrect record, then the inode number is invalid.

Cc: <stable@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
14 years agoxfs: ensure we mark all inodes in a freed cluster XFS_ISTALE
Dave Chinner [Tue, 24 Aug 2010 01:42:41 +0000 (11:42 +1000)]
xfs: ensure we mark all inodes in a freed cluster XFS_ISTALE

Under heavy load parallel metadata loads (e.g. dbench), we can fail
to mark all the inodes in a cluster being freed as XFS_ISTALE as we
skip inodes we cannot get the XFS_ILOCK_EXCL or the flush lock on.
When this happens and the inode cluster buffer has already been
marked stale and freed, inode reclaim can try to write the inode out
as it is dirty and not marked stale. This can result in writing th
metadata to an freed extent, or in the case it has already
been overwritten trigger a magic number check failure and return an
EUCLEAN error such as:

Filesystem "ram0": inode 0x442ba1 background reclaim flush failed with 117

Fix this by ensuring that we hoover up all in memory inodes in the
cluster and mark them XFS_ISTALE when freeing the cluster.

Cc: <stable@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>