openwrt/staging/blogic.git
13 years agoautofs4: split autofs4_init_ino()
Al Viro [Sun, 16 Jan 2011 23:43:40 +0000 (18:43 -0500)]
autofs4: split autofs4_init_ino()

split init_ino into new_ino and clean_ino; the former is
what used to be init_ino(NULL, sbi), the latter is for cases
where we passed non-NULL ino.  Lose unused arguments.

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4: mkdir and symlink always get a dentry that had passed lookup
Al Viro [Sun, 16 Jan 2011 23:29:35 +0000 (18:29 -0500)]
autofs4: mkdir and symlink always get a dentry that had passed lookup

... so ->d_fsdata will have been set up before we get there

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4: autofs4_get_inode() doesn't need autofs_info * argument anymore
Al Viro [Sun, 16 Jan 2011 22:43:52 +0000 (17:43 -0500)]
autofs4: autofs4_get_inode() doesn't need autofs_info * argument anymore

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4: kill ->size in autofs_info
Al Viro [Sun, 16 Jan 2011 22:39:15 +0000 (17:39 -0500)]
autofs4: kill ->size in autofs_info

It's used only to pass the length of symlink body to
autofs4_get_inode() in autofs4_dir_symlink().  We can
bloody well set inode->i_size in autofs4_dir_symlink()
directly and be done with that.

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4: pass mode to autofs4_get_inode() explicitly
Al Viro [Sun, 16 Jan 2011 22:20:23 +0000 (17:20 -0500)]
autofs4: pass mode to autofs4_get_inode() explicitly

In all cases we'd set inf->mode to know value just before
passing it to autofs4_get_inode().  That kills the need
to store it in autofs_info and pass it to autofs_init_ino()

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4: autofs4_mkroot() is not different from autofs4_init_ino()
Al Viro [Sun, 16 Jan 2011 22:13:33 +0000 (17:13 -0500)]
autofs4: autofs4_mkroot() is not different from autofs4_init_ino()

Kill it.  Mind you, it's been an obfuscated call of autofs4_init_ino()
ever since 2.3.99pre6-4...

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4: keep symlink body in inode->i_private
Al Viro [Mon, 17 Jan 2011 05:47:38 +0000 (00:47 -0500)]
autofs4: keep symlink body in inode->i_private

gets rid of all ->free()/->u.symlink machinery in autofs; we simply
keep symlink bodies in inode->i_private and free them in ->evict_inode().

Acked-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4 - fix debug print in autofs4_lookup()
Ian Kent [Tue, 18 Jan 2011 04:06:15 +0000 (12:06 +0800)]
autofs4 - fix debug print in autofs4_lookup()

oz_mode isn't defined any more, use autofs4_oz_mode(sbi) instead.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agovfs - fix dentry ref count in do_lookup()
Ian Kent [Tue, 18 Jan 2011 04:06:10 +0000 (12:06 +0800)]
vfs - fix dentry ref count in do_lookup()

There is a ref count problem in fs/namei.c:do_lookup().

When walking in ref-walk mode, if follow_managed() returns a fail we
need to drop dentry and possibly vfsmount.  Clean up properly,
as we do in the other caller of follow_managed().

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoautofs4 - fix get_next_positive_dentry()
Ian Kent [Tue, 18 Jan 2011 04:06:04 +0000 (12:06 +0800)]
autofs4 - fix get_next_positive_dentry()

The initialization condition in fs/autofs4/expire.c:get_next_positive_dentry()
appears to be incorrect. If prev == NULL I believe that root should be
returned.

Further down, at the current dentry check for it being simple_positive()
it looks like the d_lock for dentry p should be dropped instead of dentry
ret, otherwise when p is assinged to ret we end up with no lock on p and
a lost lock on ret, which leads to a deadlock.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland...
Linus Torvalds [Mon, 17 Jan 2011 22:45:48 +0000 (14:45 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/roland/infiniband

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA: Update workqueue usage
  RDMA/nes: Fix incorrect SFP+ link status detection on driver init
  RDMA/nes: Fix SFP+ link down detection issue with switch port disable
  RDMA/nes: Generate IB_EVENT_PORT_ERR/PORT_ACTIVE events
  RDMA/nes: Fix bonding on iw_nes
  IB/srp: Test only once whether iu allocation succeeded
  IB/mlx4: Handle protocol field in multicast table
  RDMA: Use vzalloc() to replace vmalloc()+memset(0)
  mlx4_{core, ib, en}: Fix driver when sizeof (phys_addr_t) > sizeof (long)
  IB/mthca: Fix driver when sizeof (phys_addr_t) > sizeof (long)

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs...
Linus Torvalds [Mon, 17 Jan 2011 22:43:43 +0000 (14:43 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/btrfs-unstable

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: (25 commits)
  Btrfs: forced readonly mounts on errors
  btrfs: Require CAP_SYS_ADMIN for filesystem rebalance
  Btrfs: don't warn if we get ENOSPC in btrfs_block_rsv_check
  btrfs: Fix memory leak in btrfs_read_fs_root_no_radix()
  btrfs: check NULL or not
  btrfs: Don't pass NULL ptr to func that may deref it.
  btrfs: mount failure return value fix
  btrfs: Mem leak in btrfs_get_acl()
  btrfs: fix wrong free space information of btrfs
  btrfs: make the chunk allocator utilize the devices better
  btrfs: restructure find_free_dev_extent()
  btrfs: fix wrong calculation of stripe size
  btrfs: try to reclaim some space when chunk allocation fails
  btrfs: fix wrong data space statistics
  fs/btrfs: Fix build of ctree
  Btrfs: fix off by one while setting block groups readonly
  Btrfs: Add BTRFS_IOC_SUBVOL_GETFLAGS/SETFLAGS ioctls
  Btrfs: Add readonly snapshots support
  Btrfs: Refactor btrfs_ioctl_snap_create()
  btrfs: Extract duplicate decompress code
  ...

13 years agoRevert "mm: simplify code of swap.c"
Linus Torvalds [Mon, 17 Jan 2011 22:42:34 +0000 (14:42 -0800)]
Revert "mm: simplify code of swap.c"

This reverts commit d8505dee1a87b8d41b9c4ee1325cd72258226fbc.

Chris Mason ended up chasing down some page allocation errors and pages
stuck waiting on the IO scheduler, and was able to narrow it down to two
commits: commit 744ed1442757 ("mm: batch activate_page() to reduce lock
contention") and d8505dee1a87 ("mm: simplify code of swap.c").

This reverts the second one.

Reported-and-debugged-by: Chris Mason <chris.mason@oracle.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoRevert "mm: batch activate_page() to reduce lock contention"
Linus Torvalds [Mon, 17 Jan 2011 22:42:19 +0000 (14:42 -0800)]
Revert "mm: batch activate_page() to reduce lock contention"

This reverts commit 744ed1442757767ffede5008bb13e0805085902e.

Chris Mason ended up chasing down some page allocation errors and pages
stuck waiting on the IO scheduler, and was able to narrow it down to two
commits: commit 744ed1442757 ("mm: batch activate_page() to reduce lock
contention") and d8505dee1a87 ("mm: simplify code of swap.c").

This reverts the first of them.

Reported-and-debugged-by: Chris Mason <chris.mason@oracle.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jaxboe@fusionio.com>
Cc: linux-mm <linux-mm@kvack.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs...
Linus Torvalds [Mon, 17 Jan 2011 20:39:57 +0000 (12:39 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ecryptfs/ecryptfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6:
  ecryptfs: remove unnecessary decrypt when extending a file
  ecryptfs: Fix ecryptfs_printk() size_t warnings
  fs/ecryptfs: Add printf format/argument verification and fix fallout
  ecryptfs: fixed testing of file descriptor flags
  ecryptfs: test lower_file pointer when lower_file_mutex is locked
  ecryptfs: missing initialization of the superblock 'magic' field
  ecryptfs: moved ECRYPTFS_SUPER_MAGIC definition to linux/magic.h
  ecryptfs: fix truncation error in ecryptfs_read_update_atime

13 years agoxfs: Do not name variables "panic"
Geert Uytterhoeven [Mon, 17 Jan 2011 20:21:14 +0000 (21:21 +0100)]
xfs: Do not name variables "panic"

On platforms that call panic() inside their BUG() macro (m68k/sun3, and
all platforms that don't set HAVE_ARCH_BUG), compilation fails with:

| fs/xfs/support/debug.c: In function ‘xfs_cmn_err’:
| fs/xfs/support/debug.c:92: error: called object ‘panic’ is not a function

as the local variable "panic" conflicts with the "panic()" function.
Rename the local variable to resolve this.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoBtrfs: forced readonly mounts on errors
liubo [Thu, 6 Jan 2011 11:30:25 +0000 (19:30 +0800)]
Btrfs: forced readonly mounts on errors

This patch comes from "Forced readonly mounts on errors" ideas.

As we know, this is the first step in being more fault tolerant of disk
corruptions instead of just using BUG() statements.

The major content:
- add a framework for generating errors that should result in filesystems
  going readonly.
- keep FS state in disk super block.
- make sure that all of resource will be freed and released at umount time.
- make sure that fter FS is forced readonly on error, there will be no more
  disk change before FS is corrected. For this, we should stop write operation.

After this patch is applied, the conversion from BUG() to such a framework can
happen incrementally.

Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoMerge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6
Linus Torvalds [Mon, 17 Jan 2011 19:18:40 +0000 (11:18 -0800)]
Merge branch 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6

* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
  spi/spi_sh_msiof: fix a wrong free_irq() parameter
  dt/flattree: Return virtual address from early_init_dt_alloc_memory_arch()

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Mon, 17 Jan 2011 19:17:51 +0000 (11:17 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: add cruid= mount option
  cifs: cFYI the entire error code in map_smb_to_linux_error

13 years agoMerge git://git.infradead.org/mtd-2.6
Linus Torvalds [Mon, 17 Jan 2011 19:15:30 +0000 (11:15 -0800)]
Merge git://git.infradead.org/mtd-2.6

* git://git.infradead.org/mtd-2.6: (59 commits)
  mtd: mtdpart: disallow reading OOB past the end of the partition
  mtd: pxa3xx_nand: NULL dereference in pxa3xx_nand_probe
  UBI: use mtd->writebufsize to set minimal I/O unit size
  mtd: initialize writebufsize in the MTD object of a partition
  mtd: onenand: add mtd->writebufsize initialization
  mtd: nand: add mtd->writebufsize initialization
  mtd: cfi: add writebufsize initialization
  mtd: add writebufsize field to mtd_info struct
  mtd: OneNAND: OMAP2/3: prevent regulator sleeping while OneNAND is in use
  mtd: OneNAND: add enable / disable methods to onenand_chip
  mtd: m25p80: Fix JEDEC ID for AT26DF321
  mtd: txx9ndfmc: limit transfer bytes to 512 (ECC provides 6 bytes max)
  mtd: cfi_cmdset_0002: add support for Samsung K8D3x16UxC NOR chips
  mtd: cfi_cmdset_0002: add support for Samsung K8D6x16UxM NOR chips
  mtd: nand: ams-delta: drop omap_read/write, use ioremap
  mtd: m25p80: add debugging trace in sst_write
  mtd: nand: ams-delta: select for built-in by default
  mtd: OneNAND: lighten scary initial bad block messages
  mtd: OneNAND: OMAP2/3: add support for command line partitioning
  mtd: nand: rearrange ONFI revision checking, add ONFI 2.3
  ...

Fix up trivial conflict in drivers/mtd/Kconfig as per DavidW.

13 years agoecryptfs: remove unnecessary decrypt when extending a file
Frank Swiderski [Mon, 15 Nov 2010 18:43:22 +0000 (10:43 -0800)]
ecryptfs: remove unnecessary decrypt when extending a file

Removes an unecessary page decrypt from ecryptfs_begin_write when the
page is beyond the current file size. Previously, the call to
ecryptfs_decrypt_page would result in a read of 0 bytes, but still
attempt to decrypt an entire page. This patch detects that case and
merely zeros the page before marking it up-to-date.

Signed-off-by: Frank Swiderski <fes@chromium.org>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agoecryptfs: Fix ecryptfs_printk() size_t warnings
Tyler Hicks [Mon, 15 Nov 2010 23:36:38 +0000 (17:36 -0600)]
ecryptfs: Fix ecryptfs_printk() size_t warnings

Commit cb55d21f6fa19d8c6c2680d90317ce88c1f57269 revealed a number of
missing 'z' length modifiers in calls to ecryptfs_printk() when
printing variables of type size_t. This patch fixes those compiler
warnings.

Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agofs/ecryptfs: Add printf format/argument verification and fix fallout
Joe Perches [Wed, 10 Nov 2010 23:46:16 +0000 (15:46 -0800)]
fs/ecryptfs: Add printf format/argument verification and fix fallout

Add __attribute__((format... to __ecryptfs_printk
Make formats and arguments match.
Add casts to (unsigned long long) for %llu.

Signed-off-by: Joe Perches <joe@perches.com>
[tyhicks: 80 columns cleanup and fixed typo]
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Mon, 17 Jan 2011 19:00:09 +0000 (11:00 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  fs: fix address space warnings in ioctl_fiemap()
  aio: check return value of create_workqueue()
  hpfs_setattr error case avoids unlock_kernel
  compat: copy missing fields in compat_statfs64 to user
  compat: update comment of compat statfs syscalls
  compat: remove unnecessary assignment in compat_rw_copy_check_uvector()
  fs: FS_POSIX_ACL does not depend on BLOCK
  fs: Remove unlikely() from fget_light()
  fs: Remove unlikely() from fput_light()
  fallocate should be a file operation
  make the feature checks in ->fallocate future proof
  staging: smbfs building fix
  tidy up around finish_automount()
  don't drop newmnt on error in do_add_mount()
  Take the completion of automount into new helper

13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88...
Linus Torvalds [Mon, 17 Jan 2011 18:58:44 +0000 (10:58 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mattst88/alpha-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: fix WARN_ON in __local_bh_enable()
  alpha: fix breakage caused by df9ee29270
  alpha: add GENERIC_HARDIRQS_NO__DO_IRQ to Kconfig
  alpha/osf_sys: remove unused MAX_SELECT_SECONDS
  alpha: change to new Makefile flag variables
  alpha: kill off alpha_do_IRQ
  alpha: irq clean up
  alpha: use set_irq_chip and push down __do_IRQ to the machine types

13 years agoMerge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied...
Linus Torvalds [Mon, 17 Jan 2011 18:57:59 +0000 (10:57 -0800)]
Merge branch 'drm-fixes' of git://git./linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: balance asic_reset functions
  drm/radeon/kms: remove duplicate card_posted() functions
  drm/radeon/kms: add module option for pcie gen2
  drm/radeon/kms: fix typo in evergreen safe reg
  drm/nouveau: fix gpu page faults triggered by plymouthd
  drm/nouveau: greatly simplify mm, killing some bugs in the process
  drm/nvc0: enable protection of system-use-only structures in vm
  drm/nv40: initialise 0x17xx on all chipsets that have it
  drm/nv40: make detection of 0x4097-ful chipsets available everywhere

13 years agoMerge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
Linus Torvalds [Mon, 17 Jan 2011 18:54:41 +0000 (10:54 -0800)]
Merge branch 'next' of git://git./linux/kernel/git/djbw/async_tx

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (63 commits)
  ARM: PL08x: cleanup comments
  Update CONFIG_MD_RAID6_PQ to CONFIG_RAID6_PQ in drivers/dma/iop-adma.c
  ARM: PL08x: fix a warning
  Fix dmaengine_submit() return type
  dmaengine: at_hdmac: fix race while monitoring channel status
  dmaengine: at_hdmac: flags located in first descriptor
  dmaengine: at_hdmac: use subsys_initcall instead of module_init
  dmaengine: at_hdmac: no need set ACK in new descriptor
  dmaengine: at_hdmac: trivial add precision to unmapping comment
  dmaengine: at_hdmac: use dma_address to program DMA hardware
  pch_dma: support new device ML7213 IOH
  ARM: PL08x: prevent dma_set_runtime_config() reconfiguring memcpy channels
  ARM: PL08x: allow dma_set_runtime_config() to return errors
  ARM: PL08x: fix locking between prepare function and submit function
  ARM: PL08x: introduce 'phychan_hold' to hold on to physical channels
  ARM: PL08x: put txd's on the pending list in pl08x_tx_submit()
  ARM: PL08x: rename 'desc_list' as 'pend_list'
  ARM: PL08x: implement unmapping of memcpy buffers
  ARM: PL08x: store prep_* flags in async_tx structure
  ARM: PL08x: shrink srcbus/dstbus in txd structure
  ...

13 years agoecryptfs: fixed testing of file descriptor flags
Roberto Sassu [Wed, 3 Nov 2010 10:11:34 +0000 (11:11 +0100)]
ecryptfs: fixed testing of file descriptor flags

This patch replaces the check (lower_file->f_flags & O_RDONLY) with
((lower_file & O_ACCMODE) == O_RDONLY).

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agoecryptfs: test lower_file pointer when lower_file_mutex is locked
Roberto Sassu [Wed, 3 Nov 2010 10:11:28 +0000 (11:11 +0100)]
ecryptfs: test lower_file pointer when lower_file_mutex is locked

This patch prevents the lower_file pointer in the 'ecryptfs_inode_info'
structure to be checked when the mutex 'lower_file_mutex' is not locked.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agoecryptfs: missing initialization of the superblock 'magic' field
Roberto Sassu [Wed, 3 Nov 2010 10:11:22 +0000 (11:11 +0100)]
ecryptfs: missing initialization of the superblock 'magic' field

This patch initializes the 'magic' field of ecryptfs filesystems to
ECRYPTFS_SUPER_MAGIC.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
[tyhicks: merge with 66cb76666d69]
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agospi/spi_sh_msiof: fix a wrong free_irq() parameter
Guennadi Liakhovetski [Mon, 17 Jan 2011 16:01:07 +0000 (17:01 +0100)]
spi/spi_sh_msiof: fix a wrong free_irq() parameter

Without this fix reloading of the driver is impossible.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
13 years agoecryptfs: moved ECRYPTFS_SUPER_MAGIC definition to linux/magic.h
Roberto Sassu [Wed, 3 Nov 2010 10:11:15 +0000 (11:11 +0100)]
ecryptfs: moved ECRYPTFS_SUPER_MAGIC definition to linux/magic.h

The definition of ECRYPTFS_SUPER_MAGIC has been moved to the include
file 'linux/magic.h' to become available to other kernel subsystems.

Signed-off-by: Roberto Sassu <roberto.sassu@polito.it>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agoecryptfs: fix truncation error in ecryptfs_read_update_atime
Edward Shishkin [Fri, 29 Oct 2010 22:11:50 +0000 (00:11 +0200)]
ecryptfs: fix truncation error in ecryptfs_read_update_atime

This is similar to the bug found in direct-io not so long ago.

Fix up truncation (ssize_t->int).  This only matters with >2G
reads/writes, which the kernel doesn't permit.

Signed-off-by: Edward Shishkin <edward.shishkin@gmail.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
13 years agomtd: mtdpart: disallow reading OOB past the end of the partition
Artem Bityutskiy [Sun, 16 Jan 2011 15:50:54 +0000 (17:50 +0200)]
mtd: mtdpart: disallow reading OOB past the end of the partition

This patch fixes the mtdpart bug which allows users reading OOB past the
end of the partition. This happens because 'part_read_oob()' allows reading
multiple OOB areas in one go, and mtdparts does not validate the OOB
length in the request.

Although there is such check in 'nand_do_read_oob()' in nand_base.c, but
it checks that we do not read past the flash chip, not the partition,
because in nand_base.c we work with the whole chip (e.g., mtd->size
in nand_base.c is the size of the whole chip). So this check cannot
be done correctly in nand_base.c and should be instead done in mtdparts.c.

This problem was reported by Jason Liu <r64343@freescale.com> and reproduced
with nandsim:

$ modprobe nandsim first_id_byte=0x20 second_id_byte=0xaa third_id_byte=0x00 \
                   fourth_id_byte=0x15 parts=0x400,0x400
$ modprobe nandsim mtd_oobtest.ko dev=0
$ dmesg
= snip =
mtd_oobtest: attempting to read past end of device
mtd_oobtest: an error is expected...
mtd_oobtest: error: read past end of device
= snip =
mtd_oobtest: finished with 2 errors

Reported-by: Jason Liu <liu.h.jason@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
13 years agofs: fix address space warnings in ioctl_fiemap()
Namhyung Kim [Sun, 16 Jan 2011 14:28:17 +0000 (23:28 +0900)]
fs: fix address space warnings in ioctl_fiemap()

The fi_extents_start field of struct fiemap_extent_info is a
user pointer but was not marked as __user. This makes sparse
emit following warnings:

  CHECK   fs/ioctl.c
fs/ioctl.c:114:26: warning: incorrect type in argument 1 (different address spaces)
fs/ioctl.c:114:26:    expected void [noderef] <asn:1>*dst
fs/ioctl.c:114:26:    got struct fiemap_extent *[assigned] dest
fs/ioctl.c:202:14: warning: incorrect type in argument 1 (different address spaces)
fs/ioctl.c:202:14:    expected void const volatile [noderef] <asn:1>*<noident>
fs/ioctl.c:202:14:    got struct fiemap_extent *[assigned] fi_extents_start
fs/ioctl.c:212:27: warning: incorrect type in argument 1 (different address spaces)
fs/ioctl.c:212:27:    expected void [noderef] <asn:1>*dst
fs/ioctl.c:212:27:    got char *<noident>

Also add 'ufiemap' variable to eliminate unnecessary casts.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoaio: check return value of create_workqueue()
Namhyung Kim [Mon, 13 Dec 2010 15:06:25 +0000 (00:06 +0900)]
aio: check return value of create_workqueue()

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agohpfs_setattr error case avoids unlock_kernel
Dr. David Alan Gilbert [Mon, 13 Dec 2010 17:09:52 +0000 (17:09 +0000)]
hpfs_setattr error case avoids unlock_kernel

This fixed a case that 'sparse' spotted where hpfs_setattr has an error return
that didn't go through it's path that unlocks.

This is against git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
version 6313e3c21743cc88bb5bd8aa72948ee1e83937b6.

Build tested only, I don't have an hpfs file system to test.

Dave

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocompat: copy missing fields in compat_statfs64 to user
Namhyung Kim [Sun, 26 Dec 2010 16:41:54 +0000 (01:41 +0900)]
compat: copy missing fields in compat_statfs64 to user

f_flags and f_spare fields were not copied to userspace when
compat_sys_[f]statfs64 called.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocompat: update comment of compat statfs syscalls
Namhyung Kim [Sun, 26 Dec 2010 16:41:53 +0000 (01:41 +0900)]
compat: update comment of compat statfs syscalls

The commit 7ed1ee6118ae ("Take statfs variants to fs/statfs.c")
separates out statfs syscalls from fs/open.c. Thus the comment
should be changed also.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agocompat: remove unnecessary assignment in compat_rw_copy_check_uvector()
Namhyung Kim [Sun, 26 Dec 2010 16:41:52 +0000 (01:41 +0900)]
compat: remove unnecessary assignment in compat_rw_copy_check_uvector()

*@ret_pointer is initialized to @fast_pointer thus the assignment is
redundant.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofs: FS_POSIX_ACL does not depend on BLOCK
Randy Dunlap [Sun, 2 Jan 2011 22:44:00 +0000 (14:44 -0800)]
fs: FS_POSIX_ACL does not depend on BLOCK

- Fix a kconfig unmet dependency warning.
- Remove the comment that identifies which filesystems use POSIX ACL
  utility routines.
- Move the FS_POSIX_ACL symbol outside of the BLOCK symbol if/endif block
  because its functions do not depend on BLOCK and some of the filesystems
  that use it do not depend on BLOCK.

warning: (GENERIC_ACL && JFFS2_FS_POSIX_ACL && NFSD_V4 && NFS_ACL_SUPPORT && 9P_FS_POSIX_ACL) selects FS_POSIX_ACL which has unmet direct dependencies (BLOCK)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofs: Remove unlikely() from fget_light()
Steven Rostedt [Tue, 14 Dec 2010 00:38:09 +0000 (19:38 -0500)]
fs: Remove unlikely() from fget_light()

There's an unlikely() in fget_light() that assumes the file ref count
will be 1. Running the annotate branch profiler on a desktop that is
performing daily tasks (running firefox, evolution, xchat and is also part
of a distcc farm), it shows that the ref count is not 1 that often.

 correct incorrect      %    Function                  File              Line
 ------- ---------      -    --------                  ----              ----
1035099358 6209599193  85    fget_light              file_table.c         315

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofs: Remove unlikely() from fput_light()
Steven Rostedt [Tue, 14 Dec 2010 00:38:08 +0000 (19:38 -0500)]
fs: Remove unlikely() from fput_light()

In fput_light(), there's an unlikely(fput_needed), which running on
my normal desktop doing firefox, xchat, evolution and part of my distcc farm,
and running the annotate branch profiler shows that the unlikely is not
very unlikely.

 correct incorrect  %        Function             File              Line
 ------- ---------  -        --------             ----              ----
       0       48 100 fput_light                file.h               26
115828710 897415279  88 fput_light              file.h               26
865271179 5286128445  85 fput_light             file.h               26
19568539  8923664  31 fput_light                file.h               26
12353677  3562279  22 fput_light                file.h               26
  267691    67062  20 fput_light                file.h               26
15014853   348172   2 fput_light                file.h               26
  209258      205   0 fput_light                file.h               26
 1364164        0   0 fput_light                file.h               26

Which gives 1032903812 times it was correct and 6203351846 times it was
incorrect, or 85% incorrect.

Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofallocate should be a file operation
Christoph Hellwig [Fri, 14 Jan 2011 12:07:43 +0000 (13:07 +0100)]
fallocate should be a file operation

Currently all filesystems except XFS implement fallocate asynchronously,
while XFS forced a commit.  Both of these are suboptimal - in case of O_SYNC
I/O we really want our allocation on disk, especially for the !KEEP_SIZE
case where we actually grow the file with user-visible zeroes.  On the
other hand always commiting the transaction is a bad idea for fast-path
uses of fallocate like for example in recent Samba versions.   Given
that block allocation is a data plane operation anyway change it from
an inode operation to a file operation so that we have the file structure
available that lets us check for O_SYNC.

This also includes moving the code around for a few of the filesystems,
and remove the already unnedded S_ISDIR checks given that we only wire
up fallocate for regular files.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agomake the feature checks in ->fallocate future proof
Christoph Hellwig [Fri, 14 Jan 2011 12:07:30 +0000 (13:07 +0100)]
make the feature checks in ->fallocate future proof

Instead of various home grown checks that might need updates for new
flags just check for any bit outside the mask of the features supported
by the filesystem.  This makes the check future proof for any newly
added flag.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agostaging: smbfs building fix
Yang Ruirui [Fri, 14 Jan 2011 07:51:04 +0000 (15:51 +0800)]
staging: smbfs building fix

Building error for smbfs:

drivers/staging/smbfs/dir.c:286: error: static declaration of 'smbfs_dentry_operations' follows non-static declaration
drivers/staging/smbfs/proto.h:42: error: previous declaration of 'smbfs_dentry_operations' was here
drivers/staging/smbfs/dir.c:294: error: static declaration of 'smbfs_dentry_operations_case' follows non-static declaration
drivers/staging/smbfs/proto.h:41: error: previous declaration of 'smbfs_dentry_operations_case' was here
make[3]: *** [drivers/staging/smbfs/dir.o] Error 1
make[2]: *** [drivers/staging/smbfs] Error 2
make[1]: *** [drivers/staging] Error 2
make[1]: *** Waiting for unfinished jobs....

Fix it by removing static keywords

Signed-off-by: Yang Ruirui <ruirui.r.yang@tieto.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agotidy up around finish_automount()
Al Viro [Mon, 17 Jan 2011 06:47:59 +0000 (01:47 -0500)]
tidy up around finish_automount()

do_add_mount() and mnt_clear_expiry() are not needed outside of
namespace.c anymore, now that namei has finish_automount() to
use.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agodon't drop newmnt on error in do_add_mount()
Al Viro [Mon, 17 Jan 2011 06:41:58 +0000 (01:41 -0500)]
don't drop newmnt on error in do_add_mount()

That gets rid of the kludge in finish_automount() - we need
to keep refcount on the vfsmount as-is until we evict it from
expiry list.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoTake the completion of automount into new helper
Al Viro [Mon, 17 Jan 2011 06:35:23 +0000 (01:35 -0500)]
Take the completion of automount into new helper

... and shift it from namei.c to namespace.c

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoMerge branches 'misc', 'mlx4', 'mthca', 'nes' and 'srp' into for-next
Roland Dreier [Mon, 17 Jan 2011 05:22:41 +0000 (21:22 -0800)]
Merge branches 'misc', 'mlx4', 'mthca', 'nes' and 'srp' into for-next

13 years agoRDMA: Update workqueue usage
Tejun Heo [Tue, 19 Oct 2010 15:24:36 +0000 (15:24 +0000)]
RDMA: Update workqueue usage

* ib_wq is added, which is used as the common workqueue for infiniband
  instead of the system workqueue.  All system workqueue usages
  including flush_scheduled_work() callers are converted to use and
  flush ib_wq.

* cancel_delayed_work() + flush_scheduled_work() converted to
  cancel_delayed_work_sync().

* qib_wq is removed and ib_wq is used instead.

This is to prepare for deprecation of flush_scheduled_work().

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
13 years agoalpha: fix WARN_ON in __local_bh_enable()
Ivan Kokshaysky [Wed, 12 Jan 2011 22:02:24 +0000 (22:02 +0000)]
alpha: fix WARN_ON in __local_bh_enable()

Interrupts ought to be disabled _before_ irq_enter().

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matt Turner <mattst88@monolith.freenet-rz.de>
13 years agoalpha: fix breakage caused by df9ee29270
Ivan Kokshaysky [Wed, 12 Jan 2011 05:37:25 +0000 (00:37 -0500)]
alpha: fix breakage caused by df9ee29270

Commit df9ee29270 made arch_local_irq_save and arch_local_irq_restore
static inline which with -Werror trips up on __set_hae() and _set_hae()
which are extern inline.  The naive solution is to make __set_hae() and
set_hae() static inline but for reasons described in commit d559d4a24a3fe
this breaks the generic kernel build.  Instead, since this is architecture
specific code, this patch hard wires in the architecture specific method
f disabling and enabling interrupts.

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agoalpha: add GENERIC_HARDIRQS_NO__DO_IRQ to Kconfig
Matt Turner [Fri, 29 Oct 2010 21:22:00 +0000 (16:22 -0500)]
alpha: add GENERIC_HARDIRQS_NO__DO_IRQ to Kconfig

Acked-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agoalpha/osf_sys: remove unused MAX_SELECT_SECONDS
Namhyung Kim [Thu, 9 Dec 2010 09:07:21 +0000 (04:07 -0500)]
alpha/osf_sys: remove unused MAX_SELECT_SECONDS

Remove the leftover from the commit 14e2acd86865 ("select:
fix alpha OSF wrapper").

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agoalpha: change to new Makefile flag variables
matt mooney [Wed, 12 Jan 2011 03:09:09 +0000 (22:09 -0500)]
alpha: change to new Makefile flag variables

Signed-off-by: matt mooney <mfm@muteddisk.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agoalpha: kill off alpha_do_IRQ
Kyle McMartin [Fri, 15 Oct 2010 02:31:34 +0000 (22:31 -0400)]
alpha: kill off alpha_do_IRQ

Good riddance... Nuke a pile of redundant handlers that the
generic code takes care of as well.

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agoalpha: irq clean up
Kyle McMartin [Fri, 15 Oct 2010 02:31:25 +0000 (22:31 -0400)]
alpha: irq clean up

Stop touching irq_desc[irq] directly, instead use accessor
functions provided. Use irq_has_action instead of directly
testing the irq_desc.

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agoalpha: use set_irq_chip and push down __do_IRQ to the machine types
Kyle McMartin [Fri, 15 Oct 2010 02:31:11 +0000 (22:31 -0400)]
alpha: use set_irq_chip and push down __do_IRQ to the machine types

Also kill superfluous IRQ_DISABLED initialization, since that's the
default state of the irq_desc[i].status field.

Tested-by: Michael Cree <mcree@orcon.net.nz>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Matt Turner <mattst88@gmail.com>
13 years agodrm/radeon/kms: balance asic_reset functions
Alex Deucher [Tue, 11 Jan 2011 18:36:55 +0000 (13:36 -0500)]
drm/radeon/kms: balance asic_reset functions

First, we were calling mc_stop() at the top of the function
which turns off all MC (memory controller) clients,
then checking if the GPU is idle.  If it was idle we
returned without re-enabling the MC clients which would
lead to a blank screen, etc.  This patch checks if the
GPU is idle before calling mc_stop().

Second, if the reset failed, we were returning without
re-enabling the MC clients.  This patch re-enables
the MC clients before returning regardless of whether
the reset was successful or not.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: remove duplicate card_posted() functions
Alex Deucher [Tue, 11 Jan 2011 23:08:59 +0000 (18:08 -0500)]
drm/radeon/kms: remove duplicate card_posted() functions

Use the common one for all asics.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: add module option for pcie gen2
Alex Deucher [Thu, 13 Jan 2011 01:05:11 +0000 (20:05 -0500)]
drm/radeon/kms: add module option for pcie gen2

Switching to pcie gen2 causes problems on some
boards.  Add a module option to turn it on/off.

There are gen2 compatability issues with some
motherboards it seems.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=33027

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agodrm/radeon/kms: fix typo in evergreen safe reg
Alex Deucher [Thu, 13 Jan 2011 18:55:12 +0000 (13:55 -0500)]
drm/radeon/kms: fix typo in evergreen safe reg

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
13 years agoMerge remote branch 'nouveau/drm-nouveau-next' of /ssd/git/drm-nouveau-next into...
Dave Airlie [Mon, 17 Jan 2011 02:20:31 +0000 (12:20 +1000)]
Merge remote branch 'nouveau/drm-nouveau-next' of /ssd/git/drm-nouveau-next into drm-fixes

* 'nouveau/drm-nouveau-next' of /ssd/git/drm-nouveau-next:
  drm/nouveau: fix gpu page faults triggered by plymouthd
  drm/nouveau: greatly simplify mm, killing some bugs in the process
  drm/nvc0: enable protection of system-use-only structures in vm
  drm/nv40: initialise 0x17xx on all chipsets that have it
  drm/nv40: make detection of 0x4097-ful chipsets available everywhere

13 years agodrm/nouveau: fix gpu page faults triggered by plymouthd
Ben Skeggs [Mon, 17 Jan 2011 01:22:38 +0000 (11:22 +1000)]
drm/nouveau: fix gpu page faults triggered by plymouthd

The switch to separate BAR and channel address spaces made the fbcon memory
address calculation incorrect on NV50+ boards, this commit fixes that.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
13 years agodrm/nouveau: greatly simplify mm, killing some bugs in the process
Ben Skeggs [Fri, 14 Jan 2011 05:46:30 +0000 (15:46 +1000)]
drm/nouveau: greatly simplify mm, killing some bugs in the process

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
13 years agodrm/nvc0: enable protection of system-use-only structures in vm
Ben Skeggs [Fri, 14 Jan 2011 00:27:02 +0000 (10:27 +1000)]
drm/nvc0: enable protection of system-use-only structures in vm

Somehow missed this in the original merge of the nvc0 code.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
13 years agodrm/nv40: initialise 0x17xx on all chipsets that have it
Ben Skeggs [Tue, 11 Jan 2011 07:22:33 +0000 (17:22 +1000)]
drm/nv40: initialise 0x17xx on all chipsets that have it

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
13 years agodrm/nv40: make detection of 0x4097-ful chipsets available everywhere
Ben Skeggs [Tue, 11 Jan 2011 04:23:12 +0000 (14:23 +1000)]
drm/nv40: make detection of 0x4097-ful chipsets available everywhere

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
13 years agodrivers/nfc/pn544.c: fix min_t warnings
Andrew Morton [Mon, 17 Jan 2011 00:55:23 +0000 (16:55 -0800)]
drivers/nfc/pn544.c: fix min_t warnings

Fix these:

  drivers/nfc/pn544.c: In function 'pn544_read':
  drivers/nfc/pn544.c:356: warning: comparison of distinct pointer types lacks a cast
  drivers/nfc/pn544.c:377: warning: comparison of distinct pointer types lacks a cast
  drivers/nfc/pn544.c: In function 'pn544_write':
  drivers/nfc/pn544.c:463: warning: comparison of distinct pointer types lacks a cast
  drivers/nfc/pn544.c:485: warning: comparison of distinct pointer types lacks a cast

Cc: "Matti J. Aaltonen" <matti.j.aaltonen@nokia.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoARM: PL08x: cleanup comments
Russell King - ARM Linux [Sun, 16 Jan 2011 20:18:05 +0000 (20:18 +0000)]
ARM: PL08x: cleanup comments

Cleanup the formatting of comments, remove some which don't make sense
anymore.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[fix conflict with 96a608a4]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi...
Linus Torvalds [Sun, 16 Jan 2011 23:06:43 +0000 (15:06 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/nab/scsi-post-merge-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/scsi-post-merge-2.6:
  ocfs2: Make OCFS2_FS depend on CONFIGFS_FS
  dlm: Make DLM depend on CONFIGFS_FS
  net: Make NETCONSOLE_DYNAMIC depend on CONFIGFS_FS
  configfs: change depends -> select SYSFS
  [SCSI] sd,sr: kill compat SDEV_MEDIA_CHANGE event
  [SCSI] sd: implement sd_check_events()

13 years agoparisc: fix compile breakage caused by inlining maybe_mkwrite
James Bottomley [Sun, 16 Jan 2011 22:06:14 +0000 (23:06 +0100)]
parisc: fix compile breakage caused by inlining maybe_mkwrite

On PARISC, we have an include of linux/mm.h inside our asm/pgtable.h, so
this patch

  commit 14fd403f2146f740942d78af4e0ee59396ad8eab
  Author: Andrea Arcangeli <aarcange@redhat.com>
  Date:   Thu Jan 13 15:46:37 2011 -0800

      thp: export maybe_mkwrite

causes us an unsatisfiable use of pte_mkwrite in linux/mm.h.

The fix is to avoid including linux/mm.h in our pgtable.h, which
unbreaks the build.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agofix non-x86 build failure in pmdp_get_and_clear
Andrea Arcangeli [Sun, 16 Jan 2011 21:10:39 +0000 (13:10 -0800)]
fix non-x86 build failure in pmdp_get_and_clear

pmdp_get_and_clear/pmdp_clear_flush/pmdp_splitting_flush were trapped as
BUG() and they were defined only to diminish the risk of build issues on
not-x86 archs and to be consistent with the generic pte methods previously
defined in include/asm-generic/pgtable.h.

But they are causing more trouble than they were supposed to solve, so
it's simpler not to define them when THP is off.

This is also correcting the export of pmdp_splitting_flush which is
currently unused (x86 isn't using the generic implementation in
mm/pgtable-generic.c and no other arch needs that [yet]).

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Sam Ravnborg <sam@ravnborg.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoVFS: Fix UP compile error in fs/namespace.c
Al Viro [Sun, 16 Jan 2011 21:32:11 +0000 (16:32 -0500)]
VFS: Fix UP compile error in fs/namespace.c

mnt_longterm is there only on SMP

Reported-and-tested-by: Joachim Eastwood <manabian@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoRDMA/nes: Fix incorrect SFP+ link status detection on driver init
Maciej Sosnowski [Wed, 24 Nov 2010 17:29:54 +0000 (17:29 +0000)]
RDMA/nes: Fix incorrect SFP+ link status detection on driver init

During iw_nes initialization the link status for SFP+ PHY is always
detected as "up" regardless of real state (cable either connected or
disconnected).  Add SFP+ PHY specific link status detection to the
iw_nes initialization procedure.  Use link status recheck for
netdev_open to detect delayed state updates.

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
13 years agoRDMA/nes: Fix SFP+ link down detection issue with switch port disable
Maciej Sosnowski [Wed, 24 Nov 2010 17:29:46 +0000 (17:29 +0000)]
RDMA/nes: Fix SFP+ link down detection issue with switch port disable

In case of SFP+ PHY, link status check at interrupt processing can
give false results.  For proper link status change detection a delayed
recheck is needed to give nes registers time to settle.  Add a
periodic link status recheck scheduled at interrupt to detect
potential delayed registers state changes.

Addresses: http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2117
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
13 years agoRDMA/nes: Generate IB_EVENT_PORT_ERR/PORT_ACTIVE events
Maciej Sosnowski [Wed, 24 Nov 2010 17:29:38 +0000 (17:29 +0000)]
RDMA/nes: Generate IB_EVENT_PORT_ERR/PORT_ACTIVE events

Depending on link state change, IB_EVENT_PORT_ERR or
IB_EVENT_PORT_ACTIVE should be generated when handling MAC interrupts.

Plugging in a cable happens to result in series of interrupts changing
driver's link state a number of times before finally staying at link
up (e.g. link up, link down, link up, link down, ..., link up).  To
prevent sending series of redundant IB_EVENT_PORT_ACTIVE and
IB_EVENT_PORT_ERR events, we use a timer to debounce them in
nes_port_ibevent().

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
13 years agoRDMA/nes: Fix bonding on iw_nes
Maciej Sosnowski [Wed, 24 Nov 2010 17:29:30 +0000 (17:29 +0000)]
RDMA/nes: Fix bonding on iw_nes

Enable configuring bonds on nes devices by adding missing support for
master net_device to the driver.

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
13 years agoocfs2: Make OCFS2_FS depend on CONFIGFS_FS
Nicholas Bellinger [Sun, 16 Jan 2011 21:16:07 +0000 (21:16 +0000)]
ocfs2: Make OCFS2_FS depend on CONFIGFS_FS

This patch fixes the following kconfig error after changing
CONFIGFS_FS -> select SYSFS:

fs/sysfs/Kconfig:1:error: recursive dependency detected!
fs/sysfs/Kconfig:1: symbol SYSFS is selected by CONFIGFS_FS
fs/configfs/Kconfig:1: symbol CONFIGFS_FS is selected by OCFS2_FS
fs/ocfs2/Kconfig:1: symbol OCFS2_FS depends on SYSFS

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: James Bottomley <James.Bottomley@suse.de>
13 years agodlm: Make DLM depend on CONFIGFS_FS
Nicholas Bellinger [Sun, 16 Jan 2011 21:14:52 +0000 (21:14 +0000)]
dlm: Make DLM depend on CONFIGFS_FS

This patch fixes the following kconfig error after changing
CONFIGFS_FS -> select SYSFS:

fs/sysfs/Kconfig:1:error: recursive dependency detected!
fs/sysfs/Kconfig:1: symbol SYSFS is selected by CONFIGFS_FS
fs/configfs/Kconfig:1: symbol CONFIGFS_FS is selected by DLM
fs/dlm/Kconfig:1: symbol DLM depends on SYSFS

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: James Bottomley <James.Bottomley@suse.de>
13 years agonet: Make NETCONSOLE_DYNAMIC depend on CONFIGFS_FS
Nicholas Bellinger [Sun, 16 Jan 2011 21:13:59 +0000 (21:13 +0000)]
net: Make NETCONSOLE_DYNAMIC depend on CONFIGFS_FS

This patch fixes the following kconfig error after changing
CONFIGFS_FS -> select SYSFS:

fs/sysfs/Kconfig:1:error: recursive dependency detected!
fs/sysfs/Kconfig:1: symbol SYSFS is selected by CONFIGFS_FS
fs/configfs/Kconfig:1: symbol CONFIGFS_FS is selected by NETCONSOLE_DYNAMIC
drivers/net/Kconfig:3390: symbol NETCONSOLE_DYNAMIC depends on SYSFS

Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: James Bottomley <James.Bottomley@suse.de>
13 years agoconfigfs: change depends -> select SYSFS
Nicholas Bellinger [Sun, 16 Jan 2011 21:11:25 +0000 (21:11 +0000)]
configfs: change depends -> select SYSFS

This patch changes configfs to select SYSFS to fix the following:

warning: (TARGET_CORE && GFS2_FS) selects CONFIGFS_FS which has unmet direct dependencies (SYSFS)

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Acked-by: Joel Becker <jlbec@evilplan.org>
13 years agoMerge branch 'master' of /pub/scm/linux/kernel/git/jejb/scsi-post-merge-2.6 into...
Nicholas Bellinger [Sun, 16 Jan 2011 21:21:04 +0000 (21:21 +0000)]
Merge branch 'master' of /linux/kernel/git/jejb/scsi-post-merge-2.6 into for-linus

13 years agofs/btrfs: Fix build of ctree
Stefan Schmidt [Wed, 12 Jan 2011 09:30:42 +0000 (10:30 +0100)]
fs/btrfs: Fix build of ctree

Fix the build failure in some configurations:

     CC [M]  fs/btrfs/ctree.o
  In file included from fs/btrfs/ctree.c:21:0:
  fs/btrfs/ctree.h:1003:17: error: field 'super_kobj' has incomplete type
  fs/btrfs/ctree.h:1074:17: error: field 'root_kobj' has incomplete type
  make[2]: *** [fs/btrfs/ctree.o] Error 1
  make[1]: *** [fs/btrfs] Error 2
  make: *** [fs] Error 2

caused by commit 57cc7215b708 ("headers: kobject.h redux")

We need to include kobject.h here.

Reported-by: Jeff Garzik <jeff@garzik.org>
Fix-suggested-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus
Linus Torvalds [Sun, 16 Jan 2011 20:11:13 +0000 (12:11 -0800)]
Merge git://git./linux/kernel/git/pkl/squashfs-linus

* git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus:
  Squashfs: simplify CONFIG_SQUASHFS_LZO handling
  Squashfs: move squashfs_i() definition from squashfs.h
  Squashfs: get rid of default n in Kconfig
  Squashfs: add missing check in zlib_wrapper
  Squashfs: remove unnecessary variable in zlib_wrapper
  Squashfs: Add XZ compression configuration option
  Squashfs: add XZ compression support

13 years agoACPI: Fix boot problem related to APEI with acpi_disabled set
Rafael J. Wysocki [Sun, 16 Jan 2011 19:44:22 +0000 (20:44 +0100)]
ACPI: Fix boot problem related to APEI with acpi_disabled set

Commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each root
bridge (v3)") put the acpi_hest_init() call in acpi_pci_root_init() into
a wrong place, presumably because the author confused acpi_pci_disabled
with acpi_disabled.  Bring the code ordering in acpi_pci_root_init()
back to sanity.

Additionally, make sure that hest_disable is set when acpi_disabled is
set, which is going to prevent acpi_hest_parse(), that still may be
executed for acpi_disabled=1 through aer_acpi_firmware_first(), from
crashing because of uninitialized hest_tab.

Reported-and-tested-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoPCI / ACPI: Fix build of the AER driver for CONFIG_ACPI unset
Rafael J. Wysocki [Sun, 16 Jan 2011 19:42:43 +0000 (20:42 +0100)]
PCI / ACPI: Fix build of the AER driver for CONFIG_ACPI unset

After commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each
root bridge (v3)") include/linux/pci-acpi.h is included by
drivers/pci/pcie/aer/aerdrv.c and if CONFIG_ACPI is unset, the bogus and
unnecessary alternative definition of acpi_find_root_bridge_handle()
causes a build error to occur.

Remove the offending piece of garbage.

Reported-and-tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
Linus Torvalds [Sun, 16 Jan 2011 19:31:50 +0000 (11:31 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (23 commits)
  sanitize vfsmount refcounting changes
  fix old umount_tree() breakage
  autofs4: Merge the remaining dentry ops tables
  Unexport do_add_mount() and add in follow_automount(), not ->d_automount()
  Allow d_manage() to be used in RCU-walk mode
  Remove a further kludge from __do_follow_link()
  autofs4: Bump version
  autofs4: Add v4 pseudo direct mount support
  autofs4: Fix wait validation
  autofs4: Clean up autofs4_free_ino()
  autofs4: Clean up dentry operations
  autofs4: Clean up inode operations
  autofs4: Remove unused code
  autofs4: Add d_manage() dentry operation
  autofs4: Add d_automount() dentry operation
  Remove the automount through follow_link() kludge code from pathwalk
  CIFS: Use d_automount() rather than abusing follow_link()
  NFS: Use d_automount() rather than abusing follow_link()
  AFS: Use d_automount() rather than abusing follow_link()
  Add an AT_NO_AUTOMOUNT flag to suppress terminal automount
  ...

13 years agosanitize vfsmount refcounting changes
Al Viro [Sat, 15 Jan 2011 03:30:21 +0000 (22:30 -0500)]
sanitize vfsmount refcounting changes

Instead of splitting refcount between (per-cpu) mnt_count
and (SMP-only) mnt_longrefs, make all references contribute
to mnt_count again and keep track of how many are longterm
ones.

Accounting rules for longterm count:
* 1 for each fs_struct.root.mnt
* 1 for each fs_struct.pwd.mnt
* 1 for having non-NULL ->mnt_ns
* decrement to 0 happens only under vfsmount lock exclusive

That allows nice common case for mntput() - since we can't drop the
final reference until after mnt_longterm has reached 0 due to the rules
above, mntput() can grab vfsmount lock shared and check mnt_longterm.
If it turns out to be non-zero (which is the common case), we know
that this is not the final mntput() and can just blindly decrement
percpu mnt_count.  Otherwise we grab vfsmount lock exclusive and
do usual decrement-and-check of percpu mnt_count.

For fs_struct.c we have mnt_make_longterm() and mnt_make_shortterm();
namespace.c uses the latter in places where we don't already hold
vfsmount lock exclusive and opencodes a few remaining spots where
we need to manipulate mnt_longterm.

Note that we mostly revert the code outside of fs/namespace.c back
to what we used to have; in particular, normal code doesn't need
to care about two kinds of references, etc.  And we get to keep
the optimization Nick's variant had bought us...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agofix old umount_tree() breakage
Al Viro [Sun, 16 Jan 2011 01:08:44 +0000 (20:08 -0500)]
fix old umount_tree() breakage

Expiry-related code calls umount_tree() several times with
the same list to collect vfsmounts to.  Which is fine, except
that umount_tree() implicitly assumed that the list would
be empty on each call - it moves the victims over there and
then iterates through the list kicking them out.  It's *almost*
idempotent, so everything nearly worked.  However, mnt->ghosts
handling (and thus expirability checks) had been broken - that
part was not idempotent...

The fix is trivial - use local temporary list, splice it to
the the collector list when we are through.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agobtrfs: Require CAP_SYS_ADMIN for filesystem rebalance
Ben Hutchings [Wed, 29 Dec 2010 14:55:03 +0000 (14:55 +0000)]
btrfs: Require CAP_SYS_ADMIN for filesystem rebalance

Filesystem rebalancing (BTRFS_IOC_BALANCE) affects the entire
filesystem and may run uninterruptibly for a long time.  This does not
seem to be something that an unprivileged user should be able to do.

Reported-by: Aron Xu <happyaron.xu@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agoBtrfs: don't warn if we get ENOSPC in btrfs_block_rsv_check
Josef Bacik [Wed, 12 Jan 2011 21:04:22 +0000 (21:04 +0000)]
Btrfs: don't warn if we get ENOSPC in btrfs_block_rsv_check

If we run low on space we could get a bunch of warnings out of
btrfs_block_rsv_check, but this is mostly just called via the transaction code
to see if we need to end the transaction, it expects to see failures, so let's
not WARN and freak everybody out for no reason.  Thanks,

Signed-off-by: Josef Bacik <josef@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: Fix memory leak in btrfs_read_fs_root_no_radix()
Tsutomu Itoh [Mon, 27 Dec 2010 06:53:10 +0000 (06:53 +0000)]
btrfs: Fix memory leak in btrfs_read_fs_root_no_radix()

In btrfs_read_fs_root_no_radix(), 'root' is not freed if
btrfs_search_slot() returns error.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: check NULL or not
Tsutomu Itoh [Wed, 5 Jan 2011 02:32:22 +0000 (02:32 +0000)]
btrfs: check NULL or not

Should check if functions returns NULL or not.

Signed-off-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: Don't pass NULL ptr to func that may deref it.
Jesper Juhl [Sat, 25 Dec 2010 21:22:30 +0000 (21:22 +0000)]
btrfs: Don't pass NULL ptr to func that may deref it.

Hi,

In fs/btrfs/inode.c::fixup_tree_root_location() we have this code:

...
  if (!path) {
  err = -ENOMEM;
  goto out;
  }
...
  out:
  btrfs_free_path(path);
  return err;

btrfs_free_path() passes its argument on to other functions and some of
them end up dereferencing the pointer.
In the code above that pointer is clearly NULL, so btrfs_free_path() will
eventually cause a NULL dereference.

There are many ways to cut this cake (fix the bug). The one I chose was to
make btrfs_free_path() deal gracefully with NULL pointers. If you
disagree, feel free to come up with an alternative patch.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: mount failure return value fix
Dave Young [Sat, 8 Jan 2011 10:09:13 +0000 (10:09 +0000)]
btrfs: mount failure return value fix

I happened to pass swap partition as root partition in cmdline,
then kernel panic and tell me about "Cannot open root device".
It is not correct, in fact it is a fs type mismatch instead of 'no device'.

Eventually I found btrfs mounting failed with -EIO, it should be -EINVAL.
The logic in init/do_mounts.c:
        for (p = fs_names; *p; p += strlen(p)+1) {
                int err = do_mount_root(name, p, flags, root_mount_data);
                switch (err) {
                        case 0:
                                goto out;
                        case -EACCES:
                                flags |= MS_RDONLY;
                                goto retry;
                        case -EINVAL:
                                continue;
                }
print "Cannot open root device"
panic
}
SO fs type after btrfs will have no chance to mount

Here fix the return value as -EINVAL

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: Mem leak in btrfs_get_acl()
Jesper Juhl [Thu, 6 Jan 2011 21:45:21 +0000 (21:45 +0000)]
btrfs: Mem leak in btrfs_get_acl()

It seems to me that we leak the memory allocated to 'value' in
btrfs_get_acl() if the call to posix_acl_from_xattr() fails.
Here's a patch that attempts to correct that problem.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: fix wrong free space information of btrfs
Miao Xie [Wed, 5 Jan 2011 10:07:31 +0000 (10:07 +0000)]
btrfs: fix wrong free space information of btrfs

When we store data by raid profile in btrfs with two or more different size
disks, df command shows there is some free space in the filesystem, but the
user can not write any data in fact, df command shows the wrong free space
information of btrfs.

 # mkfs.btrfs -d raid1 /dev/sda9 /dev/sda10
 # btrfs-show
 Label: none  uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64
  Total devices 2 FS bytes used 28.00KB
  devid    1 size 5.01GB used 2.03GB path /dev/sda9
  devid    2 size 10.00GB used 2.01GB path /dev/sda10
 # btrfs device scan /dev/sda9 /dev/sda10
 # mount /dev/sda9 /mnt
 # dd if=/dev/zero of=tmpfile0 bs=4K count=9999999999
   (fill the filesystem)
 # sync
 # df -TH
 Filesystem Type Size Used Avail Use% Mounted on
 /dev/sda9 btrfs 17G 8.6G 5.4G 62% /mnt
 # btrfs-show
 Label: none  uuid: a95cd49e-6e33-45b8-8741-a36153ce4b64
  Total devices 2 FS bytes used 3.99GB
  devid    1 size 5.01GB used 5.01GB path /dev/sda9
  devid    2 size 10.00GB used 4.99GB path /dev/sda10

It is because btrfs cannot allocate chunks when one of the pairing disks has
no space, the free space on the other disks can not be used for ever, and should
be subtracted from the total space, but btrfs doesn't subtract this space from
the total. It is strange to the user.

This patch fixes it by calcing the free space that can be used to allocate
chunks.

Implementation:
1. get all the devices free space, and align them by stripe length.
2. sort the devices by the free space.
3. check the free space of the devices,
   3.1. if it is not zero, and then check the number of the devices that has
        more free space than this device,
        if the number of the devices is beyond the min stripe number, the free
        space can be used, and add into total free space.
        if the number of the devices is below the min stripe number, we can not
        use the free space, the check ends.
   3.2. if the free space is zero, check the next devices, goto 3.1

This implementation is just likely fake chunk allocation.

After appling this patch, df can show correct space information:
 # df -TH
 Filesystem Type Size Used Avail Use% Mounted on
 /dev/sda9 btrfs 17G 8.6G 0 100% /mnt

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
13 years agobtrfs: make the chunk allocator utilize the devices better
Miao Xie [Wed, 5 Jan 2011 10:07:28 +0000 (10:07 +0000)]
btrfs: make the chunk allocator utilize the devices better

With this patch, we change the handling method when we can not get enough free
extents with default size.

Implementation:
1. Look up the suitable free extent on each device and keep the search result.
   If not find a suitable free extent, keep the max free extent
2. If we get enough suitable free extents with default size, chunk allocation
   succeeds.
3. If we can not get enough free extents, but the number of the extent with
   default size is >= min_stripes, we just change the mapping information
   (reduce the number of stripes in the extent map), and chunk allocation
   succeeds.
4. If the number of the extent with default size is < min_stripes, sort the
   devices by its max free extent's size descending
5. Use the size of the max free extent on the (num_stripes - 1)th device as the
   stripe size to allocate the device space

By this way, the chunk allocator can allocate chunks as large as possible when
the devices' space is not enough and make full use of the devices.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>