Jan Kara [Fri, 1 Aug 2014 10:20:02 +0000 (12:20 +0200)]
timer: Fix lock inversion between hrtimer_bases.lock and scheduler locks
clockevents_increase_min_delta() calls printk() from under
hrtimer_bases.lock. That causes lock inversion on scheduler locks because
printk() can call into the scheduler. Lockdep puts it as:
======================================================
[ INFO: possible circular locking dependency detected ]
3.15.0-rc8-06195-g939f04b #2 Not tainted
-------------------------------------------------------
trinity-main/74 is trying to acquire lock:
(&port_lock_key){-.....}, at: [<
811c60be>] serial8250_console_write+0x8c/0x10c
but task is already holding lock:
(hrtimer_bases.lock){-.-...}, at: [<
8103caeb>] hrtimer_try_to_cancel+0x13/0x66
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #5 (hrtimer_bases.lock){-.-...}:
[<
8104a942>] lock_acquire+0x92/0x101
[<
8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
[<
8103c918>] __hrtimer_start_range_ns+0x1c/0x197
[<
8107ec20>] perf_swevent_start_hrtimer.part.41+0x7a/0x85
[<
81080792>] task_clock_event_start+0x3a/0x3f
[<
810807a4>] task_clock_event_add+0xd/0x14
[<
8108259a>] event_sched_in+0xb6/0x17a
[<
810826a2>] group_sched_in+0x44/0x122
[<
81082885>] ctx_sched_in.isra.67+0x105/0x11f
[<
810828e6>] perf_event_sched_in.isra.70+0x47/0x4b
[<
81082bf6>] __perf_install_in_context+0x8b/0xa3
[<
8107eb8e>] remote_function+0x12/0x2a
[<
8105f5af>] smp_call_function_single+0x2d/0x53
[<
8107e17d>] task_function_call+0x30/0x36
[<
8107fb82>] perf_install_in_context+0x87/0xbb
[<
810852c9>] SYSC_perf_event_open+0x5c6/0x701
[<
810856f9>] SyS_perf_event_open+0x17/0x19
[<
8142f8ee>] syscall_call+0x7/0xb
-> #4 (&ctx->lock){......}:
[<
8104a942>] lock_acquire+0x92/0x101
[<
8142f04c>] _raw_spin_lock+0x21/0x30
[<
81081df3>] __perf_event_task_sched_out+0x1dc/0x34f
[<
8142cacc>] __schedule+0x4c6/0x4cb
[<
8142cae0>] schedule+0xf/0x11
[<
8142f9a6>] work_resched+0x5/0x30
-> #3 (&rq->lock){-.-.-.}:
[<
8104a942>] lock_acquire+0x92/0x101
[<
8142f04c>] _raw_spin_lock+0x21/0x30
[<
81040873>] __task_rq_lock+0x33/0x3a
[<
8104184c>] wake_up_new_task+0x25/0xc2
[<
8102474b>] do_fork+0x15c/0x2a0
[<
810248a9>] kernel_thread+0x1a/0x1f
[<
814232a2>] rest_init+0x1a/0x10e
[<
817af949>] start_kernel+0x303/0x308
[<
817af2ab>] i386_start_kernel+0x79/0x7d
-> #2 (&p->pi_lock){-.-...}:
[<
8104a942>] lock_acquire+0x92/0x101
[<
8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
[<
810413dd>] try_to_wake_up+0x1d/0xd6
[<
810414cd>] default_wake_function+0xb/0xd
[<
810461f3>] __wake_up_common+0x39/0x59
[<
81046346>] __wake_up+0x29/0x3b
[<
811b8733>] tty_wakeup+0x49/0x51
[<
811c3568>] uart_write_wakeup+0x17/0x19
[<
811c5dc1>] serial8250_tx_chars+0xbc/0xfb
[<
811c5f28>] serial8250_handle_irq+0x54/0x6a
[<
811c5f57>] serial8250_default_handle_irq+0x19/0x1c
[<
811c56d8>] serial8250_interrupt+0x38/0x9e
[<
810510e7>] handle_irq_event_percpu+0x5f/0x1e2
[<
81051296>] handle_irq_event+0x2c/0x43
[<
81052cee>] handle_level_irq+0x57/0x80
[<
81002a72>] handle_irq+0x46/0x5c
[<
810027df>] do_IRQ+0x32/0x89
[<
8143036e>] common_interrupt+0x2e/0x33
[<
8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49
[<
811c25a4>] uart_start+0x2d/0x32
[<
811c2c04>] uart_write+0xc7/0xd6
[<
811bc6f6>] n_tty_write+0xb8/0x35e
[<
811b9beb>] tty_write+0x163/0x1e4
[<
811b9cd9>] redirected_tty_write+0x6d/0x75
[<
810b6ed6>] vfs_write+0x75/0xb0
[<
810b7265>] SyS_write+0x44/0x77
[<
8142f8ee>] syscall_call+0x7/0xb
-> #1 (&tty->write_wait){-.....}:
[<
8104a942>] lock_acquire+0x92/0x101
[<
8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
[<
81046332>] __wake_up+0x15/0x3b
[<
811b8733>] tty_wakeup+0x49/0x51
[<
811c3568>] uart_write_wakeup+0x17/0x19
[<
811c5dc1>] serial8250_tx_chars+0xbc/0xfb
[<
811c5f28>] serial8250_handle_irq+0x54/0x6a
[<
811c5f57>] serial8250_default_handle_irq+0x19/0x1c
[<
811c56d8>] serial8250_interrupt+0x38/0x9e
[<
810510e7>] handle_irq_event_percpu+0x5f/0x1e2
[<
81051296>] handle_irq_event+0x2c/0x43
[<
81052cee>] handle_level_irq+0x57/0x80
[<
81002a72>] handle_irq+0x46/0x5c
[<
810027df>] do_IRQ+0x32/0x89
[<
8143036e>] common_interrupt+0x2e/0x33
[<
8142f23c>] _raw_spin_unlock_irqrestore+0x3f/0x49
[<
811c25a4>] uart_start+0x2d/0x32
[<
811c2c04>] uart_write+0xc7/0xd6
[<
811bc6f6>] n_tty_write+0xb8/0x35e
[<
811b9beb>] tty_write+0x163/0x1e4
[<
811b9cd9>] redirected_tty_write+0x6d/0x75
[<
810b6ed6>] vfs_write+0x75/0xb0
[<
810b7265>] SyS_write+0x44/0x77
[<
8142f8ee>] syscall_call+0x7/0xb
-> #0 (&port_lock_key){-.....}:
[<
8104a62d>] __lock_acquire+0x9ea/0xc6d
[<
8104a942>] lock_acquire+0x92/0x101
[<
8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
[<
811c60be>] serial8250_console_write+0x8c/0x10c
[<
8104e402>] call_console_drivers.constprop.31+0x87/0x118
[<
8104f5d5>] console_unlock+0x1d7/0x398
[<
8104fb70>] vprintk_emit+0x3da/0x3e4
[<
81425f76>] printk+0x17/0x19
[<
8105bfa0>] clockevents_program_min_delta+0x104/0x116
[<
8105c548>] clockevents_program_event+0xe7/0xf3
[<
8105cc1c>] tick_program_event+0x1e/0x23
[<
8103c43c>] hrtimer_force_reprogram+0x88/0x8f
[<
8103c49e>] __remove_hrtimer+0x5b/0x79
[<
8103cb21>] hrtimer_try_to_cancel+0x49/0x66
[<
8103cb4b>] hrtimer_cancel+0xd/0x18
[<
8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
[<
81080705>] task_clock_event_stop+0x20/0x64
[<
81080756>] task_clock_event_del+0xd/0xf
[<
81081350>] event_sched_out+0xab/0x11e
[<
810813e0>] group_sched_out+0x1d/0x66
[<
81081682>] ctx_sched_out+0xaf/0xbf
[<
81081e04>] __perf_event_task_sched_out+0x1ed/0x34f
[<
8142cacc>] __schedule+0x4c6/0x4cb
[<
8142cae0>] schedule+0xf/0x11
[<
8142f9a6>] work_resched+0x5/0x30
other info that might help us debug this:
Chain exists of:
&port_lock_key --> &ctx->lock --> hrtimer_bases.lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(hrtimer_bases.lock);
lock(&ctx->lock);
lock(hrtimer_bases.lock);
lock(&port_lock_key);
*** DEADLOCK ***
4 locks held by trinity-main/74:
#0: (&rq->lock){-.-.-.}, at: [<
8142c6f3>] __schedule+0xed/0x4cb
#1: (&ctx->lock){......}, at: [<
81081df3>] __perf_event_task_sched_out+0x1dc/0x34f
#2: (hrtimer_bases.lock){-.-...}, at: [<
8103caeb>] hrtimer_try_to_cancel+0x13/0x66
#3: (console_lock){+.+...}, at: [<
8104fb5d>] vprintk_emit+0x3c7/0x3e4
stack backtrace:
CPU: 0 PID: 74 Comm: trinity-main Not tainted
3.15.0-rc8-06195-g939f04b #2
00000000 81c3a310 8b995c14 81426f69 8b995c44 81425a99 8161f671 8161f570
8161f538 8161f559 8161f538 8b995c78 8b142bb0 00000004 8b142fdc 8b142bb0
8b995ca8 8104a62d 8b142fac 000016f2 81c3a310 00000001 00000001 00000003
Call Trace:
[<
81426f69>] dump_stack+0x16/0x18
[<
81425a99>] print_circular_bug+0x18f/0x19c
[<
8104a62d>] __lock_acquire+0x9ea/0xc6d
[<
8104a942>] lock_acquire+0x92/0x101
[<
811c60be>] ? serial8250_console_write+0x8c/0x10c
[<
811c6032>] ? wait_for_xmitr+0x76/0x76
[<
8142f11d>] _raw_spin_lock_irqsave+0x2e/0x3e
[<
811c60be>] ? serial8250_console_write+0x8c/0x10c
[<
811c60be>] serial8250_console_write+0x8c/0x10c
[<
8104af87>] ? lock_release+0x191/0x223
[<
811c6032>] ? wait_for_xmitr+0x76/0x76
[<
8104e402>] call_console_drivers.constprop.31+0x87/0x118
[<
8104f5d5>] console_unlock+0x1d7/0x398
[<
8104fb70>] vprintk_emit+0x3da/0x3e4
[<
81425f76>] printk+0x17/0x19
[<
8105bfa0>] clockevents_program_min_delta+0x104/0x116
[<
8105cc1c>] tick_program_event+0x1e/0x23
[<
8103c43c>] hrtimer_force_reprogram+0x88/0x8f
[<
8103c49e>] __remove_hrtimer+0x5b/0x79
[<
8103cb21>] hrtimer_try_to_cancel+0x49/0x66
[<
8103cb4b>] hrtimer_cancel+0xd/0x18
[<
8107f102>] perf_swevent_cancel_hrtimer.part.60+0x2b/0x30
[<
81080705>] task_clock_event_stop+0x20/0x64
[<
81080756>] task_clock_event_del+0xd/0xf
[<
81081350>] event_sched_out+0xab/0x11e
[<
810813e0>] group_sched_out+0x1d/0x66
[<
81081682>] ctx_sched_out+0xaf/0xbf
[<
81081e04>] __perf_event_task_sched_out+0x1ed/0x34f
[<
8104416d>] ? __dequeue_entity+0x23/0x27
[<
81044505>] ? pick_next_task_fair+0xb1/0x120
[<
8142cacc>] __schedule+0x4c6/0x4cb
[<
81047574>] ? trace_hardirqs_off_caller+0xd7/0x108
[<
810475b0>] ? trace_hardirqs_off+0xb/0xd
[<
81056346>] ? rcu_irq_exit+0x64/0x77
Fix the problem by using printk_deferred() which does not call into the
scheduler.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Stephen Boyd [Thu, 24 Jul 2014 04:03:50 +0000 (21:03 -0700)]
sched_clock: Avoid corrupting hrtimer tree during suspend
During suspend we call sched_clock_poll() to update the epoch and
accumulated time and reprogram the sched_clock_timer to fire
before the next wrap-around time. Unfortunately,
sched_clock_poll() doesn't restart the timer, instead it relies
on the hrtimer layer to do that and during suspend we aren't
calling that function from the hrtimer layer. Instead, we're
reprogramming the expires time while the hrtimer is enqueued,
which can cause the hrtimer tree to be corrupted. Furthermore, we
restart the timer during suspend but we update the epoch during
resume which seems counter-intuitive.
Let's fix this by saving the accumulated state and canceling the
timer during suspend. On resume we can update the epoch and
restart the timer similar to what we would do if we were starting
the clock for the first time.
Fixes: a08ca5d1089d "sched_clock: Use an hrtimer instead of timer"
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/1406174630-23458-1-git-send-email-john.stultz@linaro.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Thu, 24 Jul 2014 00:55:11 +0000 (17:55 -0700)]
Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfix from Bruce Fields:
"Another regression from the xdr encoding rewrite"
* 'for-3.16' of git://linux-nfs.org/~bfields/linux:
NFSD: Fix crash encoding lock reply on 32-bit
Linus Torvalds [Thu, 24 Jul 2014 00:47:36 +0000 (17:47 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Fix arm64 regression introduced by limiting the CMA buffer to ZONE_DMA
on platforms where RAM starts above 4GB (and ZONE_DMA becoming 0)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Create non-empty ZONE_DMA when DRAM starts above 4GB
Linus Torvalds [Thu, 24 Jul 2014 00:46:46 +0000 (17:46 -0700)]
Merge tag 'xtensa-next-
20140721' of git://github.com/czankel/xtensa-linux
Pull Xtensa fixes from Chris Zankel:
- resolve FIXMEs in double exception handler for window overflow. This
fix makes native building of linux on xtensa host possible;
- fix sysmem region removal issue introduced in 3.15.
* tag 'xtensa-next-
20140721' of git://github.com/czankel/xtensa-linux:
xtensa: fix sysmem reservation at the end of existing block
xtensa: add fixup for double exception raised in window overflow
Linus Torvalds [Thu, 24 Jul 2014 00:42:37 +0000 (17:42 -0700)]
Merge tag 'pinctrl-v3.16-3' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Here are three pin control fixes for the v3.16 series. Sorry that
some of these arrive late, the summer heat in Sweden makes me slow.
- an IRQ handling fix for the STi driver, also for stable
- another IRQ fix for the RCAR GPIO driver
- a MAINTAINERS entry"
* tag 'pinctrl-v3.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
gpio: rcar: Add support for DT IRQ flags
MAINTAINERS: Add entry for the Renesas pin controller driver
pinctrl: st: Fix irqmux handler
Linus Torvalds [Thu, 24 Jul 2014 00:39:28 +0000 (17:39 -0700)]
Merge branch 'for-3.16-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata regression fix from Tejun Heo:
"The last libata/for-3.16-fixes pull contained a regression introduced
by
1871ee134b73 ("libata: support the ata host which implements a
queue depth less than 32") which in turn was a fix for a regression
introduced earlier while changing queue tag order to accomodate hard
drives which perform poorly if tags are not allocated in circular
order (ugh...).
The regression happens only for SAS controllers making use of libata
to serve ATA devices. They don't fill an ata_host field which is used
by the new tag allocation function leading to NULL dereference.
This patch adds a new intermediate field ata_host->n_tags which is
initialized for both SAS and !SAS cases to fix the issue"
* 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: introduce ata_host->n_tags to avoid oops on SAS controllers
Linus Torvalds [Wed, 23 Jul 2014 22:42:53 +0000 (15:42 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input layer fixes from Dmitry Torokhov:
"A few fixups for the input subsystem"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: document INPUT_PROP_TOPBUTTONPAD
Input: fix defuzzing logic
Input: sirfsoc-onkey - fix GPL v2 license string typo
Input: st-keyscan - fix 'defined but not used' compiler warnings
Input: synaptics - add min/max quirk for pnp-id LEN2002 (Edge E531)
Input: i8042 - add Acer Aspire 5710 to nomux blacklist
Input: ti_am335x_tsc - warn about incorrect spelling
Input: wacom - cleanup multitouch code when touch_max is 2
Linus Torvalds [Wed, 23 Jul 2014 22:34:13 +0000 (15:34 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here is a handful of powerpc fixes for 3.16. They are all pretty
simple and self contained and should still make this release"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc: use _GLOBAL_TOC for memmove
powerpc/pseries: dynamically added OF nodes need to call of_node_init
powerpc: subpage_protect: Increase the array size to take care of 64TB
powerpc: Fix bugs in emulate_step()
powerpc: Disable doorbells on Power8 DD1.x
Linus Torvalds [Wed, 23 Jul 2014 22:14:46 +0000 (15:14 -0700)]
Merge tag 'urgent-slab-fix' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull slab fix from Mike Snitzer:
"This fixes the broken duplicate slab name check in
kmem_cache_sanity_check() that has been repeatedly reported (as
recently as today against Fedora rawhide).
Pekka seemed to have it staged for a late 3.15-rc in his 'slab/urgent'
branch but never sent a pull request, see:
https://lkml.org/lkml/2014/5/23/648"
* tag 'urgent-slab-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
slab_common: fix the check for duplicate slab names
Linus Torvalds [Wed, 23 Jul 2014 22:11:11 +0000 (15:11 -0700)]
Merge branch 'akpm' (patches from Andrew Morton)
Merge fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm: hugetlb: fix copy_hugetlb_page_range()
simple_xattr: permit 0-size extended attributes
mm/fs: fix pessimization in hole-punching pagecache
shmem: fix splicing from a hole while it's punched
shmem: fix faulting into a hole, not taking i_mutex
mm: do not call do_fault_around for non-linear fault
sh: also try passing -m4-nofpu for SH2A builds
zram: avoid lockdep splat by revalidate_disk
mm/rmap.c: fix pgoff calculation to handle hugepage correctly
coredump: fix the setting of PF_DUMPCORE
Naoya Horiguchi [Wed, 23 Jul 2014 21:00:19 +0000 (14:00 -0700)]
mm: hugetlb: fix copy_hugetlb_page_range()
Commit
4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
migration/hwpoisoned entry") changed the order of
huge_ptep_set_wrprotect() and huge_ptep_get(), which leads to breakage
in some workloads like hugepage-backed heap allocation via libhugetlbfs.
This patch fixes it.
The test program for the problem is shown below:
$ cat heap.c
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#define HPS 0x200000
int main() {
int i;
char *p = malloc(HPS);
memset(p, '1', HPS);
for (i = 0; i < 5; i++) {
if (!fork()) {
memset(p, '2', HPS);
p = malloc(HPS);
memset(p, '3', HPS);
free(p);
return 0;
}
}
sleep(1);
free(p);
return 0;
}
$ export HUGETLB_MORECORE=yes ; export HUGETLB_NO_PREFAULT= ; hugectl --heap ./heap
Fixes
4a705fef9862 ("hugetlb: fix copy_hugetlb_page_range() to handle
migration/hwpoisoned entry"), so is applicable to -stable kernels which
include it.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Suggested-by: Guillaume Morin <guillaume@morinfr.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org> [2.6.37+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 23 Jul 2014 21:00:17 +0000 (14:00 -0700)]
simple_xattr: permit 0-size extended attributes
If a filesystem uses simple_xattr to support user extended attributes,
LTP setxattr01 and xfstests generic/062 fail with "Cannot allocate
memory": simple_xattr_alloc()'s wrap-around test mistakenly excludes
values of zero size. Fix that off-by-one (but apparently no filesystem
needs them yet).
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 23 Jul 2014 21:00:15 +0000 (14:00 -0700)]
mm/fs: fix pessimization in hole-punching pagecache
I wanted to revert my v3.1 commit
d0823576bf4b ("mm: pincer in
truncate_inode_pages_range"), to keep truncate_inode_pages_range() in
synch with shmem_undo_range(); but have stepped back - a change to
hole-punching in truncate_inode_pages_range() is a change to
hole-punching in every filesystem (except tmpfs) that supports it.
If there's a logical proof why no filesystem can depend for its own
correctness on the pincer guarantee in truncate_inode_pages_range() - an
instant when the entire hole is removed from pagecache - then let's
revisit later. But the evidence is that only tmpfs suffered from the
livelock, and we have no intention of extending hole-punch to ramfs. So
for now just add a few comments (to match or differ from those in
shmem_undo_range()), and fix one silliness noticed in
d0823576bf4b...
Its "index == start" addition to the hole-punch termination test was
incomplete: it opened a way for the end condition to be missed, and the
loop go on looking through the radix_tree, all the way to end of file.
Fix that pessimization by resetting index when detected in inner loop.
Note that it's actually hard to hit this case, without the obsessive
concurrent faulting that trinity does: normally all pages are removed in
the initial trylock_page() pass, and this loop finds nothing to do. I
had to "#if 0" out the initial pass to reproduce bug and test fix.
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 23 Jul 2014 21:00:13 +0000 (14:00 -0700)]
shmem: fix splicing from a hole while it's punched
shmem_fault() is the actual culprit in trinity's hole-punch starvation,
and the most significant cause of such problems: since a page faulted is
one that then appears page_mapped(), needing unmap_mapping_range() and
i_mmap_mutex to be unmapped again.
But it is not the only way in which a page can be brought into a hole in
the radix_tree while that hole is being punched; and Vlastimil's testing
implies that if enough other processors are busy filling in the hole,
then shmem_undo_range() can be kept from completing indefinitely.
shmem_file_splice_read() is the main other user of SGP_CACHE, which can
instantiate shmem pagecache pages in the read-only case (without holding
i_mutex, so perhaps concurrently with a hole-punch). Probably it's
silly not to use SGP_READ already (using the ZERO_PAGE for holes): which
ought to be safe, but might bring surprises - not a change to be rushed.
shmem_read_mapping_page_gfp() is an internal interface used by
drivers/gpu/drm GEM (and next by uprobes): it should be okay. And
shmem_file_read_iter() uses the SGP_DIRTY variant of SGP_CACHE, when
called internally by the kernel (perhaps for a stacking filesystem,
which might rely on holes to be reserved): it's unclear whether it could
be provoked to keep hole-punch busy or not.
We could apply the same umbrella as now used in shmem_fault() to
shmem_file_splice_read() and the others; but it looks ugly, and use over
a range raises questions - should it actually be per page? can these get
starved themselves?
The origin of this part of the problem is my v3.1 commit
d0823576bf4b
("mm: pincer in truncate_inode_pages_range"), once it was duplicated
into shmem.c. It seemed like a nice idea at the time, to ensure
(barring RCU lookup fuzziness) that there's an instant when the entire
hole is empty; but the indefinitely repeated scans to ensure that make
it vulnerable.
Revert that "enhancement" to hole-punch from shmem_undo_range(), but
retain the unproblematic rescanning when it's truncating; add a couple
of comments there.
Remove the "indices[0] >= end" test: that is now handled satisfactorily
by the inner loop, and mem_cgroup_uncharge_start()/end() are too light
to be worth avoiding here.
But if we do not always loop indefinitely, we do need to handle the case
of swap swizzled back to page before shmem_free_swap() gets it: add a
retry for that case, as suggested by Konstantin Khlebnikov; and for the
case of page swizzled back to swap, as suggested by Johannes Weiner.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org> [3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Wed, 23 Jul 2014 21:00:10 +0000 (14:00 -0700)]
shmem: fix faulting into a hole, not taking i_mutex
Commit
f00cdc6df7d7 ("shmem: fix faulting into a hole while it's
punched") was buggy: Sasha sent a lockdep report to remind us that
grabbing i_mutex in the fault path is a no-no (write syscall may already
hold i_mutex while faulting user buffer).
We tried a completely different approach (see following patch) but that
proved inadequate: good enough for a rational workload, but not good
enough against trinity - which forks off so many mappings of the object
that contention on i_mmap_mutex while hole-puncher holds i_mutex builds
into serious starvation when concurrent faults force the puncher to fall
back to single-page unmap_mapping_range() searches of the i_mmap tree.
So return to the original umbrella approach, but keep away from i_mutex
this time. We really don't want to bloat every shmem inode with a new
mutex or completion, just to protect this unlikely case from trinity.
So extend the original with wait_queue_head on stack at the hole-punch
end, and wait_queue item on the stack at the fault end.
This involves further use of i_lock to guard against the races: lockdep
has been happy so far, and I see fs/inode.c:unlock_new_inode() holds
i_lock around wake_up_bit(), which is comparable to what we do here.
i_lock is more convenient, but we could switch to shmem's info->lock.
This issue has been tagged with CVE-2014-4171, which will require commit
f00cdc6df7d7 and this and the following patch to be backported: we
suggest to 3.1+, though in fact the trinity forkbomb effect might go
back as far as 2.6.16, when madvise(,,MADV_REMOVE) came in - or might
not, since much has changed, with i_mmap_mutex a spinlock before 3.0.
Anyone running trinity on 3.0 and earlier? I don't think we need care.
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <stable@vger.kernel.org> [3.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Wed, 23 Jul 2014 21:00:08 +0000 (14:00 -0700)]
mm: do not call do_fault_around for non-linear fault
Ingo Korb reported that "repeated mapping of the same file on tmpfs
using remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when
the process exits".
He bisected the bug to
d7c1755179b8 ("mm: implement ->map_pages for
shmem/tmpfs"), although the bug was actually added by commit
8c6e50b0290c ("mm: introduce vm_ops->map_pages()").
The problem is caused by calling do_fault_around for a _non-linear_
fault. In this case pgoff is shifted and might become negative during
calculation.
Faulting around non-linear page-fault makes no sense and breaks the
logic in do_fault_around because pgoff is shifted.
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Ingo Korb <ingo.korb@tu-dortmund.de>
Tested-by: Ingo Korb <ingo.korb@tu-dortmund.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Ning Qu <quning@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [3.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Wed, 23 Jul 2014 21:00:06 +0000 (14:00 -0700)]
sh: also try passing -m4-nofpu for SH2A builds
When compiling a SH2A kernel (e.g. se7206_defconfig or rsk7203_defconfig)
using sh4-linux-gcc, linking fails with:
net/built-in.o: In function `__sk_run_filter':
net/core/filter.c:566: undefined reference to `__fpscr_values'
net/core/filter.c:269: undefined reference to `__fpscr_values'
...
net/built-in.o:net/core/filter.c:580: more undefined references to `__fpscr_values' follow
This happens because sh4-linux-gcc doesn't support the "-m2a-nofpu",
which is thus filtered out by "$(call cc-option, ...)".
As compiling using sh4-linux-gcc is useful for compile coverage, also
try passing "-m4-nofpu" (which is presumably filtered out when using a
real sh2a-linux toolchain) to disable the generation of FPU instructions
and references to __fpscr_values[].
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Tony Breeds <tony@bakeyournoodle.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Wed, 23 Jul 2014 21:00:04 +0000 (14:00 -0700)]
zram: avoid lockdep splat by revalidate_disk
Sasha reported lockdep warning [1] introduced by [2].
It could be fixed by doing disk revalidation out of the init_lock. It's
okay because disk capacity change is protected by init_lock so that
revalidate_disk always sees up-to-date value so there is no race.
[1] https://lkml.org/lkml/2014/7/3/735
[2] zram: revalidate disk after capacity change
Fixes
2e32baea46ce ("zram: revalidate disk after capacity change").
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: "Alexander E. Patrakov" <patrakov@gmail.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Wed, 23 Jul 2014 21:00:01 +0000 (14:00 -0700)]
mm/rmap.c: fix pgoff calculation to handle hugepage correctly
I triggered VM_BUG_ON() in vma_address() when I tried to migrate an
anonymous hugepage with mbind() in the kernel v3.16-rc3. This is
because pgoff's calculation in rmap_walk_anon() fails to consider
compound_order() only to have an incorrect value.
This patch introduces page_to_pgoff(), which gets the page's offset in
PAGE_CACHE_SIZE.
Kirill pointed out that page cache tree should natively handle
hugepages, and in order to make hugetlbfs fit it, page->index of
hugetlbfs page should be in PAGE_CACHE_SIZE. This is beyond this patch,
but page_to_pgoff() contains the point to be fixed in a single function.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Silesh C V [Wed, 23 Jul 2014 20:59:59 +0000 (13:59 -0700)]
coredump: fix the setting of PF_DUMPCORE
Commit
079148b919d0 ("coredump: factor out the setting of PF_DUMPCORE")
cleaned up the setting of PF_DUMPCORE by removing it from all the
linux_binfmt->core_dump() and moving it to zap_threads().But this ended
up clearing all the previously set flags. This causes issues during
core generation when tsk->flags is checked again (eg. for PF_USED_MATH
to dump floating point registers). Fix this.
Signed-off-by: Silesh C V <svellattu@mvista.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kinglong Mee [Mon, 7 Jul 2014 14:10:56 +0000 (22:10 +0800)]
NFSD: Fix crash encoding lock reply on 32-bit
Commit
8c7424cff6 "nfsd4: don't try to encode conflicting owner if low
on space" forgot to free conf->data in nfsd4_encode_lockt and before
sign conf->data to NULL in nfsd4_encode_lock_denied, causing a leak.
Worse, kfree() can be called on an uninitialized pointer in the case of
a succesful lock (or one that fails for a reason other than a conflict).
(Note that lock->lk_denied.ld_owner.data appears it should be zero here,
until you notice that it's one arm of a union the other arm of which is
written to in the succesful case by the
memcpy(&lock->lk_resp_stateid, &lock_stp->st_stid.sc_stateid,
sizeof(stateid_t));
in nfsd4_lock(). In the 32-bit case this overwrites ld_owner.data.)
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Fixes: 8c7424cff6 ""nfsd4: don't try to encode conflicting owner if low on space"
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Tejun Heo [Wed, 23 Jul 2014 13:05:27 +0000 (09:05 -0400)]
libata: introduce ata_host->n_tags to avoid oops on SAS controllers
1871ee134b73 ("libata: support the ata host which implements a queue
depth less than 32") directly used ata_port->scsi_host->can_queue from
ata_qc_new() to determine the number of tags supported by the host;
unfortunately, SAS controllers doing SATA don't initialize ->scsi_host
leading to the following oops.
BUG: unable to handle kernel NULL pointer dereference at
0000000000000058
IP: [<
ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
PGD 0
Oops: 0002 [#1] SMP
Modules linked in: isci libsas scsi_transport_sas mgag200 drm_kms_helper ttm
CPU: 1 PID: 518 Comm: udevd Not tainted 3.16.0-rc6+ #62
Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.02.0002.
122320131210 12/23/2013
task:
ffff880c1a00b280 ti:
ffff88061a000000 task.ti:
ffff88061a000000
RIP: 0010:[<
ffffffff814e0618>] [<
ffffffff814e0618>] ata_qc_new_init+0x188/0x1b0
RSP: 0018:
ffff88061a003ae8 EFLAGS:
00010012
RAX:
0000000000000001 RBX:
ffff88000241ca80 RCX:
00000000000000fa
RDX:
0000000000000020 RSI:
0000000000000020 RDI:
ffff8806194aa298
RBP:
ffff88061a003ae8 R08:
ffff8806194a8000 R09:
0000000000000000
R10:
0000000000000000 R11:
ffff88000241ca80 R12:
ffff88061ad58200
R13:
ffff8806194aa298 R14:
ffffffff814e67a0 R15:
ffff8806194a8000
FS:
00007f3ad7fe3840(0000) GS:
ffff880627620000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000058 CR3:
000000061a118000 CR4:
00000000001407e0
Stack:
ffff88061a003b20 ffffffff814e96e1 ffff88000241ca80 ffff88061ad58200
ffff8800b6bf6000 ffff880c1c988000 ffff880619903850 ffff88061a003b68
ffffffffa0056ce1 ffff88061a003b48 0000000013d6e6f8 ffff88000241ca80
Call Trace:
[<
ffffffff814e96e1>] ata_sas_queuecmd+0xa1/0x430
[<
ffffffffa0056ce1>] sas_queuecommand+0x191/0x220 [libsas]
[<
ffffffff8149afee>] scsi_dispatch_cmd+0x10e/0x300
[<
ffffffff814a3bc5>] scsi_request_fn+0x2f5/0x550
[<
ffffffff81317613>] __blk_run_queue+0x33/0x40
[<
ffffffff8131781a>] queue_unplugged+0x2a/0x90
[<
ffffffff8131ceb4>] blk_flush_plug_list+0x1b4/0x210
[<
ffffffff8131d274>] blk_finish_plug+0x14/0x50
[<
ffffffff8117eaa8>] __do_page_cache_readahead+0x198/0x1f0
[<
ffffffff8117ee21>] force_page_cache_readahead+0x31/0x50
[<
ffffffff8117ee7e>] page_cache_sync_readahead+0x3e/0x50
[<
ffffffff81172ac6>] generic_file_read_iter+0x496/0x5a0
[<
ffffffff81219897>] blkdev_read_iter+0x37/0x40
[<
ffffffff811e307e>] new_sync_read+0x7e/0xb0
[<
ffffffff811e3734>] vfs_read+0x94/0x170
[<
ffffffff811e43c6>] SyS_read+0x46/0xb0
[<
ffffffff811e33d1>] ? SyS_lseek+0x91/0xb0
[<
ffffffff8171ee29>] system_call_fastpath+0x16/0x1b
Code: 00 00 00 88 50 29 83 7f 08 01 19 d2 83 e2 f0 83 ea 50 88 50 34 c6 81 1d 02 00 00 40 c6 81 17 02 00 00 00 5d c3 66 0f 1f 44 00 00 <89> 14 25 58 00 00 00
Fix it by introducing ata_host->n_tags which is initialized to
ATA_MAX_QUEUE - 1 in ata_host_init() for SAS controllers and set to
scsi_host_template->can_queue in ata_host_register() for !SAS ones.
As SAS hosts are never registered, this will give them the same
ATA_MAX_QUEUE - 1 as before. Note that we can't use
scsi_host->can_queue directly for SAS hosts anyway as they can go
higher than the libata maximum.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Reported-by: Jesse Brandeburg <jesse.brandeburg@gmail.com>
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Fixes: 1871ee134b73 ("libata: support the ata host which implements a queue depth less than 32")
Cc: Kevin Hao <haokexin@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org
Catalin Marinas [Fri, 18 Jul 2014 10:54:37 +0000 (11:54 +0100)]
arm64: Create non-empty ZONE_DMA when DRAM starts above 4GB
ZONE_DMA is created to allow 32-bit only devices to access memory in the
absence of an IOMMU. On systems where the memory starts above 4GB, it is
expected that some devices have a DMA offset hardwired to be able to
access the bottom of the memory. Linux currently supports DT bindings
for the DMA offsets but they are not (easily) available early during
boot.
This patch tries to guess a DMA offset and assumes that ZONE_DMA
corresponds to the 32-bit mask above the start of DRAM.
Fixes: 2d5a5612bc (arm64: Limit the CMA buffer to 32-bit if ZONE_DMA)
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Salter <msalter@redhat.com>
Tested-by: Anup Patel <anup.patel@linaro.org>
Peter Hutterer [Tue, 22 Jul 2014 00:51:35 +0000 (17:51 -0700)]
Input: document INPUT_PROP_TOPBUTTONPAD
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Mike Snitzer [Tue, 22 Jul 2014 22:38:27 +0000 (18:38 -0400)]
Merge branch 'slab/urgent' of git://git./linux/kernel/git/penberg/linux into for-3.16-rcX
Li Zhong [Mon, 21 Jul 2014 09:55:13 +0000 (17:55 +0800)]
powerpc: use _GLOBAL_TOC for memmove
memmove may be called from module code copy_pages(btrfs), and it may
call memcpy, which may call back to C code, so it needs to use
_GLOBAL_TOC to set up r2 correctly.
This fixes following error when I tried to boot an le guest:
Vector: 300 (Data Access) at [
c000000073f97210]
pc:
c000000000015004: enable_kernel_altivec+0x24/0x80
lr:
c000000000058fbc: enter_vmx_copy+0x3c/0x60
sp:
c000000073f97490
msr:
8000000002009033
dar:
d000000001d50170
dsisr:
40000000
current = 0xc0000000734c0000
paca = 0xc00000000fff0000 softe: 0 irq_happened: 0x01
pid = 815, comm = mktemp
enter ? for help
[
c000000073f974f0]
c000000000058fbc enter_vmx_copy+0x3c/0x60
[
c000000073f97510]
c000000000057d34 memcpy_power7+0x274/0x840
[
c000000073f97610]
d000000001c3179c copy_pages+0xfc/0x110 [btrfs]
[
c000000073f97660]
d000000001c3c248 memcpy_extent_buffer+0xe8/0x160 [btrfs]
[
c000000073f97700]
d000000001be4be8 setup_items_for_insert+0x208/0x4a0 [btrfs]
[
c000000073f97820]
d000000001be50b4 btrfs_insert_empty_items+0xf4/0x140 [btrfs]
[
c000000073f97890]
d000000001bfed30 insert_with_overflow+0x70/0x180 [btrfs]
[
c000000073f97900]
d000000001bff174 btrfs_insert_dir_item+0x114/0x2f0 [btrfs]
[
c000000073f979a0]
d000000001c1f92c btrfs_add_link+0x10c/0x370 [btrfs]
[
c000000073f97a40]
d000000001c20e94 btrfs_create+0x204/0x270 [btrfs]
[
c000000073f97b00]
c00000000026d438 vfs_create+0x178/0x210
[
c000000073f97b50]
c000000000270a70 do_last+0x9f0/0xe90
[
c000000073f97c20]
c000000000271010 path_openat+0x100/0x810
[
c000000073f97ce0]
c000000000272ea8 do_filp_open+0x58/0xd0
[
c000000073f97dc0]
c00000000025ade8 do_sys_open+0x1b8/0x300
[
c000000073f97e30]
c00000000000a008 syscall_exit+0x0/0x7c
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tyrel Datwyler [Thu, 10 Jul 2014 18:50:57 +0000 (14:50 -0400)]
powerpc/pseries: dynamically added OF nodes need to call of_node_init
Commit
75b57ecf9 refactored device tree nodes to use kobjects such that they
can be exposed via /sysfs. A secondary commit
0829f6d1f furthered this rework
by moving the kobect initialization logic out of of_node_add into its own
of_node_init function. The inital commit removed the existing kref_init calls
in the pseries dlpar code with the assumption kobject initialization would
occur in of_node_add. The second commit had the side effect of triggering a
BUG_ON during DLPAR, migration and suspend/resume operations as a result of
dynamically added nodes being uninitialized.
This patch fixes this by adding of_node_init calls in place of the previously
removed kref_init calls.
Fixes: 0829f6d1f69e ("of: device_node kobject lifecycle fixes")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aneesh Kumar K.V [Tue, 15 Jul 2014 14:52:30 +0000 (20:22 +0530)]
powerpc: subpage_protect: Increase the array size to take care of 64TB
We now support TASK_SIZE of 16TB, hence the array should be 8.
Fixes the below crash:
Unable to handle kernel paging request for data at address 0x000100bd
Faulting instruction address: 0xc00000000004f914
cpu 0x13: Vector: 300 (Data Access) at [
c000000fea75fa90]
pc:
c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0
lr:
c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0
sp:
c000000fea75fd10
msr:
9000000000009032
dar: 100bd
dsisr:
40000000
current = 0xc000000fea6ae490
paca = 0xc00000000fb8ab00 softe: 0 irq_happened: 0x00
pid = 8237, comm = a.out
enter ? for help
[
c000000fea75fe30]
c00000000000a164 syscall_exit+0x0/0x98
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Paul Mackerras [Sat, 19 Jul 2014 07:47:57 +0000 (17:47 +1000)]
powerpc: Fix bugs in emulate_step()
This fixes some bugs in emulate_step(). First, the setting of the carry
bit for the arithmetic right-shift instructions was not correct on 64-bit
machines because we were masking with a mask of type int rather than
unsigned long. Secondly, the sld (shift left doubleword) instruction was
using the wrong instruction field for the register containing the shift
count.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Joel Stanley [Fri, 18 Jul 2014 02:11:37 +0000 (11:41 +0930)]
powerpc: Disable doorbells on Power8 DD1.x
These processors do not currently support doorbell IPIs, so remove them
from the feature list if we are at DD 1.xx for the 0x004d part.
This fixes a regression caused by
d4e58e5928f8 (powerpc/powernv: Enable
POWER8 doorbell IPIs). With that patch the kernel would hang at boot
when calling smp_call_function_many, as the doorbell would not be
received by the target CPUs:
.smp_call_function_many+0x2bc/0x3c0 (unreliable)
.on_each_cpu_mask+0x30/0x100
.cpuidle_register_driver+0x158/0x1a0
.cpuidle_register+0x2c/0x110
.powernv_processor_idle_init+0x23c/0x2c0
.do_one_initcall+0xd4/0x260
.kernel_init_freeable+0x25c/0x33c
.kernel_init+0x1c/0x120
.ret_from_kernel_thread+0x58/0x7c
Fixes: d4e58e5928f8 (powerpc/powernv: Enable POWER8 doorbell IPIs)
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Tue, 22 Jul 2014 05:46:01 +0000 (22:46 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Null termination fix in dns_resolver got the pointer dereferncing
wrong, fix from Ben Hutchings.
2) ip_options_compile() has a benign but real buffer overflow when
parsing options. From Eric Dumazet.
3) Table updates can crash in netfilter's nftables if none of the state
flags indicate an actual change, from Pablo Neira Ayuso.
4) Fix race in nf_tables dumping, also from Pablo.
5) GRE-GRO support broke the forwarding path because the segmentation
state was not fully initialized in these paths, from Jerry Chu.
6) sunvnet driver leaks objects and potentially crashes on module
unload, from Sowmini Varadhan.
7) We can accidently generate the same handle for several u32
classifier filters, fix from Cong Wang.
8) Several edge case bug fixes in fragment handling in xen-netback,
from Zoltan Kiss.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
ipv4: fix buffer overflow in ip_options_compile()
batman-adv: fix TT VLAN inconsistency on VLAN re-add
batman-adv: drop QinQ claim frames in bridge loop avoidance
dns_resolver: Null-terminate the right string
xen-netback: Fix pointer incrementation to avoid incorrect logging
xen-netback: Fix releasing header slot on error path
xen-netback: Fix releasing frag_list skbs in error path
xen-netback: Fix handling frag_list on grant op error path
net_sched: avoid generating same handle for u32 filters
net: huawei_cdc_ncm: add "subclass 3" devices
net: qmi_wwan: add two Sierra Wireless/Netgear devices
wan/x25_asy: integer overflow in x25_asy_change_mtu()
net: ppp: fix creating PPP pass and active filters
net/mlx4_en: cq->irq_desc wasn't set in legacy EQ's
sunvnet: clean up objects created in vnet_new() on vnet_exit()
r8169: Enable RX_MULTI_EN for RTL_GIGA_MAC_VER_40
net-gre-gro: Fix a bug that breaks the forwarding path
netfilter: nf_tables: 64bit stats need some extra synchronization
netfilter: nf_tables: set NLM_F_DUMP_INTR if netlink dumping is stale
netfilter: nf_tables: safe RCU iteration on list when dumping
...
Linus Torvalds [Tue, 22 Jul 2014 05:45:28 +0000 (22:45 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fix from David Miller:
"Need to hook up the new renameat2 system call"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Hook up renameat2 syscall.
Linus Torvalds [Tue, 22 Jul 2014 05:44:24 +0000 (22:44 -0700)]
Merge git://git./linux/kernel/git/davem/ide
Pull IDE fixes from David Miller:
- fix interrupt registry for some Atari IDE chipsets.
- adjust Kconfig dependencies for x86_32 specific chips.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: Fix SC1200 dependencies
ide: Fix CS5520 and CS5530 dependencies
m68k/atari - ide: do not register interrupt if host->get_lock is set
Linus Torvalds [Tue, 22 Jul 2014 05:43:15 +0000 (22:43 -0700)]
Merge tag 'trace-fixes-v3.16-rc6' of git://git./linux/kernel/git/rostedt/linux-trace
Pull trace fix from Steven Rostedt:
"Tony Luck found that using the "uptime" trace clock that uses jiffies
as a counter was converted to nanoseconds (silly), and after 1 hour 11
minutes and 34 seconds, this monotonic clock would wrap, causing havoc
with the tracing system and making the clock useless.
He converted that clock to use jiffies_64 and made it into a counter
instead of nanosecond conversions, and displayed the clock with the
straight jiffy count, which works much better than it did in the past"
* tag 'trace-fixes-v3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix wraparound problems in "uptime" trace clock
David S. Miller [Tue, 22 Jul 2014 05:27:56 +0000 (22:27 -0700)]
sparc: Hook up renameat2 syscall.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 22 Jul 2014 03:19:09 +0000 (20:19 -0700)]
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
Antonio Quartulli says:
====================
pull request [net]: batman-adv
20140721
here you have two fixes that we have been testing for quite some time
(this is why they arrived a bit late in the rc cycle).
Patch 1) ensures that BLA packets get dropped and not forwarded to the
mesh even if they reach batman-adv within QinQ frames. Forwarding them
into the mesh means messing up with the TT database of other nodes which
can generate all kind of unexpected behaviours during route computation.
Patch 2) avoids a couple of race conditions triggered upon fast VLAN
deletion-addition. Such race conditions are pretty dangerous because
they not only create inconsistencies in the TT database of the nodes
in the network, but such scenario is also unrecoverable (unless
nodes are rebooted).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 21 Jul 2014 05:17:42 +0000 (07:17 +0200)]
ipv4: fix buffer overflow in ip_options_compile()
There is a benign buffer overflow in ip_options_compile spotted by
AddressSanitizer[1] :
Its benign because we always can access one extra byte in skb->head
(because header is followed by struct skb_shared_info), and in this case
this byte is not even used.
[28504.910798] ==================================================================
[28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
[28504.913170] Read of size 1 by thread T15843:
[28504.914026] [<
ffffffff81802f91>] ip_options_compile+0x121/0x9c0
[28504.915394] [<
ffffffff81804a0d>] ip_options_get_from_user+0xad/0x120
[28504.916843] [<
ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
[28504.918175] [<
ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
[28504.919490] [<
ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
[28504.920835] [<
ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
[28504.922208] [<
ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
[28504.923459] [<
ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
[28504.924722]
[28504.925106] Allocated by thread T15843:
[28504.925815] [<
ffffffff81804995>] ip_options_get_from_user+0x35/0x120
[28504.926884] [<
ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
[28504.927975] [<
ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
[28504.929175] [<
ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
[28504.930400] [<
ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
[28504.931677] [<
ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
[28504.932851] [<
ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
[28504.934018]
[28504.934377] The buggy address
ffff880026382828 is located 0 bytes to the right
[28504.934377] of 40-byte region [
ffff880026382800,
ffff880026382828)
[28504.937144]
[28504.937474] Memory state around the buggy address:
[28504.938430]
ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
[28504.939884]
ffff880026382400:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28504.941294]
ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
[28504.942504]
ffff880026382600:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28504.943483]
ffff880026382700:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28504.944511] >
ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
[28504.945573] ^
[28504.946277]
ffff880026382900:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.094949]
ffff880026382a00:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.096114]
ffff880026382b00:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.097116]
ffff880026382c00:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.098472]
ffff880026382d00:
ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
[28505.099804] Legend:
[28505.100269] f - 8 freed bytes
[28505.100884] r - 8 redzone bytes
[28505.101649] . - 8 allocated bytes
[28505.102406] x=1..7 - x allocated bytes + (8-x) redzone bytes
[28505.103637] ==================================================================
[1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 21 Jul 2014 18:44:34 +0000 (11:44 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
"A series of driver fixes:
- fix DVB-S tuning with tda1071
- fix tuner probe on af9035 when the device has a bad eeprom
- some fixes for the new si2168/2157 drivers
- one Kconfig build fix (for omap4iss)
- fixes at vpif error path
- don't lock saa7134 ioctl at driver's base core level, as it now
uses V4L2 and VB2 locking schema
- fix audio at hdpvr driver
- fix the aspect ratio at the digital timings table
- one new USB ID (at gspca_pac7302): Genius i-Look 317 webcam"
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] gspca_pac7302: Add new usb-id for Genius i-Look 317
[media] tda10071: fix returned symbol rate calculation
[media] tda10071: fix spec inversion reporting
[media] tda10071: add missing DVB-S2/PSK-8 FEC AUTO
[media] tda10071: force modulation to QPSK on DVB-S
[media] hdpvr: fix two audio bugs
[media] davinci: vpif: missing unlocks on error
[media] af9035: override tuner id when bad value set into eeprom
[media] saa7134: use unlocked_ioctl instead of ioctl
[media] media: v4l2-core: v4l2-dv-timings.c: Cleaning up code wrong value used in aspect ratio
[media] si2168: firmware download fix
[media] si2157: add one missing parenthesis
[media] si2168: add one missing parenthesis
[media] staging: tighten omap4iss dependencies
Linus Torvalds [Mon, 21 Jul 2014 18:31:17 +0000 (11:31 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Final block fixes for 3.16
Four small fixes that should go into 3.16, have been queued up for a
bit and delayed due to vacation and other euro duties. But here they
are. The pull request contains:
- Fix for a reported crash with shared tagging on SCSI from Christoph
- A regression fix for drbd. From Lars Ellenberg.
- Hooking up the compat ioctl for BLKZEROOUT, which requires no
translation. From Mikulas.
- A fix for a regression where we woud crash on queue exit if the
root_blkg is gone/not there. From Tejun"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: provide compat ioctl for BLKZEROOUT
blkcg: don't call into policy draining if root_blkg is already gone
drbd: fix regression 'out of mem, failed to invoke fence-peer helper'
block: don't assume last put of shared tags is for the host
Linus Torvalds [Mon, 21 Jul 2014 18:25:44 +0000 (11:25 -0700)]
Merge branch 'for-3.16-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Late libata fixes.
The most important one is from Kevin Hao which makes sure that libata
only allocates tags inside the max tag number the controller supports.
libata always had this problem but the recent tag allocation change
and addition of support for sata_fsl which only supports queue depth
of 16 exposed the issue.
Hans de Goede agreed to become the maintainer of libahci_platform
which is under higher than usual development pressure from all the new
controllers popping up from the ARM world"
* 'for-3.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
drivers/ata/pata_ep93xx.c: use signed int type for result of platform_get_irq()
libata: EH should handle AMNF error condition as a media error
libata: support the ata host which implements a queue depth less than 32
MAINTAINERS: Add Hans de Goede as ahci-platform maintainer
Linus Torvalds [Mon, 21 Jul 2014 18:19:18 +0000 (11:19 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"These are mostly PPC changes for 3.16-new things. However, there is
an x86 change too and it is a regression from 3.14. As it only
affects nested virtualization and there were other changes in this
area in 3.16, I am not nominating it for 3.15-stable"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Check for nested events if there is an injectable interrupt
KVM: PPC: RTAS: Do byte swaps explicitly
KVM: PPC: Book3S PR: Fix ABIv2 on LE
KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC()
PPC: Add _GLOBAL_TOC for 32bit
KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value
KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute
Linus Torvalds [Mon, 21 Jul 2014 18:18:31 +0000 (11:18 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"A couple of last minute bug fixes for 3.16, including a fix for ptrace
to close a hole which allowed a user space program to write to the
kernel address space"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix restore of invalid floating-point-control
s390/zcrypt: improve device probing for zcrypt adapter cards
s390/ptrace: fix PSW mask check
s390/MSI: Use standard mask and unmask funtions
s390/3270: correct size detection with the read-partition command
s390: require mvcos facility, not tod clock steering facility
Tony Luck [Fri, 18 Jul 2014 18:43:01 +0000 (11:43 -0700)]
tracing: Fix wraparound problems in "uptime" trace clock
The "uptime" trace clock added in:
commit
8aacf017b065a805d27467843490c976835eb4a5
tracing: Add "uptime" trace clock that uses jiffies
has wraparound problems when the system has been up more
than 1 hour 11 minutes and 34 seconds. It converts jiffies
to nanoseconds using:
(u64)jiffies_to_usecs(jiffy) * 1000ULL
but since jiffies_to_usecs() only returns a 32-bit value, it
truncates at 2^32 microseconds. An additional problem on 32-bit
systems is that the argument is "unsigned long", so fixing the
return value only helps until 2^32 jiffies (49.7 days on a HZ=1000
system).
Avoid these problems by using jiffies_64 as our basis, and
not converting to nanoseconds (we do convert to clock_t because
user facing API must not be dependent on internal kernel
HZ values).
Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com
Cc: stable@vger.kernel.org # 3.10+
Fixes: 8aacf017b065 "tracing: Add "uptime" trace clock that uses jiffies"
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Antonio Quartulli [Thu, 8 May 2014 15:13:15 +0000 (17:13 +0200)]
batman-adv: fix TT VLAN inconsistency on VLAN re-add
When a VLAN interface (on top of batX) is removed and
re-added within a short timeframe TT does not have enough
time to properly cleanup. This creates an internal TT state
mismatch as the newly created softif_vlan will be
initialized from scratch with a TT client count of zero
(even if TT entries for this VLAN still exist). The
resulting TT messages are bogus due to the counter / tt
client listing mismatch, thus creating inconsistencies on
every node in the network
To fix this issue destroy_vlan() has to not free the VLAN
object immediately but it has to be kept alive until all the
TT entries for this VLAN have been removed. destroy_vlan()
still removes the sysfs folder so that the user has the
feeling that everything went fine.
If the same VLAN is re-added before the old object is free'd,
then the latter is resurrected and re-used.
Implement such behaviour by increasing the reference counter
of a softif_vlan object every time a new local TT entry for
such VLAN is created and remove the object from the list
only when all the TT entries have been destroyed.
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Simon Wunderlich [Mon, 23 Jun 2014 13:55:36 +0000 (15:55 +0200)]
batman-adv: drop QinQ claim frames in bridge loop avoidance
Since bridge loop avoidance only supports untagged or simple 802.1q
tagged VLAN claim frames, claim frames with stacked VLAN headers (QinQ)
should be detected and dropped. Transporting the over the mesh may cause
problems on the receivers, or create bogus entries in the local tt
tables.
Reported-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Ben Hutchings [Sun, 20 Jul 2014 23:06:48 +0000 (00:06 +0100)]
dns_resolver: Null-terminate the right string
*_result[len] is parsed as *(_result[len]) which is not at all what we
want to touch here.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 84a7c0b1db1c ("dns_resolver: assure that dns_query() result is null-terminated")
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 21 Jul 2014 04:04:16 +0000 (21:04 -0700)]
Linux 3.16-rc6
David S. Miller [Mon, 21 Jul 2014 03:56:53 +0000 (20:56 -0700)]
Merge branch 'xen-netback'
Zoltan Kiss says:
====================
xen-netback: Fixing up xenvif_tx_check_gop
This series fixes a lot of bugs on the error path around this function, which
were introduced with my grant mapping series in 3.15. They apply to the latest
net tree, but probably to net-next as well without any modification.
I'll post an another series which applies to 3.15 stable, as the problem was
first discovered there. The only difference is that the "queue" variable name is
replaced to "vif".
====================
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss [Fri, 18 Jul 2014 18:08:05 +0000 (19:08 +0100)]
xen-netback: Fix pointer incrementation to avoid incorrect logging
Due to this pointer is increased prematurely, the error log contains rubbish.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss [Fri, 18 Jul 2014 18:08:04 +0000 (19:08 +0100)]
xen-netback: Fix releasing header slot on error path
This patch makes this function aware that the first frag and the header might
share the same ring slot. That could happen if the first slot is bigger than
PKT_PROT_LEN. Due to this the error path might release that slot twice or never,
depending on the error scenario.
xenvif_idx_release is also removed from xenvif_idx_unmap, and called separately.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss [Fri, 18 Jul 2014 18:08:03 +0000 (19:08 +0100)]
xen-netback: Fix releasing frag_list skbs in error path
When the grant operations failed, the skb is freed up eventually, and it tries
to release the frags, if there is any. For the main skb nr_frags is set to 0 to
avoid this, but on the frag_list it iterates through the frags array, and tries
to call put_page on the page pointer which contains garbage at that time.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Zoltan Kiss [Fri, 18 Jul 2014 18:08:02 +0000 (19:08 +0100)]
xen-netback: Fix handling frag_list on grant op error path
The error handling for skb's with frag_list was completely wrong, it caused
double unmap attempts to happen if the error was on the first skb. Move it to
the right place in the loop.
Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Reported-by: Armin Zentai <armin.zentai@ezit.hu>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Fri, 18 Jul 2014 00:34:53 +0000 (17:34 -0700)]
net_sched: avoid generating same handle for u32 filters
When kernel generates a handle for a u32 filter, it tries to start
from the max in the bucket. So when we have a filter with the max (fff)
handle, it will cause kernel always generates the same handle for new
filters. This can be shown by the following command:
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip pref 770 handle 800::fff u32 match ip protocol 1 0xff
tc filter add dev eth0 parent ffff: protocol ip pref 770 u32 match ip protocol 1 0xff
...
we will get some u32 filters with same handle:
# tc filter show dev eth0 parent ffff:
filter protocol ip pref 770 u32
filter protocol ip pref 770 u32 fh 800: ht divisor 1
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match
00010000/
00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match
00010000/
00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match
00010000/
00ff0000 at 8
filter protocol ip pref 770 u32 fh 800::fff order 4095 key ht 800 bkt 0
match
00010000/
00ff0000 at 8
handles should be unique. This patch fixes it by looking up a bitmap,
so that can guarantee the handle is as unique as possible. For compatibility,
we still start from 0x800.
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 21 Jul 2014 03:44:53 +0000 (20:44 -0700)]
Merge tag 'staging-3.16-rc6' of git://git./linux/kernel/git/gregkh/staging
Pull more IIO driver fixes from Greg KH:
"Here are two IIO driver fixes for 3.16-rc6 that resolve some reported
issues"
* tag 'staging-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio: mma8452: Use correct acceleration units.
iio:core: Handle error when mask type is not separate
Linus Torvalds [Mon, 21 Jul 2014 03:44:18 +0000 (20:44 -0700)]
Merge tag 'usb-3.16-rc6' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are two USB patches that resolve some reported issues, one with
an odd HUB, and one in the chipidea driver"
* tag 'usb-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: Check if port status is equal to RxDetect
usb: chipidea: udc: Disable auto ZLP generation on ep0
Linus Torvalds [Mon, 21 Jul 2014 03:43:46 +0000 (20:43 -0700)]
Merge tag 'driver-core-3.16-rc6' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix that reverts an older patch that has
been causing a number of reported problems with the platform devices.
This revert has been in linux-next for a while with no reported issues"
* tag 'driver-core-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
platform_get_irq: Revert to platform_get_resource if of_irq_get fails
Linus Torvalds [Mon, 21 Jul 2014 03:43:14 +0000 (20:43 -0700)]
Merge tag 'char-misc-3.16-rc6' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc fix from Greg KH:
"Here's a single hyper-v driver fix for a reported issue"
* tag 'char-misc-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: hv_fcopy: fix a race condition for SMP guest
Linus Torvalds [Mon, 21 Jul 2014 03:39:28 +0000 (20:39 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull intel drm fixes from Dave Airlie:
"Intel fixes came in late, but since I debugged one of them I'll send
them on,
Two reverts, a quirk and one warn regression"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
Revert "drm/i915: reverse dp link param selection, prefer fast over wide again"
drm/i915: Track the primary plane correctly when reassigning planes
drm/i915: Ignore VBT backlight presence check on HP Chromebook 14
Revert "drm/i915: Don't set the 8to6 dither flag when not scaling"
Linus Torvalds [Mon, 21 Jul 2014 03:28:04 +0000 (20:28 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/rw/uml
Pull UML fixes from Richard Weinberger:
"Four fixes, all discovered by Trinity"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: segv: Save regs only in case of a kernel mode fault
um: Fix hung task in fix_range_common()
um: Ensure that a stub page cannot get unmapped
Revert "um: Fix wait_stub_done() error handling"
Linus Torvalds [Mon, 21 Jul 2014 03:21:05 +0000 (20:21 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"We have two more fixes in my for-linus branch.
I was hoping to also include a fix for a btrfs deadlock with
compression enabled, but we're still nailing that one down"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: test for valid bdev before kobj removal in btrfs_rm_device
Btrfs: fix abnormal long waiting in fsync
Linus Torvalds [Mon, 21 Jul 2014 02:55:44 +0000 (19:55 -0700)]
Merge tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Apologies for the relative lateness of this pull request, however the
commits fix some issues with the NFS read/write code updates in
3.16-rc1 that can cause serious Oopsing when using small r/wsize. The
delay was mainly due to extra testing to make sure that the fixes
behave correctly.
Highlights include;
- Stable fix for an NFSv3 posix ACL regression
- Multiple fixes for regressions to the NFS generic read/write code:
- Fix page splitting bugs that come into play when a small
rsize/wsize read/write needs to be sent again (due to error
conditions or page redirty)
- Fix nfs_wb_page_cancel, which is called by the "invalidatepage"
method
- Fix 2 compile warnings about unused variables
- Fix a performance issue affecting unstable writes"
* tag 'nfs-for-3.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Don't reset pg_moreio in __nfs_pageio_add_request
NFS: Remove 2 unused variables
nfs: handle multiple reqs in nfs_wb_page_cancel
nfs: handle multiple reqs in nfs_page_async_flush
nfs: change find_request to find_head_request
nfs: nfs_page should take a ref on the head req
nfs: mark nfs_page reqs with flag for extra ref
nfs: only show Posix ACLs in listxattr if actually present
Dmitry Torokhov [Sat, 19 Jul 2014 23:30:31 +0000 (16:30 -0700)]
Input: fix defuzzing logic
We attempt to remove noise from coordinates reported by devices in
input_handle_abs_event(), unfortunately, unless we were dropping the
event altogether, we were ignoring the adjusted value and were passing
on the original value instead.
Cc: stable@vger.kernel.org
Reviewed-by: Andrew de los Reyes <adlr@chromium.org>
Reviewed-by: Benson Leung <bleung@chromium.org>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Richard Weinberger [Sun, 20 Jul 2014 11:39:27 +0000 (13:39 +0200)]
um: segv: Save regs only in case of a kernel mode fault
...otherwise me lose user mode regs and the resulting
stack trace is useless.
Signed-off-by: Richard Weinberger <richard@nod.at>
Richard Weinberger [Sun, 20 Jul 2014 11:16:20 +0000 (13:16 +0200)]
um: Fix hung task in fix_range_common()
If do_ops() fails we have to release current->mm->mmap_sem
otherwise the failing task will never terminate.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Richard Weinberger [Sun, 20 Jul 2014 11:09:15 +0000 (13:09 +0200)]
um: Ensure that a stub page cannot get unmapped
Trinity discovered an execution path such that a task
can unmap his stub page.
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Richard Weinberger [Sun, 20 Jul 2014 10:56:34 +0000 (12:56 +0200)]
Revert "um: Fix wait_stub_done() error handling"
This reverts commit
0974a9cadc7886f7baaa458bb0c89f5c5f9d458e.
The real for for that issue is to release current->mm->mmap_sem in
fix_range_common().
Signed-off-by: Richard Weinberger <richard@nod.at>
Eric Sandeen [Mon, 7 Jul 2014 17:34:49 +0000 (12:34 -0500)]
btrfs: test for valid bdev before kobj removal in btrfs_rm_device
commit
99994cd btrfs: dev delete should remove sysfs entry
added a btrfs_kobj_rm_device, which dereferences device->bdev...
right after we check whether device->bdev might be NULL.
I don't honestly know if it's possible to have a NULL device->bdev
here, but assuming that it is (given the test), we need to move
the kobject removal to be under that test.
(Coverity spotted this)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
Liu Bo [Thu, 17 Jul 2014 08:08:36 +0000 (16:08 +0800)]
Btrfs: fix abnormal long waiting in fsync
xfstests generic/127 detected this problem.
With commit
7fc34a62ca4434a79c68e23e70ed26111b7a4cf8, now fsync will only flush
data within the passed range. This is the cause of the above problem,
-- btrfs's fsync has a stage called 'sync log' which will wait for all the
ordered extents it've recorded to finish.
In xfstests/generic/127, with mixed operations such as truncate, fallocate,
punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will
mmap, and then msync. And I find that msync will wait for quite a long time
(about 20s in my case), thanks to ftrace, it turns out that the previous
fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the
range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants,
there can be some ordered extents created but not getting corresponding pages
flushed, then they're left in memory until we fsync which runs into the
stage 'sync log', and fsync will just wait for the system writeback thread
to flush those pages and get ordered extents finished, so the latency is
inevitable.
This adds a flush similar to btrfs_start_ordered_extent() in
btrfs_wait_logged_extents() to fix that.
Reviewed-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Linus Torvalds [Sat, 19 Jul 2014 16:27:55 +0000 (06:27 -1000)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"The locking department delivers:
- A rather large and intrusive bundle of fixes to address serious
performance regressions introduced by the new rwsem / mcs
technology. Simpler solutions have been discussed, but they would
have been ugly bandaids with more risk than doing the right thing.
- Make the rwsem spin on owner technology opt-in for architectures
and enable it only on the known to work ones.
- A few fixes to the lockdep userspace library"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER
locking/mutex: Disable optimistic spinning on some architectures
locking/rwsem: Reduce the size of struct rw_semaphore
locking/rwsem: Rename 'activity' to 'count'
locking/spinlocks/mcs: Micro-optimize osq_unlock()
locking/spinlocks/mcs: Introduce and use init macro and function for osq locks
locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead
locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node()
locking/rwsem: Allow conservative optimistic spinning when readers have lock
tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire
tools/liblockdep: Remove debug print left over from development
tools/liblockdep: Fix comparison of a boolean value with a value of 2
Linus Torvalds [Sat, 19 Jul 2014 16:26:43 +0000 (06:26 -1000)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
"Prevent a possible divide by zero in the debugging code"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix possible divide by zero in avg_atom() calculation
Linus Torvalds [Sat, 19 Jul 2014 16:26:01 +0000 (06:26 -1000)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"Three patches addressing shortcomings in the ARM gic interrupt chip
driver"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip: gic: Fix core ID calculation when topology is read from DT
irqchip: gic: Add binding probe for ARM GIC400
irqchip: gic: Add support for cortex a7 compatible string
Linus Torvalds [Sat, 19 Jul 2014 16:25:03 +0000 (06:25 -1000)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix for a long standing issue in the alarm timer subsystem,
which was noticed recently when people finally started to use alarm
timers for serious work"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
alarmtimer: Fix bug where relative alarm timers were treated as absolute
Linus Torvalds [Sat, 19 Jul 2014 16:23:27 +0000 (06:23 -1000)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RCU fixes from Thomas Gleixner:
"Two RCU patches:
- Address a serious performance regression on open/close caused by
commit
ac1bea85781e ("Make cond_resched() report RCU quiescent
states")
- Export RCU debug functions. Not a regression, but enablement to
address a serious recursion bug in the sl*b allocators in 3.17"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Reduce overhead of cond_resched() checks for RCU
rcu: Export debug_init_rcu_head() and and debug_init_rcu_head()
Linus Torvalds [Sat, 19 Jul 2014 06:49:47 +0000 (20:49 -1000)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smaller set of fixes this week, and all regression fixes:
- a handful of issues fixed on at91 with common clock conversion
- a set of fixes for Marvell mvebu (SMP, coherency, PM)
- a clock fix for i.MX6Q.
- ... and a SMP/hotplug fix for Exynos"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: EXYNOS: Fix core ID used by platsmp and hotplug code
ARM: at91/dt: add missing clocks property to pwm node in sam9x5.dtsi
ARM: at91/dt: fix usb0 clocks definition in sam9n12 dtsi
ARM: at91: at91sam9x5: correct typo error for ohci clock
ARM: clk-imx6q: parent lvds_sel input from upstream clock gates
ARM: mvebu: Fix coherency bus notifiers by using separate notifiers
ARM: mvebu: Fix the operand list in the inline asm of armada_370_xp_pmsu_idle_enter
ARM: mvebu: fix SMP boot for Armada 38x and Armada 375 Z1 in big endian
Dave Airlie [Sat, 19 Jul 2014 06:48:38 +0000 (16:48 +1000)]
Merge tag 'drm-intel-fixes-2014-07-18' of git://anongit.freedesktop.org/drm-intel
But in any case nothing really shocking in
here, 2 reverts, 1 quirk and a regression fix a WARN.
* tag 'drm-intel-fixes-2014-07-18' of git://anongit.freedesktop.org/drm-intel:
Revert "drm/i915: reverse dp link param selection, prefer fast over wide again"
drm/i915: Track the primary plane correctly when reassigning planes
drm/i915: Ignore VBT backlight presence check on HP Chromebook 14
Revert "drm/i915: Don't set the 8to6 dither flag when not scaling"
Linus Torvalds [Sat, 19 Jul 2014 06:46:55 +0000 (20:46 -1000)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
"A couple of key fixes and a few less critical ones. The main ones
are:
- add a .bss section to the PE/COFF headers when building with EFI
stub
- invoke the correct paravirt magic when building the espfix page
tables
Unfortunately both of these areas also have at least one additional
fix each still in thie pipeline, but which are not yet ready to push"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: Remove unused variable "polling"
x86/espfix/xen: Fix allocation of pages for paravirt page tables
x86/efi: Include a .bss section within the PE/COFF headers
efi: fdt: Do not report an error during boot if UEFI is not available
efi/arm64: efistub: remove local copy of linux_banner
Linus Torvalds [Sat, 19 Jul 2014 06:39:34 +0000 (20:39 -1000)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
- cxgb4 hardware driver regression fixes
- mlx5 hardware driver regression fixes
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx5: Enable "block multicast loopback" for kernel consumers
RDMA/cxgb4: Call iwpm_init() only once
mlx5_core: Fix possible race between mr tree insert/delete
RDMA/cxgb4: Initialize the device status page
RDMA/cxgb4: Clean up connection on ARP error
RDMA/cxgb4: Fix skb_leak in reject_cr()
Linus Torvalds [Sat, 19 Jul 2014 06:37:24 +0000 (20:37 -1000)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
"More fallout from module tests and code inspection.
Fixes to temperature limit write operations in adt7470 driver. Also,
dashes are not allowed in hwmon 'name' attributes. Fix drivers where
necessary"
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (adt7470) Fix writes to temperature limit registers
hwmon: (da9055) Don't use dash in the name attribute
hwmon: (da9052) Don't use dash in the name attribute
Linus Torvalds [Sat, 19 Jul 2014 06:36:13 +0000 (20:36 -1000)]
Merge tag 'iommu-fixes-v3.16-rc5' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"A couple of fixes for the Freescale PAMU driver queued up:
- fix PAMU window size check.
- fix the device domain attach condition.
- fix the error condition during iommu group"
* tag 'iommu-fixes-v3.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/fsl: Fix the error condition during iommu group
iommu/fsl: Fix the device domain attach condition.
iommu/fsl: Fix PAMU window size check.
Linus Torvalds [Sat, 19 Jul 2014 06:28:27 +0000 (20:28 -1000)]
Merge tag 'pm+acpi-3.16-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
"These are a few recent regression fixes, a revert of the ACPI video
commit I promised, a system resume fix related to request_firmware(),
an ACPI video quirk for one more Win8-oriented BIOS, an ACPI device
enumeration documentation update and a few fixes for ARM cpufreq
drivers.
Specifics:
- Fix for a recently introduced NULL pointer dereference in the core
system suspend code occuring when platforms without ACPI attempt to
use the "freeze" sleep state from Zhang Rui.
- Fix for a recently introduced build warning in cpufreq headers from
Brian W Hart.
- Fix for a 3.13 cpufreq regression related to sysem resume that
triggers on some systems with multiple CPU clusters from Viresh
Kumar.
- Fix for a 3.4 regression in request_firmware() resulting in
WARN_ON()s on some systems during system resume from Takashi Iwai.
- Revert of the ACPI video commit that changed the default value of
the video.brightness_switch_enabled command line argument to 0 as
it has been reported to break existing setups.
- ACPI device enumeration documentation update to take recent code
changes into account and make the documentation match the code
again from Darren Hart.
- Fixes for the sa1110, imx6q, kirkwood, and cpu0 cpufreq drivers
from Linus Walleij, Nicolas Del Piano, Quentin Armitage, Viresh
Kumar.
- New ACPI video blacklist entry for HP ProBook 4540s from Hans de
Goede"
* tag 'pm+acpi-3.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: make table sentinel macros unsigned to match use
cpufreq: move policy kobj to policy->cpu at resume
cpufreq: cpu0: OPPs can be populated at runtime
cpufreq: kirkwood: Reinstate cpufreq driver for ARCH_KIRKWOOD
cpufreq: imx6q: Select PM_OPP
cpufreq: sa1110: set memory type for h3600
ACPI / video: Add use_native_backlight quirk for HP ProBook 4540s
PM / sleep: fix freeze_ops NULL pointer dereferences
PM / sleep: Fix request_firmware() error at resume
Revert "ACPI / video: change acpi-video brightness_switch_enabled default to 0"
ACPI / documentation: Remove reference to acpi_platform_device_ids from enumeration.txt
Linus Torvalds [Sat, 19 Jul 2014 06:27:23 +0000 (20:27 -1000)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"One nouveau deadlock fix, one qxl irq handling fix, and a set of
radeon pageflipping changes that fix regressions in pageflipping since
-rc1 along with a leak and backlight fix.
The pageflipping fixes are a bit bigger than I'd like, but there has
been a few people focused on testing them"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Make classic pageflip completion path less racy.
drm/radeon: Add missing vblank_put in pageflip ioctl error path.
drm/radeon: Remove redundant fence unref in pageflip path.
drm/radeon: Complete page flip even if waiting on the BO fence fails
drm/radeon: Move pinning the BO back to radeon_crtc_page_flip()
drm/radeon: Prevent too early kms-pageflips triggered by vblank.
drm/radeon: set default bl level to something reasonable
drm/radeon: avoid leaking edid data
drm/qxl: return IRQ_NONE if it was not our irq
drm/nouveau/therm: fix a potential deadlock in the therm monitoring code
Linus Torvalds [Sat, 19 Jul 2014 06:26:46 +0000 (20:26 -1000)]
Merge tag 'random_for_linus_stable' of git://git./linux/kernel/git/tytso/random
Pull /dev/random fix from Ted Ts'o:
"Fix a BUG splat found by trinity"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: check for increase of entropy_count because of signed conversion
Linus Torvalds [Sat, 19 Jul 2014 06:25:54 +0000 (20:25 -1000)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This push fixes a boot hang in virt guests when the virtio RNG is
enabled"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: virtio - ensure reads happen after successful probe
hwrng: fetch randomness only after device init
Hannes Frederic Sowa [Fri, 18 Jul 2014 21:26:41 +0000 (17:26 -0400)]
random: check for increase of entropy_count because of signed conversion
The expression entropy_count -= ibytes << (ENTROPY_SHIFT + 3) could
actually increase entropy_count if during assignment of the unsigned
expression on the RHS (mind the -=) we reduce the value modulo
2^width(int) and assign it to entropy_count. Trinity found this.
[ Commit modified by tytso to add an additional safety check for a
negative entropy_count -- which should never happen, and to also add
an additional paranoia check to prevent overly large count values to
be passed into urandom_read(). ]
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Tomasz Figa [Tue, 15 Jul 2014 17:59:18 +0000 (02:59 +0900)]
ARM: EXYNOS: Fix core ID used by platsmp and hotplug code
When CPU topology is specified in device tree, cpu_logical_map() does
not return core ID anymore, but rather full MPIDR value. This breaks
existing calculation of PMU register offsets on Exynos SoCs.
This patch fixes the problem by adjusting the code to use only core ID
bits of the value returned by cpu_logical_map() to allow CPU topology to
be specified in device tree on Exynos SoCs.
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
Romain Degez [Fri, 11 Jul 2014 16:08:13 +0000 (18:08 +0200)]
ahci: add support for the Promise FastTrak TX8660 SATA HBA (ahci mode)
Add support of the Promise FastTrak TX8660 SATA HBA in ahci mode by
registering the board in the ahci_pci_tbl[].
Note: this HBA also provide a hardware RAID mode when activated in
BIOS but specific drivers from the manufacturer are required in this
case.
Signed-off-by: Romain Degez <romain.degez@gmail.com>
Tested-by: Romain Degez <romain.degez@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Olof Johansson [Fri, 18 Jul 2014 21:40:17 +0000 (14:40 -0700)]
Merge tag 'imx-fixes-3.16-2' of git://git./linux/kernel/git/shawnguo/linux into fixes
Merge "ARM: imx: fixes for 3.16, 2nd take" from Shawn Guo:
The i.MX fixes for 3.16, 2nd take:
It fixes a hard machine hang regression for boards where only pcie is
active but no sata, as the latest imx6-pcie driver is no longer enabling
the upstream clock directly but only lvds clk out.
* tag 'imx-fixes-3.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: clk-imx6q: parent lvds_sel input from upstream clock gates
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Fri, 18 Jul 2014 21:39:18 +0000 (14:39 -0700)]
Merge tag 'at91-fixes' of git://github.com/at91linux/linux-at91 into fixes
Merge "at91: fixes for 3.16 #2" from Nicolas Ferre:
Second AT91 fixes series for 3.16
- fix clock definitions after the move to CCF for:
* at91sam9n12 (ohci)
* at91sam9x5 (ohci, pwm)
* tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
ARM: at91/dt: add missing clocks property to pwm node in sam9x5.dtsi
ARM: at91/dt: fix usb0 clocks definition in sam9n12 dtsi
ARM: at91: at91sam9x5: correct typo error for ohci clock
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Fri, 18 Jul 2014 21:38:28 +0000 (14:38 -0700)]
Merge tag 'mvebu-fixes-3.16-3' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu fixes for v3.16 (round 3)" from Jason Cooper:
- Fix SMP boot on 38x/375 in big endian
- Fix operand list for pmsu on 370/XP
- Fix coherency bus notifiers
* tag 'mvebu-fixes-3.16-3' of git://git.infradead.org/linux-mvebu:
ARM: mvebu: Fix coherency bus notifiers by using separate notifiers
ARM: mvebu: Fix the operand list in the inline asm of armada_370_xp_pmsu_idle_enter
ARM: mvebu: fix SMP boot for Armada 38x and Armada 375 Z1 in big endian
Signed-off-by: Olof Johansson <olof@lixom.net>
Bjorn Helgaas [Fri, 18 Jul 2014 17:05:38 +0000 (10:05 -0700)]
Input: sirfsoc-onkey - fix GPL v2 license string typo
Per license_is_gpl_compatible(), the MODULE_LICENSE() string for GPL v2 is
"GPL v2", not "GPLv2". Use "GPL v2" so this module doesn't taint the
kernel.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tobias Klauser [Tue, 8 Jul 2014 22:18:07 +0000 (15:18 -0700)]
Input: st-keyscan - fix 'defined but not used' compiler warnings
Add #ifdef CONFIG_PM_SLEEP around keyscan_supend() and keyscan_resume() to
fix the following compiler warnings occuring if CONFIG_PM_SLEEP is unset:
+ /scratch/kisskb/src/drivers/input/keyboard/st-keyscan.c: warning: 'keyscan_resume' defined but not used [-Wunused-function]: => 235:12
+ /scratch/kisskb/src/drivers/input/keyboard/st-keyscan.c: warning: 'keyscan_suspend' defined but not used [-Wunused-function]: => 218:12
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lkml.org/lkml/2014/7/8/109
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Linus Torvalds [Fri, 18 Jul 2014 16:26:04 +0000 (06:26 -1000)]
Merge tag 'gfs2-fixes' of git://git./linux/kernel/git/steve/gfs2-3.0-fixes
Pull gfs2 fixes from Steven Whitehouse:
"This patch set contains two minor docs/spelling fixes, some fixes for
flock, a change to use GFP_NOFS to avoid recursion on a rarely used
code path and a fix for a race relating to the glock lru"
* tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes:
GFS2: fs/gfs2/rgrp.c: kernel-doc warning fixes
GFS2: memcontrol: Spelling s/invlidate/invalidate/
GFS2: Allow caching of glocks for flock
GFS2: Allow flocks to use normal glock dq rather than dq_wait
GFS2: replace count*size kzalloc by kcalloc
GFS2: Use GFP_NOFS when allocating glocks
GFS2: Fix race in glock lru glock disposal
GFS2: Only wait for demote when last holder is dequeued
Linus Torvalds [Fri, 18 Jul 2014 16:25:05 +0000 (06:25 -1000)]
Merge tag 'dm-3.16-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
"Fix the dm-thinp and dm-cache targets to disallow changing the data
device's block size"
* tag 'dm-3.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache metadata: do not allow the data block size to change
dm thin metadata: do not allow the data block size to change
Linus Torvalds [Fri, 18 Jul 2014 16:23:34 +0000 (06:23 -1000)]
Merge tag 'upstream-3.16-rc6' of git://git.infradead.org/linux-ubifs
Pull UBI fixes from Artem Bityutskiy:
"Two UBI fastmap-related fixes for v3.16:
- fix UBI fastmap support which we broke in 3.16-rc1 by reversing the
volumes RB-tree sorting criteria.
- make sure that we scrub all PEBs where we see bit-flips - we were
missing some of them when the fastmap feature was enabled"
* tag 'upstream-3.16-rc6' of git://git.infradead.org/linux-ubifs:
UBI: fastmap: do not miss bit-flips
UBI: fix the volumes tree sorting criteria
Linus Torvalds [Fri, 18 Jul 2014 16:21:43 +0000 (06:21 -1000)]
Merge tag 'xfs-for-linus-3.16-rc5' of git://oss.sgi.com/xfs/xfs
Pull xfs fixes from Dave Chinner:
"Fixes for low memory perforamnce regressions and a quota inode
handling regression.
These are regression fixes for issues recently introduced - the change
in the stack switch location is fairly important, so I've held off
sending this update until I was sure that it still addresses the stack
usage problem the original solved. So while the commits in the xfs
tree are recent, it has been under tested for several weeks now"
* tag 'xfs-for-linus-3.16-rc5' of git://oss.sgi.com/xfs/xfs:
xfs: null unused quota inodes when quota is on
xfs: refine the allocation stack switch
Revert "xfs: block allocation work needs to be kswapd aware"
Boris BREZILLON [Thu, 17 Jul 2014 19:03:58 +0000 (21:03 +0200)]
ARM: at91/dt: add missing clocks property to pwm node in sam9x5.dtsi
The pwm driver requires a clocks property referencing the pwm peripheral
clk.
Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Boris BREZILLON [Mon, 14 Jul 2014 06:39:27 +0000 (08:39 +0200)]
ARM: at91/dt: fix usb0 clocks definition in sam9n12 dtsi
udphs_clk (USB Device Controller clock) is referenced instead of
uhphs_clk (USB Host Controller clock).
Signed-off-by: Boris BREZILLON <boris.brezillon@free-electrons.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Bo Shen [Mon, 14 Jul 2014 03:08:14 +0000 (11:08 +0800)]
ARM: at91: at91sam9x5: correct typo error for ohci clock
Correct the typo error for the second "uhphs_clk".
Signed-off-by: Bo Shen <voice.shen@atmel.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Tomasz Figa [Thu, 17 Jul 2014 15:23:44 +0000 (17:23 +0200)]
irqchip: gic: Fix core ID calculation when topology is read from DT
Certain GIC implementation, namely those found on earlier, single
cluster, Exynos SoCs, have registers mapped without per-CPU banking,
which means that the driver needs to use different offset for each CPU.
Currently the driver calculates the offset by multiplying value returned
by cpu_logical_map() by CPU offset parsed from DT. This is correct when
CPU topology is not specified in DT and aforementioned function returns
core ID alone. However when DT contains CPU topology, the function
changes to return cluster ID as well, which is non-zero on mentioned
SoCs and so breaks the calculation in GIC driver.
This patch fixes this by masking out cluster ID in CPU offset
calculation so that only core ID is considered. Multi-cluster Exynos
SoCs already have banked GIC implementations, so this simple fix should
be enough.
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Fixes: db0d4db22a78d ("ARM: gic: allow GIC to support non-banked setups")
Cc: <stable@vger.kernel.org> # v3.3+
Link: https://lkml.kernel.org/r/1405610624-18722-1-git-send-email-t.figa@samsung.com
Signed-off-by: Jason Cooper <jason@lakedaemon.net>