Liu Ying [Tue, 7 May 2013 08:38:40 +0000 (16:38 +0800)]
ktime: Use macro NSEC_PER_USEC where appropriate
We've got the macro NSEC_PER_USEC defined in header file
include/linux/time.h. To make the code decent, this patch
replaces the immediate number 1000 to convert bewteen a
time value in microseconds and one in nanoseconds with the
macro NSEC_PER_USEC.
Signed-off-by: Liu Ying <Ying.Liu@freescale.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Thomas Gleixner [Tue, 28 May 2013 07:48:46 +0000 (09:48 +0200)]
clocksource: Implement clocksource_select_fallback() for CONFIG_ARCH_USES_GETTIMEOFFSET=y
commit
7eaeb34305 (clocksource: Provide unbind interface in sysfs)
implemented clocksource_select_fallback() which is not defined for
CONFIG_ARCH_USES_GETTIMEOFFSET=y. Add an empty inline function for
that.
Reported-by: Ingo Molnar <mingo@kernel.org>
Reported-by: fengguang.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Tue, 28 May 2013 07:28:02 +0000 (09:28 +0200)]
clockevents: Define CS_NAME_LEN unconditionally
Unbreak architectures which do not use clockevents, but require to
build some of the core timekeeping infrastructure
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:50 +0000 (20:31 +0000)]
clockevents: Implement unbind functionality
Provide a sysfs interface to allow unbinding of clockevent
devices. The device is unbound if it is unused or if there is a
replacement device available. Unbinding of broadcast devices is not
supported as we don't want to foster that nonsense. If no replacement
device is available the unbind returns -EBUSY. Unbind is available
from the kernel and through sysfs, which is necessary to drop the
module refcount.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.499216659@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:50 +0000 (20:31 +0000)]
clockevents: Split out selection logic
Split out the clockevent device selection logic. Preparatory patch to
allow unbinding active clockevent devices.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.431796247@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:49 +0000 (20:31 +0000)]
clockevents: Provide sysfs interface
Provide a simple sysfs interface for the clockevent devices. Show the
current active clockevent device.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.371634778@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:49 +0000 (20:31 +0000)]
clockevents: Add module refcount
We want to be able to remove clockevent modules as well. Add a
refcount so we don't remove a module with an active clock event
device.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.307435149@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:48 +0000 (20:31 +0000)]
clockevents: Move the tick_notify() switch case to clockevents_notify()
No need to call another function and have duplicated cases.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.235746557@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:48 +0000 (20:31 +0000)]
clockevents: Simplify locking
Now that the notifier chain is gone there are no other users and it's
pointless to nest tick_device_lock inside of clockevents_lock because
there is no other use case.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.162888472@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:47 +0000 (20:31 +0000)]
clockevents: Get rid of the notifier chain
7+ years and still a single user. Kill it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.098520211@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:46 +0000 (20:31 +0000)]
clocksource: Let clocksource_unregister() return success/error
The unregister call can fail, if the clocksource is the current one
and there is no replacement clocksource available. It can also fail,
if the clocksource is the watchdog clocksource and I'm not going to
provide support for this.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143436.029915527@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:46 +0000 (20:31 +0000)]
clocksource: Provide unbind interface in sysfs
With the module refcount held for the current clocksource there is no
way to unload the module.
Provide a sysfs interface which allows to unbind the clocksource. One
could argue that the clocksource override could be (ab)used to do so,
but the clocksource override cannot be used from the kernel itself,
while an unbind function can be used to programmatically check whether
a clocksource can be shutdown or not.
The unbind functionality uses the new skip current feature of
clocksource_select and verifies that a fallback clocksource has been
installed. If the clocksource which should be unbound is the current
clocksource and no fallback can be found, unbind returns -EBUSY.
This does not support the unbinding of a clocksource which is used as
the watchdog clocksource. No point in fostering crappy hardware.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143435.964218245@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:45 +0000 (20:31 +0000)]
clocksource: Split out user string input
Split out the user string input for clocksource override. Preparatory
patch for unbind.
[ jstultz: Fix an off by one error ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143435.895851338@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:45 +0000 (20:31 +0000)]
clocksource: Allow clocksource select to skip current clocksource
Preparatory patch for clocksource unbind support.
Split out code from clocksource_select and modify it, so it skips the
current clocksource on request and tries to find a fallback
clocksource. Convert all existing users. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143435.834965397@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:44 +0000 (20:31 +0000)]
clocksource: Add module refcount
Add a module refcount, so the current clocksource cannot be removed
unconditionally.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143435.762417789@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:44 +0000 (20:31 +0000)]
clocksource: Let timekeeping_notify return success/error
timekeeping_notify() can fail due cs->enable() failure. Though the
caller does not notice and happily keeps the wrong clocksource as the
current one.
Let the caller know about failure, so the current clocksource will be
shown correctly in sysfs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143435.696321912@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:43 +0000 (20:31 +0000)]
clocksource: Always verify highres capability
If a clocksource has a (wrong) high rating, but can't be used as a
timebase for oneshot tick mode, it is unconditionally selected even
when the system is already in oneshot tick mode. This causes full
system failure.
Verify the clocksource selection against the oneshot mode.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130425143435.635040849@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 25 Apr 2013 20:31:43 +0000 (20:31 +0000)]
clocksource: apb_timer: Remove unsused function
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Acked-by: Jamie Iles <jamie@jamieiles.com>
Link: http://lkml.kernel.org/r/20130425143435.558006195@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Wed, 15 May 2013 21:08:57 +0000 (14:08 -0700)]
Merge tag 'trace-fixes-v3.10-rc1' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This includes a fix to a memory leak when adding filters to traces.
Also, Masami Hiramatsu fixed up some minor bugs that were discovered
by sparse."
* tag 'trace-fixes-v3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Make print_*probe_event static
tracing/kprobes: Fix a sparse warning for incorrect type in assignment
tracing/kprobes: Use rcu_dereference_raw for tp->files
tracing: Fix leaks of filter preds
Linus Torvalds [Wed, 15 May 2013 21:07:53 +0000 (14:07 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
- Fix for a CPU hot-add deadlock in microcode update code
- Fix for idle consolidation fallout
- Documentation update for initial kernel direct mapping
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Add missing comments for initial kernel direct mapping
x86/microcode: Add local mutex to fix physical CPU hot-add deadlock
x86: Fix idle consolidation fallout
Linus Torvalds [Wed, 15 May 2013 21:07:02 +0000 (14:07 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
- Fix for a task exit cleanup race caused by a missing a preempt
disable
- Cleanup of the event notification functions with a massive reduction
of duplicated code
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Factor out auxiliary events notification
perf: Fix EXIT event notification
Linus Torvalds [Wed, 15 May 2013 21:05:17 +0000 (14:05 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
- Cure for not using zalloc in the first place, which leads to random
crashes with CPUMASK_OFF_STACK.
- Revert a user space visible change which broke udev
- Add a missing cpu_online early return introduced by the new full
dyntick conversions
- Plug a long standing race in the timer wheel cpu hotplug code.
Sigh...
- Cleanup NOHZ per cpu data on cpu down to prevent stale data on cpu
up.
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons
timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE
tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
tick: Cleanup NOHZ per cpu data on cpu down
tick: Use zalloc_cpumask_var for allocating offstack cpumasks
Linus Torvalds [Wed, 15 May 2013 21:04:00 +0000 (14:04 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
- Two fixlets for the fallout of the generic idle task conversion
- Documentation update
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit
idle: Fix hlt/nohlt command-line handling in new generic idle
kthread: Document ways of reducing OS jitter due to per-CPU kthreads
Linus Torvalds [Wed, 15 May 2013 20:37:54 +0000 (13:37 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"A small number of fixes for stuff from the last merge window, and in
one case (IRQ time accounting) the previous merge window."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7720/1: ARM v6/v7 cmpxchg64 shouldn't clear upper 32 bits of the old/new value
ARM: 7715/1: MCPM: adapt to GIC changes after upstream merge
ARM: 7714/1: mmc: mmci: Ensure return value of regulator_enable() is checked
ARM: 7712/1: Remove trailing whitespace in arch/arm/Makefile
ARM: 7711/1: dove: fix Dove cpu type from V7 to PJ4
ARM: finally enable IRQ time accounting config
Linus Torvalds [Wed, 15 May 2013 20:36:19 +0000 (13:36 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fixes from Sage Weil:
"Yes, this is a much larger pull than I would like after -rc1. There
are a few things included:
- a few fixes for leaks and incorrect assertions
- a few patches fixing behavior when mapped images are resized
- handling for cloned/layered images that are flattened out from
underneath the client
The last bit was non-trivial, and there is some code movement and
associated cleanup mixed in. This was ready and was meant to go in
last week but I missed the boat on Friday. My only excuse is that I
was waiting for an all clear from the testing and there were many
other shiny things to distract me.
Strictly speaking, handling the flatten case isn't a regression and
could wait, so if you like we can try to pull the series apart, but
Alex and I would much prefer to have it all in as it is a case real
users will hit with 3.10."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (33 commits)
rbd: re-submit flattened write request (part 2)
rbd: re-submit write request for flattened clone
rbd: re-submit read request for flattened clone
rbd: detect when clone image is flattened
rbd: reference count parent requests
rbd: define parent image request routines
rbd: define rbd_dev_unparent()
rbd: don't release write request until necessary
rbd: get parent info on refresh
rbd: ignore zero-overlap parent
rbd: support reading parent page data for writes
rbd: fix parent request size assumption
libceph: init sent and completed when starting
rbd: kill rbd_img_request_get()
rbd: only set up watch for mapped images
rbd: set mapping read-only flag in rbd_add()
rbd: support reading parent page data
rbd: fix an incorrect assertion condition
rbd: define rbd_dev_v2_header_info()
rbd: get rid of trivial v1 header wrappers
...
Masami Hiramatsu [Mon, 13 May 2013 11:58:39 +0000 (20:58 +0900)]
tracing/kprobes: Make print_*probe_event static
According to sparse warning, print_*probe_event static because
those functions are not directly called from outside.
Link: http://lkml.kernel.org/r/20130513115839.6545.83067.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Masami Hiramatsu [Mon, 13 May 2013 11:58:37 +0000 (20:58 +0900)]
tracing/kprobes: Fix a sparse warning for incorrect type in assignment
Fix a sparse warning about the rcu operated pointer is
defined without __rcu address space.
Link: http://lkml.kernel.org/r/20130513115837.6545.23322.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Masami Hiramatsu [Mon, 13 May 2013 11:58:34 +0000 (20:58 +0900)]
tracing/kprobes: Use rcu_dereference_raw for tp->files
Use rcu_dereference_raw() for accessing tp->files. Because the
write-side uses rcu_assign_pointer() for memory barrier,
the read-side also has to use rcu_dereference_raw() with
read memory barrier.
Link: http://lkml.kernel.org/r/20130513115834.6545.17022.stgit@mhiramat-M0-7522
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tom Zanussi <tom.zanussi@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Steven Rostedt (Red Hat) [Tue, 14 May 2013 19:40:48 +0000 (15:40 -0400)]
tracing: Fix leaks of filter preds
Special preds are created when folding a series of preds that
can be done in serial. These are allocated in an ops field of
the pred structure. But they were never freed, causing memory
leaks.
This was discovered using the kmemleak checker:
unreferenced object 0xffff8800797fd5e0 (size 32):
comm "swapper/0", pid 1, jiffies
4294690605 (age 104.608s)
hex dump (first 32 bytes):
00 00 01 00 03 00 05 00 07 00 09 00 0b 00 0d 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<
ffffffff814b52af>] kmemleak_alloc+0x73/0x98
[<
ffffffff8111ff84>] kmemleak_alloc_recursive.constprop.42+0x16/0x18
[<
ffffffff81120e68>] __kmalloc+0xd7/0x125
[<
ffffffff810d47eb>] kcalloc.constprop.24+0x2d/0x2f
[<
ffffffff810d4896>] fold_pred_tree_cb+0xa9/0xf4
[<
ffffffff810d3781>] walk_pred_tree+0x47/0xcc
[<
ffffffff810d5030>] replace_preds.isra.20+0x6f8/0x72f
[<
ffffffff810d50b5>] create_filter+0x4e/0x8b
[<
ffffffff81b1c30d>] ftrace_test_event_filter+0x5a/0x155
[<
ffffffff8100028d>] do_one_initcall+0xa0/0x137
[<
ffffffff81afbedf>] kernel_init_freeable+0x14d/0x1dc
[<
ffffffff814b24b7>] kernel_init+0xe/0xdb
[<
ffffffff814d539c>] ret_from_fork+0x7c/0xb0
[<
ffffffffffffffff>] 0xffffffffffffffff
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: stable@vger.kernel.org # 2.6.39+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
John Stultz [Wed, 24 Apr 2013 18:32:56 +0000 (11:32 -0700)]
time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons
Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config,
which enables some minor compile time optimization to avoid
uncessary code in mostly the suspend/resume path could cause
problems for userland.
In particular, the dependency for RTC_HCTOSYS on
!ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time
twice and simplifies suspend/resume, has the side effect
of causing the /sys/class/rtc/rtcN/hctosys flag to always be
zero, and this flag is commonly used by udev to setup the
/dev/rtc symlink to /dev/rtcN, which can cause pain for
older applications.
While the udev rules could use some work to be less fragile,
breaking userland should strongly be avoided. Additionally
the compile time optimizations are fairly minor, and the code
being optimized is likely to be reworked in the future, so
lets revert this change.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: stable <stable@vger.kernel.org> #3.9
Cc: Feng Tang <feng.tang@intel.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 14 May 2013 16:30:54 +0000 (09:30 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 update from Ted Ts'o:
"Fixed regressions (two stability regressions and a performance
regression) introduced during the 3.10-rc1 merge window.
Also included is a bug fix relating to allocating blocks after
resizing an ext3 file system when using the ext4 file system driver"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd,jbd2: fix oops in jbd2_journal_put_journal_head()
ext4: revert "ext4: use io_end for multiple bios"
ext4: limit group search loop for non-extent files
ext4: fix fio regression
Linus Torvalds [Tue, 14 May 2013 16:06:29 +0000 (09:06 -0700)]
Merge branch 'for-3.10-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fix from Tejun Heo:
"A fix for a workqueue_congested() regression that broke fscache"
* 'for-3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: workqueue_congested() shouldn't translate WORK_CPU_UNBOUND into node number
Tirupathi Reddy [Tue, 14 May 2013 08:29:02 +0000 (13:59 +0530)]
timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE
An inactive timer's base can refer to a offline cpu's base.
In the current code, cpu_base's lock is blindly reinitialized each
time a CPU is brought up. If a CPU is brought online during the period
that another thread is trying to modify an inactive timer on that CPU
with holding its timer base lock, then the lock will be reinitialized
under its feet. This leads to following SPIN_BUG().
<0> BUG: spinlock already unlocked on CPU#3, kworker/u:3/1466
<0> lock: 0xe3ebe000, .magic:
dead4ead, .owner: kworker/u:3/1466, .owner_cpu: 1
<4> [<
c0013dc4>] (unwind_backtrace+0x0/0x11c) from [<
c026e794>] (do_raw_spin_unlock+0x40/0xcc)
<4> [<
c026e794>] (do_raw_spin_unlock+0x40/0xcc) from [<
c076c160>] (_raw_spin_unlock+0x8/0x30)
<4> [<
c076c160>] (_raw_spin_unlock+0x8/0x30) from [<
c009b858>] (mod_timer+0x294/0x310)
<4> [<
c009b858>] (mod_timer+0x294/0x310) from [<
c00a5e04>] (queue_delayed_work_on+0x104/0x120)
<4> [<
c00a5e04>] (queue_delayed_work_on+0x104/0x120) from [<
c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c)
<4> [<
c04eae00>] (sdhci_msm_bus_voting+0x88/0x9c) from [<
c04d8780>] (sdhci_disable+0x40/0x48)
<4> [<
c04d8780>] (sdhci_disable+0x40/0x48) from [<
c04bf300>] (mmc_release_host+0x4c/0xb0)
<4> [<
c04bf300>] (mmc_release_host+0x4c/0xb0) from [<
c04c7aac>] (mmc_sd_detect+0x90/0xfc)
<4> [<
c04c7aac>] (mmc_sd_detect+0x90/0xfc) from [<
c04c2504>] (mmc_rescan+0x7c/0x2c4)
<4> [<
c04c2504>] (mmc_rescan+0x7c/0x2c4) from [<
c00a6a7c>] (process_one_work+0x27c/0x484)
<4> [<
c00a6a7c>] (process_one_work+0x27c/0x484) from [<
c00a6e94>] (worker_thread+0x210/0x3b0)
<4> [<
c00a6e94>] (worker_thread+0x210/0x3b0) from [<
c00aad9c>] (kthread+0x80/0x8c)
<4> [<
c00aad9c>] (kthread+0x80/0x8c) from [<
c000ea80>] (kernel_thread_exit+0x0/0x8)
As an example, this particular crash occurred when CPU #3 is executing
mod_timer() on an inactive timer whose base is refered to offlined CPU
#2. The code locked the timer_base corresponding to CPU #2. Before it
could proceed, CPU #2 came online and reinitialized the spinlock
corresponding to its base. Thus now CPU #3 held a lock which was
reinitialized. When CPU #3 finally ended up unlocking the old cpu_base
corresponding to CPU #2, we hit the above SPIN_BUG().
CPU #0 CPU #3 CPU #2
------ ------- -------
..... ...... <Offline>
mod_timer()
lock_timer_base
spin_lock_irqsave(&base->lock)
cpu_up(2) ..... ......
init_timers_cpu()
.... ..... spin_lock_init(&base->lock)
..... spin_unlock_irqrestore(&base->lock) ......
<spin_bug>
Allocation of per_cpu timer vector bases is done only once under
"tvec_base_done[]" check. In the current code, spinlock_initialization
of base->lock isn't under this check. When a CPU is up each time the
base lock is reinitialized. Move base spinlock initialization under
the check.
Signed-off-by: Tirupathi Reddy <tirupath@codeaurora.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1368520142-4136-1-git-send-email-tirupath@codeaurora.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Srivatsa S. Bhat [Mon, 13 May 2013 22:31:27 +0000 (04:01 +0530)]
rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit
Bjørn Mork reported the following warning when running powertop.
[ 49.289034] ------------[ cut here ]------------
[ 49.289055] WARNING: at kernel/rcutree.c:502 rcu_eqs_exit_common.isra.48+0x3d/0x125()
[ 49.289244] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-bisect-rcu-warn+ #107
[ 49.289251]
ffffffff8157d8c8 ffffffff81801e28 ffffffff8137e4e3 ffffffff81801e68
[ 49.289260]
ffffffff8103094f ffffffff81801e68 0000000000000000 ffff88023afcd9b0
[ 49.289268]
0000000000000000 0140000000000000 ffff88023bee7700 ffffffff81801e78
[ 49.289276] Call Trace:
[ 49.289285] [<
ffffffff8137e4e3>] dump_stack+0x19/0x1b
[ 49.289293] [<
ffffffff8103094f>] warn_slowpath_common+0x62/0x7b
[ 49.289300] [<
ffffffff8103097d>] warn_slowpath_null+0x15/0x17
[ 49.289306] [<
ffffffff810a9006>] rcu_eqs_exit_common.isra.48+0x3d/0x125
[ 49.289314] [<
ffffffff81079b49>] ? trace_hardirqs_off_caller+0x37/0xa6
[ 49.289320] [<
ffffffff810a9692>] rcu_idle_exit+0x85/0xa8
[ 49.289327] [<
ffffffff8107076e>] trace_cpu_idle_rcuidle+0xae/0xff
[ 49.289334] [<
ffffffff810708b1>] cpu_startup_entry+0x72/0x115
[ 49.289341] [<
ffffffff813689e5>] rest_init+0x149/0x150
[ 49.289347] [<
ffffffff8136889c>] ? csum_partial_copy_generic+0x16c/0x16c
[ 49.289355] [<
ffffffff81a82d34>] start_kernel+0x3f0/0x3fd
[ 49.289362] [<
ffffffff81a8274c>] ? repair_env_string+0x5a/0x5a
[ 49.289368] [<
ffffffff81a82481>] x86_64_start_reservations+0x2a/0x2c
[ 49.289375] [<
ffffffff81a82550>] x86_64_start_kernel+0xcd/0xd1
[ 49.289379] ---[ end trace
07a1cc95e29e9036 ]---
The warning is that 'rdtp->dynticks' has an unexpected value, which roughly
translates to - the calls to rcu_idle_enter() and rcu_idle_exit() were not
made in the correct order, or otherwise messed up.
And Bjørn's painstaking debugging indicated that this happens when the idle
loop enters the poll mode. Looking at the poll function cpu_idle_poll(), and
the implementation of trace_cpu_idle_rcuidle(), the problem becomes very clear:
cpu_idle_poll() lacks calls to rcu_idle_enter/exit(), and trace_cpu_idle_rcuidle()
calls them in the reverse order - first rcu_idle_exit(), and then rcu_idle_enter().
Hence the even/odd alternative sequencing of rdtp->dynticks goes for a toss.
And powertop readily triggers this because powertop uses the idle-tracing
infrastructure extensively.
So, to fix this, wrap the code in cpu_idle_poll() within rcu_idle_enter/exit(),
so that it blends properly with the calls inside trace_cpu_idle_rcuidle() and
thus get the function ordering right.
Reported-and-tested-by: Bjørn Mork <bjorn@mork.no>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/519169BF.4080208@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Mon, 13 May 2013 19:40:27 +0000 (21:40 +0200)]
tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
commit
5b39939a4 (nohz: Move ts->idle_calls incrementation into strict
idle logic) moved code out of tick_nohz_stop_sched_tick() and missed
to bail out when the cpu is offline. That's causing subsequent
failures as an offline CPU is supposed to die and not to fiddle with
nohz magic.
Return false in can_stop_idle_tick() if the cpu is offline.
Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz>
Reported-and-tested-by: Prarit Bhargava <prarit@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305132138160.2863@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 14 May 2013 14:43:11 +0000 (07:43 -0700)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
Pull powerpc fixes from Benjamin Herrenschmidt:
"This is mostly bug fixes (some of them regressions, some of them I
deemed worth merging now) along with some patches from Li Zhong
hooking up the new context tracking stuff (for the new full NO_HZ)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
powerpc: Set show_unhandled_signals to 1 by default
powerpc/perf: Fix setting of "to" addresses for BHRB
powerpc/pmu: Fix order of interpreting BHRB target entries
powerpc/perf: Move BHRB code into CONFIG_PPC64 region
powerpc: select HAVE_CONTEXT_TRACKING for pSeries
powerpc: Use the new schedule_user API on userspace preemption
powerpc: Exit user context on notify resume
powerpc: Exception hooks for context tracking subsystem
powerpc: Syscall hooks for context tracking subsystem
powerpc/booke64: Fix kernel hangs at kernel_dbg_exc
powerpc: Fix irq_set_affinity() return values
powerpc: Provide __bswapdi2
powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3
powerpc/powernv: Detect OPAL v3 API version
powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again
powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS
powerpc: Bring all threads online prior to migration/hibernation
powerpc/rtas_flash: Fix validate_flash buffer overflow issue
powerpc/kexec: Fix kexec when using VMX optimised memcpy
powerpc: Fix build errors STRICT_MM_TYPECHECKS
...
Benjamin Herrenschmidt [Tue, 14 May 2013 07:02:11 +0000 (17:02 +1000)]
powerpc: Set show_unhandled_signals to 1 by default
Just like other architectures
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Mon, 13 May 2013 18:44:58 +0000 (18:44 +0000)]
powerpc/perf: Fix setting of "to" addresses for BHRB
Currently we only set the "to" address in the branch stack when the CPU
explicitly gives us a value. Unfortunately it only does this for XL form
branches (eg blr, bctr, bctar) and not I and B form branches (eg b, bc).
Fortunately if we read the instruction from memory we can extract the offset of
a branch and calculate the target address.
This adds a function power_pmu_bhrb_to() to calculate the target/to address of
the corresponding I and B form branches. It handles branches in both user and
kernel spaces. It also plumbs this into the perf brhb reading code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Mon, 13 May 2013 18:44:57 +0000 (18:44 +0000)]
powerpc/pmu: Fix order of interpreting BHRB target entries
The current Branch History Rolling Buffer (BHRB) code misinterprets the order
of entries in the hardware buffer. It assumes that a branch target address
will be read _after_ its corresponding branch. In reality the branch target
comes before (lower mfbhrb entry) it's corresponding branch.
This is a rewrite of the code to take this into account.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Neuling [Mon, 13 May 2013 18:44:56 +0000 (18:44 +0000)]
powerpc/perf: Move BHRB code into CONFIG_PPC64 region
The new Branch History Rolling buffer (BHRB) code is only useful on 64bit
processors, so move it into the #ifdef CONFIG_PPC64 region.
This avoids code bloat on 32bit systems.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 13 May 2013 16:16:44 +0000 (16:16 +0000)]
powerpc: select HAVE_CONTEXT_TRACKING for pSeries
Start context tracking support from pSeries.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 13 May 2013 16:16:43 +0000 (16:16 +0000)]
powerpc: Use the new schedule_user API on userspace preemption
This patch corresponds to
[PATCH] x86: Use the new schedule_user API on userspace preemption
commit
0430499ce9d78691f3985962021b16bf8f8a8048
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 13 May 2013 16:16:42 +0000 (16:16 +0000)]
powerpc: Exit user context on notify resume
This patch allows RCU usage in do_notify_resume, e.g. signal handling.
It corresponds to
[PATCH] x86: Exit RCU extended QS on notify resume
commit
edf55fda35c7dc7f2d9241c3abaddaf759b457c6
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 13 May 2013 16:16:41 +0000 (16:16 +0000)]
powerpc: Exception hooks for context tracking subsystem
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit
6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 13 May 2013 16:16:40 +0000 (16:16 +0000)]
powerpc: Syscall hooks for context tracking subsystem
This is the syscall slow path hooks for context tracking subsystem,
corresponding to
[PATCH] x86: Syscall hooks for userspace RCU extended QS
commit
bf5a3c13b939813d28ce26c01425054c740d6731
TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there
is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is
better for it to be in the same 16 bits with others in the group, so in the
asm code, andi. with this group could work.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Scott Wood [Mon, 13 May 2013 14:14:53 +0000 (14:14 +0000)]
powerpc/booke64: Fix kernel hangs at kernel_dbg_exc
MSR_DE is not cleared on entry to the kernel, and we don't clear it
explicitly outside of debug code. If we have MSR_DE set in
prime_debug_regs(), and the new thread has events enabled in DBCR0
(e.g. ICMP is set in thread->dbsr0, even though it was cleared in the
real DBCR0 when the thread got scheduled out), we'll end up taking a
debug exception in the kernel when DBCR0 is loaded. DSRR0 will not
point to an exception vector, and the kernel ends up hanging at
kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new
debug state.
Another observed source of kernel_dbg_exc hangs is with the branch
taken event. If this event is active, but we take a non-debug trap
(e.g. a TLB miss or an asynchronous interrupt) before the next branch.
We end up taking a branch-taken debug exception on the initial branch
instruction of the exception vector, but because the debug exception is
DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even
checking the DSRR0 address. Fix this by checking for DBSR_BT as well
as DBSR_IC, which is what 32-bit does and what the comments suggest was
intended in the 64-bit code as well.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Alexander Gordeev [Mon, 13 May 2013 00:57:49 +0000 (00:57 +0000)]
powerpc: Fix irq_set_affinity() return values
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
David Woodhouse [Mon, 13 May 2013 00:23:38 +0000 (00:23 +0000)]
powerpc: Provide __bswapdi2
Some versions of GCC apparently expect this to be provided by libgcc.
Updates from Mikey to fix 32 bit version and adding "r" to registers.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Tue, 14 May 2013 05:12:31 +0000 (15:12 +1000)]
powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3
The current code fails to handle kexec on OPALv2. This fixes it
and adds code to improve the situation on OPALv3 where we can
query the CPU status from the firmware and decide what to do
based on that.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Benjamin Herrenschmidt [Tue, 14 May 2013 05:10:02 +0000 (15:10 +1000)]
powerpc/powernv: Detect OPAL v3 API version
Future firmwares will support that new version
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Li Zhong [Mon, 6 May 2013 22:44:41 +0000 (22:44 +0000)]
powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again
Saw this warning again, and this time from the ret_from_fork path.
It seems we could clear the back chain earlier in copy_thread(), which
could cover both path, and also fix potential lockdep usage in
schedule_tail(), or exception occurred before we clear the back chain.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Mon, 6 May 2013 18:43:39 +0000 (18:43 +0000)]
powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS
We are getting build errors with CONFIG_PROC_FS=n:
arch/powerpc/kernel/rtas_flash.c
In function 'rtas_flash_init':
745:33: error: unused variable 'f' [-Werror=unused-variable]
But rtas_flash.c should not be built when CONFIG_PROC_FS=n, beacause all
it does is provide a /proc interface to the RTAS flash routines.
CONFIG_RTAS_FLASH already depends on CONFIG_RTAS_PROC, to indicate that
it depends on the RTAS proc support, but CONFIG_RTAS_PROC does not
depend on CONFIG_PROC_FS. So fix that.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Robert Jennings [Tue, 7 May 2013 04:34:11 +0000 (04:34 +0000)]
powerpc: Bring all threads online prior to migration/hibernation
This patch brings online all threads which are present but not online
prior to migration/hibernation. After migration/hibernation those
threads are taken back offline.
During migration/hibernation all online CPUs must call H_JOIN, this is
required by the hypervisor. Without this patch, threads that are offline
(H_CEDE'd) will not be woken to make the H_JOIN call and the OS will be
deadlocked (all threads either JOIN'd or CEDE'd).
Cc: <stable@kernel.org>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Vasant Hegde [Tue, 7 May 2013 16:54:47 +0000 (16:54 +0000)]
powerpc/rtas_flash: Fix validate_flash buffer overflow issue
ibm,validate-flash-image RTAS call output buffer contains 150 - 200
bytes of data on latest system. Presently we have output
buffer size as 64 bytes and we use sprintf to copy data from
RTAS buffer to local buffer. This causes kernel oops (see below
call trace).
This patch increases local buffer size to 256 and also uses
snprintf instead of sprintf to copy data from RTAS buffer.
Kernel call trace :
-------------------
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod
Supported: Yes
NIP:
4520323031333130 LR:
4520323031333130 CTR:
0000000000000000
REGS:
c0000001b91779b0 TRAP: 0400 Tainted: G X (3.0.13-0.27-ppc64)
MSR:
8000000040009032 <EE,ME,IR,DR> CR:
44022488 XER:
20000018
TASK =
c0000001bca1aba0[4736] 'cat' THREAD:
c0000001b9174000 CPU: 36
GPR00:
4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b
GPR04:
c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031
GPR08:
3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4
GPR12:
0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90
GPR16:
00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000
GPR20:
0000000000008000 00000000100fae60 000000000000005e 0000000000000000
GPR24:
0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031
GPR28:
333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130
NIP [
4520323031333130] 0x4520323031333130
LR [
4520323031333130] 0x4520323031333130
Call Trace:
[
c0000001b9177c30] [
4520323031333130] 0x4520323031333130 (unreliable)
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Sun, 12 May 2013 15:04:53 +0000 (15:04 +0000)]
powerpc/kexec: Fix kexec when using VMX optimised memcpy
commit
b3f271e86e5a (powerpc: POWER7 optimised memcpy using VMX and
enhanced prefetch) uses VMX when it is safe to do so (ie not in
interrupt). It also looks at the task struct to decide if we have to
save the current tasks' VMX state.
kexec calls memcpy() at a point where the task struct may have been
overwritten by the new kexec segments. If it has been overwritten
then when memcpy -> enable_altivec looks up current->thread.regs->msr
we get a cryptic oops or lockup.
I also notice we aren't initialising thread_info->cpu, which means
smp_processor_id is broken. Fix that too.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@vger.kernel.org> # 3.6+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aneesh Kumar K.V [Mon, 6 May 2013 10:51:00 +0000 (10:51 +0000)]
powerpc: Fix build errors STRICT_MM_TYPECHECKS
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Aneesh Kumar K.V [Sat, 11 May 2013 22:33:19 +0000 (22:33 +0000)]
powerpc/mm: Use the correct mask value when looking at pgtable address
Our pgtable are 2*sizeof(pte_t)*PTRS_PER_PTE which is PTE_FRAG_SIZE.
Instead of depending on frag size, mask with PMD_MASKED_BITS.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Tue, 14 May 2013 02:03:49 +0000 (19:03 -0700)]
Merge tag 'fixes-for-3.10-rc2-tag' of git://git./linux/kernel/git/sstabellini/xen
Pull Xen/arm fixes from Stefano Stabellini:
"This contains a couple of Xen on ARM initialization fixes and a patch
to improve error handling"
* tag 'fixes-for-3.10-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen:
xen/arm: rename xen_secondary_init and run it on every online cpu
xen/arm: do not handle VCPUOP_register_vcpu_info failures
xen/arm: initialize pm functions later
Linus Torvalds [Mon, 13 May 2013 23:49:59 +0000 (16:49 -0700)]
Merge branch 'parisc-for-3.10' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc update from Helge Deller:
"The second round of parisc updates for 3.10 includes build fixes and
enhancements to utilize irq stacks, fixes SMP races when updating PTE
and TLB entries by proper locking and makes the search for the correct
cross compiler more robust on Debian and Gentoo."
* 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: make default cross compiler search more robust (v3)
parisc: fix SMP races when updating PTE and TLB entries in entry.S
parisc: implement irq stacks - part 2 (v2)
Jaccon Bastiaansen [Mon, 13 May 2013 16:28:27 +0000 (17:28 +0100)]
ARM: 7720/1: ARM v6/v7 cmpxchg64 shouldn't clear upper 32 bits of the old/new value
The implementation of cmpxchg64() for the ARM v6 and v7 architecture
casts parameter 2 and 3 (the old and new 64bit values) to an unsigned
long before calling the atomic_cmpxchg64() function. This clears
the top 32 bits of the old and new values, resulting in the wrong
values being compare-exchanged. Luckily, this only appears to be used
for 64-bit sched_clock, which we don't (yet) have on ARM.
This bug was introduced by commit
3e0f5a15f500 ("ARM: 7404/1: cmpxchg64:
use atomic64 and local64 routines for cmpxchg64").
Cc: <stable@vger.kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jaccon Bastiaansen <jaccon.bastiaansen@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Mon, 13 May 2013 20:25:36 +0000 (13:25 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"Several small bug fixes all over:
1) be2net driver uses wrong payload length when submitting MAC list
get requests to the chip. From Sathya Perla.
2) Fix mwifiex memory leak on driver unload, from Amitkumar Karwar.
3) Prevent random memory access in batman-adv, from Marek Lindner.
4) batman-adv doesn't check for pskb_trim_rcsum() errors, also from
Marek Lindner.
5) Fix fec crashes on rapid link up/down, from Frank Li.
6) Fix inner protocol grovelling in GSO, from Pravin B Shelar.
7) Link event validation fix in qlcnic from Rajesh Borundia.
8) Not all FEC chips can support checksum offload, fix from Shawn
Guo.
9) EXPORT_SYMBOL + inline doesn't make any sense, from Denis Efremov.
10) Fix race in passthru mode during device removal in macvlan, from
Jiri Pirko.
11) Fix RCU hash table lookup socket state race in ipv6, leading to
NULL pointer derefs, from Eric Dumazet.
12) Add several missing HAS_DMA kconfig dependencies, from Geert
Uyttterhoeven.
13) Fix bogus PCI resource management in 3c59x driver, from Sergei
Shtylyov.
14) Fix info leak in ipv6 GRE tunnel driver, from Amerigo Wang.
15) Fix device leak in ipv6 IPSEC policy layer, from Cong Wang.
16) DMA mapping leak fix in qlge from Thadeu Lima de Souza Cascardo.
17) Missing iounmap on probe failure in bna driver, from Wei Yongjun."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits)
bna: add missing iounmap() on error in bnad_init()
qlge: fix dma map leak when the last chunk is not allocated
xfrm6: release dev before returning error
ipv6,gre: do not leak info to user-space
virtio_net: use default napi weight by default
emac: Fix EMAC soft reset on 460EX/GT
3c59x: fix PCI resource management
caif: CAIF_VIRTIO should depend on HAS_DMA
net/ethernet: MACB should depend on HAS_DMA
net/ethernet: ARM_AT91_ETHER should depend on HAS_DMA
net/wireless: ATH9K should depend on HAS_DMA
net/ethernet: STMMAC_ETH should depend on HAS_DMA
net/ethernet: NET_CALXEDA_XGMAC should depend on HAS_DMA
ipv6: do not clear pinet6 field
macvlan: fix passthru mode race between dev removal and rx path
ipv4: ip_output: remove inline marking of EXPORT_SYMBOL functions
net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode
net/mlx4_core: Add missing report on VST and spoof-checking dev caps
net: fec: enable hardware checksum only on imx6q-fec
qlcnic: Fix validation of link event command.
...
Helge Deller [Sat, 11 May 2013 19:04:09 +0000 (19:04 +0000)]
parisc: make default cross compiler search more robust (v3)
People/distros vary how they prefix the toolchain name for 64bit builds.
Rather than enforce one convention over another, add a for loop which
does a search for all the general prefixes.
For 64bit builds, we now search for (in order):
hppa64-unknown-linux-gnu
hppa64-linux-gnu
hppa64-linux
For 32bit builds, we look for:
hppa-unknown-linux-gnu
hppa-linux-gnu
hppa-linux
hppa2.0-unknown-linux-gnu
hppa2.0-linux-gnu
hppa2.0-linux
hppa1.1-unknown-linux-gnu
hppa1.1-linux-gnu
hppa1.1-linux
This patch was initiated by Mike Frysinger, with feedback from Jeroen
Roovers, John David Anglin and Helge Deller.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Jeroen Roovers <jer@gentoo.org>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Alex Elder [Mon, 6 May 2013 22:40:33 +0000 (17:40 -0500)]
rbd: re-submit flattened write request (part 2)
Add code to rbd_img_obj_exists_callback() to detect when a clone's
parent image has disappeared, and re-submit the original write
request in that case.
Kill off some redundant assertions.
This completes the resolution for:
http://tracker.ceph.com/issues/3763
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Mon, 6 May 2013 22:40:33 +0000 (17:40 -0500)]
rbd: re-submit write request for flattened clone
Add code to rbd_img_parent_read_full_callback() to detect when a
clone's parent image has disappeared, and re-submit the original
write request in that case. (See the previous commit for more
reasoning about why this is appropriate.)
Rename some variables in rbd_img_obj_parent_read_full_callback()
to match the convention used in the previous patch.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Mon, 6 May 2013 22:40:33 +0000 (17:40 -0500)]
rbd: re-submit read request for flattened clone
If a clone image gets flattened while a parent read request is
underway, the original rbd object request needs to be resubmitted.
The reason is that by the time we get the response to the parent
read request, the data read from the parent may be out of date.
In other words, we could see this sequence of events:
rbd client parent image/osd
---------- ----------------
original object ENOENT;
issue parent read
respond to parent read
child image flattened
original image header refresh
<--- original object written independently here
parent read response received
Add code to rbd_img_parent_read_callback() to detect when a clone's
parent image has disappeared (as evidenced by its parent overlap
becoming 0), and re-submit the original read request in that case.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Mon, 6 May 2013 22:40:33 +0000 (17:40 -0500)]
rbd: detect when clone image is flattened
A format 2 clone image can be the subject of a "flatten" operation,
during which all of its data gets "copied up" from its parent image,
leaving the image fully populated. Once this is complete, the
clone's association with the parent is abolished.
Since this can occur when a clone is mapped, we need to detect when
it has occurred and handle it accordingly. We know an image has
been flattened when we know it at one time had a parent, but we have
learned (via a "get_parent" object class method call) it no longer
has one.
There might be in-flight requests at the point we learn an image has
been flattened, so we can't simply clean up parent data structures
right away. Instead, we'll drop the initial parent reference when
the parent has disappeared (rather than when the image gets
destroyed), which will allow the last in-flight reference to clean
things up when it's complete.
We leverage the fact that a zero parent overlap renders an image
effectively unlayered. We set the overlap to 0 at the point we
detect the clone image has flattened, which allows the unlayered
behavior to take effect immediately, while keeping other parent
structures in place until in-flight requests to complete.
This and the next few patches resolve:
http://tracker.ceph.com/issues/3763
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 9 May 2013 03:50:04 +0000 (22:50 -0500)]
rbd: reference count parent requests
Keep a reference count for uses of the parent information for an rbd
device.
An initial reference is set in rbd_img_request_create() if the
target image has a parent (with non-zero overlap). Each image
request for an image with a non-zero parent overlap gets another
reference when it's created, and that reference is dropped when the
request is destroyed.
The initial reference is dropped when the image gets torn down.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 9 May 2013 03:50:04 +0000 (22:50 -0500)]
rbd: define parent image request routines
Define rbd_parent_request_create() and rbd_parent_request_destroy()
to handle the creation of parent image requests submitted for
layered image objects. For simplicity, let rbd_img_request_put()
handle dropping the reference to any image request (parent or not),
and call whichever destructor is appropriate on the last put.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 9 May 2013 03:50:04 +0000 (22:50 -0500)]
rbd: define rbd_dev_unparent()
Define rbd_dev_unparent() to encapsulate cleaning up parent data
structures from a layered rbd image.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 9 May 2013 15:08:49 +0000 (10:08 -0500)]
rbd: don't release write request until necessary
Previously when a layered write was going to involve a copyup
request, the original osd request was released before submitting the
parent full-object read. The osd request for the copyup would then
be allocated in rbd_img_obj_parent_read_full_callback().
Shortly we will be handling the event of mapped layered images
getting flattened, and when that occurs we need to resubmit the
original request. We therefore don't want to release the osd
request until we really konw we're going to replace it--in the
callback function.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Mon, 6 May 2013 22:40:33 +0000 (17:40 -0500)]
rbd: get parent info on refresh
Get parent info for format 2 images on every refresh (rather than
just during the initial probe). This will be needed to detect the
disappearance of the parent image in the event a mapped image
becomes unlayered (i.e., flattened). Avoid leaking the previous
parent spec on the second and subsequent times this information is
requested by dropping the previous one (if any) before updating it.
(Also, extract the pool id into a local variable before assigning
it into the parent spec.)
Switch to using a non-zero parent overlap value rather than the
existence of a parent (a non-null parent_spec pointer) to determine
whether to mark a request layered. It will soon be possible for
a layered image to become unlayered while a request is in flight.
This means that the layered flag for an image request indicates that
there was a non-zero parent overlap at the time the image request
was created. The parent overlap can change thereafter, which may
lead to special handling at request submission or completion time.
This and the next several patches are related to:
http://tracker.ceph.com/issues/3763
NOTE:
If an error occurs while refreshing the parent info (i.e.,
requesting it after initial probe), the old parent info will
persist. This is not really correct, and is a scenario that needs
to be addressed. For now we'll assert that the failure mode is
unlikely, but the issue has been documented in tracker issue 5040.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Wei Yongjun [Mon, 13 May 2013 04:26:06 +0000 (04:26 +0000)]
bna: add missing iounmap() on error in bnad_init()
Add the missing iounmap() before return from bnad_init()
in the error handling case.
Introduced by commit
01b54b1451853593739816a392485c4e2bee7dda
(bna: tx rx cleanup fix).
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thadeu Lima de Souza Cascardo [Sat, 11 May 2013 09:15:37 +0000 (09:15 +0000)]
qlge: fix dma map leak when the last chunk is not allocated
qlge allocates chunks from a page that it maps and unmaps that page when
the last chunk is released. When the driver is unloaded or the card is
removed, all chunks are released and the page is unmapped for the last
chunk.
However, when the last chunk of a page is not allocated and the device
is removed, that page is not unmapped. In fact, its last reference is
not put and there's also a page leak. This bug prevents a device from
being properly hotplugged.
When the DMA API debug option is enabled, the following messages show
the pending DMA allocation after we remove the driver.
This patch fixes the bug by unmapping and putting the page from the ring
if its last chunk has not been allocated.
pci 0005:98:00.0: DMA-API: device driver has pending DMA allocations while released from device [count=1]
One of leaked entries details: [device address=0x0000000060a80000] [size=65536 bytes] [mapped with DMA_FROM_DEVICE] [mapped as page]
------------[ cut here ]------------
WARNING: at lib/dma-debug.c:746
Modules linked in: qlge(-) rpadlpar_io rpaphp pci_hotplug fuse [last unloaded: qlge]
NIP:
c0000000003fc3ec LR:
c0000000003fc3e8 CTR:
c00000000054de60
REGS:
c0000003ee9c74e0 TRAP: 0700 Tainted: G O (3.7.2)
MSR:
8000000000029032 <SF,EE,ME,IR,DR,RI> CR:
28002424 XER:
00000001
SOFTE: 1
CFAR:
c0000000007a39c8
TASK =
c0000003ee8d5c90[8406] 'rmmod' THREAD:
c0000003ee9c4000 CPU: 31
GPR00:
c0000000003fc3e8 c0000003ee9c7760 c000000000c789f8 00000000000000ee
GPR04:
0000000000000000 00000000000000ef 0000000000004000 0000000000010000
GPR08:
00000000000000be c000000000b22088 c000000000c4c218 00000000007c0000
GPR12:
0000000028002422 c00000000ff26c80 0000000000000000 000001001b0f1b40
GPR16:
00000000100cb9d8 0000000010093088 c000000000cdf910 0000000000000001
GPR20:
0000000000000000 c000000000dbfc00 0000000000000000 c000000000dbfb80
GPR24:
c0000003fafc9d80 0000000000000001 000000000001ff80 c0000003f38f7888
GPR28:
c000000000ddfc00 0000000000000400 c000000000bd7790 c000000000ddfb80
NIP [
c0000000003fc3ec] .dma_debug_device_change+0x22c/0x2b0
LR [
c0000000003fc3e8] .dma_debug_device_change+0x228/0x2b0
Call Trace:
[
c0000003ee9c7760] [
c0000000003fc3e8] .dma_debug_device_change+0x228/0x2b0 (unreliable)
[
c0000003ee9c7840] [
c00000000079a098] .notifier_call_chain+0x78/0xf0
[
c0000003ee9c78e0] [
c0000000000acc20] .__blocking_notifier_call_chain+0x70/0xb0
[
c0000003ee9c7990] [
c0000000004a9580] .__device_release_driver+0x100/0x140
[
c0000003ee9c7a20] [
c0000000004a9708] .driver_detach+0x148/0x150
[
c0000003ee9c7ac0] [
c0000000004a8144] .bus_remove_driver+0xc4/0x150
[
c0000003ee9c7b60] [
c0000000004aa58c] .driver_unregister+0x8c/0xe0
[
c0000003ee9c7bf0] [
c0000000004090b4] .pci_unregister_driver+0x34/0xf0
[
c0000003ee9c7ca0] [
d000000002231194] .qlge_exit+0x1c/0x34 [qlge]
[
c0000003ee9c7d20] [
c0000000000e36d8] .SyS_delete_module+0x1e8/0x290
[
c0000003ee9c7e30] [
c0000000000098d4] syscall_exit+0x0/0x94
Instruction dump:
7f26cb78 e818003a e87e81a0 e8f80028 e9180030 796b1f24 78001f24 7d6a5a14
7d2a002a e94b0020 483a7595 60000000 <
0fe00000>
2fb80000 40de0048 80120050
---[ end trace
4294f9abdb01031d ]---
Mapped at:
[<
d000000002222f54>] .ql_update_lbq+0x384/0x580 [qlge]
[<
d000000002227bd0>] .ql_clean_inbound_rx_ring+0x300/0xc60 [qlge]
[<
d0000000022288cc>] .ql_napi_poll_msix+0x39c/0x5a0 [qlge]
[<
c0000000006b3c50>] .net_rx_action+0x170/0x300
[<
c000000000081840>] .__do_softirq+0x170/0x300
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Jitendra Kalsaria <Jitendra.kalsaria@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Mon, 6 May 2013 22:40:33 +0000 (17:40 -0500)]
rbd: ignore zero-overlap parent
An rbd clone image that has an overlap with its parent of 0 is
effectively not a layered image at all. Detect this case and treat
such an image as non-layered. Issue a warning to be sure the user
knows what's going on.
This resolves:
http://tracker.ceph.com/issues/5028
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Fri, 10 May 2013 21:29:22 +0000 (16:29 -0500)]
rbd: support reading parent page data for writes
Currently, rbd_img_obj_parent_read_full() assumes the incoming
object request contains bio data. But if a layered image is part of
a multi-layer stack of images it will result in read requests of
page data to parent images.
This is handling the same kind of issue as was resolved by this
commit:
5b2ab72d rbd: support reading parent page data
This resolves:
http://tracker.ceph.com/issues/5027
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Fri, 10 May 2013 21:29:22 +0000 (16:29 -0500)]
rbd: fix parent request size assumption
The code that reads object data from the parent for a copyup on
write request currently assumes that the size of that request is the
size of a "full" object from the original target image.
That is not necessarily the case. The parent overlap could reduce
the request size below that. To fix that assumption we need to
record the number of pages in the copyup_pages array, for both an
image request and an object request. Rename a local variable in
rbd_img_obj_parent_read_full_callback() to reflect we're recording
the length of the parent read request, not the size of the target
object.
This resolves:
http://tracker.ceph.com/issues/5038
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Alex Elder [Thu, 9 May 2013 19:56:32 +0000 (14:56 -0500)]
libceph: init sent and completed when starting
The rbd code has a need to be able to restart an osd request that
has already been started and completed once before. This currently
wouldn't work right because the osd client code assumes an osd
request will be started exactly once Certain fields in a request
are never cleared and this leads to trouble if you try to reuse it.
Specifically, the r_sent, r_got_reply, and r_completed fields are
never cleared. The r_sent field records the osd incarnation at the
time the request was sent to that osd. If that's non-zero, the
message won't get re-mapped to a target osd properly, and won't be
put on the unsafe requests list the first time it's sent as it
should. The r_got_reply field is used in handle_reply() to ensure
the reply to a request is processed only once. And the r_completed
field is used for lingering requests to avoid calling the callback
function every time the osd client re-sends the request on behalf of
its initiator.
Each osd request passes through ceph_osdc_start_request() when
responsibility for the request is handed over to the osd client for
completion. We can safely zero these three fields there each time a
request gets started.
One last related change--clear the r_linger flag when a request
is no longer registered as a linger request.
This resolves:
http://tracker.ceph.com/issues/5026
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Stefano Stabellini [Wed, 8 May 2013 11:59:01 +0000 (11:59 +0000)]
xen/arm: rename xen_secondary_init and run it on every online cpu
Rename xen_secondary_init to xen_percpu_init.
Run xen_percpu_init on the each online cpu, reuse the current on_each_cpu call.
Merge xen_percpu_enable_events into xen_percpu_init.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Stefano Stabellini [Wed, 8 May 2013 13:02:38 +0000 (13:02 +0000)]
xen/arm: do not handle VCPUOP_register_vcpu_info failures
We expect VCPUOP_register_vcpu_info to succeed, do not try to handle
failures.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Stefano Stabellini [Wed, 8 May 2013 11:59:01 +0000 (11:59 +0000)]
xen/arm: initialize pm functions later
If we are running in dom0, we have to wait for the arch specific code to
complete the initialization in order for us to successfully reset the
power_off and pm_restart functions.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Linus Torvalds [Mon, 13 May 2013 15:12:18 +0000 (08:12 -0700)]
Merge tag 'spi-v3.10-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"A few driver specific fixes plus improved error handling in the
generic DT GPIO chipselect handling - not exciting but useful."
* tag 'spi-v3.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi/spi-atmel: BUG: fix doesn' support 16 bits transfers using PIO
spi/davinci: fix module build error
spi: Return error from of_spi_register_master on bad "cs-gpios" property
spi: Initialize cs_gpio and cs_gpios with -ENOENT
spi/atmel: fix speed_hz check in atmel_spi_transfer()
Linus Torvalds [Mon, 13 May 2013 14:59:59 +0000 (07:59 -0700)]
Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Just a few straggling fixes I hoovered up, and an intel fixes pull
from Daniel which fixes some regressions, and some mgag200 fixes from
Matrox."
* 'drm-next' of git://people.freedesktop.org/~airlied/linux:
drm/mgag200: Fix framebuffer base address programming
drm/mgag200: Convert counter delays to jiffies
drm/mgag200: Fix writes into MGA1064_PIX_CLK_CTL register
drm/mgag200: Don't change unrelated registers during modeset
drm: Only print a debug message when the polled connector has changed
drm: Make the HPD status updates debug logs more readable
drm: Use names of ioctls in debug traces
drm: Remove pointless '-' characters from drm_fb_helper documentation
drm: Add kernel-doc for drm_fb_helper_funcs->initial_config
drm: refactor call to request_module
drm: Don't prune modes loudly when a connector is disconnected
drm: Add missing break in the command line mode parsing code
drm/i915: clear the stolen fb before resuming
Revert "drm/i915: Calculate correct stolen size for GEN7+"
drm/i915: hsw: fix link training for eDP on port-A
Revert "drm/i915: revert eDP bpp clamping code changes"
drm: don't check modeset locks in panic handler
drm/i915: Fix pipe enabled mask for pipe C in WM calculations
drm/mm: fix dump table BUG
drm/i915: Always normalize return timeout for wait_timeout_ioctl
Linus Torvalds [Mon, 13 May 2013 14:59:08 +0000 (07:59 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux
Pull virtio/lguest fixes from Rusty Russell:
"Missing license tag and some fallout from the lguest pagetable rework"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
lguest: clear cached last cpu when guest_set_pgd() called.
Add missing module license tag to vring helpers.
Mark Brown [Mon, 13 May 2013 14:27:18 +0000 (18:27 +0400)]
Merge remote-tracking branch 'spi/fix/grant' into spi-linus
Mark Brown [Mon, 13 May 2013 14:27:16 +0000 (18:27 +0400)]
Merge remote-tracking branch 'spi/fix/atmel' into spi-linus
Jan Kara [Mon, 13 May 2013 13:45:01 +0000 (09:45 -0400)]
jbd,jbd2: fix oops in jbd2_journal_put_journal_head()
Commit
ae4647fb (jbd2: reduce journal_head size) introduced a
regression where we occasionally hit panic in
jbd2_journal_put_journal_head() because of wrong b_jcount. The bug is
caused by gcc making 64-bit access to 32-bit bitfield and thus
clobbering b_jcount.
At least for now, those 8 bytes saved in struct journal_head are not
worth the trouble with gcc bitfield handling so revert that part of
the patch.
Reported-by: EUNBONG SONG <eunb.song@samsung.com>
Reported-by: Tony Luck <tony.luck@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Christopher Harvey [Wed, 8 May 2013 19:10:38 +0000 (19:10 +0000)]
drm/mgag200: Fix framebuffer base address programming
Higher bits of the base address of framebuffers weren't being
programmed properly. This caused framebuffers that didn't happen to be
allocated at a low enough address to not be displayed properly.
Signed-off-by: Christopher Harvey <charvey@matrox.com>
Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Acked-by: Julia Lemire <jlemire@matrox.com>
Tested-by: Julia Lemire <jlemire@matrox.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Christopher Harvey [Mon, 6 May 2013 15:56:17 +0000 (15:56 +0000)]
drm/mgag200: Convert counter delays to jiffies
Signed-off-by: Christopher Harvey <charvey@matrox.com>
Acked-by: Julia Lemire <jlemire@matrox.com>
Tested-by: Julia Lemire <jlemire@matrox.com>
Acked-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Christopher Harvey [Fri, 12 Apr 2013 22:24:05 +0000 (22:24 +0000)]
drm/mgag200: Fix writes into MGA1064_PIX_CLK_CTL register
The original line,
WREG_DAC(MGA1064_PIX_CLK_CTL_CLK_DIS, tmp);
wrote tmp into MGA1064_PIX_CLK_CTL_CLK_DIS, where
MGA1064_PIX_CLK_CTL_CLK_DIS is an offset into
MGA1064_PIX_CLK_CTL. Change the line to write properly into
MGA1064_PIX_CLK_CTL. There were other chunks of code nearby that use
the same pattern (but work correctly), so this patch updates them all
to use this new (slightly more efficient) write pattern. The WREG_DAC
macro was causing the DAC_INDEX register to be set to the same value
twice. WREG8(DAC_DATA, foo) takes advantage of the fact that DAC_INDEX
is already at the value we want.
Signed-off-by: Christopher Harvey <charvey@matrox.com>
Acked-by: Julia Lemire <jlemire@matrox.com>
Tested-by: Julia Lemire <jlemire@matrox.com>
Acked-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Christopher Harvey [Fri, 12 Apr 2013 20:42:19 +0000 (20:42 +0000)]
drm/mgag200: Don't change unrelated registers during modeset
Registers in indices below 0x18 are totally unrelated to modesetting,
so don't write 0's, or anything else into them on modeset. Most of
these registers are hardware cursor related, so this existing code
interferes with hardware cursor development.
Signed-off-by: Christopher Harvey <charvey@matrox.com>
Tested-by: Julia Lemire <jlemire@matrox.com>
Acked-by: Julia Lemire <jlemire@matrox.com>
Acked-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lespiau, Damien [Fri, 10 May 2013 12:36:44 +0000 (12:36 +0000)]
drm: Only print a debug message when the polled connector has changed
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lespiau, Damien [Fri, 10 May 2013 12:36:42 +0000 (12:36 +0000)]
drm: Make the HPD status updates debug logs more readable
Instead of just printing "status updated from 1 to 2", make those enum
numbers immediately readable.
v2: Also patch output_poll_execute() (Daniel Vetter)
v3: Use drm_get_connector_status_name (Ville Syrjälä)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (for v1)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Gleixner [Fri, 3 May 2013 13:02:50 +0000 (15:02 +0200)]
tick: Cleanup NOHZ per cpu data on cpu down
Prarit reported a crash on CPU offline/online. The reason is that on
CPU down the NOHZ related per cpu data of the dead cpu is not cleaned
up. If at cpu online an interrupt happens before the per cpu tick
device is registered the irq_enter() check potentially sees stale data
and dereferences a NULL pointer.
Cleanup the data after the cpu is dead.
Reported-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Cc: Mike Galbraith <bitbucket@online.de>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1305031451561.2886@ionos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cong Wang [Thu, 9 May 2013 22:40:00 +0000 (22:40 +0000)]
xfrm6: release dev before returning error
We forget to call dev_put() on error path in xfrm6_fill_dst(),
its caller doesn't handle this.
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amerigo Wang [Thu, 9 May 2013 21:56:37 +0000 (21:56 +0000)]
ipv6,gre: do not leak info to user-space
There is a hole in struct ip6_tnl_parm2, so we have to
zero the struct on stack before copying it to user-space.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amerigo Wang [Thu, 9 May 2013 19:50:51 +0000 (19:50 +0000)]
virtio_net: use default napi weight by default
Since commit
82dc3c63c692b1e1d5937 ("net: introduce NAPI_POLL_WEIGHT")
we warn drivers when they use napi weight higher than NAPI_POLL_WEIGHT,
but virtio_net still uses 128 by default. This patch makes its default
value to NAPI_POLL_WEIGHT.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petri Gynther [Thu, 9 May 2013 16:50:00 +0000 (16:50 +0000)]
emac: Fix EMAC soft reset on 460EX/GT
Fix EMAC soft reset on 460EX/GT to select the right PHY clock source
before and after the soft reset.
EMAC with PHY should use the clock from PHY during soft reset.
EMAC without PHY should use the internal clock during soft reset.
PPC460EX/GT Embedded Processor Advanced User's Manual
section 28.10.1 Mode Register 0 (EMACx_MR0) states:
Note: The PHY must provide a TX Clk in order to perform a soft reset
of the EMAC. If none is present, select the internal clock
(SDR0_ETH_CFG[EMACx_PHY_CLK] = 1).
After a soft reset, select the external clock.
Without the fix, 460EX/GT-based boards with RGMII PHYs attached to
EMACs experience EMAC interrupt storm and system watchdog reset when
issuing "ifconfig eth0 down" + "ifconfig eth0 up" a few times.
The system enters endless loop of serving emac_irq() with EMACx_ISR
register stuck at value 0x10000000 (Rx parity error).
With the fix, the above issue is no longer observed.
Signed-off-by: Petri Gynther <pgynther@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Thu, 9 May 2013 11:14:07 +0000 (11:14 +0000)]
3c59x: fix PCI resource management
The driver wrongly claimed I/O ports at an address returned by pci_iomap() --
even if it was passed an MMIO address. Fix this by claiming/releasing all PCI
resources in the PCI driver's probe()/remove() methods instead and get rid of
'must_free_region' flag weirdness (why would Cardbus claim anything for us?).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 12 May 2013 00:14:08 +0000 (17:14 -0700)]
Linux 3.10-rc1
Linus Torvalds [Sun, 12 May 2013 00:04:59 +0000 (17:04 -0700)]
Merge tag 'trace-fixes-v3.10' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing/kprobes update from Steven Rostedt:
"The majority of these changes are from Masami Hiramatsu bringing
kprobes up to par with the latest changes to ftrace (multi buffering
and the new function probes).
He also discovered and fixed some bugs in doing so. When pulling in
his patches, I also found a few minor bugs as well and fixed them.
This also includes a compile fix for some archs that select the ring
buffer but not tracing.
I based this off of the last patch you took from me that fixed the
merge conflict error, as that was the commit that had all the changes
I needed for this set of changes."
* tag 'trace-fixes-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Support soft-mode disabling
tracing/kprobes: Support ftrace_event_file base multibuffer
tracing/kprobes: Pass trace_probe directly from dispatcher
tracing/kprobes: Increment probe hit-count even if it is used by perf
tracing/kprobes: Use bool for retprobe checker
ftrace: Fix function probe when more than one probe is added
ftrace: Fix the output of enabled_functions debug file
ftrace: Fix locking in register_ftrace_function_probe()
tracing: Add helper function trace_create_new_event() to remove duplicate code
tracing: Modify soft-mode only if there's no other referrer
tracing: Indicate enabled soft-mode in enable file
tracing/kprobes: Fix to increment return event probe hit-count
ftrace: Cleanup regex_lock and ftrace_lock around hash updating
ftrace, kprobes: Fix a deadlock on ftrace_regex_lock
ftrace: Have ftrace_regex_write() return either read or error
tracing: Return error if register_ftrace_function_probe() fails for event_enable_func()
tracing: Don't succeed if event_enable_func did not register anything
ring-buffer: Select IRQ_WORK