openwrt/staging/blogic.git
7 years agoMerge branch 'topic/ppc-kvm' into next
Michael Ellerman [Fri, 20 Oct 2017 00:10:30 +0000 (11:10 +1100)]
Merge branch 'topic/ppc-kvm' into next

Bring in some KVM commits we need (the TM one in particular).

7 years agoKVM: PPC: Tie KVM_CAP_PPC_HTM to the user-visible TM feature
Michael Ellerman [Thu, 12 Oct 2017 11:58:54 +0000 (22:58 +1100)]
KVM: PPC: Tie KVM_CAP_PPC_HTM to the user-visible TM feature

Currently we use CPU_FTR_TM to decide if the CPU/kernel can support
TM (Transactional Memory), and if it's true we advertise that to
Qemu (or similar) via KVM_CAP_PPC_HTM.

PPC_FEATURE2_HTM is the user-visible feature bit, which indicates that
the CPU and kernel can support TM. Currently CPU_FTR_TM and
PPC_FEATURE2_HTM always have the same value, either true or false, so
using the former for KVM_CAP_PPC_HTM is correct.

However some Power9 CPUs can operate in a mode where TM is enabled but
TM suspended state is disabled. In this mode CPU_FTR_TM is true, but
PPC_FEATURE2_HTM is false. Instead a different PPC_FEATURE2 bit is
set, to indicate that this different mode of TM is available.

It is not safe to let guests use TM as-is, when the CPU is in this
mode. So to prevent that from happening, use PPC_FEATURE2_HTM to
determine the value of KVM_CAP_PPC_HTM.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoRevert "KVM: PPC: Book3S HV: POWER9 does not require secondary thread management"
Paul Mackerras [Thu, 19 Oct 2017 04:14:20 +0000 (15:14 +1100)]
Revert "KVM: PPC: Book3S HV: POWER9 does not require secondary thread management"

This reverts commit 94a04bc25a2c6296bd0c5e82c10e8231c2b11f77.

In order to run HPT guests on a radix POWER9 host, we will have to run
the host in single-threaded mode, because POWER9 processors do not
currently support running some threads of a core in HPT mode while
others are in radix mode ("mixed mode").

That means that we will need the same mechanisms that are used on
POWER8 to make the secondary threads available to KVM, which were
disabled on POWER9 by commit 94a04bc25a2c.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/vphn: Fix numa update end-loop bug
Michael Bringmann [Fri, 8 Sep 2017 20:47:56 +0000 (15:47 -0500)]
powerpc/vphn: Fix numa update end-loop bug

powerpc/vphn: On Power systems with shared configurations of CPUs
and memory, there are some issues with the association of additional
CPUs and memory to nodes when hot-adding resources.  This patch
fixes an end-of-updates processing problem observed occasionally
in numa_update_cpu_topology().

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/hotplug: Improve responsiveness of hotplug change
Michael Bringmann [Fri, 8 Sep 2017 20:47:47 +0000 (15:47 -0500)]
powerpc/hotplug: Improve responsiveness of hotplug change

powerpc/hotplug: On Power systems with shared configurations of CPUs
and memory, there are some issues with the association of additional
CPUs and memory to nodes when hot-adding resources.  During hotplug
CPU operations, this patch resets the timer on topology update work
function to a small value to better ensure that the CPU topology is
detected and configured sooner.

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/vphn: Improve recognition of PRRN/VPHN
Michael Bringmann [Fri, 8 Sep 2017 20:47:36 +0000 (15:47 -0500)]
powerpc/vphn: Improve recognition of PRRN/VPHN

powerpc/vphn: On Power systems with shared configurations of CPUs
and memory, there are some issues with the association of additional
CPUs and memory to nodes when hot-adding resources.  This patch
updates the initialization checks to independently recognize PRRN
or VPHN support.

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/vphn: Update CPU topology when VPHN enabled
Michael Bringmann [Fri, 8 Sep 2017 20:47:27 +0000 (15:47 -0500)]
powerpc/vphn: Update CPU topology when VPHN enabled

powerpc/vphn: On Power systems with shared configurations of CPUs
and memory, there are some issues with the association of additional
CPUs and memory to nodes when hot-adding resources.  This patch
corrects the currently broken capability to set the topology for
shared CPUs in LPARs.  At boot time for shared CPU lpars, the
topology for each CPU was being set to node zero.  Now when
numa_update_cpu_topology() is called appropriately, the Virtual
Processor Home Node (VPHN) capabilities information provided by the
pHyp allows the appropriate node in the shared configuration to be
selected for the CPU.

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/mce: hookup memory_failure for UE errors
Balbir Singh [Fri, 29 Sep 2017 04:26:55 +0000 (14:26 +1000)]
powerpc/mce: hookup memory_failure for UE errors

If we are in user space and hit a UE error, we now have the
basic infrastructure to walk the page tables and find out
the effective address that was accessed, since the DAR
is not valid.

We use a work_queue content to hookup the bad pfn, any
other context causes problems, since memory_failure itself
can call into schedule() via lru_drain_ bits.

We could probably poison the struct page to avoid a race
between detection and taking corrective action.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/mce: Hookup ierror (instruction) UE errors
Balbir Singh [Fri, 29 Sep 2017 04:26:54 +0000 (14:26 +1000)]
powerpc/mce: Hookup ierror (instruction) UE errors

Hookup instruction errors (UE) for memory offling via memory_failure()
in a manner similar to load/store errors (derror). Since we have access
to the NIP, the conversion is a one step process in this case.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/mce: Hookup derror (load/store) UE errors
Balbir Singh [Fri, 29 Sep 2017 04:26:53 +0000 (14:26 +1000)]
powerpc/mce: Hookup derror (load/store) UE errors

Extract physical_address for UE errors by walking the page
tables for the mm and address at the NIP, to extract the
instruction. Then use the instruction to find the effective
address via analyse_instr().

We might have page table walking races, but we expect them to
be rare, the physical address extraction is best effort. The idea
is to then hook up this infrastructure to memory failure eventually.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/mce: Align the print of physical address better
Balbir Singh [Fri, 29 Sep 2017 04:26:52 +0000 (14:26 +1000)]
powerpc/mce: Align the print of physical address better

Use the same alignment as Effective address.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/mce: Remove unused function get_mce_fault_addr()
Balbir Singh [Fri, 29 Sep 2017 04:26:51 +0000 (14:26 +1000)]
powerpc/mce: Remove unused function get_mce_fault_addr()

There are no users of get_mce_fault_addr() since commit
1363875bdb63 ("powerpc/64s: fix handling of non-synchronous machine
checks") removed the last usage.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/modules: Use WARN_ON() in stub_for_addr()
Kamalesh Babulal [Tue, 10 Oct 2017 14:47:32 +0000 (20:17 +0530)]
powerpc/modules: Use WARN_ON() in stub_for_addr()

Use WARN_ON(), while running out of stubs in stub_for_addr()
and abort loading of the module instead of BUG_ON().

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: Dump PSL_FIR register on PSL9 error irq
Vaibhav Jain [Wed, 11 Oct 2017 06:14:41 +0000 (11:44 +0530)]
cxl: Dump PSL_FIR register on PSL9 error irq

For PSL9 currently we aren't dumping the PSL FIR register when a
PSL error interrupt is triggered. Contents of this register are useful
in debugging AFU issues.

This patch fixes issue by adding a new service_layer_ops callback
cxl_native_err_irq_dump_regs_psl9() to dump the PSL_FIR registers on a
PSL error interrupt thereby bringing the behavior in line with PSL on
POWER-8. Also the existing service_layer_ops callback
for PSL8 has been renamed to cxl_native_err_irq_dump_regs_psl8().

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: context_switch: Fix pthread errors
Cyril Bur [Tue, 4 Jul 2017 01:21:15 +0000 (11:21 +1000)]
selftests/powerpc: context_switch: Fix pthread errors

Turns out pthreads returns an errno and doesn't set errno. This doesn't
play well with perror().

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: Rename register PSL9_FIR2 to PSL9_FIR_MASK
Vaibhav Jain [Mon, 9 Oct 2017 17:56:27 +0000 (23:26 +0530)]
cxl: Rename register PSL9_FIR2 to PSL9_FIR_MASK

PSL9 doesn't have a FIR2 register as was the case with PSL8. However
currently the register definitions in 'cxl.h' have a definition for
PSL9_FIR2 that actually points to PSL9_FIR_MASK register in the P1
area at offset 0x308.

So this patch renames the def PSL9_FIR2 to PSL9_FIR_MASK and updates
the references in the code to point to the new identifier. It also
removes the code to dump contents of FIR2 (FIR_MASK actually) in
cxl_native_irq_dump_regs_psl9().

Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: Add support for POWER9 DD2
Christophe Lombard [Fri, 8 Sep 2017 13:52:11 +0000 (15:52 +0200)]
cxl: Add support for POWER9 DD2

The PSL initialization sequence has been updated to DD2.
This patch adapts to the changes, retaining compatibility with DD1.
The patch includes some changes to DD1 fix-ups as well.

Tests performed on some of the old/new hardware.

The function is_page_fault(), for POWER9, lists the Translation Checkout
Responses where the page fault will be handled by copro_handle_mm_fault().
This list is too restrictive and not necessary.

This patches removes this restriction and all page faults, whatever the
reason, will be handled. In this case, the interruption is always
acknowledged.

The following features will be added soon:
- phb reset when switching to capi mode.
- cxllib update to support new functions.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: get_wchan(): solve possible race scenario due to parallel wakeup
Kautuk Consul [Tue, 19 Apr 2016 10:18:21 +0000 (15:48 +0530)]
powerpc: get_wchan(): solve possible race scenario due to parallel wakeup

Add a check for p->state == TASK_RUNNING so that any wake-ups on
task_struct p in the interim lead to 0 being returned by get_wchan().

Signed-off-by: Kautuk Consul <kautuk.consul.1980@gmail.com>
[mpe: Confirmed other architectures do similar]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoselftests/powerpc: Use snprintf to construct DSCR sysfs interface paths
Seth Forshee [Thu, 28 Sep 2017 13:34:26 +0000 (09:34 -0400)]
selftests/powerpc: Use snprintf to construct DSCR sysfs interface paths

Currently sprintf is used, and while paths should never exceed
the size of the buffer it is theoretically possible since
dirent.d_name is 256 bytes. As a result this trips
-Wformat-overflow, and since the test is built with -Wall -Werror
the causes the build to fail. Switch to using snprintf and skip
any paths which are too long for the filename buffer.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Always initialize input array when calling epapr_hypercall()
Seth Forshee [Thu, 28 Sep 2017 13:33:39 +0000 (09:33 -0400)]
powerpc: Always initialize input array when calling epapr_hypercall()

Several callers to epapr_hypercall() pass an uninitialized stack
allocated array for the input arguments, presumably because they
have no input arguments. However this can produce errors like
this one

 arch/powerpc/include/asm/epapr_hcalls.h:470:42: error: 'in' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  unsigned long register r3 asm("r3") = in[0];
                                        ~~^~~

Fix callers to this function to always zero-initialize the input
arguments array to prevent this.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc: Add PPC_EMULATED_STATS to powernv_defconfig
Michael Neuling [Thu, 21 Sep 2017 05:24:49 +0000 (15:24 +1000)]
powerpc: Add PPC_EMULATED_STATS to powernv_defconfig

This is useful, especially for developers.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/xmon: Add option to show uptime information
Guilherme G. Piccoli [Mon, 18 Sep 2017 14:16:58 +0000 (11:16 -0300)]
powerpc/xmon: Add option to show uptime information

It might be useful to quickly get the uptime of a running system on
xmon, without needing to grab data from memory and doing math on
struct addresses.

For example, it'd be useful to check for how long after a crash a
system is on xmon shell or if some test was started after the first
test crashed (and this 2nd test crashed too into xmon).

This small patch adds the 'U' command, to accomplish this.

Suggested-by: Murilo Fossa Vicentini <muvic@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
[mpe: Display units (seconds), add sync()/__delay() sequence]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Make opal_event_shutdown() callable from IRQ context
Michael Ellerman [Fri, 29 Sep 2017 03:58:02 +0000 (13:58 +1000)]
powerpc/powernv: Make opal_event_shutdown() callable from IRQ context

In opal_event_shutdown() we free all the IRQs hanging off the
opal_event_irqchip. However it's not safe to do so if we're called
from IRQ context, because free_irq() wants to synchronise versus IRQ
context. This can lead to warnings and a stuck system.

For example from sysrq-b:

  Trying to free IRQ 17 from IRQ context!
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 0 at kernel/irq/manage.c:1461 __free_irq+0x398/0x8d0
  ...
  NIP __free_irq+0x398/0x8d0
  LR __free_irq+0x394/0x8d0
  Call Trace:
    __free_irq+0x394/0x8d0 (unreliable)
    free_irq+0xa4/0x140
    opal_event_shutdown+0x128/0x180
    opal_shutdown+0x1c/0xb0
    pnv_shutdown+0x20/0x40
    machine_restart+0x38/0x90
    emergency_restart+0x28/0x40
    sysrq_handle_reboot+0x24/0x40
    __handle_sysrq+0x198/0x590
    hvc_poll+0x48c/0x8c0
    hvc_handle_interrupt+0x1c/0x50
    __handle_irq_event_percpu+0xe8/0x6e0
    handle_irq_event_percpu+0x34/0xe0
    handle_irq_event+0xc4/0x210
    handle_level_irq+0x250/0x770
    generic_handle_irq+0x5c/0xa0
    opal_handle_events+0x11c/0x240
    opal_interrupt+0x38/0x50
    __handle_irq_event_percpu+0xe8/0x6e0
    handle_irq_event_percpu+0x34/0xe0
    handle_irq_event+0xc4/0x210
    handle_fasteoi_irq+0x174/0xa10
    generic_handle_irq+0x5c/0xa0
    __do_irq+0xbc/0x4e0
    call_do_irq+0x14/0x24
    do_IRQ+0x18c/0x540
    hardware_interrupt_common+0x158/0x180

We can avoid that by using disable_irq_nosync() rather than
free_irq(). Although it doesn't fully free the IRQ, it should be
sufficient when we're shutting down, particularly in an emergency.

Add an in_interrupt() check and use free_irq() when we're shutting
down normally. It's probably OK to use disable_irq_nosync() in that
case too, but for now it's safer to leave that behaviour as-is.

Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events")
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/jprobes: Validate break handler invocation as being due to a jprobe_return()
Naveen N. Rao [Fri, 22 Sep 2017 09:10:48 +0000 (14:40 +0530)]
powerpc/jprobes: Validate break handler invocation as being due to a jprobe_return()

Fix a circa 2005 FIXME by implementing a check to ensure that we
actually got into the jprobe break handler() due to the trap in
jprobe_return().

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/jprobes: Disable preemption when triggered through ftrace
Naveen N. Rao [Fri, 22 Sep 2017 09:10:47 +0000 (14:40 +0530)]
powerpc/jprobes: Disable preemption when triggered through ftrace

KPROBES_SANITY_TEST throws the below splat when CONFIG_PREEMPT is
enabled:

  Kprobe smoke test: started
  DEBUG_LOCKS_WARN_ON(val > preempt_count())
  ------------[ cut here ]------------
  WARNING: CPU: 19 PID: 1 at kernel/sched/core.c:3094 preempt_count_sub+0xcc/0x140
  Modules linked in:
  CPU: 19 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc7-nnr+ #97
  task: c0000000fea80000 task.stack: c0000000feb00000
  NIP:  c00000000011d3dc LR: c00000000011d3d8 CTR: c000000000a090d0
  REGS: c0000000feb03400 TRAP: 0700   Not tainted  (4.13.0-rc7-nnr+)
  MSR:  8000000000021033 <SF,ME,IR,DR,RI,LE>  CR: 28000282  XER: 00000000
  CFAR: c00000000015aa18 SOFTE: 0
  <snip>
  NIP preempt_count_sub+0xcc/0x140
  LR  preempt_count_sub+0xc8/0x140
  Call Trace:
    preempt_count_sub+0xc8/0x140 (unreliable)
    kprobe_handler+0x228/0x4b0
    program_check_exception+0x58/0x3b0
    program_check_common+0x16c/0x170
    --- interrupt: 0 at kprobe_target+0x8/0x20
                     LR = init_test_probes+0x248/0x7d0
    kp+0x0/0x80 (unreliable)
    livepatch_handler+0x38/0x74
    init_kprobes+0x1d8/0x208
    do_one_initcall+0x68/0x1d0
    kernel_init_freeable+0x298/0x374
    kernel_init+0x24/0x160
    ret_from_kernel_thread+0x5c/0x70
  Instruction dump:
  419effdc 3d22001b 39299240 81290000 2f890000 409effc8 3c82ffcb 3c62ffcb
  3884bc68 3863bc18 4803d5fd 60000000 <0fe000004bffffa8 60000000 60000000
  ---[ end trace 432dd46b4ce3d29f ]---
  Kprobe smoke test: passed successfully

The issue is that we aren't disabling preemption in
kprobe_ftrace_handler(). Disable it.

Fixes: ead514d5fb30a0 ("powerpc/kprobes: Add support for KPROBES_ON_FTRACE")
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Trim oops a little for formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/kprobes: Fix warnings from __this_cpu_read() on preempt kernels
Naveen N. Rao [Fri, 22 Sep 2017 09:10:46 +0000 (14:40 +0530)]
powerpc/kprobes: Fix warnings from __this_cpu_read() on preempt kernels

Kamalesh pointed out that we are getting the below call traces with
livepatched functions when we enable CONFIG_PREEMPT:

[  495.470721] BUG: using __this_cpu_read() in preemptible [00000000] code: cat/8394
[  495.471167] caller is is_current_kprobe_addr+0x30/0x90
[  495.471171] CPU: 4 PID: 8394 Comm: cat Tainted: G              K 4.13.0-rc7-nnr+ #95
[  495.471173] Call Trace:
[  495.471178] [c00000008fd9b960] [c0000000009f039c] dump_stack+0xec/0x160 (unreliable)
[  495.471184] [c00000008fd9b9a0] [c00000000059169c] check_preemption_disabled+0x15c/0x170
[  495.471187] [c00000008fd9ba30] [c000000000046460] is_current_kprobe_addr+0x30/0x90
[  495.471191] [c00000008fd9ba60] [c00000000004e9a0] ftrace_call+0x1c/0xb8
[  495.471195] [c00000008fd9bc30] [c000000000376fd8] seq_read+0x238/0x5c0
[  495.471199] [c00000008fd9bcd0] [c0000000003cfd78] proc_reg_read+0x88/0xd0
[  495.471203] [c00000008fd9bd00] [c00000000033e5d4] __vfs_read+0x44/0x1b0
[  495.471206] [c00000008fd9bd90] [c0000000003402ec] vfs_read+0xbc/0x1b0
[  495.471210] [c00000008fd9bde0] [c000000000342138] SyS_read+0x68/0x110
[  495.471214] [c00000008fd9be30] [c00000000000bc6c] system_call+0x58/0x6c

Commit c05b8c4474c030 ("powerpc/kprobes: Skip livepatch_handler() for
jprobes") introduced a helper is_current_kprobe_addr() to help determine
if the current function has been livepatched or if it has a jprobe
installed, both of which modify the NIP. This was subsequently renamed
to __is_active_jprobe().

In the case of a jprobe, kprobe_ftrace_handler() disables pre-emption
before calling into setjmp_pre_handler() which returns without disabling
pre-emption. This is done to ensure that the jprobe handler won't
disappear beneath us if the jprobe is unregistered between the
setjmp_pre_handler() and the subsequent longjmp_break_handler() called
from the jprobe handler. Due to this, we can use __this_cpu_read() in
__is_active_jprobe() with the pre-emption check as we know that
pre-emption will be disabled.

However, if this function has been livepatched, we are still doing this
check and when we do so, pre-emption won't necessarily be disabled. This
results in the call trace shown above.

Fix this by only invoking __is_active_jprobe() when pre-emption is
disabled. And since we now guard this within a pre-emption check, we can
instead use raw_cpu_read() to get the current_kprobe value skipping the
check done by __this_cpu_read().

Fixes: c05b8c4474c030 ("powerpc/kprobes: Skip livepatch_handler() for jprobes")
Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/kprobes: Clean up jprobe detection in livepatch handler
Naveen N. Rao [Fri, 22 Sep 2017 09:10:45 +0000 (14:40 +0530)]
powerpc/kprobes: Clean up jprobe detection in livepatch handler

In commit c05b8c4474c03 ("powerpc/kprobes: Skip livepatch_handler() for
jprobes"), we added a helper is_current_kprobe_addr() to help detect if
the modified regs->nip was due to a jprobe or livepatch. Masami felt
that the function name was not quite clear. To that end, this patch
renames is_current_kprobe_addr() to __is_active_jprobe() and adds a
comment to (hopefully) better clarify the purpose of this helper. The
helper has also now been moved to kprobes-ftrace.c so that it is only
available for KPROBES_ON_FTRACE.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/kprobes: Do not suppress instruction emulation if a single run failed
Naveen N. Rao [Fri, 22 Sep 2017 09:10:44 +0000 (14:40 +0530)]
powerpc/kprobes: Do not suppress instruction emulation if a single run failed

Currently, we disable instruction emulation if emulate_step() fails for
any reason. However, such failures could be transient and specific to a
particular run. Instead, only disable instruction emulation if we have
never been able to emulate this. If we had emulated this instruction
successfully at least once, then we single step only this probe hit and
continue to try emulating the instruction in subsequent probe hits.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/kprobes: Some cosmetic updates to try_to_emulate()
Naveen N. Rao [Fri, 22 Sep 2017 09:10:43 +0000 (14:40 +0530)]
powerpc/kprobes: Some cosmetic updates to try_to_emulate()

1. This is only used in kprobes.c, so make it static.
2. Remove the un-necessary (ret == 0) comparison in the else clause.

Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/configs: Add Skiroot defconfig
Joel Stanley [Wed, 4 Oct 2017 04:23:24 +0000 (14:53 +1030)]
powerpc/configs: Add Skiroot defconfig

This configuration is used by the OpenPower firmware for it's
Linux-as-bootloader implementation. Also known as the Petitboot
kernel, this configuration broke in 4.12 (CPU_HOTPLUG=n), so add it to
the upstream tree in order to get better coverage.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/lib/sstep: Fix fixed-point shift instructions that set CA32
Sandipan Das [Fri, 29 Sep 2017 05:44:10 +0000 (11:14 +0530)]
powerpc/lib/sstep: Fix fixed-point shift instructions that set CA32

This fixes the emulated behaviour of existing fixed-point shift right
algebraic instructions that are supposed to set both the CA and CA32
bits of XER when running on a system that is compliant with POWER ISA
v3.0 independent of whether the system is executing in 32-bit mode or
64-bit mode. The following instructions are affected:
  * Shift Right Algebraic Word Immediate (srawi[.])
  * Shift Right Algebraic Word (sraw[.])
  * Shift Right Algebraic Doubleword Immediate (sradi[.])
  * Shift Right Algebraic Doubleword (srad[.])

Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()")
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/lib/sstep: Fix fixed-point arithmetic instructions that set CA32
Sandipan Das [Fri, 29 Sep 2017 05:44:09 +0000 (11:14 +0530)]
powerpc/lib/sstep: Fix fixed-point arithmetic instructions that set CA32

There are existing fixed-point arithmetic instructions that always set the
CA bit of XER to reflect the carry out of bit 0 in 64-bit mode and out of
bit 32 in 32-bit mode. In ISA v3.0, these instructions also always set the
CA32 bit of XER to reflect the carry out of bit 32.

This fixes the emulated behaviour of such instructions when running on a
system that is compliant with POWER ISA v3.0. The following instructions
are affected:
  * Add Immediate Carrying (addic)
  * Add Immediate Carrying and Record (addic.)
  * Subtract From Immediate Carrying (subfic)
  * Add Carrying (addc[.])
  * Subtract From Carrying (subfc[.])
  * Add Extended (adde[.])
  * Subtract From Extended (subfe[.])
  * Add to Minus One Extended (addme[.])
  * Subtract From Minus One Extended (subfme[.])
  * Add to Zero Extended (addze[.])
  * Subtract From Zero Extended (subfze[.])

Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()")
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/lib/sstep: Add XER bits introduced in POWER ISA v3.0
Sandipan Das [Fri, 29 Sep 2017 05:44:08 +0000 (11:14 +0530)]
powerpc/lib/sstep: Add XER bits introduced in POWER ISA v3.0

This adds definitions for the OV32 and CA32 bits of XER that
were introduced in POWER ISA v3.0. There are some existing
instructions that currently set the OV and CA bits based on
certain conditions.

The emulation behaviour of all these instructions needs to
be updated to set these new bits accordingly.

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powermac: Use setup_timer() helper
Allen Pais [Fri, 22 Sep 2017 11:35:00 +0000 (17:05 +0530)]
powerpc/powermac: Use setup_timer() helper

Use setup_timer function instead of initializing timer with the
function and data fields.

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/6xx: Use setup_timer() helper
Allen Pais [Fri, 22 Sep 2017 11:34:59 +0000 (17:04 +0530)]
powerpc/6xx: Use setup_timer() helper

Use setup_timer function instead of initializing timer with the
function and data fields.

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/oprofile: Use setup_timer() helper
Allen Pais [Fri, 22 Sep 2017 11:34:58 +0000 (17:04 +0530)]
powerpc/oprofile: Use setup_timer() helper

Use setup_timer function instead of initializing timer with the
function and data fields.

Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Use early_radix_enabled in POWER9 tlb flush
Nicholas Piggin [Wed, 27 Sep 2017 05:45:58 +0000 (15:45 +1000)]
powerpc/powernv: Use early_radix_enabled in POWER9 tlb flush

This code is used at boot and machine checks, so it should be using
early_radix_enabled() (which is usable any time).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET
Nicholas Piggin [Fri, 29 Sep 2017 03:29:42 +0000 (13:29 +1000)]
powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET

This allows MSR[EE]=0 lockups to be detected on an OPAL (bare metal)
system similarly to the hcall NMI IPI on pseries guests, when the
platform/firmware supports it.

This is an example of CPU10 spinning with interrupts hard disabled:

  Watchdog CPU:32 detected Hard LOCKUP other CPUS:10
  Watchdog CPU:10 Hard LOCKUP
  CPU: 10 PID: 4410 Comm: bash Not tainted 4.13.0-rc7-00074-ge89ce1f89f62-dirty #34
  task: c0000003a82b4400 task.stack: c0000003af55c000
  NIP: c0000000000a7b38 LR: c000000000659044 CTR: c0000000000a7b00
  REGS: c00000000fd23d80 TRAP: 0100   Not tainted  (4.13.0-rc7-00074-ge89ce1f89f62-dirty)
  MSR: 90000000000c1033 <SF,HV,ME,IR,DR,RI,LE>
  CR: 28422222  XER: 20000000
  CFAR: c0000000000a7b38 SOFTE: 0
  GPR00: c000000000659044 c0000003af55fbb0 c000000001072a00 0000000000000078
  GPR04: c0000003c81b5c80 c0000003c81cc7e8 9000000000009033 0000000000000000
  GPR08: 0000000000000000 c0000000000a7b00 0000000000000001 9000000000001003
  GPR12: c0000000000a7b00 c00000000fd83200 0000000010180df8 0000000010189e60
  GPR16: 0000000010189ed8 0000000010151270 000000001018bd88 000000001018de78
  GPR20: 00000000370a0668 0000000000000001 00000000101645e0 0000000010163c10
  GPR24: 00007fffd14d6294 00007fffd14d6290 c000000000fba6f0 0000000000000004
  GPR28: c000000000f351d8 0000000000000078 c000000000f4095c 0000000000000000
  NIP [c0000000000a7b38] sysrq_handle_xmon+0x38/0x40
  LR [c000000000659044] __handle_sysrq+0xe4/0x270
  Call Trace:
  [c0000003af55fbd0] [c000000000659044] __handle_sysrq+0xe4/0x270
  [c0000003af55fc70] [c000000000659810] write_sysrq_trigger+0x70/0xa0
  [c0000003af55fca0] [c0000000003da650] proc_reg_write+0xb0/0x110
  [c0000003af55fcf0] [c0000000003423bc] __vfs_write+0x6c/0x1b0
  [c0000003af55fd90] [c000000000344398] vfs_write+0xd8/0x240
  [c0000003af55fde0] [c00000000034632c] SyS_write+0x6c/0x110
  [c0000003af55fe30] [c00000000000b220] system_call+0x58/0x6c

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use kernel types for opal_signal_system_reset()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Implement system reset idle wakeup reason
Nicholas Piggin [Fri, 29 Sep 2017 03:29:41 +0000 (13:29 +1000)]
powerpc/64s: Implement system reset idle wakeup reason

It is possible to wake from idle due to a system reset exception, in
which case the CPU takes a system reset interrupt to wake from idle,
with system reset as the wakeup reason.

The regular (not idle wakeup) system reset interrupt handler must be
invoked in this case, otherwise the system reset interrupt is lost.

Handle the system reset interrupt immediately after CPU state has been
restored.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/xmon: Avoid tripping SMP hardlockup watchdog
Nicholas Piggin [Fri, 29 Sep 2017 03:29:40 +0000 (13:29 +1000)]
powerpc/xmon: Avoid tripping SMP hardlockup watchdog

The SMP hardlockup watchdog cross-checks other CPUs for lockups, which
causes xmon headaches because it's assuming interrupts hard disabled
means no watchdog troubles. Try to improve that by calling
touch_nmi_watchdog() in obvious places where secondaries are spinning.

Also annotate these spin loops with spin_begin/end calls.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog
Nicholas Piggin [Fri, 29 Sep 2017 03:29:39 +0000 (13:29 +1000)]
powerpc/watchdog: Do not trigger SMP crash from touch_nmi_watchdog

In xmon, touch_nmi_watchdog() is not expected to be checking that
other CPUs have not touched the watchdog, so the code will just call
touch_nmi_watchdog() once before re-enabling hard interrupts.

Just update our CPU's state, and ignore apparently stuck SMP threads.

Arguably touch_nmi_watchdog should check for SMP lockups, and callers
should be fixed, but that's not trivial for the input code of xmon.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/watchdog: Do not backtrace locked CPUs twice if allcpus backtrace is enabled
Nicholas Piggin [Fri, 29 Sep 2017 03:29:38 +0000 (13:29 +1000)]
powerpc/watchdog: Do not backtrace locked CPUs twice if allcpus backtrace is enabled

If sysctl_hardlockup_all_cpu_backtrace is enabled, there is no need to
IPI stuck CPUs for backtrace before trigger_allbutself_cpu_backtrace(),
which does the same thing again.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/watchdog: Do not panic from locked CPU's IPI handler
Nicholas Piggin [Fri, 29 Sep 2017 03:29:37 +0000 (13:29 +1000)]
powerpc/watchdog: Do not panic from locked CPU's IPI handler

The SMP watchdog will detect locked CPUs and IPI them to print a
backtrace and registers. If panic on hard lockup is enabled, do not
panic from this handler, because that can cause recursion into the IPI
layer during the panic.

The caller already panics in this case.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: Set the valid bit in PE for dedicated mode
Vaibhav Jain [Mon, 4 Sep 2017 08:48:25 +0000 (14:18 +0530)]
cxl: Set the valid bit in PE for dedicated mode

Make sure to set the valid-bit in software-state field of the
populated PE. This was earlier missing for dedicated mode AFUs, hence
was causing a PSL freeze when the AFU was activated.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agocxl: Enable global TLBIs for cxl contexts
Frederic Barrat [Sun, 3 Sep 2017 18:15:13 +0000 (20:15 +0200)]
cxl: Enable global TLBIs for cxl contexts

The PSL and nMMU need to see all TLB invalidations for the memory
contexts used on the adapter. For the hash memory model, it is done by
making all TLBIs global as soon as the cxl driver is in use. For
radix, we need something similar, but we can refine and only convert
to global the invalidations for contexts actually used by the device.

The new mm_context_add_copro() API increments the 'active_cpus' count
for the contexts attached to the cxl adapter. As soon as there's more
than 1 active cpu, the TLBIs for the context become global. Active cpu
count must be decremented when detaching to restore locality if
possible and to avoid overflowing the counter.

The hash memory model support is somewhat limited, as we can't
decrement the active cpus count when mm_context_remove_copro() is
called, because we can't flush the TLB for a mm on hash. So TLBIs
remain global on hash.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
Tested-by: Alistair Popple <alistair@popple.id.au>
[mpe: Fold in updated comment on the barrier from Fred]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/mm: Export flush_all_mm()
Frederic Barrat [Sun, 3 Sep 2017 18:15:12 +0000 (20:15 +0200)]
powerpc/mm: Export flush_all_mm()

With the optimizations introduced by commit a46cc7a90fd8
("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no
longer flushes the page walk cache (PWC) with radix. This patch
introduces flush_all_mm(), which flushes everything, TLB and PWC, for
a given mm.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-By: Alistair Popple <alistair@popple.id.au>
[mpe: Add a WARN_ON_ONCE() in the empty hash routines]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/64s: Add workaround for P9 vector CI load issue
Michael Neuling [Fri, 15 Sep 2017 05:25:48 +0000 (15:25 +1000)]
powerpc/64s: Add workaround for P9 vector CI load issue

POWER9 DD2.1 and earlier has an issue where some cache inhibited
vector load will return bad data. The workaround is two part, one
firmware/microcode part triggers HMI interrupts when hitting such
loads, the other part is this patch which then emulates the
instructions in Linux.

The affected instructions are limited to lxvd2x, lxvw4x, lxvb16x and
lxvh8x.

When an instruction triggers the HMI, all threads in the core will be
sent to the HMI handler, not just the one running the vector load.

In general, these spurious HMIs are detected by the emulation code and
we just return back to the running process. Unfortunately, if a
spurious interrupt occurs on a vector load that's to normal memory we
have no way to detect that it's spurious (unless we walk the page
tables, which is very expensive). In this case we emulate the load but
we need do so using a vector load itself to ensure 128bit atomicity is
preserved.

Some additional debugfs emulated instruction counters are added also.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[mpe: Switch CONFIG_PPC_BOOK3S_64 to CONFIG_VSX to unbreak the build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agopowerpc/powernv: Rework EEH initialization on powernv
Benjamin Herrenschmidt [Thu, 7 Sep 2017 06:35:44 +0000 (16:35 +1000)]
powerpc/powernv: Rework EEH initialization on powernv

Remove the post_init callback which is only used
by powernv, we can just call it explicitly from
the powernv code.

This partially kills the ability to "disable" eeh at
runtime via debugfs as this was calling that same
callback again, but this is both unused and broken
in several ways. If we want to revive it, we need
to create a dedicated enable/disable callback on the
backend that does the right thing.

Let the bulk of eeh initialize normally at
core_initcall() like it does on pseries by removing
the hack in eeh_init() that delays it.

Instead we make sure our eeh->probe cleanly bails
out of the PEs haven't been created yet and we force
a re-probe where we used to call eeh_init() again.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
7 years agoLinux 4.14-rc2
Linus Torvalds [Sun, 24 Sep 2017 23:38:56 +0000 (16:38 -0700)]
Linux 4.14-rc2

7 years agoMerge tag 'devicetree-fixes-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 24 Sep 2017 23:04:12 +0000 (16:04 -0700)]
Merge tag 'devicetree-fixes-for-4.14' of git://git./linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:

 - fix build for !OF providing empty of_find_device_by_node

 - fix Abracon vendor prefix

 - sync dtx_diff include paths (again)

 - a stm32h7 clock binding doc fix

* tag 'devicetree-fixes-for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: clk: stm32h7: fix clock-cell size
  scripts/dtc: dtx_diff - 2nd update of include dts paths to match build
  dt-bindings: fix vendor prefix for Abracon
  of: provide inline helper for of_find_device_by_node

7 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 24 Sep 2017 19:33:58 +0000 (12:33 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
 "Another round of CR3/PCID related fixes (I think this addresses all
  but one of the known problems with PCID support), an objtool fix plus
  a Clang fix that (finally) solves all Clang quirks to build a bootable
  x86 kernel as-is"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Fix inline asm call constraints for Clang
  objtool: Handle another GCC stack pointer adjustment bug
  x86/mm/32: Load a sane CR3 before cpu_init() on secondary CPUs
  x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier
  x86/mm/64: Stop using CR3.PCID == 0 in ASID-aware code
  x86/mm: Factor out CR3-building code

7 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 24 Sep 2017 19:28:55 +0000 (12:28 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "A clocksource driver section mismatch fix"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/integrator: Fix section mismatch warning

7 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 24 Sep 2017 18:57:07 +0000 (11:57 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
 "Three irqchip driver fixes, and an affinity mask helper function bug
  fix affecting x86"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "genirq: Restrict effective affinity to interrupts actually using it"
  irqchip.mips-gic: Fix shared interrupt mask writes
  irqchip/gic-v4: Fix building with ancient gcc
  irqchip/gic-v3: Iterate over possible CPUs by for_each_possible_cpu()

7 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 24 Sep 2017 18:53:13 +0000 (11:53 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull address-limit checking fixes from Ingo Molnar:
 "This fixes a number of bugs in the address-limit (USER_DS) checks that
  got introduced in the merge window, (mostly) affecting the ARM and
  ARM64 platforms"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm64/syscalls: Move address limit check in loop
  arm/syscalls: Optimize address limit check
  Revert "arm/syscalls: Check address limit on user-mode return"
  syscalls: Use CHECK_DATA_CORRUPTION for addr_limit_user_check

7 years agoMerge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Sun, 24 Sep 2017 18:40:41 +0000 (11:40 -0700)]
Merge branch 'next-general' of git://git./linux/kernel/git/jmorris/linux-security

Pull misc security layer update from James Morris:
 "This is the remaining 'general' change in the security tree for v4.14,
  following the direct merging of SELinux (+ TOMOYO), AppArmor, and
  seccomp.

  That's everything now for the security tree except IMA, which will
  follow shortly (I've been traveling for the past week with patchy
  internet)"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security: fix description of values returned by cap_inode_need_killpriv

7 years agoMerge branch 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Sun, 24 Sep 2017 18:34:28 +0000 (11:34 -0700)]
Merge branch 'next-tpm' of git://git./linux/kernel/git/jmorris/linux-security

Pull TPM updates from James Morris:
 "Here are the TPM updates from Jarkko for v4.14, which I've placed in
  their own branch (next-tpm). I ended up cherry-picking them as other
  changes had been made in Jarkko's branch after he sent me his original
  pull request.

  I plan on maintaining a separate branch for TPM (and other security
  subsystems) from now on.

  From Jarkko: 'Not much this time except a few fixes'"

* 'next-tpm' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  tpm: ibmvtpm: simplify crq initialization and document crq format
  tpm: replace msleep() with  usleep_range() in TPM 1.2/2.0 generic drivers
  Documentation: tpm: add powered-while-suspended binding documentation
  tpm: tpm_crb: constify acpi_device_id.
  tpm: vtpm: constify vio_device_id

7 years agotpm: ibmvtpm: simplify crq initialization and document crq format
Michal Suchanek [Fri, 24 Feb 2017 19:35:16 +0000 (20:35 +0100)]
tpm: ibmvtpm: simplify crq initialization and document crq format

The crq is passed in registers and is the same on BE and LE hosts.
However, current implementation allocates a structure on-stack to
represent the crq, initializes the members swapping them to BE, and
loads the structure swapping it from BE. This is pointless and causes
GCC warnings about ununitialized members. Get rid of the structure and
the warnings.

Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
7 years agotpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers
Hamza Attak [Mon, 14 Aug 2017 18:09:16 +0000 (19:09 +0100)]
tpm: replace msleep() with  usleep_range() in TPM 1.2/2.0 generic drivers

The patch simply replaces all msleep function calls with usleep_range calls
in the generic drivers.

Tested with an Infineon TPM 1.2, using the generic tpm-tis module, for a
thousand PCR extends, we see results going from 1m57s unpatched to 40s
with the new patch. We obtain similar results when using the original and
patched tpm_infineon driver, which is also part of the patch.
Similarly with a STM TPM 2.0, using the CRB driver, it takes about 20ms per
extend unpatched and around 7ms with the new patch.

Note that the PCR consistency is untouched with this patch, each TPM has
been tested with 10 million extends and the aggregated PCR value is
continuously verified to be correct.

As an extension of this work, this could potentially and easily be applied
to other vendor's drivers. Still, these changes are not included in the
proposed patch as they are untested.

Signed-off-by: Hamza Attak <hamza@hpe.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
7 years agoDocumentation: tpm: add powered-while-suspended binding documentation
Enric Balletbo i Serra [Tue, 27 Jun 2017 10:27:23 +0000 (12:27 +0200)]
Documentation: tpm: add powered-while-suspended binding documentation

Add a new powered-while-suspended property to control the behavior of the
TPM suspend/resume.

Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
7 years agotpm: tpm_crb: constify acpi_device_id.
Arvind Yadav [Thu, 6 Jul 2017 17:48:39 +0000 (23:18 +0530)]
tpm: tpm_crb: constify acpi_device_id.

acpi_device_id are not supposed to change at runtime. All functions
working with acpi_device_id provided by <acpi/acpi_bus.h> work with
const acpi_device_id. So mark the non-const structs as const.

File size before:
   text    data     bss     dec     hex filename
   4198     608       0    4806    12c6 drivers/char/tpm/tpm_crb.o

File size After adding 'const':
   text    data     bss     dec     hex filename
   4262     520       0    4782    12ae drivers/char/tpm/tpm_crb.o

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
7 years agotpm: vtpm: constify vio_device_id
Arvind Yadav [Thu, 17 Aug 2017 17:34:21 +0000 (23:04 +0530)]
tpm: vtpm: constify vio_device_id

vio_device_id are not supposed to change at runtime. All functions
working with vio_device_id provided by <asm/vio.h> work with
const vio_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
7 years agosecurity: fix description of values returned by cap_inode_need_killpriv
Stefan Berger [Thu, 27 Jul 2017 02:27:05 +0000 (22:27 -0400)]
security: fix description of values returned by cap_inode_need_killpriv

cap_inode_need_killpriv returns 1 if security.capability exists and
has a value and inode_killpriv() is required, 0 otherwise. Fix the
description of the return value to reflect this.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
7 years agoMerge branch 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Sat, 23 Sep 2017 16:14:06 +0000 (06:14 -1000)]
Merge branch 'parisc-4.14-2' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:

 - Unbreak parisc bootloader by avoiding a gcc-7 optimization to convert
   multiple byte-accesses into one word-access.

 - Add missing HWPOISON page fault handler code. I completely missed
   that when I added HWPOISON support during this merge window and it
   only showed up now with the madvise07 LTP test case.

 - Fix backtrace unwinding to stop when stack start has been reached.

 - Issue warning if initrd has been loaded into memory regions with
   broken RAM modules.

 - Fix HPMC handler (parisc hardware fault handler) to comply with
   architecture specification.

 - Avoid compiler warnings about too large frame sizes.

 - Minor init-section fixes.

* 'parisc-4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Unbreak bootloader due to gcc-7 optimizations
  parisc: Reintroduce option to gzip-compress the kernel
  parisc: Add HWPOISON page fault handler code
  parisc: Move init_per_cpu() into init section
  parisc: Check if initrd was loaded into broken RAM
  parisc: Add PDCE_CHECK instruction to HPMC handler
  parisc: Add wrapper for pdc_instr() firmware function
  parisc: Move start_parisc() into init section
  parisc: Stop unwinding at start of stack
  parisc: Fix too large frame size warnings

7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Sat, 23 Sep 2017 15:47:04 +0000 (05:47 -1000)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:

 - Smattering of miscellanous fixes

 - A five patch series for i40iw that had a patch (5/5) that was larger
   than I would like, but I took it because it's needed for large scale
   users

 - An 8 patch series for bnxt_re that landed right as I was leaving on
   PTO and so had to wait until now...they are all appropriate fixes for
   -rc IMO

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (22 commits)
  bnxt_re: Don't issue cmd to delete GID for QP1 GID entry before the QP is destroyed
  bnxt_re: Fix memory leak in FRMR path
  bnxt_re: Remove RTNL lock dependency in bnxt_re_query_port
  bnxt_re: Fix race between the netdev register and unregister events
  bnxt_re: Free up devices in module_exit path
  bnxt_re: Fix compare and swap atomic operands
  bnxt_re: Stop issuing further cmds to FW once a cmd times out
  bnxt_re: Fix update of qplib_qp.mtu when modified
  i40iw: Add support for port reuse on active side connections
  i40iw: Add missing VLAN priority
  i40iw: Call i40iw_cm_disconn on modify QP to disconnect
  i40iw: Prevent multiple netdev event notifier registrations
  i40iw: Fail open if there are no available MSI-X vectors
  RDMA/vmw_pvrdma: Fix reporting correct opcodes for completion
  IB/bnxt_re: Fix frame stack compilation warning
  IB/mlx5: fix debugfs cleanup
  IB/ocrdma: fix incorrect fall-through on switch statement
  IB/ipoib: Suppress the retry related completion errors
  iw_cxgb4: remove the stid on listen create failure
  iw_cxgb4: drop listen destroy replies if no ep found
  ...

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sat, 23 Sep 2017 15:41:27 +0000 (05:41 -1000)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix NAPI poll list corruption in enic driver, from Christian
    Lamparter.

 2) Fix route use after free, from Eric Dumazet.

 3) Fix regression in reuseaddr handling, from Josef Bacik.

 4) Assert the size of control messages in compat handling since we copy
    it in from userspace twice. From Meng Xu.

 5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
    from Ursula Braun.

 6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.

 7) Don't use ARRAY_SIZE on spinlock array which might have zero
    entries, from Geert Uytterhoeven.

 8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
    Kasiviswanathan.

 9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
    Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
  inet: fix improper empty comparison
  net: use inet6_rcv_saddr to compare sockets
  net: set tb->fast_sk_family
  net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
  MAINTAINERS: update git tree locations for ieee802154 subsystem
  net: prevent dst uses after free
  net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
  net/smc: no close wait in case of process shut down
  net/smc: introduce a delay
  net/smc: terminate link group if out-of-sync is received
  net/smc: longer delay for client link group removal
  net/smc: adapt send request completion notification
  net/smc: adjust net_device refcount
  net/smc: take RCU read lock for routing cache lookup
  net/smc: add receive timeout check
  net/smc: add missing dev_put
  net: stmmac: Cocci spatch "of_table"
  lan78xx: Use default values loaded from EEPROM/OTP after reset
  lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
  lan78xx: Fix for eeprom read/write when device auto suspend
  ...

7 years agoMerge tag 'apparmor-pr-2017-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 23 Sep 2017 15:33:29 +0000 (05:33 -1000)]
Merge tag 'apparmor-pr-2017-09-22' of git://git./linux/kernel/git/jj/linux-apparmor

Pull apparmor updates from John Johansen:
 "This is the apparmor pull request, similar to SELinux and seccomp.

  It's the same series that I was sent to James' security tree + one
  regression fix that was found after the series was sent to James and
  would have been sent for v4.14-rc2.

  Features:
  - in preparation for secid mapping add support for absolute root view
    based labels
  - add base infastructure for socket mediation
  - add mount mediation
  - add signal mediation

  minor cleanups and changes:
  - be defensive, ensure unconfined profiles have dfas initialized
  - add more debug asserts to apparmorfs
  - enable policy unpacking to audit different reasons for failure
  - cleanup conditional check for label in label_print
  - Redundant condition: prev_ns. in [label.c:1498]

  Bug Fixes:
  - fix regression in apparmorfs DAC access permissions
  - fix build failure on sparc caused by undeclared signals
  - fix sparse report of incorrect type assignment when freeing label proxies
  - fix race condition in null profile creation
  - Fix an error code in aafs_create()
  - Fix logical error in verify_header()
  - Fix shadowed local variable in unpack_trans_table()"

* tag 'apparmor-pr-2017-09-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
  apparmor: fix apparmorfs DAC access permissions
  apparmor: fix build failure on sparc caused by undeclared signals
  apparmor: fix incorrect type assignment when freeing proxies
  apparmor: ensure unconfined profiles have dfas initialized
  apparmor: fix race condition in null profile creation
  apparmor: move new_null_profile to after profile lookup fns()
  apparmor: add base infastructure for socket mediation
  apparmor: add more debug asserts to apparmorfs
  apparmor: make policy_unpack able to audit different info messages
  apparmor: add support for absolute root view based labels
  apparmor: cleanup conditional check for label in label_print
  apparmor: add mount mediation
  apparmor: add the ability to mediate signals
  apparmor: Redundant condition: prev_ns. in [label.c:1498]
  apparmor: Fix an error code in aafs_create()
  apparmor: Fix logical error in verify_header()
  apparmor: Fix shadowed local variable in unpack_trans_table()

7 years agox86/asm: Fix inline asm call constraints for Clang
Josh Poimboeuf [Wed, 20 Sep 2017 21:24:33 +0000 (16:24 -0500)]
x86/asm: Fix inline asm call constraints for Clang

For inline asm statements which have a CALL instruction, we list the
stack pointer as a constraint to convince GCC to ensure the frame
pointer is set up first:

  static inline void foo()
  {
register void *__sp asm(_ASM_SP);
asm("call bar" : "+r" (__sp))
  }

Unfortunately, that pattern causes Clang to corrupt the stack pointer.

The fix is easy: convert the stack pointer register variable to a global
variable.

It should be noted that the end result is different based on the GCC
version.  With GCC 6.4, this patch has exactly the same result as
before:

defconfig defconfig-nofp distro distro-nofp
 before 9820389 9491555 8816046 8516940
 after 9820389 9491555 8816046 8516940

With GCC 7.2, however, GCC's behavior has changed.  It now changes its
behavior based on the conversion of the register variable to a global.
That somehow convinces it to *always* set up the frame pointer before
inserting *any* inline asm.  (Therefore, listing the variable as an
output constraint is a no-op and is no longer necessary.)  It's a bit
overkill, but the performance impact should be negligible.  And in fact,
there's a nice improvement with frame pointers disabled:

defconfig defconfig-nofp distro distro-nofp
 before 9796316 9468236 9076191 8790305
 after 9796957 9464267 9076381 8785949

So in summary, while listing the stack pointer as an output constraint
is no longer necessary for newer versions of GCC, it's still needed for
older versions.

Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reported-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3db862e970c432ae823cf515c52b54fec8270e0e.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoobjtool: Handle another GCC stack pointer adjustment bug
Josh Poimboeuf [Wed, 20 Sep 2017 21:24:32 +0000 (16:24 -0500)]
objtool: Handle another GCC stack pointer adjustment bug

The kbuild bot reported the following warning with GCC 4.4 and a
randconfig:

  net/socket.o: warning: objtool: compat_sock_ioctl()+0x1083: stack state mismatch: cfa1=7+160 cfa2=-1+0

This is caused by another GCC non-optimization, where it backs up and
restores the stack pointer for no apparent reason:

    2f91:       48 89 e0                mov    %rsp,%rax
    2f94:       4c 89 e7                mov    %r12,%rdi
    2f97:       4c 89 f6                mov    %r14,%rsi
    2f9a:       ba 20 00 00 00          mov    $0x20,%edx
    2f9f:       48 89 c4                mov    %rax,%rsp

This issue would have been happily ignored before the following commit:

  dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug")

But now that objtool is paying attention to such stack pointer writes
to/from a register, it needs to understand them properly.  In this case
that means recognizing that the "mov %rsp, %rax" instruction is
potentially a backup of the stack pointer.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: dd88a0a0c861 ("objtool: Handle GCC stack pointer adjustment bug")
Link: http://lkml.kernel.org/r/8c7aa8e9a36fbbb6655d9d8e7cea58958c912da8.1505942196.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoMerge tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Sat, 23 Sep 2017 03:40:11 +0000 (17:40 -1000)]
Merge tag 'acpi-4.14-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix the initialization of resources in the ACPI WDAT watchdog
  driver, a recent regression in the ACPI device properties handling, a
  recent change in behavior causing the ACPI_HANDLE() macro to only work
  for GPL code and create a MAINTAINERS entry for ACPI PMIC drivers in
  order to specify the official reviewers for that code.

  Specifics:

   - Fix the initialization of resources in the ACPI WDAT watchdog
     driver that uses unititialized memory which causes compiler
     warnings to be triggered (Arnd Bergmann).

   - Fix a recent regression in the ACPI device properties handling that
     causes some device properties data to be skipped during enumeration
     (Sakari Ailus).

   - Fix a recent change in behavior that caused the ACPI_HANDLE() macro
     to stop working for non-GPL code which is a problem for the NVidia
     binary graphics driver, for example (John Hubbard).

   - Add a MAINTAINERS entry for the ACPI PMIC drivers to specify the
     official reviewers for that code (Rafael Wysocki)"

* tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: properties: Return _DSD hierarchical extension (data) sub-nodes correctly
  ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again
  ACPI / watchdog: properly initialize resources
  ACPI / PMIC: Add code reviewers to MAINTAINERS

7 years agoMerge branch 'net-fix-reuseaddr-regression'
David S. Miller [Sat, 23 Sep 2017 03:33:18 +0000 (20:33 -0700)]
Merge branch 'net-fix-reuseaddr-regression'

Josef Bacik says:

====================
net: fix reuseaddr regression

I introduced a regression when reworking the fastreuse port stuff that allows
bind conflicts to occur once a reuseaddr successfully opens on an existing tb.
The root cause is I reversed an if statement which caused us to set the tb as if
there were no owners on the socket if there were, which obviously is not
correct.

Dave could you please queue these changes up for -stable, I've run them through
the net tests and added another test to check for this problem specifically.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoinet: fix improper empty comparison
Josef Bacik [Sat, 23 Sep 2017 00:20:08 +0000 (20:20 -0400)]
inet: fix improper empty comparison

When doing my reuseport rework I screwed up and changed a

if (hlist_empty(&tb->owners))

to

if (!hlist_empty(&tb->owners))

This is obviously bad as all of the reuseport/reuse logic was reversed,
which caused weird problems like allowing an ipv4 bind conflict if we
opened an ipv4 only socket on a port followed by an ipv6 only socket on
the same port.

Fixes: b9470c27607b ("inet: kill smallest_size and smallest_port")
Reported-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: use inet6_rcv_saddr to compare sockets
Josef Bacik [Sat, 23 Sep 2017 00:20:07 +0000 (20:20 -0400)]
net: use inet6_rcv_saddr to compare sockets

In ipv6_rcv_saddr_equal() we need to use inet6_rcv_saddr(sk) for the
ipv6 compare with the fast socket information to make sure we're doing
the proper comparisons.

Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: set tb->fast_sk_family
Josef Bacik [Sat, 23 Sep 2017 00:20:06 +0000 (20:20 -0400)]
net: set tb->fast_sk_family

We need to set the tb->fast_sk_family properly so we can use the proper
comparison function for all subsequent reuseport bind requests.

Fixes: 637bc8bbe6c0 ("inet: reset tb->fastreuseport when adding a reuseport sk")
Reported-and-tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: orphan frags on stand-alone ptype in dev_queue_xmit_nit
Willem de Bruijn [Fri, 22 Sep 2017 23:42:37 +0000 (19:42 -0400)]
net: orphan frags on stand-alone ptype in dev_queue_xmit_nit

Zerocopy skbs frags are copied when the skb is looped to a local sock.
Commit 1080e512d44d ("net: orphan frags on receive") introduced calls
to skb_orphan_frags to deliver_skb and __netif_receive_skb for this.

With msg_zerocopy, these skbs can also exist in the tx path and thus
loop from dev_queue_xmit_nit. This already calls deliver_skb in its
loop. But it does not orphan before a separate pt_prev->func().

Add the missing skb_orphan_frags_rx.

Changes
  v1->v2: handle skb_orphan_frags_rx failure

Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Sat, 23 Sep 2017 03:28:59 +0000 (17:28 -1000)]
Merge tag 'pm-4.14-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a cpufreq regression introduced by recent changes related to
  the generic DT driver, an initialization time memory leak in cpuidle
  on ARM, a PM core bug that may cause system suspend/resume to fail on
  some systems, a request type validation issue in the PM QoS framework
  and two documentation-related issues.

  Specifics:

   - Fix a regression in cpufreq on systems using DT as the source of
     CPU configuration information where two different code paths
     attempt to create the cpufreq-dt device object (there can be only
     one) and fix up the "compatible" matching for some TI platforms on
     top of that (Viresh Kumar, Dave Gerlach).

   - Fix an initialization time memory leak in cpuidle on ARM which
     occurs if the cpuidle driver initialization fails (Stefan Wahren).

   - Fix a PM core function that checks whether or not there are any
     system suspend/resume callbacks for a device, but forgets to check
     legacy callbacks which then may be skipped incorrectly and the
     system may crash and/or the device may become unusable after a
     suspend-resume cycle (Rafael Wysocki).

   - Fix request type validation for latency tolerance PM QoS requests
     which may lead to unexpected behavior (Jan Schönherr).

   - Fix a broken link to PM documentation from a header file and a typo
     in a PM document (Geert Uytterhoeven, Rafael Wysocki)"

* tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: ti-cpufreq: Support additional am43xx platforms
  ARM: cpuidle: Avoid memleak if init fail
  cpufreq: dt-platdev: Add some missing platforms to the blacklist
  PM: core: Fix device_pm_check_callbacks()
  PM: docs: Drop an excess character from devices.rst
  PM / QoS: Use the correct variable to check the QoS request type
  driver core: Fix link to device power management documentation

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sat, 23 Sep 2017 03:23:41 +0000 (17:23 -1000)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - fixes for two long standing issues (lock up and a crash) in force
   feedback handling in uinput driver

 - tweak to firmware update timing in Elan I2C touchpad driver.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elan_i2c - extend Flash-Write delay
  Input: uinput - avoid crash when sending FF request to device going away
  Input: uinput - avoid FF flush when destroying device

7 years agoMerge tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Sat, 23 Sep 2017 02:16:41 +0000 (16:16 -1000)]
Merge tag 'seccomp-v4.14-rc2' of git://git./linux/kernel/git/kees/linux

Pull seccomp updates from Kees Cook:
 "Major additions:

   - sysctl and seccomp operation to discover available actions
     (tyhicks)

   - new per-filter configurable logging infrastructure and sysctl
     (tyhicks)

   - SECCOMP_RET_LOG to log allowed syscalls (tyhicks)

   - SECCOMP_RET_KILL_PROCESS as the new strictest possible action

   - self-tests for new behaviors"

[ This is the seccomp part of the security pull request during the merge
  window that was nixed due to unrelated problems   - Linus ]

* tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  samples: Unrename SECCOMP_RET_KILL
  selftests/seccomp: Test thread vs process killing
  seccomp: Implement SECCOMP_RET_KILL_PROCESS action
  seccomp: Introduce SECCOMP_RET_KILL_PROCESS
  seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
  seccomp: Action to log before allowing
  seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
  seccomp: Selftest for detection of filter flag support
  seccomp: Sysctl to configure actions that are allowed to be logged
  seccomp: Operation for checking if an action is available
  seccomp: Sysctl to display available actions
  seccomp: Provide matching filter for introspection
  selftests/seccomp: Refactor RET_ERRNO tests
  selftests/seccomp: Add simple seccomp overhead benchmark
  selftests/seccomp: Add tests for basic ptrace actions

7 years agoMerge tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba...
Linus Torvalds [Sat, 23 Sep 2017 02:11:48 +0000 (16:11 -1000)]
Merge tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Various SMB3 fixes for stable and security improvements from the
  recently completed SMB3/Samba test events

* tag '4.14-smb3-fixes-from-recent-test-events-for-stable' of git://git.samba.org/sfrench/cifs-2.6:
  SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
  SMB3: handle new statx fields
  SMB: Validate negotiate (to protect against downgrade) even if signing off
  cifs: release auth_key.response for reconnect.
  cifs: release cifs root_cred after exit_cifs
  CIFS: make arrays static const, reduces object code size
  [SMB3] Update session and share information displayed for debugging SMB2/SMB3
  cifs: show 'soft' in the mount options for hard mounts
  SMB3: Warn user if trying to sign connection that authenticated as guest
  SMB3: Fix endian warning
  Fix SMB3.1.1 guest authentication to Samba

7 years agoMerge tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client
Linus Torvalds [Sat, 23 Sep 2017 02:09:31 +0000 (16:09 -1000)]
Merge tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Two small but important fixes: RADOS semantic change in upcoming v12.2.1
  release and a rare NULL dereference in create_session_open_msg()"

* tag 'ceph-for-4.14-rc2' of git://github.com/ceph/ceph-client:
  ceph: avoid panic in create_session_open_msg() if utsname() returns NULL
  libceph: don't allow bidirectional swap of pg-upmap-items

7 years agoMAINTAINERS: update git tree locations for ieee802154 subsystem
Stefan Schmidt [Fri, 22 Sep 2017 12:28:46 +0000 (14:28 +0200)]
MAINTAINERS: update git tree locations for ieee802154 subsystem

Patches for ieee802154 will go through my new trees towards netdev from
now on. The 6LoWPAN subsystem will stay as is (shared between ieee802154
and bluetooth) and go through the bluetooth tree as usual.

Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoSMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags
Steve French [Fri, 22 Sep 2017 06:40:27 +0000 (01:40 -0500)]
SMB3: Don't ignore O_SYNC/O_DSYNC and O_DIRECT flags

Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
7 years agoMerge tag 'pci-v4.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Fri, 22 Sep 2017 23:09:11 +0000 (13:09 -1000)]
Merge tag 'pci-v4.14-fixes-2' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - fix endpoint "end of test" interrupt issue (introduced in v4.14-rc1)
   (John Keeping)

 - fix MIPS use-after-free map_irq() issue (introduced in v4.14-rc1)
   (Lorenzo Pieralisi)

* tag 'pci-v4.14-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: endpoint: Use correct "end of test" interrupt
  MIPS: PCI: Move map_irq() hooks out of initdata

7 years agoMerge tag 'iommu-fixes-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 22 Sep 2017 23:06:05 +0000 (13:06 -1000)]
Merge tag 'iommu-fixes-v4.14-rc1' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:

 - two Kconfig fixes to fix dependencies that cause compile failures
   when they are not fulfilled.

 - a section mismatch fix for Intel VT-d

 - a fix for PCI topology detection in ARM device-tree code

* tag 'iommu-fixes-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/of: Remove PCI host bridge node check
  iommu/qcom: Depend on HAS_DMA to fix compile error
  iommu/vt-d: Fix harmless section mismatch warning
  iommu: Add missing dependencies

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile
Linus Torvalds [Fri, 22 Sep 2017 23:02:54 +0000 (13:02 -1000)]
Merge git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile fixes from Chris Metcalf:
 "These are a code cleanup and config cleanup, respectively"

* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  tile: array underflow in setup_maxnodemem()
  tile: defconfig: Cleanup from old Kconfig options

7 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 22 Sep 2017 23:01:16 +0000 (13:01 -1000)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - #ifdef CONFIG_EFI around __efi_fpsimd_begin/end

 - Assembly code alignment reduced to 4 bytes from 16

 - Ensure the kernel is compiled for LP64 (there are some arm64
   compilers around defaulting to ILP32)

 - Fix arm_pmu_acpi memory leak on the error path

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  drivers/perf: arm_pmu_acpi: Release memory obtained by kasprintf
  arm64: ensure the kernel is compiled for LP64
  arm64: relax assembly code alignment from 16 byte to 4 byte
  arm64: efi: Don't include EFI fpsimd save/restore code in non-EFI kernels

7 years agoSMB3: handle new statx fields
Steve French [Fri, 22 Sep 2017 02:32:29 +0000 (21:32 -0500)]
SMB3: handle new statx fields

We weren't returning the creation time or the two easily supported
attributes (ENCRYPTED or COMPRESSED) for the getattr call to
allow statx to return these fields.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>\
Acked-by: Jeff Layton <jlayton@poochiereds.net>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
7 years agoarch: remove unused *_segments() macros/functions
Tobias Klauser [Fri, 22 Sep 2017 07:42:42 +0000 (09:42 +0200)]
arch: remove unused *_segments() macros/functions

Some architectures define the no-op macros/functions copy_segments,
release_segments and forget_segments. These are used nowhere in the
tree, so removed them.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge branches 'acpi-pmic', 'acpi-bus', 'acpi-wdat' and 'acpi-properties'
Rafael J. Wysocki [Fri, 22 Sep 2017 21:38:45 +0000 (23:38 +0200)]
Merge branches 'acpi-pmic', 'acpi-bus', 'acpi-wdat' and 'acpi-properties'

* acpi-pmic:
  ACPI / PMIC: Add code reviewers to MAINTAINERS

* acpi-bus:
  ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again

* acpi-wdat:
  ACPI / watchdog: properly initialize resources

* acpi-properties:
  ACPI: properties: Return _DSD hierarchical extension (data) sub-nodes correctly

7 years agoMerge branches 'pm-cpufreq' and 'pm-cpuidle'
Rafael J. Wysocki [Fri, 22 Sep 2017 20:45:54 +0000 (22:45 +0200)]
Merge branches 'pm-cpufreq' and 'pm-cpuidle'

* pm-cpufreq:
  cpufreq: ti-cpufreq: Support additional am43xx platforms
  cpufreq: dt-platdev: Add some missing platforms to the blacklist

* pm-cpuidle:
  ARM: cpuidle: Avoid memleak if init fail

7 years agoMerge branches 'pm-core', 'pm-qos' and 'pm-docs'
Rafael J. Wysocki [Fri, 22 Sep 2017 20:45:28 +0000 (22:45 +0200)]
Merge branches 'pm-core', 'pm-qos' and 'pm-docs'

* pm-core:
  PM: core: Fix device_pm_check_callbacks()

* pm-qos:
  PM / QoS: Use the correct variable to check the QoS request type

* pm-docs:
  PM: docs: Drop an excess character from devices.rst
  driver core: Fix link to device power management documentation

7 years agoparisc: Unbreak bootloader due to gcc-7 optimizations
Helge Deller [Fri, 22 Sep 2017 19:57:11 +0000 (21:57 +0200)]
parisc: Unbreak bootloader due to gcc-7 optimizations

gcc-7 optimizes the byte-wise accesses of get_unaligned_le32() into
word-wise accesses if the 32-bit integer output_len is declared as
external. This panics then the bootloader since we don't have the
unaligned access fault trap handler installed during boot time.

Avoid this optimization by declaring output_len as byte-aligned and thus
unbreak the bootloader code.

Additionally, compile the boot code optimized for size.

Signed-off-by: Helge Deller <deller@gmx.de>
7 years agoparisc: Reintroduce option to gzip-compress the kernel
Helge Deller [Fri, 22 Sep 2017 20:24:02 +0000 (22:24 +0200)]
parisc: Reintroduce option to gzip-compress the kernel

By adding the feature to build the kernel as self-extracting
executeable, the possibility to simply compress the kernel with gzip was
lost.

This patch now reintroduces this possibilty again and leaves it up to
the user to decide how the kernel should be built.

The palo bootloader is able to natively load both formats.

Signed-off-by: Helge Deller <deller@gmx.de>
7 years agoapparmor: fix apparmorfs DAC access permissions
John Johansen [Thu, 31 Aug 2017 16:54:43 +0000 (09:54 -0700)]
apparmor: fix apparmorfs DAC access permissions

The DAC access permissions for several apparmorfs files are wrong.

.access - needs to be writable by all tasks to perform queries
the others in the set only provide a read fn so should be read only.

With policy namespace virtualization all apparmor needs to control
the permission and visibility checks directly which means DAC
access has to be allowed for all user, group, and other.

BugLink: http://bugs.launchpad.net/bugs/1713103
Fixes: c97204baf840b ("apparmor: rename apparmor file fns and data to indicate use")
Signed-off-by: John Johansen <john.johansen@canonical.com>
7 years agoapparmor: fix build failure on sparc caused by undeclared signals
John Johansen [Wed, 23 Aug 2017 19:10:39 +0000 (12:10 -0700)]
apparmor: fix build failure on sparc caused by undeclared signals

  In file included from security/apparmor/ipc.c:23:0:
  security/apparmor/include/sig_names.h:26:3: error: 'SIGSTKFLT' undeclared here (not in a function)
    [SIGSTKFLT] = 16, /* -, 16, - */
     ^
  security/apparmor/include/sig_names.h:26:3: error: array index in initializer not of integer type
  security/apparmor/include/sig_names.h:26:3: note: (near initialization for 'sig_map')
  security/apparmor/include/sig_names.h:51:3: error: 'SIGUNUSED' undeclared here (not in a function)
    [SIGUNUSED] = 34, /* -, 31, - */
     ^
  security/apparmor/include/sig_names.h:51:3: error: array index in initializer not of integer type
  security/apparmor/include/sig_names.h:51:3: note: (near initialization for 'sig_map')

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: c6bf1adaecaa ("apparmor: add the ability to mediate signals")
Signed-off-by: John Johansen <john.johansen@canonical.com>
7 years agoapparmor: fix incorrect type assignment when freeing proxies
John Johansen [Wed, 16 Aug 2017 16:33:48 +0000 (09:33 -0700)]
apparmor: fix incorrect type assignment when freeing proxies

sparse reports

poisoning the proxy->label before freeing the struct is resulting in
a sparse build warning.
../security/apparmor/label.c:52:30: warning: incorrect type in assignment (different address spaces)
../security/apparmor/label.c:52:30:    expected struct aa_label [noderef] <asn:4>*label
../security/apparmor/label.c:52:30:    got struct aa_label *<noident>

fix with RCU_INIT_POINTER as this is one of those cases where
rcu_assign_pointer() is not needed.

Signed-off-by: John Johansen <john.johansen@canonical.com>
7 years agoapparmor: ensure unconfined profiles have dfas initialized
John Johansen [Wed, 16 Aug 2017 12:48:06 +0000 (05:48 -0700)]
apparmor: ensure unconfined profiles have dfas initialized

Generally unconfined has early bailout tests and does not need the
dfas initialized, however if an early bailout test is ever missed
it will result in an oops.

Be defensive and initialize the unconfined profile to have null dfas
(no permission) so if an early bailout test is missed we fail
closed (no perms granted) instead of oopsing.

Signed-off-by: John Johansen <john.johansen@canonical.com>
7 years agoapparmor: fix race condition in null profile creation
John Johansen [Wed, 16 Aug 2017 12:40:49 +0000 (05:40 -0700)]
apparmor: fix race condition in null profile creation

There is a race when null- profile is being created between the
initial lookup/creation of the profile and lock/addition of the
profile. This could result in multiple version of a profile being
added to the list which need to be removed/replaced.

Since these are learning profile their is no affect on mediation.

Signed-off-by: John Johansen <john.johansen@canonical.com>
7 years agoapparmor: move new_null_profile to after profile lookup fns()
John Johansen [Wed, 16 Aug 2017 15:59:57 +0000 (08:59 -0700)]
apparmor: move new_null_profile to after profile lookup fns()

new_null_profile will need to use some of the profile lookup fns()
so move instead of doing forward fn declarations.

Signed-off-by: John Johansen <john.johansen@canonical.com>
7 years agoapparmor: add base infastructure for socket mediation
John Johansen [Wed, 19 Jul 2017 06:18:33 +0000 (23:18 -0700)]
apparmor: add base infastructure for socket mediation

Provide a basic mediation of sockets. This is not a full net mediation
but just whether a spcific family of socket can be used by an
application, along with setting up some basic infrastructure for
network mediation to follow.

the user space rule hav the basic form of
  NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ]
                 [ TYPE | PROTOCOL ]

  DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' |
             'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' |
     'netbeui' | 'security' | 'key' | 'packet' | 'ash' |
     'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' |
     'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' |
     'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' |
     'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' |
     'vsock' | 'mpls' | 'ib' | 'kcm' ) ','

  TYPE = ( 'stream' | 'dgram' | 'seqpacket' |  'rdm' | 'raw' |
           'packet' )

  PROTOCOL = ( 'tcp' | 'udp' | 'icmp' )

eg.
  network,
  network inet,

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>
7 years agoapparmor: add more debug asserts to apparmorfs
John Johansen [Wed, 19 Jul 2017 06:41:13 +0000 (23:41 -0700)]
apparmor: add more debug asserts to apparmorfs

Signed-off-by: John Johansen <john.johansen@canonical.com>
Acked-by: Seth Arnold <seth.arnold@canonical.com>