Alastair D'Silva [Mon, 4 Nov 2019 02:32:58 +0000 (13:32 +1100)]
powerpc: Don't flush caches when adding memory
This operation takes a significant amount of time when hotplugging
large amounts of memory (~50 seconds with 890GB of persistent memory).
This was orignally in commit
fb5924fddf9e
("powerpc/mm: Flush cache on memory hot(un)plug") to support memtrace,
but the flush on add is not needed as it is flushed on remove.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191104023305.9581-7-alastair@au1.ibm.com
Alastair D'Silva [Mon, 4 Nov 2019 02:32:57 +0000 (13:32 +1100)]
powerpc: Chunk calls to flush_dcache_range in arch_*_memory
When presented with large amounts of memory being hotplugged
(in my test case, ~890GB), the call to flush_dcache_range takes
a while (~50 seconds), triggering RCU stalls.
This patch breaks up the call into 1GB chunks, calling
cond_resched() inbetween to allow the scheduler to run.
Fixes: fb5924fddf9e ("powerpc/mm: Flush cache on memory hot(un)plug")
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191104023305.9581-6-alastair@au1.ibm.com
Alastair D'Silva [Mon, 4 Nov 2019 02:32:56 +0000 (13:32 +1100)]
powerpc: Convert flush_icache_range & friends to C
Similar to commit
22e9c88d486a
("powerpc/64: reuse PPC32 static inline flush_dcache_range()")
this patch converts the following ASM symbols to C:
flush_icache_range()
__flush_dcache_icache()
__flush_dcache_icache_phys()
This was done as we discovered a long-standing bug where the length of the
range was truncated due to using a 32 bit shift instead of a 64 bit one.
By converting these functions to C, it becomes easier to maintain.
flush_dcache_icache_phys() retains a critical assembler section as we must
ensure there are no memory accesses while the data MMU is disabled
(authored by Christophe Leroy). Since this has no external callers, it has
also been made static, allowing the compiler to inline it within
flush_dcache_icache_page().
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[mpe: Minor fixups, don't export __flush_dcache_icache()]
Link: https://lore.kernel.org/r/20191104023305.9581-5-alastair@au1.ibm.com
Alastair D'Silva [Mon, 4 Nov 2019 02:32:55 +0000 (13:32 +1100)]
powerpc: define helpers to get L1 icache sizes
This patch adds helpers to retrieve icache sizes, and renames the existing
helpers to make it clear that they are for dcache.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191104023305.9581-4-alastair@au1.ibm.com
Alastair D'Silva [Mon, 4 Nov 2019 02:32:54 +0000 (13:32 +1100)]
powerpc: Allow 64bit VDSO __kernel_sync_dicache to work across ranges >4GB
When calling __kernel_sync_dicache with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.
This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191104023305.9581-3-alastair@au1.ibm.com
Alastair D'Silva [Mon, 4 Nov 2019 02:32:53 +0000 (13:32 +1100)]
powerpc: Allow flush_icache_range to work across ranges >4GB
When calling flush_icache_range with a size >4GB, we were masking
off the upper 32 bits, so we would incorrectly flush a range smaller
than intended.
This patch replaces the 32 bit shifts with 64 bit ones, so that
the full size is accounted for.
Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191104023305.9581-2-alastair@au1.ibm.com
Chris Packham [Thu, 1 Aug 2019 22:50:06 +0000 (10:50 +1200)]
powerpc: Support CMDLINE_EXTEND
Bring powerpc in line with other architectures that support extending or
overriding the bootloader provided command line.
The current behaviour is most like CMDLINE_FROM_BOOTLOADER where the
bootloader command line is preferred but the kernel config can provide a
fallback so CMDLINE_FROM_BOOTLOADER is the default. CMDLINE_EXTEND can
be used to append the CMDLINE from the kernel config to the one provided
by the bootloader.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190801225006.21952-1-chris.packham@alliedtelesis.co.nz
Michael Ellerman [Wed, 6 Nov 2019 02:30:25 +0000 (13:30 +1100)]
powerpc/64s: Always disable branch profiling for prom_init.o
Otherwise the build fails because prom_init is calling symbols it's
not allowed to, eg:
Error: External symbol 'ftrace_likely_update' referenced from prom_init.c
make[3]: *** [arch/powerpc/kernel/Makefile:197: arch/powerpc/kernel/prom_init_check] Error 1
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191106051129.7626-1-mpe@ellerman.id.au
Rasmus Villemoes [Fri, 2 Nov 2018 21:17:06 +0000 (22:17 +0100)]
macintosh: ans-lcd: make anslcd_logo static and __initconst
This variable has no reason to have external linkage, and since it is
only used in an __init function, it might as well be made __initconst
also.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20181102211707.10229-1-linux@rasmusvillemoes.dk
Geert Uytterhoeven [Mon, 21 Oct 2019 14:23:09 +0000 (16:23 +0200)]
powerpc/security: Fix debugfs data leak on 32-bit
"powerpc_security_features" is "unsigned long", i.e. 32-bit or 64-bit,
depending on the platform (PPC_FSL_BOOK3E or PPC_BOOK3S_64). Hence
casting its address to "u64 *", and calling debugfs_create_x64() is
wrong, and leaks 32-bit of nearby data to userspace on 32-bit platforms.
While all currently defined SEC_FTR_* security feature flags fit in
32-bit, they all have "ULL" suffixes to make them 64-bit constants.
Hence fix the leak by changing the type of "powerpc_security_features"
(and the parameter types of its accessors) to "u64". This also allows
to drop the cast.
Fixes: 398af571128fe75f ("powerpc/security: Show powerpc_security_features in debugfs")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191021142309.28105-1-geert+renesas@glider.be
Aneesh Kumar K.V [Tue, 1 Oct 2019 08:46:56 +0000 (14:16 +0530)]
powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning
With large memory (8TB and more) hotplug, we can get soft lockup
warnings as below. These were caused by a long loop without any
explicit cond_resched which is a problem for !PREEMPT kernels.
Avoid this using cond_resched() while inserting hash page table
entries. We already do similar cond_resched() in __add_pages(), see
commit
f64ac5e6e306 ("mm, memory_hotplug: add scheduling point to
__add_pages").
rcu: 3-....: (24002 ticks this GP) idle=13e/1/0x4000000000000002 softirq=722/722 fqs=12001
(t=24003 jiffies g=4285 q=2002)
NMI backtrace for cpu 3
CPU: 3 PID: 3870 Comm: ndctl Not tainted 5.3.0-197.18-default+ #2
Call Trace:
dump_stack+0xb0/0xf4 (unreliable)
nmi_cpu_backtrace+0x124/0x130
nmi_trigger_cpumask_backtrace+0x1ac/0x1f0
arch_trigger_cpumask_backtrace+0x28/0x3c
rcu_dump_cpu_stacks+0xf8/0x154
rcu_sched_clock_irq+0x878/0xb40
update_process_times+0x48/0x90
tick_sched_handle.isra.16+0x4c/0x80
tick_sched_timer+0x68/0xe0
__hrtimer_run_queues+0x180/0x430
hrtimer_interrupt+0x110/0x300
timer_interrupt+0x108/0x2f0
decrementer_common+0x114/0x120
--- interrupt: 901 at arch_add_memory+0xc0/0x130
LR = arch_add_memory+0x74/0x130
memremap_pages+0x494/0x650
devm_memremap_pages+0x3c/0xa0
pmem_attach_disk+0x188/0x750
nvdimm_bus_probe+0xac/0x2c0
really_probe+0x148/0x570
driver_probe_device+0x19c/0x1d0
device_driver_attach+0xcc/0x100
bind_store+0x134/0x1c0
drv_attr_store+0x44/0x60
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1a0/0x270
__vfs_write+0x3c/0x70
vfs_write+0xd0/0x260
ksys_write+0xdc/0x130
system_call+0x5c/0x68
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191001084656.31277-1-aneesh.kumar@linux.ibm.com
Aneesh Kumar K.V [Thu, 24 Oct 2019 07:58:01 +0000 (13:28 +0530)]
powerpc/mm/book3s64/radix: Flush the full mm even when need_flush_all is set
With the previous patch, we should now not be using need_flush_all for
powerpc. But then make sure we force a PID tlbie flush with RIC=2 if
we ever find need_flush_all set. Also don't reset it after a mmu
gather flush.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024075801.22434-3-aneesh.kumar@linux.ibm.com
Aneesh Kumar K.V [Thu, 24 Oct 2019 07:58:00 +0000 (13:28 +0530)]
powerpc/mm/book3s64/radix: Use freed_tables instead of need_flush_all
With commit
22a61c3c4f13 ("asm-generic/tlb: Track freeing of
page-table directories in struct mmu_gather") we now track whether we
freed page table in mmu_gather. Use that to decide whether to flush
Page Walk Cache.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024075801.22434-2-aneesh.kumar@linux.ibm.com
Aneesh Kumar K.V [Thu, 24 Oct 2019 07:57:59 +0000 (13:27 +0530)]
powerpc/mm/book3s64/radix: Remove unused code.
mm_tlb_flush_nested change was added in the mmu gather tlb flush to
handle the case of parallel pte invalidate happening with mmap_sem
held in read mode. This fix was done by commit
02390f66bd23 ("powerpc/64s/radix: Fix MADV_[FREE|DONTNEED] TLB flush
miss problem with THP") and the problem is explained in detail in
commit
99baac21e458 ("mm: fix MADV_[FREE|DONTNEED] TLB flush miss
problem")
This was later updated by commit
7a30df49f63a ("mm: mmu_gather: remove
__tlb_reset_range() for force flush") to do a full mm flush rather
than a range flush. By commit
dd2283f2605e ("mm: mmap: zap pages with
read mmap_sem in munmap") we are also now allowing a page table free
in mmap_sem read mode which means we should do a PWC flush too. Our
current full mm flush imply a PWC flush.
With all the above change the mm_tlb_flush_nested(mm) branch in
radix__tlb_flush will never be taken because for the nested case we
would have taken the if (tlb->fullmm) branch. This patch removes the
unused code. Also, remove the gflush change in
__radix__flush_tlb_range that was added to handle the range tlb flush
code. We only check for THP there because hugetlb is flushed via a
different code path where page size is explicitly specified.
This is a partial revert of commit
02390f66bd23 ("powerpc/64s/radix:
Fix MADV_[FREE|DONTNEED] TLB flush miss problem with THP")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024075801.22434-1-aneesh.kumar@linux.ibm.com
Anthony Steinhauser [Tue, 29 Oct 2019 19:07:59 +0000 (12:07 -0700)]
powerpc/security/book3s64: Report L1TF status in sysfs
Some PowerPC CPUs are vulnerable to L1TF to the same extent as to
Meltdown. It is also mitigated by flushing the L1D on privilege
transition.
Currently the sysfs gives a false negative on L1TF on CPUs that I
verified to be vulnerable, a Power9 Talos II Boston 004e 1202, PowerNV
T2P9D01.
Signed-off-by: Anthony Steinhauser <asteinhauser@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[mpe: Just have cpu_show_l1tf() call cpu_show_meltdown() directly]
Link: https://lore.kernel.org/r/20191029190759.84821-1-asteinhauser@google.com
Nathan Lynch [Wed, 16 Oct 2019 18:36:11 +0000 (13:36 -0500)]
powerpc/pseries: safely roll back failed DLPAR cpu add
dlpar_online_cpu() attempts to online all threads of a core that has
been added to an LPAR. If onlining a non-primary thread
fails (e.g. due to an allocation failure), the core is left with at
least one thread online. dlpar_cpu_add() attempts to roll back the
whole operation, releasing the core back to the platform. However,
since some threads of the core being removed are still online, the
BUG_ON(cpu_online(cpu)) in pseries_remove_processor() strikes:
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 3 PID: 8587 Comm: drmgr Not tainted
5.3.0-rc2-00190-g9b123d1ea237-dirty #46
NIP:
c0000000000eeb2c LR:
c0000000000eeac4 CTR:
c0000000000ee9e0
REGS:
c0000001f745b6c0 TRAP: 0700 Not tainted (
5.3.0-rc2-00190-g9b123d1ea237-dirty)
MSR:
800000010282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR:
44002448 XER:
00000000
CFAR:
c00000000195d718 IRQMASK: 0
GPR00:
c0000000000eeac4 c0000001f745b950 c0000000032f6200 0000000000000008
GPR04:
0000000000000008 c000000003349c78 0000000000000040 00000000000001ff
GPR08:
0000000000000008 0000000000000000 0000000000000001 0007ffffffffffff
GPR12:
0000000084002844 c00000001ecacb80 0000000000000000 0000000000000000
GPR16:
0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20:
0000000000000000 0000000000000000 0000000000000000 0000000000000008
GPR24:
c000000003349ee0 c00000000334a2e4 c0000000fca4d7a8 c000000001d20048
GPR28:
0000000000000001 ffffffffffffffff ffffffffffffffff c0000000fca4d7c4
NIP [
c0000000000eeb2c] pseries_smp_notifier+0x14c/0x2e0
LR [
c0000000000eeac4] pseries_smp_notifier+0xe4/0x2e0
Call Trace:
[
c0000001f745b950] [
c0000000000eeac4] pseries_smp_notifier+0xe4/0x2e0 (unreliable)
[
c0000001f745ba10] [
c0000000001ac774] notifier_call_chain+0xb4/0x190
[
c0000001f745bab0] [
c0000000001ad62c] blocking_notifier_call_chain+0x7c/0xb0
[
c0000001f745baf0] [
c00000000167bda0] of_detach_node+0xc0/0x110
[
c0000001f745bb50] [
c0000000000e7ae4] dlpar_detach_node+0x64/0xa0
[
c0000001f745bb80] [
c0000000000edefc] dlpar_cpu_add+0x31c/0x360
[
c0000001f745bc10] [
c0000000000ee980] dlpar_cpu_probe+0x50/0xb0
[
c0000001f745bc50] [
c00000000002cf70] arch_cpu_probe+0x40/0x70
[
c0000001f745bc70] [
c000000000ccd808] cpu_probe_store+0x48/0x80
[
c0000001f745bcb0] [
c000000000cbcef8] dev_attr_store+0x38/0x60
[
c0000001f745bcd0] [
c00000000059c980] sysfs_kf_write+0x70/0xb0
[
c0000001f745bd10] [
c00000000059afb8] kernfs_fop_write+0xf8/0x280
[
c0000001f745bd60] [
c0000000004b437c] __vfs_write+0x3c/0x70
[
c0000001f745bd80] [
c0000000004b8710] vfs_write+0xd0/0x220
[
c0000001f745bdd0] [
c0000000004b8acc] ksys_write+0x7c/0x140
[
c0000001f745be20] [
c00000000000bbd8] system_call+0x5c/0x68
Move dlpar_offline_cpu() up in the file so that dlpar_online_cpu() can
use it to re-offline any threads that have been onlined when an error
is encountered.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: e666ae0b10aa ("powerpc/pseries: Update CPU hotplug error recovery")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191016183611.10867-3-nathanl@linux.ibm.com
Nathan Lynch [Wed, 16 Oct 2019 18:36:10 +0000 (13:36 -0500)]
powerpc/pseries: address checkpatch warnings in dlpar_offline_cpu
Remove some stray blank lines, convert a printk to pr_warn, and
address a line length violation.
One functional change: use WARN_ON instead of BUG_ON in case H_PROD of
a ceded thread yields an unexpected result from the platform. We can
expect this code path to get uninterruptibly stuck in __cpu_die() if
this happens, but that's more desirable than crashing.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Fixes: b6db63d1a7f0 ("pseries/pseries: Add code to online/offline CPUs of a DLPAR node")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191016183611.10867-2-nathanl@linux.ibm.com
Michael Ellerman [Mon, 4 Nov 2019 23:15:56 +0000 (10:15 +1100)]
selftests/powerpc: Skip tm-signal-sigreturn-nt if TM not available
On systems where TM (Transactional Memory) is disabled the
tm-signal-sigreturn-nt test causes a SIGILL:
test: tm_signal_sigreturn_nt
tags: git_version:
7c202575ef63
!! child died by signal 4
failure: tm_signal_sigreturn_nt
We should skip the test if TM is not available.
Fixes: 34642d70ac7e ("selftests/powerpc: Add checks for transactional sigreturn")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191104233524.24348-1-mpe@ellerman.id.au
Michael Ellerman [Mon, 4 Nov 2019 10:01:59 +0000 (21:01 +1100)]
Merge branch 'fixes' into next
Merge our fixes branch, primarily to bring in the powernv CPU hotplug
warning fix.
Michael Ellerman [Thu, 24 Oct 2019 00:47:30 +0000 (11:47 +1100)]
powerpc/tools: Don't quote $objdump in scripts
Some of our scripts are passed $objdump and then call it as
"$objdump". This doesn't work if it contains spaces because we're
using ccache, for example you get errors such as:
./arch/powerpc/tools/relocs_check.sh: line 48: ccache ppc64le-objdump: No such file or directory
./arch/powerpc/tools/unrel_branch_check.sh: line 26: ccache ppc64le-objdump: No such file or directory
Fix it by not quoting the string when we expand it, allowing the shell
to do the right thing for us.
Fixes: a71aa05e1416 ("powerpc: Convert relocs_check to a shell script using grep")
Fixes: 4ea80652dc75 ("powerpc/64s: Tool to flag direct branches from unrelocated interrupt vectors")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024004730.32135-1-mpe@ellerman.id.au
Michael Ellerman [Wed, 30 Oct 2019 11:12:31 +0000 (22:12 +1100)]
powerpc: Add build-time check of ptrace PT_xx defines
As part of the uapi we export a lot of PT_xx defines for each register
in struct pt_regs. These are expressed as an index from gpr[0], in
units of unsigned long.
Currently there's nothing tying the values of those defines to the
actual layout of the struct.
But we *don't* want to change the uapi defines to derive the PT_xx
values based on the layout of the struct, those values are ABI and
must never change.
Instead we want to do the reverse, make sure that the layout of the
struct never changes vs the PT_xx defines. So add build time checks of
that.
This probably seems paranoid, but at least once in the past someone
has sent a patch that would have broken the ABI if it hadn't been
spotted. Although it probably would have been detected via testing,
it's preferable to just quash any issues at the source.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191030111231.22720-1-mpe@ellerman.id.au
Mathieu Malaterre [Sat, 8 Dec 2018 15:46:23 +0000 (16:46 +0100)]
powerpc/ptrace: Add prototype for function pt_regs_check
`pt_regs_check` is a dummy function, its purpose is to break the build
if struct pt_regs and struct user_pt_regs don't match.
This function has no functionnal purpose, and will get eliminated at
link time or after init depending on CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
This commit adds a prototype to fix warning at W=1:
arch/powerpc/kernel/ptrace.c:3339:13: error: no previous prototype for ‘pt_regs_check’ [-Werror=missing-prototypes]
Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20181208154624.6504-1-malat@debian.org
Michael Ellerman [Mon, 20 May 2019 10:55:20 +0000 (20:55 +1000)]
selftests/powerpc: Add a test of spectre_v2 mitigations
This test uses the PMU to count branch prediction hits/misses for a
known loop, and compare the result to the reported spectre v2
mitigation.
This gives us a way of sanity checking that the reported mitigation is
actually in effect.
Sample output for some cases, eg:
Power9:
sysfs reports: 'Vulnerable'
PM_BR_PRED_CCACHE: result 368 running/enabled
5792777124
PM_BR_MPRED_CCACHE: result 319 running/enabled
5792775546
PM_BR_PRED_PCACHE: result
2147483281 running/enabled
5792773128
PM_BR_MPRED_PCACHE: result
213604201 running/enabled
5792771640
Miss percent 9 %
OK - Measured branch prediction rates match reported spectre v2 mitigation.
sysfs reports: 'Mitigation: Indirect branch serialisation (kernel only)'
PM_BR_PRED_CCACHE: result 895 running/enabled
5780320920
PM_BR_MPRED_CCACHE: result 822 running/enabled
5780312414
PM_BR_PRED_PCACHE: result
2147482754 running/enabled
5780308836
PM_BR_MPRED_PCACHE: result
213639731 running/enabled
5780307912
Miss percent 9 %
OK - Measured branch prediction rates match reported spectre v2 mitigation.
sysfs reports: 'Mitigation: Indirect branch cache disabled'
PM_BR_PRED_CCACHE: result
2147483649 running/enabled
20540186160
PM_BR_MPRED_CCACHE: result
2147483649 running/enabled
20540180056
PM_BR_PRED_PCACHE: result 0 running/enabled
20540176090
PM_BR_MPRED_PCACHE: result 0 running/enabled
20540174182
Miss percent 100 %
OK - Measured branch prediction rates match reported spectre v2 mitigation.
Power8:
sysfs reports: 'Vulnerable'
PM_BR_PRED_CCACHE: result
2147483649 running/enabled
3505888142
PM_BR_MPRED_CCACHE: result 9 running/enabled
3505882788
Miss percent 0 %
OK - Measured branch prediction rates match reported spectre v2 mitigation.
sysfs reports: 'Mitigation: Indirect branch cache disabled'
PM_BR_PRED_CCACHE: result
2147483649 running/enabled
16931421988
PM_BR_MPRED_CCACHE: result
2147483649 running/enabled
16931416478
Miss percent 100 %
OK - Measured branch prediction rates match reported spectre v2 mitigation.
success: spectre_v2
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190520105520.22274-1-mpe@ellerman.id.au
Nicholas Piggin [Tue, 22 Oct 2019 11:58:14 +0000 (21:58 +1000)]
powerpc/powernv: Fix CPU idle to be called with IRQs disabled
Commit
e78a7614f3876 ("idle: Prevent late-arriving interrupts from
disrupting offline") changes arch_cpu_idle_dead to be called with
interrupts disabled, which triggers the WARN in pnv_smp_cpu_kill_self.
Fix this by fixing up irq_happened after hard disabling, rather than
requiring there are no pending interrupts, similarly to what was done
done until commit
2525db04d1cc5 ("powerpc/powernv: Simplify lazy IRQ
handling in CPU offline").
Fixes: e78a7614f3876 ("idle: Prevent late-arriving interrupts from disrupting offline")
Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add unexpected_mask rather than checking for known bad values,
change the WARN_ON() to a WARN_ON_ONCE()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191022115814.22456-1-npiggin@gmail.com
Michael Ellerman [Mon, 14 Oct 2019 02:30:43 +0000 (13:30 +1100)]
selftests/powerpc: Fixup clobbers for TM tests
Some of our TM (Transactional Memory) tests, list "r1" (the stack
pointer) as a clobbered register.
GCC >= 9 doesn't accept this, and the build breaks:
ptrace-tm-spd-tar.c: In function 'tm_spd_tar':
ptrace-tm-spd-tar.c:31:2: error: listing the stack pointer register 'r1' in a clobber list is deprecated [-Werror=deprecated]
31 | asm __volatile__(
| ^~~
ptrace-tm-spd-tar.c:31:2: note: the value of the stack pointer after an 'asm' statement must be the same as it was before the statement
We do have some fairly large inline asm blocks in these tests, and
some of them do change the value of r1. However they should all return
to C with the value in r1 restored, so I think it's legitimate to say
r1 is not clobbered.
As Segher points out, the r1 clobbers may have been added because of
the use of `or 1,1,1`, however that doesn't actually clobber r1.
Segher also points out that some of these tests do clobber LR, because
they call functions, and that is not listed in the clobbers, so add
that where appropriate.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191029095324.14669-1-mpe@ellerman.id.au
Thiago Jung Bauermann [Wed, 11 Sep 2019 16:34:33 +0000 (13:34 -0300)]
powerpc/prom_init: Undo relocation before entering secure mode
The ultravisor will do an integrity check of the kernel image but we
relocated it so the check will fail. Restore the original image by
relocating it back to the kernel virtual base address.
This works because during build vmlinux is linked with an expected
virtual runtime address of KERNELBASE.
Fixes: 6a9c930bd775 ("powerpc/prom_init: Add the ESM call to prom_init")
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Tested-by: Michael Anderson <andmike@linux.ibm.com>
[mpe: Add IS_ENABLED() to fix the CONFIG_RELOCATABLE=n build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190911163433.12822-1-bauerman@linux.ibm.com
Aneesh Kumar K.V [Thu, 24 Oct 2019 09:35:42 +0000 (15:05 +0530)]
powerpc/book3s64/hash: Use secondary hash for bolted mapping if the primary is full
With bolted hash page table entry, kernel currently only use primary hash group
when inserting the hash page table entry. In the rare case where kernel find all the
8 primary hash slot occupied by bolted entries, this can result in hash page
table insert failure for bolted entries. Avoid this by using the secondary hash
group.
This is different from what kernel does for the non-bolted mapping. With
non-bolted entries kernel will try secondary before removing an existing entry
from hash page table group. With bolted prefer primary hash group and hence
try to insert the page table entry by removing a slot from primary before trying
the secondary hash group.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024093542.29777-3-aneesh.kumar@linux.ibm.com
Aneesh Kumar K.V [Thu, 24 Oct 2019 09:35:41 +0000 (15:05 +0530)]
powerpc/pseries: Don't fail hash page table insert for bolted mapping
If the hypervisor returned H_PTEG_FULL for H_ENTER hcall, retry a hash page table
insert by removing a random entry from the group.
After some runtime, it is very well possible to find all the 8 hash page table
entry slot in the hpte group used for mapping. Don't fail a bolted entry insert
in that case. With Storage class memory a user can find this error easily since
a namespace enable/disable is equivalent to memory add/remove.
This results in failures as reported below:
$ ndctl create-namespace -r region1 -t pmem -m devdax -a 65536 -s 100M
libndctl: ndctl_dax_enable: dax1.3: failed to enable
Error: namespace1.2: failed to enable
failed to create namespace: No such device or address
In kernel log we find the details as below:
Unable to create mapping for hot added memory 0xc000042006000000..0xc00004200d000000: -1
dax_pmem: probe of dax1.3 failed with error -14
This indicates that we failed to create a bolted hash table entry for direct-map
address backing the namespace.
We also observe failures such that not all namespaces will be enabled with
ndctl enable-namespace all command.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024093542.29777-2-aneesh.kumar@linux.ibm.com
Aneesh Kumar K.V [Thu, 24 Oct 2019 09:35:40 +0000 (15:05 +0530)]
powerpc/pseries: Don't opencode HPTE_V_BOLTED
No functional change in this patch.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024093542.29777-1-aneesh.kumar@linux.ibm.com
Michael Ellerman [Sun, 13 Oct 2019 10:23:51 +0000 (21:23 +1100)]
powerpc/pseries: Mark accumulate_stolen_time() as notrace
accumulate_stolen_time() is called prior to interrupt state being
reconciled, which can trip the warning in arch_local_irq_restore():
WARNING: CPU: 5 PID: 1017 at arch/powerpc/kernel/irq.c:258 .arch_local_irq_restore+0x9c/0x130
...
NIP .arch_local_irq_restore+0x9c/0x130
LR .rb_start_commit+0x38/0x80
Call Trace:
.ring_buffer_lock_reserve+0xe4/0x620
.trace_function+0x44/0x210
.function_trace_call+0x148/0x170
.ftrace_ops_no_ops+0x180/0x1d0
ftrace_call+0x4/0x8
.accumulate_stolen_time+0x1c/0xb0
decrementer_common+0x124/0x160
For now just mark it as notrace. We may change the ordering to call it
after interrupt state has been reconciled, but that is a larger
change.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191024055932.27940-1-mpe@ellerman.id.au
Michael Ellerman [Tue, 28 May 2019 08:16:14 +0000 (18:16 +1000)]
powerpc/configs: Rename foo_basic_defconfig to foo_base.config
We have several "defconfigs" that are not actually full defconfigs
they are just a base set of options which are then merged with other
fragments to produce a working defconfig.
The most obvious example is corenet_basic_defconfig which only
contains one symbol CONFIG_CORENET_GENERIC=y. And in fact if you build
it as a "defconfig" that one symbol ends up undefined, because its
prerequisites are missing.
There is also mpc85xx_base_defconfig which doesn't actually enable
CONFIG_PPC_85xx.
To avoid confusion, rename these config fragments to "foo_base.config"
to make it clearer that they are not full defconfigs and are instaed
just fragments that are used to generate real defconfigs.
Reported-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190528081614.26096-1-mpe@ellerman.id.au
Andrew Donnellan [Thu, 1 Aug 2019 04:58:55 +0000 (14:58 +1000)]
powerpc/configs: Add debug config fragment
Add a debug config fragment that we can use to put useful debug
options into.
It can be used like:
# make foo_defconfig
# make debug.config
Currently the only option included is to enable debugfs SCOM access.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
[mpe: Drop the special targets, just use the fragment directly]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190801045855.5822-1-ajd@linux.ibm.com
Aneesh Kumar K.V [Tue, 17 Sep 2019 12:38:51 +0000 (18:08 +0530)]
powerpc/nvdimm: Update vmemmap_populated to check sub-section range
With commit:
7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap")
pmem namespaces are remapped in 2M chunks. On architectures like ppc64 we
can map the memmap area using 16MB hugepage size and that can cover
a memory range of 16G.
While enabling new pmem namespaces, since memory is added in sub-section chunks,
before creating a new memmap mapping, kernel should check whether there is an
existing memmap mapping covering the new pmem namespace. Currently, this is
validated by checking whether the section covering the range is already
initialized or not. Considering there can be multiple namespaces in the same
section this can result in wrong validation. Update this to check for
sub-sections in the range. This is done by checking for all pfns in the range we
are mapping.
We could optimize this by checking only just one pfn in each sub-section. But
since this is not fast-path we keep this simple.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190917123851.22553-1-aneesh.kumar@linux.ibm.com
Christopher M. Riedl [Sat, 7 Sep 2019 06:11:24 +0000 (01:11 -0500)]
powerpc/xmon: Restrict when kernel is locked down
Xmon should be either fully or partially disabled depending on the
kernel lockdown state.
Put xmon into read-only mode for lockdown=integrity and prevent user
entry into xmon when lockdown=confidentiality. Xmon checks the lockdown
state on every attempted entry:
(1) during early xmon'ing
(2) when triggered via sysrq
(3) when toggled via debugfs
(4) when triggered via a previously enabled breakpoint
The following lockdown state transitions are handled:
(1) lockdown=none -> lockdown=integrity
set xmon read-only mode
(2) lockdown=none -> lockdown=confidentiality
clear all breakpoints, set xmon read-only mode,
prevent user re-entry into xmon
(3) lockdown=integrity -> lockdown=confidentiality
clear all breakpoints, set xmon read-only mode,
prevent user re-entry into xmon
Suggested-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190907061124.1947-3-cmr@informatik.wtf
Christopher M. Riedl [Sat, 7 Sep 2019 06:11:23 +0000 (01:11 -0500)]
powerpc/xmon: Allow listing and clearing breakpoints in read-only mode
Read-only mode should not prevent listing and clearing any active
breakpoints.
Tested-by: Daniel Axtens <dja@axtens.net>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190907061124.1947-2-cmr@informatik.wtf
Frederic Barrat [Wed, 16 Oct 2019 16:28:33 +0000 (18:28 +0200)]
powerpc/powernv/eeh: Fix oops when probing cxl devices
Recent cleanup in the way EEH support is added to a device causes a
kernel oops when the cxl driver probes a device and creates virtual
devices discovered on the FPGA:
BUG: Kernel NULL pointer dereference at 0x000000a0
Faulting instruction address: 0xc000000000048070
Oops: Kernel access of bad area, sig: 7 [#1]
...
NIP eeh_add_device_late.part.9+0x50/0x1e0
LR eeh_add_device_late.part.9+0x3c/0x1e0
Call Trace:
_dev_info+0x5c/0x6c (unreliable)
pnv_pcibios_bus_add_device+0x60/0xb0
pcibios_bus_add_device+0x40/0x60
pci_bus_add_device+0x30/0x100
pci_bus_add_devices+0x64/0xd0
cxl_pci_vphb_add+0xe0/0x130 [cxl]
cxl_probe+0x504/0x5b0 [cxl]
local_pci_probe+0x6c/0x110
work_for_cpu_fn+0x38/0x60
The root cause is that those cxl virtual devices don't have a
representation in the device tree and therefore no associated pci_dn
structure. In eeh_add_device_late(), pdn is NULL, so edev is NULL and
we oops.
We never had explicit support for EEH for those virtual devices.
Instead, EEH events are reported to the (real) pci device and handled
by the cxl driver. Which can then forward to the virtual devices and
handle dependencies. The fact that we try adding EEH support for the
virtual devices is new and a side-effect of the recent cleanup.
This patch fixes it by skipping adding EEH support on powernv for
devices which don't have a pci_dn structure.
The cxl driver doesn't create virtual devices on pseries so this patch
doesn't fix it there intentionally.
Fixes: b905f8cdca77 ("powerpc/eeh: EEH for pSeries hot plug")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191016162833.22509-1-fbarrat@linux.ibm.com
Michael Ellerman [Sun, 13 Oct 2019 23:26:34 +0000 (10:26 +1100)]
selftests/powerpc: Reduce sigfuz runtime to ~60s
The defaults for the sigfuz test is to run for 4000 iterations, but
that can take quite a while and the test harness may kill the test.
Reduce the number of iterations to 600, which gives a runtime of
roughly 1 minute on a Power8 system.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191013234643.3430-1-mpe@ellerman.id.au
Christophe Leroy [Mon, 14 Oct 2019 16:51:28 +0000 (16:51 +0000)]
powerpc/32s: fix allow/prevent_user_access() when crossing segment boundaries.
Make sure starting addr is aligned to segment boundary so that when
incrementing the segment, the starting address of the new segment is
below the end address. Otherwise the last segment might get missed.
Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/067a1b09f15f421d40797c2d04c22d4049a1cee8.1571071875.git.christophe.leroy@c-s.fr
Michael Ellerman [Sun, 13 Oct 2019 10:21:06 +0000 (21:21 +1100)]
Merge branch 'fixes' into next
Merge our fixes branch, to bring in the fixes for the KVM PCR bug and
the spufs crash.
Deb McLemore [Mon, 21 May 2018 02:04:38 +0000 (21:04 -0500)]
powerpc/powernv: Add queue mechanism for early messages
When issuing a BMC soft poweroff during IPL, the poweroff can be lost
so the machine would not poweroff.
This is because opal messages can be received before the opal-power
code registered its notifiers.
Fix it by buffering messages. If we receive a message and do not yet
have a handler for that type, store the message and replay when a
handler for that type is registered.
Signed-off-by: Deb McLemore <debmc@linux.vnet.ibm.com>
[mpe: Single unlock path in opal_message_notifier_register(), tweak
comments/formatting and change log.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1526868278-4204-1-git-send-email-debmc@linux.vnet.ibm.com
Qian Cai [Tue, 17 Sep 2019 15:22:30 +0000 (11:22 -0400)]
powerpc/pkeys: remove unused pkey_allows_readwrite
pkey_allows_readwrite() was first introduced in the commit
5586cf61e108
("powerpc: introduce execute-only pkey"), but the usage was removed
entirely in the commit
a4fcc877d4e1 ("powerpc/pkeys: Preallocate
execute-only key").
Found by the "-Wunused-function" compiler warning flag.
Fixes: a4fcc877d4e1 ("powerpc/pkeys: Preallocate execute-only key")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1568733750-14580-1-git-send-email-cai@lca.pw
Qian Cai [Mon, 15 Jul 2019 18:32:32 +0000 (14:32 -0400)]
powerpc/setup_64: fix -Wempty-body warnings
At the beginning of setup_64.c, it has,
#ifdef DEBUG
#define DBG(fmt...) udbg_printf(fmt)
#else
#define DBG(fmt...)
#endif
where DBG() could be compiled away, and generate warnings,
arch/powerpc/kernel/setup_64.c: In function 'initialize_cache_info':
arch/powerpc/kernel/setup_64.c:579:49: warning: suggest braces around
empty body in an 'if' statement [-Wempty-body]
DBG("Argh, can't find dcache properties !\n");
^
arch/powerpc/kernel/setup_64.c:582:49: warning: suggest braces around
empty body in an 'if' statement [-Wempty-body]
DBG("Argh, can't find icache properties !\n");
Fix it by using the suggestions from Michael:
"Neither of those sites should use DBG(), that's not really early
boot code, they should just use pr_warn().
And the other uses of DBG() in initialize_cache_info() should just
be removed.
In smp_release_cpus() the entry/exit DBG's should just be removed,
and the spinning_secondaries line should just be pr_debug().
That would just leave the two calls in early_setup(). If we taught
udbg_printf() to return early when udbg_putc is NULL, then we could
just call udbg_printf() unconditionally and get rid of the DBG macro
entirely."
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Qian Cai <cai@lca.pw>
[mpe: Split udbg change out into previous patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1563215552-8166-1-git-send-email-cai@lca.pw
Michael Ellerman [Fri, 11 Oct 2019 08:30:39 +0000 (19:30 +1100)]
powerpc/udbg: Make it safe to call udbg_printf() always
Make udbg_printf() check if udbg_putc is set, and if not just return.
This makes it safe to call udbg_printf() anytime, even when a udbg
backend has not been registered, which means we can avoid some ifdefs
at call sites.
Signed-off-by: Qian Cai <cai@lca.pw>
[mpe: Split out of larger patch, write change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Hari Bathini [Wed, 9 Oct 2019 15:27:20 +0000 (20:57 +0530)]
powerpc: make syntax for FADump config options in kernel/Makefile readable
arch/powerpc/kernel/fadump.c file needs to be compiled in if 'config
FA_DUMP' or 'config PRESERVE_FA_DUMP' is set. The current syntax
achieves that but looks a bit odd. Fix it for better readability.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/157063484064.11906.3586824898111397624.stgit@hbathini.in.ibm.com
Hari Bathini [Wed, 9 Oct 2019 14:04:29 +0000 (19:34 +0530)]
powerpc/configs: add FADump awareness to skiroot_defconfig
FADump is supported on PowerNV platform. To fulfill this support, the
petitboot kernel must be FADump aware. Enable config PRESERVE_FA_DUMP
to make the petitboot kernel FADump aware.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/157062986936.23016.10146169203560084401.stgit@hbathini.in.ibm.com
Emmanuel Nicolet [Tue, 8 Oct 2019 14:13:42 +0000 (16:13 +0200)]
spufs: fix a crash in spufs_create_root()
The spu_fs_context was not set in fc->fs_private, this caused a crash
when accessing ctx->mode in spufs_create_root().
Fixes: d2e0981c3b9a ("vfs: Convert spufs to use the new mount API")
Signed-off-by: Emmanuel Nicolet <emmanuel.nicolet@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20191008141342.GA266797@gmail.com
Vaibhav Jain [Fri, 27 Sep 2019 06:20:02 +0000 (11:50 +0530)]
powerpc/papr_scm: Fix an off-by-one check in papr_scm_meta_{get, set}
A validation check to prevent out of bounds read/write inside
functions papr_scm_meta_{get,set}() is off-by-one that prevent reads
and writes to the last byte of the label area.
This bug manifests as a failure to probe a dimm when libnvdimm is
unable to read the entire config-area as advertised by
ND_CMD_GET_CONFIG_SIZE. This usually happens when there are large
number of namespaces created in the region backed by the dimm and the
label-index spans max possible config-area. An error of the form below
usually reported in the kernel logs:
[ 255.293912] nvdimm: probe of nmem0 failed with error -22
The patch fixes these validation checks there by letting libnvdimm
access the entire config-area.
Fixes: 53e80bd042773('powerpc/nvdimm: Add support for multibyte read/write for metadata')
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190927062002.3169-1-vaibhav@linux.ibm.com
Jordan Niethe [Fri, 4 Oct 2019 02:53:17 +0000 (12:53 +1000)]
powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
kvmhv_switch_to_host() in arch/powerpc/kvm/book3s_hv_rmhandlers.S
needs to set kvmppc_vcore->in_guest to 0 to signal secondary CPUs to
continue. This happens after resetting the PCR. Before commit
13c7bb3c57dc ("powerpc/64s: Set reserved PCR bits"), r0 would always
be 0 before it was stored to kvmppc_vcore->in_guest. However because
of this change in the commit:
/* Reset PCR */
ld r0, VCORE_PCR(r5)
- cmpdi r0, 0
+ LOAD_REG_IMMEDIATE(r6, PCR_MASK)
+ cmpld r0, r6
beq 18f
- li r0, 0
- mtspr SPRN_PCR, r0
+ mtspr SPRN_PCR, r6
18:
/* Signal secondary CPUs to continue */
stb r0,VCORE_IN_GUEST(r5)
We are no longer comparing r0 against 0 and loading it with 0 if it
contains something else. Hence when we store r0 to
kvmppc_vcore->in_guest, it might not be 0. This means that secondary
CPUs will not be signalled to continue. Those CPUs get stuck and
errors like the following are logged:
KVM: CPU 1 seems to be stuck
KVM: CPU 2 seems to be stuck
KVM: CPU 3 seems to be stuck
KVM: CPU 4 seems to be stuck
KVM: CPU 5 seems to be stuck
KVM: CPU 6 seems to be stuck
KVM: CPU 7 seems to be stuck
This can be reproduced with:
$ for i in `seq 1 7` ; do chcpu -d $i ; done ;
$ taskset -c 0 qemu-system-ppc64 -smp 8,threads=8 \
-M pseries,accel=kvm,kvm-type=HV -m 1G -nographic -vga none \
-kernel vmlinux -initrd initrd.cpio.xz
Fix by making sure r0 is 0 before storing it to
kvmppc_vcore->in_guest.
Fixes: 13c7bb3c57dc ("powerpc/64s: Set reserved PCR bits")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Alistair Popple <alistair@popple.id.au>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191004025317.19340-1-jniethe5@gmail.com
Desnes A. Nunes do Rosario [Thu, 3 Oct 2019 21:10:10 +0000 (18:10 -0300)]
selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
Newer versions of GCC (>= 9) demand that the size of the string to be
copied must be explicitly smaller than the size of the destination.
Thus, the NULL char has to be taken into account on strncpy.
This will avoid the following compiling error:
tlbie_test.c: In function 'main':
tlbie_test.c:639:4: error: 'strncpy' specified bound 100 equals destination size
strncpy(logdir, optarg, LOGDIR_NAME_SIZE);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191003211010.9711-1-desnesn@linux.ibm.com
Laurent Dufour [Tue, 1 Oct 2019 13:29:28 +0000 (15:29 +0200)]
powerpc/pseries: Remove confusing warning message.
Since commit
1211ee61b4a8 ("powerpc/pseries: Read TLB Block Invalidate
Characteristics"), a warning message is displayed when booting a guest
on top of KVM:
lpar: arch/powerpc/platforms/pseries/lpar.c pseries_lpar_read_hblkrm_characteristics Error calling get-system-parameter (0xfffffffd)
This message is displayed because this hypervisor is not supporting
the H_BLOCK_REMOVE hcall and thus is not exposing the corresponding
feature.
Reading the TLB Block Invalidate Characteristics should not be done if
the feature is not exposed.
Fixes: 1211ee61b4a8 ("powerpc/pseries: Read TLB Block Invalidate Characteristics")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191001132928.72555-1-ldufour@linux.ibm.com
Stephen Rothwell [Mon, 30 Sep 2019 00:13:42 +0000 (10:13 +1000)]
powerpc/64s/radix: Fix build failure with RADIX_MMU=n
After merging the powerpc tree, today's linux-next build (powerpc64
allnoconfig) failed like this:
arch/powerpc/mm/book3s64/pgtable.c:216:3:
error: implicit declaration of function 'radix__flush_all_lpid_guest'
radix__flush_all_lpid_guest() is only declared for
CONFIG_PPC_RADIX_MMU which is not set for this build.
Fix it by adding an empty version for the RADIX_MMU=n case, which
should never be called.
Fixes: 99161de3a283 ("powerpc/64s/radix: tidy up TLB flushing code")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
[mpe: Munge change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190930101342.36c1afa0@canb.auug.org.au
Linus Torvalds [Sun, 6 Oct 2019 21:27:30 +0000 (14:27 -0700)]
Linux 5.4-rc2
Linus Torvalds [Sun, 6 Oct 2019 20:53:27 +0000 (13:53 -0700)]
elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings
In commit
4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") we
changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the
executable mappings.
Then, people reported that it broke some binaries that had overlapping
segments from the same file, and commit
ad55eac74f20 ("elf: enforce
MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some
overlaying elf segment cases. But only some - despite the summary line
of that commit, it only did it when it also does a temporary brk vma for
one obvious overlapping case.
Now Russell King reports another overlapping case with old 32-bit x86
binaries, which doesn't trigger that limited case. End result: we had
better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED.
Yes, it's a sign of old binaries generated with old tool-chains, but we
do pride ourselves on not breaking existing setups.
This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp()
and the old load_elf_library() use-cases, because nobody has reported
breakage for those. Yet.
Note that in all the cases seen so far, the overlapping elf sections
seem to be just re-mapping of the same executable with different section
attributes. We could possibly introduce a new MAP_FIXED_NOFILECHANGE
flag or similar, which acts like NOREPLACE, but allows just remapping
the same executable file using different protection flags.
It's not clear that would make a huge difference to anything, but if
people really hate that "elf remaps over previous maps" behavior, maybe
at least a more limited form of remapping would alleviate some concerns.
Alternatively, we should take a look at our elf_map() logic to see if we
end up not mapping things properly the first time.
In the meantime, this is the minimal "don't do that then" patch while
people hopefully think about it more.
Reported-by: Russell King <linux@armlinux.org.uk>
Fixes: 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map")
Fixes: ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments")
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 6 Oct 2019 18:10:15 +0000 (11:10 -0700)]
Merge tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping regression fix from Christoph Hellwig:
"Revert an incorret hunk from a patch that caused problems on various
arm boards (Andrey Smirnov)"
* tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: fix false positive warnings in dma_common_free_remap()
Linus Torvalds [Sun, 6 Oct 2019 00:18:43 +0000 (17:18 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"A few fixes this time around:
- Fixup of some clock specifications for DRA7 (device-tree fix)
- Removal of some dead/legacy CPU OPP/PM code for OMAP that throws
warnings at boot
- A few more minor fixups for OMAPs, most around display
- Enable STM32 QSPI as =y since their rootfs sometimes comes from
there
- Switch CONFIG_REMOTEPROC to =y since it went from tristate to bool
- Fix of thermal zone definition for ux500 (5.4 regression)"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support
ARM: dts: ux500: Fix up the CPU thermal zone
arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y
ARM: dts: am4372: Set memory bandwidth limit for DISPC
ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage()
ARM: OMAP2+: Add missing LCDC midlemode for am335x
ARM: OMAP2+: Fix missing reset done flag for am3 and am43
ARM: dts: Fix gpio0 flags for am335x-icev2
ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules
ARM: omap2plus_defconfig: Enable DRM_TI_TFP410
DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again
ARM: dts: Fix wrong clocks for dra7 mcasp
clk: ti: dra7: Fix mcasp8 clock bits
Linus Torvalds [Sat, 5 Oct 2019 19:56:59 +0000 (12:56 -0700)]
Merge tag 'kbuild-fixes-v5.4' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- remove unneeded ar-option and KBUILD_ARFLAGS
- remove long-deprecated SUBDIRS
- fix modpost to suppress false-positive warnings for UML builds
- fix namespace.pl to handle relative paths to ${objtree}, ${srctree}
- make setlocalversion work for /bin/sh
- make header archive reproducible
- fix some Makefiles and documents
* tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kheaders: make headers archive reproducible
kbuild: update compile-test header list for v5.4-rc2
kbuild: two minor updates for Documentation/kbuild/modules.rst
scripts/setlocalversion: clear local variable to make it work for sh
namespace: fix namespace.pl script to support relative paths
video/logo: do not generate unneeded logo C files
video/logo: remove unneeded *.o pattern from clean-files
integrity: remove pointless subdir-$(CONFIG_...)
integrity: remove unneeded, broken attempt to add -fshort-wchar
modpost: fix static EXPORT_SYMBOL warnings for UML build
kbuild: correct formatting of header in kbuild module docs
kbuild: remove SUBDIRS support
kbuild: remove ar-option and KBUILD_ARFLAGS
Linus Torvalds [Sat, 5 Oct 2019 19:53:27 +0000 (12:53 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Twelve patches mostly small but obvious fixes or cosmetic but small
updates"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Fix Nport ID display value
scsi: qla2xxx: Fix N2N link up fail
scsi: qla2xxx: Fix N2N link reset
scsi: qla2xxx: Optimize NPIV tear down process
scsi: qla2xxx: Fix stale mem access on driver unload
scsi: qla2xxx: Fix unbound sleep in fcport delete path.
scsi: qla2xxx: Silence fwdump template message
scsi: hisi_sas: Make three functions static
scsi: megaraid: disable device when probe failed after enabled device
scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
scsi: qedf: Remove always false 'tmp_prio < 0' statement
scsi: ufs: skip shutdown if hba is not powered
scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF
Linus Torvalds [Sat, 5 Oct 2019 19:03:27 +0000 (12:03 -0700)]
Merge branch 'readdir' (readdir speedup and sanity checking)
This makes getdents() and getdents64() do sanity checking on the
pathname that it gives to user space. And to mitigate the performance
impact of that, it first cleans up the way it does the user copying, so
that the code avoids doing the SMAP/PAN updates between each part of the
dirent structure write.
I really wanted to do this during the merge window, but didn't have
time. The conversion of filldir to unsafe_put_user() is something I've
had around for years now in a private branch, but the extra pathname
checking finally made me clean it up to the point where it is mergable.
It's worth noting that the filename validity checking really should be a
bit smarter: it would be much better to delay the error reporting until
the end of the readdir, so that non-corrupted filenames are still
returned. But that involves bigger changes, so let's see if anybody
actually hits the corrupt directory entry case before worrying about it
further.
* branch 'readdir':
Make filldir[64]() verify the directory entry filename is valid
Convert filldir[64]() from __put_user() to unsafe_put_user()
Linus Torvalds [Sat, 5 Oct 2019 18:32:52 +0000 (11:32 -0700)]
Make filldir[64]() verify the directory entry filename is valid
This has been discussed several times, and now filesystem people are
talking about doing it individually at the filesystem layer, so head
that off at the pass and just do it in getdents{64}().
This is partially based on a patch by Jann Horn, but checks for NUL
bytes as well, and somewhat simplified.
There's also commentary about how it might be better if invalid names
due to filesystem corruption don't cause an immediate failure, but only
an error at the end of the readdir(), so that people can still see the
filenames that are ok.
There's also been discussion about just how much POSIX strictly speaking
requires this since it's about filesystem corruption. It's really more
"protect user space from bad behavior" as pointed out by Jann. But
since Eric Biederman looked up the POSIX wording, here it is for context:
"From readdir:
The readdir() function shall return a pointer to a structure
representing the directory entry at the current position in the
directory stream specified by the argument dirp, and position the
directory stream at the next entry. It shall return a null pointer
upon reaching the end of the directory stream. The structure dirent
defined in the <dirent.h> header describes a directory entry.
From definitions:
3.129 Directory Entry (or Link)
An object that associates a filename with a file. Several directory
entries can associate names with the same file.
...
3.169 Filename
A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
characters composing the name may be selected from the set of all
character values excluding the slash character and the null byte. The
filenames dot and dot-dot have special meaning. A filename is
sometimes referred to as a 'pathname component'."
Note that I didn't bother adding the checks to any legacy interfaces
that nobody uses.
Also note that if this ends up being noticeable as a performance
regression, we can fix that to do a much more optimized model that
checks for both NUL and '/' at the same time one word at a time.
We haven't really tended to optimize 'memchr()', and it only checks for
one pattern at a time anyway, and we really _should_ check for NUL too
(but see the comment about "soft errors" in the code about why it
currently only checks for '/')
See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
lookup code looks for pathname terminating characters in parallel.
Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 22 May 2016 04:59:07 +0000 (21:59 -0700)]
Convert filldir[64]() from __put_user() to unsafe_put_user()
We really should avoid the "__{get,put}_user()" functions entirely,
because they can easily be mis-used and the original intent of being
used for simple direct user accesses no longer holds in a post-SMAP/PAN
world.
Manually optimizing away the user access range check makes no sense any
more, when the range check is generally much cheaper than the "enable
user accesses" code that the __{get,put}_user() functions still need.
So instead of __put_user(), use the unsafe_put_user() interface with
user_access_{begin,end}() that really does generate better code these
days, and which is generally a nicer interface. Under some loads, the
multiple user writes that filldir() does are actually quite noticeable.
This also makes the dirent name copy use unsafe_put_user() with a couple
of macros. We do not want to make function calls with SMAP/PAN
disabled, and the code this generates is quite good when the
architecture uses "asm goto" for unsafe_put_user() like x86 does.
Note that this doesn't bother with the legacy cases. Nobody should use
them anyway, so performance doesn't really matter there.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 5 Oct 2019 15:50:15 +0000 (08:50 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold.
2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric
Dumazet.
3) txq null deref in mac80211, from Miaoqing Pan.
4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann.
5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso.
6) Avoid division by zero in taprio scheduler, from Vladimir Oltean.
7) Various xgmac fixes in stmmac driver from Jose Abreu.
8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from
Michal Kubecek.
9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij.
10) Fix sleep while atomic in sja1105, from Vladimir Oltean.
11) Suspend/resume deadlock in stmmac, from Thierry Reding.
12) Various UDP GSO fixes from Josh Hunt.
13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric
Dumazet.
14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern.
15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
selftests/net: add nettest to .gitignore
net: qlogic: Fix memory leak in ql_alloc_large_buffers
nfc: fix memory leak in llcp_sock_bind()
sch_dsmark: fix potential NULL deref in dsmark_init()
net: phy: at803x: use operating parameters from PHY-specific status
net: phy: extract pause mode
net: phy: extract link partner advertisement reading
net: phy: fix write to mii-ctrl1000 register
ipv6: Handle missing host route in __ipv6_ifa_notify
net: phy: allow for reset line to be tied to a sleepy GPIO controller
net: ipv4: avoid mixed n_redirects and rate_tokens usage
r8152: Set macpassthru in reset_resume callback
cxgb4:Fix out-of-bounds MSI-X info array access
Revert "ipv6: Handle race in addrconf_dad_work"
net: make sock_prot_memory_pressure() return "const char *"
rxrpc: Fix rxrpc_recvmsg tracepoint
qmi_wwan: add support for Cinterion CLS8 devices
tcp: fix slab-out-of-bounds in tcp_zerocopy_receive()
lib: textsearch: fix escapes in example code
udp: only do GSO if # of segs > 1
...
Linus Torvalds [Sat, 5 Oct 2019 15:44:02 +0000 (08:44 -0700)]
Merge tag 's390-5.4-3' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- defconfig updates
- Fix build errors with CC_OPTIMIZE_FOR_SIZE due to usage of "i"
constraint for function arguments. Two kvm changes acked-by Christian
Borntraeger.
- Fix -Wunused-but-set-variable warnings in mm code.
- Avoid a constant misuse in qdio.
- Handle a case when cpumf is temporarily unavailable.
* tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
KVM: s390: mark __insn32_query() as __always_inline
KVM: s390: fix __insn32_query() inline assembly
s390: update defconfigs
s390/pci: mark function(s) __always_inline
s390/mm: mark function(s) __always_inline
s390/jump_label: mark function(s) __always_inline
s390/cpu_mf: mark function(s) __always_inline
s390/atomic,bitops: mark function(s) __always_inline
s390/mm: fix -Wunused-but-set-variable warnings
s390: mark __cpacf_query() as __always_inline
s390/qdio: clarify size of the QIB parm area
s390/cpumf: Fix indentation in sampling device driver
s390/cpumsf: Check for CPU Measurement sampling
s390/cpumf: Use consistant debug print format
Heiko Carstens [Wed, 2 Oct 2019 12:34:37 +0000 (14:34 +0200)]
KVM: s390: mark __insn32_query() as __always_inline
__insn32_query() will not compile if the compiler decides to not
inline it, since it contains an inline assembly with an "i" constraint
with variable contents.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Heiko Carstens [Wed, 2 Oct 2019 12:24:47 +0000 (14:24 +0200)]
KVM: s390: fix __insn32_query() inline assembly
The inline assembly constraints of __insn32_query() tell the compiler
that only the first byte of "query" is being written to. Intended was
probably that 32 bytes are written to.
Fix and simplify the code and just use a "memory" clobber.
Fixes: d668139718a9 ("KVM: s390: provide query function for instructions returning 32 byte")
Cc: stable@vger.kernel.org # v5.2+
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Andrey Smirnov [Sat, 5 Oct 2019 08:23:30 +0000 (10:23 +0200)]
dma-mapping: fix false positivse warnings in dma_common_free_remap()
Commit
5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages
helper") changed invalid input check in dma_common_free_remap() from:
if (!area || !area->flags != VM_DMA_COHERENT)
to
if (!area || !area->flags != VM_DMA_COHERENT || !area->pages)
which seem to produce false positives for memory obtained via
dma_common_contiguous_remap()
This triggers the following warning message when doing "reboot" on ZII
VF610 Dev Board Rev B:
WARNING: CPU: 0 PID: 1 at kernel/dma/remap.c:112 dma_common_free_remap+0x88/0x8c
trying to free invalid coherent area:
9ef82980
Modules linked in:
CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.3.0-rc6-next-
20190820 #119
Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
Backtrace:
[<
8010d1ec>] (dump_backtrace) from [<
8010d588>] (show_stack+0x20/0x24)
r7:
8015ed78 r6:
00000009 r5:
00000000 r4:
9f4d9b14
[<
8010d568>] (show_stack) from [<
8077e3f0>] (dump_stack+0x24/0x28)
[<
8077e3cc>] (dump_stack) from [<
801197a0>] (__warn.part.3+0xcc/0xe4)
[<
801196d4>] (__warn.part.3) from [<
80119830>] (warn_slowpath_fmt+0x78/0x94)
r6:
00000070 r5:
808e540c r4:
81c03048
[<
801197bc>] (warn_slowpath_fmt) from [<
8015ed78>] (dma_common_free_remap+0x88/0x8c)
r3:
9ef82980 r2:
808e53e0
r7:
00001000 r6:
a0b1e000 r5:
a0b1e000 r4:
00001000
[<
8015ecf0>] (dma_common_free_remap) from [<
8010fa9c>] (remap_allocator_free+0x60/0x68)
r5:
81c03048 r4:
9f4d9b78
[<
8010fa3c>] (remap_allocator_free) from [<
801100d0>] (__arm_dma_free.constprop.3+0xf8/0x148)
r5:
81c03048 r4:
9ef82900
[<
8010ffd8>] (__arm_dma_free.constprop.3) from [<
80110144>] (arm_dma_free+0x24/0x2c)
r5:
9f563410 r4:
80110120
[<
80110120>] (arm_dma_free) from [<
8015d80c>] (dma_free_attrs+0xa0/0xdc)
[<
8015d76c>] (dma_free_attrs) from [<
8020f3e4>] (dma_pool_destroy+0xc0/0x154)
r8:
9efa8860 r7:
808f02f0 r6:
808f02d0 r5:
9ef82880 r4:
9ef82780
[<
8020f324>] (dma_pool_destroy) from [<
805525d0>] (ehci_mem_cleanup+0x6c/0x150)
r7:
9f563410 r6:
9efa8810 r5:
00000000 r4:
9efd0148
[<
80552564>] (ehci_mem_cleanup) from [<
80558e0c>] (ehci_stop+0xac/0xc0)
r5:
9efd0148 r4:
9efd0000
[<
80558d60>] (ehci_stop) from [<
8053c4bc>] (usb_remove_hcd+0xf4/0x1b0)
r7:
9f563410 r6:
9efd0074 r5:
81c03048 r4:
9efd0000
[<
8053c3c8>] (usb_remove_hcd) from [<
8056361c>] (host_stop+0x48/0xb8)
r7:
9f563410 r6:
9efd0000 r5:
9f5f4040 r4:
9f5f5040
[<
805635d4>] (host_stop) from [<
80563d0c>] (ci_hdrc_host_destroy+0x34/0x38)
r7:
9f563410 r6:
9f5f5040 r5:
9efa8800 r4:
9f5f4040
[<
80563cd8>] (ci_hdrc_host_destroy) from [<
8055ef18>] (ci_hdrc_remove+0x50/0x10c)
[<
8055eec8>] (ci_hdrc_remove) from [<
804a2ed8>] (platform_drv_remove+0x34/0x4c)
r7:
9f563410 r6:
81c4f99c r5:
9efa8810 r4:
9efa8810
[<
804a2ea4>] (platform_drv_remove) from [<
804a18a8>] (device_release_driver_internal+0xec/0x19c)
r5:
00000000 r4:
9efa8810
[<
804a17bc>] (device_release_driver_internal) from [<
804a1978>] (device_release_driver+0x20/0x24)
r7:
9f563410 r6:
81c41ed0 r5:
9efa8810 r4:
9f4a1dac
[<
804a1958>] (device_release_driver) from [<
804a01b8>] (bus_remove_device+0xdc/0x108)
[<
804a00dc>] (bus_remove_device) from [<
8049c204>] (device_del+0x150/0x36c)
r7:
9f563410 r6:
81c03048 r5:
9efa8854 r4:
9efa8810
[<
8049c0b4>] (device_del) from [<
804a3368>] (platform_device_del.part.2+0x20/0x84)
r10:
9f563414 r9:
809177e0 r8:
81cb07dc r7:
81c78320 r6:
9f563454 r5:
9efa8800
r4:
9efa8800
[<
804a3348>] (platform_device_del.part.2) from [<
804a3420>] (platform_device_unregister+0x28/0x34)
r5:
9f563400 r4:
9efa8800
[<
804a33f8>] (platform_device_unregister) from [<
8055dce0>] (ci_hdrc_remove_device+0x1c/0x30)
r5:
9f563400 r4:
00000001
[<
8055dcc4>] (ci_hdrc_remove_device) from [<
805652ac>] (ci_hdrc_imx_remove+0x38/0x118)
r7:
81c78320 r6:
9f563454 r5:
9f563410 r4:
9f541010
[<
8056538c>] (ci_hdrc_imx_shutdown) from [<
804a2970>] (platform_drv_shutdown+0x2c/0x30)
[<
804a2944>] (platform_drv_shutdown) from [<
8049e4fc>] (device_shutdown+0x158/0x1f0)
[<
8049e3a4>] (device_shutdown) from [<
8013ac80>] (kernel_restart_prepare+0x44/0x48)
r10:
00000058 r9:
9f4d8000 r8:
fee1dead r7:
379ce700 r6:
81c0b280 r5:
81c03048
r4:
00000000
[<
8013ac3c>] (kernel_restart_prepare) from [<
8013ad14>] (kernel_restart+0x1c/0x60)
[<
8013acf8>] (kernel_restart) from [<
8013af84>] (__do_sys_reboot+0xe0/0x1d8)
r5:
81c03048 r4:
00000000
[<
8013aea4>] (__do_sys_reboot) from [<
8013b0ec>] (sys_reboot+0x18/0x1c)
r8:
80101204 r7:
00000058 r6:
00000000 r5:
00000000 r4:
00000000
[<
8013b0d4>] (sys_reboot) from [<
80101000>] (ret_fast_syscall+0x0/0x54)
Exception stack(0x9f4d9fa8 to 0x9f4d9ff0)
9fa0:
00000000 00000000 fee1dead 28121969 01234567 379ce700
9fc0:
00000000 00000000 00000000 00000058 00000000 00000000 00000000 00016d04
9fe0:
00028e0c 7ec87c64 000135ec 76c1f410
Restore original invalid input check in dma_common_free_remap() to
avoid this problem.
Fixes: 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages helper")
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
[hch: just revert the offending hunk instead of creating a new helper]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Dmitry Goldin [Fri, 4 Oct 2019 10:40:07 +0000 (10:40 +0000)]
kheaders: make headers archive reproducible
In commit
43d8ce9d65a5 ("Provide in-kernel headers to make
extending kernel easier") a new mechanism was introduced, for kernels
>=5.2, which embeds the kernel headers in the kernel image or a module
and exposes them in procfs for use by userland tools.
The archive containing the header files has nondeterminism caused by
header files metadata. This patch normalizes the metadata and utilizes
KBUILD_BUILD_TIMESTAMP if provided and otherwise falls back to the
default behaviour.
In commit
f7b101d33046 ("kheaders: Move from proc to sysfs") it was
modified to use sysfs and the script for generation of the archive was
renamed to what is being patched.
Signed-off-by: Dmitry Goldin <dgoldin+lkml@protonmail.ch>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Thu, 3 Oct 2019 02:36:29 +0000 (11:36 +0900)]
kbuild: update compile-test header list for v5.4-rc2
Commit
6dc280ebeed2 ("coda: remove uapi/linux/coda_psdev.h") removed
a header in question. Some more build errors were fixed. Add more
headers into the test coverage.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Thu, 3 Oct 2019 10:29:12 +0000 (19:29 +0900)]
kbuild: two minor updates for Documentation/kbuild/modules.rst
Capitalize the first word in the sentence.
Use obj-m instead of obj-y. obj-y still works, but we have no built-in
objects in external module builds. So, obj-m is better IMHO.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Tue, 1 Oct 2019 12:17:24 +0000 (21:17 +0900)]
scripts/setlocalversion: clear local variable to make it work for sh
Geert Uytterhoeven reports a strange side-effect of commit
858805b336be
("kbuild: add $(BASH) to run scripts with bash-extension"), which
inserts the contents of a localversion file in the build directory twice.
[Steps to Reproduce]
$ echo bar > localversion
$ mkdir build
$ cd build/
$ echo foo > localversion
$ make -s -f ../Makefile defconfig include/config/kernel.release
$ cat include/config/kernel.release
5.4.0-rc1foofoobar
This comes down to the behavior change of local variables.
The 'man sh' on my Ubuntu machine, where sh is an alias to dash,
explains as follows:
When a variable is made local, it inherits the initial value and
exported and readonly flags from the variable with the same name
in the surrounding scope, if there is one. Otherwise, the variable
is initially unset.
[Test Code]
foo ()
{
local res
echo "res: $res"
}
res=1
foo
[Result]
$ sh test.sh
res: 1
$ bash test.sh
res:
So, scripts/setlocalversion correctly works only for bash in spite of
its hashbang being #!/bin/sh. Nobody had noticed it before because
CONFIG_SHELL was previously set to bash almost all the time.
Now that CONFIG_SHELL is set to sh, we must write portable and correct
code. I gave the Fixes tag to the commit that uncovered the issue.
Clear the variable 'res' in collect_files() to make it work for sh
(and it also works on distributions where sh is an alias to bash).
Fixes: 858805b336be ("kbuild: add $(BASH) to run scripts with bash-extension")
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Jacob Keller [Fri, 27 Sep 2019 23:30:27 +0000 (16:30 -0700)]
namespace: fix namespace.pl script to support relative paths
The namespace.pl script does not work properly if objtree is not set to
an absolute path. The do_nm function is run from within the find
function, which changes directories.
Because of this, appending objtree, $File::Find::dir, and $source, will
return a path which is not valid from the current directory.
This used to work when objtree was set to an absolute path when using
"make namespacecheck". It appears to have not worked when calling
./scripts/namespace.pl directly.
This behavior was changed in
7e1c04779efd ("kbuild: Use relative path
for $(objtree)", 2014-05-14)
Rather than fixing the Makefile to set objtree to an absolute path, just
fix namespace.pl to work when srctree and objtree are relative. Also fix
the script to use an absolute path for these by default.
Use the File::Spec module for this purpose. It's been part of perl
5 since 5.005.
The curdir() function is used to get the current directory when the
objtree and srctree aren't set in the environment.
rel2abs() is used to convert possibly relative objtree and srctree
environment variables to absolute paths.
Finally, the catfile() function is used instead of string appending
paths together, since this is more robust when joining paths together.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Wed, 21 Aug 2019 04:12:35 +0000 (13:12 +0900)]
video/logo: do not generate unneeded logo C files
Currently, all the logo C files are generated irrespective of the
CONFIG options. Adding them to extra-y is wrong. What we need to do
here is to add them to 'targets' so that if_changed works properly.
Files listed in 'targets' are cleaned, so clean-files is unneeded.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Wed, 21 Aug 2019 04:12:34 +0000 (13:12 +0900)]
video/logo: remove unneeded *.o pattern from clean-files
The pattern *.o is cleaned up globally by the top Makefile.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Fri, 26 Jul 2019 02:10:55 +0000 (11:10 +0900)]
integrity: remove pointless subdir-$(CONFIG_...)
The ima/ and evm/ sub-directories contain built-in objects, so
obj-$(CONFIG_...) is the correct way to descend into them.
subdir-$(CONFIG_...) is redundant.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Fri, 26 Jul 2019 02:10:54 +0000 (11:10 +0900)]
integrity: remove unneeded, broken attempt to add -fshort-wchar
I guess commit
15ea0e1e3e18 ("efi: Import certificates from UEFI Secure
Boot") attempted to add -fshort-wchar for building load_uefi.o, but it
has never worked as intended.
load_uefi.o is created in the platform_certs/ sub-directory. If you
really want to add -fshort-wchar, the correct code is:
$(obj)/platform_certs/load_uefi.o: KBUILD_CFLAGS += -fshort-wchar
But, you do not need to fix it.
Commit
8c97023cf051 ("Kbuild: use -fshort-wchar globally") had already
added -fshort-wchar globally. This code was unneeded in the first place.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Jakub Kicinski [Sat, 5 Oct 2019 00:36:50 +0000 (17:36 -0700)]
selftests/net: add nettest to .gitignore
nettest is missing from gitignore.
Fixes: acda655fefae ("selftests: Add nettest")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Navid Emamdoost [Fri, 4 Oct 2019 20:24:39 +0000 (15:24 -0500)]
net: qlogic: Fix memory leak in ql_alloc_large_buffers
In ql_alloc_large_buffers, a new skb is allocated via netdev_alloc_skb.
This skb should be released if pci_dma_mapping_error fails.
Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() in ql_release_to_lrg_buf_free_list(), ql_populate_free_queue(), ql_alloc_large_buffers(), and ql3xxx_send()")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Oct 2019 18:08:34 +0000 (11:08 -0700)]
nfc: fix memory leak in llcp_sock_bind()
sysbot reported a memory leak after a bind() has failed.
While we are at it, abort the operation if kmemdup() has failed.
BUG: memory leak
unreferenced object 0xffff888105d83ec0 (size 32):
comm "syz-executor067", pid 7207, jiffies
4294956228 (age 19.430s)
hex dump (first 32 bytes):
00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34 .ile read.net:[4
30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00
026533097]......
backtrace:
[<
0000000036bac473>] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline]
[<
0000000036bac473>] slab_post_alloc_hook /mm/slab.h:522 [inline]
[<
0000000036bac473>] slab_alloc /mm/slab.c:3319 [inline]
[<
0000000036bac473>] __do_kmalloc /mm/slab.c:3653 [inline]
[<
0000000036bac473>] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670
[<
000000000cd39d07>] kmemdup+0x27/0x60 /mm/util.c:120
[<
000000008e57e5fc>] kmemdup /./include/linux/string.h:432 [inline]
[<
000000008e57e5fc>] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107
[<
000000009cb0b5d3>] __sys_bind+0x11c/0x140 /net/socket.c:1647
[<
00000000492c3bbc>] __do_sys_bind /net/socket.c:1658 [inline]
[<
00000000492c3bbc>] __se_sys_bind /net/socket.c:1656 [inline]
[<
00000000492c3bbc>] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656
[<
0000000008704b2a>] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296
[<
000000009f4c57a4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 30cc4587659e ("NFC: Move LLCP code to the NFC top level diirectory")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Oct 2019 17:34:45 +0000 (10:34 -0700)]
sch_dsmark: fix potential NULL deref in dsmark_init()
Make sure TCA_DSMARK_INDICES was provided by the user.
syzbot reported :
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8799 Comm: syz-executor235 Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nla_get_u16 include/net/netlink.h:1501 [inline]
RIP: 0010:dsmark_init net/sched/sch_dsmark.c:364 [inline]
RIP: 0010:dsmark_init+0x193/0x640 net/sched/sch_dsmark.c:339
Code: 85 db 58 0f 88 7d 03 00 00 e8 e9 1a ac fb 48 8b 9d 70 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 04 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 ca
RSP: 0018:
ffff88809426f3b8 EFLAGS:
00010247
RAX:
dffffc0000000000 RBX:
0000000000000000 RCX:
ffffffff85c6eb09
RDX:
0000000000000000 RSI:
ffffffff85c6eb17 RDI:
0000000000000004
RBP:
ffff88809426f4b0 R08:
ffff88808c4085c0 R09:
ffffed1015d26159
R10:
ffffed1015d26158 R11:
ffff8880ae930ac7 R12:
ffff8880a7e96940
R13:
dffffc0000000000 R14:
ffff88809426f8c0 R15:
0000000000000000
FS:
0000000001292880(0000) GS:
ffff8880ae900000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000020000080 CR3:
000000008ca1b000 CR4:
00000000001406e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
qdisc_create+0x4ee/0x1210 net/sched/sch_api.c:1237
tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1653
rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:657
___sys_sendmsg+0x803/0x920 net/socket.c:2311
__sys_sendmsg+0x105/0x1d0 net/socket.c:2356
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg net/socket.c:2363 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440369
Fixes: 758cc43c6d73 ("[PKT_SCHED]: Fix dsmark to apply changes consistent")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 5 Oct 2019 01:11:08 +0000 (18:11 -0700)]
Merge branch 'Fix-regression-with-AR8035-speed-downgrade'
Russell King says:
====================
Fix regression with AR8035 speed downgrade
The following series attempts to address an issue spotted by tinywrkb
with the AR8035 on the Cubox-i2 in a situation where the PHY downgrades
the negotiated link.
This is version 2, not much has changed other than rebasing on the
current net tree. Changes have happend to patch 2 due to conflicts,
so I dropped Andrew's reviewed-by. Minor context changes to patch 4
which I don't consider important enough to warrant dropping the
reviewed-by.
Before commit
5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in
genphy_read_status"), we would read not only the link partner's
advertisement, but also our own advertisement from the PHY registers,
and use both to derive the PHYs current link mode. This works when the
AR8035 downgrades the speed, because it appears that the AR8035 clears
link mode bits in the advertisement registers as part of the downgrade.
Commentary: what is not yet known is whether the AR8035 restores the
advertisement register when the link goes down to the
previous state.
However, since the above referenced commit, we no longer use the PHYs
advertisement registers, instead converting the link partner's
advertisement to the ethtool link mode array, and combine that with
phylib's cached version of our advertisement - which is not updated on
speed downgrade.
This results in phylib disagreeing with the actual operating mode of
the PHY.
Commentary: I wonder how many more PHY drivers are broken by this
commit, but have yet to be discovered.
The obvious way to address this would be to disable the downgrade
feature, and indeed this does fix the problem in tinywrkb's case - his
link partner instead downgrades the speed by reducing its
advertisement, resulting in phylib correctly evaluating a slower speed.
However, it has a serious drawback - the gigabit control register (MII
register 9) appears to become read only. It seems the only way to
update the register is to re-enable the downgrade feature, reset the
PHY, changing register 9, disable the downgrade feature, and reset the
PHY again.
This series attempts to address the problem using a different approach,
similar to the approach taken with Marvell PHYs. The AR8031, AR8033
and AR8035 have a PHY-Specific Status register which reports the
actual operating mode of the PHY - both speed and duplex. This
register correctly reports the operating mode irrespective of whether
autoneg is enabled or not. We use this register to fill in phylib's
speed and duplex parameters.
In detail:
Patch 1 fixes a bug where writing to register 9 does not update
phylib's advertisement mask in the same way that writing register 4
does; this looks like an omission from when gigabit PHY support came
into being.
Patch 2 seperates the generic phylib code which reads the link partners
advertisement from the PHY, so that we can re-use this in the Atheros
PHY driver.
Patch 3 seperates the generic phylib pause mode; phylib provides no
help for MAC drivers to ascertain the negotiated pause mode, it merely
copies the link partner's pause mode bits into its own variables.
Commentary: Both the aforementioned Atheros PHYs and Marvell PHYs
provide the resolved pause modes in terms of whether
we should transmit pause frames, or whether we should
allow reception of pause frames. Surely the resolution
of this should be in phylib?
Patch 4 provides the Atheros PHY driver with a private "read_status"
implementation that fills in phylib's speed and duplex settings
depending on the PHY-Specific status register. This ensures that
phylib and the MAC driver match the operating mode that the PHY has
decided to use. Since the register also gives us MDIX status, we
can trivially fill that status in as well.
Note that, although the bits mentioned in this patch for this register
match those in th Marvell PHY driver, and it is located at the same
address, the meaning of other register bits varies between the PHYs.
Therefore, I do not feel that it would be appropriate to make this some
kind of generic function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Fri, 4 Oct 2019 16:06:14 +0000 (17:06 +0100)]
net: phy: at803x: use operating parameters from PHY-specific status
Read the PHY-specific status register for the current operating mode
(speed and duplex) of the PHY. This register reflects the actual
mode that the PHY has resolved depending on either the advertisements
of autoneg is enabled, or the forced mode if autoneg is disabled.
This ensures that phylib's software state always tracks the hardware
state.
It seems both AR8033 (which uses the AR8031 ID) and AR8035 support
this status register. AR8030 is not known at the present time.
This patch depends on "net: phy: extract pause mode" and "net: phy:
extract link partner advertisement reading".
Reported-by: tinywrkb <tinywrkb@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: tinywrkb <tinywrkb@gmail.com>
Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Fri, 4 Oct 2019 16:06:09 +0000 (17:06 +0100)]
net: phy: extract pause mode
Extract the update of phylib's software pause mode state from
genphy_read_status(), so that we can re-use this functionality with
PHYs that have alternative ways to read the negotiation results.
Tested-by: tinywrkb <tinywrkb@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Fri, 4 Oct 2019 16:06:04 +0000 (17:06 +0100)]
net: phy: extract link partner advertisement reading
Move reading the link partner advertisement out of genphy_read_status()
into its own separate function. This will allow re-use of this code by
PHY drivers that are able to read the resolved status from the PHY.
Tested-by: tinywrkb <tinywrkb@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Fri, 4 Oct 2019 16:05:58 +0000 (17:05 +0100)]
net: phy: fix write to mii-ctrl1000 register
When userspace writes to the MII_ADVERTISE register, we update phylib's
advertising mask and trigger a renegotiation. However, writing to the
MII_CTRL1000 register, which contains the gigabit advertisement, does
neither. This can lead to phylib's copy of the advertisement becoming
de-synced with the values in the PHY register set, which can result in
incorrect negotiation resolution.
Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 4 Oct 2019 15:03:09 +0000 (08:03 -0700)]
ipv6: Handle missing host route in __ipv6_ifa_notify
Rajendra reported a kernel panic when a link was taken down:
[ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at
00000000000000a8
[ 6870.271856] IP: [<
ffffffff8efc5764>] __ipv6_ifa_notify+0x154/0x290
<snip>
[ 6870.570501] Call Trace:
[ 6870.573238] [<
ffffffff8efc58c6>] ? ipv6_ifa_notify+0x26/0x40
[ 6870.579665] [<
ffffffff8efc98ec>] ? addrconf_dad_completed+0x4c/0x2c0
[ 6870.586869] [<
ffffffff8efe70c6>] ? ipv6_dev_mc_inc+0x196/0x260
[ 6870.593491] [<
ffffffff8efc9c6a>] ? addrconf_dad_work+0x10a/0x430
[ 6870.600305] [<
ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
[ 6870.606732] [<
ffffffff8ea93a7a>] ? process_one_work+0x18a/0x430
[ 6870.613449] [<
ffffffff8ea93d6d>] ? worker_thread+0x4d/0x490
[ 6870.619778] [<
ffffffff8ea93d20>] ? process_one_work+0x430/0x430
[ 6870.626495] [<
ffffffff8ea99dd9>] ? kthread+0xd9/0xf0
[ 6870.632145] [<
ffffffff8f01ade4>] ? __switch_to_asm+0x34/0x70
[ 6870.638573] [<
ffffffff8ea99d00>] ? kthread_park+0x60/0x60
[ 6870.644707] [<
ffffffff8f01ae77>] ? ret_from_fork+0x57/0x70
[ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0
addrconf_dad_work is kicked to be scheduled when a device is brought
up. There is a race between addrcond_dad_work getting scheduled and
taking the rtnl lock and a process taking the link down (under rtnl).
The latter removes the host route from the inet6_addr as part of
addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
to use the host route in __ipv6_ifa_notify. If the down event removes
the host route due to the race to the rtnl, then the BUG listed above
occurs.
Since the DAD sequence can not be aborted, add a check for the missing
host route in __ipv6_ifa_notify. The only way this should happen is due
to the previously mentioned race. The host route is created when the
address is added to an interface; it is only removed on a down event
where the address is kept. Add a warning if the host route is missing
AND the device is up; this is a situation that should never happen.
Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Rajendra Dendukuri <rajendra.dendukuri@broadcom.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrea Merello [Fri, 4 Oct 2019 13:53:32 +0000 (15:53 +0200)]
net: phy: allow for reset line to be tied to a sleepy GPIO controller
mdio_device_reset() makes use of the atomic-pretending API flavor for
handling the PHY reset GPIO line.
I found no hint that mdio_device_reset() is called from atomic context
and indeed it uses usleep_range() since long time, so I would assume that
it is OK to sleep there.
This patch switch to gpiod_set_value_cansleep() in mdio_device_reset().
This is relevant if e.g. the PHY reset line is tied to a I2C GPIO
controller.
This has been tested on a ZynqMP board running an upstream 4.19 kernel and
then hand-ported on current kernel tree.
Signed-off-by: Andrea Merello <andrea.merello@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 4 Oct 2019 13:11:17 +0000 (15:11 +0200)]
net: ipv4: avoid mixed n_redirects and rate_tokens usage
Since commit
c09551c6ff7f ("net: ipv4: use a dedicated counter
for icmp_v4 redirect packets") we use 'n_redirects' to account
for redirect packets, but we still use 'rate_tokens' to compute
the redirect packets exponential backoff.
If the device sent to the relevant peer any ICMP error packet
after sending a redirect, it will also update 'rate_token' according
to the leaking bucket schema; typically 'rate_token' will raise
above BITS_PER_LONG and the redirect packets backoff algorithm
will produce undefined behavior.
Fix the issue using 'n_redirects' to compute the exponential backoff
in ip_rt_send_redirect().
Note that we still clear rate_tokens after a redirect silence period,
to avoid changing an established behaviour.
The root cause predates git history; before the mentioned commit in
the critical scenario, the kernel stopped sending redirects, after
the mentioned commit the behavior more randomic.
Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Fixes: c09551c6ff7f ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kai-Heng Feng [Fri, 4 Oct 2019 12:51:04 +0000 (20:51 +0800)]
r8152: Set macpassthru in reset_resume callback
r8152 may fail to establish network connection after resume from system
suspend.
If the USB port connects to r8152 lost its power during system suspend,
the MAC address was written before is lost. The reason is that The MAC
address doesn't get written again in its reset_resume callback.
So let's set MAC address again in reset_resume callback. Also remove
unnecessary lock as no other locking attempt will happen during
reset_resume.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vishal Kulkarni [Thu, 3 Oct 2019 22:36:15 +0000 (04:06 +0530)]
cxgb4:Fix out-of-bounds MSI-X info array access
When fetching free MSI-X vectors for ULDs, check for the error code
before accessing MSI-X info array. Otherwise, an out-of-bounds access is
attempted, which results in kernel panic.
Fixes: 94cdb8bb993a ("cxgb4: Add support for dynamic allocation of resources for ULD")
Signed-off-by: Shahjada Abul Husain <shahjada@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 3 Oct 2019 21:46:15 +0000 (14:46 -0700)]
Revert "ipv6: Handle race in addrconf_dad_work"
This reverts commit
a3ce2a21bb8969ae27917281244fa91bf5f286d7.
Eric reported tests failings with commit. After digging into it,
the bottom line is that the DAD sequence is not to be messed with.
There are too many cases that are expected to proceed regardless
of whether a device is up.
Revert the patch and I will send a different solution for the
problem Rajendra reported.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Thu, 3 Oct 2019 21:44:40 +0000 (00:44 +0300)]
net: make sock_prot_memory_pressure() return "const char *"
This function returns string literals which are "const char *".
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Thu, 3 Oct 2019 16:44:44 +0000 (17:44 +0100)]
rxrpc: Fix rxrpc_recvmsg tracepoint
Fix the rxrpc_recvmsg tracepoint to handle being called with a NULL call
parameter.
Fixes: a25e21f0bcd2 ("rxrpc, afs: Use debug_ids rather than pointers in traces")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reinhard Speyerer [Thu, 3 Oct 2019 16:34:39 +0000 (18:34 +0200)]
qmi_wwan: add support for Cinterion CLS8 devices
Add support for Cinterion CLS8 devices.
Use QMI_QUIRK_SET_DTR as required for Qualcomm MDM9x07 chipsets.
T: Bus=01 Lev=03 Prnt=05 Port=01 Cnt=02 Dev#= 25 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1e2d ProdID=00b0 Rev= 3.18
S: Manufacturer=GEMALTO
S: Product=USB Modem
C:* #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E: Ad=89(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
E: Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
Signed-off-by: Reinhard Speyerer <rspmn@arcor.de>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 4 Oct 2019 20:31:56 +0000 (13:31 -0700)]
Merge tag 'mips_fixes_5.4_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:
- Build fixes for Cavium Octeon & PMC-Sierra MSP systems, as well as
all pre-MIPSr6 configurations built with binutils < 2.25.
- Boot fixes for 64-bit Loongson systems & SGI IP28 systems.
- Wire up the new clone3 syscall.
- Clean ups for a few build-time warnings.
* tag 'mips_fixes_5.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: fw/arc: Remove unused addr variable
MIPS: pmcs-msp71xx: Remove unused addr variable
MIPS: pmcs-msp71xx: Add missing MAX_PROM_MEM definition
mips: Loongson: Fix the link time qualifier of 'serial_exit()'
MIPS: init: Prevent adding memory before PHYS_OFFSET
MIPS: init: Fix reservation of memory between PHYS_OFFSET and mem start
MIPS: VDSO: Fix build for binutils < 2.25
MIPS: VDSO: Remove unused gettimeofday.c
MIPS: Wire up clone3 syscall
MIPS: octeon: Include required header; fix octeon ethernet build
MIPS: cpu-bugs64: Mark inline functions as __always_inline
MIPS: dts: ar9331: fix interrupt-controller size
MIPS: Loongson64: Fix boot failure after dropping boot_mem_map
Linus Torvalds [Fri, 4 Oct 2019 20:02:47 +0000 (13:02 -0700)]
Merge tag 'riscv/for-v5.4-rc2' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
- Ensure that exclusive-load reservations are terminated after system
call or exception handling. This primarily affects QEMU, which does
not expire load reservations.
- Fix an issue primarily affecting RV32 platforms that can cause the DT
header to be corrupted, causing boot failures.
* tag 'riscv/for-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Fix memblock reservation for device tree blob
RISC-V: Clear load reservations while restoring hart contexts
Linus Torvalds [Fri, 4 Oct 2019 19:57:45 +0000 (12:57 -0700)]
Merge tag 'devicetree-fixes-for-5.4' of git://git./linux/kernel/git/robh/linux
Pull DeviceTree fixes from Rob Herring:
"Fix several 'dt_binding_check' build failures"
* tag 'devicetree-fixes-for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: phy: lantiq: Fix Property Name
dt-bindings: iio: ad7192: Fix DTC warning in the example
dt-bindings: iio: ad7192: Fix Regulator Properties
dt-bindings: media: rc: Fix redundant string
dt-bindings: dsp: Fix fsl,dsp example
Paul Burton [Fri, 4 Oct 2019 17:41:02 +0000 (17:41 +0000)]
MIPS: fw/arc: Remove unused addr variable
The addr variable in prom_free_prom_memory() has been unused since
commit
0df1007677d5 ("MIPS: fw: Record prom memory"), leading to a
compiler warning:
arch/mips/fw/arc/memory.c:163:16:
warning: unused variable 'addr' [-Wunused-variable]
Fix this by removing the unused variable.
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 0df1007677d5 ("MIPS: fw: Record prom memory")
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: linux-mips@vger.kernel.org
Linus Torvalds [Fri, 4 Oct 2019 18:17:51 +0000 (11:17 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"ARM and x86 bugfixes of all kinds.
The most visible one is that migrating a nested hypervisor has always
been busted on Broadwell and newer processors, and that has finally
been fixed"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits)
KVM: x86: omit "impossible" pmu MSRs from MSR list
KVM: nVMX: Fix consistency check on injected exception error code
KVM: x86: omit absent pmu MSRs from MSR list
selftests: kvm: Fix libkvm build error
kvm: vmx: Limit guest PMCs to those supported on the host
kvm: x86, powerpc: do not allow clearing largepages debugfs entry
KVM: selftests: x86: clarify what is reported on KVM_GET_MSRS failure
KVM: VMX: Set VMENTER_L1D_FLUSH_NOT_REQUIRED if !X86_BUG_L1TF
selftests: kvm: add test for dirty logging inside nested guests
KVM: x86: fix nested guest live migration with PML
KVM: x86: assign two bits to track SPTE kinds
KVM: x86: Expose XSAVEERPTR to the guest
kvm: x86: Enumerate support for CLZERO instruction
kvm: x86: Use AMD CPUID semantics for AMD vCPUs
kvm: x86: Improve emulation of CPUID leaves 0BH and 1FH
KVM: X86: Fix userspace set invalid CR4
kvm: x86: Fix a spurious -E2BIG in __do_cpuid_func
KVM: LAPIC: Loosen filter for adaptive tuning of lapic_timer_advance_ns
KVM: arm/arm64: vgic: Use the appropriate TRACE_INCLUDE_PATH
arm64: KVM: Kill hyp_alternate_select()
...
Linus Torvalds [Fri, 4 Oct 2019 18:13:09 +0000 (11:13 -0700)]
Merge tag 'for-linus-5.4-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes and cleanups from Juergen Gross:
- a fix in the Xen balloon driver avoiding hitting a BUG_ON() in some
cases, plus a follow-on cleanup series for that driver
- a patch for introducing non-blocking EFI callbacks in Xen's EFI
driver, plu a cleanup patch for Xen EFI handling merging the x86 and
ARM arch specific initialization into the Xen EFI driver
- a fix of the Xen xenbus driver avoiding a self-deadlock when cleaning
up after a user process has died
- a fix for Xen on ARM after removal of ZONE_DMA
- a cleanup patch for avoiding build warnings for Xen on ARM
* tag 'for-linus-5.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/xenbus: fix self-deadlock after killing user process
xen/efi: have a common runtime setup function
arm: xen: mm: use __GPF_DMA32 for arm64
xen/balloon: Clear PG_offline in balloon_retrieve()
xen/balloon: Mark pages PG_offline in balloon_append()
xen/balloon: Drop __balloon_append()
xen/balloon: Set pages PageOffline() in balloon_add_region()
ARM: xen: unexport HYPERVISOR_platform_op function
xen/efi: Set nonblocking callbacks
Linus Torvalds [Fri, 4 Oct 2019 17:36:31 +0000 (10:36 -0700)]
Merge tag 'copy-struct-from-user-v5.4-rc2' of git://git./linux/kernel/git/brauner/linux
Pull copy_struct_from_user() helper from Christian Brauner:
"This contains the copy_struct_from_user() helper which got split out
from the openat2() patchset. It is a generic interface designed to
copy a struct from userspace.
The helper will be especially useful for structs versioned by size of
which we have quite a few. This allows for backwards compatibility,
i.e. an extended struct can be passed to an older kernel, or a legacy
struct can be passed to a newer kernel. For the first case (extended
struct, older kernel) the new fields in an extended struct can be set
to zero and the struct safely passed to an older kernel.
The most obvious benefit is that this helper lets us get rid of
duplicate code present in at least sched_setattr(), perf_event_open(),
and clone3(). More importantly it will also help to ensure that users
implementing versioning-by-size end up with the same core semantics.
This point is especially crucial since we have at least one case where
versioning-by-size is used but with slighly different semantics:
sched_setattr(), perf_event_open(), and clone3() all do do similar
checks to copy_struct_from_user() while rt_sigprocmask(2) always
rejects differently-sized struct arguments.
With this pull request we also switch over sched_setattr(),
perf_event_open(), and clone3() to use the new helper"
* tag 'copy-struct-from-user-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
usercopy: Add parentheses around assignment in test_copy_struct_from_user
perf_event_open: switch to copy_struct_from_user()
sched_setattr: switch to copy_struct_from_user()
clone3: switch to copy_struct_from_user()
lib: introduce copy_struct_from_user() helper
Linus Torvalds [Fri, 4 Oct 2019 17:18:56 +0000 (10:18 -0700)]
Merge tag 'for-linus-
20191003' of git://git./linux/kernel/git/brauner/linux
Pull clone3/pidfd fixes from Christian Brauner:
"This contains a couple of fixes:
- Fix pidfd selftest compilation (Shuah Kahn)
Due to a false linking instruction in the Makefile compilation for
the pidfd selftests would fail on some systems.
- Fix compilation for glibc on RISC-V systems (Seth Forshee)
In some scenarios linux/uapi/linux/sched.h is included where
__ASSEMBLY__ is defined causing a build failure because struct
clone_args was not guarded by an #ifndef __ASSEMBLY__.
- Add missing clone3() and struct clone_args kernel-doc (Christian Brauner)
clone3() and struct clone_args were missing kernel-docs. (The goal
is to use kernel-doc for any function or type where it's worth it.)
For struct clone_args this also contains a comment about the fact
that it's versioned by size"
* tag 'for-linus-
20191003' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
sched: add kernel-doc for struct clone_args
fork: add kernel-doc for clone3
selftests: pidfd: Fix undefined reference to pthread_create()
sched: Add __ASSEMBLY__ guards around struct clone_args