Sudeep Holla [Fri, 6 Jul 2018 11:02:44 +0000 (12:02 +0100)]
arm64: topology: add support to remove cpu topology sibling masks
This patch adds support to remove all the CPU topology information using
clear_cpu_topology and also resetting the sibling information on other
sibling CPUs. This will be used in cpu_disable so that all the topology
sibling information is removed on CPU hotplug out.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Sudeep Holla [Fri, 6 Jul 2018 11:02:43 +0000 (12:02 +0100)]
arm64: numa: separate out updates to percpu nodeid and NUMA node cpumap
Currently numa_clear_node removes both cpu information from the NUMA
node cpumap as well as the NUMA node id from the cpu. Similarly
numa_store_cpu_info updates both percpu nodeid and NUMA cpumap.
However we need to retain the numa node id for the cpu and only remove
the cpu information from the numa node cpumap during CPU hotplug out.
The same can be extended for hotplugging in the CPU.
This patch separates out numa_{add,remove}_cpu from numa_clear_node and
numa_store_cpu_info.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Sudeep Holla [Fri, 6 Jul 2018 11:02:42 +0000 (12:02 +0100)]
arm64: topology: refactor reset_cpu_topology to add support for removing topology
Currently reset_cpu_topology clears all the CPU topology information
and resets to default values. However we may need to just clear the
information when we hotplug out the CPU. In preparation to add the
support the same, let's refactor reset_cpu_topology to just reset
the information and move clearing out the topology information to
clear_cpu_topology.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Fri, 6 Jul 2018 08:57:45 +0000 (09:57 +0100)]
arm64: errata: Don't define type field twice for arm64_errata[] entries
The ERRATA_MIDR_REV_RANGE macro assigns ARM64_CPUCAP_LOCAL_CPU_ERRATUM
to the '.type' field of the 'struct arm64_cpu_capabilities', so there's
no need to assign it explicitly as well.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Chintan Pandya [Wed, 6 Jun 2018 07:01:21 +0000 (12:31 +0530)]
arm64: Implement page table free interfaces
arm64 requires break-before-make. Originally, before
setting up new pmd/pud entry for huge mapping, in few
cases, the modifying pmd/pud entry was still valid
and pointing to next level page table as we only
clear off leaf PTE in unmap leg.
a) This was resulting into stale entry in TLBs (as few
TLBs also cache intermediate mapping for performance
reasons)
b) Also, modifying pmd/pud was the only reference to
next level page table and it was getting lost without
freeing it. So, page leaks were happening.
Implement pud_free_pmd_page() and pmd_free_pte_page() to
enforce BBM and also free the leaking page tables.
Implementation requires,
1) Clearing off the current pud/pmd entry
2) Invalidation of TLB
3) Freeing of the un-used next level page tables
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Chintan Pandya [Wed, 6 Jun 2018 07:01:20 +0000 (12:31 +0530)]
arm64: tlbflush: Introduce __flush_tlb_kernel_pgtable
Add an interface to invalidate intermediate page tables
from TLB for kernel.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Fri, 6 Jul 2018 12:15:06 +0000 (13:15 +0100)]
Merge branch 'x86/mm' of git://git./linux/kernel/git/tip/tip into aarch64/for-next/core
Pull in core ioremap changes from -tip, since we depend on these for
re-enabling huge I/O mappings on arm64.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 19 Jun 2018 16:55:28 +0000 (17:55 +0100)]
arm64: insn: Don't fallback on nosync path for general insn patching
Patching kernel instructions at runtime requires other CPUs to undergo
a context synchronisation event via an explicit ISB or an IPI in order
to ensure that the new instructions are visible. This is required even
for "hotpatch" instructions such as NOP and BL, so avoid optimising in
this case and always go via stop_machine() when performing general
patching.
ftrace isn't quite as strict, so it can continue to call the nosync
code directly.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Mon, 11 Jun 2018 13:22:09 +0000 (14:22 +0100)]
arm64: IPI each CPU after invalidating the I-cache for kernel mappings
When invalidating the instruction cache for a kernel mapping via
flush_icache_range(), it is also necessary to flush the pipeline for
other CPUs so that instructions fetched into the pipeline before the
I-cache invalidation are discarded. For example, if module 'foo' is
unloaded and then module 'bar' is loaded into the same area of memory,
a CPU could end up executing instructions from 'foo' when branching into
'bar' if these instructions were fetched into the pipeline before 'foo'
was unloaded.
Whilst this is highly unlikely to occur in practice, particularly as
any exception acts as a context-synchronizing operation, following the
letter of the architecture requires us to execute an ISB on each CPU
in order for the new instruction stream to be visible.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:54 +0000 (15:16 +0100)]
arm64: remove unused COMPAT_PSR definitions
Now that users have been migrated to PSR_AA32, kill the unused
COMPAT_PSR definitions.
The only difference we need a definition for is COMPAT_PSR_DIT_BIT,
which differs from PSR_AA32_DIT_BIT.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:53 +0000 (15:16 +0100)]
kvm/arm: use PSR_AA32 definitions
Some code cares about the SPSR_ELx format for exceptions taken from
AArch32 to inspect or manipulate the SPSR_ELx value, which is already in
the SPSR_ELx format, and not in the AArch32 PSR format.
To separate these from cases where we care about the AArch32 PSR format,
migrate these cases to use the PSR_AA32_* definitions rather than
COMPAT_PSR_*.
There should be no functional change as a result of this patch.
Note that arm64 KVM does not support a compat KVM API, and always uses
the SPSR_ELx format, even for AArch32 guests.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:52 +0000 (15:16 +0100)]
arm64: use PSR_AA32 definitions
Some code cares about the SPSR_ELx format for exceptions taken from
AArch32 to inspect or manipulate the SPSR_ELx value, which is already in
the SPSR_ELx format, and not in the AArch32 PSR format.
To separate these from cases where we care about the AArch32 PSR format,
migrate these cases to use the PSR_AA32_* definitions rather than
COMPAT_PSR_*.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:51 +0000 (15:16 +0100)]
arm64: ptrace: map SPSR_ELx<->PSR for compat tasks
The SPSR_ELx format for exceptions taken from AArch32 is slightly
different to the AArch32 PSR format.
Map between the two in the compat ptrace code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 7206dc93a58fb764 ("arm64: Expose Arm v8.4 features")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:50 +0000 (15:16 +0100)]
arm64: compat: map SPSR_ELx<->PSR for signals
The SPSR_ELx format for exceptions taken from AArch32 differs from the
AArch32 PSR format. Thus, we must translate between the two when setting
up a compat sigframe, or restoring context from a compat sigframe.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 7206dc93a58fb764 ("arm64: Expose Arm v8.4 features")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:49 +0000 (15:16 +0100)]
arm64: don't zero DIT on signal return
Currently valid_user_regs() treats SPSR_ELx.DIT as a RES0 bit, causing
it to be zeroed upon exception return, rather than preserved. Thus, code
relying on DIT will not function as expected, and may expose an
unexpected timing sidechannel.
Let's remove DIT from the set of RES0 bits, such that it is preserved.
At the same time, the related comment is updated to better describe the
situation, and to take into account the most recent documentation of
SPSR_ELx, in ARM DDI 0487C.a.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 7206dc93a58fb764 ("arm64: Expose Arm v8.4 features")
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mark Rutland [Thu, 5 Jul 2018 14:16:48 +0000 (15:16 +0100)]
arm64: add PSR_AA32_* definitions
The AArch32 CPSR/SPSR format is *almost* identical to the AArch64
SPSR_ELx format for exceptions taken from AArch32, but the two have
diverged with the addition of DIT, and we need to treat the two as
logically distinct.
This patch adds new definitions for the SPSR_ELx format for exceptions
taken from AArch32, with a consistent PSR_AA32_ prefix. The existing
COMPAT_PSR_ definitions will be used for the PSR format as seen from
AArch32.
Definitions of DIT are provided for both, and inline functions are
provided to map between the two formats. Note that for SPSR_ELx, the
(RES0) J bit has been re-allocated as the DIT bit.
Once users of the COMPAT_PSR definitions have been migrated over to the
PSR_AA32 definitions, the (majority of) the former will be removed, so
no efforts is made to avoid duplication until then.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Suzuki K Poulose [Wed, 4 Jul 2018 22:07:46 +0000 (23:07 +0100)]
arm64: Handle mismatched cache type
Track mismatches in the cache type register (CTR_EL0), other
than the D/I min line sizes and trap user accesses if there are any.
Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions")
Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Suzuki K Poulose [Wed, 4 Jul 2018 22:07:45 +0000 (23:07 +0100)]
arm64: Fix mismatched cache line size detection
If there is a mismatch in the I/D min line size, we must
always use the system wide safe value both in applications
and in the kernel, while performing cache operations. However,
we have been checking more bits than just the min line sizes,
which triggers false negatives. We may need to trap the user
accesses in such cases, but not necessarily patch the kernel.
This patch fixes the check to do the right thing as advertised.
A new capability will be added to check mismatches in other
fields and ensure we trap the CTR accesses.
Fixes: be68a8aaf925 ("arm64: cpufeature: Fix CTR_EL0 field definitions")
Cc: <stable@vger.kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 13 Mar 2018 21:17:01 +0000 (21:17 +0000)]
arm64: kconfig: Ensure spinlock fastpaths are inlined if !PREEMPT
When running with CONFIG_PREEMPT=n, the spinlock fastpaths fit inside
64 bytes, which typically coincides with the L1 I-cache line size.
Inline the spinlock fastpaths, like we do already for rwlocks.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Tue, 13 Mar 2018 20:45:45 +0000 (20:45 +0000)]
arm64: locking: Replace ticket lock implementation with qspinlock
It's fair to say that our ticket lock has served us well over time, but
it's time to bite the bullet and start using the generic qspinlock code
so we can make use of explicit MCS queuing and potentially better PV
performance in future.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Will Deacon [Wed, 31 Jan 2018 15:55:24 +0000 (15:55 +0000)]
arm64: barrier: Implement smp_cond_load_relaxed
We can provide an implementation of smp_cond_load_relaxed using READ_ONCE
and __cmpwait_relaxed.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Toshi Kani [Wed, 27 Jun 2018 14:13:48 +0000 (08:13 -0600)]
x86/mm: Add TLB purge to free pmd/pte page interfaces
ioremap() calls pud_free_pmd_page() / pmd_free_pte_page() when it creates
a pud / pmd map. The following preconditions are met at their entry.
- All pte entries for a target pud/pmd address range have been cleared.
- System-wide TLB purges have been peformed for a target pud/pmd address
range.
The preconditions assure that there is no stale TLB entry for the range.
Speculation may not cache TLB entries since it requires all levels of page
entries, including ptes, to have P & A-bits set for an associated address.
However, speculation may cache pud/pmd entries (paging-structure caches)
when they have P-bit set.
Add a system-wide TLB purge (INVLPG) to a single page after clearing
pud/pmd entry's P-bit.
SDM 4.10.4.1, Operation that Invalidate TLBs and Paging-Structure Caches,
states that:
INVLPG invalidates all paging-structure caches associated with the
current PCID regardless of the liner addresses to which they correspond.
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: cpandya@codeaurora.org
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Joerg Roedel <joro@8bytes.org>
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-4-toshi.kani@hpe.com
Chintan Pandya [Wed, 27 Jun 2018 14:13:47 +0000 (08:13 -0600)]
ioremap: Update pgtable free interfaces with addr
The following kernel panic was observed on ARM64 platform due to a stale
TLB entry.
1. ioremap with 4K size, a valid pte page table is set.
2. iounmap it, its pte entry is set to 0.
3. ioremap the same address with 2M size, update its pmd entry with
a new value.
4. CPU may hit an exception because the old pmd entry is still in TLB,
which leads to a kernel panic.
Commit
b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page
table") has addressed this panic by falling to pte mappings in the above
case on ARM64.
To support pmd mappings in all cases, TLB purge needs to be performed
in this case on ARM64.
Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
so that TLB purge can be added later in seprate patches.
[toshi.kani@hpe.com: merge changes, rewrite patch description]
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
Toshi Kani [Wed, 27 Jun 2018 14:13:46 +0000 (08:13 -0600)]
x86/mm: Disable ioremap free page handling on x86-PAE
ioremap() supports pmd mappings on x86-PAE. However, kernel's pmd
tables are not shared among processes on x86-PAE. Therefore, any
update to sync'd pmd entries need re-syncing. Freeing a pte page
also leads to a vmalloc fault and hits the BUG_ON in vmalloc_sync_one().
Disable free page handling on x86-PAE. pud_free_pmd_page() and
pmd_free_pte_page() simply return 0 if a given pud/pmd entry is present.
This assures that ioremap() does not update sync'd pmd entries at the
cost of falling back to pte mappings.
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Reported-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: cpandya@codeaurora.org
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-2-toshi.kani@hpe.com
Mark Rutland [Mon, 2 Jul 2018 13:17:53 +0000 (14:17 +0100)]
arm64: kexec: always reset to EL2 if present
Currently machine_kexec() doesn't reset to EL2 in the case of a
crashdump kernel. This leaves potentially dodgy state active at EL2, and
means that if the crashdump kernel attempts to online secondary CPUs,
these will be booted as mismatched ELs.
Let's reset to EL2, as we do in all other cases, and simplify things. If
EL2 state is corrupt, things are already sufficiently bad that kdump is
unlikely to work, and it's best-effort regardless.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Mikulas Patocka [Thu, 14 Jun 2018 18:58:21 +0000 (14:58 -0400)]
arm64: fix infinite stacktrace
I've got this infinite stacktrace when debugging another problem:
[ 908.795225] INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 908.796176] 1-...!: (1 GPs behind) idle=952/1/
4611686018427387904 softirq=1462/1462 fqs=355
[ 908.797692] 2-...!: (1 GPs behind) idle=f42/1/
4611686018427387904 softirq=1550/1551 fqs=355
[ 908.799189] (detected by 0, t=2109 jiffies, g=130, c=129, q=235)
[ 908.800284] Task dump for CPU 1:
[ 908.800871] kworker/1:1 R running task 0 32 2 0x00000022
[ 908.802127] Workqueue: writecache-writeabck writecache_writeback [dm_writecache]
[ 908.820285] Call trace:
[ 908.824785] __switch_to+0x68/0x90
[ 908.837661] 0xfffffe00603afd90
[ 908.844119] 0xfffffe00603afd90
[ 908.850091] 0xfffffe00603afd90
[ 908.854285] 0xfffffe00603afd90
[ 908.863538] 0xfffffe00603afd90
[ 908.865523] 0xfffffe00603afd90
The machine just locked up and kept on printing the same line over and
over again. This patch fixes it.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Peng Donglin [Wed, 13 Jun 2018 01:11:56 +0000 (09:11 +0800)]
ARM64: dump: Convert to use DEFINE_SHOW_ATTRIBUTE macro
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Peng Donglin <dolinux.peng@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Linus Torvalds [Sun, 1 Jul 2018 23:04:53 +0000 (16:04 -0700)]
Linux 4.18-rc3
Linus Torvalds [Sun, 1 Jul 2018 19:38:16 +0000 (12:38 -0700)]
Merge tag 'for-4.18-rc2-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"We have a few regression fixes for qgroup rescan status tracking and
the vm_fault_t conversion that mixed up the error values"
* tag 'for-4.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: fix mount failure when qgroup rescan is in progress
Btrfs: fix regression in btrfs_page_mkwrite() from vm_fault_t conversion
btrfs: quota: Set rescan progress to (u64)-1 if we hit last leaf
Linus Torvalds [Sun, 1 Jul 2018 19:32:19 +0000 (12:32 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
"Followup to procfs-seq_file series this window"
This fixes a memory leak by making sure that proc seq files release any
private data on close. The 'proc_seq_open' has to be properly paired
with 'proc_seq_release' that releases the extra private data.
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
proc: add proc_seq_release
Linus Torvalds [Sun, 1 Jul 2018 19:20:20 +0000 (12:20 -0700)]
Merge tag 'staging-4.18-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Greg KH:
"Here are a few small staging and IIO driver fixes for 4.18-rc3.
Nothing major or big, all just fixes for reported problems since
4.18-rc1. All of these have been in linux-next this week with no
reported problems"
* tag 'staging-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: android: ion: Return an ERR_PTR in ion_map_kernel
staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write()
iio: imu: inv_mpu6050: Fix probe() failure on older ACPI based machines
iio: buffer: fix the function signature to match implementation
iio: mma8452: Fix ignoring MMA8452_INT_DRDY
iio: tsl2x7x/tsl2772: avoid potential division by zero
iio: pressure: bmp280: fix relative humidity unit
Linus Torvalds [Sun, 1 Jul 2018 19:05:53 +0000 (12:05 -0700)]
Merge tag 'tty-4.18-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are five fixes for the tty core and some serial drivers.
The tty core ones fix some security and other issues reported by the
syzbot that I have taken too long in responding to (sorry Tetsuo!).
The 8350 serial driver fix resolves an issue of devices that used to
work properly stopping working as they shouldn't have been added to a
blacklist.
All of these have been in linux-next for a few days with no reported
issues"
* tag 'tty-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: prevent leaking uninitialized data to userspace via /dev/vcs*
serdev: fix memleak on module unload
serial: 8250_pci: Remove stalled entries in blacklist
n_tty: Access echo_* variables carefully.
n_tty: Fix stall at n_tty_receive_char_special().
Linus Torvalds [Sun, 1 Jul 2018 18:50:16 +0000 (11:50 -0700)]
Merge tag 'usb-4.18-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here is a number of USB gadget and other driver fixes for 4.18-rc3.
There's a bunch of them here, most of them being gadget driver and
xhci host controller fixes for reported issues (as normal), but there
are also some new device ids, and some fixes for the typec code.
There is an acpi core patch in here that was acked by the acpi
maintainer as it is needed for the typec fixes in order to properly
solve a problem in that driver.
All of these have been in linux-next this week with no reported
issues"
* tag 'usb-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (33 commits)
usb: chipidea: host: fix disconnection detect issue
usb: typec: tcpm: fix logbuffer index is wrong if _tcpm_log is re-entered
typec: tcpm: Fix a msecs vs jiffies bug
NFC: pn533: Fix wrong GFP flag usage
usb: cdc_acm: Add quirk for Uniden UBC125 scanner
staging/typec: fix tcpci_rt1711h build errors
usb: typec: ucsi: Fix for incorrect status data issue
usb: typec: ucsi: acpi: Workaround for cache mode issue
acpi: Add helper for deactivating memory region
usb: xhci: increase CRS timeout value
usb: xhci: tegra: fix runtime PM error handling
usb: xhci: remove the code build warning
xhci: Fix kernel oops in trace_xhci_free_virt_device
xhci: Fix perceived dead host due to runtime suspend race with event handler
dwc2: gadget: Fix ISOC IN DDMA PID bitfield value calculation
usb: gadget: dwc2: fix memory leak in gadget_init()
usb: gadget: composite: fix delayed_status race condition when set_interface
usb: dwc2: fix isoc split in transfer with no data
usb: dwc2: alloc dma aligned buffer for isoc split in
usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hub
...
Linus Torvalds [Sun, 1 Jul 2018 17:45:13 +0000 (10:45 -0700)]
Merge tag 'dma-mapping-4.18-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma mapping fixlet from Christoph Hellwig:
"Add a missing export required by riscv and unicore"
* tag 'dma-mapping-4.18-2' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: export swiotlb_dma_ops
Linus Torvalds [Sat, 30 Jun 2018 21:16:30 +0000 (14:16 -0700)]
Merge branch 'parisc-4.18-1' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes and cleanups from Helge Deller:
"Nothing exiting in this patchset, just
- small cleanups of header files
- default to 4 CPUs when building a SMP kernel
- mark 16kB and 64kB page sizes broken
- addition of the new io_pgetevents syscall"
* 'parisc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Build kernel without -ffunction-sections
parisc: Reduce debug output in unwind code
parisc: Wire up io_pgetevents syscall
parisc: Default to 4 SMP CPUs
parisc: Convert printk(KERN_LEVEL) to pr_lvl()
parisc: Mark 16kB and 64kB page sizes BROKEN
parisc: Drop struct sigaction from not exported header file
Linus Torvalds [Sat, 30 Jun 2018 21:08:06 +0000 (14:08 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A smaller batch for the end of the week (let's see if I can keep the
weekly cadence going for once).
All medium-grade fixes here, nothing worrisome:
- Fixes for some fairly old bugs around SD card write-protect
detection and GPIO interrupt assignments on Davinci.
- Wifi module suspend fix for Hikey.
- Minor DT tweaks to fix inaccuracies for Amlogic platforms, one
of which solves booting with third-party u-boot"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: dts: hikey960: Define wl1837 power capabilities
arm64: dts: hikey: Define wl1835 power capabilities
ARM64: dts: meson-gxl: fix Mali GPU compatible string
ARM64: dts: meson-axg: fix ethernet stability issue
ARM64: dts: meson-gx: fix ATF reserved memory region
ARM64: dts: meson-gxl-s905x-p212: Add phy-supply for usb0
ARM64: dts: meson: fix register ranges for SD/eMMC
ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
ARM: dts: da850: Fix interrups property for gpio
ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SD
Linus Torvalds [Sat, 30 Jun 2018 20:05:30 +0000 (13:05 -0700)]
Merge tag 'kbuild-fixes-v4.18' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- introduce __diag_* macros and suppress -Wattribute-alias warnings
from GCC 8
- fix stack protector test script for x86_64
- fix line number handling in Kconfig
- document that '#' starts a comment in Kconfig
- handle P_SYMBOL property in dump debugging of Kconfig
- correct help message of LD_DEAD_CODE_DATA_ELIMINATION
- fix occasional segmentation faults in Kconfig
* tag 'kbuild-fixes-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: loop boundary condition fix
kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION
kconfig: handle P_SYMBOL in print_symbol()
kconfig: document Kconfig source file comments
kconfig: fix line numbers for if-entries in menu tree
stack-protector: Fix test with 32-bit userland and CONFIG_64BIT=y
powerpc: Remove -Wattribute-alias pragmas
disable -Wattribute-alias warning for SYSCALL_DEFINEx()
kbuild: add macro for controlling warnings to linux/compiler.h
Linus Torvalds [Sat, 30 Jun 2018 18:42:14 +0000 (11:42 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"The biggest diffstat comes from self-test updates, plus there's entry
code fixes, 5-level paging related fixes, console debug output fixes,
and misc fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Clean up the printk()s in show_fault_oops()
x86/mm: Drop unneeded __always_inline for p4d page table helpers
x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y
selftests/x86/sigreturn: Do minor cleanups
selftests/x86/sigreturn/64: Fix spurious failures on AMD CPUs
x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80"
x86/mm: Don't free P4D table when it is folded at runtime
x86/entry/32: Add explicit 'l' instruction suffix
x86/mm: Get rid of KERN_CONT in show_fault_oops()
Linus Torvalds [Sat, 30 Jun 2018 18:26:25 +0000 (11:26 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Tooling fixes mostly, plus a build warning fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
perf/core: Move inline keyword at the beginning of declaration
tools/headers: Pick up latest kernel ABIs
perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE]
perf script: Fix crash because of missing evsel->priv
perf script: Add missing output fields in a hint
perf bench: Fix numa report output code
perf stat: Remove duplicate event counting
perf alias: Rebuild alias expression string to make it comparable
perf alias: Remove trailing newline when reading sysfs files
perf tools: Fix a clang 7.0 compilation error
tools include uapi: Synchronize bpf.h with the kernel
tools include uapi: Update if_link.h to pick IFLA_{BRPORT_ISOLATED,VXLAN_TTL_INHERIT}
tools include powerpc: Update arch/powerpc/include/uapi/asm/unistd.h copy to get 'rseq' syscall
perf tools: Update x86's syscall_64.tbl, adding 'io_pgetevents' and 'rseq'
tools headers uapi: Synchronize drm/drm.h
perf intel-pt: Fix packet decoding of CYC packets
perf tests: Add valid callback for parse-events test
perf tests: Add event parsing error handling to parse events test
perf report powerpc: Fix crash if callchain is empty
perf test session topology: Fix test on s390
...
Linus Torvalds [Sat, 30 Jun 2018 18:15:12 +0000 (11:15 -0700)]
Merge tag 'selinux-pr-
20180629' of git://git./linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One fairly straightforward patch to fix a longstanding issue where a
process could stall while accessing files in selinuxfs and block
everyone else due to a held mutex.
The patch passes all our tests and looks to apply cleanly to your
current tree"
* tag 'selinux-pr-
20180629' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: move user accesses in selinuxfs out of locked regions
Linus Torvalds [Sat, 30 Jun 2018 17:47:46 +0000 (10:47 -0700)]
Merge tag 'for-linus-
20180629' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Small set of fixes for this series. Mostly just minor fixes, the only
oddball in here is the sg change.
The sg change came out of the stall fix for NVMe, where we added a
mempool and limited us to a single page allocation. CONFIG_SG_DEBUG
sort-of ruins that, since we'd need to account for that. That's
actually a generic problem, since lots of drivers need to allocate SG
lists. So this just removes support for CONFIG_SG_DEBUG, which I added
back in 2007 and to my knowledge it was never useful.
Anyway, outside of that, this pull contains:
- clone of request with special payload fix (Bart)
- drbd discard handling fix (Bart)
- SATA blk-mq stall fix (me)
- chunk size fix (Keith)
- double free nvme rdma fix (Sagi)"
* tag 'for-linus-
20180629' of git://git.kernel.dk/linux-block:
sg: remove ->sg_magic member
drbd: Fix drbd_request_prepare() discard handling
blk-mq: don't queue more if we get a busy return
block: Fix cloning of requests with a special payload
nvme-rdma: fix possible double free of controller async event buffer
block: Fix transfer when chunk sectors exceeds max
Linus Torvalds [Sat, 30 Jun 2018 02:28:26 +0000 (19:28 -0700)]
Merge tag 'powerpc-4.18-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Two regression fixes, and a new syscall wire-up:
- A fix for the recent conversion to time64_t in the powermac RTC
routines, which caused time to go backward.
- Another fix for fallout from the split PMD PTL conversion.
- Wire up the new io_pgetevents() syscall.
Thanks to: Aneesh Kumar K.V, Arnd Bergmann, Breno Leitao, Mathieu
Malaterre"
* tag 'powerpc-4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powermac: Fix rtc read/write functions
powerpc/mm/32: Fix pgtable_page_dtor call
powerpc: Wire up io_pgetevents
Olof Johansson [Fri, 29 Jun 2018 21:08:27 +0000 (14:08 -0700)]
Merge tag 'davinci-fixes-for-v4.18' of git://git./linux/kernel/git/nsekhar/linux-davinci into fixes
This fixes polarity of SD card write-protect pin on DA850 EVM
and fixes interrupt property for DA850 SoC GPIO as defined in
device-tree.
Both of these are not introduced with v4.18 merge but have
existed prior.
* tag 'davinci-fixes-for-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nsekhar/linux-davinci:
ARM: dts: da850: Fix interrups property for gpio
ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SD
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Fri, 29 Jun 2018 21:06:49 +0000 (14:06 -0700)]
Merge tag 'hisi-fixes-for-4.18' of git://github.com/hisilicon/linux-hisi into fixes
ARM64: hisi fixes for 4.18
- Added power capabilities for the mmc host controller on the
hikey and hikey960 boards to avoid broken wifi.
* tag 'hisi-fixes-for-4.18' of git://github.com/hisilicon/linux-hisi:
arm64: dts: hikey960: Define wl1837 power capabilities
arm64: dts: hikey: Define wl1835 power capabilities
Signed-off-by: Olof Johansson <olof@lixom.net>
Olof Johansson [Fri, 29 Jun 2018 21:04:39 +0000 (14:04 -0700)]
Merge tag 'amlogic-fixes' of https://git./linux/kernel/git/khilman/linux-amlogic into fixes
Amlogic fixes for v4.18-rc
- minor 64-bit DT fixes
* tag 'amlogic-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic:
ARM64: dts: meson-gxl: fix Mali GPU compatible string
ARM64: dts: meson-axg: fix ethernet stability issue
ARM64: dts: meson-gx: fix ATF reserved memory region
ARM64: dts: meson-gxl-s905x-p212: Add phy-supply for usb0
ARM64: dts: meson: fix register ranges for SD/eMMC
ARM64: dts: meson: disable sd-uhs modes on the libretech-cc
Signed-off-by: Olof Johansson <olof@lixom.net>
Linus Torvalds [Fri, 29 Jun 2018 19:25:26 +0000 (12:25 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- The alternatives patching code uses flush_icache_range() which itself
uses alternatives. Change the code to use an unpatched variant of
cache maintenance
- Remove unnecessary ISBs from set_{pte,pmd,pud}
- perf: xgene_pmu: Fix IOB SLOW PMU parser error
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Remove unnecessary ISBs from set_{pte,pmd,pud}
arm64: Avoid flush_icache_range() in alternatives patching code
drivers/perf: xgene_pmu: Fix IOB SLOW PMU parser error
Linus Torvalds [Fri, 29 Jun 2018 19:21:12 +0000 (12:21 -0700)]
Merge branch 'i2c/for-current-fixed' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- a revert because of bugzilla #200045 (and some documentation about
it)
- another regression fix in the i2c-gpio driver
- a leak fix for the i2c core
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: gpio: initialize SCL to HIGH again
i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers
i2c: algos: bit: mention our experience about initial states
Revert "i2c: algo-bit: init the bus to a known state"
Linus Torvalds [Fri, 29 Jun 2018 19:19:47 +0000 (12:19 -0700)]
Merge tag 'ceph-for-4.18-rc3' of git://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"A trivial dentry leak fix from Zheng"
* tag 'ceph-for-4.18-rc3' of git://github.com/ceph/ceph-client:
ceph: fix dentry leak in splice_dentry()
Helge Deller [Fri, 20 Apr 2018 21:13:44 +0000 (23:13 +0200)]
parisc: Build kernel without -ffunction-sections
As suggested by Nick Piggin it seems we can drop the -ffunction-sections
compile flag, now that the kernel uses thin archives. Testing with 32-
and 64-bit kernel showed no difference in kernel size.
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Jens Axboe [Fri, 29 Jun 2018 14:48:06 +0000 (08:48 -0600)]
sg: remove ->sg_magic member
This was introduced more than a decade ago when sg chaining was
added, but we never really caught anything with it. The scatterlist
entry size can be critical, since drivers allocate it, so remove
the magic member. Recently it's been triggering allocation stalls
and failures in NVMe.
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 29 Jun 2018 14:22:46 +0000 (07:22 -0700)]
Merge tag 'pci-v4.18-fixes-1' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Fix crash caused by endpoint library initialization order change
(Alan Douglas)
- Fix shpchp NULL pointer dereference regression on non-ACPI platforms
(Bjorn Helgaas)
- Move PCI_DOMAINS selection to fix build regression (Lorenzo
Pieralisi)
* tag 'pci-v4.18-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: controller: Move PCI_DOMAINS selection to arch Kconfig
PCI: Initialize endpoint library before controllers
PCI: shpchp: Manage SHPC unconditionally on non-ACPI systems
Linus Torvalds [Fri, 29 Jun 2018 14:14:41 +0000 (07:14 -0700)]
Merge tag 'pm-4.18-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix up recently added features (the Kryo cpufreq driver and
performance states coverage in the generic power domains framework),
add missing documentation for a recently added sysfs knob in the
intel_pstate driver and fix an error in its documentation.
Specifics:
- Fix the initialization time error handling in the recently added
Kryo cpufreq driver (Dan Carpenter).
- Fix up the recently added coverage of performance states in the
generic power domains (genpd) framework (Viresh Kumar).
- Add missing documentation of the new hwp_dynamic_boost sysfs knob
in the intel_pstate driver (Rafael Wysocki).
- Fix incorrect sysfs path in the intel_pstate driver documentation
(Rafael Wysocki)"
* tag 'pm-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Documentation: intel_pstate: Describe hwp_dynamic_boost sysfs knob
Documentation: admin-guide: intel_pstate: Fix sysfs path
PM / Domains: Rename opp_node to np
PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
cpufreq: qcom-kryo: Fix error handling in probe()
Linus Torvalds [Fri, 29 Jun 2018 14:11:03 +0000 (07:11 -0700)]
Merge tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Nothing too major this round:
- small set of mali-dp fixes
- single meson fix
- a bunch of amdgpu fixes (one makes non-4k page sizes not be a bad
experience)"
* tag 'drm-fixes-2018-06-29' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: release spinlock before committing updates to stream
drm/amdgpu:Support new VCN FW version naming convention
drm/amdgpu: fix UBSAN: Undefined behaviour for amdgpu_fence.c
drm/meson: Fix an un-handled error path in 'meson_drv_bind_master()'
drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping
drm/amdgpu: Count disabled CRTCs in commit tail earlier
drm/mali-dp: Rectify the width and height passed to rotmem_required()
drm/arm/malidp: Preserve LAYER_FORMAT contents when setting format
drm: mali-dp: Enable Global SE interrupts mask for DP500
drm/arm/malidp: Ensure that the crtcs are shutdown before removing any encoder/connector
Linus Torvalds [Fri, 29 Jun 2018 14:07:25 +0000 (07:07 -0700)]
Merge tag 'for-4.18/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix dm core to use more efficient bio_split() instead of
bio_clone_bioset(). Also fixes splitting bio that has integrity
payload.
- Three fixes related to properly validating DAX capabilities of a
stacked DM device that will advertise DAX support.
- Update DM writecache target to use 2-factor allocator arguments. Kees
says this is the last related change for 4.18.
- Fix DM zoned target to use GFP_NOIO to avoid triggering reclaim
during IO submission (caught by lockdep).
- Fix DM thinp to gracefully recover from running out of data space
while a previous async discard completes (whereby freeing space).
- Fix DM thinp's metadata transaction commit to avoid needless work.
* tag 'for-4.18/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: prevent DAX mounts if not supported
dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
pmem: only set QUEUE_FLAG_DAX for fsdax mode
dm thin: handle running out of data space vs concurrent discard
dm raid: don't use 'const' in function return
dm zoned: avoid triggering reclaim from inside dmz_map()
dm writecache: use 2-factor allocator arguments
dm thin metadata: remove needless work from __commit_transaction
dm: use bio_split() when splitting out the already processed bio
Jens Axboe [Fri, 29 Jun 2018 13:55:41 +0000 (07:55 -0600)]
Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linus
Pull single NVMe fix from Christoph.
* 'nvme-4.18' of git://git.infradead.org/nvme:
nvme-rdma: fix possible double free of controller async event buffer
Bart Van Assche [Mon, 25 Jun 2018 22:51:30 +0000 (15:51 -0700)]
drbd: Fix drbd_request_prepare() discard handling
Fix the test that verifies whether bio_op(bio) represents a discard
or write zeroes operation. Compile-tested only.
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Fixes: 7435e9018f91 ("drbd: zero-out partial unaligned discards on local backend")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 28 Jun 2018 17:54:01 +0000 (11:54 -0600)]
blk-mq: don't queue more if we get a busy return
Some devices have different queue limits depending on the type of IO. A
classic case is SATA NCQ, where some commands can queue, but others
cannot. If we have NCQ commands inflight and encounter a non-queueable
command, the driver returns busy. Currently we attempt to dispatch more
from the scheduler, if we were able to queue some commands. But for the
case where we ended up stopping due to BUSY, we should not attempt to
retrieve more from the scheduler. If we do, we can get into a situation
where we attempt to queue a non-queueable command, get BUSY, then
successfully retrieve more commands from that scheduler and queue those.
This can repeat forever, starving the non-queuable command indefinitely.
Fix this by NOT attempting to pull more commands from the scheduler, if
we get a BUSY return. This should also be more optimal in terms of
letting requests stay in the scheduler for as long as possible, if we
get a BUSY due to the regular out-of-tags condition.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Avi Kivity [Fri, 29 Jun 2018 13:37:25 +0000 (15:37 +0200)]
aio: mark __aio_sigset::sigmask const
io_pgetevents() will not change the signal mask. Mark it const to make
it clear and to reduce the need for casts in user code.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Avi Kivity <avi@scylladb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[hch: reapply the patch that got incorrectly reverted]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Fri, 29 Jun 2018 13:37:24 +0000 (15:37 +0200)]
net: handle NULL ->poll gracefully
The big aio poll revert broke various network protocols that don't
implement ->poll as a patch in the aio poll serie removed sock_no_poll
and made the common code handle this case.
Reported-by: syzbot+57727883dbad76db2ef0@syzkaller.appspotmail.com
Reported-by: syzbot+cdb0d3176b53d35ad454@syzkaller.appspotmail.com
Reported-by: syzbot+2c7e8f74f8b2571c87e8@syzkaller.appspotmail.com
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Fixes: a11e1d432b51 ("Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Fri, 29 Jun 2018 07:54:20 +0000 (09:54 +0200)]
Merge branch 'pm-domains'
Merge fixups for the recent extenstion of the generic power domains
(genpd) framework covering performance states.
* pm-domains:
PM / Domains: Rename opp_node to np
PM / Domains: Fix return value of of_genpd_opp_to_performance_state()
Wolfram Sang [Sat, 16 Jun 2018 12:56:36 +0000 (21:56 +0900)]
i2c: gpio: initialize SCL to HIGH again
It seems that during the conversion from gpio* to gpiod*, the initial
state of SCL was wrongly switched to LOW. Fix it to be HIGH again.
Fixes: 7bb75029ef34 ("i2c: gpio: Enforce open drain through gpiolib")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Peter Rosin [Wed, 20 Jun 2018 09:43:23 +0000 (11:43 +0200)]
i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers
If DMA safe memory was allocated, but the subsequent I2C transfer
fails the memory is leaked. Plug this leak.
Fixes: 8a77821e74d6 ("i2c: smbus: use DMA safe buffers for emulated SMBus transactions")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Wolfram Sang [Sat, 16 Jun 2018 13:37:57 +0000 (22:37 +0900)]
i2c: algos: bit: mention our experience about initial states
So, if somebody wants to re-implement this in the future, we pinpoint to
a problem case.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Sat, 16 Jun 2018 13:37:56 +0000 (22:37 +0900)]
Revert "i2c: algo-bit: init the bus to a known state"
This reverts commit
3e5f06bed72fe72166a6778f630241a893f67799. As per
bugzilla #200045, this caused a regression. I don't really see a way to
fix it without having the hardware. So, revert the patch and I will fix
the issue I was seeing originally in the i2c-gpio driver itself. I
couldn't find new users of this algorithm since, so there should be no
one depending on the new behaviour.
Reported-by: Sergey Larin <cerg2010cerg2010@mail.ru>
Fixes: 3e5f06bed72f ("i2c: algo-bit: init the bus to a known state")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Sergey Larin <cerg2010cerg2010@mail.ru>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Jann Horn [Fri, 29 Jun 2018 00:39:54 +0000 (20:39 -0400)]
selinux: move user accesses in selinuxfs out of locked regions
If a user is accessing a file in selinuxfs with a pointer to a userspace
buffer that is backed by e.g. a userfaultfd, the userspace access can
stall indefinitely, which can block fsi->mutex if it is held.
For sel_read_policy(), remove the locking, since this method doesn't seem
to access anything that requires locking.
For sel_read_bool(), move the user access below the locked region.
For sel_write_bool() and sel_commit_bools_write(), move the user access
up above the locked region.
Cc: stable@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: removed an unused variable in sel_read_policy()]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Helge Deller [Thu, 28 Jun 2018 20:21:24 +0000 (22:21 +0200)]
parisc: Reduce debug output in unwind code
Signed-off-by: Helge Deller <deller@gmx.de>
Dave Airlie [Thu, 28 Jun 2018 20:25:01 +0000 (06:25 +1000)]
Merge tag 'drm-misc-fixes-2018-06-28' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v4.18-rc3:
- A single fix in meson for an unhandled error path in meson_drv_bind_master().
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/fa740f31-5a8d-ed45-5e8a-aecd3f6f11b7@linux.intel.com
Dave Airlie [Thu, 28 Jun 2018 20:21:12 +0000 (06:21 +1000)]
Merge branch 'drm-fixes-4.18' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few fixes for 4.18:
- fix a read past the end of an array due to vega20 changes
- fix driver on systems with non-4K pages
- fix locking with pageflipping in DC that could lead to a sleep while atomic
- fix VCN firmware version reporting for upcoming firmware
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180628032641.2765-1-alexander.deucher@amd.com
Ross Zwisler [Tue, 26 Jun 2018 22:30:41 +0000 (16:30 -0600)]
dm: prevent DAX mounts if not supported
Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX
flag is set on the device's request queue to decide whether or not the
device supports filesystem DAX. Really we should be using
bdev_dax_supported() like filesystems do at mount time. This performs
other tests like checking to make sure the dax_direct_access() path works.
We also explicitly clear QUEUE_FLAG_DAX on the DM device's request queue if
any of the underlying devices do not support DAX. This makes the handling
of QUEUE_FLAG_DAX consistent with the setting/clearing of most other flags
in dm_table_set_restrictions().
Now that bdev_dax_supported() explicitly checks for QUEUE_FLAG_DAX, this
will ensure that filesystems built upon DM devices will only be able to
mount with DAX if all underlying devices also support DAX.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
Cc: stable@vger.kernel.org
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Ross Zwisler [Tue, 26 Jun 2018 22:30:40 +0000 (16:30 -0600)]
dax: check for QUEUE_FLAG_DAX in bdev_dax_supported()
Add an explicit check for QUEUE_FLAG_DAX to __bdev_dax_supported(). This
is needed for DM configurations where the first element in the dm-linear or
dm-stripe target supports DAX, but other elements do not. Without this
check __bdev_dax_supported() will pass for such devices, letting a
filesystem on that device mount with the DAX option.
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Mike Snitzer <snitzer@redhat.com>
Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
Cc: stable@vger.kernel.org
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Ross Zwisler [Tue, 26 Jun 2018 22:30:39 +0000 (16:30 -0600)]
pmem: only set QUEUE_FLAG_DAX for fsdax mode
QUEUE_FLAG_DAX is an indication that a given block device supports
filesystem DAX and should not be set for PMEM namespaces which are in "raw"
mode. These namespaces lack struct page and are prevented from
participating in filesystem DAX as of commit
569d0365f571 ("dax: require
'struct page' by default for filesystem dax").
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Suggested-by: Mike Snitzer <snitzer@redhat.com>
Fixes: 569d0365f571 ("dax: require 'struct page' by default for filesystem dax")
Cc: stable@vger.kernel.org
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Linus Torvalds [Thu, 28 Jun 2018 19:45:34 +0000 (12:45 -0700)]
Merge tag 'printk-for-4.18-rc3' of git://git./linux/kernel/git/pmladek/printk
Pull printk fix from Petr Mladek:
"Revert a commit that went in by mistake. I already have a better fix
in the queue for 4.19"
* tag 'printk-for-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk:
Revert "lib/test_printf.c: call wait_for_random_bytes() before plain %p tests"
Linus Torvalds [Thu, 28 Jun 2018 19:43:37 +0000 (12:43 -0700)]
Merge tag 'sound-4.18-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Over a dozen changes, but all small and clear fixes.
Half of them are the regression fixes for CA0132 HD-audio codec, and
the rest are, again, a few more fixups for HD-audio, two UBSAN fixes
in the core ioctls, and a trivial fix in the error path handling in
lx6464es driver"
* tag 'sound-4.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: seq: Fix UBSAN warning at SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT ioctl
ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl
ALSA: hda/realtek - Fix the problem of two front mics on more machines
ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210
ALSA: hda/ca0132: make array ca0132_alt_chmaps static
ALSA: hda - Force to link down at runtime suspend on ATI/AMD HDMI
ALSA: lx6464es: Missing error code in snd_lx6464es_create()
ALSA: hda/ca0132: Fix DMic data rate for Alienware M17x R4
ALSA: hda/ca0132: Restore PCM Analog Mic-In2
ALSA: hda/ca0132: Don't test for QUIRK_NONE
ALSA: hda/ca0132: Restore behavior of QUIRK_ALIENWARE
ALSA: hda/ca0132: Delete redundant UNSOL event requests
ALSA: hda/ca0132: Delete pointless assignments to struct auto_pin_cfg fields
ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co
Linus Torvalds [Thu, 28 Jun 2018 19:31:59 +0000 (12:31 -0700)]
Merge tag 'mtd/fixes-for-4.18-rc3' of git://git.infradead.org/linux-mtd
Pull mtd fixes from Boris Brezillon:
"NAND fixes:
- add a quirk for a bunch of broken Macronix chips
- fix nand_block_bad() when chip->ecc.read_oob() returns a positive
value encoding the number of bitflips
- fix OOB handling in the MXC driver fo V2.1 controllers
- flag the ONFI_FEATURE_ON_DIE_ECC as supported in the Micron driver
- hardcode clk rate in the denali_dt driver to address a bad DT
representation (the proper fix will be queued for 4.19)
SPI NOR fixes:
- add an ULL constant to some ID definitions so that the ID is not
truncated on 32-bit platforms
MTD fixes:
- fix the sector unlocking logic in the CFI driver"
* tag 'mtd/fixes-for-4.18-rc3' of git://git.infradead.org/linux-mtd:
mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz unconditionally
mtd: dataflash: Use ULL suffix for 64-bit constants
mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking.
mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary
mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips
mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock()
mtd: rawnand: All AC chips have a broken GET_FEATURES(TIMINGS).
mtd: rawnand: fix return value check for bad block status
mtd: rawnand: mxc: set spare area size register explicitly
mtd: rawnand: micron: add ONFI_FEATURE_ON_DIE_ECC to supported features
Linus Torvalds [Thu, 28 Jun 2018 18:42:56 +0000 (11:42 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"7 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
proc: add Alexey to MAINTAINERS
kasan: depend on CONFIG_SLUB_DEBUG
include/linux/dax.h: dax_iomap_fault() returns vm_fault_t
x86/e820: put !E820_TYPE_RAM regions into memblock.reserved
slub: fix failure when we delete and create a slab cache
Revert mm/vmstat.c: fix vmstat_update() preemption BUG
lib/percpu_ida.c: don't do alloc from per-CPU list if there is none
Alexey Dobriyan [Thu, 28 Jun 2018 06:26:24 +0000 (23:26 -0700)]
proc: add Alexey to MAINTAINERS
I know I'll regret it.
Link: http://lkml.kernel.org/r/20180627194840.GA18113@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jason A. Donenfeld [Thu, 28 Jun 2018 06:26:20 +0000 (23:26 -0700)]
kasan: depend on CONFIG_SLUB_DEBUG
KASAN depends on having access to some of the accounting that SLUB_DEBUG
does; without it, there are immediate crashes [1]. So, the natural
thing to do is to make KASAN select SLUB_DEBUG.
[1] http://lkml.kernel.org/r/CAHmME9rtoPwxUSnktxzKso14iuVCWT7BE_-_8PAC=pGw1iJnQg@mail.gmail.com
Link: http://lkml.kernel.org/r/20180622154623.25388-1-Jason@zx2c4.com
Fixes: f9e13c0a5a33 ("slab, slub: skip unnecessary kasan_cache_shutdown()")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Souptick Joarder [Thu, 28 Jun 2018 06:26:17 +0000 (23:26 -0700)]
include/linux/dax.h: dax_iomap_fault() returns vm_fault_t
Commit
1c8f422059ae ("mm: change return type to vm_fault_t") missed a
conversion. It's not a big problem at present because mainline is still
using
typedef int vm_fault_t;
Fixes: 1c8f422059ae ("mm: change return type to vm_fault_t")
Link: http://lkml.kernel.org/r/20180620172046.GA27894@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Thu, 28 Jun 2018 06:26:13 +0000 (23:26 -0700)]
x86/e820: put !E820_TYPE_RAM regions into memblock.reserved
There is a kernel panic that is triggered when reading /proc/kpageflags
on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]':
BUG: unable to handle kernel paging request at
fffffffffffffffe
PGD
9b20e067 P4D
9b20e067 PUD
9b210067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 2 PID: 1728 Comm: page-types Not tainted
4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014
RIP: 0010:stable_page_flags+0x27/0x3c0
Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7
RSP: 0018:
ffffbbd44111fde0 EFLAGS:
00010202
RAX:
fffffffffffffffe RBX:
00007fffffffeff9 RCX:
0000000000000000
RDX:
0000000000000001 RSI:
0000000000000202 RDI:
ffffed1182fff5c0
RBP:
ffffffffffffffff R08:
0000000000000001 R09:
0000000000000001
R10:
ffffbbd44111fed8 R11:
0000000000000000 R12:
ffffed1182fff5c0
R13:
00000000000bffd7 R14:
0000000002fff5c0 R15:
ffffbbd44111ff10
FS:
00007efc4335a500(0000) GS:
ffff93a5bfc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
fffffffffffffffe CR3:
00000000b2a58000 CR4:
00000000001406e0
Call Trace:
kpageflags_read+0xc7/0x120
proc_reg_read+0x3c/0x60
__vfs_read+0x36/0x170
vfs_read+0x89/0x130
ksys_pread64+0x71/0x90
do_syscall_64+0x5b/0x160
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7efc42e75e23
Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24
According to kernel bisection, this problem became visible due to commit
f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
which changes how struct pages are initialized.
Memblock layout affects the pfn ranges covered by node/zone. Consider
that we have a VM with 2 NUMA nodes and each node has 4GB memory, and
the default (no memmap= given) memblock layout is like below:
MEMBLOCK configuration:
memory size = 0x00000001fff75c00 reserved size = 0x000000000300c000
memory.cnt = 0x4
memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000100000000-0x000000013fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0
memory[0x3] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
...
If you give memmap=1G!4G (so it just covers memory[0x2]),
the range [0x100000000-0x13fffffff] is gone:
MEMBLOCK configuration:
memory size = 0x00000001bff75c00 reserved size = 0x000000000300c000
memory.cnt = 0x3
memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0
memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0
...
This causes shrinking node 0's pfn range because it is calculated by the
address range of memblock.memory. So some of struct pages in the gap
range are left uninitialized.
We have a function zero_resv_unavail() which does zeroing the struct pages
within the reserved unavailable range (i.e. memblock.memory &&
!memblock.reserved). This patch utilizes it to cover all unavailable
ranges by putting them into memblock.reserved.
Link: http://lkml.kernel.org/r/20180615072947.GB23273@hori1.linux.bs1.fc.nec.co.jp
Fixes: f7f99100d8d9 ("mm: stop zeroing memory during allocation in vmemmap")
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Oscar Salvador <osalvador@suse.de>
Tested-by: "Herton R. Krzesinski" <herton@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Thu, 28 Jun 2018 06:26:09 +0000 (23:26 -0700)]
slub: fix failure when we delete and create a slab cache
In kernel 4.17 I removed some code from dm-bufio that did slab cache
merging (commit
21bb13276768: "dm bufio: remove code that merges slab
caches") - both slab and slub support merging caches with identical
attributes, so dm-bufio now just calls kmem_cache_create and relies on
implicit merging.
This uncovered a bug in the slub subsystem - if we delete a cache and
immediatelly create another cache with the same attributes, it fails
because of duplicate filename in /sys/kernel/slab/. The slub subsystem
offloads freeing the cache to a workqueue - and if we create the new
cache before the workqueue runs, it complains because of duplicate
filename in sysfs.
This patch fixes the bug by moving the call of kobject_del from
sysfs_slab_remove_workfn to shutdown_cache. kobject_del must be called
while we hold slab_mutex - so that the sysfs entry is deleted before a
cache with the same attributes could be created.
Running device-mapper-test-suite with:
dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/
triggered:
Buffer I/O error on dev dm-0, logical block
1572848, async page read
device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5
device-mapper: thin: 253:1: aborting current metadata transaction
sysfs: cannot create duplicate filename '/kernel/slab/:a-
0000144'
CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25
Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017
Workqueue: dm-thin do_worker [dm_thin_pool]
Call Trace:
dump_stack+0x5a/0x73
sysfs_warn_dup+0x58/0x70
sysfs_create_dir_ns+0x77/0x80
kobject_add_internal+0xba/0x2e0
kobject_init_and_add+0x70/0xb0
sysfs_slab_add+0xb1/0x250
__kmem_cache_create+0x116/0x150
create_cache+0xd9/0x1f0
kmem_cache_create_usercopy+0x1c1/0x250
kmem_cache_create+0x18/0x20
dm_bufio_client_create+0x1ae/0x410 [dm_bufio]
dm_block_manager_create+0x5e/0x90 [dm_persistent_data]
__create_persistent_data_objects+0x38/0x940 [dm_thin_pool]
dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool]
metadata_operation_failed+0x59/0x100 [dm_thin_pool]
alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool]
process_cell+0x2a3/0x550 [dm_thin_pool]
do_worker+0x28d/0x8f0 [dm_thin_pool]
process_one_work+0x171/0x370
worker_thread+0x49/0x3f0
kthread+0xf8/0x130
ret_from_fork+0x35/0x40
kobject_add_internal failed for :a-
0000144 with -EEXIST, don't try to register things with the same name in the same directory.
kmem_cache_create(dm_bufio_buffer-16) failed with error -17
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Mike Snitzer <snitzer@redhat.com>
Tested-by: Mike Snitzer <snitzer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sebastian Andrzej Siewior [Thu, 28 Jun 2018 06:26:05 +0000 (23:26 -0700)]
Revert mm/vmstat.c: fix vmstat_update() preemption BUG
Revert commit
c7f26ccfb2c3 ("mm/vmstat.c: fix vmstat_update() preemption
BUG"). Steven saw a "using smp_processor_id() in preemptible" message
and added a preempt_disable() section around it to keep it quiet. This
is not the right thing to do it does not fix the real problem.
vmstat_update() is invoked by a kworker on a specific CPU. This worker
it bound to this CPU. The name of the worker was "kworker/1:1" so it
should have been a worker which was bound to CPU1. A worker which can
run on any CPU would have a `u' before the first digit.
smp_processor_id() can be used in a preempt-enabled region as long as
the task is bound to a single CPU which is the case here. If it could
run on an arbitrary CPU then this is the problem we have an should seek
to resolve.
Not only this smp_processor_id() must not be migrated to another CPU but
also refresh_cpu_vm_stats() which might access wrong per-CPU variables.
Not to mention that other code relies on the fact that such a worker
runs on one specific CPU only.
Therefore revert that commit and we should look instead what broke the
affinity mask of the kworker.
Link: http://lkml.kernel.org/r/20180504104451.20278-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Steven J. Hill <steven.hill@cavium.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sebastian Andrzej Siewior [Thu, 28 Jun 2018 06:26:01 +0000 (23:26 -0700)]
lib/percpu_ida.c: don't do alloc from per-CPU list if there is none
In commit
804209d8a009 ("lib/percpu_ida.c: use _irqsave() instead of
local_irq_save() + spin_lock") I inlined alloc_local_tag() and mixed up
the >= check from percpu_ida_alloc() with the one in alloc_local_tag().
Don't alloc from per-CPU freelist if ->nr_free is zero.
Link: http://lkml.kernel.org/r/20180613075830.c3zeva52fuj6fxxv@linutronix.de
Fixes: 804209d8a009 ("lib/percpu_ida.c: use _irqsave() instead of local_irq_save() + spin_lock")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: David Disseldorp <ddiss@suse.de>
Tested-by: David Disseldorp <ddiss@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Shaohua Li <shli@fb.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 28 Jun 2018 16:43:44 +0000 (09:43 -0700)]
Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL
The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression, because
"->poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.
Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"->get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead. That gets rid of one of the new indirections.
But that doesn't fix the new complexity that is completely unwarranted
for the regular case. The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.
[ This revert is a revert of about 30 different commits, not reverted
individually because that would just be unnecessarily messy - Linus ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
oscardagrach [Wed, 13 Jun 2018 18:03:21 +0000 (13:03 -0500)]
arm64: dts: hikey960: Define wl1837 power capabilities
These properties are required for compatibility with runtime PM.
Without these properties, MMC host controller will not be aware
of power capabilities. When the wlcore driver attempts to power
on the device, it will erroneously fail with -EACCES. This fixes
a regression found here: https://lkml.org/lkml/2018/6/12/930
Fixes: 60f36637bbbd ("wlcore: sdio: allow pm to handle sdio power")
Signed-off-by: Ryan Grachek <ryan@edited.us>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
oscardagrach [Wed, 13 Jun 2018 15:13:05 +0000 (10:13 -0500)]
arm64: dts: hikey: Define wl1835 power capabilities
These properties are required for compatibility with runtime PM.
Without these properties, MMC host controller will not be aware
of power capabilities. When the wlcore driver attempts to power
on the device, it will erroneously fail with -EACCES.
Fixes: 60f36637bbbd ("wlcore: sdio: allow pm to handle sdio power")
Signed-off-by: Ryan Grachek <ryan@edited.us>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Bart Van Assche [Wed, 27 Jun 2018 19:55:18 +0000 (12:55 -0700)]
block: Fix cloning of requests with a special payload
This patch avoids that removing a path controlled by the dm-mpath driver
while mkfs is running triggers the following kernel bug:
kernel BUG at block/blk-core.c:3347!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 20 PID: 24369 Comm: mkfs.ext4 Not tainted 4.18.0-rc1-dbg+ #2
RIP: 0010:blk_end_request_all+0x68/0x70
Call Trace:
<IRQ>
dm_softirq_done+0x326/0x3d0 [dm_mod]
blk_done_softirq+0x19b/0x1e0
__do_softirq+0x128/0x60d
irq_exit+0x100/0x110
smp_call_function_single_interrupt+0x90/0x330
call_function_single_interrupt+0xf/0x20
</IRQ>
Fixes: f9d03f96b988 ("block: improve handling of the magic discard payload")
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Helge Deller [Thu, 28 Jun 2018 15:41:58 +0000 (17:41 +0200)]
parisc: Wire up io_pgetevents syscall
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Fri, 15 Jun 2018 20:38:22 +0000 (22:38 +0200)]
parisc: Default to 4 SMP CPUs
I haven't seen any real SMP machine yet with > 4 CPUs (we don't suport
SuperDomes yet), so reducing the default maximum number of CPUs may speed up
various bitop functions which depend on number of CPUs in the system.
bload-o-meter on a typical 64-bit kernel shows:
Data: add/remove: 0/0 grow/shrink: 0/10 up/down: 0/-3724 (-3724)
Total: Before=
1910404, After=
1906680, chg -0.19%
Code: add/remove: 0/2 grow/shrink: 42/38 up/down: 2320/-3500 (-1180)
Total: Before=
11053099, After=
11051919, chg -0.01%
Signed-off-by: Helge Deller <deller@gmx.de>
Andy Shevchenko [Tue, 29 May 2018 19:38:01 +0000 (22:38 +0300)]
parisc: Convert printk(KERN_LEVEL) to pr_lvl()
Convert printk(KERN_LEVEL) type of calls to pr_lvl() macros.
While here,
- convert printk() to pr_info()
- join back string literal to be on one line
- use %*phN (note, it gives 1 byte more for sake of simplicity)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Fri, 15 Jun 2018 20:33:44 +0000 (22:33 +0200)]
parisc: Mark 16kB and 64kB page sizes BROKEN
A full boot only succeeds with 4kB page sizes currently.
For 16kB and 64kB page size support somone needs to fix the LBA PCI code
at least, so mark those broken for now.
Signed-off-by: Helge Deller <deller@gmx.de>
Helge Deller [Fri, 18 May 2018 14:27:15 +0000 (16:27 +0200)]
parisc: Drop struct sigaction from not exported header file
This header file isn't exported to userspace, so there is no benefit in
defining struct sigaction for userspace here.
Signed-off-by: Helge Deller <deller@gmx.de>
Sagi Grimberg [Mon, 25 Jun 2018 17:58:17 +0000 (20:58 +0300)]
nvme-rdma: fix possible double free of controller async event buffer
If reconnect/reset failed where the controller async event buffer
was freed, we might end up freeing it again as we call
nvme_rdma_destroy_admin_queue again in the remove path. Given that
the sequence is guaranteed to serialize by .ctrl_stop, we simply
set ctrl->async_event_sqe.data to NULL and don't free it in future
visits.
Reported-by: Max Gurtovoy <maxg@mellanox.com>
Tested-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Jerry James [Sat, 23 Jun 2018 20:49:04 +0000 (22:49 +0200)]
kconfig: loop boundary condition fix
If buf[-1] just happens to hold the byte 0x0A, then nread can wrap around
to (size_t)-1, leading to invalid memory accesses.
This has caused segmentation faults when trying to build the latest
kernel snapshots for i686 in Fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=
1592374
Signed-off-by: Jerry James <loganjerry@gmail.com>
[alexpl@fedoraproject.org: reformatted patch for submission]
Signed-off-by: Alexander Ploumistos <alexpl@fedoraproject.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Masahiro Yamada [Sat, 23 Jun 2018 16:41:51 +0000 (01:41 +0900)]
kbuild: reword help of LD_DEAD_CODE_DATA_ELIMINATION
Since commit
5d20ee3192a5 ("kbuild: Allow LD_DEAD_CODE_DATA_ELIMINATION
to be selectable if enabled"), HAVE_LD_DEAD_CODE_DATA_ELIMINATION is
supposed to be selected by architectures that are capable of this
functionality. LD_DEAD_CODE_DATA_ELIMINATION is now users' selection.
Update the help message.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Dirk Gouders [Fri, 22 Jun 2018 19:27:38 +0000 (21:27 +0200)]
kconfig: handle P_SYMBOL in print_symbol()
Each symbol has a property of type P_SYMBOL since commit
59e89e3ddf85 (kconfig: save location of config symbols).
Handle those properties in print_symbol().
Further, place a pointer to print_symbol() in the comment above the
list of known property type.
Signed-off-by: Dirk Gouders <dirk@gouders.net>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Alexander Potapenko [Thu, 14 Jun 2018 10:23:09 +0000 (12:23 +0200)]
vt: prevent leaking uninitialized data to userspace via /dev/vcs*
KMSAN reported an infoleak when reading from /dev/vcs*:
BUG: KMSAN: kernel-infoleak in vcs_read+0x18ba/0x1cc0
Call Trace:
...
kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1253
copy_to_user ./include/linux/uaccess.h:184
vcs_read+0x18ba/0x1cc0 drivers/tty/vt/vc_screen.c:352
__vfs_read+0x1b2/0x9d0 fs/read_write.c:416
vfs_read+0x36c/0x6b0 fs/read_write.c:452
...
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279
kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
__kmalloc+0x13a/0x350 mm/slub.c:3818
kmalloc ./include/linux/slab.h:517
vc_allocate+0x438/0x800 drivers/tty/vt/vt.c:787
con_install+0x8c/0x640 drivers/tty/vt/vt.c:2880
tty_driver_install_tty drivers/tty/tty_io.c:1224
tty_init_dev+0x1b5/0x1020 drivers/tty/tty_io.c:1324
tty_open_by_driver drivers/tty/tty_io.c:1959
tty_open+0x17b4/0x2ed0 drivers/tty/tty_io.c:2007
chrdev_open+0xc25/0xd90 fs/char_dev.c:417
do_dentry_open+0xccc/0x1440 fs/open.c:794
vfs_open+0x1b6/0x2f0 fs/open.c:908
...
Bytes 0-79 of 240 are uninitialized
Consistently allocating |vc_screenbuf| with kzalloc() fixes the problem
Reported-by: syzbot+17a8efdf800000@syzkaller.appspotmail.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Wed, 13 Jun 2018 15:08:59 +0000 (17:08 +0200)]
serdev: fix memleak on module unload
Make sure to free all resources associated with the ida on module
exit.
Fixes: cd6484e1830b ("serdev: Introduce new bus for serial attached devices")
Cc: stable <stable@vger.kernel.org> # 4.11
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Shevchenko [Wed, 6 Jun 2018 18:00:41 +0000 (21:00 +0300)]
serial: 8250_pci: Remove stalled entries in blacklist
After the commit
7d8905d06405 ("serial: 8250_pci: Enable device after we check black list")
pure serial multi-port cards, such as CH355, got blacklisted and thus
not being enumerated anymore. Previously, it seems, blacklisting them
was on purpose to shut up pciserial_init_one() about record duplication.
So, remove the entries from blacklist in order to get cards enumerated.
Fixes: 7d8905d06405 ("serial: 8250_pci: Enable device after we check black list")
Reported-by: Matt Turner <mattst88@gmail.com>
Cc: Sergej Pupykin <ml@sergej.pp.ru>
Cc: Alexandr Petrenko <petrenkoas83@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tetsuo Handa [Sat, 26 May 2018 00:53:14 +0000 (09:53 +0900)]
n_tty: Access echo_* variables carefully.
syzbot is reporting stalls at __process_echoes() [1]. This is because
since ldata->echo_commit < ldata->echo_tail becomes true for some reason,
the discard loop is serving as almost infinite loop. This patch tries to
avoid falling into ldata->echo_commit < ldata->echo_tail situation by
making access to echo_* variables more carefully.
Since reset_buffer_flags() is called without output_lock held, it should
not touch echo_* variables. And omit a call to reset_buffer_flags() from
n_tty_open() by using vzalloc().
Since add_echo_byte() is called without output_lock held, it needs memory
barrier between storing into echo_buf[] and incrementing echo_head counter.
echo_buf() needs corresponding memory barrier before reading echo_buf[].
Lack of handling the possibility of not-yet-stored multi-byte operation
might be the reason of falling into ldata->echo_commit < ldata->echo_tail
situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to
echo_buf(ldata, tail + 1), the WARN_ON() fires.
Also, explicitly masking with buffer for the former "while" loop, and
use ldata->echo_commit > tail for the latter "while" loop.
[1] https://syzkaller.appspot.com/bug?id=
17f23b094cd80df750e5b0f8982c521ee6bcbf40
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+108696293d7a21ab688f@syzkaller.appspotmail.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tetsuo Handa [Sat, 26 May 2018 00:53:13 +0000 (09:53 +0900)]
n_tty: Fix stall at n_tty_receive_char_special().
syzbot is reporting stalls at n_tty_receive_char_special() [1]. This is
because comparison is not working as expected since ldata->read_head can
change at any moment. Mitigate this by explicitly masking with buffer size
when checking condition for "while" loops.
[1] https://syzkaller.appspot.com/bug?id=
3d7481a346958d9469bebbeb0537d5f056bdd6e8
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com>
Fixes: bc5a5e3f45d04784 ("n_tty: Don't wrap input buffer indices at buffer size")
Cc: stable <stable@vger.kernel.org>
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>