Paul Burton [Mon, 17 Oct 2016 14:34:36 +0000 (15:34 +0100)]
MIPS: Cleanup LLBit handling in switch_to
Commit
7c151d3d5d7a ("MIPS: Make use of the ERETNC instruction on MIPS
R6") began clearing LLBit during context switches, but did so on all
systems where it is writable for unclear reasons & did so from a macro
with "software_ll_bit" in its name, which is intended to operate on the
ll_bit variable used by ll/sc emulation for old CPUs.
We do now need to clear LLBit on MIPSr6 systems where we'll use eretnc
to return to userland, but we don't need to do so on MIPSr5 systems with
a writable LLBit.
Move the clear to its own appropriately named macro, do it only for
MIPSr6 systems & comment about why.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14409/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 17 Oct 2016 14:34:35 +0000 (15:34 +0100)]
MIPS: Remove r2_emul_return from struct thread_info
The r2_emul_return field in struct thread_info was used in order to take
an alternate codepath when returning to userland, which (besides not
implementing certain features) effectively used the eretnc instruction
in place of eret. The difference is that eretnc doesn't clear LLBit, and
therefore doesn't cause a linked load & store sequence to fail due to
emulation like eret would.
The reason eret would usually be used to clear LLBit is so that after
context switching we ensure that a load performed by one task doesn't
influence another task. However commit
7c151d3d5d7a ("MIPS: Make use of
the ERETNC instruction on MIPS R6") which introduced the r2_emul_return
field and conditional use of eretnc also for some reason began
explicitly clearing LLBit during context switches - despite retaining
the use of eret for everything but returns from the pre-r6 instruction
emulation code.
As LLBit is cleared upon context switches anyway, simplify this by using
eretnc unconditionally for MIPSr6 kernels. This allows us to remove the
4 byte r2_emul_return boolean from struct thread_info, simplify the
return to user code in entry.S and avoid the overhead of tracking &
checking state which we don't need.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14408/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Rahul Bedarkar [Fri, 14 Oct 2016 05:55:55 +0000 (11:25 +0530)]
MIPS: DTS: img: add device tree for Marduk board
Add support for Imagination Technologies' Marduk board which is based
on Pistachio SoC. It is also known as Creator Ci40. Marduk is legacy
name and will be there for decades.
Documentation for this board can be found on
https://docs.creatordev.io/ci40/
This patch adds initial support for board with following peripherals:
* PWM based heartbeat LED
* GPIO based buttons
* SPI NOR flash on SPI1
* UART0 and UART1
* SD card
* Ethernet
* USB
* PWM
* ADC
* I2C
Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: James Hartley <james.hartley@imgtec.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14394/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Rahul Bedarkar [Fri, 14 Oct 2016 05:55:54 +0000 (11:25 +0530)]
MIPS: DTS: Add base device tree for Pistachio SoC
Add support for the base Device Tree for Imagination Technologies'
Pistachio SoC.
This commit supports the following peripherals:
* Clocks
* Pinctrl and GPIO
* UART
* SPI
* I2C
* PWM
* ADC
* Watchdog
* Ethernet
* MMC
* DMA engine
* Crypto
* I2S
* SPDIF
* Internal DAC
* Timer
* USB
* IR
* Interrupt Controller
Signed-off-by: Rahul Bedarkar <rahul.bedarkar@imgtec.com>
Acked-by: James Hartley <james.hartley@imgtec.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-mips@linux-mips.org
Cc: devicetree@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14393/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 17 Oct 2016 15:01:07 +0000 (16:01 +0100)]
MIPS: traps: Ensure L1 & L2 ECC checking match for CM3 systems
On systems with CM3, we must ensure that the L1 & L2 ECC enables are set
to the same value. This is presumed by the hardware & cache corruption
can occur when it is not the case. Support enabling & disabling the L2
ECC checking on CM3 systems where this is controlled via a GCR, and
ensure that it matches the state of L1 ECC checking. Remove I6400 from
the switch statement it will no longer hit, and which was incorrect
since the L2 ECC enable bit isn't in the CP0 ErrCtl register.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14413/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Leonid Yegoshin [Thu, 25 Aug 2016 17:37:38 +0000 (10:37 -0700)]
MIPS: R2-on-R6 MULTU/MADDU/MSUBU emulation bugfix
MIPS instructions MULTU, MADDU and MSUBU emulation requires registers HI/LO
to be converted to signed 32bits before 64bit sign extension on MIPS64.
Bug was found on running MIPS32 R2 test application on MIPS64 R6 kernel.
Fixes: b0a668fb2038 ("MIPS: kernel: mips-r2-to-r6-emul: Add R2 emulator for MIPS R6")
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Reported-by: Nikola.Veljkovic@imgtec.com
Cc: paul.burton@imgtec.com
Cc: yamada.masahiro@socionext.com
Cc: akpm@linux-foundation.org
Cc: andrea.gelmini@gelma.net
Cc: macro@imgtec.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14043/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Kelvin Cheung [Tue, 9 Aug 2016 10:52:59 +0000 (18:52 +0800)]
MIPS: Loongson1B: Modify DEFAULT_MEMSIZE
This patch changes DEFAULT_MEMSIZE to 64MB
which is the memory size of latest EVB.
Signed-off-by: Kelvin Cheung <keguang.zhang@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13856/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Kelvin Cheung [Tue, 9 Aug 2016 08:23:11 +0000 (16:23 +0800)]
MIPS: Loongson1C: Remove ARCH_WANT_OPTIONAL_GPIOLIB
This patch removes ARCH_WANT_OPTIONAL_GPIOLIB due to upstream changes.
Signed-off-by: Kelvin Cheung <keguang.zhang@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Yang Ling <gnaygnil@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/13855/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Florian Fainelli [Mon, 31 Oct 2016 21:17:36 +0000 (14:17 -0700)]
MIPS: BMIPS: Migrate interrupts during bmips_cpu_disable
While we properly disabled the per-CPU timer interrupt, we also need to
make sure that all interrupts that can possibly have this CPU in their
smp_affinity mask also have a chance to see this interrupt migrated to a
CPU not being taken offline.
[ralf@linux-mips.org: Fix merge conflict.]
Fixes: 230b6ff57552 ("MIPS: BMIPS: Mask off timer IRQs when hot-unplugging a CPU")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: cernekee@gmail.com
Cc: jaedon.shin@gmail.com
Cc: justinpopo6@gmail.com
Cc: tglx@linutronix.de
Cc: marc.zyngier@arm.com
Cc: jason@lakedaemon.net
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14488/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Zubair Lutfullah Kakakhel [Tue, 22 Nov 2016 17:52:43 +0000 (17:52 +0000)]
MIPS: xilfpga: Update defconfig
Update defconfig to enable emaclite, i2c and temp sensor on the
xilfpga platform
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14595/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Zubair Lutfullah Kakakhel [Tue, 22 Nov 2016 17:52:42 +0000 (17:52 +0000)]
MIPS: xilfpga: Add DT node for AXI emaclite
The xilfpga platform has a Xilinx AXI emaclite block.
Add the DT node to use it.
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14596/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Zubair Lutfullah Kakakhel [Tue, 22 Nov 2016 17:52:41 +0000 (17:52 +0000)]
MIPS: xilfpga: Add DT node for AXI I2C
The xilfpga platform has an AXI I2C Bus master with a temperature
sensor connected to it.
Add the device tree node to use them.
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14594/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Zubair Lutfullah Kakakhel [Tue, 22 Nov 2016 17:52:40 +0000 (17:52 +0000)]
MIPS: xilfpga: Update DT node and specify uart irq
Update the DT node with the UART irq
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14593/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Zubair Lutfullah Kakakhel [Tue, 22 Nov 2016 17:52:39 +0000 (17:52 +0000)]
MIPS: xilfpga: Use Xilinx Interrupt Controller
IRQs from peripherals such as i2c/uart/ethernet come via
the AXI Interrupt controller.
Select it in Kconfig for xilfpga and add the DT node
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14592/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Zubair Lutfullah Kakakhel [Tue, 22 Nov 2016 17:52:38 +0000 (17:52 +0000)]
MIPS: xilfpga: Use irqchip instead of the legacy way
This prepares the code to use the Xilinx Interrupt Controller
driver in drivers/irqchip/irq-xilinx-intc.c
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: Zubair.Kakakhel@imgtec.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14591/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Christophe JAILLET [Sun, 30 Oct 2016 08:25:46 +0000 (09:25 +0100)]
MIPS: ath79: Fix error handling
'clk_register_fixed_rate()' returns an error pointer in case of error, not
NULL. So test it with IS_ERR.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Aban Bedel <albeu@free.fr>
Cc: antonynpavlov@gmail.com
Cc: hackpascal@gmail.com
Cc: amitoj1606@gmail.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: kernel-janitors@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14464/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Fri, 4 Nov 2016 09:28:58 +0000 (09:28 +0000)]
MIPS: SMP-CPS: Don't BUG if a CPU fails to start
If there is no online CPU within a core which could receive the IPI to
start another VP in that core, a BUG() is triggered. Instead print a
warning and gracefully handle the failure such that the system remains
usable, albeit without the requested secondary CPU.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14504/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Fri, 4 Nov 2016 09:28:57 +0000 (09:28 +0000)]
MIPS: SMP: Remove cpu_callin_map
The previous commit made cpu_callin_map redundant, since it is no longer
used to signal secondary CPUs starting, or going offline. Remove it now.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Kevin Cernekee <cernekee@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Yang Shi <yang.shi@windriver.com>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14503/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Fri, 4 Nov 2016 09:28:56 +0000 (09:28 +0000)]
MIPS: SMP: Use a completion event to signal CPU up
If a secondary CPU failed to start, for any reason, the CPU requesting
the secondary to start would get stuck in the loop waiting for the
secondary to be present in the cpu_callin_map.
Rather than that, use a completion event to signal that the secondary
CPU has started and is waiting to synchronise counters.
Since the CPU presence will no longer be marked in cpu_callin_map,
remove the redundant test from arch_cpu_idle_dead().
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Qais Yousef <qsyousef@gmail.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14502/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 11:30:41 +0000 (11:30 +0000)]
MIPS: Netlogic: Exclude netlogic,xlp-pic code from XLR builds
Code in arch/mips/netlogic/common/irq.c which handles the XLP PIC fails
to build in XLR configurations due to cpu_is_xlp9xx not being defined,
leading to the following build failure:
arch/mips/netlogic/common/irq.c: In function ‘xlp_of_pic_init’:
arch/mips/netlogic/common/irq.c:298:2: error: implicit declaration
of function ‘cpu_is_xlp9xx’ [-Werror=implicit-function-declaration]
if (cpu_is_xlp9xx()) {
^
Although the code was conditional upon CONFIG_OF which is indirectly
selected by CONFIG_NLM_XLP_BOARD but not CONFIG_NLM_XLR_BOARD, the
failing XLR with CONFIG_OF configuration can be configured manually or
by randconfig.
Fix the build failure by making the affected XLP PIC code conditional
upon CONFIG_CPU_XLP which is used to guard the inclusion of
asm/netlogic/xlp-hal/xlp.h that provides the required cpu_is_xlp9xx
function.
[ralf@linux-mips.org: Fixed up as per Jayachandran's suggestion.]
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Jayachandran C <jchandra@broadcom.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14524/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 11:28:12 +0000 (11:28 +0000)]
MIPS: Remove unused HIGHMEM_DEBUG macro
We have a HIGHMEM_DEBUG macro defined in asm/highmem.h with a comment
stating that it should be removed for production, and no users... Kill
it.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14523/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Thu, 10 Nov 2016 10:02:13 +0000 (10:02 +0000)]
MIPS: Use Makefile.postlink to insert relocations into vmlinux
When relocatable support for MIPS was merged, there was no support for
an architecture to add a postlink step for vmlinux. This meant that only
invoking a target within the boot directory, such as uImage, caused the
relocations to be inserted into vmlinux. Building just the vmlinux
target would result in a relocatable kernel with no relocation
information present.
Commit
fbe6e37dab97 ("kbuild: add arch specific post-link Makefile")
recified this situation, so MIPS can now define a postlink step to add
relocation information into vmlinux, and remove the additional steps
tacked onto boot targets.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Tested-by: Steven J. Hill <steven.hill@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14554/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 15:07:07 +0000 (15:07 +0000)]
MIPS: Handle microMIPS jumps in the same way as MIPS32/MIPS64 jumps
is_jump_ins() checks for plain jump ("j") instructions since commit
e7438c4b893e ("MIPS: Fix sibling call handling in get_frame_info") but
that commit didn't make the same change to the microMIPS code, leaving
it inconsistent with the MIPS32/MIPS64 code. Handle the microMIPS
encoding of the jump instruction too such that it behaves consistently.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: e7438c4b893e ("MIPS: Fix sibling call handling in get_frame_info")
Cc: Tony Wu <tung7970@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+
Patchwork: https://patchwork.linux-mips.org/patch/14533/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 15:07:06 +0000 (15:07 +0000)]
MIPS: Calculate microMIPS ra properly when unwinding the stack
get_frame_info() calculates the offset of the return address within a
stack frame simply by dividing a the bottom 16 bits of the instruction,
treated as a signed integer, by the size of a long. Whilst this works
for MIPS32 & MIPS64 ISAs where the sw or sd instructions are used, it's
incorrect for microMIPS where encodings differ. The result is that we
typically completely fail to unwind the stack on microMIPS.
Fix this by adjusting is_ra_save_ins() to calculate the return address
offset, and take into account the various different encodings there in
the same place as we consider whether an instruction is storing the
ra/$31 register.
With this we are now able to unwind the stack for kernels targetting the
microMIPS ISA, for example we can produce:
Call Trace:
[<
80109e1f>] show_stack+0x63/0x7c
[<
8011ea17>] __warn+0x9b/0xac
[<
8011ea45>] warn_slowpath_fmt+0x1d/0x20
[<
8013fe53>] register_console+0x43/0x314
[<
8067c58d>] of_setup_earlycon+0x1dd/0x1ec
[<
8067f63f>] early_init_dt_scan_chosen_stdout+0xe7/0xf8
[<
8066c115>] do_early_param+0x75/0xac
[<
801302f9>] parse_args+0x1dd/0x308
[<
8066c459>] parse_early_options+0x25/0x28
[<
8066c48b>] parse_early_param+0x2f/0x38
[<
8066e8cf>] setup_arch+0x113/0x488
[<
8066c4f3>] start_kernel+0x57/0x328
---[ end trace
0000000000000000 ]---
Whereas previously we only produced:
Call Trace:
[<
80109e1f>] show_stack+0x63/0x7c
---[ end trace
0000000000000000 ]---
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0f6 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+
Patchwork: https://patchwork.linux-mips.org/patch/14532/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 15:07:05 +0000 (15:07 +0000)]
MIPS: Fix is_jump_ins() handling of 16b microMIPS instructions
is_jump_ins() checks 16b instruction fields without verifying that the
instruction is indeed 16b, as is done by is_ra_save_ins() &
is_sp_move_ins(). Add the appropriate check.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0f6 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+
Patchwork: https://patchwork.linux-mips.org/patch/14531/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 15:07:04 +0000 (15:07 +0000)]
MIPS: Fix get_frame_info() handling of microMIPS function size
get_frame_info() is meant to iterate over up to the first 128
instructions within a function, but for microMIPS kernels it will not
reach that many instructions unless the function is 512 bytes long since
we calculate the maximum number of instructions to check by dividing the
function length by the 4 byte size of a union mips_instruction. In
microMIPS kernels this won't do since instructions are variable length.
Fix this by instead checking whether the pointer to the current
instruction has reached the end of the function, and use max_insns as a
simple constant to check the number of iterations against.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0f6 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+
Patchwork: https://patchwork.linux-mips.org/patch/14530/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 15:07:03 +0000 (15:07 +0000)]
MIPS: Prevent unaligned accesses during stack unwinding
During stack unwinding we call a number of functions to determine what
type of instruction we're looking at. The union mips_instruction pointer
provided to them may be pointing at a 2 byte, but not 4 byte, aligned
address & we thus cannot directly access the 4 byte wide members of the
union mips_instruction. To avoid this is_ra_save_ins() copies the
required half-words of the microMIPS instruction to a correctly aligned
union mips_instruction on the stack, which it can then access safely.
The is_jump_ins() & is_sp_move_ins() functions do not correctly perform
this temporary copy, and instead attempt to directly dereference 4 byte
fields which may be misaligned and lead to an address exception.
Fix this by copying the instruction halfwords to a temporary union
mips_instruction in get_frame_info() such that we can provide a 4 byte
aligned union mips_instruction to the is_*_ins() functions and they do
not need to deal with misalignment themselves.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0f6 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+
Patchwork: https://patchwork.linux-mips.org/patch/14529/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 15:07:02 +0000 (15:07 +0000)]
MIPS: Clear ISA bit correctly in get_frame_info()
get_frame_info() can be called in microMIPS kernels with the ISA bit
already clear. For example this happens when unwind_stack_by_address()
is called because we begin with a PC that has the ISA bit set & subtract
the (odd) offset from the preceding symbol (which does not have the ISA
bit set). Since get_frame_info() unconditionally subtracts 1 from the PC
in microMIPS kernels it incorrectly misaligns the address it then
attempts to access code at, leading to an address error exception.
Fix this by using msk_isa16_mode() to clear the ISA bit, which allows
get_frame_info() to function regardless of whether it is provided with a
PC that has the ISA bit set or not.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0f6 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # v3.10+
Patchwork: https://patchwork.linux-mips.org/patch/14528/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 17 Oct 2016 14:56:21 +0000 (15:56 +0100)]
MIPS: Use generic asm/unaligned.h
The MIPS-specific asm/unaligned.h provides nothing that the generic
version doesn't - it simply uses MIPS-specific endianness macros in
place of generic ones & lacks support for
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS. Remove it & switch to using the
generic version to remove duplication.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14412/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Mon, 7 Nov 2016 11:52:19 +0000 (11:52 +0000)]
MIPS: Ensure bss section ends on a long-aligned address
When clearing the .bss section in kernel_entry we do so using LONG_S
instructions, and branch whilst the current write address doesn't equal
the end of the .bss section minus the size of a long integer. The .bss
section always begins at a long-aligned address and we always increment
the write pointer by the size of a long integer - we therefore rely upon
the .bss section ending at a long-aligned address. If this is not the
case then the long-aligned write address can never be equal to the
non-long-aligned end address & we will continue to increment past the
end of the .bss section, attempting to zero the rest of memory.
Despite this requirement that .bss end at a long-aligned address we pass
0 as the end alignment requirement to the BSS_SECTION macro and thus
don't guarantee any particular alignment, allowing us to hit the error
condition described above.
Fix this by instead passing 8 bytes as the end alignment argument to
the BSS_SECTION macro, ensuring that the end of the .bss section is
always at least long-aligned.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14526/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Steven J. Hill [Fri, 9 Dec 2016 08:36:22 +0000 (02:36 -0600)]
MIPS: Relocatable: Provide plat_post_relocation hook
This hook provides the platform the chance to perform any required
setup before the boot processor switches to the relocated kernel.
The relocated kernel has been copied and fixed up ready for execution
at this point. Secondary CPUs may wish to switch to it early. There
is also the opportunity for the platform to abort jumping to the
relocated kernel if there is anything wrong with the chosen offset.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Signed-off-by: Steven J. Hill <Steven.Hill@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14651/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Steven J. Hill [Tue, 13 Dec 2016 20:25:37 +0000 (14:25 -0600)]
MIPS: Octeon: Enable KASLR
This patch enables KASLR for Octeon systems. The SMP startup code is
such that the secondaries monitor the volatile variable
'octeon_processor_relocated_kernel_entry' for any non-zero value.
The 'plat_post_relocation hook' is used to set that value to the
kernel entry point of the relocated kernel. The secondary CPUs will
then jusmp to the new kernel, perform their initialization again
and begin waiting for the boot CPU to start them via the relocated
loop 'octeon_spin_wait_boot'. Inspired by Steven's code from Cavium.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14669/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Steven J. Hill [Tue, 22 Nov 2016 19:44:20 +0000 (13:44 -0600)]
MIPS: Octeon: Add plat_get_fdt() function for Cavium platforms.
Add in the function needed for Octeon platforms to support KASLR.
Signed-off-by: Steven J. Hill <Steven.Hill@cavium.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Steven J. Hill [Tue, 22 Nov 2016 19:44:10 +0000 (13:44 -0600)]
MIPS: Octeon: Add fw_init_cmdline() for Cavium platforms.
Add platform-specific kernel command line processing for Octeon.
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14599/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Mon, 19 Dec 2016 14:21:00 +0000 (14:21 +0000)]
MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK
Since do_IRQ is now invoked on a separate IRQ stack, we select
HAVE_IRQ_EXIT_ON_IRQ_STACK so that softirq's may be invoked directly
from irq_exit(), rather than requiring do_softirq_own_stack.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14744/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Mon, 19 Dec 2016 14:20:59 +0000 (14:20 +0000)]
MIPS: Switch to the irq_stack in interrupts
When enterring interrupt context via handle_int or except_vec_vi, switch
to the irq_stack of the current CPU if it is not already in use.
The current stack pointer is masked with the thread size and compared to
the base or the irq stack. If it does not match then the stack pointer
is set to the top of that stack, otherwise this is a nested irq being
handled on the irq stack so the stack pointer should be left as it was.
The in-use stack pointer is placed in the callee saved register s1. It
will be saved to the stack when plat_irq_dispatch is invoked and can be
restored once control returns here.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14743/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Mon, 19 Dec 2016 14:20:58 +0000 (14:20 +0000)]
MIPS: Only change $28 to thread_info if coming from user mode
The SAVE_SOME macro is used to save the execution context on all
exceptions.
If an exception occurs while executing user code, the stack is switched
to the kernel's stack for the current task, and register $28 is switched
to point to the current_thread_info, which is at the bottom of the stack
region.
If the exception occurs while executing kernel code, the stack is left,
and this change ensures that register $28 is not updated. This is the
correct behaviour when the kernel can be executing on the separate irq
stack, because the thread_info will not be at the base of it.
With this change, register $28 is only switched to it's kernel
conventional usage of the currrent thread info pointer at the point at
which execution enters kernel space. Doing it on every exception was
redundant, but OK without an IRQ stack, but will be erroneous once that
is introduced.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14742/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Mon, 19 Dec 2016 14:20:57 +0000 (14:20 +0000)]
MIPS: Stack unwinding while on IRQ stack
Within unwind stack, check if the stack pointer being unwound is within
the CPU's irq_stack and if so use that page rather than the task's stack
page.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Maciej W. Rozycki <macro@imgtec.com>
Cc: Marcin Nowakowski <marcin.nowakowski@imgtec.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14741/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Matt Redfearn [Mon, 19 Dec 2016 14:20:56 +0000 (14:20 +0000)]
MIPS: Introduce irq_stack
Allocate a per-cpu irq stack for use within interrupt handlers.
Also add a utility function on_irq_stack to determine if a given stack
pointer is within the irq stack for that cpu.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14740/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Thu, 15 Dec 2016 11:39:22 +0000 (12:39 +0100)]
MIPS: IP22: Fix build error due to binutils 2.25 uselessnes.
Fix the following build error with binutils 2.25.
CC arch/mips/mm/sc-ip22.o
{standard input}: Assembler messages:
{standard input}:132: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:159: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:200: Error: number (0x9000000080000000) larger than 32 bits
scripts/Makefile.build:293: recipe for target 'arch/mips/mm/sc-ip22.o' failed
make[1]: *** [arch/mips/mm/sc-ip22.o] Error 1
MIPS has used .set mips3 to temporarily switch the assembler to 64 bit
mode in 64 bit kernels virtually forever. Binutils 2.25 broke this
behavious partially by happily accepting 64 bit instructions in .set mips3
mode but puking on 64 bit constants when generating 32 bit ELF. Binutils
2.26 restored the old behaviour again.
Fix build with binutils 2.25 by open coding the offending
dli $1, 0x9000000080000000
as
li $1, 0x9000
dsll $1, $1, 48
which is ugly be the only thing that will build on all binutils vintages.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: stable@vger.kernel.org
Ralf Baechle [Thu, 15 Dec 2016 11:27:21 +0000 (12:27 +0100)]
MIPS: IP22: Reformat inline assembler code to modern standards.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Bolle [Fri, 25 Nov 2016 12:36:18 +0000 (13:36 +0100)]
MIPS: Zboot: Don't use $(LINUXINCLUDE) twice
The make variables KBUILD_CFLAGS and KBUILD_AFLAGS both contain
$(LINUXINCLUDE). But the build already picks up $(LINUXINCLUDE) from
scripts/Makefile.lib. The net effect is that the (long) list of include
directories is used twice.
This is harmless but pointless. So stop using $(LINUXINCLUDE) twice.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14622/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Geert Uytterhoeven [Wed, 7 Dec 2016 09:05:15 +0000 (10:05 +0100)]
MIPS: TXx9: Modernize printing of kernel messages
- Convert from printk() to pr_*(),
- Add missing continuations, to fix user-visible breakage,
- Drop superfluous casts (u64 has been unsigned long long on all
architectures for many years).
On rbtx4927, this restores the kernel output like:
-TX4927 SDRAMC --
- CR0:
0000007e00000544
- TR:
32800030e
+TX4927 SDRAMC -- CR0:
0000007e00000544 TR:
32800030e
and:
-PCIC -- PCICLK:
-Internal(33.3MHz)
-
+PCIC -- PCICLK:Internal(33.3MHz)
Fixes: 4bcc595ccd80decb ("printk: reinstate KERN_CONT for printing continuation lines")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14646/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Aaro Koskinen [Sun, 27 Nov 2016 00:30:51 +0000 (02:30 +0200)]
MIPS: Octeon: Kill cvmx_helper_link_autoconf()
Kill cvmx_helper_link_autoconf(). Nobody uses this function.
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14626/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Julia Lawall [Sat, 29 Oct 2016 19:36:59 +0000 (21:36 +0200)]
MIPS: TXx9: 7segled: use permission-specific DEVICE_ATTR variants
Use DEVICE_ATTR_WO for write only attributes. This simplifies the
source code, improves readbility, and reduces the chance of
inconsistencies.
The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@wo@
declarer name DEVICE_ATTR;
identifier x,x_store;
@@
DEVICE_ATTR(x, \(0200\|S_IWUSR\), NULL, x_store);
@script:ocaml@
x << wo.x;
x_store << wo.x_store;
@@
if not (x^"_store" = x_store) then Coccilib.include_match false
@@
declarer name DEVICE_ATTR_WO;
identifier wo.x,wo.x_store;
@@
- DEVICE_ATTR(x, \(0200\|S_IWUSR\), NULL, x_store);
+ DEVICE_ATTR_WO(x);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: kernel-janitors@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14463/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Maarten ter Huurne [Mon, 31 Oct 2016 14:19:55 +0000 (15:19 +0100)]
MIPS: zboot: Add "uzImage.bin" target
uzImage.bin is vmlinuz.bin wrapped in a legacy U-Boot image. Since
the extraction code is inside the image, it does not depend on the
boot loader to extract the kernel.
Signed-off-by: Maarten ter Huurne <maarten@treewalker.org>
Cc: Alban Bedel <albeu@free.fr>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14473/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Linus Torvalds [Sun, 1 Jan 2017 22:31:53 +0000 (14:31 -0800)]
Linux 4.10-rc2
Linus Torvalds [Sun, 1 Jan 2017 20:27:05 +0000 (12:27 -0800)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull DAX updates from Dan Williams:
"The completion of Jan's DAX work for 4.10.
As I mentioned in the libnvdimm-for-4.10 pull request, these are some
final fixes for the DAX dirty-cacheline-tracking invalidation work
that was merged through the -mm, ext4, and xfs trees in -rc1. These
patches were prepared prior to the merge window, but we waited for
4.10-rc1 to have a stable merge base after all the prerequisites were
merged.
Quoting Jan on the overall changes in these patches:
"So I'd like all these 6 patches to go for rc2. The first three
patches fix invalidation of exceptional DAX entries (a bug which
is there for a long time) - without these patches data loss can
occur on power failure even though user called fsync(2). The other
three patches change locking of DAX faults so that ->iomap_begin()
is called in a more relaxed locking context and we are safe to
start a transaction there for ext4"
These have received a build success notification from the kbuild
robot, and pass the latest libnvdimm unit tests. There have not been
any -next releases since -rc1, so they have not appeared there"
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
ext4: Simplify DAX fault path
dax: Call ->iomap_begin without entry lock during dax fault
dax: Finish fault completely when loading holes
dax: Avoid page invalidation races and unnecessary radix tree traversals
mm: Invalidate DAX radix tree entries only if appropriate
ext2: Return BH_New buffers for zeroed blocks
Linus Torvalds [Fri, 30 Dec 2016 17:32:26 +0000 (09:32 -0800)]
Merge tag 'docs-4.10-rc1-fix' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"Two small fixes:
- A merge error on my part broke the DocBook build. I've
requisitioned one of tglx's frozen sharks for appropriate
disciplinary action and resolved to be more careful about testing
the DocBook stuff as long as it's still around.
- Fix an error in unaligned-memory-access.txt"
* tag 'docs-4.10-rc1-fix' of git://git.lwn.net/linux:
Documentation/unaligned-memory-access.txt: fix incorrect comparison operator
docs: Fix build failure
Linus Torvalds [Fri, 30 Dec 2016 17:29:50 +0000 (09:29 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a boot failure on some platforms when crypto self test is
enabled along with the new acomp interface"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: testmgr - Use heap buffer for acomp test input
Olof Johansson [Thu, 29 Dec 2016 22:16:07 +0000 (14:16 -0800)]
mm/filemap: fix parameters to test_bit()
mm/filemap.c: In function 'clear_bit_unlock_is_negative_byte':
mm/filemap.c:933:9: error: too few arguments to function 'test_bit'
return test_bit(PG_waiters);
^~~~~~~~
Fixes: b91e1302ad9b ('mm: optimize PageWaiters bit use for unlock_page()')
Signed-off-by: Olof Johansson <olof@lixom.net>
Brown-paper-bag-by: Linus Torvalds <dummy@duh.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 27 Dec 2016 19:40:38 +0000 (11:40 -0800)]
mm: optimize PageWaiters bit use for unlock_page()
In commit
62906027091f ("mm: add PageWaiters indicating tasks are
waiting for a page bit") Nick Piggin made our page locking no longer
unconditionally touch the hashed page waitqueue, which not only helps
performance in general, but is particularly helpful on NUMA machines
where the hashed wait queues can bounce around a lot.
However, the "clear lock bit atomically and then test the waiters bit"
sequence turns out to be much more expensive than it needs to be,
because you get a nasty stall when trying to access the same word that
just got updated atomically.
On architectures where locking is done with LL/SC, this would be trivial
to fix with a new primitive that clears one bit and tests another
atomically, but that ends up not working on x86, where the only atomic
operations that return the result end up being cmpxchg and xadd. The
atomic bit operations return the old value of the same bit we changed,
not the value of an unrelated bit.
On x86, we could put the lock bit in the high bit of the byte, and use
"xadd" with that bit (where the overflow ends up not touching other
bits), and look at the other bits of the result. However, an even
simpler model is to just use a regular atomic "and" to clear the lock
bit, and then the sign bit in eflags will indicate the resulting state
of the unrelated bit #7.
So by moving the PageWaiters bit up to bit #7, we can atomically clear
the lock bit and test the waiters bit on x86 too. And architectures
with LL/SC (which is all the usual RISC suspects), the particular bit
doesn't matter, so they are fine with this approach too.
This avoids the extra access to the same atomic word, and thus avoids
the costly stall at page unlock time.
The only downside is that the interface ends up being a bit odd and
specialized: clear a bit in a byte, and test the sign bit. Nick doesn't
love the resulting name of the new primitive, but I'd rather make the
name be descriptive and very clear about the limitation imposed by
trying to work across all relevant architectures than make it be some
generic thing that doesn't make the odd semantics explicit.
So this introduces the new architecture primitive
clear_bit_unlock_is_negative_byte();
and adds the trivial implementation for x86. We have a generic
non-optimized fallback (that just does a "clear_bit()"+"test_bit(7)"
combination) which can be overridden by any architecture that can do
better. According to Nick, Power has the same hickup x86 has, for
example, but some other architectures may not even care.
All these optimizations mean that my page locking stress-test (which is
just executing a lot of small short-lived shell scripts: "make test" in
the git source tree) no longer makes our page locking look horribly bad.
Before all these optimizations, just the unlock_page() costs were just
over 3% of all CPU overhead on "make test". After this, it's down to
0.66%, so just a quarter of the cost it used to be.
(The difference on NUMA is bigger, but there this micro-optimization is
likely less noticeable, since the big issue on NUMA was not the accesses
to 'struct page', but the waitqueue accesses that were already removed
by Nick's earlier commit).
Acked-by: Nick Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 28 Dec 2016 01:51:36 +0000 (17:51 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a hash corruption bug in the marvell driver"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: marvell - Copy IVDIG before launching partial DMA ahash requests
Linus Torvalds [Wed, 28 Dec 2016 00:04:37 +0000 (16:04 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Various ipvlan fixes from Eric Dumazet and Mahesh Bandewar.
The most important is to not assume the packet is RX just because
the destination address matches that of the device. Such an
assumption causes problems when an interface is put into loopback
mode.
2) If we retry when creating a new tc entry (because we dropped the
RTNL mutex in order to load a module, for example) we end up with
-EAGAIN and then loop trying to replay the request. But we didn't
reset some state when looping back to the top like this, and if
another thread meanwhile inserted the same tc entry we were trying
to, we re-link it creating an enless loop in the tc chain. Fix from
Daniel Borkmann.
3) There are two different WRITE bits in the MDIO address register for
the stmmac chip, depending upon the chip variant. Due to a bug we
could set them both, fix from Hock Leong Kweh.
4) Fix mlx4 bug in XDP_TX handling, from Tariq Toukan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: stmmac: fix incorrect bit set in gmac4 mdio addr register
r8169: add support for RTL8168 series add-on card.
net: xdp: remove unused bfp_warn_invalid_xdp_buffer()
openvswitch: upcall: Fix vlan handling.
ipv4: Namespaceify tcp_tw_reuse knob
net: korina: Fix NAPI versus resources freeing
net, sched: fix soft lockup in tc_classify
net/mlx4_en: Fix user prio field in XDP forward
tipc: don't send FIN message from connectionless socket
ipvlan: fix multicast processing
ipvlan: fix various issues in ipvlan_process_multicast()
Cihangir Akturk [Sat, 17 Dec 2016 17:42:17 +0000 (19:42 +0200)]
Documentation/unaligned-memory-access.txt: fix incorrect comparison operator
In the actual implementation ether_addr_equal function tests for equality to 0
when returning. It seems in commit 0d74c4 it is somehow overlooked to change
this operator to reflect the actual function.
Signed-off-by: Cihangir Akturk <cakturk@gmail.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
John Brooks [Fri, 23 Dec 2016 00:53:10 +0000 (00:53 +0000)]
docs: Fix build failure
The 80211.tmpl DocBook file was removed in commit
819bf593767c ("docs-rst:
sphinxify 802.11 documentation"), but the 80211.xml target was re-added to
the Makefile by commit
7ddedebb03b7 ("ALSA: doc: ReSTize
writing-an-alsa-driver document"), leading to a failure when building the
documentation:
*** No rule to make target 'Documentation/DocBook/80211.xml', needed by
'Documentation/DocBook/80211.aux.xml'.
cc: stable@vger.kernel.org
Signed-off-by: John Brooks <john@fastquake.com>
Mea-culpa-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Jonathan Corbet [Tue, 27 Dec 2016 19:53:44 +0000 (12:53 -0700)]
Merge tag 'v4.10-rc1' into docs-next
Linux 4.10-rc1
Kweh, Hock Leong [Tue, 27 Dec 2016 20:07:41 +0000 (04:07 +0800)]
net: stmmac: fix incorrect bit set in gmac4 mdio addr register
Fixing the gmac4 mdio write access to use MII_GMAC4_WRITE only instead of
OR together with MII_WRITE.
Signed-off-by: Kweh, Hock Leong <hock.leong.kweh@intel.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chun-Hao Lin [Tue, 27 Dec 2016 08:29:43 +0000 (16:29 +0800)]
r8169: add support for RTL8168 series add-on card.
This chip is the same as RTL8168, but its device id is 0x8161.
Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 27 Dec 2016 02:49:54 +0000 (10:49 +0800)]
net: xdp: remove unused bfp_warn_invalid_xdp_buffer()
After commit
73b62bd085f4737679ea9afc7867fa5f99ba7d1b ("virtio-net:
remove the warning before XDP linearizing"), there's no users for
bpf_warn_invalid_xdp_buffer(), so remove it. This is a revert for
commit
f23bc46c30ca5ef58b8549434899fcbac41b2cfc.
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
pravin shelar [Mon, 26 Dec 2016 16:31:27 +0000 (08:31 -0800)]
openvswitch: upcall: Fix vlan handling.
Networking stack accelerate vlan tag handling by
keeping topmost vlan header in skb. This works as
long as packet remains in OVS datapath. But during
OVS upcall vlan header is pushed on to the packet.
When such packet is sent back to OVS datapath, core
networking stack might not handle it correctly. Following
patch avoids this issue by accelerating the vlan tag
during flow key extract. This simplifies datapath by
bringing uniform packet processing for packets from
all code paths.
Fixes: 5108bbaddc ("openvswitch: add processing of L3 packets").
CC: Jarno Rajahalme <jarno@ovn.org>
CC: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haishuang Yan [Sun, 25 Dec 2016 06:33:16 +0000 (14:33 +0800)]
ipv4: Namespaceify tcp_tw_reuse knob
Different namespaces might have different requirements to reuse
TIME-WAIT sockets for new connections. This might be required in
cases where different namespace applications are in place which
require TIME_WAIT socket connections to be reduced independently
of the host.
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Laura Abbott [Wed, 21 Dec 2016 20:32:54 +0000 (12:32 -0800)]
crypto: testmgr - Use heap buffer for acomp test input
Christopher Covington reported a crash on aarch64 on recent Fedora
kernels:
kernel BUG at ./include/linux/scatterlist.h:140!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 2 PID: 752 Comm: cryptomgr_test Not tainted
4.9.0-11815-ge93b1cc #162
Hardware name: linux,dummy-virt (DT)
task:
ffff80007c650080 task.stack:
ffff800008910000
PC is at sg_init_one+0xa0/0xb8
LR is at sg_init_one+0x24/0xb8
...
[<
ffff000008398db8>] sg_init_one+0xa0/0xb8
[<
ffff000008350a44>] test_acomp+0x10c/0x438
[<
ffff000008350e20>] alg_test_comp+0xb0/0x118
[<
ffff00000834f28c>] alg_test+0x17c/0x2f0
[<
ffff00000834c6a4>] cryptomgr_test+0x44/0x50
[<
ffff0000080dac70>] kthread+0xf8/0x128
[<
ffff000008082ec0>] ret_from_fork+0x10/0x50
The test vectors used for input are part of the kernel image. These
inputs are passed as a buffer to sg_init_one which eventually blows up
with BUG_ON(!virt_addr_valid(buf)). On arm64, virt_addr_valid returns
false for the kernel image since virt_to_page will not return the
correct page. Fix this by copying the input vectors to heap buffer
before setting up the scatterlist.
Reported-by: Christopher Covington <cov@codeaurora.org>
Fixes: d7db7a882deb ("crypto: acomp - update testmgr with support for acomp")
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Jan Kara [Fri, 21 Oct 2016 09:33:49 +0000 (11:33 +0200)]
ext4: Simplify DAX fault path
Now that dax_iomap_fault() calls ->iomap_begin() without entry lock, we
can use transaction starting in ext4_iomap_begin() and thus simplify
ext4_dax_fault(). It also provides us proper retries in case of ENOSPC.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jan Kara [Wed, 19 Oct 2016 12:34:31 +0000 (14:34 +0200)]
dax: Call ->iomap_begin without entry lock during dax fault
Currently ->iomap_begin() handler is called with entry lock held. If the
filesystem held any locks between ->iomap_begin() and ->iomap_end()
(such as ext4 which will want to hold transaction open), this would cause
lock inversion with the iomap_apply() from standard IO path which first
calls ->iomap_begin() and only then calls ->actor() callback which grabs
entry locks for DAX (if it faults when copying from/to user provided
buffers).
Fix the problem by nesting grabbing of entry lock inside ->iomap_begin()
- ->iomap_end() pair.
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jan Kara [Wed, 19 Oct 2016 12:48:38 +0000 (14:48 +0200)]
dax: Finish fault completely when loading holes
The only case when we do not finish the page fault completely is when we
are loading hole pages into a radix tree. Avoid this special case and
finish the fault in that case as well inside the DAX fault handler. It
will allow us for easier iomap handling.
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jan Kara [Wed, 10 Aug 2016 15:10:28 +0000 (17:10 +0200)]
dax: Avoid page invalidation races and unnecessary radix tree traversals
Currently dax_iomap_rw() takes care of invalidating page tables and
evicting hole pages from the radix tree when write(2) to the file
happens. This invalidation is only necessary when there is some block
allocation resulting from write(2). Furthermore in current place the
invalidation is racy wrt page fault instantiating a hole page just after
we have invalidated it.
So perform the page invalidation inside dax_iomap_actor() where we can
do it only when really necessary and after blocks have been allocated so
nobody will be instantiating new hole pages anymore.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jan Kara [Wed, 10 Aug 2016 15:22:44 +0000 (17:22 +0200)]
mm: Invalidate DAX radix tree entries only if appropriate
Currently invalidate_inode_pages2_range() and invalidate_mapping_pages()
just delete all exceptional radix tree entries they find. For DAX this
is not desirable as we track cache dirtiness in these entries and when
they are evicted, we may not flush caches although it is necessary. This
can for example manifest when we write to the same block both via mmap
and via write(2) (to different offsets) and fsync(2) then does not
properly flush CPU caches when modification via write(2) was the last
one.
Create appropriate DAX functions to handle invalidation of DAX entries
for invalidate_inode_pages2_range() and invalidate_mapping_pages() and
wire them up into the corresponding mm functions.
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Jan Kara [Wed, 10 Aug 2016 14:42:53 +0000 (16:42 +0200)]
ext2: Return BH_New buffers for zeroed blocks
So far we did not return BH_New buffers from ext2_get_blocks() when we
allocated and zeroed-out a block for DAX inode to avoid racy zeroing in
DAX code. This zeroing is gone these days so we can remove the
workaround.
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Thomas Gleixner [Mon, 26 Dec 2016 21:58:20 +0000 (22:58 +0100)]
x86/mce/AMD: Make the init code more robust
If mce_device_init() fails then the mce device pointer is NULL and the
AMD mce code happily dereferences it.
Add a sanity check.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Mon, 26 Dec 2016 21:58:19 +0000 (22:58 +0100)]
smp/hotplug: Undo tglxs brainfart
The attempt to prevent overwriting an active state resulted in a
disaster which effectively disables all dynamically allocated hotplug
states.
Cleanup the mess.
Fixes: dc280d936239 ("cpu/hotplug: Prevent overwriting of callbacks")
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 26 Dec 2016 09:10:19 +0000 (04:10 -0500)]
arm64: don't pull uaccess.h into *.S
Split asm-only parts of arm64 uaccess.h into a new header and use that
from *.S.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Florian Fainelli [Sat, 24 Dec 2016 03:56:56 +0000 (19:56 -0800)]
net: korina: Fix NAPI versus resources freeing
Commit
beb0babfb77e ("korina: disable napi on close and restart")
introduced calls to napi_disable() that were missing before,
unfortunately this leaves a small window during which NAPI has a chance
to run, yet we just freed resources since korina_free_ring() has been
called:
Fix this by disabling NAPI first then freeing resource, and make sure
that we also cancel the restart task before doing the resource freeing.
Fixes: beb0babfb77e ("korina: disable napi on close and restart")
Reported-by: Alexandros C. Couloumbis <alex@ozo.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 21 Dec 2016 17:04:11 +0000 (18:04 +0100)]
net, sched: fix soft lockup in tc_classify
Shahar reported a soft lockup in tc_classify(), where we run into an
endless loop when walking the classifier chain due to tp->next == tp
which is a state we should never run into. The issue only seems to
trigger under load in the tc control path.
What happens is that in tc_ctl_tfilter(), thread A allocates a new
tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
with it. In that classifier callback we had to unlock/lock the rtnl
mutex and returned with -EAGAIN. One reason why we need to drop there
is, for example, that we need to request an action module to be loaded.
This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
after we loaded and found the requested action, we need to redo the
whole request so we don't race against others. While we had to unlock
rtnl in that time, thread B's request was processed next on that CPU.
Thread B added a new tp instance successfully to the classifier chain.
When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
and destroying its tp instance which never got linked, we goto replay
and redo A's request.
This time when walking the classifier chain in tc_ctl_tfilter() for
checking for existing tp instances we had a priority match and found
the tp instance that was created and linked by thread B. Now calling
again into tp->ops->change() with that tp was successful and returned
without error.
tp_created was never cleared in the second round, thus kernel thinks
that we need to link it into the classifier chain (once again). tp and
*back point to the same object due to the match we had earlier on. Thus
for thread B's already public tp, we reset tp->next to tp itself and
link it into the chain, which eventually causes the mentioned endless
loop in tc_classify() once a packet hits the data path.
Fix is to clear tp_created at the beginning of each request, also when
we replay it. On the paths that can cause -EAGAIN we already destroy
the original tp instance we had and on replay we really need to start
from scratch. It seems that this issue was first introduced in commit
12186be7d2e1 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
and avoid kernel panic when we use cls_cgroup").
Fixes: 12186be7d2e1 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
Reported-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 26 Dec 2016 00:13:08 +0000 (16:13 -0800)]
Linux 4.10-rc1
Larry Finger [Fri, 23 Dec 2016 03:06:53 +0000 (21:06 -0600)]
powerpc: Fix build warning on 32-bit PPC
I am getting the following warning when I build kernel 4.9-git on my
PowerBook G4 with a 32-bit PPC processor:
AS arch/powerpc/kernel/misc_32.o
arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]
This problem is evident after commit
989cea5c14be ("kbuild: prevent
lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
error that has been in the code since 2005 when this source file was
created. That was with commit
9994a33865f4 ("powerpc: Introduce
entry_{32,64}.S, misc_{32,64}.S, systbl.S").
The offending line does not make a lot of sense. This error does not
seem to cause any errors in the executable, thus I am not recommending
that it be applied to any stable versions.
Thanks to Nicholas Piggin for suggesting this solution.
Fixes: 9994a33865f4 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 25 Dec 2016 22:56:58 +0000 (14:56 -0800)]
avoid spurious "may be used uninitialized" warning
The timer type simplifications caused a new gcc warning:
drivers/base/power/domain.c: In function ‘genpd_runtime_suspend’:
drivers/base/power/domain.c:562:14: warning: ‘time_start’ may be used uninitialized in this function [-Wmaybe-uninitialized]
elapsed_ns = ktime_to_ns(ktime_sub(ktime_get(), time_start));
despite the actual use of "time_start" not having changed in any way.
It appears that simply changing the type of ktime_t from a union to a
plain scalar type made gcc check the use.
The variable wasn't actually used uninitialized, but gcc apparently
failed to notice that the conditional around the use was exactly the
same as the conditional around the initialization of that variable.
Add an unnecessary initialization just to shut up the compiler.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 25 Dec 2016 22:30:04 +0000 (14:30 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer type cleanups from Thomas Gleixner:
"This series does a tree wide cleanup of types related to
timers/timekeeping.
- Get rid of cycles_t and use a plain u64. The type is not really
helpful and caused more confusion than clarity
- Get rid of the ktime union. The union has become useless as we use
the scalar nanoseconds storage unconditionally now. The 32bit
timespec alike storage got removed due to the Y2038 limitations
some time ago.
That leaves the odd union access around for no reason. Clean it up.
Both changes have been done with coccinelle and a small amount of
manual mopping up"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
ktime: Get rid of ktime_equal()
ktime: Cleanup ktime_set() usage
ktime: Get rid of the union
clocksource: Use a plain u64 instead of cycle_t
Linus Torvalds [Sun, 25 Dec 2016 22:05:56 +0000 (14:05 -0800)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull SMP hotplug notifier removal from Thomas Gleixner:
"This is the final cleanup of the hotplug notifier infrastructure. The
series has been reintgrated in the last two days because there came a
new driver using the old infrastructure via the SCSI tree.
Summary:
- convert the last leftover drivers utilizing notifiers
- fixup for a completely broken hotplug user
- prevent setup of already used states
- removal of the notifiers
- treewide cleanup of hotplug state names
- consolidation of state space
There is a sphinx based documentation pending, but that needs review
from the documentation folks"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/armada-xp: Consolidate hotplug state space
irqchip/gic: Consolidate hotplug state space
coresight/etm3/4x: Consolidate hotplug state space
cpu/hotplug: Cleanup state names
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
staging/lustre/libcfs: Convert to hotplug state machine
scsi/bnx2i: Convert to hotplug state machine
scsi/bnx2fc: Convert to hotplug state machine
cpu/hotplug: Prevent overwriting of callbacks
x86/msr: Remove bogus cleanup from the error path
bus: arm-ccn: Prevent hotplug callback leak
perf/x86/intel/cstate: Prevent hotplug callback leak
ARM/imx/mmcd: Fix broken cpu hotplug handling
scsi: qedi: Convert to hotplug state machine
Linus Torvalds [Sun, 25 Dec 2016 22:01:28 +0000 (14:01 -0800)]
Merge branch 'turbostat' of git://git./linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown.
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: remove obsolete -M, -m, -C, -c options
tools/power turbostat: Make extensible via the --add parameter
tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz
tools/power turbostat: line up headers when -M is used
tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding
tools/power turbostat: Support Knights Mill (KNM)
tools/power turbostat: Display HWP OOB status
tools/power turbostat: fix Denverton BCLK
tools/power turbostat: use intel-family.h model strings
tools/power/turbostat: Add Denverton RAPL support
tools/power/turbostat: Add Denverton support
tools/power/turbostat: split core MSR support into status + limit
tools/power turbostat: fix error case overflow read of slm_freq_table[]
tools/power turbostat: Allocate correct amount of fd and irq entries
tools/power turbostat: switch to tab delimited output
tools/power turbostat: Gracefully handle ACPI S3
tools/power turbostat: tidy up output on Joule counter overflow
Nicholas Piggin [Sun, 25 Dec 2016 03:00:30 +0000 (13:00 +1000)]
mm: add PageWaiters indicating tasks are waiting for a page bit
Add a new page flag, PageWaiters, to indicate the page waitqueue has
tasks waiting. This can be tested rather than testing waitqueue_active
which requires another cacheline load.
This bit is always set when the page has tasks on page_waitqueue(page),
and is set and cleared under the waitqueue lock. It may be set when
there are no tasks on the waitqueue, which will cause a harmless extra
wakeup check that will clears the bit.
The generic bit-waitqueue infrastructure is no longer used for pages.
Instead, waitqueues are used directly with a custom key type. The
generic code was not flexible enough to have PageWaiters manipulation
under the waitqueue lock (which simplifies concurrency).
This improves the performance of page lock intensive microbenchmarks by
2-3%.
Putting two bits in the same word opens the opportunity to remove the
memory barrier between clearing the lock bit and testing the waiters
bit, after some work on the arch primitives (e.g., ensuring memory
operand widths match and cover both bits).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicholas Piggin [Sun, 25 Dec 2016 03:00:29 +0000 (13:00 +1000)]
mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked
A page is not added to the swap cache without being swap backed,
so PageSwapBacked mappings can use PG_owner_priv_1 for PageSwapCache.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Thomas Gleixner [Sun, 25 Dec 2016 11:43:07 +0000 (12:43 +0100)]
ktime: Get rid of ktime_equal()
No point in going through loops and hoops instead of just comparing the
values.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Thomas Gleixner [Sun, 25 Dec 2016 11:30:41 +0000 (12:30 +0100)]
ktime: Cleanup ktime_set() usage
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Thomas Gleixner [Sun, 25 Dec 2016 10:38:40 +0000 (11:38 +0100)]
ktime: Get rid of the union
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.
Get rid of the union and just keep ktime_t as simple typedef of type s64.
The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Thomas Gleixner [Wed, 21 Dec 2016 19:32:01 +0000 (20:32 +0100)]
clocksource: Use a plain u64 instead of cycle_t
There is no point in having an extra type for extra confusion. u64 is
unambiguous.
Conversion was done with the following coccinelle script:
@rem@
@@
-typedef u64 cycle_t;
@fix@
typedef cycle_t;
@@
-cycle_t
+u64
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:57 +0000 (20:19 +0100)]
irqchip/armada-xp: Consolidate hotplug state space
The mpic is either the main interrupt controller or is cascaded behind a
GIC. The mpic is single instance and the modes are mutually exclusive, so
there is no reason to have seperate cpu hotplug states.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/20161221192112.333161745@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:56 +0000 (20:19 +0100)]
irqchip/gic: Consolidate hotplug state space
Even if both drivers are compiled in only one instance can run on a given
system depending on the available GIC version.
So having seperate hotplug states for them is pointless.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.252416267@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:55 +0000 (20:19 +0100)]
coresight/etm3/4x: Consolidate hotplug state space
Even if both drivers are compiled in only one instance can run on a given
system depending on the available tracer cell.
So having seperate hotplug states for them is pointless.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lkml.kernel.org/r/20161221192112.162765484@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:54 +0000 (20:19 +0100)]
cpu/hotplug: Cleanup state names
When the state names got added a script was used to add the extra argument
to the calls. The script basically converted the state constant to a
string, but the cleanup to convert these strings into meaningful ones did
not happen.
Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
are used in all the other places already.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:53 +0000 (20:19 +0100)]
cpu/hotplug: Remove obsolete cpu hotplug register/unregister functions
hotcpu_notifier(), cpu_notifier(), __hotcpu_notifier(), __cpu_notifier(),
register_hotcpu_notifier(), register_cpu_notifier(),
__register_hotcpu_notifier(), __register_cpu_notifier(),
unregister_hotcpu_notifier(), unregister_cpu_notifier(),
__unregister_hotcpu_notifier(), __unregister_cpu_notifier()
are unused now. Remove them and all related code.
Remove also the now pointless cpu notifier error injection mechanism. The
states can be executed step by step and error rollback is the same as cpu
down, so any state transition can be tested w/o requiring the notifier
error injection.
Some CPU hotplug states are kept as they are (ab)used for hotplug state
tracking.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161221192112.005642358@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Anna-Maria Gleixner [Wed, 21 Dec 2016 19:19:52 +0000 (20:19 +0100)]
staging/lustre/libcfs: Convert to hotplug state machine
Install the callbacks via the state machine. No functional change.
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: devel@driverdev.osuosl.org
Cc: Andreas Dilger <andreas.dilger@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: rt@linutronix.de
Cc: lustre-devel@lists.lustre.org
Link: http://lkml.kernel.org/r/20161202110027.htzzeervzkoc4muv@linutronix.de
Link: http://lkml.kernel.org/r/20161221192111.922872524@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sebastian Andrzej Siewior [Wed, 21 Dec 2016 19:19:51 +0000 (20:19 +0100)]
scsi/bnx2i: Convert to hotplug state machine
Install the callbacks via the state machine. No functional change.
This is the minimal fixup so we can remove the hotplug notifier mess
completely.
The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.836895753@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Sebastian Andrzej Siewior [Wed, 21 Dec 2016 19:19:50 +0000 (20:19 +0100)]
scsi/bnx2fc: Convert to hotplug state machine
Install the callbacks via the state machine. No functional change.
This is the minimal fixup so we can remove the hotplug notifier mess
completely.
The real rework of this driver to use work queues is still stuck in
review/testing on the SCSI mailing list.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Chad Dupuis <chad.dupuis@qlogic.com>
Cc: QLogic-Storage-Upstream@qlogic.com
Cc: Johannes Thumshirn <jth@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: http://lkml.kernel.org/r/20161221192111.757309869@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Wed, 21 Dec 2016 19:19:49 +0000 (20:19 +0100)]
cpu/hotplug: Prevent overwriting of callbacks
Developers manage to overwrite states blindly without thought. That's fatal
and hard to debug. Add sanity checks to make it fail.
This requries to restructure the code so that the dynamic state allocation
happens in the same lock protected section as the actual store. Otherwise
the previous assignment of 'Reserved' to the name field would trigger the
overwrite check.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: http://lkml.kernel.org/r/20161221192111.675234535@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 22 Dec 2016 09:32:38 +0000 (10:32 +0100)]
x86/msr: Remove bogus cleanup from the error path
The error cleanup which is invoked when the hotplug state setup failed
tries to remove the failed state, which is broken.
Fixes: 8fba38c937cd ("x86/msr: Convert to hotplug state machine")
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Thomas Gleixner [Thu, 22 Dec 2016 10:14:06 +0000 (11:14 +0100)]
bus: arm-ccn: Prevent hotplug callback leak
In case the driver registration fails, the hotplug callback is leaked.
Not fatal, because it's never invoked as there are no instances registered,
but wrong nevertheless.
Fixes: fdc15a36d84e ("bus/arm-ccn: Convert to hotplug statemachine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Thomas Gleixner [Thu, 22 Dec 2016 10:02:08 +0000 (11:02 +0100)]
perf/x86/intel/cstate: Prevent hotplug callback leak
If the pmu registration fails the registered hotplug callbacks are not
removed. Wrong in any case, but fatal in case of a modular driver.
Replace the nonsensical state names with proper ones while at it.
Fixes: 77c34ef1c319 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Thomas Gleixner [Wed, 21 Dec 2016 19:19:48 +0000 (20:19 +0100)]
ARM/imx/mmcd: Fix broken cpu hotplug handling
The cpu hotplug support of this perf driver is broken in several ways:
1) It adds a instance before setting up the state.
2) The state for the instance is different from the state of the
callback. It's just a randomly chosen state.
3) The instance registration is not error checked so nobody noticed that
the call can never succeed.
4) The state for the multi install callbacks is chosen randomly and
overwrites existing state. This is now prevented by the core code so the
call is guaranteed to fail.
5) The error exit path in the init function leaves the instance registered
and then frees the memory which contains the enqueued hlist node.
6) The remove function is removing the state and not the instance.
Fix it by:
- Setting up the state before adding instances. Use a dynamically allocated
state for it.
- Installing instances after the state has been set up
- Removing the instance in the error path before freeing memory
- Removing the instance not the state in the driver remove callback
While at is use raw_cpu_processor_id(), because cpu_processor_id() cannot
be used in preemptible context, and set the driver data after successful
registration of the pmu.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Frank Li <frank.li@nxp.com>
Cc: Zhengyu Shen <zhengyu.shen@nxp.com>
Link: http://lkml.kernel.org/r/20161221192111.596204211@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Sat, 24 Dec 2016 11:34:02 +0000 (12:34 +0100)]
scsi: qedi: Convert to hotplug state machine
The CPU hotplug code is a trainwreck. It leaks a notifier in case of driver
registration error and the per cpu loop is racy against cpu hotplug. Aside
of that the driver should have been written and merged with the new state
machine interfaces in the first place.
Mop up the mess and Convert it to the hotplug state machine.
Signed-off-by: Thomas Grumpy Gleixner <tglx@linutronix.de>
Cc: Nilesh Javali <nilesh.javali@cavium.com>
Cc: Adheer Chandravanshi <adheer.chandravanshi@qlogic.com>
Cc: Chad Dupuis <chad.dupuis@cavium.com>
Cc: Saurav Kashyap <saurav.kashyap@cavium.com>
Cc: Arun Easi <arun.easi@cavium.com>
Cc: Manish Rangankar <manish.rangankar@cavium.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>