openwrt/staging/blogic.git
7 years agocpufreq: arm_big_little: make cpufreq_arm_bL_ops structures const
Bhumika Goyal [Thu, 19 Oct 2017 10:59:15 +0000 (12:59 +0200)]
cpufreq: arm_big_little: make cpufreq_arm_bL_ops structures const

Make these const as they are only getting passed to the functions
bL_cpufreq_{register/unregister} having the arguments as const.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: arm_big_little: make function arguments and structure pointer const
Bhumika Goyal [Thu, 19 Oct 2017 10:59:14 +0000 (12:59 +0200)]
cpufreq: arm_big_little: make function arguments and structure pointer const

Make the arguments of functions bL_cpufreq_{register/unregister} as
const as the ops pointer does not modify the fields of the
cpufreq_arm_bL_ops structure it points to. The pointer arm_bL_ops is
also getting initialized with ops but the pointer does not modify the
fields. So, make the function argument and the structure pointer const.
Add const to function prototypes too.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: pxa: convert to clock API
Robert Jarzmik [Sat, 14 Oct 2017 21:51:02 +0000 (23:51 +0200)]
cpufreq: pxa: convert to clock API

As the clock settings have been introduced into the clock pxa drivers,
which are now available to change the CPU clock by themselves, remove
the clock handling from this driver, and rely on pxa clock drivers.

Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: speedstep-lib: mark expected switch fall-through
Gustavo A. R. Silva [Thu, 12 Oct 2017 22:41:03 +0000 (17:41 -0500)]
cpufreq: speedstep-lib: mark expected switch fall-through

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: ti-cpufreq: add missing of_node_put()
Zumeng Chen [Tue, 10 Oct 2017 13:27:20 +0000 (21:27 +0800)]
cpufreq: ti-cpufreq: add missing of_node_put()

call of_node_put to release the refcount of np.

Signed-off-by: Zumeng Chen <zumeng.chen@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: dt: Remove support for Exynos4212 SoCs
Marek Szyprowski [Wed, 4 Oct 2017 06:38:28 +0000 (08:38 +0200)]
cpufreq: dt: Remove support for Exynos4212 SoCs

Support for Exynos4212 SoCs has been removed by commit bca9085e0ae9
"ARM: dts: exynos: remove Exynos4212 support (dead code)", so there
is no need to keep remaining dead code related to this SoC version.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: imx6q: Move speed grading check to cpufreq driver
Fabio Estevam [Sat, 30 Sep 2017 15:16:46 +0000 (12:16 -0300)]
cpufreq: imx6q: Move speed grading check to cpufreq driver

On some i.MX6 SoCs (like i.MX6SL, i.MX6SX and i.MX6UL) that do not have
speed grading check, opp table will not be created in platform code,
so cpufreq driver prints the following error message:

cpu cpu0: dev_pm_opp_get_opp_count: OPP table not found (-19)

However, this is not really an error in this case because the
imx6q-cpufreq driver first calls dev_pm_opp_get_opp_count()
and if it fails, it means that platform code does not provide
OPP and then dev_pm_opp_of_add_table() will be called.

In order to avoid such confusing error message, move the speed grading
check from platform code to the imx6q-cpufreq driver.

This way the imx6q-cpufreq no longer has to check whether OPP table
is supplied by platform code.

Tested on a i.MX6Q and i.MX6UL based boards.

Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: ti-cpufreq: kfree opp_data when failure
Zumeng Chen [Wed, 27 Sep 2017 07:08:17 +0000 (15:08 +0800)]
cpufreq: ti-cpufreq: kfree opp_data when failure

memory leakage was found by kmemleak. opp_data needs to be freed
when failure, including fail_put_node.

unreferenced object 0xccdd4c40 (size 64):
  comm "swapper", pid 1, jiffies 4294938465 (age 888.520s)
  hex dump (first 32 bytes):
    00 7c 00 c1 98 69 d8 ce 00 24 03 ce 00 24 03 ce  .|...i...$...$..
    20 35 23 c1 00 00 00 00 00 00 00 00 00 00 00 00   5#.............
  backtrace:
    [<c028fb64>] kmem_cache_alloc_trace+0x2c4/0x3cc
    [<c076d5f0>] ti_cpufreq_probe+0x6c/0x334
    [<c068d6e4>] platform_drv_probe+0x60/0xc0
    [<c068b384>] driver_probe_device+0x218/0x2c4
    [<c068b5a4>] __device_attach_driver+0xa8/0xdc
    [<c0689340>] bus_for_each_drv+0x70/0xa4
    [<c068b020>] __device_attach+0xc0/0x124
    [<c068b634>] device_initial_probe+0x1c/0x20
    [<c068a3b8>] bus_probe_device+0x94/0x9c
    [<c0688300>] device_add+0x404/0x590
    [<c068d408>] platform_device_add+0x11c/0x230
    [<c068df40>] platform_device_register_full+0x10c/0x128
    [<c076d578>] ti_cpufreq_init+0x44/0x50
    [<c01017c4>] do_one_initcall+0x54/0x180
    [<c0e00fe0>] kernel_init_freeable+0x270/0x33c
    [<c093f2bc>] kernel_init+0x18/0x124

Signed-off-by: Zumeng Chen <zumeng.chen@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: SPEAr: pr_err() strings should end with newlines
Arvind Yadav [Mon, 25 Sep 2017 10:13:49 +0000 (15:43 +0530)]
cpufreq: SPEAr: pr_err() strings should end with newlines

pr_err() messages should terminated with a new-line to avoid
other messages being concatenated onto the end.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: powernow-k8: pr_err() strings should end with newlines
Arvind Yadav [Mon, 25 Sep 2017 09:40:11 +0000 (15:10 +0530)]
cpufreq: powernow-k8: pr_err() strings should end with newlines

pr_err() messages should terminated with a new-line to avoid
other messages being concatenated onto the end.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: dt-platdev: drop socionext,uniphier-ld6b from whitelist
Masahiro Yamada [Tue, 29 Aug 2017 15:37:03 +0000 (00:37 +0900)]
cpufreq: dt-platdev: drop socionext,uniphier-ld6b from whitelist

As you see arch/arm/boot/dts/uniphier-ld6b.dtsi, it includes
uniphier-pxs2.dtsi, which uses "operating-points-v2" property
and whose cpufreq device is automatically created.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoarm64: wire cpu-invariant accounting support up to the task scheduler
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:15 +0000 (17:41 +0100)]
arm64: wire cpu-invariant accounting support up to the task scheduler

Commit 8cd5601c5060 ("sched/fair: Convert arch_scale_cpu_capacity() from
weak function to #define") changed the wiring which now has to be done
by associating arch_scale_cpu_capacity with the actual implementation
provided by the architecture.

Define arch_scale_cpu_capacity to use the arch_topology "driver"
function topology_get_cpu_scale() for the task scheduler's cpu-invariant
accounting instead of the default arch_scale_cpu_capacity() in
kernel/sched/sched.h.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoarm64: wire frequency-invariant accounting support up to the task scheduler
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:14 +0000 (17:41 +0100)]
arm64: wire frequency-invariant accounting support up to the task scheduler

Commit dfbca41f3479 ("sched: Optimize freq invariant accounting")
changed the wiring which now has to be done by associating
arch_scale_freq_capacity with the actual implementation provided
by the architecture.

Define arch_scale_freq_capacity to use the arch_topology "driver"
function topology_get_freq_scale() for the task scheduler's
frequency-invariant accounting instead of the default
arch_scale_freq_capacity() in kernel/sched/sched.h.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoarm: wire cpu-invariant accounting support up to the task scheduler
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:13 +0000 (17:41 +0100)]
arm: wire cpu-invariant accounting support up to the task scheduler

Commit 8cd5601c5060 ("sched/fair: Convert arch_scale_cpu_capacity() from
weak function to #define") changed the wiring which now has to be done
by associating arch_scale_cpu_capacity with the actual implementation
provided by the architecture.

Define arch_scale_cpu_capacity to use the arch_topology "driver"
function topology_get_cpu_scale() for the task scheduler's cpu-invariant
accounting instead of the default arch_scale_cpu_capacity() in
kernel/sched/sched.h.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoarm: wire frequency-invariant accounting support up to the task scheduler
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:12 +0000 (17:41 +0100)]
arm: wire frequency-invariant accounting support up to the task scheduler

Commit dfbca41f3479 ("sched: Optimize freq invariant accounting")
changed the wiring which now has to be done by associating
arch_scale_freq_capacity with the actual implementation provided
by the architecture.

Define arch_scale_freq_capacity to use the arch_topology "driver"
function topology_get_freq_scale() for the task scheduler's
frequency-invariant accounting instead of the default
arch_scale_freq_capacity() in kernel/sched/sched.h.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agodrivers base/arch_topology: allow inlining cpu-invariant accounting support
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:11 +0000 (17:41 +0100)]
drivers base/arch_topology: allow inlining cpu-invariant accounting support

Allow inlining of topology_get_cpu_scale() into the task
scheduler fast path (e.g. __update_load_avg_se()) by coding it as a
static inline function in the arch topology header file.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agodrivers base/arch_topology: provide frequency-invariant accounting support
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:10 +0000 (17:41 +0100)]
drivers base/arch_topology: provide frequency-invariant accounting support

Implements the arch-specific (arm and arm64) frequency-invariance setter
function arch_set_freq_scale() which provides the following frequency
scaling factor:

  current_freq(cpu) << SCHED_CAPACITY_SHIFT / max_supported_freq(cpu)

One possible consumer of the frequency-invariance getter function
topology_get_freq_scale() is the Per-Entity Load Tracking (PELT)
mechanism of the task scheduler.

Allow inlining of topology_get_freq_scale() into the task scheduler
fast path (e.g. __update_load_avg_se()) by coding it as a static inline
function in the arch topology header file.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: dt: invoke frequency-invariance setter function
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:09 +0000 (17:41 +0100)]
cpufreq: dt: invoke frequency-invariance setter function

Call the frequency-invariance setter function arch_set_freq_scale()
if the new frequency has been successfully set which is indicated by
dev_pm_opp_set_rate() returning 0.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: arm_big_little: invoke frequency-invariance setter function
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:08 +0000 (17:41 +0100)]
cpufreq: arm_big_little: invoke frequency-invariance setter function

Call the frequency-invariance setter function arch_set_freq_scale()
if the new frequency has been successfully set which is indicated by
bL_cpufreq_set_rate() returning 0.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agocpufreq: provide default frequency-invariance setter function
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:07 +0000 (17:41 +0100)]
cpufreq: provide default frequency-invariance setter function

Frequency-invariant accounting support based on the ratio of current
frequency and maximum supported frequency is an optional feature an arch
can implement.

Since there are cpufreq drivers (e.g. cpufreq-dt) which can be build for
different arch's a default implementation of the frequency-invariance
setter function arch_set_freq_scale() is needed.

This default implementation is an empty weak function which will be
overwritten by a strong function in case the arch provides one.

The setter function passes the cpumask of related (to the frequency
change) cpus (online and offline cpus), the (new) current frequency and
the maximum supported frequency.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agodrivers base/arch_topology: free cpumask cpus_to_visit
Dietmar Eggemann [Tue, 26 Sep 2017 16:41:06 +0000 (17:41 +0100)]
drivers base/arch_topology: free cpumask cpus_to_visit

Free cpumask cpus_to_visit in case registering
init_cpu_capacity_notifier has failed or the parsing of the cpu
capacity-dmips-mhz property is done. The cpumask cpus_to_visit is
only used inside the notifier call init_cpu_capacity_callback.

Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Juri Lelli <juri.lelli@arm.com>
Reviewed-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoLinux 4.14-rc3
Linus Torvalds [Sun, 1 Oct 2017 21:54:54 +0000 (14:54 -0700)]
Linux 4.14-rc3

7 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 20:55:32 +0000 (13:55 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "This contains the following fixes and improvements:

   - Avoid dereferencing an unprotected VMA pointer in the fault signal
     generation code

   - Fix inline asm call constraints for GCC 4.4

   - Use existing register variable to retrieve the stack pointer
     instead of forcing the compiler to create another indirect access
     which results in excessive extra 'mov %rsp, %<dst>' instructions

   - Disable branch profiling for the memory encryption code to prevent
     an early boot crash

   - Fix a sparse warning caused by casting the __user annotation in
     __get_user_asm_u64() away

   - Fix an off by one error in the loop termination of the error patch
     in the x86 sysfs init code

   - Add missing CPU IDs to various Intel specific drivers to enable the
     functionality on recent hardware

   - More (init) constification in the numachip code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Use register variable to get stack pointer value
  x86/mm: Disable branch profiling in mem_encrypt.c
  x86/asm: Fix inline asm call constraints for GCC 4.4
  perf/x86/intel/uncore: Correct num_boxes for IIO and IRP
  perf/x86/intel/rapl: Add missing CPU IDs
  perf/x86/msr: Add missing CPU IDs
  perf/x86/intel/cstate: Add missing CPU IDs
  x86: Don't cast away the __user in __get_user_asm_u64()
  x86/sysfs: Fix off-by-one error in loop termination
  x86/mm: Fix fault error path using unsafe vma pointer
  x86/numachip: Add const and __initconst to numachip2_clockevent

7 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 20:03:16 +0000 (13:03 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "This adds a new timer wheel function which is required for the
  conversion of the timer callback function from the 'unsigned long
  data' argument to 'struct timer_list *timer'. This conversion has two
  benefits:

   1) It makes struct timer_list smaller

   2) Many callers hand in a pointer to the timer or to the structure
      containing the timer, which happens via type casting both at setup
      and in the callback. This change gets rid of the typecasts.

  Once the conversion is complete, which is planned for 4.15, the old
  setup function and the intermediate typecast in the new setup function
  go away along with the data field in struct timer_list.

  Merging this now into mainline allows a smooth queueing of the actual
  conversion in the affected maintainer trees without creating
  dependencies"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  um/time: Fixup namespace collision
  timer: Prepare to change timer callback argument type

7 years agoMerge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 19:34:42 +0000 (12:34 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull smp/hotplug fixes from Thomas Gleixner:
 "This addresses the fallout of the new lockdep mechanism which covers
  completions in the CPU hotplug code.

  The lockdep splats are false positives, but there is no way to
  annotate that reliably. The solution is to split the completions for
  CPU up and down, which requires some reshuffling of the failure
  rollback handling as well"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  smp/hotplug: Hotplug state fail injection
  smp/hotplug: Differentiate the AP completion between up and down
  smp/hotplug: Differentiate the AP-work lockdep class between up and down
  smp/hotplug: Callback vs state-machine consistency
  smp/hotplug: Rewrite AP state machine core
  smp/hotplug: Allow external multi-instance rollback
  smp/hotplug: Add state diagram

7 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 19:10:02 +0000 (12:10 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:
 "The scheduler pull request comes with the following updates:

   - Prevent a divide by zero issue by validating the input value of
     sysctl_sched_time_avg

   - Make task state printing consistent all over the place and have
     explicit state characters for IDLE and PARKED so they wont be
     displayed as 'D' state which confuses tools"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/sysctl: Check user input value of sysctl_sched_time_avg
  sched/debug: Add explicit TASK_PARKED printing
  sched/debug: Ignore TASK_IDLE for SysRq-W
  sched/debug: Add explicit TASK_IDLE printing
  sched/tracing: Use common task-state helpers
  sched/tracing: Fix trace_sched_switch task-state printing
  sched/debug: Remove unused variable
  sched/debug: Convert TASK_state to hex
  sched/debug: Implement consistent task-state printing

7 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 19:06:31 +0000 (12:06 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:

 - Prevent a division by zero in the perf aux buffer handling

 - Sync kernel headers with perf tool headers

 - Fix a build failure in the syscalltbl code

 - Make the debug messages of perf report --call-graph work correctly

 - Make sure that all required perf files are in the MANIFEST for
   container builds

 - Fix the atrr.exclude kernel handling so it respects the
   perf_event_paranoid and the user permissions

 - Make perf test on s390x work correctly

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/aux: Only update ->aux_wakeup in non-overwrite mode
  perf test: Fix vmlinux failure on s390x part 2
  perf test: Fix vmlinux failure on s390x
  perf tools: Fix syscalltbl build failure
  perf report: Fix debug messages with --call-graph option
  perf evsel: Fix attr.exclude_kernel setting for default cycles:p
  tools include: Sync kernel ABI headers with tooling headers
  perf tools: Get all of tools/{arch,include}/ in the MANIFEST

7 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 19:02:47 +0000 (12:02 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull  locking fixes from Thomas Gleixner:
 "Two fixes for locking:

   - Plug a hole the pi_stat->owner serialization which was changed
     recently and failed to fixup two usage sites.

   - Prevent reordering of the rwsem_has_spinner() check vs the
     decrement of rwsem count in up_write() which causes a missed
     wakeup"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem-xadd: Fix missed wakeup due to reordering of load
  futex: Fix pi_state->owner serialization

7 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 19:00:56 +0000 (12:00 -0700)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:

 - Add a missing NULL pointer check in free_irq()

 - Fix a memory leak/memory corruption in the generic irq chip

 - Add missing rcu annotations for radix tree access

 - Use ffs instead of fls when extracting data from a chip register in
   the MIPS GIC irq driver

 - Fix the unmasking of IPI interrupts in the MIPS GIC driver so they
   end up at the target CPU and not at CPU0

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irq/generic-chip: Don't replace domain's name
  irqdomain: Add __rcu annotations to radix tree accessors
  irqchip/mips-gic: Use effective affinity to unmask
  irqchip/mips-gic: Fix shifts to extract register fields
  genirq: Check __free_irq() return value for NULL

7 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 1 Oct 2017 18:12:29 +0000 (11:12 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull objtool fixes from Thomas Gleixner:
 "Two small fixes for objtool:

   - Support frame pointer setup via 'lea (%rsp), %rbp' which was not
     yet supported and caused build warnings

   - Disable unreacahble warnings for GCC4.4 and older to avoid false
     positives caused by the compiler itself"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Support unoptimized frame pointer setup
  objtool: Skip unreachable warnings for GCC 4.4 and older

7 years agoMerge tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd
Linus Torvalds [Sat, 30 Sep 2017 19:52:32 +0000 (12:52 -0700)]
Merge tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd

Pull mtd fixes from Boris Brezillon:

 - Fix partition alignment check in mtdcore.c

 - Fix a buffer overflow in the Atmel NAND driver

* tag 'mtd/fixes-for-4.14-rc3' of git://git.infradead.org/linux-mtd:
  mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user
  mtd: Fix partition alignment check on multi-erasesize devices

7 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 30 Sep 2017 19:50:56 +0000 (12:50 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Eight mostly minor fixes for recently discovered issues in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: ILLEGAL REQUEST + ASC==27 => target failure
  scsi: aacraid: Add a small delay after IOP reset
  scsi: scsi_transport_fc: Also check for NOTPRESENT in fc_remote_port_add()
  scsi: scsi_transport_fc: set scsi_target_id upon rescan
  scsi: scsi_transport_iscsi: fix the issue that iscsi_if_rx doesn't parse nlmsg properly
  scsi: aacraid: error: testing array offset 'bus' after use
  scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
  scsi: aacraid: Fix 2T+ drives on SmartIOC-2000

7 years agoMerge tag 'platform-drivers-x86-v4.14-2' of git://git.infradead.org/linux-platform...
Linus Torvalds [Sat, 30 Sep 2017 02:35:41 +0000 (19:35 -0700)]
Merge tag 'platform-drivers-x86-v4.14-2' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform drivers fix from Darren Hart:
 "Newly discovered species of fujitsu laptops break some assumptions
  about ACPI device pairings.

  fujitsu-laptop: Don't oops when FUJ02E3 is not present"

* tag 'platform-drivers-x86-v4.14-2' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: fujitsu-laptop: Don't oops when FUJ02E3 is not presnt

7 years agoMerge tag 'led_fixes-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j...
Linus Torvalds [Sat, 30 Sep 2017 02:33:32 +0000 (19:33 -0700)]
Merge tag 'led_fixes-4.14-rc3' of git://git./linux/kernel/git/j.anaszewski/linux-leds

Pull LED fixes from Jacek Anaszewski:
 "Four fixes for the as3645a LED flash controller and one update to
  MAINTAINERS"

* tag 'led_fixes-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
  MAINTAINERS: Add entry for MediaTek PMIC LED driver
  as3645a: Unregister indicator LED on device unbind
  as3645a: Use integer numbers for parsing LEDs
  dt: bindings: as3645a: Use LED number to refer to LEDs
  as3645a: Use ams,input-max-microamp as documented in DT bindings

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Fri, 29 Sep 2017 19:59:59 +0000 (12:59 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull waitid fix from Al Viro:
 "Fix infoleak in waitid()"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fix infoleak in waitid(2)

7 years agoMerge branch 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 29 Sep 2017 19:57:35 +0000 (12:57 -0700)]
Merge branch 'for-4.14-rc3' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "We've collected a bunch of isolated fixes, for crashes, user-visible
  behaviour or missing bits from other subsystem cleanups from the past.

  The overall number is not small but I was not able to make it
  significantly smaller. Most of the patches are supposed to go to
  stable"

* 'for-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: log csums for all modified extents
  Btrfs: fix unexpected result when dio reading corrupted blocks
  btrfs: Report error on removing qgroup if del_qgroup_item fails
  Btrfs: skip checksum when reading compressed data if some IO have failed
  Btrfs: fix kernel oops while reading compressed data
  Btrfs: use btrfs_op instead of bio_op in __btrfs_map_block
  Btrfs: do not backup tree roots when fsync
  btrfs: remove BTRFS_FS_QUOTA_DISABLING flag
  btrfs: propagate error to btrfs_cmp_data_prepare caller
  btrfs: prevent to set invalid default subvolid
  Btrfs: send: fix error number for unknown inode types
  btrfs: fix NULL pointer dereference from free_reloc_roots()
  btrfs: finish ordered extent cleaning if no progress is found
  btrfs: clear ordered flag on cleaning up ordered extents
  Btrfs: fix incorrect {node,sector}size endianness from BTRFS_IOC_FS_INFO
  Btrfs: do not reset bio->bi_ops while writing bio
  Btrfs: use the new helper wbc_to_write_flags

7 years agoMerge tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Linus Torvalds [Fri, 29 Sep 2017 19:55:33 +0000 (12:55 -0700)]
Merge tag 'md/4.14-rc3' of git://git./linux/kernel/git/shli/md

Pull MD fixes from Shaohua Li:
 "A few fixes for MD. Mainly fix a problem introduced in 4.13, which we
  retry bio for some code paths but not all in some situations"

* tag 'md/4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid5: cap worker count
  dm-raid: fix a race condition in request handling
  md: fix a race condition for flush request handling
  md: separate request handling

7 years agoMerge tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Fri, 29 Sep 2017 19:46:13 +0000 (12:46 -0700)]
Merge tag 'pci-v4.14-fixes-3' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - fix CONFIG_PCI=n build error (introduced in v4.14-rc1) (Geert
   Uytterhoeven)

 - fix a race in sysfs driver_override store/show (Nicolai Stange)

* tag 'pci-v4.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Fix race condition with driver_override
  PCI: Add dummy pci_acs_enabled() for CONFIG_PCI=n build

7 years agoMerge tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 29 Sep 2017 19:43:36 +0000 (12:43 -0700)]
Merge tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Regular fixes pull, some amdkfd, amdgpu, etnaviv, sun4i, qxl, tegra
  fixes.

  I've got an outstanding pull for i915 but it wasn't on an rc2 base so
  I wanted to ship these out first, I might get to it before rc3 or I
  might not"

* tag 'drm-fixes-for-v4.14-rc3' of git://people.freedesktop.org/~airlied/linux:
  drm/tegra: trace: Fix path to include
  qxl: fix framebuffer unpinning
  drm/sun4i: cec: Enable back CEC-pin framework
  drm/amdkfd: Print event limit messages only once per process
  drm/amdkfd: Fix kernel-queue wrapping bugs
  drm/amdkfd: Fix incorrect destroy_mqd parameter
  drm/radeon: disable hard reset in hibernate for APUs
  drm/amdgpu: revert tile table update for oland
  etnaviv: fix gem object list corruption
  etnaviv: fix submit error path
  qxl: fix primary surface handling
  drm/amdkfd: check for null dev to avoid a null pointer dereference

7 years agoMerge tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 29 Sep 2017 19:37:07 +0000 (12:37 -0700)]
Merge tag 'iommu-fixes-v4.14-rc2' of git://git./linux/kernel/git/joro/iommu

Pull IOMMU fixes from Joerg Roedel:

 - A comment fix for 'struct iommu_ops'

 - Format string fixes for AMD IOMMU, unfortunatly I missed that during
   review.

 - Limit mediatek physical addresses to 32 bit for v7s to fix a warning
   triggered in io-page-table code.

 - Fix dma-sync in io-pgtable-arm-v7s code

* tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Fix comment for iommu_ops.map_sg
  iommu/amd: pr_err() strings should end with newlines
  iommu/mediatek: Limit the physical address in 32bit for v7s
  iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA

7 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 29 Sep 2017 19:31:35 +0000 (12:31 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

 - SPsel register initialisation on reset as the architecture defines
   its state as unknown

 - Use READ_ONCE when dereferencing pmd_t pointers to avoid race
   conditions in page_vma_mapped_walk() (or fast GUP) with concurrent
   modifications of the page table

 - Avoid invoking the mm fault handling code for kernel addresses (check
   against TASK_SIZE) which would otherwise result in calling
   might_sleep() in atomic context

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: fault: Route pte translation faults via do_translation_fault
  arm64: mm: Use READ_ONCE when dereferencing pointer to pte table
  arm64: Make sure SPsel is always set

7 years agoMerge tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 29 Sep 2017 19:24:28 +0000 (12:24 -0700)]
Merge tag 'for-linus-4.14c-rc3-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - avoid a warning when compiling with clang

 - consider read-only bits in xen-pciback when writing to a BAR

 - fix a boot crash of pv-domains

* tag 'for-linus-4.14c-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
  xen-pciback: relax BAR sizing write value check
  x86/xen: clean up clang build warning

7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 29 Sep 2017 19:18:55 +0000 (12:18 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Mixed bugfixes. Perhaps the most interesting one is a latent bug that
  was finally triggered by PCID support"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm/x86: Handle async PF in RCU read-side critical sections
  KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
  KVM: VMX: use cmpxchg64
  KVM: VMX: simplify and fix vmx_vcpu_pi_load
  KVM: VMX: avoid double list add with VT-d posted interrupts
  KVM: VMX: extract __pi_post_block
  KVM: PPC: Book3S HV: Check for updated HDSISR on P9 HDSI exception
  KVM: nVMX: fix HOST_CR3/HOST_CR4 cache

7 years agofix infoleak in waitid(2)
Al Viro [Fri, 29 Sep 2017 17:43:15 +0000 (13:43 -0400)]
fix infoleak in waitid(2)

kernel_waitid() can return a PID, an error or 0.  rusage is filled in the first
case and waitid(2) rusage should've been copied out exactly in that case, *not*
whenever kernel_waitid() has not returned an error.  Compat variant shares that
braino; none of kernel_wait4() callers do, so the below ought to fix it.

Reported-and-tested-by: Alexander Potapenko <glider@google.com>
Fixes: ce72a16fa705 ("wait4(2)/waitid(2): separate copying rusage to userland")
Cc: stable@vger.kernel.org # v4.13
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7 years agox86/asm: Use register variable to get stack pointer value
Andrey Ryabinin [Fri, 29 Sep 2017 14:15:36 +0000 (17:15 +0300)]
x86/asm: Use register variable to get stack pointer value

Currently we use current_stack_pointer() function to get the value
of the stack pointer register. Since commit:

  f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")

... we have a stack register variable declared. It can be used instead of
current_stack_pointer() function which allows to optimize away some
excessive "mov %rsp, %<dst>" instructions:

 -mov    %rsp,%rdx
 -sub    %rdx,%rax
 -cmp    $0x3fff,%rax
 -ja     ffffffff810722fd <ist_begin_non_atomic+0x2d>

 +sub    %rsp,%rax
 +cmp    $0x3fff,%rax
 +ja     ffffffff810722fa <ist_begin_non_atomic+0x2a>

Remove current_stack_pointer(), rename __asm_call_sp to current_stack_pointer
and use it instead of the removed function.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929141537.29167-1-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agox86/mm: Disable branch profiling in mem_encrypt.c
Tom Lendacky [Fri, 29 Sep 2017 16:24:19 +0000 (11:24 -0500)]
x86/mm: Disable branch profiling in mem_encrypt.c

Some routines in mem_encrypt.c are called very early in the boot process,
e.g. sme_encrypt_kernel(). When CONFIG_TRACE_BRANCH_PROFILING=y is defined
the resulting branch profiling associated with the check to see if SME is
active results in a kernel crash. Disable branch profiling for
mem_encrypt.c by defining DISABLE_BRANCH_PROFILING before including any
header files.

Reported-by: kernel test robot <lkp@01.org>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170929162419.6016.53390.stgit@tlendack-t1.amdoffice.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoMerge tag 'perf-urgent-for-mingo-4.14-20170928' of git://git.kernel.org/pub/scm/linux...
Ingo Molnar [Fri, 29 Sep 2017 17:31:46 +0000 (19:31 +0200)]
Merge tag 'perf-urgent-for-mingo-4.14-20170928' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Fix syscalltbl build failure (Akemi Yagi)

- Fix attr.exclude_kernel setting for default cycles:p, this time for
  !root with kernel.perf_event_paranoid = -1 (Arnaldo Carvalho de Melo)

- Sync kernel ABI headers with tooling headers (Ingo Molnar)

- Remove misleading debug messages with --call-graph option (Mengting Zhang)

- Revert vmlinux symbol resolution patches for s390x (Thomas Richter)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoMerge branch 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorr...
Linus Torvalds [Fri, 29 Sep 2017 17:26:35 +0000 (10:26 -0700)]
Merge branch 'fixes-v4.14-rc3' of git://git./linux/kernel/git/jmorris/linux-security

Pull keys fixes from James Morris:
 "Notable here is a rewrite of big_key crypto by Jason Donenfeld to
  address some issues in the original code.

  From Jason's commit log:
   "This started out as just replacing the use of crypto/rng with
    get_random_bytes_wait, so that we wouldn't use bad randomness at
    boot time. But, upon looking further, it appears that there were
    even deeper underlying cryptographic problems, and that this seems
    to have been committed with very little crypto review. So, I rewrote
    the whole thing, trying to keep to the conventions introduced by the
    previous author, to fix these cryptographic flaws."

  There has been positive review of the new code by Eric Biggers and
  Herbert Xu, and it passes basic testing via the keyutils test suite.
  Eric also manually tested it.

  Generally speaking, we likely need to improve the amount of crypto
  review for kernel crypto users including keys (I'll post a note
  separately to ksummit-discuss)"

* 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  security/keys: rewrite all of big_key crypto
  security/keys: properly zero out sensitive key material in big_key
  KEYS: use kmemdup() in request_key_auth_new()
  KEYS: restrict /proc/keys by credentials at open time
  KEYS: reset parent each time before searching key_user_tree
  KEYS: prevent KEYCTL_READ on negative key
  KEYS: prevent creating a different user's keyrings
  KEYS: fix writing past end of user-supplied buffer in keyring_read()
  KEYS: fix key refcount leak in keyctl_read_key()
  KEYS: fix key refcount leak in keyctl_assume_authority()
  KEYS: don't revoke uninstantiated key in request_key_auth_new()
  KEYS: fix cred refcount leak in request_key_auth_new()

7 years agoarm64: fault: Route pte translation faults via do_translation_fault
Will Deacon [Fri, 29 Sep 2017 11:27:41 +0000 (12:27 +0100)]
arm64: fault: Route pte translation faults via do_translation_fault

We currently route pte translation faults via do_page_fault, which elides
the address check against TASK_SIZE before invoking the mm fault handling
code. However, this can cause issues with the path walking code in
conjunction with our word-at-a-time implementation because
load_unaligned_zeropad can end up faulting in kernel space if it reads
across a page boundary and runs into a page fault (e.g. by attempting to
read from a guard region).

In the case of such a fault, load_unaligned_zeropad has registered a
fixup to shift the valid data and pad with zeroes, however the abort is
reported as a level 3 translation fault and we dispatch it straight to
do_page_fault, despite it being a kernel address. This results in calling
a sleeping function from atomic context:

  BUG: sleeping function called from invalid context at arch/arm64/mm/fault.c:313
  in_atomic(): 0, irqs_disabled(): 0, pid: 10290
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  [...]
  [<ffffff8e016cd0cc>] ___might_sleep+0x134/0x144
  [<ffffff8e016cd158>] __might_sleep+0x7c/0x8c
  [<ffffff8e016977f0>] do_page_fault+0x140/0x330
  [<ffffff8e01681328>] do_mem_abort+0x54/0xb0
  Exception stack(0xfffffffb20247a70 to 0xfffffffb20247ba0)
  [...]
  [<ffffff8e016844fc>] el1_da+0x18/0x78
  [<ffffff8e017f399c>] path_parentat+0x44/0x88
  [<ffffff8e017f4c9c>] filename_parentat+0x5c/0xd8
  [<ffffff8e017f5044>] filename_create+0x4c/0x128
  [<ffffff8e017f59e4>] SyS_mkdirat+0x50/0xc8
  [<ffffff8e01684e30>] el0_svc_naked+0x24/0x28
  Code: 36380080 d5384100 f9400800 9402566d (d4210000)
  ---[ end trace 2d01889f2bca9b9f ]---

Fix this by dispatching all translation faults to do_translation_faults,
which avoids invoking the page fault logic for faults on kernel addresses.

Cc: <stable@vger.kernel.org>
Reported-by: Ankit Jain <ankijain@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
7 years agoarm64: mm: Use READ_ONCE when dereferencing pointer to pte table
Will Deacon [Fri, 29 Sep 2017 10:29:55 +0000 (11:29 +0100)]
arm64: mm: Use READ_ONCE when dereferencing pointer to pte table

On kernels built with support for transparent huge pages, different CPUs
can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
and they must take care to use READ_ONCE to avoid value tearing or caching
of stale values by the compiler. Unfortunately, these functions call into
our pgtable macros, which don't use READ_ONCE, and compiler caching has
been observed to cause the following crash during ext4 writeback:

PC is at check_pte+0x20/0x170
LR is at page_vma_mapped_walk+0x2e0/0x540
[...]
Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
Call trace:
[<ffff000008233328>] check_pte+0x20/0x170
[<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
[<ffff000008234adc>] page_mkclean_one+0xac/0x278
[<ffff000008234d98>] rmap_walk_file+0xf0/0x238
[<ffff000008236e74>] rmap_walk+0x64/0xa0
[<ffff0000082370c8>] page_mkclean+0x90/0xa8
[<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
[<ffff00000832f984>] mpage_submit_page+0x34/0x98
[<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
[<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
[<ffff00000833530c>] ext4_writepages+0x484/0xe30
[<ffff0000081f6ab4>] do_writepages+0x44/0xe8
[<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
[<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
[<ffff000008324310>] ext4_sync_file+0x80/0x4b8
[<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
[<ffff0000082332b4>] SyS_msync+0x194/0x1e8

This is because page_vma_mapped_walk loads the PMD twice before calling
pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
(where it sees a valid table pointer due to a concurrent pmd_populate).
However, the compiler inlines everything and caches the first value in
a register, which is subsequently used in pte_offset_phys which returns
a junk pointer that is later dereferenced when attempting to access the
relevant pte.

This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
that a stale value is not used. Whilst this is a point fix for a known
failure (and simple to backport), a full fix moving all of our page table
accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
page_vma_mapped_walk is in the works for a future kernel release.

Cc: Jon Masters <jcm@redhat.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
7 years agokvm/x86: Handle async PF in RCU read-side critical sections
Boqun Feng [Fri, 29 Sep 2017 11:01:45 +0000 (19:01 +0800)]
kvm/x86: Handle async PF in RCU read-side critical sections

Sasha Levin reported a WARNING:

| WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
| rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
| WARNING: CPU: 0 PID: 6974 at kernel/rcu/tree_plugin.h:329
| rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
...
| CPU: 0 PID: 6974 Comm: syz-fuzzer Not tainted 4.13.0-next-20170908+ #246
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
| 1.10.1-1ubuntu1 04/01/2014
| Call Trace:
...
| RIP: 0010:rcu_preempt_note_context_switch kernel/rcu/tree_plugin.h:329 [inline]
| RIP: 0010:rcu_note_context_switch+0x16c/0x2210 kernel/rcu/tree.c:458
| RSP: 0018:ffff88003b2debc8 EFLAGS: 00010002
| RAX: 0000000000000001 RBX: 1ffff1000765bd85 RCX: 0000000000000000
| RDX: 1ffff100075d7882 RSI: ffffffffb5c7da20 RDI: ffff88003aebc410
| RBP: ffff88003b2def30 R08: dffffc0000000000 R09: 0000000000000001
| R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003b2def08
| R13: 0000000000000000 R14: ffff88003aebc040 R15: ffff88003aebc040
| __schedule+0x201/0x2240 kernel/sched/core.c:3292
| schedule+0x113/0x460 kernel/sched/core.c:3421
| kvm_async_pf_task_wait+0x43f/0x940 arch/x86/kernel/kvm.c:158
| do_async_page_fault+0x72/0x90 arch/x86/kernel/kvm.c:271
| async_page_fault+0x22/0x30 arch/x86/entry/entry_64.S:1069
| RIP: 0010:format_decode+0x240/0x830 lib/vsprintf.c:1996
| RSP: 0018:ffff88003b2df520 EFLAGS: 00010283
| RAX: 000000000000003f RBX: ffffffffb5d1e141 RCX: ffff88003b2df670
| RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffb5d1e140
| RBP: ffff88003b2df560 R08: dffffc0000000000 R09: 0000000000000000
| R10: ffff88003b2df718 R11: 0000000000000000 R12: ffff88003b2df5d8
| R13: 0000000000000064 R14: ffffffffb5d1e140 R15: 0000000000000000
| vsnprintf+0x173/0x1700 lib/vsprintf.c:2136
| sprintf+0xbe/0xf0 lib/vsprintf.c:2386
| proc_self_get_link+0xfb/0x1c0 fs/proc/self.c:23
| get_link fs/namei.c:1047 [inline]
| link_path_walk+0x1041/0x1490 fs/namei.c:2127
...

This happened when the host hit a page fault, and delivered it as in an
async page fault, while the guest was in an RCU read-side critical
section.  The guest then tries to reschedule in kvm_async_pf_task_wait(),
but rcu_preempt_note_context_switch() would treat the reschedule as a
sleep in RCU read-side critical section, which is not allowed (even in
preemptible RCU).  Thus the WARN.

To cure this, make kvm_async_pf_task_wait() go to the halt path if the
PF happens in a RCU read-side critical section.

Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agoKVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume
Wanpeng Li [Fri, 29 Sep 2017 01:16:44 +0000 (18:16 -0700)]
KVM: nVMX: Fix nested #PF intends to break L1's vmlauch/vmresume

------------[ cut here ]------------
 WARNING: CPU: 4 PID: 5280 at /home/kernel/linux/arch/x86/kvm//vmx.c:11394 nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 CPU: 4 PID: 5280 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0+ #17
 RIP: 0010:nested_vmx_vmexit+0xc2b/0xd70 [kvm_intel]
 Call Trace:
  ? emulator_read_emulated+0x15/0x20 [kvm]
  ? segmented_read+0xae/0xf0 [kvm]
  vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  ? vmx_inject_page_fault_nested+0x60/0x70 [kvm_intel]
  x86_emulate_instruction+0x733/0x810 [kvm]
  vmx_handle_exit+0x2f4/0xda0 [kvm_intel]
  ? kvm_arch_vcpu_ioctl_run+0xd2f/0x1c60 [kvm]
  kvm_arch_vcpu_ioctl_run+0xdab/0x1c60 [kvm]
  ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
  kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
  ? __fget+0xfc/0x210
  do_vfs_ioctl+0xa4/0x6a0
  ? __fget+0x11d/0x210
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x23/0xc2

A nested #PF is triggered during L0 emulating instruction for L2. However, it
doesn't consider we should not break L1's vmlauch/vmresme. This patch fixes
it by queuing the #PF exception instead ,requesting an immediate VM exit from
L2 and keeping the exception for L1 pending for a subsequent nested VM exit.

This should actually work all the time, making vmx_inject_page_fault_nested
totally unnecessary.  However, that's not working yet, so this patch can work
around the issue in the meanwhile.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agosched/sysctl: Check user input value of sysctl_sched_time_avg
Ethan Zhao [Mon, 4 Sep 2017 05:59:34 +0000 (13:59 +0800)]
sched/sysctl: Check user input value of sysctl_sched_time_avg

System will hang if user set sysctl_sched_time_avg to 0:

  [root@XXX ~]# sysctl kernel.sched_time_avg_ms=0

  Stack traceback for pid 0
  0xffff883f6406c600 0 0 1 3 R 0xffff883f6406cf50 *swapper/3
  ffff883f7ccc3ae8 0000000000000018 ffffffff810c4dd0 0000000000000000
  0000000000017800 ffff883f7ccc3d78 0000000000000003 ffff883f7ccc3bf8
  ffffffff810c4fc9 ffff883f7ccc3c08 00000000810c5043 ffff883f7ccc3c08
  Call Trace:
  <IRQ> [<ffffffff810c4dd0>] ? update_group_capacity+0x110/0x200
  [<ffffffff810c4fc9>] ? update_sd_lb_stats+0x109/0x600
  [<ffffffff810c5507>] ? find_busiest_group+0x47/0x530
  [<ffffffff810c5b84>] ? load_balance+0x194/0x900
  [<ffffffff810ad5ca>] ? update_rq_clock.part.83+0x1a/0xe0
  [<ffffffff810c6d42>] ? rebalance_domains+0x152/0x290
  [<ffffffff810c6f5c>] ? run_rebalance_domains+0xdc/0x1d0
  [<ffffffff8108a75b>] ? __do_softirq+0xfb/0x320
  [<ffffffff8108ac85>] ? irq_exit+0x125/0x130
  [<ffffffff810b3a17>] ? scheduler_ipi+0x97/0x160
  [<ffffffff81052709>] ? smp_reschedule_interrupt+0x29/0x30
  [<ffffffff8173a1be>] ? reschedule_interrupt+0x6e/0x80
   <EOI> [<ffffffff815bc83c>] ? cpuidle_enter_state+0xcc/0x230
  [<ffffffff815bc80c>] ? cpuidle_enter_state+0x9c/0x230
  [<ffffffff815bc9d7>] ? cpuidle_enter+0x17/0x20
  [<ffffffff810cd6dc>] ? cpu_startup_entry+0x38c/0x420
  [<ffffffff81053373>] ? start_secondary+0x173/0x1e0

Because divide-by-zero error happens in function:

update_group_capacity()
  update_cpu_capacity()
    scale_rt_capacity()
     {
          ...
          total = sched_avg_period() + delta;
          used = div_u64(avg, total);
          ...
     }

To fix this issue, check user input value of sysctl_sched_time_avg, keep
it unchanged when hitting invalid input, and set the minimum limit of
sysctl_sched_time_avg to 1 ms.

Reported-by: James Puthukattukaran <james.puthukattukaran@oracle.com>
Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: efault@gmx.de
Cc: ethan.kernel@gmail.com
Cc: keescook@chromium.org
Cc: mcgrof@kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1504504774-18253-1-git-send-email-ethan.zhao@oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agox86/asm: Fix inline asm call constraints for GCC 4.4
Josh Poimboeuf [Thu, 28 Sep 2017 21:58:26 +0000 (16:58 -0500)]
x86/asm: Fix inline asm call constraints for GCC 4.4

The kernel test bot (run by Xiaolong Ye) reported that the following commit:

  f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")

is causing double faults in a kernel compiled with GCC 4.4.

Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:

  register unsigned int __asm_call_sp asm("esp");
  #define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)

Even on a 64-bit kernel, it's using ESP instead of RSP.  That causes GCC
to produce the following bogus code:

  ffffffff8147461d:       89 e0                   mov    %esp,%eax
  ffffffff8147461f:       4c 89 f7                mov    %r14,%rdi
  ffffffff81474622:       4c 89 fe                mov    %r15,%rsi
  ffffffff81474625:       ba 20 00 00 00          mov    $0x20,%edx
  ffffffff8147462a:       89 c4                   mov    %eax,%esp
  ffffffff8147462c:       e8 bf 52 05 00          callq  ffffffff814c98f0 <copy_user_generic_unrolled>

Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer.  The upper 32 bits
are getting cleared out, corrupting the stack pointer.

So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.

This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.

Reported-and-Bisected-and-Tested-by: kernel test robot <xiaolong.ye@intel.com>
Diagnosed-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: LKP <lkp@01.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/debug: Add explicit TASK_PARKED printing
Peter Zijlstra [Fri, 22 Sep 2017 16:37:28 +0000 (18:37 +0200)]
sched/debug: Add explicit TASK_PARKED printing

Currently TASK_PARKED is masqueraded as TASK_INTERRUPTIBLE, give it
its own print state because it will not in fact get woken by regular
wakeups and is a long-term state.

This requires moving TASK_PARKED into the TASK_REPORT mask, and since
that latter needs to be a contiguous bitmask, we need to shuffle the
bits around a bit.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/debug: Ignore TASK_IDLE for SysRq-W
Peter Zijlstra [Fri, 22 Sep 2017 16:32:41 +0000 (18:32 +0200)]
sched/debug: Ignore TASK_IDLE for SysRq-W

Markus reported that tasks in TASK_IDLE state are reported by SysRq-W,
which results in undesirable clutter.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/debug: Add explicit TASK_IDLE printing
Peter Zijlstra [Fri, 22 Sep 2017 16:30:40 +0000 (18:30 +0200)]
sched/debug: Add explicit TASK_IDLE printing

Markus reported that kthreads that idle using TASK_IDLE instead of
TASK_INTERRUPTIBLE are reported in as TASK_UNINTERRUPTIBLE and things
like htop mark those red.

This is undesirable, so add an explicit state for TASK_IDLE.

Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/tracing: Use common task-state helpers
Peter Zijlstra [Fri, 22 Sep 2017 16:23:31 +0000 (18:23 +0200)]
sched/tracing: Use common task-state helpers

Remove yet another task-state char instance.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agolocking/rwsem-xadd: Fix missed wakeup due to reordering of load
Prateek Sood [Thu, 7 Sep 2017 14:30:58 +0000 (20:00 +0530)]
locking/rwsem-xadd: Fix missed wakeup due to reordering of load

If a spinner is present, there is a chance that the load of
rwsem_has_spinner() in rwsem_wake() can be reordered with
respect to decrement of rwsem count in __up_write() leading
to wakeup being missed:

 spinning writer                  up_write caller
 ---------------                  -----------------------
 [S] osq_unlock()                 [L] osq
  spin_lock(wait_lock)
  sem->count=0xFFFFFFFF00000001
            +0xFFFFFFFF00000000
  count=sem->count
  MB
                                   sem->count=0xFFFFFFFE00000001
                                             -0xFFFFFFFF00000001
                                   spin_trylock(wait_lock)
                                   return
 rwsem_try_write_lock(count)
 spin_unlock(wait_lock)
 schedule()

Reordering of atomic_long_sub_return_release() in __up_write()
and rwsem_has_spinner() in rwsem_wake() can cause missing of
wakeup in up_write() context. In spinning writer, sem->count
and local variable count is 0XFFFFFFFE00000001. It would result
in rwsem_try_write_lock() failing to acquire rwsem and spinning
writer going to sleep in rwsem_down_write_failed().

The smp_rmb() will make sure that the spinner state is
consulted after sem->count is updated in up_write context.

Signed-off-by: Prateek Sood <prsood@codeaurora.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: longman@redhat.com
Cc: parri.andrea@gmail.com
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/tracing: Fix trace_sched_switch task-state printing
Peter Zijlstra [Fri, 22 Sep 2017 16:19:53 +0000 (18:19 +0200)]
sched/tracing: Fix trace_sched_switch task-state printing

Convert trace_sched_switch to use the common task-state helpers and
fix the "X" and "Z" order, possibly they ended up in the wrong order
because TASK_REPORT has them in the wrong order too.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/debug: Remove unused variable
Peter Zijlstra [Fri, 22 Sep 2017 16:14:08 +0000 (18:14 +0200)]
sched/debug: Remove unused variable

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/debug: Convert TASK_state to hex
Peter Zijlstra [Fri, 22 Sep 2017 16:13:36 +0000 (18:13 +0200)]
sched/debug: Convert TASK_state to hex

Bit patterns are easier in hex.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agosched/debug: Implement consistent task-state printing
Peter Zijlstra [Fri, 22 Sep 2017 16:09:26 +0000 (18:09 +0200)]
sched/debug: Implement consistent task-state printing

Currently get_task_state() and task_state_to_char() report different
states, create a number of common helpers and unify the reported state
space.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoum/time: Fixup namespace collision
Thomas Gleixner [Fri, 29 Sep 2017 08:07:44 +0000 (10:07 +0200)]
um/time: Fixup namespace collision

The new timer_setup() function for struct timer_list collides with a
private um function. Rename it.

Fixes: 686fef928bba ("timer: Prepare to change timer callback argument type")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: Kees Cook <keescook@chromium.org>
7 years agoperf/aux: Only update ->aux_wakeup in non-overwrite mode
Alexander Shishkin [Wed, 6 Sep 2017 16:08:11 +0000 (19:08 +0300)]
perf/aux: Only update ->aux_wakeup in non-overwrite mode

The following commit:

  d9a50b0256 ("perf/aux: Ensure aux_wakeup represents most recent wakeup index")

changed the AUX wakeup position calculation to rounddown(), which causes
a division-by-zero in AUX overwrite mode (aka "snapshot mode").

The zero denominator results from the fact that perf record doesn't set
aux_watermark to anything, in which case the kernel will set it to half
the AUX buffer size, but only for non-overwrite mode. In the overwrite
mode aux_watermark stays zero.

The good news is that, AUX overwrite mode, wakeups don't happen and
related bookkeeping is not relevant, so we can simply forego the whole
wakeup updates.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20170906160811.16510-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoMerge tag 'drm-misc-fixes-2017-09-28-1' of git://anongit.freedesktop.org/git/drm...
Dave Airlie [Fri, 29 Sep 2017 07:11:04 +0000 (17:11 +1000)]
Merge tag 'drm-misc-fixes-2017-09-28-1' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes

Driver Changes:
- qxl: fix primary surface and fb unpinning (Gerd)
- sun41: fix CEC_PIN config gate now that media has been merged (Hans)
- tegra: fix TRACE_INCLUDE_PATH (Thierry)

Cc: Thierry Reding <treding@nvidia.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Gerd Hoffmann <kraxel@redhat.com>
* tag 'drm-misc-fixes-2017-09-28-1' of git://anongit.freedesktop.org/git/drm-misc:
  drm/tegra: trace: Fix path to include
  qxl: fix framebuffer unpinning
  drm/sun4i: cec: Enable back CEC-pin framework
  qxl: fix primary surface handling

7 years agoMerge tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 28 Sep 2017 21:38:30 +0000 (14:38 -0700)]
Merge tag 'acpi-4.14-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "This fixes an APEI problem that may cause a reported error to be
  missed due to a race condition"

* tag 'acpi-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / APEI: clear error status before acknowledging the error

7 years agoMerge tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 28 Sep 2017 21:35:51 +0000 (14:35 -0700)]
Merge tag 'pm-4.14-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a deadlock in the operating performance points (OPP)
  framework introduced during the 4.11 cycle, more issues with duplicate
  device objects for cpufreq-dt and cpufreq documentation.

  Specifics:

   - Fix a deadlock in the operating performance points (OPP) framework
     caused by a notifier callback taking a lock that's already held by
     its caller (Viresh Kumar).

   - Prevent the ti-cpufreq and cpufreq-dt-platdev drivers from
     attempting to register conflicting device objects which triggers a
     warning from sysfs (Suniel Mahesh).

   - Drop a stale reference to a piece of intel_pstate documentation
     that's not in the tree any more (Rafael Wysocki)"

* tag 'pm-4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: docs: Drop intel-pstate.txt from index.txt
  cpufreq: dt: Fix sysfs duplicate filename creation for platform-device
  PM / OPP: Call notifier without holding opp_table->lock

7 years agoMerge tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Thu, 28 Sep 2017 20:27:23 +0000 (13:27 -0700)]
Merge tag 'xfs-4.14-fixes-2' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:

 - fix various problems with the copy-on-write extent maps getting freed
   at the wrong time

 - fix printk format specifier problems

 - report zeroing operation outcomes instead of dropping them on the
   floor

 - fix some crashes when dio operations partially fail

 - fix a race condition between unwritten extent conversion & dio read

 - fix some incorrect tests in the inode log item processing

 - correct the delayed allocation space reservations on rmap filesystems

 - fix some problems checking for dax support

* tag 'xfs-4.14-fixes-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: revert "xfs: factor rmap btree size into the indlen calculations"
  xfs: Capture state of the right inode in xfs_iflush_done
  xfs: perag initialization should only touch m_ag_max_usable for AG 0
  xfs: update i_size after unwritten conversion in dio completion
  iomap_dio_rw: Allocate AIO completion queue before submitting dio
  xfs: validate bdev support for DAX inode flag
  xfs: remove redundant re-initialization of total_nr_pages
  xfs: Output warning message when discard option was enabled even though the device does not support discard
  xfs: report zeroed or not correctly in xfs_zero_range()
  xfs: kill meaningless variable 'zero'
  fs/xfs: Use %pS printk format for direct addresses
  xfs: evict CoW fork extents when performing finsert/fcollapse
  xfs: don't unconditionally clear the reflink flag on zero-block files

7 years agoRevert "Bluetooth: Add option for disabling legacy ioctl interfaces"
Linus Torvalds [Thu, 28 Sep 2017 20:20:32 +0000 (13:20 -0700)]
Revert "Bluetooth: Add option for disabling legacy ioctl interfaces"

This reverts commit dbbccdc4ced015cdd4051299bd87fbe0254ad351.

It turns out that the "legacy" users aren't so legacy at all, and that
turning off the legacy ioctl will break the current Qt bluetooth stack
for bluetooth LE devices that were released just a couple of months ago.

So it's simply not true that this was a legacy interface that hasn't
been needed and is only limited to old legacy BT devices.  Because I
actually read Kconfig help messages, and actively try to turn off
features that I don't need, I turned the option off.

Then I spent _way_ too much time debugging BLE issues until I realized
that it wasn't the Qt and subsurface development that had broken one of
my dive computer BLE downloads, but simply my broken kernel config.

Maybe in a decade it will be true that this is a legacy interface.  And
maybe with a better help-text and correct dependencies, this kind of
legacy removal might be acceptable.  But as things are right now both
the commit message and the Kconfig help text were misleading, and the
Kconfig option had the wrong dependenencies.

There's no reason to keep that broken Kconfig option in the tree.

Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge branch 'acpi-apei'
Rafael J. Wysocki [Thu, 28 Sep 2017 20:18:15 +0000 (22:18 +0200)]
Merge branch 'acpi-apei'

* acpi-apei:
  ACPI / APEI: clear error status before acknowledging the error

7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Thu, 28 Sep 2017 19:12:51 +0000 (12:12 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 "Second -rc update for 4.14.

  Both Mellanox and Intel had a series of -rc fixes that landed this
  week. The Mellanox bunch is spread throughout the stack and not just
  in their driver, where as the Intel bunch was mostly in the hfi1
  driver. And, several of the fixes in the hfi1 driver were more than
  just simple 5 line fixes. As a result, the hfi1 driver fixes has a
  sizable LOC count.

  Everything else is as one would expect in an RC cycle in terms of LOC
  count. One item that might jump out and make you think "That's not an
  rc item" is the fix that corrects a typo. But, that change fixes a
  typo in a user visible API that was just added in this merge window,
  so if we fix it now, we can fix it. If we don't, the typo is in the
  API forever. Another that might not appear to be a fix at first glance
  is the Simplify mlx5_ib_cont_pages patch, but the simplification
  allows them to fix a bug in the existing function whenever the length
  of an SGE exceeded page size. We also had to revert one patch from the
  merge window that was wrong.

  Summary:

   - a few core fixes
   - a few ipoib fixes
   - a few mlx5 fixes
   - a 7-patch hfi1 related series"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
  IB/hfi1: On error, fix use after free during user context setup
  Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
  IB/hfi1: Return correct value in general interrupt handler
  IB/hfi1: Check eeprom config partition validity
  IB/hfi1: Only reset QSFP after link up and turn off AOC TX
  IB/hfi1: Turn off AOC TX after offline substates
  IB/mlx5: Fix NULL deference on mlx5_ib_update_xlt failure
  IB/mlx5: Simplify mlx5_ib_cont_pages
  IB/ipoib: Fix inconsistency with free_netdev and free_rdma_netdev
  IB/ipoib: Fix sysfs Pkey create<->remove possible deadlock
  IB: Correct MR length field to be 64-bit
  IB/core: Fix qp_sec use after free access
  IB/core: Fix typo in the name of the tag-matching cap struct

7 years agoMerge branches 'pm-opp' and 'pm-cpufreq'
Rafael J. Wysocki [Thu, 28 Sep 2017 19:08:07 +0000 (21:08 +0200)]
Merge branches 'pm-opp' and 'pm-cpufreq'

* pm-opp:
  PM / OPP: Call notifier without holding opp_table->lock

* pm-cpufreq:
  cpufreq: docs: Drop intel-pstate.txt from index.txt
  cpufreq: dt: Fix sysfs duplicate filename creation for platform-device

7 years agoMerge tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Thu, 28 Sep 2017 18:20:52 +0000 (11:20 -0700)]
Merge tag 'seccomp-v4.14-rc3' of git://git./linux/kernel/git/kees/linux

Pull seccomp fix from Kees Cook:
 "Fix refcounting bug in CRIU interface, noticed by Chris Salls (Oleg &
  Tycho)"

* tag 'seccomp-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()

7 years agoperf test: Fix vmlinux failure on s390x part 2
Thomas Richter [Fri, 15 Sep 2017 07:14:04 +0000 (09:14 +0200)]
perf test: Fix vmlinux failure on s390x part 2

On s390x perf test 1 failed. It turned out that commit cf6383f73cf2
("perf report: Fix kernel symbol adjustment for s390x") was incorrect.

The previous implementation in dso__load_sym() is also suitable for
s390x.

Therefore this patch undoes commit cf6383f73cf2

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Fixes: cf6383f73cf2 ("perf report: Fix kernel symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-v101o8k25vuja2ogosgf15yy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf test: Fix vmlinux failure on s390x
Thomas Richter [Fri, 15 Sep 2017 07:14:03 +0000 (09:14 +0200)]
perf test: Fix vmlinux failure on s390x

On s390x perf test 1 failed. It turned out that commit 4a084ecfc821
("perf report: Fix module symbol adjustment for s390x") was incorrect.
The previous implementation in dso__load_sym() is also suitable for
s390x.

Therefore this patch undoes commit 4a084ecfc821.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Fixes: 4a084ecfc821 ("perf report: Fix module symbol adjustment for s390x")
LPU-Reference: 20170915071404.58398-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-5ani7ly57zji7s0hmzkx416l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoKVM: VMX: use cmpxchg64
Paolo Bonzini [Thu, 28 Sep 2017 15:58:41 +0000 (17:58 +0200)]
KVM: VMX: use cmpxchg64

This fixes a compilation failure on 32-bit systems.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
7 years agotimer: Prepare to change timer callback argument type
Kees Cook [Thu, 28 Sep 2017 13:38:17 +0000 (06:38 -0700)]
timer: Prepare to change timer callback argument type

Modern kernel callback systems pass the structure associated with a
given callback to the callback function. The timer callback remains one
of the legacy cases where an arbitrary unsigned long argument continues
to be passed as the callback argument. This has several problems:

- This bloats the timer_list structure with a normally redundant
  .data field.

- No type checking is being performed, forcing callbacks to do
  explicit type casts of the unsigned long argument into the object
  that was passed, rather than using container_of(), as done in most
  of the other callback infrastructure.

- Neighboring buffer overflows can overwrite both the .function and
  the .data field, providing attackers with a way to elevate from a buffer
  overflow into a simplistic ROP-like mechanism that allows calling
  arbitrary functions with a controlled first argument.

- For future Control Flow Integrity work, this creates a unique function
  prototype for timer callbacks, instead of allowing them to continue to
  be clustered with other void functions that take a single unsigned long
  argument.

This adds a new timer initialization API, which will ultimately replace
the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
named timer_setup() (to mirror hrtimer_setup(), making instances of its
use much easier to grep for).

In order to support the migration of existing timers into the new
callback arguments, timer_setup() casts its arguments to the existing
legacy types, and explicitly passes the timer pointer as the legacy
data argument. Once all setup_*timer() callers have been replaced with
timer_setup(), the casts can be removed, and the data argument can be
dropped with the timer expiration code changed to just pass the timer
to the callback directly.

Since the regular pattern of using container_of() during local variable
declaration repeats the need for the variable type declaration
to be included, this adds a helper modeled after other from_*()
helpers that wrap container_of(), named from_timer(). This helper uses
typeof(*variable), removing the type redundancy and minimizing the need
for line wraps in forthcoming conversions from "unsigned data long" to
"struct timer_list *" in the timer callbacks:

-void callback(unsigned long data)
+void callback(struct timer_list *t)
{
-   struct some_data_structure *local = (struct some_data_structure *)data;
+   struct some_data_structure *local = from_timer(local, t, timer);

Finally, in order to support the handful of timer users that perform
open-coded assignments of the .function (and .data) fields, provide
cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
temporarily. Once conversion has been completed, these can be globally
trivially removed.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
7 years agoxen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping
Zhenzhong Duan [Wed, 27 Sep 2017 09:41:25 +0000 (02:41 -0700)]
xen/mmu: Call xen_cleanhighmap() with 4MB aligned for page tables mapping

When bootup a PVM guest with large memory(Ex.240GB), XEN provided initial
mapping overlaps with kernel module virtual space. When mapping in this space
is cleared by xen_cleanhighmap(), in certain case there could be an 2MB mapping
left. This is due to XEN initialize 4MB aligned mapping but xen_cleanhighmap()
finish at 2MB boundary.

When module loading is just on top of the 2MB space, got below warning:

WARNING: at mm/vmalloc.c:106 vmap_pte_range+0x14e/0x190()
Call Trace:
 [<ffffffff81117083>] warn_alloc_failed+0xf3/0x160
 [<ffffffff81146022>] __vmalloc_area_node+0x182/0x1c0
 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
 [<ffffffff81145df7>] __vmalloc_node_range+0xa7/0x110
 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
 [<ffffffff8103ca54>] module_alloc+0x64/0x70
 [<ffffffff810ac91e>] ? module_alloc_update_bounds+0x1e/0x80
 [<ffffffff810ac91e>] module_alloc_update_bounds+0x1e/0x80
 [<ffffffff810ac9a7>] move_module+0x27/0x150
 [<ffffffff810aefa0>] layout_and_allocate+0x120/0x1b0
 [<ffffffff810af0a8>] load_module+0x78/0x640
 [<ffffffff811ff90b>] ? security_file_permission+0x8b/0x90
 [<ffffffff810af6d2>] sys_init_module+0x62/0x1e0
 [<ffffffff815154c2>] system_call_fastpath+0x16/0x1b

Then the mapping of 2MB is cleared, finally oops when the page in that space is
accessed.

BUG: unable to handle kernel paging request at ffff880022600000
IP: [<ffffffff81260877>] clear_page_c_e+0x7/0x10
PGD 1788067 PUD 178c067 PMD 22434067 PTE 0
Oops: 0002 [#1] SMP
Call Trace:
 [<ffffffff81116ef7>] ? prep_new_page+0x127/0x1c0
 [<ffffffff81117d42>] get_page_from_freelist+0x1e2/0x550
 [<ffffffff81133010>] ? ii_iovec_copy_to_user+0x90/0x140
 [<ffffffff81119c9d>] __alloc_pages_nodemask+0x12d/0x230
 [<ffffffff81155516>] alloc_pages_vma+0xc6/0x1a0
 [<ffffffff81006ffd>] ? pte_mfn_to_pfn+0x7d/0x100
 [<ffffffff81134cfb>] do_anonymous_page+0x16b/0x350
 [<ffffffff81139c34>] handle_pte_fault+0x1e4/0x200
 [<ffffffff8100712e>] ? xen_pmd_val+0xe/0x10
 [<ffffffff810052c9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
 [<ffffffff81139dab>] handle_mm_fault+0x15b/0x270
 [<ffffffff81510c10>] do_page_fault+0x140/0x470
 [<ffffffff8150d7d5>] page_fault+0x25/0x30

Call xen_cleanhighmap() with 4MB aligned for page tables mapping to fix it.
The unnecessory call of xen_cleanhighmap() in DEBUG mode is also removed.

-v2: add comment about XEN alignment from Juergen.

References: https://lists.xen.org/archives/html/xen-devel/2012-07/msg01562.html
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
[boris: added 'xen/mmu' tag to commit subject]
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agoxen-pciback: relax BAR sizing write value check
Jan Beulich [Mon, 25 Sep 2017 08:01:01 +0000 (02:01 -0600)]
xen-pciback: relax BAR sizing write value check

Just like done in d2bd05d88d ("xen-pciback: return proper values during
BAR sizing") for the ROM BAR, ordinary ones also shouldn't compare the
written value directly against ~0, but consider the r/o bits at the
bottom (if any).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agoirq/generic-chip: Don't replace domain's name
Jeffy Chen [Thu, 28 Sep 2017 04:37:31 +0000 (12:37 +0800)]
irq/generic-chip: Don't replace domain's name

When generic irq chips are allocated for an irq domain the domain name is
set to the irq chip name. That was done to have named domains before the
recent changes which enforce domain naming were done.

Since then the overwrite causes a memory leak when the domain name is
dynamically allocated and even worse it would cause the domain free code to
free the wrong name pointer, which might point to a constant.

Remove the name assignment to prevent this.

Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only")
Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20170928043731.4764-1-jeffy.chen@rock-chips.com
7 years agoseccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()
Oleg Nesterov [Wed, 27 Sep 2017 15:25:30 +0000 (09:25 -0600)]
seccomp: fix the usage of get/put_seccomp_filter() in seccomp_get_filter()

As Chris explains, get_seccomp_filter() and put_seccomp_filter() can end
up using different filters. Once we drop ->siglock it is possible for
task->seccomp.filter to have been replaced by SECCOMP_FILTER_FLAG_TSYNC.

Fixes: f8e529ed941b ("seccomp, ptrace: add support for dumping seccomp filters")
Reported-by: Chris Salls <chrissalls5@gmail.com>
Cc: stable@vger.kernel.org # needs s/refcount_/atomic_/ for v4.12 and earlier
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
[tycho: add __get_seccomp_filter vs. open coding refcount_inc()]
Signed-off-by: Tycho Andersen <tycho@docker.com>
[kees: tweak commit log]
Signed-off-by: Kees Cook <keescook@chromium.org>
7 years agoobjtool: Support unoptimized frame pointer setup
Josh Poimboeuf [Wed, 27 Sep 2017 15:36:38 +0000 (10:36 -0500)]
objtool: Support unoptimized frame pointer setup

Arnd Bergmann reported a bunch of warnings like:

  crypto/jitterentropy.o: warning: objtool: jent_fold_time()+0x3b: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_stuck()+0x1d: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_unbiased_bit()+0x15: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_read_entropy()+0x32: call without frame pointer save/setup
  crypto/jitterentropy.o: warning: objtool: jent_entropy_collector_free()+0x19: call without frame pointer save/setup

and

  arch/x86/events/core.o: warning: objtool: collect_events uses BP as a scratch register
  arch/x86/events/core.o: warning: objtool: events_ht_sysfs_show()+0x22: call without frame pointer save/setup

With certain rare configurations, GCC sometimes sets up the frame
pointer with:

  lea    (%rsp),%rbp

instead of:

  mov    %rsp,%rbp

The instructions are equivalent, so treat the former like the latter.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a468af8b28a69b83fffc6d7668be9b6fcc873699.1506526584.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoobjtool: Skip unreachable warnings for GCC 4.4 and older
Josh Poimboeuf [Wed, 27 Sep 2017 22:34:23 +0000 (17:34 -0500)]
objtool: Skip unreachable warnings for GCC 4.4 and older

The kbuild bot occasionally reports warnings like:

  drivers/scsi/pcmcia/aha152x_core.o: warning: objtool: seldo_run()+0x130: unreachable instruction

These warnings are always with GCC 4.4.  That version of GCC sometimes
places unreachable instructions after calls to noreturn functions.

The unreachable warnings aren't very important anyway.  Just ignore them
for old versions of GCC.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bc89b807d965b98ec18a0bb94f96a594bd58f2f2.1506551639.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agomd/raid5: cap worker count
Shaohua Li [Thu, 21 Sep 2017 18:03:52 +0000 (11:03 -0700)]
md/raid5: cap worker count

static checker reports a potential integer overflow. Cap the worker count to
avoid the overflow.

Reported:-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shaohua Li <shli@fb.com>
7 years agodm-raid: fix a race condition in request handling
Shaohua Li [Thu, 21 Sep 2017 17:29:22 +0000 (10:29 -0700)]
dm-raid: fix a race condition in request handling

raid_map calls pers->make_request, which missed the suspend check. Fix it with
the new md_handle_request API.

Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: Heinz Mauelshagen <heinzm@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
7 years agomd: fix a race condition for flush request handling
Shaohua Li [Thu, 21 Sep 2017 16:55:28 +0000 (09:55 -0700)]
md: fix a race condition for flush request handling

md_submit_flush_data calls pers->make_request, which missed the suspend check.
Fix it with the new md_handle_request API.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Tested-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
7 years agomd: separate request handling
Shaohua Li [Thu, 21 Sep 2017 17:23:35 +0000 (10:23 -0700)]
md: separate request handling

With commit cc27b0c78c79, pers->make_request could bail out without handling
the bio. If that happens, we should retry.  The commit fixes md_make_request
but not other call sites. Separate the request handling part, so other call
sites can use it.

Reported-by: Nate Dailey <nate.dailey@stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable@vger.kernel.org
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
7 years agoscsi: ILLEGAL REQUEST + ASC==27 => target failure
Martin Wilck [Wed, 27 Sep 2017 12:44:19 +0000 (14:44 +0200)]
scsi: ILLEGAL REQUEST + ASC==27 => target failure

ASC 0x27 is "WRITE PROTECTED". This error code is returned e.g.  by
Fujitsu ETERNUS systems under certain conditions for WRITE SAME 16
commands with UNMAP bit set. It should not be treated as a path
error. In general, it makes sense to assume that being write protected
is a target rather than a path property.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
7 years agoscsi: aacraid: Add a small delay after IOP reset
Guilherme G. Piccoli [Tue, 19 Sep 2017 15:11:55 +0000 (12:11 -0300)]
scsi: aacraid: Add a small delay after IOP reset

Commit 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset
status") changed the way driver checks if a reset succeeded. Now, after an
IOP reset, aacraid immediately start polling a register to verify the reset
is complete.

This behavior cause regressions on the reset path in PowerPC (at least).
Since the delay after the IOP reset was removed by the aforementioned patch,
the fact driver just starts to read a register instantly after the reset
was issued (by writing in another register) "corrupts" the reset procedure,
which ends up failing all the time.

The issue highly impacted kdump on PowerPC, since on kdump path we
proactively issue a reset in adapter (through the reset_devices kernel
parameter).

This patch (re-)adds a delay right after IOP reset is issued. Empirically
we measured that 3 seconds is enough, but for safety reasons we delay
for 5s (and since it was 30s before, 5s is still a small amount).

For reference, without this patch we observe the following messages
on kdump kernel boot process:

  [ 76.294] aacraid 0003:01:00.0: IOP reset failed
  [ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed
  [ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff.
  [ 86.524] aacraid 0003:01:00.0: Controller reset type is 3
  [ 86.524] aacraid 0003:01:00.0: Issuing IOP reset
  [146.534] aacraid 0003:01:00.0: IOP reset failed
  [146.534] aacraid 0003:01:00.0: ARC Reset attempt failed

Fixes: 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
7 years agocpufreq: docs: Drop intel-pstate.txt from index.txt
Rafael J. Wysocki [Thu, 28 Sep 2017 00:08:43 +0000 (02:08 +0200)]
cpufreq: docs: Drop intel-pstate.txt from index.txt

Commit 33fc30b47098 (cpufreq: intel_pstate: Document the current
behavior and user interface) dropped the intel-pstate.txt file
from Documentation/cpu-freq/, but it did not update the index.txt
file in there accordingly, so do that now.

Fixes: 33fc30b47098 (cpufreq: intel_pstate: Document the current behavior and user interface)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoMerge commit 'keys-fixes-20170927' into fixes-v4.14-rc3
James Morris [Wed, 27 Sep 2017 23:11:28 +0000 (09:11 +1000)]
Merge commit 'keys-fixes-20170927' into fixes-v4.14-rc3

From David Howells:

"There are two sets of patches here:
 (1) A bunch of core keyrings bug fixes from Eric Biggers.

 (2) Fixing big_key to use safe crypto from Jason A. Donenfeld."

7 years agoACPI / APEI: clear error status before acknowledging the error
Tyler Baicar [Mon, 28 Aug 2017 16:53:41 +0000 (10:53 -0600)]
ACPI / APEI: clear error status before acknowledging the error

Currently we acknowledge errors before clearing the error status.
This could cause a new error to be populated by firmware in-between
the error acknowledgment and the error status clearing which would
cause the second error's status to be cleared without being handled.
So, clear the error status before acknowledging the errors.

Also, make sure to acknowledge the error if the error status read
fails.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoMerge branch 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Wed, 27 Sep 2017 19:49:38 +0000 (05:49 +1000)]
Merge branch 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

A few fixes for 4.14.  Nothing too major.

* 'drm-fixes-4.14' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon: disable hard reset in hibernate for APUs
  drm/amdgpu: revert tile table update for oland

7 years agoMerge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm...
Dave Airlie [Wed, 27 Sep 2017 19:48:53 +0000 (05:48 +1000)]
Merge branch 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux into drm-fixes

Just two small etnaviv fixes, one fixing a list corruption, the other
fixing a NULL ptr deref in an error path.

* 'etnaviv/fixes' of https://git.pengutronix.de/git/lst/linux:
  etnaviv: fix gem object list corruption
  etnaviv: fix submit error path

7 years agoMerge tag 'drm-amdkfd-fixes-2017-09-24' of git://people.freedesktop.org/~gabbayo...
Dave Airlie [Wed, 27 Sep 2017 19:47:39 +0000 (05:47 +1000)]
Merge tag 'drm-amdkfd-fixes-2017-09-24' of git://people.freedesktop.org/~gabbayo/linux into drm-fixes

It contains the following fixes:
- correct checking of return value
- send correct parameter to function (According to the parameter type)
- avoid spamming of dmesg log
- fix queue wrapping calculations

* tag 'drm-amdkfd-fixes-2017-09-24' of git://people.freedesktop.org/~gabbayo/linux:
  drm/amdkfd: Print event limit messages only once per process
  drm/amdkfd: Fix kernel-queue wrapping bugs
  drm/amdkfd: Fix incorrect destroy_mqd parameter
  drm/amdkfd: check for null dev to avoid a null pointer dereference

7 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Wed, 27 Sep 2017 19:22:12 +0000 (12:22 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull quota and isofs fixes from Jan Kara:
 "Two quota fixes (fallout of the quota locking changes) and an isofs
  build fix"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  quota: Fix quota corruption with generic/232 test
  isofs: fix build regression
  quota: add missing lock into __dquot_transfer()

7 years agoMerge tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Wed, 27 Sep 2017 17:51:08 +0000 (10:51 -0700)]
Merge tag 'linux-kselftest-4.14-rc3-fixes' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "This update consists of:

   - fixes to several existing tests

   - a test for regression introduced by b9470c27607b ("inet: kill
     smallest_size and smallest_port")

   - seccomp support for glibc 2.26 siginfo_t.h

   - fixes to kselftest framework and tests to run make O=dir use-case

   - fixes to silence unnecessary test output to de-clutter test results"

* tag 'linux-kselftest-4.14-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (28 commits)
  selftests: timers: set-timer-lat: Fix hang when testing unsupported alarms
  selftests: timers: set-timer-lat: fix hang when std out/err are redirected
  selftests/memfd: correct run_tests.sh permission
  selftests/seccomp: Support glibc 2.26 siginfo_t.h
  selftests: futex: Makefile: fix for loops in targets to run silently
  selftests: Makefile: fix for loops in targets to run silently
  selftests: mqueue: Use full path to run tests from Makefile
  selftests: futex: copy sub-dir test scripts for make O=dir run
  selftests: lib.mk: copy test scripts and test files for make O=dir run
  selftests: sync: kselftest and kselftest-clean fail for make O=dir case
  selftests: sync: use TEST_CUSTOM_PROGS instead of TEST_PROGS
  selftests: lib.mk: add TEST_CUSTOM_PROGS to allow custom test run/install
  selftests: watchdog: fix to use TEST_GEN_PROGS and remove clean
  selftests: lib.mk: fix test executable status check to use full path
  selftests: Makefile: clear LDFLAGS for make O=dir use-case
  selftests: lib.mk: kselftest and kselftest-clean fail for make O=dir case
  Makefile: kselftest and kselftest-clean fail for make O=dir case
  selftests/net: msg_zerocopy enable build with older kernel headers
  selftests: actually run the various net selftests
  selftest: add a reuseaddr test
  ...

7 years agomtd: nand: atmel: fix buffer overflow in atmel_pmecc_user
Richard Genoud [Wed, 27 Sep 2017 12:49:17 +0000 (14:49 +0200)]
mtd: nand: atmel: fix buffer overflow in atmel_pmecc_user

When calculating the size needed by struct atmel_pmecc_user *user,
the dmu and delta buffer sizes were forgotten.
This lead to a memory corruption (especially with a large ecc_strength).

Link: http://lkml.kernel.org/r/1506503157.3016.5.camel@gmail.com
Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: stable@vger.kernel.org
Reported-by: Richard Genoud <richard.genoud@gmail.com>
Pointed-at-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
7 years agoMerge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 27 Sep 2017 15:33:26 +0000 (08:33 -0700)]
Merge branch 'x86-fpu-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fpu fixes and cleanups from Ingo Molnar:
 "This is _way_ more cleanups than fixes, but the bugs were subtle and
  hard to hit, and the primary reason for them existing was the
  unnecessary historical complexity of some of the x86/fpu interfaces.

  The first bunch of commits clean up and simplify the xstate user copy
  handling functions, in reaction to the collective head-scratching
  about the xstate user-copy handling code that leads up to the fix for
  this SkyLake xstate handling bug:

     0852b374173b: x86/fpu: Add FPU state copying quirk to handle XRSTOR failure on Intel Skylake CPUs

  The cleanups don't change any functionality, they just (hopefully)
  make it all clearer, more consistent, more debuggable and more robust.

  Note that most of the linecount increase comes from these commits,
  where we better split the user/kernel copy logic by having more
  variants, instead repeated fragile patterns of:

               if (kbuf) {
                       memcpy(kbuf + pos, data, copy);
               } else {
                       if (__copy_to_user(ubuf + pos, data, copy))
                               return -EFAULT;
               }

  The next bunch of commits simplify the FPU state-machine to get rid of
  old lazy-FPU idiosyncrasies - a defensive simplification to make all
  the code easier to review and fix. No change in functionality.

  Then there's a couple of additional debugging tweaks: static checker
  warning fix and move an FPU related warning to under WARN_ON_FPU(),
  followed by another bunch of commits that represent a finegrained
  split-up of the fixes from Eric Biggers to handle weird xstate bits
  properly.

  I did this finegrained split-up because some of these fixes also
  impact the ABI for weird xstate handling, for which we'd like to have
  good bisection results, should they cause any problems. (We also had
  one regression with the more monolithic fixes, so splitting it all up
  sounded prudent for robustness reasons as well.)

  About the whole series: the commits up to 03eaec81ac09 have been in
  -next for months - but I've recently rebased them to remove a state
  machine clean-up commit that was objected to, and to make it more
  bisectable - so technically it's a new, rebased tree.

  Robustness history: this series had some regressions along the way,
  and all reported regressions have been fixed. All but one of the
  regressions manifested itself as easy to report warnings. The previous
  version of this latest series was also in linux-next, with one
  (warning-only) regression reported which is fixed in the latest
  version.

  Barring last minute brown paper bag bugs (and the commits are now
  older by a day which I'd hope helps paperbag reduction), I'm
  reasonably confident about its general robustness.

  Famous last words ..."

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/fpu: Use using_compacted_format() instead of open coded X86_FEATURE_XSAVES
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_user_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_user_to_xstate()
  x86/fpu: Copy the full header in copy_user_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in copy_kernel_to_xstate()
  x86/fpu: Eliminate the 'xfeatures' local variable in copy_kernel_to_xstate()
  x86/fpu: Copy the full state_header in copy_kernel_to_xstate()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in __fpu__restore_sig()
  x86/fpu: Use validate_xstate_header() to validate the xstate_header in xstateregs_set()
  x86/fpu: Introduce validate_xstate_header()
  x86/fpu: Rename fpu__activate_fpstate_read/write() to fpu__prepare_[read|write]()
  x86/fpu: Rename fpu__activate_curr() to fpu__initialize()
  x86/fpu: Simplify and speed up fpu__copy()
  x86/fpu: Fix stale comments about lazy FPU logic
  x86/fpu: Rename fpu::fpstate_active to fpu::initialized
  x86/fpu: Remove fpu__current_fpstate_write_begin/end()
  x86/fpu: Fix fpu__activate_fpstate_read() and update comments
  x86/fpu: Reinitialize FPU registers if restoring FPU state fails
  x86/fpu: Don't let userspace set bogus xcomp_bv
  x86/fpu: Turn WARN_ON() in context switch into WARN_ON_FPU()
  ...