Linus Torvalds [Mon, 2 Apr 2018 18:49:41 +0000 (11:49 -0700)]
Merge branch 'sched-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"The main scheduler changes in this cycle were:
- NUMA balancing improvements (Mel Gorman)
- Further load tracking improvements (Patrick Bellasi)
- Various NOHZ balancing cleanups and optimizations (Peter Zijlstra)
- Improve blocked load handling, in particular we can now reduce and
eventually stop periodic load updates on 'very idle' CPUs. (Vincent
Guittot)
- On isolated CPUs offload the final 1Hz scheduler tick as well, plus
related cleanups and reorganization. (Frederic Weisbecker)
- Core scheduler code cleanups (Ingo Molnar)"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
sched/core: Update preempt_notifier_key to modern API
sched/cpufreq: Rate limits for SCHED_DEADLINE
sched/fair: Update util_est only on util_avg updates
sched/cpufreq/schedutil: Use util_est for OPP selection
sched/fair: Use util_est in LB and WU paths
sched/fair: Add util_est on top of PELT
sched/core: Remove TASK_ALL
sched/completions: Use bool in try_wait_for_completion()
sched/fair: Update blocked load when newly idle
sched/fair: Move idle_balance()
sched/nohz: Merge CONFIG_NO_HZ_COMMON blocks
sched/fair: Move rebalance_domains()
sched/nohz: Optimize nohz_idle_balance()
sched/fair: Reduce the periodic update duration
sched/nohz: Stop NOHZ stats when decayed
sched/cpufreq: Provide migration hint
sched/nohz: Clean up nohz enter/exit
sched/fair: Update blocked load from NEWIDLE
sched/fair: Add NOHZ stats balancing
sched/fair: Restructure nohz_balance_kick()
...
Linus Torvalds [Mon, 2 Apr 2018 18:47:07 +0000 (11:47 -0700)]
Merge branch 'ras-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 RAS updates from Ingo Molnar:
"The main changes in this cycle were:
- AMD MCE support/decoding improvements (Yazen Ghannam)
- general MCE header cleanups and reorganization (Borislav Petkov)"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "x86/mce/AMD: Collect error info even if valid bits are not set"
x86/MCE: Cleanup and complete struct mce fields definitions
x86/mce/AMD: Carve out SMCA get_block_address() code
x86/mce/AMD: Get address from already initialized block
x86/mce/AMD, EDAC/mce_amd: Enumerate Reserved SMCA bank type
x86/mce/AMD: Pass the bank number to smca_get_bank_type()
x86/mce/AMD: Collect error info even if valid bits are not set
x86/mce: Issue the 'mcelog --ascii' message only on !AMD
x86/mce: Convert 'struct mca_config' bools to a bitfield
x86/mce: Put private structures and definitions into the internal header
Linus Torvalds [Mon, 2 Apr 2018 18:06:34 +0000 (11:06 -0700)]
Merge branch 'perf-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The main kernel side changes were:
- Modernize the kprobe and uprobe creation/destruction tooling ABIs:
The existing text based APIs (kprobe_events and uprobe_events in
tracefs), are naive, limited ABIs in that they require user-space
to clean up after themselves, which is both difficult and fragile
if the tool is buggy or exits unexpectedly. In other words they are
not really suited for modern, robust tooling.
So introduce a modern, file descriptor based ABI that does not have
these limitations: introduce the 'perf_kprobe' and 'perf_uprobe'
PMUs and extend the perf_event_open() syscall to create events with
a kprobe/uprobe attached to them. These [k,u]probe are associated
with this file descriptor, so they are not available in tracefs.
(Song Liu)
- Intel Cannon Lake CPU support (Harry Pan)
- Intel PT cleanups (Alexander Shishkin)
- Improve the performance of pinned/flexible event groups by using RB
trees (Alexey Budankov)
- Add PERF_EVENT_IOC_MODIFY_ATTRIBUTES which allows the modification
of hardware breakpoints, which new ABI variant massively speeds up
existing tooling that uses hardware breakpoints to instrument (and
debug) memory usage.
(Milind Chabbi, Jiri Olsa)
- Various Intel PEBS handling fixes and improvements, and other Intel
PMU improvements (Kan Liang)
- Various perf core improvements and optimizations (Peter Zijlstra)
- ... misc cleanups, fixes and updates.
There's over 200 tooling commits, here's an (imperfect) list of
highlights:
- 'perf annotate' improvements:
* Recognize and handle jumps to other functions as calls, which
improves the navigation along jumps and back. (Arnaldo Carvalho
de Melo)
* Add the 'P' hotkey in TUI annotation to dump annotation output
into a file, to ease e-mail reporting of annotation details.
(Arnaldo Carvalho de Melo)
* Add an IPC/cycles column to the TUI (Jin Yao)
* Improve s390 assembly annotation (Thomas Richter)
* Refactor the output formatting logic to better separate it into
interactive and non-interactive features and add the --stdio2
output variant to demonstrate this. (Arnaldo Carvalho de Melo)
- 'perf script' improvements:
* Add Python 3 support (Jaroslav Škarvada)
* Add --show-round-event (Jiri Olsa)
- 'perf c2c' improvements:
* Add NUMA analysis support (Jiri Olsa)
- 'perf trace' improvements:
* Improve PowerPC support (Ravi Bangoria)
- 'perf inject' improvements:
* Integrate ARM CoreSight traces (Robert Walker)
- 'perf stat' improvements:
* Add the --interval-count option (yuzhoujian)
* Add the --timeout option (yuzhoujian)
- 'perf sched' improvements (Changbin Du)
- Vendor events improvements :
* Add IBM s390 vendor events (Thomas Richter)
* Add and improve arm64 vendor events (John Garry, Ganapatrao
Kulkarni)
* Update POWER9 vendor events (Sukadev Bhattiprolu)
- Intel PT tooling improvements (Adrian Hunter)
- PMU handling improvements (Agustin Vega-Frias)
- Record machine topology in perf.data (Jiri Olsa)
- Various overwrite related cleanups (Kan Liang)
- Add arm64 dwarf post unwind support (Kim Phillips, Jean Pihet)
- ... and lots of other changes, cleanups and fixes, see the shortlog
and Git history for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf/x86/intel: Enable C-state residency events for Cannon Lake
perf/x86/intel: Add Cannon Lake support for RAPL profiling
perf/x86/pt, coresight: Clean up address filter structure
perf vendor events s390: Add JSON files for IBM z14
perf vendor events s390: Add JSON files for IBM z13
perf vendor events s390: Add JSON files for IBM zEC12 zBC12
perf vendor events s390: Add JSON files for IBM z196
perf vendor events s390: Add JSON files for IBM z10EC z10BC
perf mmap: Be consistent when checking for an unmaped ring buffer
perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done()
perf build: Fix check-headers.sh opts assignment
perf/x86: Update rdpmc_always_available static key to the modern API
perf annotate: Use absolute addresses to calculate jump target offsets
perf annotate: Defer searching for comma in raw line till it is needed
perf annotate: Support jumping from one function to another
perf annotate: Add "_local" to jump/offset validation routines
perf python: Reference Py_None before returning it
perf annotate: Mark jumps to outher functions with the call arrow
perf annotate: Pass function descriptor to its instruction parsing routines
perf annotate: No need to calculate notes->start twice
...
Linus Torvalds [Mon, 2 Apr 2018 17:27:16 +0000 (10:27 -0700)]
Merge branch 'locking-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
"The main changes in the locking subsystem in this cycle were:
- Add the Linux Kernel Memory Consistency Model (LKMM) subsystem,
which is an an array of tools in tools/memory-model/ that formally
describe the Linux memory coherency model (a.k.a.
Documentation/memory-barriers.txt), and also produce 'litmus tests'
in form of kernel code which can be directly executed and tested.
Here's a high level background article about an earlier version of
this work on LWN.net:
https://lwn.net/Articles/718628/
The design principles:
"There is reason to believe that Documentation/memory-barriers.txt
could use some help, and a major purpose of this patch is to
provide that help in the form of a design-time tool that can
produce all valid executions of a small fragment of concurrent
Linux-kernel code, which is called a "litmus test". This tool's
functionality is roughly similar to a full state-space search.
Please note that this is a design-time tool, not useful for
regression testing. However, we hope that the underlying
Linux-kernel memory model will be incorporated into other tools
capable of analyzing large bodies of code for regression-testing
purposes."
[...]
"A second tool is klitmus7, which converts litmus tests to
loadable kernel modules for direct testing. As with herd7, the
klitmus7 code is freely available from
http://diy.inria.fr/sources/index.html
(and via "git" at https://github.com/herd/herdtools7)"
[...]
Credits go to:
"This patch was the result of a most excellent collaboration
founded by Jade Alglave and also including Alan Stern, Andrea
Parri, and Luc Maranget."
... and to the gents listed in the MAINTAINERS entry:
LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM)
M: Alan Stern <stern@rowland.harvard.edu>
M: Andrea Parri <parri.andrea@gmail.com>
M: Will Deacon <will.deacon@arm.com>
M: Peter Zijlstra <peterz@infradead.org>
M: Boqun Feng <boqun.feng@gmail.com>
M: Nicholas Piggin <npiggin@gmail.com>
M: David Howells <dhowells@redhat.com>
M: Jade Alglave <j.alglave@ucl.ac.uk>
M: Luc Maranget <luc.maranget@inria.fr>
M: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
The LKMM project already found several bugs in Linux locking
primitives and improved the understanding and the documentation of
the Linux memory model all around.
- Add KASAN instrumentation to atomic APIs (Dmitry Vyukov)
- Add RWSEM API debugging and reorganize the lock debugging Kconfig
(Waiman Long)
- ... misc cleanups and other smaller changes"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
locking/Kconfig: Restructure the lock debugging menu
locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable
locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches
lockdep: Make the lock debug output more useful
locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
locking/atomic, asm-generic, x86: Add comments for atomic instrumentation
locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations
locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h
locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h
locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants
tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference()
tools/memory-model: Add documentation of new litmus test
tools/memory-model: Remove mention of docker/gentoo image
locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more
locking/lockdep: Show unadorned pointers
mutex: Drop linkage.h from mutex.h
tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference
tools/memory-model: Convert underscores to hyphens
tools/memory-model: Add a S lock-based external-view litmus test
tools/memory-model: Add required herd7 version to README file
...
Linus Torvalds [Mon, 2 Apr 2018 16:59:09 +0000 (09:59 -0700)]
Merge branch 'core-rcu-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The main RCU subsystem changes in this cycle were:
- Miscellaneous fixes, perhaps most notably removing obsolete code
whose only purpose in life was to gather information for the
now-removed RCU debugfs facility. Other notable changes include
removing NO_HZ_FULL_ALL in favor of the nohz_full kernel boot
parameter, minor optimizations for expedited grace periods, some
added tracing, creating an RCU-specific workqueue using Tejun's new
WQ_MEM_RECLAIM flag, and several cleanups to code and comments.
- SRCU cleanups and optimizations.
- Torture-test updates, perhaps most notably the adding of ARMv8
support, but also including numerous cleanups and usability fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
rcu: Create RCU-specific workqueues with rescuers
torture: Provide more sensible nreader/nwriter defaults for rcuperf
torture: Grace periods do not piggyback off of themselves
torture: Adjust rcuperf trace processing to allow for workqueues
torture: Default jitter off when running rcuperf
torture: Specify qemu memory size with --memory argument
rcutorture: Add basic ARM64 support to run scripts
rcutorture: Update kvm.sh header comment
rcutorture: Record which grace-period primitives are tested
rcutorture: Re-enable testing of dynamic expediting
rcutorture: Avoid fake-writer use of undefined primitives
rcutorture: Abstract function and module names
rcutorture: Replace multi-instance kzalloc() with kcalloc()
rcu: Remove SRCU throttling
srcu: Remove dead code in srcu_gp_end()
srcu: Reduce scans of srcu_data in counter wrap check
srcu: Prevent sdp->srcu_gp_seq_needed_exp counter wrap
srcu: Abstract function name
rcu: Make expedited RCU CPU selection avoid unnecessary stores
rcu: Trace expedited GP delays due to transitioning CPUs
...
Linus Torvalds [Mon, 2 Apr 2018 16:27:29 +0000 (09:27 -0700)]
Merge branch 'core-headers-for-linus' of git://git./linux/kernel/git/tip/tip
Pull header file cleanup from Ingo Molnar:
"Reduce <linux/interrupt.h> dependencies: a single change that drops
two #includes from this frequently used kernel header"
* 'core-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
headers: Drop two #included headers from <linux/interrupt.h>
Linus Torvalds [Mon, 2 Apr 2018 16:14:41 +0000 (09:14 -0700)]
Merge branch 'core-debugobjects-for-linus' of git://git./linux/kernel/git/tip/tip
Pull debugobjects updates from Ingo Molnar:
"Misc improvements:
- add better instrumentation/debugging
- optimize the freeing logic improve performance"
* 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobjects: Avoid another unused variable warning
debugobjects: Fix debug_objects_freed accounting
debugobjects: Use global free list in __debug_check_no_obj_freed()
debugobjects: Use global free list in free_object()
debugobjects: Add global free list and the counter
debugobjects: Export max loops counter
Linus Torvalds [Mon, 2 Apr 2018 16:08:26 +0000 (09:08 -0700)]
Merge branch 'core-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull misc core updates from Ingo Molnar:
"Two changes:
- add membarriers to Documentation/features/
- fix a minor nit in panic printk formatting"
* 'core-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
panic: Add closing panic marker parenthesis
Documentation/features, membarriers: Document membarrier-sync-core architecture support
Documentation/features: Allow comments in arch features files
Linus Torvalds [Mon, 2 Apr 2018 14:59:23 +0000 (07:59 -0700)]
Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux
Pull drm updates from Dave Airlie:
"Cannonlake and Vega12 support are probably the two major things. This
pull lacks nouveau, Ben had some unforseen leave and a few other
blockers so we'll see how things look or maybe leave it for this merge
window.
core:
- Device links to handle sound/gpu pm dependency
- Color encoding/range properties
- Plane clipping into plane check helper
- Backlight helpers
- DP TP4 + HBR3 helper support
amdgpu:
- Vega12 support
- Enable DC by default on all supported GPUs
- Powerplay restructuring and cleanup
- DC bandwidth calc updates
- DC backlight on pre-DCE11
- TTM backing store dropping support
- SR-IOV fixes
- Adding "wattman" like functionality
- DC crc support
- Improved DC dual-link handling
amdkfd:
- GPUVM support for dGPU
- KFD events for dGPU
- Enable PCIe atomics for dGPUs
- HSA process eviction support
- Live-lock fixes for process eviction
- VM page table allocation fix for large-bar systems
panel:
- Raydium RM68200
- AUO G104SN02 V2
- KEO TX31D200VM0BAA
- ARM Versatile panels
i915:
- Cannonlake support enabled
- AUX-F port support added
- Icelake base enabling until internal milestone of forcewake support
- Query uAPI interface (used for GPU topology information currently)
- Compressed framebuffer support for sprites
- kmem cache shrinking when GPU is idle
- Avoid boosting GPU when waited item is being processed already
- Avoid retraining LSPCON link unnecessarily
- Decrease request signaling latency
- Deprecation of I915_SET_COLORKEY_NONE
- Kerneldoc and compiler warning cleanup for upcoming CI enforcements
- Full range ycbcr toggling
- HDCP support
i915/gvt:
- Big refactor for shadow ppgtt
- KBL context save/restore via LRI cmd (Weinan)
- Properly unmap dma for guest page (Changbin)
vmwgfx:
- Lots of various improvements
etnaviv:
- Use the drm gpu scheduler
- prep work for GC7000L support
vc4:
- fix alpha blending
- Expose perf counters to userspace
pl111:
- Bandwidth checking/limiting
- Versatile panel support
sun4i:
- A83T HDMI support
- A80 support
- YUV plane support
- H3/H5 HDMI support
omapdrm:
- HPD support for DVI connector
- remove lots of static variables
msm:
- DSI updates from 10nm / SDM845
- fix for race condition with a3xx/a4xx fence completion irq
- some refactoring/prep work for eventual a6xx support (ie. when we
have a userspace)
- a5xx debugfs enhancements
- some mdp5 fixes/cleanups to prepare for eventually merging
writeback
- support (ie. when we have a userspace)
tegra:
- mmap() fixes for fbdev devices
- Overlay plane for hw cursor fix
- dma-buf cache maintenance support
mali-dp:
- YUV->RGB conversion support
rockchip:
- rk3399/chromebook fixes and improvements
rcar-du:
- LVDS support move to drm bridge
- DT bindings for R8A77995
- Driver/DT support for R8A77970
tilcdc:
- DRM panel support"
* tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux: (1646 commits)
drm/i915: Fix hibernation with ACPI S0 target state
drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
drm/i915: Specify which engines to reset following semaphore/event lockups
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
drm/amdkfd: Use ordered workqueue to restore processes
drm/amdgpu: Fix acquiring VM on large-BAR systems
drm/amd/pp: clean header file hwmgr.h
drm/amd/pp: use mlck_table.count for array loop index limit
drm: Fix uabi regression by allowing garbage mode->type from userspace
drm/amdgpu: Add an ATPX quirk for hybrid laptop
drm/amdgpu: fix spelling mistake: "asssert" -> "assert"
drm/amd/pp: Add new asic support in pp_psm.c
drm/amd/pp: Clean up powerplay code on Vega12
drm/amd/pp: Add smu irq handlers for legacy asics
drm/amd/pp: Fix set wrong temperature range on smu7
drm/amdgpu: Don't change preferred domian when fallback GTT v5
drm/vmwgfx: Bump version patchlevel and date
drm/vmwgfx: use monotonic event timestamps
drm/vmwgfx: Unpin the screen object backup buffer when not used
drm/vmwgfx: Stricter count of legacy surface device resources
...
Linus Torvalds [Sun, 1 Apr 2018 21:20:27 +0000 (14:20 -0700)]
Linux 4.16
Linus Torvalds [Sat, 31 Mar 2018 17:59:00 +0000 (07:59 -1000)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two fixlets"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/hwbp: Simplify the perf-hwbp code, fix documentation
perf/x86/intel: Fix linear IP of PEBS real_ip on Haswell and later CPUs
Linus Torvalds [Sat, 31 Mar 2018 17:50:30 +0000 (07:50 -1000)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Two UV platform fixes, and a kbuild fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/UV: Fix critical UV MMR address error
x86/platform/uv/BAU: Add APIC idt entry
x86/purgatory: Avoid creating stray .<pid>.d files, remove -MD from KBUILD_CFLAGS
Linus Torvalds [Sat, 31 Mar 2018 17:26:48 +0000 (07:26 -1000)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 PTI fixes from Ingo Molnar:
"Two fixes: a relatively simple objtool fix that makes Clang built
kernels work with ORC debug info, plus an alternatives macro fix"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives: Fixup alternative_call_2
objtool: Add Clang support
Harry Pan [Fri, 9 Mar 2018 12:15:48 +0000 (20:15 +0800)]
perf/x86/intel: Enable C-state residency events for Cannon Lake
Cannon Lake supports C1/C3/C6/C7, PC2/PC3/PC6/PC7/PC8/PC9/PC10
state residency counters, this patch enables those counters.
( The MSR information is based on Intel Software Developers' Manual,
Vol. 4, Order No. 335592. )
Tested-by: Puthikorn Voravootivat <puthik@chromium.org>
Signed-off-by: Harry Pan <harry.pan@intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan.liang@intel.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: gs0622@gmail.com
Link: http://lkml.kernel.org/r/20180309121549.630-3-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Harry Pan [Fri, 9 Mar 2018 12:15:47 +0000 (20:15 +0800)]
perf/x86/intel: Add Cannon Lake support for RAPL profiling
This patch enables RAPL counters (energy consumption counters)
support for Cannon Lake processors.
( ESU and power domains refer to Intel Software Developers' Manual,
Vol. 4, Order No. 335592. )
Usage example:
$ perf list
$ perf stat -a -e power/energy-cores/,power/energy-pkg/ sleep 10
Tested-by: Puthikorn Voravootivat <puthik@chromium.org>
Signed-off-by: Harry Pan <harry.pan@intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: colin.king@canonical.com
Cc: gs0622@gmail.com
Cc: kan.liang@linux.intel.com
Link: http://lkml.kernel.org/r/20180309121549.630-2-harry.pan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Waiman Long [Fri, 30 Mar 2018 21:28:00 +0000 (17:28 -0400)]
locking/Kconfig: Restructure the lock debugging menu
Two config options in the lock debugging menu that are probably the most
frequently used, as far as I am concerned, is the PROVE_LOCKING and
LOCK_STAT. From a UI perspective, they should be front and center. So
these two options are now moved to the top of the lock debugging menu.
The DEBUG_WW_MUTEX_SLOWPATH option is also added to the PROVE_LOCKING
umbrella.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-4-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Waiman Long [Fri, 30 Mar 2018 21:27:59 +0000 (17:27 -0400)]
locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable
There are a couples of lock debugging Kconfig options that depends on
the following support options:
- TRACE_IRQFLAGS_SUPPORT
- STACKTRACE_SUPPORT
- LOCKDEP_SUPPORT
That makes those lock debugging options harder to read and understand.
So a new LOCK_DEBUGGING_SUPPORT option is added that is equivalent to
the above three options together. That makes the Kconfig.debug file
more readable.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-3-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Waiman Long [Fri, 30 Mar 2018 21:27:58 +0000 (17:27 -0400)]
locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches
For a rwsem, locking can either be exclusive or shared. The corresponding
exclusive or shared unlock must be used. Otherwise, the protected data
structures may get corrupted or the lock may be in an inconsistent state.
In order to detect such anomaly, a new configuration option DEBUG_RWSEMS
is added which can be enabled to look for such mismatches and print
warnings that that happens.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1522445280-7767-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Sat, 31 Mar 2018 05:30:17 +0000 (07:30 +0200)]
Merge branch 'linus' into locking/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sat, 31 Mar 2018 04:53:57 +0000 (18:53 -1000)]
Merge tag 'kbuild-fixes-v4.16-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- fix missed rebuild of TRIM_UNUSED_KSYMS
- fix rpm-pkg for GNU tar >= 1.29
- include scripts/dtc/include-prefixes/* to kernel header deb-pkg
- add -no-integrated-as option ealier to fix building with Clang
- fix netfilter Makefile for parallel building
* tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
netfilter: nf_nat_snmp_basic: add correct dependency to Makefile
kbuild: rpm-pkg: Support GNU tar >= 1.29
builddeb: Fix header package regarding dtc source links
kbuild: set no-integrated-as before incl. arch Makefile
kbuild: make scripts/adjust_autoksyms.sh robust against timestamp races
Linus Torvalds [Sat, 31 Mar 2018 04:47:28 +0000 (18:47 -1000)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix RCU locking in xfrm_local_error(), from Taehee Yoo.
2) Fix return value assignments and thus error checking in
iwl_mvm_start_ap_ibss(), from Johannes Berg.
3) Don't count header length twice in vti4, from Stefano Brivio.
4) Fix deadlock in rt6_age_examine_exception, from Eric Dumazet.
5) Fix out-of-bounds access in nf_sk_lookup_slow{v4,v6}() from Subash
Abhinov.
6) Check nladdr size in netlink_connect(), from Alexander Potapenko.
7) VF representor SQ numbers are 32 not 16 bits, in mlx5 driver, from
Or Gerlitz.
8) Out of bounds read in skb_network_protocol(), from Eric Dumazet.
9) r8169 driver sets driver data pointer after register_netdev() which
is too late. Fix from Heiner Kallweit.
10) Fix memory leak in mlx4 driver, from Moshe Shemesh.
11) The multi-VLAN decap fix added a regression when dealing with device
that lack a MAC header, such as tun. Fix from Toshiaki Makita.
12) Fix integer overflow in dynamic interrupt coalescing code. From Tal
Gilboa.
13) Use after free in vrf code, from David Ahern.
14) IPV6 route leak between VRFs fix, also from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
net: mvneta: fix enable of all initialized RXQs
net/ipv6: Fix route leaking between VRFs
vrf: Fix use after free and double free in vrf_finish_output
ipv6: sr: fix seg6 encap performances with TSO enabled
net/dim: Fix int overflow
vlan: Fix vlan insertion for packets without ethernet header
net: Fix untag for vlan packets without ethernet header
atm: iphase: fix spelling mistake: "Receiverd" -> "Received"
vhost: validate log when IOTLB is enabled
qede: Do not drop rx-checksum invalidated packets.
hv_netvsc: enable multicast if necessary
ip_tunnel: Resolve ipsec merge conflict properly.
lan78xx: Crash in lan78xx_writ_reg (Workqueue: events lan78xx_deferred_multicast_write)
qede: Fix barrier usage after tx doorbell write.
vhost: correctly remove wait queue during poll failure
net/mlx4_core: Fix memory leak while delete slave's resources
net/mlx4_en: Fix mixed PFC and Global pause user control requests
net/smc: use announced length in sock_recvmsg()
llc: properly handle dev_queue_xmit() return value
strparser: Fix sign of err codes
...
Yelena Krivosheev [Fri, 30 Mar 2018 10:05:31 +0000 (12:05 +0200)]
net: mvneta: fix enable of all initialized RXQs
In mvneta_port_up() we enable relevant RX and TX port queues by write
queues bit map to an appropriate register.
q_map must be ZERO in the beginning of this process.
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 30 Mar 2018 00:44:57 +0000 (17:44 -0700)]
net/ipv6: Fix route leaking between VRFs
Donald reported that IPv6 route leaking between VRFs is not working.
The root cause is the strict argument in the call to rt6_lookup when
validating the nexthop spec.
ip6_route_check_nh validates the gateway and device (if given) of a
route spec. It in turn could call rt6_lookup (e.g., lookup in a given
table did not succeed so it falls back to a full lookup) and if so
sets the strict argument to 1. That means if the egress device is given,
the route lookup needs to return a result with the same device. This
strict requirement does not work with VRFs (IPv4 or IPv6) because the
oif in the flow struct is overridden with the index of the VRF device
to trigger a match on the l3mdev rule and force the lookup to its table.
The right long term solution is to add an l3mdev index to the flow
struct such that the oif is not overridden. That solution will not
backport well, so this patch aims for a simpler solution to relax the
strict argument if the route spec device is an l3mdev slave. As done
in other places, use the FLOWI_FLAG_SKIP_NH_OIF to know that the
RT6_LOOKUP_F_IFACE flag needs to be removed.
Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack")
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 29 Mar 2018 19:49:52 +0000 (12:49 -0700)]
vrf: Fix use after free and double free in vrf_finish_output
Miguel reported an skb use after free / double free in vrf_finish_output
when neigh_output returns an error. The vrf driver should return after
the call to neigh_output as it takes over the skb on error path as well.
Patch is a simplified version of Miguel's patch which was written for 4.9,
and updated to top of tree.
Fixes: 8f58336d3f78a ("net: Add ethernet header for pass through VRF device")
Signed-off-by: Miguel Fadon Perlines <mfadon@teldat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Lebrun [Thu, 29 Mar 2018 16:59:36 +0000 (17:59 +0100)]
ipv6: sr: fix seg6 encap performances with TSO enabled
Enabling TSO can lead to abysmal performances when using seg6 in
encap mode, such as with the ixgbe driver. This patch adds a call to
iptunnel_handle_offloads() to remove the encapsulation bit if needed.
Before:
root@comp4-seg6bpf:~# iperf3 -c fc00::55
Connecting to host fc00::55, port 5201
[ 4] local fc45::4 port 36592 connected to fc00::55 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 196 KBytes 1.60 Mbits/sec 47 6.66 KBytes
[ 4] 1.00-2.00 sec 304 KBytes 2.49 Mbits/sec 100 5.33 KBytes
[ 4] 2.00-3.00 sec 284 KBytes 2.32 Mbits/sec 92 5.33 KBytes
After:
root@comp4-seg6bpf:~# iperf3 -c fc00::55
Connecting to host fc00::55, port 5201
[ 4] local fc45::4 port 43062 connected to fc00::55 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 1.03 GBytes 8.89 Gbits/sec 0 743 KBytes
[ 4] 1.00-2.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes
[ 4] 2.00-3.00 sec 1.03 GBytes 8.87 Gbits/sec 0 743 KBytes
Reported-by: Tom Herbert <tom@quantonium.net>
Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Signed-off-by: David Lebrun <dlebrun@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 30 Mar 2018 17:29:47 +0000 (07:29 -1000)]
Merge tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"A fix for a dio-enabled loop on ceph deadlock from Zheng, marked for
stable"
* tag 'ceph-for-4.16-rc8' of git://github.com/ceph/ceph-client:
ceph: only dirty ITER_IOVEC pages for direct read
Linus Torvalds [Fri, 30 Mar 2018 17:24:14 +0000 (07:24 -1000)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Radim Krčmář:
"PPC:
- Fix a bug causing occasional machine check exceptions on POWER8
hosts (introduced in 4.16-rc1)
x86:
- Fix a guest crashing regression with nested VMX and restricted
guest (introduced in 4.16-rc1)
- Fix dependency check for pv tlb flush (the wrong dependency that
effectively disabled the feature was added in 4.16-rc4, the
original feature in 4.16-rc1, so it got decent testing)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix pv tlb flush dependencies
KVM: nVMX: sync vmcs02 segment regs prior to vmx_set_cr0
KVM: PPC: Book3S HV: Fix duplication of host SLB entries
Linus Torvalds [Fri, 30 Mar 2018 17:14:35 +0000 (07:14 -1000)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"A simple but worthwhile I2C driver fix for 4.16"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-stm32f7: fix no check on returned setup
Linus Torvalds [Fri, 30 Mar 2018 17:11:14 +0000 (07:11 -1000)]
Merge tag 'sound-4.16' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Very small fixes (all one-liners) at this time.
One fix is for a PCM core stuff to correct the mmap behavior on
non-x86. It doesn't show on most machines but mostly only for exotic
non-interleaved formats"
* tag 'sound-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: potential uninitialized return values
ALSA: pcm: Use dma_bytes as size parameter in dma_mmap_coherent()
ALSA: usb-audio: Add native DSD support for TEAC UD-301
Tal Gilboa [Thu, 29 Mar 2018 10:53:52 +0000 (13:53 +0300)]
net/dim: Fix int overflow
When calculating difference between samples, the values
are multiplied by 100. Large values may cause int overflow
when multiplied (usually on first iteration).
Fixed by forcing 100 to be of type unsigned long.
Fixes: 4c4dbb4a7363 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 30 Mar 2018 16:36:28 +0000 (12:36 -0400)]
Merge branch 'vlan-fix'
Toshiaki Makita says:
====================
Fix vlan tag handling for vlan packets without ethernet headers
Eric Dumazet reported syzbot found a new bug which leads to underflow of
size argument of memmove(), causing crash[1]. This can be triggered by tun
devices.
The underflow happened because skb_vlan_untag() did not expect vlan packets
without ethernet headers, and tun can produce such packets.
I also checked vlan_insert_inner_tag() and found a similar bug.
This series fixes these problems.
[1] https://marc.info/?l=linux-netdev&m=
152221753920510&w=2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Thu, 29 Mar 2018 10:05:30 +0000 (19:05 +0900)]
vlan: Fix vlan insertion for packets without ethernet header
In some situation vlan packets do not have ethernet headers. One example
is packets from tun devices. Users can specify vlan protocol in tun_pi
field instead of IP protocol. When we have a vlan device with reorder_hdr
disabled on top of the tun device, such packets from tun devices are
untagged in skb_vlan_untag() and vlan headers will be inserted back in
vlan_insert_inner_tag().
vlan_insert_inner_tag() however did not expect packets without ethernet
headers, so in such a case size argument for memmove() underflowed.
We don't need to copy headers for packets which do not have preceding
headers of vlan headers, so skip memmove() in that case.
Also don't write vlan protocol in skb->data when it does not have enough
room for it.
Fixes: cbe7128c4b92 ("vlan: Fix out of order vlan headers with reorder header off")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Toshiaki Makita [Thu, 29 Mar 2018 10:05:29 +0000 (19:05 +0900)]
net: Fix untag for vlan packets without ethernet header
In some situation vlan packets do not have ethernet headers. One example
is packets from tun devices. Users can specify vlan protocol in tun_pi
field instead of IP protocol, and skb_vlan_untag() attempts to untag such
packets.
skb_vlan_untag() (more precisely, skb_reorder_vlan_header() called by it)
however did not expect packets without ethernet headers, so in such a case
size argument for memmove() underflowed and triggered crash.
====
BUG: unable to handle kernel paging request at
ffff8801cccb8000
IP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
PGD
9cee067 P4D
9cee067 PUD
1d9401063 PMD
1cccb7063 PTE
2810100028101
Oops: 000b [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 17663 Comm: syz-executor2 Not tainted 4.16.0-rc7+ #368
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43
RSP: 0018:
ffff8801cc046e28 EFLAGS:
00010287
RAX:
ffff8801ccc244c4 RBX:
fffffffffffffffe RCX:
fffffffffff6c4c2
RDX:
fffffffffffffffe RSI:
ffff8801cccb7ffc RDI:
ffff8801cccb8000
RBP:
ffff8801cc046e48 R08:
ffff8801ccc244be R09:
ffffed0039984899
R10:
0000000000000001 R11:
ffffed0039984898 R12:
ffff8801ccc244c4
R13:
ffff8801ccc244c0 R14:
ffff8801d96b7c06 R15:
ffff8801d96b7b40
FS:
00007febd562d700(0000) GS:
ffff8801db300000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
ffff8801cccb8000 CR3:
00000001ccb2f006 CR4:
00000000001606e0
DR0:
0000000020000000 DR1:
0000000020000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000600
Call Trace:
memmove include/linux/string.h:360 [inline]
skb_reorder_vlan_header net/core/skbuff.c:5031 [inline]
skb_vlan_untag+0x470/0xc40 net/core/skbuff.c:5061
__netif_receive_skb_core+0x119c/0x3460 net/core/dev.c:4460
__netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4627
netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4701
netif_receive_skb+0xae/0x390 net/core/dev.c:4725
tun_rx_batched.isra.50+0x5ee/0x870 drivers/net/tun.c:1555
tun_get_user+0x299e/0x3c20 drivers/net/tun.c:1962
tun_chr_write_iter+0xb9/0x160 drivers/net/tun.c:1990
call_write_iter include/linux/fs.h:1782 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x684/0x970 fs/read_write.c:482
vfs_write+0x189/0x510 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x220 fs/read_write.c:581
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x454879
RSP: 002b:
00007febd562cc68 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
00007febd562d6d4 RCX:
0000000000454879
RDX:
0000000000000157 RSI:
0000000020000180 RDI:
0000000000000014
RBP:
000000000072bea0 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
00000000ffffffff
R13:
00000000000006b0 R14:
00000000006fc120 R15:
0000000000000000
Code: 90 90 90 90 90 90 90 48 89 f8 48 83 fa 20 0f 82 03 01 00 00 48 39 fe 7d 0f 49 89 f0 49 01 d0 49 39 f8 0f 8f 9f 00 00 00 48 89 d1 <f3> a4 c3 48 81 fa a8 02 00 00 72 05 40 38 fe 74 3b 48 83 ea 20
RIP: __memmove+0x24/0x1a0 arch/x86/lib/memmove_64.S:43 RSP:
ffff8801cc046e28
CR2:
ffff8801cccb8000
====
We don't need to copy headers for packets which do not have preceding
headers of vlan headers, so skip memmove() in that case.
Fixes: 4bbb3e0e8239 ("net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 28 Mar 2018 23:18:53 +0000 (00:18 +0100)]
atm: iphase: fix spelling mistake: "Receiverd" -> "Received"
Trivial fix to spelling mistake in message text
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yan, Zheng [Fri, 16 Mar 2018 03:22:29 +0000 (11:22 +0800)]
ceph: only dirty ITER_IOVEC pages for direct read
If a page is already locked, attempting to dirty it leads to a deadlock
in lock_page(). This is what currently happens to ITER_BVEC pages when
a dio-enabled loop device is backed by ceph:
$ losetup --direct-io /dev/loop0 /mnt/cephfs/img
$ xfs_io -c 'pread 0 4k' /dev/loop0
Follow other file systems and only dirty ITER_IOVEC pages.
Cc: stable@kernel.org
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Linus Torvalds [Fri, 30 Mar 2018 05:27:12 +0000 (19:27 -1000)]
Merge tag 'for-4.16/dm-fixes-4' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix a DM multipath regression introduced in a v4.16-rc6 commit:
restore support for loading, and attaching, scsi_dh modules during
multipath table load. Otherwise some users may find themselves unable
to boot, as was reported today:
https://marc.info/?l=linux-scsi&m=
152231276114962&w=2
- Fix a DM core ioctl permission check regression introduced in a
v4.16-rc5 commit.
* tag 'for-4.16/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix dropped return code from dm_get_bdev_for_ioctl
dm mpath: fix support for loading scsi_dh modules during table load
Linus Torvalds [Fri, 30 Mar 2018 05:23:24 +0000 (19:23 -1000)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"It has been fairly silent lately on our -rc front. Big queue of
patches on the mailing list going to for-next though.
Bug fixes:
- qedr driver bugfixes causing application hangs, wrong uapi errnos,
and a race condition
- three syzkaller found bugfixes in the ucma uapi
Regression fixes for things introduced in 4.16:
- Crash on error introduced in mlx5 UMR flow
- Crash on module unload/etc introduced by bad interaction of
restrack and mlx5 patches this cycle
- Typo in a two line syzkaller bugfix causing a bad regression
- Coverity report of nonsense code in hns driver"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/ucma: Introduce safer rdma_addr_size() variants
RDMA/hns: ensure for-loop actually iterates and free's buffers
RDMA/ucma: Check that device exists prior to accessing it
RDMA/ucma: Check that device is connected prior to access it
RDMA/rdma_cm: Fix use after free race with process_one_req
RDMA/qedr: Fix QP state initialization race
RDMA/qedr: Fix rc initialization on CNQ allocation failure
RDMA/qedr: fix QP's ack timeout configuration
RDMA/ucma: Correct option size check using optlen
RDMA/restrack: Move restrack_clean to be symmetrical to restrack_init
IB/mlx5: Don't clean uninitialized UMR resources
Linus Torvalds [Fri, 30 Mar 2018 05:21:29 +0000 (19:21 -1000)]
Merge tag 'mtd/fixes-for-4.16' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Boris Brezillon:
"Two fixes, one in the atmel NAND driver and another one in the
CFI/JEDEC code.
Summary:
- Fix a bug in Atmel ECC engine driver
- Fix a bug in the CFI/JEDEC driver"
* tag 'mtd/fixes-for-4.16' of git://git.infradead.org/linux-mtd:
mtd: jedec_probe: Fix crash in jedec_read_mfr()
mtd: nand: atmel: Fix get_sectorsize() function
Mike Snitzer [Fri, 30 Mar 2018 03:31:32 +0000 (23:31 -0400)]
dm: fix dropped return code from dm_get_bdev_for_ioctl
dm_get_bdev_for_ioctl()'s return of 0 or 1 must be the result from
prepare_ioctl (1 means the ioctl was issued to a partition, 0 means it
wasn't). Unfortunately commit
519049afea ("dm: use blkdev_get rather
than bdgrab when issuing pass-through ioctl") reused the variable 'r'
to store the return from blkdev_get() that follows prepare_ioctl()
-- whereby dropping prepare_ioctl()'s result on the floor.
This can lead to an ioctl or persistent reservation being issued to a
partition going unnoticed, which implies the extra permission check for
CAP_SYS_RAWIO is skipped.
Fix this by using a different variable to store blkdev_get()'s return.
Fixes: 519049afea ("dm: use blkdev_get rather than bdgrab when issuing pass-through ioctl")
Reported-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
David S. Miller [Fri, 30 Mar 2018 01:49:19 +0000 (21:49 -0400)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkman says:
====================
pull-request: bpf 2018-03-29
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix nfp to properly check max insn count while emitting
instructions in the JIT which was wrongly comparing bytes
against number of instructions before, from Jakub.
2) Fix for bpftool to avoid usage of hex numbers in JSON
output since JSON doesn't accept hex numbers with 0x
prefix, also from Jakub.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mike Snitzer [Thu, 29 Mar 2018 15:50:10 +0000 (11:50 -0400)]
dm mpath: fix support for loading scsi_dh modules during table load
The ability to have multipath dynamically attach a scsi_dh, that the user
specified in the multipath table, was broken by commit
e8f74a0f00 ("dm
mpath: eliminate need to use scsi_device_from_queue").
Restore the ability to load, and attach, a particular scsi_dh module if
one is specified (as noticed by checking m->hw_handler_name).
Fixes: e8f74a0f00 ("dm mpath: eliminate need to use scsi_device_from_queue")
Reported-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Jason Wang [Thu, 29 Mar 2018 08:00:04 +0000 (16:00 +0800)]
vhost: validate log when IOTLB is enabled
Vq log_base is the userspace address of bitmap which has nothing to do
with IOTLB. So it needs to be validated unconditionally otherwise we
may try use 0 as log_base which may lead to pin pages that will lead
unexpected result (e.g trigger BUG_ON() in set_bit_to_user()).
Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API")
Reported-by: syzbot+6304bf97ef436580fede@syzkaller.appspotmail.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Wed, 28 Mar 2018 10:35:52 +0000 (03:35 -0700)]
qede: Do not drop rx-checksum invalidated packets.
Today, driver drops received packets which are indicated as
invalid checksum by the device. Instead of dropping such packets,
pass them to the stack with CHECKSUM_NONE indication in skb.
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Walleij [Sat, 3 Mar 2018 22:29:03 +0000 (23:29 +0100)]
mtd: jedec_probe: Fix crash in jedec_read_mfr()
It turns out that the loop where we read manufacturer
jedec_read_mfd() can under some circumstances get a
CFI_MFR_CONTINUATION repeatedly, making the loop go
over all banks and eventually hit the end of the
map and crash because of an access violation:
Unable to handle kernel paging request at virtual address
c4980000
pgd = (ptrval)
[
c4980000] *pgd=
03808811, *pte=
00000000, *ppte=
00000000
Internal error: Oops: 7 [#1] PREEMPT ARM
CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc1+ #150
Hardware name: Gemini (Device Tree)
PC is at jedec_probe_chip+0x6ec/0xcd0
LR is at 0x4
pc : [<
c03a2bf4>] lr : [<
00000004>] psr:
60000013
sp :
c382dd18 ip :
0000ffff fp :
00000000
r10:
c0626388 r9 :
00020000 r8 :
c0626340
r7 :
00000000 r6 :
00000001 r5 :
c3a71afc r4 :
c382dd70
r3 :
00000001 r2 :
c4900000 r1 :
00000002 r0 :
00080000
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control:
0000397f Table:
00004000 DAC:
00000053
Process swapper (pid: 1, stack limit = 0x(ptrval))
Fix this by breaking the loop with a return 0 if
the offset exceeds the map size.
Fixes: 5c9c11e1c47c ("[MTD] [NOR] Add support for flash chips with ID in bank other than 0")
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Boris Brezillon [Tue, 27 Mar 2018 17:01:58 +0000 (19:01 +0200)]
mtd: nand: atmel: Fix get_sectorsize() function
get_sectorsize() was not using the appropriate macro to extract the
ECC sector size from the config cache, which led to buggy ECC when
using 1024 byte sectors.
Fixes: f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: <stable@vger.kernel.org>
Reported-by: Olivier Schonken <olivier.schonken@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Olivier Schonken <olivier.schonken@gmail.com>
Stephen Hemminger [Tue, 27 Mar 2018 18:28:48 +0000 (11:28 -0700)]
hv_netvsc: enable multicast if necessary
My recent change to netvsc drive in how receive flags are handled
broke multicast. The Hyper-v/Azure virtual interface there is not a
multicast filter list, filtering is only all or none. The driver must
enable all multicast if any multicast address is present.
Fixes: 009f766ca238 ("hv_netvsc: filter multicast/broadcast")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 29 Mar 2018 15:42:14 +0000 (11:42 -0400)]
ip_tunnel: Resolve ipsec merge conflict properly.
We want to use dev_set_mtu() regardless of how we calculate
the mtu value.
Signed-off-by: David S. Miller <davem@davemloft.net>
Raghuram Chary J [Tue, 27 Mar 2018 09:21:16 +0000 (14:51 +0530)]
lan78xx: Crash in lan78xx_writ_reg (Workqueue: events lan78xx_deferred_multicast_write)
Description:
Crash was reported with syzkaller pointing to lan78xx_write_reg routine.
Root-cause:
Proper cleanup of workqueues and init/setup routines was not happening
in failure conditions.
Fix:
Handled the error conditions by cleaning up the queues and init/setup
routines.
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Raghuram Chary J <raghuramchary.jallipalli@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 29 Mar 2018 14:12:47 +0000 (10:12 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2018-03-29
1) Fix a rcu_read_lock/rcu_read_unlock imbalance
in the error path of xfrm_local_error().
From Taehee Yoo.
2) Some VTI MTU fixes. From Stefano Brivio.
3) Fix a too early overwritten skb control buffer
on xfrm transport mode.
Please note that this pull request has a merge conflict
in net/ipv4/ip_tunnel.c.
The conflict is between
commit
f6cc9c054e77 ("ip_tunnel: Emit events for post-register MTU changes")
from the net tree and
commit
24fc79798b8d ("ip_tunnel: Clamp MTU to bounds on new link")
from the ipsec tree.
It can be solved as it is currently done in linux-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Shishkin [Thu, 29 Mar 2018 12:06:48 +0000 (15:06 +0300)]
perf/x86/pt, coresight: Clean up address filter structure
This is a cosmetic patch that deals with the address filter structure's
ambiguous fields 'filter' and 'range'. The former stands to mean that the
filter's *action* should be to filter the traces to its address range if
it's set or stop tracing if it's unset. This is confusing and hard on the
eyes, so this patch replaces it with 'action' enum. The 'range' field is
completely redundant (meaning that the filter is an address range as
opposed to a single address trigger), as we can use zero size to mean the
same thing.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20180329120648.11902-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Thu, 29 Mar 2018 14:03:48 +0000 (16:03 +0200)]
Merge branch 'perf/urgent' into perf/core
Conflicts:
kernel/events/hw_breakpoint.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tetsuo Handa [Tue, 27 Mar 2018 10:41:41 +0000 (19:41 +0900)]
lockdep: Make the lock debug output more useful
The lock debug output in print_lock() has a few shortcomings:
- It prints the hlock->acquire_ip field in %px and %pS format. That's
redundant information.
- It lacks information about the lock object itself. The lock class is
not helpful to identify a particular instance of a lock.
Change the output so it prints:
- hlock->instance to allow identification of a particular lock instance.
- only the %pS format of hlock->ip_acquire which is sufficient to decode
the actual code line with faddr2line.
The resulting output is:
3 locks held by a.out/31106:
#0:
00000000b0f753ba (&mm->mmap_sem){++++}, at: copy_process.part.41+0x10d5/0x1fe0
#1:
00000000ef64d539 (&mm->mmap_sem/1){+.+.}, at: copy_process.part.41+0x10fe/0x1fe0
#2:
00000000b41a282e (&mapping->i_mmap_rwsem){++++}, at: copy_process.part.41+0x12f2/0x1fe0
[ tglx: Massaged changelog ]
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: linux-mm@kvack.org
Cc: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/201803271941.GBE57310.tVSOJLQOFFOHFM@I-love.SAKURA.ne.jp
Ingo Molnar [Thu, 29 Mar 2018 07:21:21 +0000 (09:21 +0200)]
Merge tag 'perf-core-for-mingo-4.17-
20180328' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Be consistent when checking if a perf_mmap instance had
its ring buffer unmmaped, fixing segfaults noticed in
'perf trace' (Kan Liang, Arnaldo Carvalho de Melo)
- Avoid adding the same option multiple times to the 'diff'
command in check-headers.sh (Jiri Olsa)
- Add vendor event files (JSON format) to various IBM
s390 models (z10EC, z10BC, z196, zEC12, zBC12, z13
and z14) (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Thu, 29 Mar 2018 01:07:23 +0000 (15:07 -1000)]
Merge tag 'drm-fixes-for-v4.16-rc8' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Nothing serious, two amdkfd and two tegra fixes"
* tag 'drm-fixes-for-v4.16-rc8' of git://people.freedesktop.org/~airlied/linux:
drm/tegra: dc: Using NULL instead of plain integer
drm/amdkfd: Deallocate SDMA queues correctly
drm/amdkfd: Fix scratch memory with HWS enabled
drm/tegra: dc: Use correct format array for Tegra124
Masahiro Yamada [Thu, 29 Mar 2018 00:24:28 +0000 (09:24 +0900)]
netfilter: nf_nat_snmp_basic: add correct dependency to Makefile
nf_nat_snmp_basic_main.c includes a generated header, but the
necessary dependency is missing in Makefile. This could cause
build error in parallel building.
Remove a weird line, and add a correct one.
Fixes: cc2d58634e0f ("netfilter: nf_nat_snmp_basic: use asn1 decoder library")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Linus Torvalds [Thu, 29 Mar 2018 00:34:55 +0000 (14:34 -1000)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"8 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: demote ARM port to "odd fixes"
MAINTAINERS: correct rmk's email address
mm/kmemleak.c: wait for scan completion before disabling free
mm/memcontrol.c: fix parameter description mismatch
mm/vmstat.c: fix vmstat_update() preemption BUG
mm/page_owner: fix recursion bug after changing skip entries
ipc/shm.c: add split function to shm_vm_ops
mm, slab: memcg_link the SLAB's kmem_cache
Dave Airlie [Wed, 28 Mar 2018 23:57:09 +0000 (09:57 +1000)]
Merge tag 'drm/tegra/for-4.16-fixes' of git://anongit.freedesktop.org/tegra/linux into drm-fixes
drm/tegra: Fixes for v4.16
This contains two small fixes, one which fixes a typo that causes a
crash with the new framebuffer modifier query support and another that
fixes a build warning.
* tag 'drm/tegra/for-4.16-fixes' of git://anongit.freedesktop.org/tegra/linux:
drm/tegra: dc: Using NULL instead of plain integer
drm/tegra: dc: Use correct format array for Tegra124
Linus Torvalds [Wed, 28 Mar 2018 23:54:03 +0000 (13:54 -1000)]
Merge tag 'powerpc-4.16-6' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 4.16. Apologies if this is a bit big at
rc7, but they're all reasonably important fixes. None are actually for
new code, so they aren't indicative of 4.16 being in bad shape from
our point of view.
- Fix missing AT_BASE_PLATFORM (in auxv) when we're using a new
firmware interface for describing CPU features.
- Fix lost pending interrupts due to a race in our interrupt
soft-masking code.
- A workaround for a nest MMU bug with TLB invalidations on Power9.
- A workaround for broadcast TLB invalidations on Power9.
- Fix a bug in our instruction SLB miss handler, when handling bad
addresses (eg. >= TASK_SIZE), which could corrupt non-volatile user
GPRs.
Thanks to: Aneesh Kumar K.V, Balbir Singh, Benjamin Herrenschmidt,
Nicholas Piggin"
* tag 'powerpc-4.16-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs
powerpc/mm: Fixup tlbie vs store ordering issue on POWER9
powerpc/mm/radix: Move the functions that does the actual tlbie closer
powerpc/mm/radix: Remove unused code
powerpc/mm: Workaround Nest MMU bug with TLB invalidations
powerpc/mm: Add tracking of the number of coprocessors using a context
powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features
Linus Torvalds [Wed, 28 Mar 2018 23:52:13 +0000 (13:52 -1000)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"Here are are a couple of last-minute fixes for 4.16, mostly for
regressions. As usual, the majory are device tree changes:
- USB 3 support on rk3399 didn't work and is being reverted for now
- One fix for an old suspend/resume bug on rk3399
- A few regulator related fixes on Banana Pi M2, and on imx7d-sdb
- A boot regression fix for all Aspeed SoCs failing to find their
memory
- One more dtc warning fix
The other changes are:
- A few updates to the MAINTAINERS file
- A revert for an incorrect orion5x cleanup
- Two power management fixes for OMAP"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: OMAP: Fix SRAM W+X mapping
ARM: dts: aspeed: Add default memory node
mailmap: Update email address for Gregory CLEMENT
ARM: davinci: fix the GPIO lookup for omapl138-hawk
MAINTAINERS: Update Tegra IOMMU maintainer
ARM: dts: imx7d-sdb: Fix regulator-usb-otg2-vbus node name
ARM: ux500: Fix PMU IRQ regression
ARM: dts: rockchip: Add missing #sound-dai-cells on rk3288
Revert "arm64: dts: rockchip: add usb3-phy otg-port support for rk3399"
arm64: dts: rockchip: Fix rk3399-gru-* s2r (pinctrl hogs, wifi reset)
ARM: OMAP: Fix dmtimer init for omap1
MAINTAINERS: update email address for Maxime Ripard
ARM: dts: sun6i: a31s: bpi-m2: add missing regulators
ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
Russell King [Wed, 28 Mar 2018 23:01:22 +0000 (16:01 -0700)]
MAINTAINERS: demote ARM port to "odd fixes"
As of the start of 2018, I am no longer paid to support the core 32-bit
ARM architecture code. This means that this code is no longer
commercially supported, and is now only supported through voluntary
effort.
I will continue to merge patches as and when able, but this will be at a
lower priority than before (which means a longer latency.) I have also
be scaled back the amount of time spent reading email, so email that is
intended for my attention needs to make itself plainly obvious, or I
will miss it.
In an attempt to reduce the amount of email Cc'd to me, exclude
arch/arm/boot/dts from the maintainers patterns, but add entries for the
SolidRun platforms I look after.
Link: http://lkml.kernel.org/r/E1ezkgn-0002fO-52@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Russell King [Wed, 28 Mar 2018 23:01:19 +0000 (16:01 -0700)]
MAINTAINERS: correct rmk's email address
Correct my email address in the MAINTAINTERS file.
Link: http://lkml.kernel.org/r/E1ezkgi-0002fH-01@rmk-PC.armlinux.org.uk
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vinayak Menon [Wed, 28 Mar 2018 23:01:16 +0000 (16:01 -0700)]
mm/kmemleak.c: wait for scan completion before disabling free
A crash is observed when kmemleak_scan accesses the object->pointer,
likely due to the following race.
TASK A TASK B TASK C
kmemleak_write
(with "scan" and
NOT "scan=on")
kmemleak_scan()
create_object
kmem_cache_alloc fails
kmemleak_disable
kmemleak_do_cleanup
kmemleak_free_enabled = 0
kfree
kmemleak_free bails out
(kmemleak_free_enabled is 0)
slub frees object->pointer
update_checksum
crash - object->pointer
freed (DEBUG_PAGEALLOC)
kmemleak_do_cleanup waits for the scan thread to complete, but not for
direct call to kmemleak_scan via kmemleak_write. So add a wait for
kmemleak_scan completion before disabling kmemleak_free, and while at it
fix the comment on stop_scan_thread.
[vinmenon@codeaurora.org: fix stop_scan_thread comment]
Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
Signed-off-by: Vinayak Menon <vinmenon@codeaurora.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Honglei Wang [Wed, 28 Mar 2018 23:01:12 +0000 (16:01 -0700)]
mm/memcontrol.c: fix parameter description mismatch
There are a couple of places where parameter description and function
name do not match the actual code. Fix it.
Link: http://lkml.kernel.org/r/1520843448-17347-1-git-send-email-honglei.wang@oracle.com
Signed-off-by: Honglei Wang <honglei.wang@oracle.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Steven J. Hill [Wed, 28 Mar 2018 23:01:09 +0000 (16:01 -0700)]
mm/vmstat.c: fix vmstat_update() preemption BUG
Attempting to hotplug CPUs with CONFIG_VM_EVENT_COUNTERS enabled can
cause vmstat_update() to report a BUG due to preemption not being
disabled around smp_processor_id().
Discovered on Ubiquiti EdgeRouter Pro with Cavium Octeon II processor.
BUG: using smp_processor_id() in preemptible [
00000000] code:
kworker/1:1/269
caller is vmstat_update+0x50/0xa0
CPU: 0 PID: 269 Comm: kworker/1:1 Not tainted
4.16.0-rc4-Cavium-Octeon-00009-gf83bbd5-dirty #1
Workqueue: mm_percpu_wq vmstat_update
Call Trace:
show_stack+0x94/0x128
dump_stack+0xa4/0xe0
check_preemption_disabled+0x118/0x120
vmstat_update+0x50/0xa0
process_one_work+0x144/0x348
worker_thread+0x150/0x4b8
kthread+0x110/0x140
ret_from_kernel_thread+0x14/0x1c
Link: http://lkml.kernel.org/r/1520881552-25659-1-git-send-email-steven.hill@cavium.com
Signed-off-by: Steven J. Hill <steven.hill@cavium.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <htejun@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Maninder Singh [Wed, 28 Mar 2018 23:01:05 +0000 (16:01 -0700)]
mm/page_owner: fix recursion bug after changing skip entries
This patch fixes commit
5f48f0bd4e36 ("mm, page_owner: skip unnecessary
stack_trace entries").
Because if we skip first two entries then logic of checking count value
as 2 for recursion is broken and code will go in one depth recursion.
so we need to check only one call of _RET_IP(__set_page_owner) while
checking for recursion.
Current Backtrace while checking for recursion:-
(save_stack) from (__set_page_owner) // (But recursion returns true here)
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack) // recursion should return true here
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask+)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack)
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
Correct Backtrace with fix:
(save_stack) from (__set_page_owner) // recursion returned true here
(__set_page_owner) from (get_page_from_freelist)
(get_page_from_freelist) from (__alloc_pages_nodemask+)
(__alloc_pages_nodemask) from (depot_save_stack)
(depot_save_stack) from (save_stack)
(save_stack) from (__set_page_owner)
(__set_page_owner) from (get_page_from_freelist)
Link: http://lkml.kernel.org/r/1521607043-34670-1-git-send-email-maninder1.s@samsung.com
Fixes: 5f48f0bd4e36 ("mm, page_owner: skip unnecessary stack_trace entries")
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Signed-off-by: Vaneet Narang <v.narang@samsung.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@techadventures.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ayush Mittal <ayush.m@samsung.com>
Cc: Prakash Gupta <guptap@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Cc: Vasyl Gomonovych <gomonovych@gmail.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: <pankaj.m@samsung.com>
Cc: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Kravetz [Wed, 28 Mar 2018 23:01:01 +0000 (16:01 -0700)]
ipc/shm.c: add split function to shm_vm_ops
If System V shmget/shmat operations are used to create a hugetlbfs
backed mapping, it is possible to munmap part of the mapping and split
the underlying vma such that it is not huge page aligned. This will
untimately result in the following BUG:
kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/mm/hugetlb.c:3310!
Oops: Exception in kernel mode, sig: 5 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: kcm nfc af_alg caif_socket caif phonet fcrypt
CPU: 18 PID: 43243 Comm: trinity-subchil Tainted: G C E 4.15.0-10-generic #11-Ubuntu
NIP:
c00000000036e764 LR:
c00000000036ee48 CTR:
0000000000000009
REGS:
c000003fbcdcf810 TRAP: 0700 Tainted: G C E (4.15.0-10-generic)
MSR:
9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR:
24002222 XER:
20040000
CFAR:
c00000000036ee44 SOFTE: 1
NIP __unmap_hugepage_range+0xa4/0x760
LR __unmap_hugepage_range_final+0x28/0x50
Call Trace:
0x7115e4e00000 (unreliable)
__unmap_hugepage_range_final+0x28/0x50
unmap_single_vma+0x11c/0x190
unmap_vmas+0x94/0x140
exit_mmap+0x9c/0x1d0
mmput+0xa8/0x1d0
do_exit+0x360/0xc80
do_group_exit+0x60/0x100
SyS_exit_group+0x24/0x30
system_call+0x58/0x6c
---[ end trace
ee88f958a1c62605 ]---
This bug was introduced by commit
31383c6865a5 ("mm, hugetlbfs:
introduce ->split() to vm_operations_struct"). A split function was
added to vm_operations_struct to determine if a mapping can be split.
This was mostly for device-dax and hugetlbfs mappings which have
specific alignment constraints.
Mappings initiated via shmget/shmat have their original vm_ops
overwritten with shm_vm_ops. shm_vm_ops functions will call back to the
original vm_ops if needed. Add such a split function to shm_vm_ops.
Link: http://lkml.kernel.org/r/20180321161314.7711-1-mike.kravetz@oracle.com
Fixes: 31383c6865a5 ("mm, hugetlbfs: introduce ->split() to vm_operations_struct")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shakeel Butt [Wed, 28 Mar 2018 23:00:57 +0000 (16:00 -0700)]
mm, slab: memcg_link the SLAB's kmem_cache
All the root caches are linked into slab_root_caches which was
introduced by the commit
510ded33e075 ("slab: implement slab_root_caches
list") but it missed to add the SLAB's kmem_cache.
While experimenting with opt-in/opt-out kmem accounting, I noticed
system crashes due to NULL dereference inside cache_from_memcg_idx()
while deferencing kmem_cache.memcg_params.memcg_caches. The upstream
clean kernel will not see these crashes but SLAB should be consistent
with SLUB which does linked its boot caches (kmem_cache_node and
kmem_cache) into slab_root_caches.
Link: http://lkml.kernel.org/r/20180319210020.60289-1-shakeelb@google.com
Fixes: 510ded33e075c ("slab: implement slab_root_caches list")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Airlie [Wed, 28 Mar 2018 23:25:13 +0000 (09:25 +1000)]
Merge branch 'drm-misc-next-fixes' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
- Mask mode type garbage from userspace (Ville)
Something went wrong on the misc tree side, but I'll pull the patch directly.
* 'drm-misc-next-fixes' of git://anongit.freedesktop.org/drm/drm-misc:
drm: Fix uabi regression by allowing garbage mode->type from userspace
Roland Dreier [Wed, 28 Mar 2018 18:27:22 +0000 (11:27 -0700)]
RDMA/ucma: Introduce safer rdma_addr_size() variants
There are several places in the ucma ABI where userspace can pass in a
sockaddr but set the address family to AF_IB. When that happens,
rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6,
and the ucma kernel code might end up copying past the end of a buffer
not sized for a struct sockaddr_ib.
Fix this by introducing new variants
int rdma_addr_size_in6(struct sockaddr_in6 *addr);
int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr);
that are type-safe for the types used in the ucma ABI and return 0 if the
size computed is bigger than the size of the type passed in. We can use
these new variants to check what size userspace has passed in before
copying any addresses.
Reported-by: <syzbot+6800425d54ed3ed8135d@syzkaller.appspotmail.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Peter Zijlstra [Tue, 27 Mar 2018 12:14:38 +0000 (14:14 +0200)]
locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
(->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
waiter. In such a case remove_waiter() must not be called because without a
waiter it will trigger the BUG_ON() statement.
This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
with an explicit check for waiters before calling remove_waiter().
Instead of an explicit NULL check before calling rt_mutex_top_waiter() make
the function return NULL if there are no waiters. With that fixed the now
pointless NULL check is removed from rt_mutex_slowlock().
Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
Link: https://lkml.kernel.org/r/20180327121438.sss7hxg3crqy4ecd@linutronix.de
Yazen Ghannam [Mon, 26 Mar 2018 19:15:25 +0000 (14:15 -0500)]
Revert "x86/mce/AMD: Collect error info even if valid bits are not set"
This reverts commit
4b1e84276a6172980c5bf39aa091ba13e90d6dad.
Software uses the valid bits to decide if the values can be used for
further processing or other actions. So setting the valid bits will have
software act on values that it shouldn't be acting on.
The recommendation to save all the register values does not mean that
the values are always valid.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: tony.luck@intel.com
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: bp@suse.de
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/20180326191526.64314-1-Yazen.Ghannam@amd.com
mike.travis@hpe.com [Wed, 28 Mar 2018 17:40:11 +0000 (12:40 -0500)]
x86/platform/UV: Fix critical UV MMR address error
A critical error was found testing the fixed UV4 HUB in that an MMR address
was found to be incorrect. This causes the virtual address space for
accessing the MMIOH1 region to be allocated with the incorrect size.
Fixes: 673aa20c55a1 ("x86/platform/UV: Update uv_mmrs.h to prepare for UV4A fixes")
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com>
Cc: Russ Anderson <russ.anderson@hpe.com>
Cc: Andrew Banman <andrew.banman@hpe.com>
Link: https://lkml.kernel.org/r/20180328174011.041801248@stormcage.americas.sgi.com
Linus Torvalds [Tue, 27 Mar 2018 01:39:07 +0000 (15:39 -1000)]
perf/hwbp: Simplify the perf-hwbp code, fix documentation
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
modification of a breakpoint - simplify it and remove the pointless
local variables.
Also update the stale Docbook while at it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Wei Yongjun [Wed, 28 Mar 2018 12:52:10 +0000 (12:52 +0000)]
drm/tegra: dc: Using NULL instead of plain integer
Fixes the following sparse warnings:
drivers/gpu/drm/tegra/dc.c:2181:69: warning:
Using plain integer as NULL pointer
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Wanpeng Li [Sun, 25 Mar 2018 04:18:35 +0000 (21:18 -0700)]
KVM: x86: Fix pv tlb flush dependencies
PV TLB FLUSH can only be turned on when steal time is enabled.
The condition got reversed during conflict resolution.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Fixes: 4f2f61fc5071 ("KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is disabled")
[Rebased on top of kvm/master and reworded the commit message. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Banman [Tue, 27 Mar 2018 22:09:06 +0000 (17:09 -0500)]
x86/platform/uv/BAU: Add APIC idt entry
BAU uses the old alloc_initr_gate90 method to setup its interrupt. This
fails silently as the BAU vector is in the range of APIC vectors that are
registered to the spurious interrupt handler. As a consequence BAU
broadcasts are not handled, and the broadcast source CPU hangs.
Update BAU to use new idt structure.
Fixes: dc20b2d52653 ("x86/idt: Move interrupt gate initialization to IDT code")
Signed-off-by: Andrew Banman <abanman@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1522188546-196177-1-git-send-email-abanman@hpe.com
Dave Airlie [Wed, 28 Mar 2018 04:49:19 +0000 (14:49 +1000)]
Merge tag 'drm-amdkfd-next-2018-03-27' of git://people.freedesktop.org/~gabbayo/linux into drm-next
- GPUVM support for dGPUs
- KFD events support for dGPUs
- Fix live-lock situation when restoring multiple evicted processes
- Fix VM page table allocation on large-bar systems
- Fix for build failure on frv architecture
* tag 'drm-amdkfd-next-2018-03-27' of git://people.freedesktop.org/~gabbayo/linux:
drm/amdkfd: Use ordered workqueue to restore processes
drm/amdgpu: Fix acquiring VM on large-BAR systems
drm/amdkfd: Add module option for testing large-BAR functionality
drm/amdkfd: Kmap event page for dGPUs
drm/amdkfd: Add ioctls for GPUVM memory management
drm/amdkfd: Add TC flush on VMID deallocation for Hawaii
drm/amdkfd: Allocate CWSR trap handler memory for dGPUs
drm/amdkfd: Add per-process IDR for buffer handles
drm/amdkfd: Aperture setup for dGPUs
drm/amdkfd: Remove limit on number of GPUs
drm/amdkfd: Populate DRM render device minor
drm/amdkfd: Create KFD VMs on demand
drm/amdgpu: Add kfd2kgd interface to acquire an existing VM
drm/amdgpu: Add helper to turn an existing VM into a compute VM
drm/amdgpu: Fix initial validation of PD BO for KFD VMs
drm/amdgpu: Move KFD-specific fields into struct amdgpu_vm
drm/amdkfd: fix uninitialized variable use
drm/amdkfd: add missing include of mm.h
Dave Airlie [Wed, 28 Mar 2018 04:47:26 +0000 (14:47 +1000)]
Merge tag 'drm-intel-next-fixes-2018-03-27' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Display fixes for booting with MST hub lid closed and display
freezing after hibernation (fd.o bugs 105470 & 105196)
- Fix for a very rare interrupt handling race resulting in GPU hang
* tag 'drm-intel-next-fixes-2018-03-27' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Fix hibernation with ACPI S0 target state
drm/i915/execlists: Use a locked clear_bit() for synchronisation with interrupt
drm/i915: Specify which engines to reset following semaphore/event lockups
drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.
Dave Airlie [Wed, 28 Mar 2018 04:30:41 +0000 (14:30 +1000)]
Backmerge tag 'v4.16-rc7' into drm-next
Linux 4.16-rc7
This was requested by Daniel, and things were getting
a bit hard to reconcile, most of the conflicts were
trivial though.
Linus Torvalds [Wed, 28 Mar 2018 00:28:40 +0000 (14:28 -1000)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A small number of small fixes for ARM, mostly for some build issues.
One fix for a regression caused by the cpu hotplug conversion from a
few kernel versions ago"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8750/1: deflate_xip_data.sh: minor fixes
ARM: 8748/1: mm: Define vdso_start, vdso_end as array
ARM: 8747/1: make CONFIG_DEBUG_WX depend on MMU
ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
Linus Torvalds [Wed, 28 Mar 2018 00:11:46 +0000 (14:11 -1000)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two driver fixes (ibmvfc, iscsi_tcp) and a USB fix for devices that
give the wrong return to Read Capacity and cause a huge log spew.
The remaining five patches all try to fix commit
84676c1f21e8
("genirq/affinity: assign vectors to all possible CPUs") which broke
the non-mq I/O path"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: iscsi_tcp: set BDI_CAP_STABLE_WRITES when data digest enabled
scsi: sd: Remember that READ CAPACITY(16) succeeded
scsi: ibmvfc: Avoid unnecessary port relogin
scsi: virtio_scsi: unify scsi_host_template
scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity
scsi: core: introduce force_blk_mq
scsi: megaraid_sas: fix selection of reply queue
scsi: hpsa: fix selection of reply queue
Colin Ian King [Mon, 26 Mar 2018 15:10:18 +0000 (16:10 +0100)]
RDMA/hns: ensure for-loop actually iterates and free's buffers
The current for-loop zeros variable i and only loops once, hence
not all the buffers are free'd. Fix this by setting i correctly.
Detected by CoverityScan, CID#
1463415 ("Operands don't affect result")
Fixes: a5073d6054f7 ("RDMA/hns: Add eq support of hip08")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Leon Romanovsky [Sun, 25 Mar 2018 08:39:05 +0000 (11:39 +0300)]
RDMA/ucma: Check that device exists prior to accessing it
Ensure that device exists prior to accessing its properties.
Reported-by: <syzbot+71655d44855ac3e76366@syzkaller.appspotmail.com>
Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Leon Romanovsky [Sun, 25 Mar 2018 08:23:55 +0000 (11:23 +0300)]
RDMA/ucma: Check that device is connected prior to access it
Add missing check that device is connected prior to access it.
[ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0
[ 55.359389] Read of size 8 at addr
00000000000000b0 by task qp/618
[ 55.360255]
[ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted
4.16.0-rc1-00071-gcaf61b1b8b88 #91
[ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 55.363264] Call Trace:
[ 55.363833] dump_stack+0x5c/0x77
[ 55.364215] kasan_report+0x163/0x380
[ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0
[ 55.365238] rdma_init_qp_attr+0x4a/0x2c0
[ 55.366410] ucma_init_qp_attr+0x111/0x200
[ 55.366846] ? ucma_notify+0xf0/0xf0
[ 55.367405] ? _get_random_bytes+0xea/0x1b0
[ 55.367846] ? urandom_read+0x2f0/0x2f0
[ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0
[ 55.369104] ? refcount_inc_not_zero+0x9/0x60
[ 55.369583] ? refcount_inc+0x5/0x30
[ 55.370155] ? rdma_create_id+0x215/0x240
[ 55.370937] ? _copy_to_user+0x4f/0x60
[ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290
[ 55.372127] ? _copy_from_user+0x5e/0x90
[ 55.372720] ucma_write+0x174/0x1f0
[ 55.373090] ? ucma_close_id+0x40/0x40
[ 55.373805] ? __lru_cache_add+0xa8/0xd0
[ 55.374403] __vfs_write+0xc4/0x350
[ 55.374774] ? kernel_read+0xa0/0xa0
[ 55.375173] ? fsnotify+0x899/0x8f0
[ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170
[ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30
[ 55.377522] ? handle_mm_fault+0x174/0x320
[ 55.378169] vfs_write+0xf7/0x280
[ 55.378864] SyS_write+0xa1/0x120
[ 55.379270] ? SyS_read+0x120/0x120
[ 55.379643] ? mm_fault_error+0x180/0x180
[ 55.380071] ? task_work_run+0x7d/0xd0
[ 55.380910] ? __task_pid_nr_ns+0x120/0x140
[ 55.381366] ? SyS_read+0x120/0x120
[ 55.381739] do_syscall_64+0xeb/0x250
[ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 55.382841] RIP: 0033:0x7fc2ef803e99
[ 55.383227] RSP: 002b:
00007fffcc5f3be8 EFLAGS:
00000217 ORIG_RAX:
0000000000000001
[ 55.384173] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007fc2ef803e99
[ 55.386145] RDX:
0000000000000057 RSI:
0000000020000080 RDI:
0000000000000003
[ 55.388418] RBP:
00007fffcc5f3c00 R08:
0000000000000000 R09:
0000000000000000
[ 55.390542] R10:
0000000000000000 R11:
0000000000000217 R12:
0000000000400480
[ 55.392916] R13:
00007fffcc5f3cf0 R14:
0000000000000000 R15:
0000000000000000
[ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49
8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 <49> 03 85 b0 00 00 00 48 8d 78 08
48 89 04 24 e8 3a 4f 1e ff 48
[ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP:
ffff8801e2c2f9d8
[ 55.532648] CR2:
00000000000000b0
[ 55.534396] ---[ end trace
70cee64090251c0b ]---
Fixes: 75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Fixes: d541e45500bd ("IB/core: Convert ah_attr from OPA to IB when copying to user")
Reported-by: <syzbot+7b62c837c2516f8f38c8@syzkaller.appspotmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Jason Gunthorpe [Thu, 22 Mar 2018 20:04:23 +0000 (14:04 -0600)]
RDMA/rdma_cm: Fix use after free race with process_one_req
process_one_req() can race with rdma_addr_cancel():
CPU0 CPU1
==== ====
process_one_work()
debug_work_deactivate(work);
process_one_req()
rdma_addr_cancel()
mutex_lock(&lock);
set_timeout(&req->work,..);
__queue_work()
debug_work_activate(work);
mutex_unlock(&lock);
mutex_lock(&lock);
[..]
list_del(&req->list);
mutex_unlock(&lock);
[..]
// ODEBUG explodes since the work is still queued.
kfree(req);
Causing ODEBUG to detect the use after free:
ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165
WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288
kvm: emulating exchange as write
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: ib_addr process_one_req
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x1f4/0x2b0 lib/bug.c:186
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
RSP: 0000:
ffff8801d966f210 EFLAGS:
00010086
RAX:
dffffc0000000008 RBX:
0000000000000003 RCX:
ffffffff815acd6e
RDX:
0000000000000000 RSI:
1ffff1003b2cddf2 RDI:
0000000000000000
RBP:
ffff8801d966f250 R08:
0000000000000000 R09:
1ffff1003b2cddc8
R10:
ffffed003b2cde71 R11:
ffffffff86f39a98 R12:
0000000000000001
R13:
ffffffff86f15540 R14:
ffffffff86408700 R15:
ffffffff8147c0a0
__debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
kfree+0xc7/0x260 mm/slab.c:3799
process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592
process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
Fixes: 5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC")
Reported-by: <syzbot+3b4acab09b6463472d0a@syzkaller.appspotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Manish Chopra [Tue, 27 Mar 2018 13:34:41 +0000 (06:34 -0700)]
qede: Fix barrier usage after tx doorbell write.
Since commit
c5ad119fb6c09b0297446be05bd66602fa564758
("net: sched: pfifo_fast use skb_array") driver is exposed
to an issue where it is hitting NULL skbs while handling TX
completions. Driver uses mmiowb() to flush the writes to the
doorbell bar which is a write-combined bar, however on x86
mmiowb() does not flush the write combined buffer.
This patch fixes this problem by replacing mmiowb() with wmb()
after the write combined doorbell write so that writes are
flushed and synchronized from more than one processor.
V1->V2:
-------
This patch was marked as "superseded" in patchwork.
(Not really sure for what reason).Resending it as v2.
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 27 Mar 2018 12:50:52 +0000 (20:50 +0800)]
vhost: correctly remove wait queue during poll failure
We tried to remove vq poll from wait queue, but do not check whether
or not it was in a list before. This will lead double free. Fixing
this by switching to use vhost_poll_stop() which zeros poll->wqh after
removing poll from waitqueue to make sure it won't be freed twice.
Cc: Darren Kenny <darren.kenny@oracle.com>
Reported-by: syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com
Fixes: 2b8b328b61c79 ("vhost_net: handle polling errors when setting backend")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Gunthorpe [Fri, 23 Mar 2018 17:59:36 +0000 (11:59 -0600)]
kbuild: rpm-pkg: Support GNU tar >= 1.29
There is a change in how command line parsing is done in this version.
Excludes and includes are now ordered with the file list. Since
the spec file puts the file list before the exclude list it means newer
tar ignores the excludes and packs all the build output into the
kernel-devel RPM resulting in a huge package.
Simple argument re-ordering fixes the problem.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Jan Kiszka [Wed, 21 Mar 2018 05:15:28 +0000 (13:15 +0800)]
builddeb: Fix header package regarding dtc source links
Since
d5d332d3f7e8, a couple of links in scripts/dtc/include-prefixes
are additionally required in order to build device trees with the header
package.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Thomas Richter [Mon, 26 Mar 2018 08:25:38 +0000 (10:25 +0200)]
perf vendor events s390: Add JSON files for IBM z14
Add CPU measurement counter facility event description files (json
files) for IBM z14.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-5-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Richter [Mon, 26 Mar 2018 08:25:37 +0000 (10:25 +0200)]
perf vendor events s390: Add JSON files for IBM z13
Add CPU measurement counter facility event description files (json
files) for IBM z13.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-4-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Richter [Mon, 26 Mar 2018 08:25:36 +0000 (10:25 +0200)]
perf vendor events s390: Add JSON files for IBM zEC12 zBC12
Add CPU measurement counter facility event description files (json
files) for IBM zEC12 and zBC12.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-3-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Richter [Mon, 26 Mar 2018 08:25:35 +0000 (10:25 +0200)]
perf vendor events s390: Add JSON files for IBM z196
Add CPU measurement counter facility event description files (json
files) for IBM z196.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-2-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Richter [Mon, 26 Mar 2018 08:25:34 +0000 (10:25 +0200)]
perf vendor events s390: Add JSON files for IBM z10EC z10BC
Add CPU measurement counter facility event description files (JSON
files) for IBM z10EC and z10BC.
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180326082538.2258-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 26 Mar 2018 14:42:15 +0000 (11:42 -0300)]
perf mmap: Be consistent when checking for an unmaped ring buffer
The previous patch is insufficient to cure the reported 'perf trace'
segfault, as it only cures the perf_mmap__read_done() case, moving the
segfault to perf_mmap__read_init() functio, fix it by doing the same
refcount check.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8872481bd048 ("perf mmap: Introduce perf_mmap__read_init()")
Link: https://lkml.kernel.org/r/20180326144127.GF18897@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kan Liang [Mon, 26 Mar 2018 13:42:09 +0000 (09:42 -0400)]
perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done()
There is a segmentation fault when running 'perf trace'. For example:
[root@jouet e]# perf trace -e *chdir -o /tmp/bla perf report --ignore-vmlinux -i ../perf.data
The perf_mmap__consume() could unmap the mmap. It needs to check the
refcnt in perf_mmap__read_done().
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ee023de05f35 ("perf mmap: Introduce perf_mmap__read_done()")
Link: http://lkml.kernel.org/r/1522071729-16776-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Wed, 21 Mar 2018 14:05:15 +0000 (15:05 +0100)]
perf build: Fix check-headers.sh opts assignment
Currently the "opts" variable is not zero-ed and we keep on adding to
it, ending up with:
$ check-headers.sh 2>&1
+ opts=' "-B"'
+ opts=' "-B" "-B"'
+ opts=' "-B" "-B" "-B"'
+ opts=' "-B" "-B" "-B" "-B"'
+ opts=' "-B" "-B" "-B" "-B" "-B"'
+ opts=' "-B" "-B" "-B" "-B" "-B" "-B"'
Fix this by initializing it in the check() function, right before
starting the loop.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180321140515.2252-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David S. Miller [Tue, 27 Mar 2018 16:02:30 +0000 (12:02 -0400)]
Merge branch 'mlx4-misc-fixes-for-4.16'
Tariq Toukan says:
====================
mlx4 misc fixes for 4.16
This patchset contains misc bug fixes from the team
to the mlx4 Core and Eth drivers.
Patch 1 by Eran fixes a control mix of PFC and Global pauses, please queue it
to -stable for >= v4.8.
Patch 2 by Moshe fixes a resource leak in slave's delete flow, please queue it
to -stable for >= v4.5.
Series generated against net commit:
3c82b372a9f4 net: dsa: mt7530: fix module autoloading for OF platform drivers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Moshe Shemesh [Tue, 27 Mar 2018 11:41:19 +0000 (14:41 +0300)]
net/mlx4_core: Fix memory leak while delete slave's resources
mlx4_delete_all_resources_for_slave in resource tracker should free all
memory allocated for a slave.
While releasing memory of fs_rule, it misses releasing memory of
fs_rule->mirr_mbox.
Fixes: 78efed275117 ('net/mlx4_core: Support mirroring VF DMFS rules on both ports')
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eran Ben Elisha [Tue, 27 Mar 2018 11:41:18 +0000 (14:41 +0300)]
net/mlx4_en: Fix mixed PFC and Global pause user control requests
Global pause and PFC configuration should be mutually exclusive (i.e. only
one of them at most can be set). However, once PFC was turned off,
driver automatically turned Global pause on. This is a bug.
Fix the driver behaviour to turn off PFC/Global once the user turned the
other on.
This also fixed a weird behaviour that at a current time, the profile
had both PFC and global pause configuration turned on, which is
Hardware-wise impossible and caused returning false positive indication
to query tools.
In addition, fix error code when setting global pause or PFC to change
metadata only upon successful change.
Also, removed useless debug print.
Fixes: af7d51852631 ("net/mlx4_en: Add DCB PFC support through CEE netlink commands")
Fixes: c27a02cd94d6 ("mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>