Michal Hocko [Fri, 3 Feb 2017 21:13:29 +0000 (13:13 -0800)]
mm, fs: check for fatal signals in do_generic_file_read()
do_generic_file_read() can be told to perform a large request from
userspace. If the system is under OOM and the reading task is the OOM
victim then it has an access to memory reserves and finishing the full
request can lead to the full memory depletion which is dangerous. Make
sure we rather go with a short read and allow the killed task to
terminate.
Link: http://lkml.kernel.org/r/20170201092706.9966-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michal Hocko [Fri, 3 Feb 2017 21:13:26 +0000 (13:13 -0800)]
fs: break out of iomap_file_buffered_write on fatal signals
Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion. He has tracked
this down to the following path
__alloc_pages_nodemask+0x436/0x4d0
alloc_pages_current+0x97/0x1b0
__page_cache_alloc+0x15d/0x1a0 mm/filemap.c:728
pagecache_get_page+0x5a/0x2b0 mm/filemap.c:1331
grab_cache_page_write_begin+0x23/0x40 mm/filemap.c:2773
iomap_write_begin+0x50/0xd0 fs/iomap.c:118
iomap_write_actor+0xb5/0x1a0 fs/iomap.c:190
? iomap_write_end+0x80/0x80 fs/iomap.c:150
iomap_apply+0xb3/0x130 fs/iomap.c:79
iomap_file_buffered_write+0x68/0xa0 fs/iomap.c:243
? iomap_write_end+0x80/0x80
xfs_file_buffered_aio_write+0x132/0x390 [xfs]
? remove_wait_queue+0x59/0x60
xfs_file_write_iter+0x90/0x130 [xfs]
__vfs_write+0xe5/0x140
vfs_write+0xc7/0x1f0
? syscall_trace_enter+0x1d0/0x380
SyS_write+0x58/0xc0
do_syscall_64+0x6c/0x200
entry_SYSCALL64_slow_path+0x25/0x25
the oom victim has access to all memory reserves to make a forward
progress to exit easier. But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request. We need to
check for fatal signals and back off with a short write instead.
As the iomap_apply delegates all the work down to the actor we have to
hook into those. All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there. dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly. Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.
Fixes: 68a9f5e7007c ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Toshi Kani [Fri, 3 Feb 2017 21:13:23 +0000 (13:13 -0800)]
base/memory, hotplug: fix a kernel oops in show_valid_zones()
Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
BUG: unable to handle kernel paging request at
ffffea017a000000
IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.
[1] 'Commit
bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Toshi Kani [Fri, 3 Feb 2017 21:13:20 +0000 (13:13 -0800)]
mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
Patch series "fix a kernel oops when reading sysfs valid_zones", v2.
A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory. [1] When the start address of a
memory block is not backed by struct page, i.e. a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops. This issue was observed on multiple x86-64 systems with
more than 64GiB of memory. This patch-set fixes this issue.
Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.
Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).
Note for stable kernels: The memory block size change was made by commit
bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9. However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887f4de ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.
So, I recommend that we backport it up to 4.4.
[1] 'Commit
bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
This patch (of 2):
test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'. Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.
Fix it by properly setting 'sec_end_pfn' to the next section pfn.
Also make sure that this function returns 1 only when the range belongs
to a zone.
Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Lin [Fri, 3 Feb 2017 21:13:18 +0000 (13:13 -0800)]
jump label: pass kbuild_cflags when checking for asm goto support
Some versions of ARM GCC compiler such as Android toolchain throws in a
'-fpic' flag by default. This causes the gcc-goto check script to fail
although some config would have '-fno-pic' flag in the KBUILD_CFLAGS.
This patch passes the KBUILD_CFLAGS to the check script so that the
script does not rely on the default config from different compilers.
Link: http://lkml.kernel.org/r/20170120234329.78868-1-dtwlin@google.com
Signed-off-by: David Lin <dtwlin@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Michal Marek <mmarek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kirill A. Shutemov [Fri, 3 Feb 2017 21:13:15 +0000 (13:13 -0800)]
shmem: fix sleeping from atomic context
Syzkaller fuzzer managed to trigger this:
BUG: sleeping function called from invalid context at mm/shmem.c:852
in_atomic(): 1, irqs_disabled(): 0, pid: 529, name: khugepaged
3 locks held by khugepaged/529:
#0: (shrinker_rwsem){++++..}, at: [<
ffffffff818d7ef1>] shrink_slab.part.59+0x121/0xd30 mm/vmscan.c:451
#1: (&type->s_umount_key#29){++++..}, at: [<
ffffffff81a63630>] trylock_super+0x20/0x100 fs/super.c:392
#2: (&(&sbinfo->shrinklist_lock)->rlock){+.+.-.}, at: [<
ffffffff818fd83e>] spin_lock include/linux/spinlock.h:302 [inline]
#2: (&(&sbinfo->shrinklist_lock)->rlock){+.+.-.}, at: [<
ffffffff818fd83e>] shmem_unused_huge_shrink+0x28e/0x1490 mm/shmem.c:427
CPU: 2 PID: 529 Comm: khugepaged Not tainted 4.10.0-rc5+ #201
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
shmem_undo_range+0xb20/0x2710 mm/shmem.c:852
shmem_truncate_range+0x27/0xa0 mm/shmem.c:939
shmem_evict_inode+0x35f/0xca0 mm/shmem.c:1030
evict+0x46e/0x980 fs/inode.c:553
iput_final fs/inode.c:1515 [inline]
iput+0x589/0xb20 fs/inode.c:1542
shmem_unused_huge_shrink+0xbad/0x1490 mm/shmem.c:446
shmem_unused_huge_scan+0x10c/0x170 mm/shmem.c:512
super_cache_scan+0x376/0x450 fs/super.c:106
do_shrink_slab mm/vmscan.c:378 [inline]
shrink_slab.part.59+0x543/0xd30 mm/vmscan.c:481
shrink_slab mm/vmscan.c:2592 [inline]
shrink_node+0x2c7/0x870 mm/vmscan.c:2592
shrink_zones mm/vmscan.c:2734 [inline]
do_try_to_free_pages+0x369/0xc80 mm/vmscan.c:2776
try_to_free_pages+0x3c6/0x900 mm/vmscan.c:2982
__perform_reclaim mm/page_alloc.c:3301 [inline]
__alloc_pages_direct_reclaim mm/page_alloc.c:3322 [inline]
__alloc_pages_slowpath+0xa24/0x1c30 mm/page_alloc.c:3683
__alloc_pages_nodemask+0x544/0xae0 mm/page_alloc.c:3848
__alloc_pages include/linux/gfp.h:426 [inline]
__alloc_pages_node include/linux/gfp.h:439 [inline]
khugepaged_alloc_page+0xc2/0x1b0 mm/khugepaged.c:750
collapse_huge_page+0x182/0x1fe0 mm/khugepaged.c:955
khugepaged_scan_pmd+0xfdf/0x12a0 mm/khugepaged.c:1208
khugepaged_scan_mm_slot mm/khugepaged.c:1727 [inline]
khugepaged_do_scan mm/khugepaged.c:1808 [inline]
khugepaged+0xe9b/0x1590 mm/khugepaged.c:1853
kthread+0x326/0x3f0 kernel/kthread.c:227
ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
The iput() from atomic context was a bad idea: if after igrab() somebody
else calls iput() and we left with the last inode reference, our iput()
would lead to inode eviction and therefore sleeping.
This patch should fix the situation.
Link: http://lkml.kernel.org/r/20170131093141.GA15899@node.shutemov.name
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Fri, 3 Feb 2017 21:13:12 +0000 (13:13 -0800)]
kasan: respect /proc/sys/kernel/traceoff_on_warning
After much waiting I finally reproduced a KASAN issue, only to find my
trace-buffer empty of useful information because it got spooled out :/
Make kasan_report honour the /proc/sys/kernel/traceoff_on_warning
interface.
Link: http://lkml.kernel.org/r/20170125164106.3514-1-aryabinin@virtuozzo.com
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Streetman [Fri, 3 Feb 2017 21:13:09 +0000 (13:13 -0800)]
zswap: disable changing params if init fails
Add zswap_init_failed bool that prevents changing any of the module
params, if init_zswap() fails, and set zswap_enabled to false. Change
'enabled' param to a callback, and check zswap_init_failed before
allowing any change to 'enabled', 'zpool', or 'compressor' params.
Any driver that is built-in to the kernel will not be unloaded if its
init function returns error, and its module params remain accessible for
users to change via sysfs. Since zswap uses param callbacks, which
assume that zswap has been initialized, changing the zswap params after
a failed initialization will result in WARNING due to the param
callbacks expecting a pool to already exist. This prevents that by
immediately exiting any of the param callbacks if initialization failed.
This was reported here:
https://marc.info/?l=linux-mm&m=
147004228125528&w=4
And fixes this WARNING:
[ 429.723476] WARNING: CPU: 0 PID: 5140 at mm/zswap.c:503 __zswap_pool_current+0x56/0x60
The warning is just noise, and not serious. However, when init fails,
zswap frees all its percpu dstmem pages and its kmem cache. The kmem
cache might be serious, if kmem_cache_alloc(NULL, gfp) has problems; but
the percpu dstmem pages are definitely a problem, as they're used as
temporary buffer for compressed pages before copying into place in the
zpool.
If the user does get zswap enabled after an init failure, then zswap
will likely Oops on the first page it tries to compress (or worse, start
corrupting memory).
Fixes: 90b0fc26d5db ("zswap: change zpool/compressor at runtime")
Link: http://lkml.kernel.org/r/20170124200259.16191-2-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reported-by: Marcin Miroslaw <marcin@mejor.pl>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 3 Feb 2017 19:32:25 +0000 (11:32 -0800)]
Merge tag 'drm-fixes-for-v4.10-rc7' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Another fixes pull for v4.10, it's a bit big due to the backport of
the VMA fixes for i915 that should fix the oops on shutdown problems
that you've worked around.
There are also two drm core connector registration fixes, a bunch of
nouveau regression fixes and two AMD fixes"
* tag 'drm-fixes-for-v4.10-rc7' of git://people.freedesktop.org/~airlied/linux:
drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctl
drm/amdgpu/si: fix crash on headless asics
drm/i915: Track pinned vma in intel_plane_state
drm/atomic: Unconditionally call prepare_fb.
drm/atomic: Fix double free in drm_atomic_state_default_clear
drm/nouveau/kms/nv50: request vblank events for commits that send completion events
drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215
drm/nouveau/nouveau/led: prevent compiling the led-code if nouveau=y and leds=m
drm/nouveau/disp/mcp7x: disable dptmds workaround
drm/nouveau: prevent userspace from deleting client object
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
drm: Don't race connector registration
drm: prevent double-(un)registration for connectors
Linus Torvalds [Fri, 3 Feb 2017 19:10:06 +0000 (11:10 -0800)]
Merge tag 'powerpc-4.10-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"The main change is we're reverting the initial stack protector support
we merged this cycle. It turns out to not work on toolchains built
with libc support, and fixing it will be need to wait for another
release.
And the rest are all fairly minor:
- Some pasemi machines were not booting due to a missing error check
in prom_find_boot_cpu()
- In EEH we were checking a pointer rather than the bool it pointed
to
- The clang build was broken by a BUILD_BUG_ON() we added.
- The radix (Power9 only) version of map_kernel_page() was broken if
our memory size was a multiple of 2MB, which it generally isn't
Thanks to: Darren Stevens, Gavin Shan, Reza Arbab"
* tag 'powerpc-4.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Use the correct pointer when setting a 2MB pte
powerpc: Fix build failure with clang due to BUILD_BUG_ON()
powerpc: Revert the initial stack protector support
powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe()
powerpc: Add missing error check to prom_find_boot_cpu()
Linus Torvalds [Fri, 3 Feb 2017 19:06:59 +0000 (11:06 -0800)]
Merge tag 'trace-v4.10-rc2-2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Simple fix of s/static struct __init/static __init struct/"
* tag 'trace-v4.10-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/kprobes: Fix __init annotation
Linus Torvalds [Fri, 3 Feb 2017 18:30:27 +0000 (10:30 -0800)]
Merge branch 'modversions' (modversions fixes for powerpc from Ard)
Merge kcrctab entry fixes from Ard Biesheuvel:
"This is a followup to [0] 'modversions: redefine kcrctab entries as
relative CRC pointers', but since relative CRC pointers do not work in
modules, and are actually only needed by powerpc with
CONFIG_RELOCATABLE=y, I have made it a Kconfig selectable feature
instead.
First it introduces the MODULE_REL_CRCS Kconfig symbol, and adds the
kbuild handling of it, i.e., modpost, genksyms and kallsyms.
Then it switches all architectures to 32-bit CRC entries in kcrctab,
where all architectures except powerpc with CONFIG_RELOCATABLE=y use
absolute ELF symbol references as before"
[0] http://marc.info/?l=linux-arch&m=
148493613415294&w=2
* emailed patches from Ard Biesheuvel:
module: unify absolute krctab definitions for 32-bit and 64-bit
modversions: treat symbol CRCs as 32 bit quantities
kbuild: modversions: add infrastructure for emitting relative CRCs
Ard Biesheuvel [Thu, 2 Feb 2017 18:05:26 +0000 (18:05 +0000)]
log2: make order_base_2() behave correctly on const input value zero
The function order_base_2() is defined (according to the comment block)
as returning zero on input zero, but subsequently passes the input into
roundup_pow_of_two(), which is explicitly undefined for input zero.
This has gone unnoticed until now, but optimization passes in GCC 7 may
produce constant folded function instances where a constant value of
zero is passed into order_base_2(), resulting in link errors against the
deliberately undefined '____ilog2_NaN'.
So update order_base_2() to adhere to its own documented interface.
[ See
http://marc.info/?l=linux-kernel&m=
147672952517795&w=2
and follow-up discussion for more background. The gcc "optimization
pass" is really just broken, but now the GCC trunk problem seems to
have escaped out of just specially built daily images, so we need to
work around it in mainline. - Linus ]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ard Biesheuvel [Fri, 3 Feb 2017 09:54:07 +0000 (09:54 +0000)]
module: unify absolute krctab definitions for 32-bit and 64-bit
The previous patch introduced a separate inline asm version of the
krcrctab declaration template for use with 64-bit architectures, which
cannot refer to ELF symbols using 32-bit quantities.
This declaration should be equivalent to the C one for 32-bit
architectures, but just in case - unify them in a separate patch, which
can simply be dropped if it turns out to break anything.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ard Biesheuvel [Fri, 3 Feb 2017 09:54:06 +0000 (09:54 +0000)]
modversions: treat symbol CRCs as 32 bit quantities
The modversion symbol CRCs are emitted as ELF symbols, which allows us
to easily populate the kcrctab sections by relying on the linker to
associate each kcrctab slot with the correct value.
This has a couple of downsides:
- Given that the CRCs are treated as memory addresses, we waste 4 bytes
for each CRC on 64 bit architectures,
- On architectures that support runtime relocation, a R_<arch>_RELATIVE
relocation entry is emitted for each CRC value, which identifies it
as a quantity that requires fixing up based on the actual runtime
load offset of the kernel. This results in corrupted CRCs unless we
explicitly undo the fixup (and this is currently being handled in the
core module code)
- Such runtime relocation entries take up 24 bytes of __init space
each, resulting in a x8 overhead in [uncompressed] kernel size for
CRCs.
Switching to explicit 32 bit values on 64 bit architectures fixes most
of these issues, given that 32 bit values are not treated as quantities
that require fixing up based on the actual runtime load offset. Note
that on some ELF64 architectures [such as PPC64], these 32-bit values
are still emitted as [absolute] runtime relocatable quantities, even if
the value resolves to a build time constant. Since relative relocations
are always resolved at build time, this patch enables MODULE_REL_CRCS on
powerpc when CONFIG_RELOCATABLE=y, which turns the absolute CRC
references into relative references into .rodata where the actual CRC
value is stored.
So redefine all CRC fields and variables as u32, and redefine the
__CRC_SYMBOL() macro for 64 bit builds to emit the CRC reference using
inline assembler (which is necessary since 64-bit C code cannot use
32-bit types to hold memory addresses, even if they are ultimately
resolved using values that do not exceed 0xffffffff). To avoid
potential problems with legacy 32-bit architectures using legacy
toolchains, the equivalent C definition of the kcrctab entry is retained
for 32-bit architectures.
Note that this mostly reverts commit
d4703aefdbc8 ("module: handle ppc64
relocating kcrctabs when CONFIG_RELOCATABLE=y")
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ard Biesheuvel [Fri, 3 Feb 2017 09:54:05 +0000 (09:54 +0000)]
kbuild: modversions: add infrastructure for emitting relative CRCs
This add the kbuild infrastructure that will allow architectures to emit
vmlinux symbol CRCs as 32-bit offsets to another location in the kernel
where the actual value is stored. This works around problems with CRCs
being mistaken for relocatable symbols on kernels that self relocate at
runtime (i.e., powerpc with CONFIG_RELOCATABLE=y)
For the kbuild side of things, this comes down to the following:
- introducing a Kconfig symbol MODULE_REL_CRCS
- adding a -R switch to genksyms to instruct it to emit the CRC symbols
as references into the .rodata section
- making modpost distinguish such references from absolute CRC symbols
by the section index (SHN_ABS)
- making kallsyms disregard non-absolute symbols with a __crc_ prefix
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Airlie [Thu, 2 Feb 2017 23:10:08 +0000 (09:10 +1000)]
Merge branch 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
two amd fixes.
* 'drm-fixes-4.10' of git://people.freedesktop.org/~agd5f/linux:
drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctl
drm/amdgpu/si: fix crash on headless asics
Dave Airlie [Thu, 2 Feb 2017 23:09:36 +0000 (09:09 +1000)]
Merge tag 'topic/vma-fix-for-4.10-2017-02-02' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
here's Maarten's backport of the vma fixes for v4.10.
* tag 'topic/vma-fix-for-4.10-2017-02-02' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Track pinned vma in intel_plane_state
drm/atomic: Unconditionally call prepare_fb.
Linus Torvalds [Thu, 2 Feb 2017 22:08:58 +0000 (14:08 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- two microcode loader fixes
- two FPU xstate handling fixes
- an MCE timer handling related crash fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Make timer handling more robust
x86/microcode: Do not access the initrd after it has been freed
x86/fpu/xstate: Fix xcomp_bv in XSAVES header
x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
x86/microcode/intel: Drop stashed AP patch pointer optimization
Linus Torvalds [Thu, 2 Feb 2017 21:30:19 +0000 (13:30 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Five kernel fixes:
- an mmap tracing ABI fix for certain mappings
- a use-after-free fix, found via KASAN
- three CPU hotplug related x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Make package handling more robust
perf/x86/intel/uncore: Clean up hotplug conversion fallout
perf/x86/intel/rapl: Make package handling more robust
perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory
perf/core: Fix use-after-free bug
Linus Torvalds [Thu, 2 Feb 2017 21:20:23 +0000 (13:20 -0800)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Two EFI boot fixes, one for arm64 and one for x86 systems with certain
firmware versions"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/fdt: Avoid FDT manipulation after ExitBootServices()
x86/efi: Always map the first physical page into the EFI pagetables
Linus Torvalds [Thu, 2 Feb 2017 20:54:45 +0000 (12:54 -0800)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull objtool fix from Ingo Molnar:
"A fix for a bad opcode in objtool's instruction decoder"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix IRET's opcode
Linus Torvalds [Thu, 2 Feb 2017 20:49:58 +0000 (12:49 -0800)]
Merge tag 'nfsd-4.10-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Three more miscellaneous nfsd bugfixes"
* tag 'nfsd-4.10-2' of git://linux-nfs.org/~bfields/linux:
svcrpc: fix oops in absence of krb5 module
nfsd: special case truncates some more
NFSD: Fix a null reference case in find_or_create_lock_stateid()
Linus Torvalds [Thu, 2 Feb 2017 20:39:10 +0000 (12:39 -0800)]
Merge tag 'xtensa-
20170202' of git://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa fix from Max Filippov:
"A for an Xtensa build error introduced in reset code refactoring
series in v4.9:
- fix noMMU build on cores with MMU"
* tag 'xtensa-
20170202' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix noMMU build on cores with MMU
Linus Torvalds [Thu, 2 Feb 2017 20:34:27 +0000 (12:34 -0800)]
Merge tag 'pci-v4.10-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Configure ASPM on the link from a PCI-to-PCIe bridge (avoids a NULL
pointer dereference on topologies including these bridges)"
* tag 'pci-v4.10-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/ASPM: Handle PCI-to-PCIe bridges as roots of PCIe hierarchies
Michel Dänzer [Mon, 30 Jan 2017 03:06:35 +0000 (12:06 +0900)]
drm/radeon: Fix vram_size/visible values in DRM_RADEON_GEM_INFO ioctl
vram_size is supposed to be the total amount of VRAM that can be used by
userspace, which corresponds to the TTM VRAM manager size (which is
normally the full amount of VRAM, but can be just the visible VRAM when
DMA can't be used for BO migration for some reason).
The above was incorrectly used for vram_visible before, resulting in
generally too large values being reported.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Fri, 27 Jan 2017 15:31:52 +0000 (10:31 -0500)]
drm/amdgpu/si: fix crash on headless asics
Missing check for crtcs present.
Fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=193341
https://bugs.freedesktop.org/show_bug.cgi?id=99387
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Arnd Bergmann [Wed, 1 Feb 2017 16:57:56 +0000 (17:57 +0100)]
tracing/kprobes: Fix __init annotation
clang complains about "__init" being attached to a struct name:
kernel/trace/trace_kprobe.c:1375:15: error: '__section__' attribute only applies to functions and global variables
The intention must have been to mark the function as __init instead of
the type, so move the attribute there.
Link: http://lkml.kernel.org/r/20170201165826.2625888-1-arnd@arndb.de
Fixes: f18f97ac43d7 ("tracing/kprobes: Add a helper method to return number of probe hits")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Ard Biesheuvel [Wed, 1 Feb 2017 17:45:02 +0000 (17:45 +0000)]
efi/fdt: Avoid FDT manipulation after ExitBootServices()
Some AArch64 UEFI implementations disable the MMU in ExitBootServices(),
after which unaligned accesses to RAM are no longer supported.
Commit:
abfb7b686a3e ("efi/libstub/arm*: Pass latest memory map to the kernel")
fixed an issue in the memory map handling of the stub FDT code, but
inadvertently created an issue with such firmware, by moving some
of the FDT manipulation to after the invocation of ExitBootServices().
Given that the stub's libfdt implementation uses the ordinary, accelerated
string functions, which rely on hardware handling of unaligned accesses,
manipulating the FDT with the MMU off may result in alignment faults.
So fix the situation by moving the update_fdt_memmap() call into the
callback function invoked by efi_exit_boot_services() right before it
calls the ExitBootServices() UEFI service (which is arguably a better
place for it anyway)
Note that disabling the MMU in ExitBootServices() is not compliant with
the UEFI spec, and carries great risk due to the fact that switching from
cached to uncached memory accesses halfway through compiler generated code
(i.e., involving a stack) can never be done in a way that is architecturally
safe.
Fixes: abfb7b686a3e ("efi/libstub/arm*: Pass latest memory map to the kernel")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: linux-efi@vger.kernel.org
Cc: matt@codeblueprint.co.uk
Cc: leif.lindholm@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1485971102-23330-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 1 Feb 2017 19:52:27 +0000 (11:52 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix handling of interrupt status in stmmac driver. Just because we
have masked the event from generating interrupts, doesn't mean the
bit won't still be set in the interrupt status register. From Alexey
Brodkin.
2) Fix DMA API debugging splats in gianfar driver, from Arseny Solokha.
3) Fix off-by-one error in __ip6_append_data(), from Vlad Yasevich.
4) cls_flow does not match on icmpv6 codes properly, from Simon Horman.
5) Initial MAC address can be set incorrectly in some scenerios, from
Ivan Vecera.
6) Packet header pointer arithmetic fix in ip6_tnl_parse_tlv_end_lim(),
from Dan Carpenter.
7) Fix divide by zero in __tcp_select_window(), from Eric Dumazet.
8) Fix crash in iwlwifi when unregistering thermal zone, from Jens
Axboe.
9) Check for DMA mapping errors in starfire driver, from Alexey
Khoroshilov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
tcp: fix 0 divide in __tcp_select_window()
ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()
net: fix ndo_features_check/ndo_fix_features comment ordering
net/sched: matchall: Fix configuration race
be2net: fix initial MAC setting
ipv6: fix flow labels when the traffic class is non-0
net: thunderx: avoid dereferencing xcv when NULL
net/sched: cls_flower: Correct matching on ICMPv6 code
ipv6: Paritially checksum full MTU frames
net/mlx4_core: Avoid command timeouts during VF driver device shutdown
gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
net: ethtool: add support for 2500BaseT and 5000BaseT link modes
can: bcm: fix hrtimer/tasklet termination in bcm op removal
net: adaptec: starfire: add checks for dma mapping errors
net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
can: Fix kernel panic at security_sock_rcv_skb
net: macb: Fix 64 bit addressing support for GEM
stmmac: Discard masked flags in interrupt status register
net/mlx5e: Check ets capability before ets query FW command
net/mlx5e: Fix update of hash function/key via ethtool
...
Linus Torvalds [Wed, 1 Feb 2017 18:30:56 +0000 (10:30 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull fscache fixes from Al Viro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fscache: Fix dead object requeue
fscache: Clear outstanding writes when disabling a cookie
FS-Cache: Initialise stores_lock in netfs cookie
Eric Dumazet [Wed, 1 Feb 2017 16:33:53 +0000 (08:33 -0800)]
tcp: fix 0 divide in __tcp_select_window()
syszkaller fuzzer was able to trigger a divide by zero, when
TCP window scaling is not enabled.
SO_RCVBUF can be used not only to increase sk_rcvbuf, also
to decrease it below current receive buffers utilization.
If mss is negative or 0, just return a zero TCP window.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 1 Feb 2017 08:46:32 +0000 (11:46 +0300)]
ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()
Casting is a high precedence operation but "off" and "i" are in terms of
bytes so we need to have some parenthesis here.
Fixes: fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Feb 2017 17:24:00 +0000 (09:24 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a bug in CBC/CTR on ARM64 that breaks chaining as well as a
bug in the core API that causes registration failures when a driver
unloads and then reloads an algorithm"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes
crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg
Linus Torvalds [Wed, 1 Feb 2017 17:22:08 +0000 (09:22 -0800)]
Merge tag 'dmaengine-fix-4.10-rc7' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"A couple of fixes showed up late in the cycle so sending them up and
sending early in the week and not on Friday :).
They fix a double lock in pl330 driver and runtime pm fixes for cppi
driver"
* tag 'dmaengine-fix-4.10-rc7' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: pl330: fix double lock
dmaengine: cppi41: Clean up pointless warnings
dmaengine: cppi41: Fix oops in cppi41_runtime_resume
dmaengine: cppi41: Fix runtime PM timeouts with USB mass storage
Dimitris Michailidis [Wed, 1 Feb 2017 00:03:13 +0000 (16:03 -0800)]
net: fix ndo_features_check/ndo_fix_features comment ordering
Commit
cdba756f5803a2 ("net: move ndo_features_check() close to
ndo_start_xmit()") inadvertently moved the doc comment for
.ndo_fix_features instead of .ndo_features_check. Fix the comment
ordering.
Fixes: cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()")
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yotam Gigi [Tue, 31 Jan 2017 13:14:29 +0000 (15:14 +0200)]
net/sched: matchall: Fix configuration race
In the current version, the matchall internal state is split into two
structs: cls_matchall_head and cls_matchall_filter. This makes little
sense, as matchall instance supports only one filter, and there is no
situation where one exists and the other does not. In addition, that led
to some races when filter was deleted while packet was processed.
Unify that two structs into one, thus simplifying the process of matchall
creation and deletion. As a result, the new, delete and get callbacks have
a dummy implementation where all the work is done in destroy and change
callbacks, as was done in cls_cgroup.
Fixes: bf3994d2ed31 ("net/sched: introduce Match-all classifier")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Feb 2017 16:34:13 +0000 (08:34 -0800)]
Merge tag 'pinctrl-v4.10-4' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Another week, another set of pin control fixes. The subsystem has seen
high patch-spot activity recently.
The majority of the patches are for Intel, I vaguely think it mostly
concern phones, tablets and maybe chromebooks and even laptops with
this Intel Atom family chips.
Driver fixes only:
- one fix to the Berlin driver making the SD card work fully again.
- one fix to the Allwinner/sunxi bias function: one premature change
needs to be partially reverted.
- the remaining four patches are to Intel embedded SoCs: baytrail
(three patches) and merrifield (one patch): register access
debounce fixes and a missing spinlock"
* tag 'pinctrl-v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: baytrail: Add missing spinlock usage in byt_gpio_irq_handler
pinctrl: baytrail: Debounce register is one per community
pinctrl: baytrail: Rectify debounce support (part 2)
pinctrl: intel: merrifield: Add missed check in mrfld_config_set()
pinctrl: sunxi: Don't enforce bias disable (for now)
pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
Ivan Vecera [Tue, 31 Jan 2017 19:01:31 +0000 (20:01 +0100)]
be2net: fix initial MAC setting
Recent commit
34393529163a ("be2net: fix MAC addr setting on privileged
BE3 VFs") allows privileged BE3 VFs to set its MAC address during
initialization. Although the initial MAC for such VFs is already
programmed by parent PF the subsequent setting performed by VF is OK,
but in certain cases (after fresh boot) this command in VF can fail.
The MAC should be initialized only when:
1) no MAC is programmed (always except BE3 VFs during first init)
2) programmed MAC is different from requested (e.g. MAC is set when
interface is down). In this case the initial MAC programmed by PF
needs to be deleted.
The adapter->dev_mac contains MAC address currently programmed in HW so
it should be zeroed when the MAC is deleted from HW and should not be
filled when MAC is set when interface is down in be_mac_addr_set() as
no programming is performed in this case.
Example of failure without the fix (immediately after fresh boot):
# ip link set eth0 up <- eth0 is BE3 PF
be2net 0000:01:00.0 eth0: Link is Up
# echo 1 > /sys/class/net/eth0/device/sriov_numvfs <- Create 1 VF
...
be2net 0000:01:04.0: Emulex OneConnect(be3): VF port 0
# ip link set eth8 up <- eth8 is created privileged VF
be2net 0000:01:04.0: opcode 59-1 failed:status 1-76
RTNETLINK answers: Input/output error
# echo 0 > /sys/class/net/eth0/device/sriov_numvfs <- Delete VF
iommu: Removing device 0000:01:04.0 from group 33
...
# echo 1 > /sys/class/net/eth0/device/sriov_numvfs <- Create it again
iommu: Removing device 0000:01:04.0 from group 33
...
# ip link set eth8 up
be2net 0000:01:04.0 eth8: Link is Up
Initialization is now OK.
v2 - Corrected the comment and condition check suggested by Suresh & Harsha
Fixes: 34393529163a ("be2net: fix MAC addr setting on privileged BE3 VFs")
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Wilson [Tue, 31 Jan 2017 09:21:31 +0000 (10:21 +0100)]
drm/i915: Track pinned vma in intel_plane_state
With atomic plane states we are able to track an allocation right from
preparation, during use and through to the final free after being
swapped out for a new plane. We can couple the VMA we pin for the
framebuffer (and its rotation) to this lifetime and avoid all the clumsy
lookups in between.
v2: Remove residual vma on plane cleanup (Chris)
v3: Add a description for the vma destruction in
intel_plane_destroy_state (Maarten)
References: https://bugs.freedesktop.org/show_bug.cgi?id=98829
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170116152131.18089-1-chris@chris-wilson.co.uk
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
(cherry picked from commit
be1e341513ca23b0668b7b0f26fa6e2ffc46ba20)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485854491-27389-3-git-send-email-maarten.lankhorst@linux.intel.com
Maarten Lankhorst [Tue, 31 Jan 2017 09:21:30 +0000 (10:21 +0100)]
drm/atomic: Unconditionally call prepare_fb.
Atomic drivers may set properties like rotation on the same fb, which
may require a call to prepare_fb even when framebuffer stays identical.
Instead of handling all the special cases in the core, let the driver
decide when prepare_fb and cleanup_fb are noops.
This is a revert of:
commit
fcc60b413d14dd06ddbd79ec50e83c4fb2a097ba
Author: Keith Packard <keithp@keithp.com>
Date: Sat Jun 4 01:16:22 2016 -0700
drm: Don't prepare or cleanup unchanging frame buffers [v3]
The original commit mentions that this prevents waiting in i915 on all
previous rendering during cursor updates, but there are better ways to
fix this.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/6d82f9b6-9d16-91d1-d176-4a37b09afc44@linux.intel.com
(cherry picked from commit
0532be078a207d7dd6ad26ebd0834e258acc4ee7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485854491-27389-2-git-send-email-maarten.lankhorst@linux.intel.com
Thomas Gleixner [Tue, 31 Jan 2017 22:58:40 +0000 (23:58 +0100)]
perf/x86/intel/uncore: Make package handling more robust
The package management code in uncore relies on package mapping being
available before a CPU is started. This changed with:
9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
because the ACPI/BIOS information turned out to be unreliable, but that
left uncore in broken state. This was not noticed because on a regular boot
all CPUs are online before uncore is initialized.
Move the allocation to the CPU online callback and simplify the hotplug
handling. At this point the package mapping is established and correct.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 31 Jan 2017 22:58:39 +0000 (23:58 +0100)]
perf/x86/intel/uncore: Clean up hotplug conversion fallout
The recent conversion to the hotplug state machine kept two mechanisms from
the original code:
1) The first_init logic which adds the number of online CPUs in a package
to the refcount. That's wrong because the callbacks are executed for
all online CPUs.
Remove it so the refcounting is correct.
2) The on_each_cpu() call to undo box->init() in the error handling
path. That's bogus because when the prepare callback fails no box has
been initialized yet.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Fixes: 1a246b9f58c6 ("perf/x86/intel/uncore: Convert to hotplug state machine")
Link: http://lkml.kernel.org/r/20170131230141.298032324@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 31 Jan 2017 22:58:38 +0000 (23:58 +0100)]
perf/x86/intel/rapl: Make package handling more robust
The package management code in RAPL relies on package mapping being
available before a CPU is started. This changed with:
9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
because the ACPI/BIOS information turned out to be unreliable, but that
left RAPL in broken state. This was not noticed because on a regular boot
all CPUs are online before RAPL is initialized.
A possible fix would be to reintroduce the mess which allocates a package
data structure in CPU prepare and when it turns out to already exist in
starting throw it away later in the CPU online callback. But that's a
horrible hack and not required at all because RAPL becomes functional for
perf only in the CPU online callback. That's correct because user space is
not yet informed about the CPU being onlined, so nothing caan rely on RAPL
being available on that particular CPU.
Move the allocation to the CPU online callback and simplify the hotplug
handling. At this point the package mapping is established and correct.
This also adds a missing check for available package data in the
event_init() function.
Reported-by: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
Link: http://lkml.kernel.org/r/20170131230141.212593966@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Max Filippov [Wed, 1 Feb 2017 02:35:37 +0000 (18:35 -0800)]
xtensa: fix noMMU build on cores with MMU
Commit
bf15f86b343ed8 ("xtensa: initialize MMU before jumping to reset
vector") calls MMU management functions even when CONFIG_MMU is not
selected. That breaks noMMU build on cores with MMU.
Don't manage MMU when CONFIG_MMU is not selected.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Linus Torvalds [Wed, 1 Feb 2017 00:32:40 +0000 (16:32 -0800)]
Merge tag 'trace-4.10-rc2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"It was reported to me that the thread created by the hwlat tracer does
not migrate after the first instance. I found that there was as small
bug in the logic, and fixed it. It's minor, but should be fixed
regardless. There's not much impact outside the hwlat tracer"
* tag 'trace-4.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix hwlat kthread migration
Dave Airlie [Tue, 31 Jan 2017 22:45:27 +0000 (08:45 +1000)]
Merge tag 'drm-misc-fixes-2017-01-31' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
2 patches to fix the oops Dave Hanse reported, plus a double kfree fix
Maarten discovered while backporting the fix for Linus.
For Linus' vma tracking oops the plan is to send you a dedicated pull with
the 2 patches we need, but since it's tricky we're letting CI beat on it a
bit more.
* tag 'drm-misc-fixes-2017-01-31' of git://anongit.freedesktop.org/git/drm-misc:
drm/atomic: Fix double free in drm_atomic_state_default_clear
drm: Don't race connector registration
drm: prevent double-(un)registration for connectors
Linus Torvalds [Tue, 31 Jan 2017 21:59:10 +0000 (13:59 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"A fix for a crash in the wm97xx driver and synaptics-rmi4 will stop
throwing erroneous warnings."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - fix reversed conditions in enable/disable_irq_wake
Input: wm97xx - make missing platform data non-fatal
Linus Torvalds [Tue, 31 Jan 2017 21:54:41 +0000 (13:54 -0800)]
Merge branch 'for-4.10-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"The cgroup creation path was getting the order of operations wrong and
exposing cgroups which don't have their names set yet to controllers
which can lead to NULL derefs.
This contains the fix for the bug"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: don't online subsystems before cgroup_name/path() are operational
Linus Torvalds [Tue, 31 Jan 2017 21:10:59 +0000 (13:10 -0800)]
Merge branch 'for-4.10-fixes' of git://git./linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
"Douglas found and fixed a ref leak bug in percpu_ref_tryget[_live]().
The bug is caused by storing the return value of atomic_long_inc_not_zero()
into an int temp variable before returning it as a bool. The interim
cast to int loses the upper bits and can lead to false negatives. As
percpu_ref uses a high bit to mark a draining counter, this can happen
relatively easily.
Fixed by using bool for the temp variable"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-refcount: fix reference leak during percpu-atomic transition
Linus Torvalds [Tue, 31 Jan 2017 21:07:04 +0000 (13:07 -0800)]
Merge branch 'for-4.10-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Three libata fixes: an error handling fix, blacklist addition for
another fallout from upping the default max sectors, and fix for a
sense data reporting bug which affects new harddrives which can report
sense data"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: sata_mv:- Handle return value of devm_ioremap.
libata: Fix ATA request sense
libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices
Linus Torvalds [Tue, 31 Jan 2017 21:05:15 +0000 (13:05 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- regression fix (sleeping while atomic) for cp2112, from Johan Hovold
- regression fix for proximity handling under certain circumstances in
Wacom driver, from Jason Gerecke
- functional fix for Logitech Rumblepad 2, from Ardinartsev Nikita
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: cp2112: fix gpio-callback error handling
HID: cp2112: fix sleep-while-atomic
HID: hid-lg: Fix immediate disconnection of Logitech Rumblepad 2
HID: usbhid: Quirk a AMI virtual mouse and keyboard with ALWAYS_POLL
HID: wacom: Fix poor prox handling in 'wacom_pl_irq'
Thomas Gleixner [Tue, 31 Jan 2017 08:37:34 +0000 (09:37 +0100)]
x86/mce: Make timer handling more robust
Erik reported that on a preproduction hardware a CMCI storm triggers the
BUG_ON in add_timer_on(). The reason is that the per CPU MCE timer is
started by the CMCI logic before the MCE CPU hotplug callback starts the
timer with add_timer_on(). So the timer is already queued which triggers
the BUG.
Using add_timer_on() is pretty pointless in this code because the timer is
strictlty per CPU, initialized as pinned and all operations which arm the
timer happen on the CPU to which the timer belongs.
Simplify the whole machinery by using mod_timer() instead of add_timer_on()
which avoids the problem because mod_timer() can handle already queued
timers. Use __start_timer() everywhere so the earliest armed expiry time is
preserved.
Reported-by: Erik Veijola <erik.veijola@intel.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701310936080.3457@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 31 Jan 2017 20:36:39 +0000 (12:36 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French:
"A small cifs fix for stable"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: initialize file_info_lock
Dave Airlie [Tue, 31 Jan 2017 19:28:14 +0000 (05:28 +1000)]
Merge branch 'linux-4.10' of git://github.com/skeggsb/linux into drm-fixes
Just a couple of minor race/regression fixes, nothing exciting, but
somewhat important
* 'linux-4.10' of git://github.com/skeggsb/linux:
drm/nouveau/kms/nv50: request vblank events for commits that send completion events
drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215
drm/nouveau/nouveau/led: prevent compiling the led-code if nouveau=y and leds=m
drm/nouveau/disp/mcp7x: disable dptmds workaround
drm/nouveau: prevent userspace from deleting client object
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
David Howells [Tue, 31 Jan 2017 09:45:28 +0000 (09:45 +0000)]
fscache: Fix dead object requeue
Under some circumstances, an fscache object can become queued such that it
fscache_object_work_func() can be called once the object is in the
OBJECT_DEAD state. This results in the kernel oopsing when it tries to
invoke the handler for the state (which is hard coded to 0x2).
The way this comes about is something like the following:
(1) The object dispatcher is processing a work state for an object. This
is done in workqueue context.
(2) An out-of-band event comes in that isn't masked, causing the object to
be queued, say EV_KILL.
(3) The object dispatcher finishes processing the current work state on
that object and then sees there's another event to process, so,
without returning to the workqueue core, it processes that event too.
It then follows the chain of events that initiates until we reach
OBJECT_DEAD without going through a wait state (such as
WAIT_FOR_CLEARANCE).
At this point, object->events may be 0, object->event_mask will be 0
and oob_event_mask will be 0.
(4) The object dispatcher returns to the workqueue processor, and in due
course, this sees that the object's work item is still queued and
invokes it again.
(5) The current state is a work state (OBJECT_DEAD), so the dispatcher
jumps to it - resulting in an OOPS.
When I'm seeing this, the work state in (1) appears to have been either
LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
fscache_osm_lookup_oob).
The window for (2) is very small:
(A) object->event_mask is cleared whilst the event dispatch process is
underway - though there's no memory barrier to force this to the top
of the function.
The window, therefore is from the time the object was selected by the
workqueue processor and made requeueable to the time the mask was
cleared.
(B) fscache_raise_event() will only queue the object if it manages to set
the event bit and the corresponding event_mask bit was set.
The enqueuement is then deferred slightly whilst we get a ref on the
object and get the per-CPU variable for workqueue congestion. This
slight deferral slightly increases the probability by allowing extra
time for the workqueue to make the item requeueable.
Handle this by giving the dead state a processor function and checking the
for the dead state address rather than seeing if the processor function is
address 0x2. The dead state processor function can then set a flag to
indicate that it's occurred and give a warning if it occurs more than once
per object.
If this race occurs, an oops similar to the following is seen (note the RIP
value):
BUG: unable to handle kernel NULL pointer dereference at
0000000000000002
IP: [<
0000000000000002>] 0x1
PGD 0
Oops: 0010 [#1] SMP
Modules linked in: ...
CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
Workqueue: fscache_object fscache_object_work_func [fscache]
task:
ffff880302b63980 ti:
ffff880717544000 task.ti:
ffff880717544000
RIP: 0010:[<
0000000000000002>] [<
0000000000000002>] 0x1
RSP: 0018:
ffff880717547df8 EFLAGS:
00010202
RAX:
ffffffffa0368640 RBX:
ffff880edf7a4480 RCX:
dead000000200200
RDX:
0000000000000002 RSI:
00000000ffffffff RDI:
ffff880edf7a4480
RBP:
ffff880717547e18 R08:
0000000000000000 R09:
dfc40a25cb3a4510
R10:
dfc40a25cb3a4510 R11:
0000000000000400 R12:
0000000000000000
R13:
ffff880edf7a4510 R14:
ffff8817f6153400 R15:
0000000000000600
FS:
0000000000000000(0000) GS:
ffff88181f420000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000002 CR3:
000000000194a000 CR4:
00000000001407e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Stack:
ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
Call Trace:
[<
ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
[<
ffffffff8109d5db>] process_one_work+0x17b/0x470
[<
ffffffff8109e4ac>] worker_thread+0x21c/0x400
[<
ffffffff8109e290>] ? rescuer_thread+0x400/0x400
[<
ffffffff810a5acf>] kthread+0xcf/0xe0
[<
ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[<
ffffffff816460d8>] ret_from_fork+0x58/0x90
[<
ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeremy McNicoll <jeremymc@redhat.com>
Tested-by: Frank Sorenson <sorenson@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Wed, 18 Jan 2017 14:29:25 +0000 (14:29 +0000)]
fscache: Clear outstanding writes when disabling a cookie
fscache_disable_cookie() needs to clear the outstanding writes on the
cookie it's disabling because they cannot be completed after.
Without this, fscache_nfs_open_file() gets stuck because it disables the
cookie when the file is opened for writing but can't uncache the pages till
afterwards - otherwise there's a race between the open routine and anyone
who already has it open R/O and is still reading from it.
Looking in /proc/pid/stack of the offending process shows:
[<
ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache]
[<
ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache]
[<
ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs]
[<
ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4]
[<
ffffffff8117350e>] do_dentry_open+0x16d/0x2b7
[<
ffffffff811743ac>] vfs_open+0x5c/0x65
[<
ffffffff81184185>] path_openat+0x785/0x8fb
[<
ffffffff81184343>] do_filp_open+0x48/0x9e
[<
ffffffff81174710>] do_sys_open+0x13b/0x1cb
[<
ffffffff811747b9>] SyS_open+0x19/0x1b
[<
ffffffff81001c44>] do_syscall_64+0x80/0x17a
[<
ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a
[<
ffffffffffffffff>] 0xffffffffffffffff
Reported-by: Jianhong Yin <jiyin@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Wed, 18 Jan 2017 14:29:17 +0000 (14:29 +0000)]
FS-Cache: Initialise stores_lock in netfs cookie
Initialise the stores_lock in fscache netfs cookies. Technically, it
shouldn't be necessary, since the netfs cookie is an index and stores no
data, but initialising it anyway adds insignificant overhead.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dimitris Michailidis [Mon, 30 Jan 2017 22:09:42 +0000 (14:09 -0800)]
ipv6: fix flow labels when the traffic class is non-0
ip6_make_flowlabel() determines the flow label for IPv6 packets. It's
supposed to be passed a flow label, which it returns as is if non-0 and
in some other cases, otherwise it calculates a new value.
The problem is callers often pass a flowi6.flowlabel, which may also
contain traffic class bits. If the traffic class is non-0
ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and
returns the whole thing. Thus it can return a 'flow label' longer than
20b and the low 20b of that is typically 0 resulting in packets with 0
label. Moreover, different packets of a flow may be labeled differently.
For a TCP flow with ECN non-payload and payload packets get different
labels as exemplified by this pair of consecutive packets:
(pure ACK)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
.... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49
Payload Length: 32
Next Header: TCP (6)
(payload)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
.... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000
Payload Length: 688
Next Header: TCP (6)
This patch allows ip6_make_flowlabel() to be passed more than just a
flow label and has it extract the part it really wants. This was simpler
than modifying the callers. With this patch packets like the above become
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
.... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
Payload Length: 32
Next Header: TCP (6)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
.... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
Payload Length: 688
Next Header: TCP (6)
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent [Mon, 30 Jan 2017 14:06:43 +0000 (15:06 +0100)]
net: thunderx: avoid dereferencing xcv when NULL
This fixes the following smatch and coccinelle warnings:
drivers/net/ethernet/cavium/thunder/thunder_xcv.c:119 xcv_setup_link() error: we previously assumed 'xcv' could be null (see line 118) [smatch]
drivers/net/ethernet/cavium/thunder/thunder_xcv.c:119:16-20: ERROR: xcv is NULL but dereferenced. [coccinelle]
Fixes: 6465859aba1e66a5 ("net: thunderx: Add RGMII interface type support")
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Cc: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
J. Bruce Fields [Tue, 31 Jan 2017 16:37:50 +0000 (11:37 -0500)]
svcrpc: fix oops in absence of krb5 module
Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."
The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize. Fix that.
[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task:
ffff8800776f95c0 task.stack:
ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946] ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901] gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017] svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257] ? __enqueue_entity+0x6c/0x70
[120408.589101] svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212] ? try_to_wake_up+0x4a/0x360
[120408.591036] ? wake_up_process+0x15/0x20
[120408.592093] ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177] svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168] svc_process_common+0x203/0x710 [sunrpc]
[120408.595220] svc_process+0x105/0x1c0 [sunrpc]
[120408.596278] nfsd+0xe9/0x160 [nfsd]
[120408.597060] kthread+0x101/0x140
[120408.597734] ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626] ? kthread_park+0x90/0x90
[120408.599448] ret_from_fork+0x22/0x30
Fixes: 1d658336b05f "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: stable@vger.kernel.org
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Christoph Hellwig [Tue, 24 Jan 2017 08:22:41 +0000 (09:22 +0100)]
nfsd: special case truncates some more
Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributs to set to set various file attributes including the
file size and the uid/gid.
The Linux syscalls never mixes size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates might not update random other attributes, and many other
file systems handle the case but do not update the different attributes
in the same transaction. NFSD on the other hand passes the attributes
it gets on the wire more or less directly through to the VFS, leading to
updates the file systems don't expect. XFS at least has an assert on
the allowed attributes, which caught an unusual NFS client setting the
size and group at the same time.
To handle this issue properly this switches nfsd to call vfs_truncate
for size changes, and then handle all other attributes through
notify_change. As a side effect this also means less boilerplace code
around the size change as we can now reuse the VFS code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Kinglong Mee [Wed, 18 Jan 2017 11:04:42 +0000 (19:04 +0800)]
NFSD: Fix a null reference case in find_or_create_lock_stateid()
nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().
If nfsd doesn't go through init_lock_stateid() and put stateid at end,
there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).
This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().
Cc: stable@vger.kernel.org
Fixes: 356a95ece7aa "nfsd: clean up races in lock stateid searching..."
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Steven Rostedt (VMware) [Tue, 31 Jan 2017 00:27:10 +0000 (19:27 -0500)]
tracing: Fix hwlat kthread migration
The hwlat tracer creates a kernel thread at start of the tracer. It is
pinned to a single CPU and will move to the next CPU after each period of
running. If the user modifies the migration thread's affinity, it will not
change after that happens.
The original code created the thread at the first instance it was called,
but later was changed to destroy the thread after the tracer was finished,
and would not be created until the next instance of the tracer was
established. The code that initialized the affinity was only called on the
initial instantiation of the tracer. After that, it was not initialized, and
the previous affinity did not match the current newly created one, making
it appear that the user modified the thread's affinity when it did not, and
the thread failed to migrate again.
Cc: stable@vger.kernel.org
Fixes: 0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Maarten Lankhorst [Tue, 31 Jan 2017 09:25:25 +0000 (10:25 +0100)]
drm/atomic: Fix double free in drm_atomic_state_default_clear
drm_atomic_helper_page_flip and drm_atomic_ioctl set their own events
in crtc_state->event. But when it's set the event is freed in 2 places.
Solve this by only freeing the event in the atomic ioctl when it
allocated its own event.
This has been broken twice. The first time when the code was introduced,
but only in the corner case when an event is allocated, but more crtc's
were included by atomic check and then failing. This can mostly
happen when you do an atomic modeset in i915 and the display clock is
changed, which forces all crtc's to be included to the state.
This has been broken worse by adding in-fences support, which caused
the double free to be done unconditionally.
[IGT] kms_rotation_crc: starting subtest primary-rotation-180
=============================================================================
BUG kmalloc-128 (Tainted: G U ): Object already free
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in drm_atomic_helper_setup_commit+0x285/0x2f0 [drm_kms_helper] age=0 cpu=3 pid=1529
___slab_alloc+0x308/0x3b0
__slab_alloc+0xd/0x20
kmem_cache_alloc_trace+0x92/0x1c0
drm_atomic_helper_setup_commit+0x285/0x2f0 [drm_kms_helper]
intel_atomic_commit+0x35/0x4f0 [i915]
drm_atomic_commit+0x46/0x50 [drm]
drm_mode_atomic_ioctl+0x7d4/0xab0 [drm]
drm_ioctl+0x2b3/0x490 [drm]
do_vfs_ioctl+0x69c/0x700
SyS_ioctl+0x4e/0x80
entry_SYSCALL_64_fastpath+0x13/0x94
INFO: Freed in drm_event_cancel_free+0xa3/0xb0 [drm] age=0 cpu=3 pid=1529
__slab_free+0x48/0x2e0
kfree+0x159/0x1a0
drm_event_cancel_free+0xa3/0xb0 [drm]
drm_mode_atomic_ioctl+0x86d/0xab0 [drm]
drm_ioctl+0x2b3/0x490 [drm]
do_vfs_ioctl+0x69c/0x700
SyS_ioctl+0x4e/0x80
entry_SYSCALL_64_fastpath+0x13/0x94
INFO: Slab 0xffffde1f0997b080 objects=17 used=2 fp=0xffff92fb65ec2578 flags=0x200000000008101
INFO: Object 0xffff92fb65ec2578 @offset=1400 fp=0xffff92fb65ec2ae8
Redzone
ffff92fb65ec2570: bb bb bb bb bb bb bb bb ........
Object
ffff92fb65ec2578: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec2588: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec2598: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec25a8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec25b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec25c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec25d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Object
ffff92fb65ec25e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
Redzone
ffff92fb65ec25f8: bb bb bb bb bb bb bb bb ........
Padding
ffff92fb65ec2738: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
CPU: 3 PID: 180 Comm: kworker/3:2 Tainted: G BU 4.10.0-rc6-patser+ #5039
Hardware name: /NUC5PPYB, BIOS PYBSWCEL.86A.0031.2015.0601.1712 06/01/2015
Workqueue: events intel_atomic_helper_free_state [i915]
Call Trace:
dump_stack+0x4d/0x6d
print_trailer+0x20c/0x220
free_debug_processing+0x1c6/0x330
? drm_atomic_state_default_clear+0xf7/0x1c0 [drm]
__slab_free+0x48/0x2e0
? drm_atomic_state_default_clear+0xf7/0x1c0 [drm]
kfree+0x159/0x1a0
drm_atomic_state_default_clear+0xf7/0x1c0 [drm]
? drm_atomic_state_clear+0x30/0x30 [drm]
intel_atomic_state_clear+0xd/0x20 [i915]
drm_atomic_state_clear+0x1a/0x30 [drm]
__drm_atomic_state_free+0x13/0x60 [drm]
intel_atomic_helper_free_state+0x5d/0x70 [i915]
process_one_work+0x260/0x4a0
worker_thread+0x2d1/0x4f0
kthread+0x127/0x130
? process_one_work+0x4a0/0x4a0
? kthread_stop+0x120/0x120
ret_from_fork+0x29/0x40
FIX kmalloc-128: Object at 0xffff92fb65ec2578 not freed
Fixes: 3b24f7d67581 ("drm/atomic: Add struct drm_crtc_commit to track async updates")
Fixes: 9626014258a5 ("drm/fence: add in-fences support")
Cc: <stable@vger.kernel.org> # v4.8+
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1485854725-27640-1-git-send-email-maarten.lankhorst@linux.intel.com
Johan Hovold [Mon, 30 Jan 2017 10:26:39 +0000 (11:26 +0100)]
HID: cp2112: fix gpio-callback error handling
In case of a zero-length report, the gpio direction_input callback would
currently return success instead of an errno.
Fixes: 1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable")
Cc: stable <stable@vger.kernel.org> # 4.9
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Johan Hovold [Mon, 30 Jan 2017 10:26:38 +0000 (11:26 +0100)]
HID: cp2112: fix sleep-while-atomic
A recent commit fixing DMA-buffers on stack added a shared transfer
buffer protected by a spinlock. This is broken as the USB HID request
callbacks can sleep. Fix this up by replacing the spinlock with a mutex.
Fixes: 1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable")
Cc: stable <stable@vger.kernel.org> # 4.9
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Ben Skeggs [Mon, 23 Jan 2017 23:32:26 +0000 (09:32 +1000)]
drm/nouveau/kms/nv50: request vblank events for commits that send completion events
This somehow fixes an issue where sync-to-vblank longer works correctly
after resume from suspend.
From a HW perspective, we don't need the IRQs turned on to be able to
detect flip completion, so it's assumed that this is required for the
voodoo in the core DRM vblank core.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ilia Mirkin [Fri, 20 Jan 2017 03:56:30 +0000 (22:56 -0500)]
drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval
Based on the xf86-video-nv code, NFORCE (NV1A) and NFORCE2 (NV1F) have a
different way of retrieving clocks. See the
nv_hw.c:nForceUpdateArbitrationSettings function in the original code
for how these clocks were accessed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54587
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Alastair Bridgewater [Wed, 11 Jan 2017 20:47:18 +0000 (15:47 -0500)]
drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215
Store the ELD correctly, not just enough copies of the first byte
to pad out the given ELD size.
Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com>
Fixes: 120b0c39c756 ("drm/nv50-/disp: audit and version SOR_HDA_ELD method")
Cc: stable@vger.kernel.org # v3.17+
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Martin Peres [Wed, 7 Dec 2016 04:30:15 +0000 (06:30 +0200)]
drm/nouveau/nouveau/led: prevent compiling the led-code if nouveau=y and leds=m
The proper fix would have been to select LEDS_CLASS but this can lead
to a circular dependency, as found out by Arnd.
This patch implements Arnd's suggestion instead, at the cost of some
auto-magic for a fringe feature.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Intel's 0-DAY
Fixes: 8d021d71b324 ("drm/nouveau/drm/nouveau: add a LED driver for the NVIDIA logo")
Signed-off-by: Martin Peres <martin.peres@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 9 Jan 2017 00:22:15 +0000 (10:22 +1000)]
drm/nouveau/disp/mcp7x: disable dptmds workaround
The workaround appears to cause regressions on these boards, and from
inspection of RM traces, NVIDIA don't appear to do it on them either.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Roy Spliet <nouveau@spliet.org>
Ben Skeggs [Wed, 25 May 2016 07:11:40 +0000 (17:11 +1000)]
drm/nouveau: prevent userspace from deleting client object
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Tue, 13 Dec 2016 23:52:39 +0000 (09:52 +1000)]
drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffers
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Christophe JAILLET [Tue, 31 Jan 2017 08:47:30 +0000 (00:47 -0800)]
Input: synaptics-rmi4 - fix reversed conditions in enable/disable_irq_wake
These tests are reversed. A warning should be displayed if an error is
returned, not on success.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Linus Torvalds [Mon, 30 Jan 2017 23:47:19 +0000 (15:47 -0800)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"Several small bug fixes and tidies, along with a fix for non-resumable
memory errors triggered by userspace"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Handle PIO & MEM non-resumable errors.
sparc64: Zero pages on allocation for mondo and error queues.
sparc: Fixed typo in sstate.c. Replaced panicing with panicking
sparc: use symbolic names for tsb indexing
David S. Miller [Mon, 30 Jan 2017 22:28:22 +0000 (14:28 -0800)]
Merge branch 'sparc64-non-resumable-user-error-recovery'
Liam R. Howlett says:
====================
sparc64: Recover from userspace non-resumable PIO & MEM errors
A non-resumable error from userspace is able to cause a kernel panic or trap
loop due to the setup and handling of the queued traps once in the kernel.
This patch series addresses both of these issues.
The queues are fixed by simply zeroing the memory before use.
PIO errors from userspace will result in a SIGBUS being sent to the user
process.
The MEM errors form userspace will result in a SIGKILL and also cause the
offending pages to be claimed so they are no longer used in future tasks.
SIGKILL is used to ensure that the process does not try to coredump and result
in an attempt to read the memory again from within kernel space. Although
there is a HV call to scrub the memory (mem_scrub), there is no easy way to
guarantee that the real memory address(es) are not used by other tasks.
Clearing the error with mem_scrub would zero the memory and cause the other
processes to proceed with bad data.
The handling of other non-resumable errors remain unchanged and will cause a
panic.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Liam R. Howlett [Tue, 17 Jan 2017 15:59:03 +0000 (10:59 -0500)]
sparc64: Handle PIO & MEM non-resumable errors.
User processes trying to access an invalid memory address via PIO will
receive a SIGBUS signal instead of causing a panic. Memory errors will
receive a SIGKILL since a SIGBUS may result in a coredump which may
attempt to repeat the faulting access.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Liam R. Howlett [Tue, 17 Jan 2017 15:59:02 +0000 (10:59 -0500)]
sparc64: Zero pages on allocation for mondo and error queues.
Error queues use a non-zero first word to detect if the queues are full.
Using pages that have not been zeroed may result in false positive
overflow events. These queues are set up once during boot so zeroing
all mondo and error queue pages is safe.
Note that the false positive overflow does not always occur because the
page allocation for these queues is so early in the boot cycle that
higher number CPUs get fresh pages. It is only when traps are serviced
with lower number CPUs who were given already used pages that this issue
is exposed.
Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Mon, 30 Jan 2017 15:19:02 +0000 (16:19 +0100)]
net/sched: cls_flower: Correct matching on ICMPv6 code
When matching on the ICMPv6 code ICMPV6_CODE rather than
ICMPV4_CODE attributes should be used.
This corrects what appears to be a typo.
Sample usage:
tc qdisc add dev eth0 ingress
tc filter add dev eth0 protocol ipv6 parent ffff: flower \
indev eth0 ip_proto icmpv6 type 128 code 0 action drop
Without this change the code parameter above is effectively ignored.
Fixes: 7b684884fbfa ("net/sched: cls_flower: Support matching on ICMP type and code")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Jan 2017 21:38:39 +0000 (16:38 -0500)]
Merge tag 'linux-can-fixes-for-4.10-
20170130' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2017-01-30
this is a pull request of one patch.
The patch is by Oliver Hartkopp and fixes the hrtimer/tasklet termination in
bcm op removal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 30 Jan 2017 21:18:12 +0000 (13:18 -0800)]
Merge tag 'rtc-4.10-2' of git://git./linux/kernel/git/abelloni/linux
Pull RTC fix from Alexandre Belloni:
"A single fix for this cycle. It is worth taking it for 4.10 so that
distributions will not have CONFIG_RTC_DRV_JZ4740 switching from m to
y in their config.
Summary:
- Allow jz4740 to build as a module again by using kernel_halt()"
* tag 'rtc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: jz4740: make the driver buildable as a module again
Vlad Yasevich [Mon, 30 Jan 2017 03:52:53 +0000 (22:52 -0500)]
ipv6: Paritially checksum full MTU frames
IPv6 will mark data that is smaller that mtu - headersize as
CHECKSUM_PARTIAL, but if the data will completely fill the mtu,
the packet checksum will be computed in software instead.
Extend the conditional to include the data that fills the mtu
as well.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Mon, 30 Jan 2017 13:11:45 +0000 (15:11 +0200)]
net/mlx4_core: Avoid command timeouts during VF driver device shutdown
Some Hypervisors detach VFs from VMs by instantly causing an FLR event
to be generated for a VF.
In the mlx4 case, this will cause that VF's comm channel to be disabled
before the VM has an opportunity to invoke the VF device's "shutdown"
method.
The result is that the VF driver on the VM will experience a command
timeout during the shutdown process when the Hypervisor does not deliver
a command-completion event to the VM.
To avoid FW command timeouts on the VM when the driver's shutdown method
is invoked, we detect the absence of the VF's comm channel at the very
start of the shutdown process. If the comm-channel has already been
disabled, we cause all FW commands during the device shutdown process to
immediately return success (and thus avoid all command timeouts).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Jan 2017 20:44:05 +0000 (15:44 -0500)]
Merge tag 'mlx5-fixes-2017-01-27' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-fixes-2017-01-27
A couple of mlx5 core and ethernet driver fixes.
From Or, A couple of error return values and error handling fixes.
From Hadar, Support TC encapsulation offloads even when the mlx5e uplink
device is stacked under an upper device.
From Gal, Two patches to fix RSS hash modifications via ethtool.
From Moshe, Added a needed ets capability check.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 30 Jan 2017 20:19:23 +0000 (15:19 -0500)]
Merge tag 'wireless-drivers-for-davem-2017-01-29' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.10
Most important here are fixes to two iwlwifi crashes, but there's also
a firmware naming fix for iwlwifi and a revert of an older bcma patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arseny Solokha [Sun, 29 Jan 2017 12:52:20 +0000 (19:52 +0700)]
gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
In spite of switching to paged allocation of Rx buffers, the driver still
called dma_unmap_single() in the Rx queues tear-down path.
The DMA region unmapping code in free_skb_rx_queue() basically predates
the introduction of paged allocation to the driver. While being refactored,
it apparently hasn't reflected the change in the DMA API usage by its
counterpart gfar_new_page().
As a result, setting an interface to the DOWN state now yields the following:
# ip link set eth2 down
fsl-gianfar
ffe24000.ethernet: DMA-API: device driver frees DMA memory with wrong function [device address=0x000000001ecd0000] [size=40]
------------[ cut here ]------------
WARNING: CPU: 1 PID: 189 at lib/dma-debug.c:1123 check_unmap+0x8e0/0xa28
CPU: 1 PID: 189 Comm: ip Tainted: G O 4.9.5 #1
task:
dee73400 task.stack:
dede2000
NIP:
c02101e8 LR:
c02101e8 CTR:
c0260d74
REGS:
dede3bb0 TRAP: 0700 Tainted: G O (4.9.5)
MSR:
00021000 <CE,ME> CR:
28002222 XER:
00000000
GPR00:
c02101e8 dede3c60 dee73400 000000b6 dfbd033c dfbd36c4 1f622000 dede2000
GPR08:
00000007 c05b1634 1f622000 00000000 22002484 100a9904 00000000 00000000
GPR16:
00000000 db4c849c 00000002 db4c8480 00000001 df142240 db4c84bc 00000000
GPR24:
c0706148 c0700000 00029000 c07552e8 c07323b4 dede3cb8 c07605e0 db535540
NIP [
c02101e8] check_unmap+0x8e0/0xa28
LR [
c02101e8] check_unmap+0x8e0/0xa28
Call Trace:
[
dede3c60] [
c02101e8] check_unmap+0x8e0/0xa28 (unreliable)
[
dede3cb0] [
c02103b8] debug_dma_unmap_page+0x88/0x9c
[
dede3d30] [
c02dffbc] free_skb_resources+0x2c4/0x404
[
dede3d80] [
c02e39b4] gfar_close+0x24/0xc8
[
dede3da0] [
c0361550] __dev_close_many+0xa0/0xf8
[
dede3dd0] [
c03616f0] __dev_close+0x2c/0x4c
[
dede3df0] [
c036b1b8] __dev_change_flags+0xa0/0x174
[
dede3e10] [
c036b2ac] dev_change_flags+0x20/0x60
[
dede3e30] [
c03e130c] devinet_ioctl+0x540/0x824
[
dede3e90] [
c0347dcc] sock_ioctl+0x134/0x298
[
dede3eb0] [
c0111814] do_vfs_ioctl+0xac/0x854
[
dede3f20] [
c0111ffc] SyS_ioctl+0x40/0x74
[
dede3f40] [
c000f290] ret_from_syscall+0x0/0x3c
--- interrupt: c01 at 0xff45da0
LR = 0xff45cd0
Instruction dump:
811d001c 7c66482e 813d0020 9061000c 807f000c 5463103a 7cc6182e 3c60c052
386309ac 90c10008 4cc63182 4826b845 <
0fe00000>
4bfffa60 3c80c052 388402c4
---[ end trace
695ae6d7ac1d0c47 ]---
Mapped at:
[<
c02e22a8>] gfar_alloc_rx_buffs+0x178/0x248
[<
c02e3ef0>] startup_gfar+0x368/0x570
[<
c036aeb4>] __dev_open+0xdc/0x150
[<
c036b1b8>] __dev_change_flags+0xa0/0x174
[<
c036b2ac>] dev_change_flags+0x20/0x60
Even though the issue was discovered in 4.9 kernel, the code in question
is identical in the current net and net-next trees.
Fixes: 75354148ce69 ("gianfar: Add paged allocation and Rx S/G")
Signed-off-by: Arseny Solokha <asolokha@kb.kras.ru>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Belous [Sat, 28 Jan 2017 19:53:28 +0000 (22:53 +0300)]
net: ethtool: add support for 2500BaseT and 5000BaseT link modes
This patch introduce support for 2500BaseT and 5000BaseT link modes.
These modes are included in the new IEEE 802.3bz standard.
Signed-off-by: Pavel Belous <pavel.s.belous@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Stein [Mon, 30 Jan 2017 11:35:28 +0000 (12:35 +0100)]
pinctrl: baytrail: Add missing spinlock usage in byt_gpio_irq_handler
According to VLI64 Intel Atom E3800 Specification Update (#329901)
concurrent read accesses may result in returning 0xffffffff and write
accesses may be dropped silently.
To workaround all accesses must be protected by locks.
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andy Shevchenko [Thu, 26 Jan 2017 17:24:08 +0000 (19:24 +0200)]
pinctrl: baytrail: Debounce register is one per community
Debounce value is set globally per community. Otherwise user will easily
get a kernel crash when they start using the feature:
BUG: unable to handle kernel paging request at
ffffc900003be000
IP: byt_gpio_dbg_show+0xa9/0x430
Make it clear in byt_gpio_reg().
Note that this fix just prevents kernel to crash, but doesn't make any
difference to the existing logic. It means the last caller will win the
trade and debounce value will be configured accordingly. The actual
logic fix needs to be thought about and it's not as important as crash
fix. That's why the latter goes separately and right now.
Fixes: 658b476c742f ("pinctrl: baytrail: Add debounce configuration")
Cc: Cristina Ciocan <cristina.ciocan@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andy Shevchenko [Thu, 26 Jan 2017 17:24:07 +0000 (19:24 +0200)]
pinctrl: baytrail: Rectify debounce support (part 2)
The commit
04ff5a095d66 ("pinctrl: baytrail: Rectify debounce support")
almost fixes the logic of debuonce but missed couple of things, i.e.
typo in mask when disabling debounce and lack of enabling it back.
This patch addresses above issues.
Reported-by: Jean Delvare <jdelvare@suse.de>
Fixes: 04ff5a095d66 ("pinctrl: baytrail: Rectify debounce support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Peter Zijlstra [Thu, 26 Jan 2017 22:15:08 +0000 (23:15 +0100)]
perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory
Andres reported that MMAP2 records for anonymous memory always have
their protection field 0.
Turns out, someone daft put the prot/flags generation code in the file
branch, leaving them unset for anonymous memory.
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Don Zickus <dzickus@redhat.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: anton@ozlabs.org
Cc: namhyung@kernel.org
Cc: stable@vger.kernel.org # v3.16+
Fixes: f972eb63b100 ("perf: Pass protection and flags bits through mmap2 interface")
Link: http://lkml.kernel.org/r/20170126221508.GF6536@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 26 Jan 2017 15:39:55 +0000 (16:39 +0100)]
perf/core: Fix use-after-free bug
Dmitry reported a KASAN use-after-free on event->group_leader.
It turns out there's a hole in perf_remove_from_context() due to
event_function_call() not calling its function when the task
associated with the event is already dead.
In this case the event will have been detached from the task, but the
grouping will have been retained, such that group operations might
still work properly while there are live child events etc.
This does however mean that we can miss a perf_group_detach() call
when the group decomposes, this in turn can then lead to
use-after-free.
Fix it by explicitly doing the group detach if its still required.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org # v4.5+
Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: 63b6da39bb38 ("perf: Fix perf_event_exit_task() race")
Link: http://lkml.kernel.org/r/20170126153955.GD6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Oliver Hartkopp [Wed, 18 Jan 2017 20:30:51 +0000 (21:30 +0100)]
can: bcm: fix hrtimer/tasklet termination in bcm op removal
When removing a bcm tx operation either a hrtimer or a tasklet might run.
As the hrtimer triggers its associated tasklet and vice versa we need to
take care to mutually terminate both handlers.
Reported-by: Michael Josenhans <michael.josenhans@web.de>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Michael Josenhans <michael.josenhans@web.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Daniel Vetter [Thu, 12 Jan 2017 16:15:56 +0000 (17:15 +0100)]
drm: Don't race connector registration
I was under the misconception that the sysfs dev stuff can be fully
set up, and then registered all in one step with device_add. That's
true for properties and property groups, but not for parents and child
devices. Those must be fully registered before you can register a
child.
Add a bit of tracking to make sure that asynchronous mst connector
hotplugging gets this right. For consistency we rely upon the implicit
barriers of the connector->mutex, which is taken anyway, to ensure
that at least either the connector or device registration call will
work out.
Mildly tested since I can't reliably reproduce this on my mst box
here.
Reported-by: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484237756-2720-1-git-send-email-daniel.vetter@ffwll.ch
Daniel Vetter [Sun, 18 Dec 2016 13:35:45 +0000 (14:35 +0100)]
drm: prevent double-(un)registration for connectors
If we're unlucky then the registration from a hotplugged connector
might race with the final registration step on driver load. And since
MST topology discover is asynchronous that's even somewhat likely.
v2: Also update the kerneldoc for @registered!
v3: Review from Chris:
- Improve kerneldoc for late_register/early_unregister callbacks.
- Use mutex_destroy.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161218133545.2106-1-daniel.vetter@ffwll.ch
(cherry picked from commit
e73ab00e9a0f1731f34d0620a9c55f5c30c4ad4e)
Borislav Petkov [Wed, 25 Jan 2017 20:00:48 +0000 (21:00 +0100)]
x86/microcode: Do not access the initrd after it has been freed
When we look for microcode blobs, we first try builtin and if that
doesn't succeed, we fallback to the initrd supplied to the kernel.
However, at some point doing boot, that initrd gets jettisoned and we
shouldn't access it anymore. But we do, as the below KASAN report shows.
That's because find_microcode_in_initrd() doesn't check whether the
initrd is still valid or not.
So do that.
==================================================================
BUG: KASAN: use-after-free in find_cpio_data
Read of size 1 by task swapper/1/0
page:
ffffea0000db9d40 count:0 mapcount:0 mapping: (null) index:0x1
flags: 0x100000000000000()
raw:
0100000000000000 0000000000000000 0000000000000001 00000000ffffffff
raw:
dead000000000100 dead000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G W
4.10.0-rc5-debug-00075-g2dbde22 #3
Hardware name: Dell Inc. XPS 13 9360/0839Y6, BIOS 1.2.3 12/01/2016
Call Trace:
dump_stack
? _atomic_dec_and_lock
? __dump_page
kasan_report_error
? pointer
? find_cpio_data
__asan_report_load1_noabort
? find_cpio_data
find_cpio_data
? vsprintf
? dump_stack
? get_ucode_user
? print_usage_bug
find_microcode_in_initrd
__load_ucode_intel
? collect_cpu_info_early
? debug_check_no_locks_freed
load_ucode_intel_ap
? collect_cpu_info
? trace_hardirqs_on
? flat_send_IPI_mask_allbutself
load_ucode_ap
? get_builtin_firmware
? flush_tlb_func
? do_raw_spin_trylock
? cpumask_weight
cpu_init
? trace_hardirqs_off
? play_dead_common
? native_play_dead
? hlt_play_dead
? syscall_init
? arch_cpu_idle_dead
? do_idle
start_secondary
start_cpu
Memory state around the buggy address:
ffff880036e74f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff880036e74f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>
ffff880036e75000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff880036e75080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff880036e75100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Tested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170126165833.evjemhbqzaepirxo@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andy Shevchenko [Tue, 24 Jan 2017 15:28:22 +0000 (17:28 +0200)]
pinctrl: intel: merrifield: Add missed check in mrfld_config_set()
Not every pin can be configured. Add missed check to prevent access
violation.
Fixes: 4e80c8f50574 ("pinctrl: intel: Add Intel Merrifield pin controller support")
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Maxime Ripard [Mon, 23 Jan 2017 08:21:30 +0000 (09:21 +0100)]
pinctrl: sunxi: Don't enforce bias disable (for now)
Commit
07fe64ba213f ("pinctrl: sunxi: Handle bias disable") actually
enforced enforced the disabling of the pull up/down resistors instead of
ignoring it like it was done before.
This was part of a wider rework to switch to the generic pinconf bindings,
and was meant to be merged together with DT patches that were switching to
it, and removing what was considered default values by both the binding and
the boards. This included no bias on a pin.
However, those DT patches were delayed to 4.11, which would be fine only
for a significant number boards having the bias setup wrong, which in turns
break the MMC on those boards (and possibly other devices too).
In order to avoid conflicts as much as possible, bring back the old
behaviour for 4.10, and we'll revert that commit once all the DT bits will
have landed.
Tested-by: Priit Laes <plaes@plaes.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Jisheng Zhang [Mon, 23 Jan 2017 07:15:32 +0000 (15:15 +0800)]
pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
This should be a typo.
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>