Nikolay Borisov [Tue, 10 Jan 2017 18:35:39 +0000 (20:35 +0200)]
btrfs: Make btrfs_remove_delayed_node take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:38 +0000 (20:35 +0200)]
btrfs: Make btrfs_kill_delayed_inode_items take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:37 +0000 (20:35 +0200)]
btrfs: Make btrfs_delayed_delete_inode_ref take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:36 +0000 (20:35 +0200)]
btrfs: Make btrfs_delete_delayed_dir_index take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:35 +0000 (20:35 +0200)]
btrfs: Make btrfs_insert_delayed_dir_index take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:34 +0000 (20:35 +0200)]
btrfs: Make btrfs_delayed_inode_reserve_metadata take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:33 +0000 (20:35 +0200)]
btrfs: Make btrfs_get_or_create_delayed_node take btrfs_inode
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:32 +0000 (20:35 +0200)]
btrfs: Make btrfs_get_delayed_node take btrfs_inode
This function is internal to btrfs and doesn't really deal with any
VFS members, as such it needn't take a struct inode refrence but
btrfs_inode.
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Nikolay Borisov [Tue, 10 Jan 2017 18:35:31 +0000 (20:35 +0200)]
btrfs: Make btrfs_ino take a struct btrfs_inode
Currently btrfs_ino takes a struct inode and this causes a lot of
internal btrfs functions which consume this ino to take a VFS inode,
rather than btrfs' own struct btrfs_inode. In order to fix this "leak"
of VFS structs into the internals of btrfs first it's necessary to
eliminate all uses of struct inode for the purpose of inode. This patch
does that by using BTRFS_I to convert an inode to btrfs_inode. With
this problem eliminated subsequent patches will start eliminating the
passing of struct inode altogether, eventually resulting in a lot cleaner
code.
Signed-off-by: Nikolay Borisov <n.borisov.lkml@gmail.com>
[ fix btrfs_get_extent tracepoint prototype ]
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Wed, 4 Jan 2017 10:09:51 +0000 (11:09 +0100)]
btrfs: add wrapper for counting BTRFS_MAX_EXTENT_SIZE
The expression is open-coded in several places, this asks for a wrapper.
As we know the MAX_EXTENT fits to u32, we can use the appropirate
division helper. This cascades to the result type updates.
Compiler is clever enough to use shift instead of integer division, so
there's no change in the generated assembly.
Signed-off-by: David Sterba <dsterba@suse.com>
David Sterba [Fri, 6 Jan 2017 12:27:00 +0000 (13:27 +0100)]
btrfs: remove unused logic of limiting async delalloc pages
A proposed patch in https://marc.info/?l=linux-btrfs&m=
147859791003837
pointed out bad limit threshold in cow_file_range_async, but it turned
out that the whole logic is not necessary and is done by writeback. We
agreed to remove it.
Signed-off-by: David Sterba <dsterba@suse.com>
Anand Jain [Mon, 19 Dec 2016 11:09:06 +0000 (19:09 +0800)]
btrfs: consolidate auto defrag kick off policies
As of now writes smaller than 64k for non compressed extents and 16k
for compressed extents inside eof are considered as candidate
for auto defrag, put them together at a place.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Anand Jain [Wed, 21 Dec 2016 07:42:08 +0000 (15:42 +0800)]
btrfs: btrfs_defrag_root() doesn't defrag extent root tree
Since btrfs_defrag_leaves() does not support extent_root, remove its
corresponding call. The user can use the file based defrag to defrag
extents as of now.
No change in behaviour as extent_root is explicitly skipped in
btrfs_defrag_leaves and this has never worked as expected.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ ehnance changelong ]
Signed-off-by: David Sterba <dsterba@suse.com>
Jeff Mahoney [Tue, 13 Dec 2016 19:39:34 +0000 (14:39 -0500)]
btrfs: drop unused extent_op arg from btrfs_add_delayed_data_ref
btrfs_add_delayed_data_ref is always called with a NULL extent_op,
so let's drop the argument.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Colin Ian King [Tue, 20 Dec 2016 16:18:37 +0000 (16:18 +0000)]
btrfs: remove redundant inode null check
The check for a null inode is redundant since the function
is a callback for exportfs, which will itself crash if
dentry->d_inode or parent->d_inode is NULL. Removing the
null check makes this consistent with other file systems.
Also remove the redundant null dir check too.
Found with static analysis by CoverityScan, CID
1389472
Kudos to Jeff Mahoney for reviewing and explaining the error in
my original patch (most of this explanation went into the above
commit message) and David Sterba for pointing out that the dir
check is also redundant.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Seraphime Kirkovski [Thu, 15 Dec 2016 13:38:16 +0000 (14:38 +0100)]
Btrfs: ACCESS_ONCE cleanup
This replaces ACCESS_ONCE macro with the corresponding
READ|WRITE macros
Signed-off-by: Seraphime Kirkovski <kirkseraph@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Seraphime Kirkovski [Thu, 15 Dec 2016 13:38:28 +0000 (14:38 +0100)]
Btrfs: code cleanup min/max -> min_t/max_t
This cleans up the cases where the min/max macros were used with a cast
rather than using directly min_t/max_t.
Signed-off-by: Seraphime Kirkovski <kirkseraph@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Geliang Tang [Mon, 19 Dec 2016 14:53:41 +0000 (22:53 +0800)]
btrfs: use rb_entry() instead of container_of
To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.
Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Anand Jain [Tue, 6 Dec 2016 04:43:07 +0000 (12:43 +0800)]
btrfs: use BTRFS_COMPRESS_NONE to specify no compression
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Michal Hocko [Mon, 9 Jan 2017 14:39:03 +0000 (15:39 +0100)]
btrfs: drop gfp mask tweaking in try_release_extent_state
try_release_extent_state reduces the gfp mask to GFP_NOFS if it is
compatible. This is true for GFP_KERNEL as well. There is no real
reason to do that though. There is no new lock taken down the
the only consumer of the gfp mask which is
try_release_extent_state
clear_extent_bit
__clear_extent_bit
alloc_extent_state
So this seems just unnecessary and confusing.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Michal Hocko [Mon, 9 Jan 2017 14:39:02 +0000 (15:39 +0100)]
btrfs: fix up misleading GFP_NOFS usage in btrfs_releasepage
b335b0034e25 ("Btrfs: Avoid using __GFP_HIGHMEM with slab allocator")
has reduced the allocation mask in btrfs_releasepage to GFP_NOFS just
to prevent from giving an unappropriate gfp mask to the slab allocator
deeper down the callchain (in alloc_extent_state). This is wrong for
two reasons a) GFP_NOFS might be just too restrictive for the calling
context b) it is better to tweak the gfp mask down when it needs that.
So just remove the mask tweaking from btrfs_releasepage and move it
down to alloc_extent_state where it is needed.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Qu Wenruo [Thu, 20 Oct 2016 02:28:41 +0000 (10:28 +0800)]
btrfs: Add WARN_ON for qgroup reserved underflow
Goldwyn Rodrigues has exposed and fixed a bug which underflows btrfs
qgroup reserved space, and leads to non-writable fs.
This reminds us that we don't have enough underflow check for qgroup
reserved space.
For underflow case, we should not really underflow the numbers but warn
and keeps qgroup still work.
So add more check on qgroup reserved space and add WARN_ON() and
btrfs_warn() for any underflow case.
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Linus Torvalds [Sun, 12 Feb 2017 21:03:20 +0000 (13:03 -0800)]
Linux 4.10-rc8
Linus Torvalds [Sat, 11 Feb 2017 18:31:46 +0000 (10:31 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Last minute x86 fixes:
- Fix a softlockup detector warning and long delays if using ptdump
with KASAN enabled.
- Two more TSC-adjust fixes for interesting firmware interactions.
- Two commits to fix an AMD CPU topology enumeration bug that caused
a measurable gaming performance regression"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm/ptdump: Fix soft lockup in page table walker
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST
x86/CPU/AMD: Fix Zen SMT topology
x86/CPU/AMD: Bring back Compute Unit ID
Linus Torvalds [Sat, 11 Feb 2017 18:24:16 +0000 (10:24 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix a sporadic missed timer hw reprogramming bug that can result in
random delays"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/nohz: Fix possible missing clock reprog after tick soft restart
Linus Torvalds [Sat, 11 Feb 2017 18:20:06 +0000 (10:20 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A kernel crash fix plus three tooling fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/core: Fix crash in perf_event_read()
perf callchain: Reference count maps
perf diff: Fix -o/--order option behavior (again)
perf diff: Fix segfault on 'perf diff -o N' option
Linus Torvalds [Sat, 11 Feb 2017 18:16:05 +0000 (10:16 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull lockdep fix from Ingo Molnar:
"This fixes an ugly lockdep stack trace output regression. (But also
affects other stacktrace users such as kmemleak, KASAN, etc)"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stacktrace, lockdep: Fix address, newline ugliness
Linus Torvalds [Sat, 11 Feb 2017 18:14:24 +0000 (10:14 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Two last minute ARM irqchip driver fixes"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
irqchip/keystone: Fix "scheduling while atomic" on rt
Linus Torvalds [Sat, 11 Feb 2017 17:15:58 +0000 (09:15 -0800)]
Merge branch 'for-linus-4.10' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This has two last minute fixes. The highest priority here is a
regression fix for the decompression code, but we also fixed up a
problem with the 32-bit compat ioctls.
The decompression bug could hand back the wrong data on big reads when
zlib was used. I have a larger cleanup to make the math here less
error prone, but at this stage in the release Omar's patch is the best
choice"
* 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix btrfs_decompress_buf2page()
btrfs: fix btrfs_compat_ioctl failures on non-compat ioctls
Linus Torvalds [Sat, 11 Feb 2017 17:01:03 +0000 (09:01 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Six fairly small fixes. None is a real show stopper, two automation
detected problems: one memory leak, one use after free and four others
each of which fixes something that has been a significant source of
annoyance to someone"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: zfcp: fix use-after-free by not tracing WKA port open/close on failed send
scsi: aacraid: Fix INTx/MSI-x issue with older controllers
scsi: mpt3sas: disable ASPM for MPI2 controllers
scsi: mpt3sas: Force request partial completion alignment
scsi: qla2xxx: Avoid that issuing a LIP triggers a kernel crash
scsi: qla2xxx: Fix a recently introduced memory leak
Omar Sandoval [Fri, 10 Feb 2017 23:03:35 +0000 (15:03 -0800)]
Btrfs: fix btrfs_decompress_buf2page()
If btrfs_decompress_buf2page() is handed a bio with its page in the
middle of the working buffer, then we adjust the offset into the working
buffer. After we copy into the bio, we advance the iterator by the
number of bytes we copied. Then, we have some logic to handle the case
of discontiguous pages and adjust the offset into the working buffer
again. However, if we didn't advance the bio to a new page, we may enter
this case in error, essentially repeating the adjustment that we already
made when we entered the function. The end result is bogus data in the
bio.
Previously, we only checked for this case when we advanced to a new
page, but the conversion to bio iterators changed that. This restores
the old, correct behavior.
A case I saw when testing with zlib was:
buf_start = 42769
total_out = 46865
working_bytes = total_out - buf_start = 4096
start_byte = 45056
The condition (total_out > start_byte && buf_start < start_byte) is
true, so we adjust the offset:
buf_offset = start_byte - buf_start = 2287
working_bytes -= buf_offset = 1809
current_buf_start = buf_start = 42769
Then, we copy
bytes = min(bvec.bv_len, PAGE_SIZE - buf_offset, working_bytes) = 1809
buf_offset += bytes = 4096
working_bytes -= bytes = 0
current_buf_start += bytes = 44578
After bio_advance(), we are still in the same page, so start_byte is the
same. Then, we check (total_out > start_byte && current_buf_start < start_byte),
which is true! So, we adjust the values again:
buf_offset = start_byte - buf_start = 2287
working_bytes = total_out - start_byte = 1809
current_buf_start = buf_start + buf_offset = 45056
But note that working_bytes was already zero before this, so we should
have stopped copying.
Fixes: 974b1adc3b10 ("btrfs: use bio iterators for the decompression handlers")
Reported-by: Pat Erley <pat-lkml@erley.org>
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Tested-by: Liu Bo <bo.li.liu@oracle.com>
Linus Torvalds [Fri, 10 Feb 2017 22:44:49 +0000 (14:44 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) If the timing is wrong we can indefinitely stop generating new ipv6
temporary addresses, from Marcus Huewe.
2) Don't double free per-cpu stats in ipv6 SIT tunnel driver, from Cong
Wang.
3) Put protections in place so that AF_PACKET is not able to submit
packets which don't even have a link level header to drivers. From
Willem de Bruijn.
4) Fix memory leaks in ipv4 and ipv6 multicast code, from Hangbin Liu.
5) Don't use udp_ioctl() in l2tp code, UDP version expects a UDP socket
and that doesn't go over very well when it is passed an L2TP one.
Fix from Eric Dumazet.
6) Don't crash on NULL pointer in phy_attach_direct(), from Florian
Fainelli.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
l2tp: do not use udp_ioctl()
xen-netfront: Delete rx_refill_timer in xennet_disconnect_backend()
NET: mkiss: Fix panic
net: hns: Fix the device being used for dma mapping during TX
net: phy: Initialize mdio clock at probe function
igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()
xen-netfront: Improve error handling during initialization
sierra_net: Skip validating irrelevant fields for IDLE LSIs
sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
kcm: fix 0-length case for kcm_sendmsg()
xen-netfront: Rework the fix for Rx stall during OOM and network stress
net: phy: Fix PHY module checks and NULL deref in phy_attach_direct()
net: thunderx: Fix PHY autoneg for SGMII QLM mode
net: dsa: Do not destroy invalid network devices
ping: fix a null pointer dereference
packet: round up linear to header len
net: introduce device min_header_len
sit: fix a double free on error path
lwtunnel: valid encap attr check should return 0 when lwtunnel is disabled
ipv6: addrconf: fix generation of new temporary addresses
Linus Torvalds [Fri, 10 Feb 2017 22:41:16 +0000 (14:41 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Third round of -rc fixes for 4.10 kernel:
- two security related issues in the rxe driver
- one compile issue in the RDMA uapi header"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA: Don't reference kernel private header from UAPI header
IB/rxe: Fix mem_check_range integer overflow
IB/rxe: Fix resid update
Linus Torvalds [Fri, 10 Feb 2017 22:39:08 +0000 (14:39 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c bugfixes from Wolfram Sang:
"Two bugfixes (proper IO mapping and use of mutex) for a driver feature
we introduced in this cycle"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: piix4: Request the SMBUS semaphore inside the mutex
i2c: piix4: Fix request_region size
Linus Torvalds [Fri, 10 Feb 2017 22:35:22 +0000 (14:35 -0800)]
Merge tag 'mmc-v4.10-rc7' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fix from Ulf Hansson:
"mmci: Fix hang while waiting for busy-end interrupt"
* tag 'mmc-v4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mmci: avoid clearing ST Micro busy end interrupt mistakenly
Linus Torvalds [Fri, 10 Feb 2017 22:29:30 +0000 (14:29 -0800)]
Merge tag 'sound-4.10' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Here are some last-minute fixes: two fixes for races in ALSA sequencer
queue spotted by syzkaller, a revert for a regression of LINE6 driver
(since 4.9), and a trivial new codec ID addition for Nvidia HDMI"
* tag 'sound-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - adding a new NV HDMI/DP codec ID in the driver
ALSA: seq: Fix race at creating a queue
Revert "ALSA: line6: Only determine control port properties if needed"
ALSA: seq: Don't handle loop timeout at snd_seq_pool_done()
Linus Torvalds [Fri, 10 Feb 2017 22:23:45 +0000 (14:23 -0800)]
Merge tag 'nfsd-4.10-3' of git://linux-nfs.org/~bfields/linux
Pull nfsd revert from Bruce Fields:
"This patch turned out to have a couple problems. The problems are
fixable, but at least one of the fixes is a little ugly. The original
bug has always been there, so we can wait another week or two to get
this right"
* tag 'nfsd-4.10-3' of git://linux-nfs.org/~bfields/linux:
nfsd: Revert "nfsd: special case truncates some more"
Linus Torvalds [Fri, 10 Feb 2017 22:10:35 +0000 (14:10 -0800)]
Merge tag 'powerpc-4.10-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes friom Michael Ellerman:
"Apologies for the late pull request, but Ben has been busy finding bugs.
- Userspace was semi-randomly segfaulting on radix due to us
incorrectly handling a fault triggered by autonuma, caused by a
patch we merged earlier in v4.10 to prevent the kernel executing
userspace.
- We weren't marking host IPIs properly for KVM in the OPAL ICP
backend.
- The ERAT flushing on radix was missing an isync and was incorrectly
marked as DD1 only.
- The powernv CPU hotplug code was missing a wakeup type and failing
to flush the interrupt correctly when using OPAL ICP
Thanks to Benjamin Herrenschmidt"
* tag 'powerpc-4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: Properly set "host-ipi" on IPIs
powerpc/powernv: Fix CPU hotplug to handle waking on HVI
powerpc/mm/radix: Update ERAT flushes when invalidating TLB
powerpc/mm: Fix spurrious segfaults on radix with autonuma
Eric Dumazet [Fri, 10 Feb 2017 00:15:52 +0000 (16:15 -0800)]
l2tp: do not use udp_ioctl()
udp_ioctl(), as its name suggests, is used by UDP protocols,
but is also used by L2TP :(
L2TP should use its own handler, because it really does not
look the same.
SIOCINQ for instance should not assume UDP checksum or headers.
Thanks to Andrey and syzkaller team for providing the report
and a nice reproducer.
While crashes only happen on recent kernels (after commit
7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")), this
probably needs to be backported to older kernels.
Fixes: 7c13f97ffde6 ("udp: do fwd memory scheduling on dequeue")
Fixes: 85584672012e ("udp: Fix udp_poll() and ioctl()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Mason [Fri, 10 Feb 2017 20:53:18 +0000 (12:53 -0800)]
Merge branch 'for-chris' of git://git./linux/kernel/git/kdave/linux into for-linus-4.10
Boris Ostrovsky [Mon, 30 Jan 2017 17:45:46 +0000 (12:45 -0500)]
xen-netfront: Delete rx_refill_timer in xennet_disconnect_backend()
rx_refill_timer should be deleted as soon as we disconnect from the
backend since otherwise it is possible for the timer to go off before
we get to xennet_destroy_queues(). If this happens we may dereference
queue->rx.sring which is set to NULL in xennet_disconnect_backend().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: stable@vger.kernel.org
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ralf Baechle [Thu, 9 Feb 2017 13:12:11 +0000 (14:12 +0100)]
NET: mkiss: Fix panic
If a USB-to-serial adapter is unplugged, the driver re-initializes, with
dev->hard_header_len and dev->addr_len set to zero, instead of the correct
values. If then a packet is sent through the half-dead interface, the
kernel will panic due to running out of headroom in the skb when pushing
for the AX.25 headers resulting in this panic:
[<
c0595468>] (skb_panic) from [<
c0401f70>] (skb_push+0x4c/0x50)
[<
c0401f70>] (skb_push) from [<
bf0bdad4>] (ax25_hard_header+0x34/0xf4 [ax25])
[<
bf0bdad4>] (ax25_hard_header [ax25]) from [<
bf0d05d4>] (ax_header+0x38/0x40 [mkiss])
[<
bf0d05d4>] (ax_header [mkiss]) from [<
c041b584>] (neigh_compat_output+0x8c/0xd8)
[<
c041b584>] (neigh_compat_output) from [<
c043e7a8>] (ip_finish_output+0x2a0/0x914)
[<
c043e7a8>] (ip_finish_output) from [<
c043f948>] (ip_output+0xd8/0xf0)
[<
c043f948>] (ip_output) from [<
c043f04c>] (ip_local_out_sk+0x44/0x48)
This patch makes mkiss behave like the 6pack driver. 6pack does not
panic. In 6pack.c sp_setup() (same function name here) the values for
dev->hard_header_len and dev->addr_len are set to the same values as in
my mkiss patch.
[ralf@linux-mips.org: Massages original submission to conform to the usual
standards for patch submissions.]
Signed-off-by: Thomas Osterried <thomas@osterried.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kejian Yan [Thu, 9 Feb 2017 11:46:15 +0000 (11:46 +0000)]
net: hns: Fix the device being used for dma mapping during TX
This patch fixes the device being used to DMA map skb->data.
Erroneous device assignment causes the crash when SMMU is enabled.
This happens during TX since buffer gets DMA mapped with device
correspondign to net_device and gets unmapped using the device
related to DSAF.
Signed-off-by: Kejian Yan <yankejian@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Gleixner [Fri, 10 Feb 2017 13:44:01 +0000 (14:44 +0100)]
Merge tag 'irqchip-fixes-4.10' of git://git.infradead.org/users/jcooper/linux into irq/urgent
Pull irqchip fixes for v4.10 from Jason Cooper
- keystone: Fix scheduling while atomic for realtime
- mxs: Enable SKIP_SET_WAKE and MASK_ON_SUSPEND
Andrey Ryabinin [Fri, 10 Feb 2017 09:54:05 +0000 (12:54 +0300)]
x86/mm/ptdump: Fix soft lockup in page table walker
CONFIG_KASAN=y needs a lot of virtual memory mapped for its shadow.
In that case ptdump_walk_pgd_level_core() takes a lot of time to
walk across all page tables and doing this without
a rescheduling causes soft lockups:
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/0:1]
...
Call Trace:
ptdump_walk_pgd_level_core+0x40c/0x550
ptdump_walk_pgd_level_checkwx+0x17/0x20
mark_rodata_ro+0x13b/0x150
kernel_init+0x2f/0x120
ret_from_fork+0x2c/0x40
I guess that this issue might arise even without KASAN on huge machines
with several terabytes of RAM.
Stick cond_resched() in pgd loop to fix this.
Reported-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170210095405.31802-1-aryabinin@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 9 Feb 2017 15:08:42 +0000 (16:08 +0100)]
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
When the TSC is marked reliable then the synchronization check is skipped,
but that also skips the TSC ADJUST sanitizing code. So on a machine with a
wreckaged BIOS the TSC deviation between CPUs might go unnoticed.
Let the TSC adjust sanitizing code run unconditionally and just skip the
expensive synchronization checks when TSC is marked reliable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Olof Johansson <olof@lixom.net>
Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Thu, 9 Feb 2017 15:08:41 +0000 (16:08 +0100)]
x86/tsc: Avoid the large time jump when sanitizing TSC ADJUST
Olof reported that on a machine which has a BIOS wreckaged TSC the
timestamps in dmesg are making a large jump because the TSC value is
jumping forward after resetting the TSC ADJUST register to a sane value.
This can be avoided by calling the TSC ADJUST saniziting function before
initializing the per cpu sched clock machinery. That takes the offset into
account and avoid the time jump.
What cannot be avoided is that the 'Firmware Bug' warnings on the secondary
CPUs are printed with the large time offsets because it would be too much
effort and ugly hackery to print those warnings into a buffer and emit them
after the adjustemt on the starting CPUs. It's a firmware bug and should be
fixed in firmware. The weird timestamps are collateral damage and just
illustrate the sillyness of the BIOS folks:
[ 0.397445] smp: Bringing up secondary CPUs ...
[ 0.402100] x86: Booting SMP configuration:
[ 0.406343] .... node #0, CPUs: #1
[
1265776479.930667] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -
2978888639075328 CPU1: -
2978888639183101
[
1265776479.944664] TSC ADJUST synchronize: Reference CPU0: 0 CPU1: -
2978888639183101
[ 0.508119] #2
[
1265776480.032346] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -
2978888639075328 CPU2: -
2978888639183677
[
1265776480.044192] TSC ADJUST synchronize: Reference CPU0: 0 CPU2: -
2978888639183677
[ 0.607643] #3
[
1265776480.131874] [Firmware Bug]: TSC ADJUST differs: Reference CPU0: -
2978888639075328 CPU3: -
2978888639184530
[
1265776480.143720] TSC ADJUST synchronize: Reference CPU0: 0 CPU3: -
2978888639184530
[ 0.707108] smp: Brought up 1 node, 4 CPUs
[ 0.711271] smpboot: Total of 4 processors activated (21698.88 BogoMIPS)
Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170209151231.411460506@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Frederic Weisbecker [Tue, 7 Feb 2017 16:44:54 +0000 (17:44 +0100)]
tick/nohz: Fix possible missing clock reprog after tick soft restart
ts->next_tick keeps track of the next tick deadline in order to optimize
clock programmation on irq exit and avoid redundant clock device writes.
Now if ts->next_tick missed an update, we may spuriously miss a clock
reprog later as the nohz code is fooled by an obsolete next_tick value.
This is what happens here on a specific path: when we observe an
expired timer from the nohz update code on irq exit, we perform a soft
tick restart which simply fires the closest possible tick without
actually exiting the nohz mode and restoring a periodic state. But we
forget to update ts->next_tick accordingly.
As a result, after the next tick resulting from such soft tick restart,
the nohz code sees a stale value on ts->next_tick which doesn't match
the clock deadline that just expired. If that obsolete ts->next_tick
value happens to collide with the actual next tick deadline to be
scheduled, we may spuriously bypass the clock reprogramming. In the
worst case, the tick may never fire again.
Fix this with a ts->next_tick reset on soft tick restart.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1486485894-29173-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Peter Zijlstra [Tue, 31 Jan 2017 10:27:10 +0000 (11:27 +0100)]
perf/core: Fix crash in perf_event_read()
Alexei had his box explode because doing read() on a package
(rapl/uncore) event that isn't currently scheduled in ends up doing an
out-of-bounds load.
Rework the code to more explicitly deal with event->oncpu being -1.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Tested-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: David Carrillo-Cisneros <davidcc@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: eranian@google.com
Fixes: d6a2f9035bfc ("perf/core: Introduce PMU_EV_CAP_READ_ACTIVE_PKG")
Link: http://lkml.kernel.org/r/20170131102710.GL6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
James Bottomley [Fri, 10 Feb 2017 05:00:46 +0000 (21:00 -0800)]
Merge remote-tracking branch 'mkp-scsi/4.10/scsi-fixes' into fixes
J. Bruce Fields [Thu, 9 Feb 2017 19:20:42 +0000 (14:20 -0500)]
nfsd: Revert "nfsd: special case truncates some more"
This patch incorrectly attempted nested mnt_want_write, and incorrectly
disabled nfsd's owner override for truncate. We'll fix those problems
and make another attempt soon, for the moment I think the safest is to
revert.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Linus Torvalds [Fri, 10 Feb 2017 01:46:30 +0000 (17:46 -0800)]
Merge tag 'drm-fixes-for-v4.10-rc8' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"This should be the final set of drm fixes for 4.10: one vmwgfx boot
fix, one vc4 fix, and a few i915 fixes:
* tag 'drm-fixes-for-v4.10-rc8' of git://people.freedesktop.org/~airlied/linux:
drm: vc4: adapt to new behaviour of drm_crtc.c
drm/i915: Always convert incoming exec offsets to non-canonical
drm/i915: Remove overzealous fence warn on runtime suspend
drm/i915/bxt: Add MST support when do DPLL calculation
drm/i915: don't warn about Skylake CPU - KabyPoint PCH combo
drm/i915: fix i915 running as dom0 under Xen
drm/i915: Flush untouched framebuffers before display on !llc
drm/i915: fix use-after-free in page_flip_completed()
drm/vmwgfx: Fix depth input into drm_mode_legacy_fb_format
Steffen Maier [Wed, 8 Feb 2017 14:34:22 +0000 (15:34 +0100)]
scsi: zfcp: fix use-after-free by not tracing WKA port open/close on failed send
Dan Carpenter kindly reported:
<quote>
The patch
d27a7cb91960: "zfcp: trace on request for open and close of
WKA port" from Aug 10, 2016, leads to the following static checker
warning:
drivers/s390/scsi/zfcp_fsf.c:1615 zfcp_fsf_open_wka_port()
warn: 'req' was already freed.
drivers/s390/scsi/zfcp_fsf.c
1609 zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
1610 retval = zfcp_fsf_req_send(req);
1611 if (retval)
1612 zfcp_fsf_req_free(req);
^^^
Freed.
1613 out:
1614 spin_unlock_irq(&qdio->req_q_lock);
1615 if (req && !IS_ERR(req))
1616 zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id);
^^^^^^^^^^^
Use after free.
1617 return retval;
1618 }
Same thing for zfcp_fsf_close_wka_port() as well.
</quote>
Rather than relying on req being NULL (or ERR_PTR) for all cases where
we don't want to trace or should not trace,
simply check retval which is unconditionally initialized with -EIO != 0
and it can only become 0 on successful retval = zfcp_fsf_req_send(req).
With that we can also remove the then again unnecessary unconditional
initialization of req which was introduced with that earlier commit.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: d27a7cb91960 ("zfcp: trace on request for open and close of WKA port")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dave Carroll [Thu, 9 Feb 2017 18:04:47 +0000 (11:04 -0700)]
scsi: aacraid: Fix INTx/MSI-x issue with older controllers
commit
78cbccd3bd68 ("aacraid: Fix for KDUMP driver hang")
caused a problem on older controllers which do not support MSI-x (namely
ASR3405,ASR3805). This patch conditionalizes the previous patch to
controllers which support MSI-x
Cc: <stable@vger.kernel.org> # v4.7+
Fixes: 78cbccd3bd68 ("aacraid: Fix for KDUMP driver hang")
Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dave Airlie [Fri, 10 Feb 2017 00:14:24 +0000 (10:14 +1000)]
Merge tag 'drm-intel-fixes-2017-02-09' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes
Hopefully final fixes for v4.10, about half of them stable material.
* tag 'drm-intel-fixes-2017-02-09' of git://anongit.freedesktop.org/git/drm-intel:
drm/i915: Always convert incoming exec offsets to non-canonical
drm/i915: Remove overzealous fence warn on runtime suspend
drm/i915/bxt: Add MST support when do DPLL calculation
drm/i915: don't warn about Skylake CPU - KabyPoint PCH combo
drm/i915: fix i915 running as dom0 under Xen
drm/i915: Flush untouched framebuffers before display on !llc
drm/i915: fix use-after-free in page_flip_completed()
Dave Airlie [Fri, 10 Feb 2017 00:14:01 +0000 (10:14 +1000)]
Merge tag 'drm-misc-fixes-2017-02-09' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
Last-minute vc4 fix for 4.10.
* tag 'drm-misc-fixes-2017-02-09' of git://anongit.freedesktop.org/git/drm-misc:
drm: vc4: adapt to new behaviour of drm_crtc.c
ojab [Wed, 28 Dec 2016 11:05:24 +0000 (11:05 +0000)]
scsi: mpt3sas: disable ASPM for MPI2 controllers
MPI2 controllers sometimes got lost (i.e. disappear from
/sys/bus/pci/devices) if ASMP is enabled.
Signed-off-by: Slava Kardakov <ojab@ojab.ru>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60644
Cc: <stable@vger.kernel.org>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Yendapally Reddy Dhananjaya Reddy [Wed, 8 Feb 2017 22:14:26 +0000 (17:14 -0500)]
net: phy: Initialize mdio clock at probe function
USB PHYs need the MDIO clock divisor enabled earlier to work.
Initialize mdio clock divisor in probe function. The ext bus
bit available in the same register will be used by mdio mux
to enable external mdio.
Signed-off-by: Yendapally Reddy Dhananjaya Reddy <yendapally.reddy@broadcom.com>
Fixes: ddc24ae1 ("net: phy: Broadcom iProc MDIO bus driver")
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Wed, 8 Feb 2017 13:16:45 +0000 (21:16 +0800)]
igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()
In function igmpv3/mld_add_delrec() we allocate pmc and put it in
idev->mc_tomb, so we should free it when we don't need it in del_delrec().
But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.
Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info when ...")
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when ...")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ross Lagerwall [Wed, 8 Feb 2017 10:57:37 +0000 (10:57 +0000)]
xen-netfront: Improve error handling during initialization
This fixes a crash when running out of grant refs when creating many
queues across many netdevs.
* If creating queues fails (i.e. there are no grant refs available),
call xenbus_dev_fatal() to ensure that the xenbus device is set to the
closed state.
* If no queues are created, don't call xennet_disconnect_backend as
netdev->real_num_tx_queues will not have been set correctly.
* If setup_netfront() fails, ensure that all the queues created are
cleaned up, not just those that have been set up.
* If any queues were set up and an error occurs, call
xennet_destroy_queues() to clean up the napi context.
* If any fatal error occurs, unregister and destroy the netdev to avoid
leaving around a half setup network device.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 Feb 2017 21:41:43 +0000 (16:41 -0500)]
Merge branch 'sierra_net-fixes'
Stefan Brüns says:
====================
Fixes for sierra_net driver
When trying to initiate a dual-stack (ipv4v6) connection, a MC7710, FW
version SWI9200X_03.05.24.00ap answers with an unsupported LSI. Add support
for this LSI.
Also the link_type should be ignored when going idle, otherwise the modem
is stuck in a bad link state.
Tested on MC7710, T-Mobile DE, APN internet.telekom, IPv4v6 PDP type. Both
IPv4 and IPv6 connections work.
v2: Do not overwrite protocol field in rx_fixup
v3: Remove leftover struct ethhdr *eth declaration
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Brüns [Wed, 8 Feb 2017 01:46:33 +0000 (02:46 +0100)]
sierra_net: Skip validating irrelevant fields for IDLE LSIs
When the context is deactivated, the link_type is set to 0xff, which
triggers a warning message, and results in a wrong link status, as
the LSI is ignored.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Brüns [Wed, 8 Feb 2017 01:46:32 +0000 (02:46 +0100)]
sierra_net: Add support for IPv6 and Dual-Stack Link Sense Indications
If a context is configured as dualstack ("IPv4v6"), the modem indicates
the context activation with a slightly different indication message.
The dual-stack indication omits the link_type (IPv4/v6) and adds
additional address fields.
IPv6 LSIs are identical to IPv4 LSIs, but have a different link type.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Reviewed-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Tue, 7 Feb 2017 20:59:47 +0000 (12:59 -0800)]
kcm: fix 0-length case for kcm_sendmsg()
Dmitry reported a kernel warning:
WARNING: CPU: 3 PID: 2936 at net/kcm/kcmsock.c:627
kcm_write_msgs+0x12e3/0x1b90 net/kcm/kcmsock.c:627
CPU: 3 PID: 2936 Comm: a.out Not tainted 4.10.0-rc6+ #209
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
panic+0x1fb/0x412 kernel/panic.c:179
__warn+0x1c4/0x1e0 kernel/panic.c:539
warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
kcm_write_msgs+0x12e3/0x1b90 net/kcm/kcmsock.c:627
kcm_sendmsg+0x163a/0x2200 net/kcm/kcmsock.c:1029
sock_sendmsg_nosec net/socket.c:635 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:645
sock_write_iter+0x326/0x600 net/socket.c:848
new_sync_write fs/read_write.c:499 [inline]
__vfs_write+0x483/0x740 fs/read_write.c:512
vfs_write+0x187/0x530 fs/read_write.c:560
SYSC_write fs/read_write.c:607 [inline]
SyS_write+0xfb/0x230 fs/read_write.c:599
entry_SYSCALL_64_fastpath+0x1f/0xc2
when calling syscall(__NR_write, sock2, 0x208aaf27ul, 0x0ul) on a KCM
seqpacket socket. It appears that kcm_sendmsg() does not handle len==0
case correctly, which causes an empty skb is allocated and queued.
Fix this by skipping the skb allocation for len==0 case.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vineeth Remanan Pillai [Tue, 7 Feb 2017 18:59:01 +0000 (18:59 +0000)]
xen-netfront: Rework the fix for Rx stall during OOM and network stress
The commit
90c311b0eeea ("xen-netfront: Fix Rx stall during network
stress and OOM") caused the refill timer to be triggerred almost on
all invocations of xennet_alloc_rx_buffers for certain workloads.
This reworks the fix by reverting to the old behaviour and taking into
consideration the skb allocation failure. Refill timer is now triggered
on insufficient requests or skb allocation failure.
Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com>
Fixes: 90c311b0eeea (xen-netfront: Fix Rx stall during network stress and OOM)
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 9 Feb 2017 21:22:54 +0000 (13:22 -0800)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull SCSI target fixes from Nicholas Bellinger:
"This target series for v4.10 contains fixes which address a few
long-standing bugs that DATERA's QA + automation teams have uncovered
while putting v4.1.y target code into production usage.
We've been running the top three in our nightly automated regression
runs for the last two months, and the COMPARE_AND_WRITE fix Mr. Gary
Guo has been manually verifying against a four node ESX cluster this
past week.
Note all of them have CC' stable tags.
Summary:
- Fix a bug with ESX EXTENDED_COPY + SAM_STAT_RESERVATION_CONFLICT
status, where target_core_xcopy.c logic was incorrectly returning
SAM_STAT_CHECK_CONDITION for all non SAM_STAT_GOOD cases (Nixon
Vincent)
- Fix a TMR LUN_RESET hung task bug while other in-flight TMRs are
being aborted, before the new one had been dispatched into tmr_wq
(Rob Millner)
- Fix a long standing double free OOPs, where a dynamically generated
'demo-mode' NodeACL has multiple sessions associated with it, and
the /sys/kernel/config/target/$FABRIC/$WWN/ subsequently disables
demo-mode, but never converts the dynamic ACL into a explicit ACL
(Rob Millner)
- Fix a long standing reference leak with ESX VAAI COMPARE_AND_WRITE
when the second phase WRITE COMMIT command fails, resulting in
CHECK_CONDITION response never being sent and se_cmd->cmd_kref
never reaching zero (Gary Guo)
Beyond these items on v4.1.y we've reproduced, fixed, and run through
our regression test suite using iscsi-target exports, there are two
additional outstanding list items:
- Remove a >= v4.2 RCU conversion BUG_ON that would trigger when
dynamic node NodeACLs where being converted to explicit NodeACLs.
The patch drops the BUG_ON to follow how pre RCU conversion worked
for this special case (Benjamin Estrabaud)
- Add ibmvscsis target_core_fabric_ops->max_data_sg_nent assignment
to match what IBM's Virtual SCSI hypervisor is already enforcing at
transport layer. (Bryant Ly + Steven Royer)"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
ibmvscsis: Add SGL limit
target: Fix COMPARE_AND_WRITE ref leak for non GOOD status
target: Fix multi-session dynamic se_node_acl double free OOPs
target: Fix early transport_generic_handle_tmr abort scenario
target: Use correct SCSI status during EXTENDED_COPY exception
target: Don't BUG_ON during NodeACL dynamic -> explicit conversion
Florian Fainelli [Thu, 9 Feb 2017 03:05:26 +0000 (19:05 -0800)]
net: phy: Fix PHY module checks and NULL deref in phy_attach_direct()
The Generic PHY drivers gets assigned after we checked that the current
PHY driver is NULL, so we need to check a few things before we can
safely dereference d->driver. This would be causing a NULL deference to
occur when a system binds to the Generic PHY driver. Update
phy_attach_direct() to do the following:
- grab the driver module reference after we have assigned the Generic
PHY drivers accordingly, and remember we came from the generic PHY
path
- update the error path to clean up the module reference in case the
Generic PHY probe function fails
- split the error path involving phy_detacht() to avoid double free/put
since phy_detach() does all the clean up
- finally, have phy_detach() drop the module reference count before we
call device_release_driver() for the Generic PHY driver case
Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 9 Feb 2017 20:25:42 +0000 (12:25 -0800)]
Merge tag 'pstore-v4.10-rc8' of git://git./linux/kernel/git/kees/linux
Pull pstore fix from Kees Cook:
"Fix pstore regression (boot Oops) when ftrace disabled, from Brian
Norris"
* tag 'pstore-v4.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: don't OOPS when there are no ftrace zones
Linus Torvalds [Thu, 9 Feb 2017 19:58:05 +0000 (11:58 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"A fix for a crash in uinput, and a fix for build errors when HID-RMI
is built-in but SERIO is a module"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - select 'SERIO' when needed
Input: uinput - fix crash when mixing old and new init style
Brian Norris [Thu, 9 Feb 2017 06:44:44 +0000 (22:44 -0800)]
pstore: don't OOPS when there are no ftrace zones
We'll OOPS in ramoops_get_next_prz() if the platform didn't ask for any
ftrace zones (i.e., cxt->fprzs will be NULL). Let's just skip this
entire FTRACE section if there's no 'fprzs'.
Regression seen on a coreboot/depthcharge-based Chromebook.
Fixes: 2fbea82bbb89 ("pstore: Merge per-CPU ftrace records into one")
Cc: Joel Fernandes <joelaf@google.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Linus Torvalds [Thu, 9 Feb 2017 19:34:15 +0000 (11:34 -0800)]
Merge tag 'vfio-v4.10-final' of git://github.com/awilliam/linux-vfio
Pull VFIO fix from Alex Williamson:
"Fix regression in attaching groups to existing container for SPAPR
IOMMU backend (Alexey Kardashevskiy)"
* tag 'vfio-v4.10-final' of git://github.com/awilliam/linux-vfio:
vfio/spapr_tce: Set window when adding additional groups to container
Linus Torvalds [Thu, 9 Feb 2017 19:30:56 +0000 (11:30 -0800)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"A couple more fixes for 4.10:
- fix addressing the short regset write issue (Dave Martin)
- fix for LPAE systems which leave a pending imprecise data abort
before entering the kernel (Alexander Sverdlin)"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8643/3: arm/ptrace: Preserve previous registers for short regset write
ARM: 8642/1: LPAE: catch pending imprecise abort on unmask
Ricardo Ribalda [Thu, 2 Feb 2017 19:15:16 +0000 (20:15 +0100)]
i2c: piix4: Request the SMBUS semaphore inside the mutex
SMBSLVCNT must be protected with the piix4_mutex_sb800 in order to avoid
multiple buses accessing to the semaphore at the same time.
Fixes: 701dc207bf55 ("i2c: piix4: Avoid race conditions with IMC")
Reported-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Ricardo Ribalda [Fri, 27 Jan 2017 14:59:30 +0000 (15:59 +0100)]
i2c: piix4: Fix request_region size
Since '
701dc207bf55 ("i2c: piix4: Avoid race conditions with IMC")' we
are using the SMBSLVCNT register at offset 0x8. We need to request it.
Fixes: 701dc207bf55 ("i2c: piix4: Avoid race conditions with IMC")
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Hui Wang [Thu, 9 Feb 2017 01:20:54 +0000 (09:20 +0800)]
ALSA: hda - adding a new NV HDMI/DP codec ID in the driver
Without this change, the HDMI/DP codec will be recognised as a
generic codec, and there is no sound when playing through this codec.
As suggested by NVidia side, after adding the new ID in the driver,
the sound playing works well.
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Benjamin Herrenschmidt [Tue, 7 Feb 2017 00:35:36 +0000 (11:35 +1100)]
powerpc/powernv: Properly set "host-ipi" on IPIs
Otherwise KVM will fail to pass them through to the host
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Tue, 7 Feb 2017 00:35:31 +0000 (11:35 +1100)]
powerpc/powernv: Fix CPU hotplug to handle waking on HVI
The IPIs come in as HVI not EE, so we need to test the appropriate
SRR1 bits. The encoding is such that it won't have false positives
on P7 and P8 so we can just test it like that. We also need to handle
the icp-opal variant of the flush.
Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Benjamin Herrenschmidt [Mon, 6 Feb 2017 02:05:16 +0000 (13:05 +1100)]
powerpc/mm/radix: Update ERAT flushes when invalidating TLB
Three tiny changes to the ERAT flushing logic: First don't make
it depend on DD1. It hasn't been decided yet but we might run
DD2 in a mode that also requires explicit flushes for performance
reasons so make it unconditional. We also add a missing isync, and
finally remove the flush from _tlbiel_va as it is only necessary
for congruence-class invalidations (PID, LPID and full TLB), not
targetted invalidations.
Fixes: 96ed1fe511a8 ("powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Linus Torvalds [Thu, 9 Feb 2017 02:08:29 +0000 (18:08 -0800)]
Revert "x86/ioapic: Restore IO-APIC irq_chip retrigger callback"
This reverts commit
020eb3daaba2857b32c4cf4c82f503d6a00a67de.
Gabriel C reports that it causes his machine to not boot, and we haven't
tracked down the reason for it yet. Since the bug it fixes has been
around for a longish time, we're better off reverting the fix for now.
Gabriel says:
"It hangs early and freezes with a lot RCU warnings.
I bisected it down to :
> Ruslan Ruslichenko (1):
> x86/ioapic: Restore IO-APIC irq_chip retrigger callback
Reverting this one fixes the problem for me..
The box is a PRIMERGY TX200 S5 , 2 socket , 2 x E5520 CPU(s) installed"
and Ruslan and Thomas are currently stumped.
Reported-and-bisected-by: Gabriel C <nix.or.die@gmail.com>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org # for the backport of the original commit
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Daney [Wed, 8 Feb 2017 00:23:31 +0000 (16:23 -0800)]
Revert "hwrng: core - zeroize buffers with random data"
This reverts commit
2cc751545854d7bd7eedf4d7e377bb52e176cd07.
With this commit in place I get on a Cavium ThunderX (arm64) system:
$ if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v > rng-bad.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 9.1171e-05 s, 2.8 MB/s
$ dd if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v >> rng-bad.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 9.6141e-05 s, 2.7 MB/s
$ cat rng-bad.txt
000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000050 00 00 00 00 37 20 46 ae d0 fc 1c 55 25 6e b0 b8
000060 7c 7e d7 d4 00 0f 6f b2 91 1e 30 a8 fa 3e 52 0e
000070 06 2d 53 30 be a1 20 0f aa 56 6e 0e 44 6e f4 35
000080 b7 6a fe d2 52 70 7e 58 56 02 41 ea d1 9c 6a 6a
000090 d1 bd d8 4c da 35 45 ef 89 55 fc 59 d5 cd 57 ba
0000a0 4e 3e 02 1c 12 76 43 37 23 e1 9f 7a 9f 9e 99 24
0000b0 47 b2 de e3 79 85 f6 55 7e ad 76 13 4f a0 b5 41
0000c0 c6 92 42 01 d9 12 de 8f b4 7b 6e ae d7 24 fc 65
0000d0 4d af 0a aa 36 d9 17 8d 0e 8b 7a 3b b6 5f 96 47
0000e0 46 f7 d8 ce 0b e8 3e c6 13 a6 2c b6 d6 cc 17 26
0000f0 e3 c3 17 8e 9e 45 56 1e 41 ef 29 1a a8 65 c8 3a
000100
000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000050 00 00 00 00 f4 90 65 aa 8b f2 5e 31 01 53 b4 d4
000060 06 c0 23 a2 99 3d 01 e4 b0 c1 b1 55 0f 80 63 cf
000070 33 24 d8 3a 1d 5e cd 2c ba c0 d0 18 6f bc 97 46
000080 1e 19 51 b1 90 15 af 80 5e d1 08 0d eb b0 6c ab
000090 6a b4 fe 62 37 c5 e1 ee 93 c3 58 78 91 2a d5 23
0000a0 63 50 eb 1f 3b 84 35 18 cf b2 a4 b8 46 69 9e cf
0000b0 0c 95 af 03 51 45 a8 42 f1 64 c9 55 fc 69 76 63
0000c0 98 9d 82 fa 76 85 24 da 80 07 29 fe 4e 76 0c 61
0000d0 ff 23 94 4f c8 5c ce 0b 50 e8 31 bc 9d ce f4 ca
0000e0 be ca 28 da e6 fa cc 64 1c ec a8 41 db fe 42 bd
0000f0 a0 e2 4b 32 b4 52 ba 03 70 8e c1 8e d0 50 3a c6
000100
To my untrained mental entropy detector, the first several bytes of
each read from /dev/hwrng seem to not be very random (i.e. all zero).
When I revert the patch (apply this patch), I get back to what we have
in v4.9, which looks like (much more random appearing):
$ dd if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v > rng-good.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 0.
000252233 s, 1.0 MB/s
$ dd if=/dev/hwrng bs=256 count=1 | od -t x1 -A x -v >> rng-good.txt
1+0 records in
1+0 records out
256 bytes (256 B) copied, 0.
000113571 s, 2.3 MB/s
$ cat rng-good.txt
000000 75 d1 2d 19 68 1f d2 26 a1 49 22 61 66 e8 09 e5
000010 e0 4e 10 d0 1a 2c 45 5d 59 04 79 8e e2 b7 2c 2e
000020 e8 ad da 34 d5 56 51 3d 58 29 c7 7a 8e ed 22 67
000030 f9 25 b9 fb c6 b7 9c 35 1f 84 21 35 c1 1d 48 34
000040 45 7c f6 f1 57 63 1a 88 38 e8 81 f0 a9 63 ad 0e
000050 be 5d 3e 74 2e 4e cb 36 c2 01 a8 14 e1 38 e1 bb
000060 23 79 09 56 77 19 ff 98 e8 44 f3 27 eb 6e 0a cb
000070 c9 36 e3 2a 96 13 07 a0 90 3f 3b bd 1d 04 1d 67
000080 be 33 14 f8 02 c2 a4 02 ab 8b 5b 74 86 17 f0 5e
000090 a1 d7 aa ef a6 21 7b 93 d1 85 86 eb 4e 8c d0 4c
0000a0 56 ac e4 45 27 44 84 9f 71 db 36 b9 f7 47 d7 b3
0000b0 f2 9c 62 41 a3 46 2b 5b e3 80 63 a4 35 b5 3c f4
0000c0 bc 1e 3a ad e4 59 4a 98 6c e8 8d ff 1b 16 f8 52
0000d0 05 5c 2f 52 2a 0f 45 5b 51 fb 93 97 a4 49 4f 06
0000e0 f3 a0 d1 1e ba 3d ed a7 60 8f bb 84 2c 21 94 2d
0000f0 b3 66 a6 61 1e 58 30 24 85 f8 c8 18 c3 77 00 22
000100
000000 73 ca cc a1 d9 bb 21 8d c3 5c f3 ab 43 6d a7 a4
000010 4a fd c5 f4 9c ba 4a 0f b1 2e 19 15 4e 84 26 e0
000020 67 c9 f2 52 4d 65 1f 81 b7 8b 6d 2b 56 7b 99 75
000030 2e cd d0 db 08 0c 4b df f3 83 c6 83 00 2e 2b b8
000040 0f af 61 1d f2 02 35 74 b5 a4 6f 28 f3 a1 09 12
000050 f2 53 b5 d2 da 45 01 e5 12 d6 46 f8 0b db ed 51
000060 7b f4 0d 54 e0 63 ea 22 e2 1d d0 d6 d0 e7 7e e0
000070 93 91 fb 87 95 43 41 28 de 3d 8b a3 a8 8f c4 9e
000080 30 95 12 7a b2 27 28 ff 37 04 2e 09 7c dd 7c 12
000090 e1 50 60 fb 6d 5f a8 65 14 40 89 e3 4c d2 87 8f
0000a0 34 76 7e 66 7a 8e 6b a3 fc cf 38 52 2e f9 26 f0
0000b0 98 63 15 06 34 99 b2 88 4f aa d8 14 88 71 f1 81
0000c0 be 51 11 2b f4 7e a0 1e 12 b2 44 2e f6 8d 84 ea
0000d0 63 82 2b 66 b3 9a fd 08 73 5a c2 cc ab 5a af b1
0000e0 88 e3 a6 80 4b fc db ed 71 e0 ae c0 0a a4 8c 35
0000f0 eb 89 f9 8a 4b 52 59 6f 09 7c 01 3f 56 e7 c7 bf
000100
Signed-off-by: David Daney <david.daney@cavium.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 9 Feb 2017 00:06:10 +0000 (16:06 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"4 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/slub.c: fix random_seq offset destruction
cpumask: use nr_cpumask_bits for parsing functions
mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
kernel/ucount.c: mark user_header with kmemleak_ignore()
Sean Rees [Wed, 8 Feb 2017 22:30:59 +0000 (14:30 -0800)]
mm/slub.c: fix random_seq offset destruction
Commit
210e7a43fa90 ("mm: SLUB freelist randomization") broke USB hub
initialisation as described in
https://bugzilla.kernel.org/show_bug.cgi?id=177551.
Bail out early from init_cache_random_seq if s->random_seq is already
initialised. This prevents destroying the previously computed
random_seq offsets later in the function.
If the offsets are destroyed, then shuffle_freelist will truncate
page->freelist to just the first object (orphaning the rest).
Fixes: 210e7a43fa90 ("mm: SLUB freelist randomization")
Link: http://lkml.kernel.org/r/20170207140707.20824-1-sean@erifax.org
Signed-off-by: Sean Rees <sean@erifax.org>
Reported-by: <userwithuid@gmail.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tejun Heo [Wed, 8 Feb 2017 22:30:56 +0000 (14:30 -0800)]
cpumask: use nr_cpumask_bits for parsing functions
Commit
513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and
parsing functions") converted both cpumask printing and parsing
functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was
okay for the printing functions as it just picked one of the two output
formats that we were alternating between depending on a kernel config,
doing the same for parsing wasn't okay.
nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use
nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can
be more efficient to use NR_CPUS when we can get away with it.
Converting the printing functions to nr_cpu_ids makes sense because it
affects how the masks get presented to userspace and doesn't break
anything; however, using nr_cpu_ids for parsing functions can
incorrectly leave the higher bits uninitialized while reading in these
masks from userland. As all testing and comparison functions use
nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks
can erroneously yield false negative results.
This made the taskstats interface incorrectly return -EINVAL even when
the inputs were correct.
Fix it by restoring the parse functions to use nr_cpumask_bits instead
of nr_cpu_ids.
Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org
Fixes: 513e3d2d11c9 ("cpumask: always use nr_cpu_ids in formatting and parsing functions")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de>
Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: <stable@vger.kernel.org> [4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Wed, 8 Feb 2017 22:30:53 +0000 (14:30 -0800)]
mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted.
Fix Lustre and block_page_mkwrite_ret() used by other filesystems
(notably GFS2) to return VM_FAULT_NOPAGE instead which results in
bailing out from the fault code, the CPU then retries the access, and we
fault again effectively doing what the handler wanted.
Link: http://lkml.kernel.org/r/20170203150729.15863-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Luis R. Rodriguez [Wed, 8 Feb 2017 22:30:50 +0000 (14:30 -0800)]
kernel/ucount.c: mark user_header with kmemleak_ignore()
The user_header gets caught by kmemleak with the following splat as
missing a free:
unreferenced object 0xffff99667a733d80 (size 96):
comm "swapper/0", pid 1, jiffies
4294892317 (age 62191.468s)
hex dump (first 32 bytes):
a0 b6 92 b4 ff ff ff ff 00 00 00 00 01 00 00 00 ................
01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
kmemleak_alloc+0x4a/0xa0
__kmalloc+0x144/0x260
__register_sysctl_table+0x54/0x5e0
register_sysctl+0x1b/0x20
user_namespace_sysctl_init+0x17/0x34
do_one_initcall+0x52/0x1a0
kernel_init_freeable+0x173/0x200
kernel_init+0xe/0x100
ret_from_fork+0x2c/0x40
The BUG_ON()s are intended to crash so no need to clean up after
ourselves on error there. This is also a kernel/ subsys_init() we don't
need a respective exit call here as this is never modular, so just white
list it.
Link: http://lkml.kernel.org/r/20170203211404.31458-1-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nikolay Borisov <n.borisov.lkml@gmail.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrzej Pietrasiewicz [Wed, 1 Feb 2017 09:35:08 +0000 (10:35 +0100)]
drm: vc4: adapt to new behaviour of drm_crtc.c
When drm_crtc_init_with_planes() was orignally added
(in drm_crtc.c,
e13161af80c185ecd8dc4641d0f5df58f9e3e0af
drm: Add drm_crtc_init_with_planes() (v2)), it only checked for "primary"
being non-null. If that was the case, it modified primary->possible_crtcs.
Then, when support for cursor planes was added
(
fc1d3e44ef7c1db93384150fdbf8948dcf949f15 drm: Allow drivers to register
cursor planes with crtc), the same behaviour was implemented for cursor
planes.
vc4_plane_init() since its inception has passed 0xff as "possible_crtcs"
parameter to drm_universal_plane_init(). With a change in drm_crtc.c
(
7abc7d47510c75dd984380ebf819616e574c9604 drm: don't override
possible_crtcs for primary/cursor planes) passing 0xff results in primary's
possible_crtcs set to 0xff (cursor was updated manually by vc4_crtc.c).
Consequently, it would be allowed to use the primary plane from CRTC 1 (for
example) on CRTC 0, which would result in the overlay and cursors being
buried.
Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: http://patchwork.freedesktop.org/patch/msgid/1485941708-27892-1-git-send-email-andrzej.p@samsung.com
Fixes: 7abc7d47510c ("drm: don't override possible_crtcs for primary/cursor planes")
Thanneeru Srinivasulu [Wed, 8 Feb 2017 12:39:00 +0000 (18:09 +0530)]
net: thunderx: Fix PHY autoneg for SGMII QLM mode
This patch fixes the case where there is no phydev attached
to a LMAC in DT due to non-existance of a PHY driver or due
to usage of non-stanadard PHY which doesn't support autoneg.
Changes dependeds on firmware to send correct info w.r.t
PHY and autoneg capability.
This patch also covers a case where a 10G/40G interface is used
as a 1G with convertors with Cortina PHY in between.
Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com>
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 8 Feb 2017 20:23:49 +0000 (12:23 -0800)]
Merge tag 'pci-v4.10-fixes-3' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- check MSI affinity vs. number of vectors to avoid memory corruption
- drop runtime power management for PCIe hotplug ports for now to avoid
regressing hotplug via sysfs
* tag 'pci-v4.10-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI: pciehp: Add runtime PM support for PCIe hotplug ports"
PCI/MSI: Don't apply affinity if there aren't enough vectors left
Florian Fainelli [Wed, 8 Feb 2017 07:10:13 +0000 (23:10 -0800)]
net: dsa: Do not destroy invalid network devices
dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
for the network device not being NULL before attempting to destroy it. We were
not setting the slave network device as NULL if dsa_slave_create() failed, so
we would later on be calling dsa_slave_destroy() on a now free'd and
unitialized network device, causing crashes in dsa_slave_destroy().
Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Tue, 7 Feb 2017 20:59:46 +0000 (12:59 -0800)]
ping: fix a null pointer dereference
Andrey reported a kernel crash:
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task:
ffff880060048040 task.stack:
ffff880069be8000
RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
RSP: 0018:
ffff880069bef8b8 EFLAGS:
00010206
RAX:
dffffc0000000000 RBX:
ffff880069befb90 RCX:
0000000000000000
RDX:
0000000000000018 RSI:
ffff880069befa30 RDI:
00000000000000c2
RBP:
ffff880069befbb8 R08:
0000000000000008 R09:
0000000000000000
R10:
0000000000000002 R11:
0000000000000000 R12:
ffff880069befab0
R13:
ffff88006c624a80 R14:
ffff880069befa70 R15:
0000000000000000
FS:
00007f6f7c716700(0000) GS:
ffff88006de00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00000000004a6f28 CR3:
000000003a134000 CR4:
00000000000006e0
Call Trace:
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
sock_sendmsg_nosec net/socket.c:635 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:645
SYSC_sendto+0x660/0x810 net/socket.c:1687
SyS_sendto+0x40/0x50 net/socket.c:1655
entry_SYSCALL_64_fastpath+0x1f/0xc2
This is because we miss a check for NULL pointer for skb_peek() when
the queue is empty. Other places already have the same check.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 Feb 2017 18:56:38 +0000 (13:56 -0500)]
Merge branch 'net-header-length-truncation'
Willem de Bruijn says:
====================
net: Fixes for header length truncation
Packets should not enter the stack with truncated link layer headers
and link layer headers should always be stored in the skb linear
segment.
Patch 1 ensures the first for PF_PACKET sockets
Patch 2 ensures the second for PF_PACKET GSO sockets without tx_ring
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Tue, 7 Feb 2017 20:57:21 +0000 (15:57 -0500)]
packet: round up linear to header len
Link layer protocols may unconditionally pull headers, as Ethernet
does in eth_type_trans. Ensure that the entire link layer header
always lies in the skb linear segment. tpacket_snd has such a check.
Extend this to packet_snd.
Variable length link layer headers complicate the computation
somewhat. Here skb->len may be smaller than dev->hard_header_len.
Round up the linear length to be at least as long as the smallest of
the two.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Tue, 7 Feb 2017 20:57:20 +0000 (15:57 -0500)]
net: introduce device min_header_len
The stack must not pass packets to device drivers that are shorter
than the minimum link layer header length.
Previously, packet sockets would drop packets smaller than or equal
to dev->hard_header_len, but this has false positives. Zero length
payload is used over Ethernet. Other link layer protocols support
variable length headers. Support for validation of these protocols
removed the min length check for all protocols.
Introduce an explicit dev->min_header_len parameter and drop all
packets below this value. Initially, set it to non-zero only for
Ethernet and loopback. Other protocols can follow in a patch to
net-next.
Fixes: 9ed988cd5915 ("packet: validate variable length ll headers")
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bryant G. Ly [Mon, 6 Feb 2017 16:04:28 +0000 (10:04 -0600)]
ibmvscsis: Add SGL limit
This patch adds internal LIO sgl limit since the driver already
sets a max transfer limit on transport layer of 1MB to the client.
Cc: stable@vger.kernel.org
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
WANG Cong [Wed, 8 Feb 2017 18:02:13 +0000 (10:02 -0800)]
sit: fix a double free on error path
Dmitry reported a double free in sit_init_net():
kernel BUG at mm/percpu.c:689!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-
20170206 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
task:
ffff8801c9cc27c0 task.stack:
ffff88017d1d8000
RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
RSP: 0018:
ffff88017d1df488 EFLAGS:
00010046
RAX:
0000000000010000 RBX:
00000000000007c0 RCX:
ffffc90002829000
RDX:
0000000000010000 RSI:
ffffffff81940efb RDI:
ffff8801db841d94
RBP:
ffff88017d1df590 R08:
dffffc0000000000 R09:
1ffffffff0bb3bdd
R10:
dffffc0000000000 R11:
00000000000135dd R12:
ffff8801db841d80
R13:
0000000000038e40 R14:
00000000000007c0 R15:
00000000000007c0
FS:
00007f6ea608f700(0000) GS:
ffff8801dbe00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
000000002000aff8 CR3:
00000001c8d44000 CR4:
00000000001426f0
DR0:
0000000020000000 DR1:
0000000020000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000600
Call Trace:
free_percpu+0x212/0x520 mm/percpu.c:1264
ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
ops_init+0x10a/0x530 net/core/net_namespace.c:115
setup_net+0x2ed/0x690 net/core/net_namespace.c:291
copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
SYSC_unshare kernel/fork.c:2281 [inline]
SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
entry_SYSCALL_64_fastpath+0x1f/0xc2
This is because when tunnel->dst_cache init fails, we free dev->tstats
once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
redundant but its ndo_uinit() does not seem enough to clean up everything
here. So avoid this by setting dev->tstats to NULL after the first free,
at least for -net.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 8 Feb 2017 18:01:39 +0000 (10:01 -0800)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
- A relatively large patch restores booting on i.MX platforms that
failed to boot after a cleanup was merged for v4.10.
- A quirk for USB needs to be enabled on the STi platform
- On the Meson platform, we saw memory corruption with part of the
memory used by the secure monitor, so we have to stay out of that
area.
- The same platform also has a problem with ethernet under load, which
is fixed by disabling EEE negotiation.
- imx6dl has an incorrect pin configuration, which prevents SPI from
working.
- Two maintainers have lost their access to their email addresses, so
we should update the MAINTAINERS file before the release
- Renaming one of the orion5x linkstation models to help simplify the
debian install.
- A couple of fixes for build warnings that were introduced during
v4.10-rc.
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: defconfigs: make NF_CT_PROTO_SCTP and NF_CT_PROTO_UDPLITE built-in
MAINTAINERS: socfpga: update email for Dinh Nguyen
ARM: orion5x: fix Makefile for linkstation-lschl.dtb
ARM: dts: orion5x-lschl: More consistent naming on linkstation series
ARM: dts: orion5x-lschl: Fix model name
MAINTAINERS: change email address from atmel to microchip
MAINTAINERS: at91: change email address
ARM64: dts: meson-gx: Add firmware reserved memory zones
ARM64: dts: meson-gxbb-odroidc2: fix GbE tx link breakage
ARM: dts: STiH407-family: set snps,dis_u3_susphy_quirk
ARM: dts: imx: Pass 'chosen' and 'memory' nodes
ARM: dts: imx6dl: fix GPIO4 range
ARM: imx: hide unused variable in #ifdef
Linus Torvalds [Wed, 8 Feb 2017 17:59:45 +0000 (09:59 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security
Pull selinux fix from James Morris:
"Fix off-by-one in setprocattr"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
selinux: fix off-by-one in setprocattr
Linus Torvalds [Wed, 8 Feb 2017 17:56:15 +0000 (09:56 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fix from Jens Axboe:
"A single fix that should go into 4.10, fixing a regression on some
devices with the WRITE_SAME command"
* 'for-linus' of git://git.kernel.dk/linux-block:
block: don't try Write Same from __blkdev_issue_zeroout
David Ahern [Wed, 8 Feb 2017 17:29:00 +0000 (09:29 -0800)]
lwtunnel: valid encap attr check should return 0 when lwtunnel is disabled
An error was reported upgrading to 4.9.8:
root@Typhoon:~# ip route add default table 210 nexthop dev eth0 via 10.68.64.1
weight 1 nexthop dev eth0 via 10.68.64.2 weight 1
RTNETLINK answers: Operation not supported
The problem occurs when CONFIG_LWTUNNEL is not enabled and a multipath
route is submitted.
The point of lwtunnel_valid_encap_type_attr is catch modules that
need to be loaded before any references are taken with rntl held. With
CONFIG_LWTUNNEL disabled, there will be no modules to load so the
lwtunnel_valid_encap_type_attr stub should just return 0.
Fixes: 9ed59592e3e3 ("lwtunnel: fix autoload of lwt modules")
Reported-by: pupilla@libero.it
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leon Romanovsky [Wed, 8 Feb 2017 15:04:09 +0000 (17:04 +0200)]
RDMA: Don't reference kernel private header from UAPI header
Remove references to private kernel header and defines from exported
ib_user_verb.h file.
The code snippet below is used to reproduce the issue:
#include <stdio.h>
#include <rdma/ib_user_verb.h>
int main(void)
{
printf("IB_USER_VERBS_ABI_VERSION = %d\n", IB_USER_VERBS_ABI_VERSION);
return 0;
}
It fails during compilation phase with an error:
➜ /tmp gcc main.c
main.c:2:31: fatal error: rdma/ib_user_verb.h: No such file or directory
#include <rdma/ib_user_verb.h>
^
compilation terminated.
Fixes: 189aba99e700 ("IB/uverbs: Extend modify_qp and support packet pacing")
CC: Bodong Wang <bodong@mellanox.com>
CC: Matan Barak <matanb@mellanox.com>
CC: Christoph Hellwig <hch@infradead.org>
Tested-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>