Jakub Kicinski [Thu, 2 Nov 2017 08:31:33 +0000 (01:31 -0700)]
nfp: bpf: fall back to core NIC app if BPF not selected
If kernel config does not include BPF just replace the BPF
app handler with the handler for basic NIC. The BPF app
will now be built only if BPF infrastructure is selected
in kernel config.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 2 Nov 2017 08:31:32 +0000 (01:31 -0700)]
nfp: reorganize the app table
The app table is an unordered array right now. We have to search
apps by ID. It also makes it harder to fall back to core NIC if
advanced functions are not compiled into the kernel (e.g. eBPF).
Make the table keyed by app id.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 2 Nov 2017 08:31:31 +0000 (01:31 -0700)]
nfp: bpf: reject TC offload if XDP loaded
Recent TC changes dropped the check protecting us from trying
to offload a TC program if XDP programs are already loaded.
Fixes: 90d97315b3e7 ("nfp: bpf: Convert ndo_setup_tc offloads to block callbacks")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Thu, 2 Nov 2017 08:31:30 +0000 (01:31 -0700)]
nfp: flower: vxlan - ensure no sleep in atomic context
Functions called by the netevent notifier must be in atomic context.
Change the mutex to spinlock and ensure mem allocations are done with the
atomic flag.
Also, remove unnecessary locking after notifiers are unregistered.
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Thu, 2 Nov 2017 08:31:29 +0000 (01:31 -0700)]
nfp: flower: app should use struct nfp_repr
Ensure priv netdev data in flower app is cast to nfp_repr and not nfp_net
as in other apps.
Fixes: 363fc53b8b58 ("nfp: flower: Convert ndo_setup_tc offloads to block callbacks")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Bhole [Thu, 2 Nov 2017 08:09:45 +0000 (17:09 +0900)]
tools: bpf: handle long path in jit disasm
Use PATH_MAX instead of hardcoded array size 256
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vijaya Mohan Guvva [Wed, 1 Nov 2017 23:19:49 +0000 (16:19 -0700)]
liquidio: synchronize VF representor names with NIC firmware
LiquidIO firmware supports a vswitch that needs to know the names of the
VF representors in the host to maintain compatibility for direct
programming using external Openflow agents. So, for each VF representor,
send its name to the firmware when it gets registered and when its name
changes.
Signed-off-by: Vijaya Mohan Guvva <vijaya.guvva@cavium.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 08:01:39 +0000 (17:01 +0900)]
Merge branch 'BPF-range-marking-improvements-for-meta-data'
Daniel Borkmann says:
====================
BPF range marking improvements for meta data
The set contains improvements for direct packet access range
markings related to data_meta pointer and test cases for all
such access patterns that the verifier matches on.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 1 Nov 2017 22:58:11 +0000 (23:58 +0100)]
bpf: add test cases to bpf selftests to cover all meta tests
Lets also add test cases to cover all possible data_meta access tests
for good/bad access cases so we keep tracking them.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 1 Nov 2017 22:58:10 +0000 (23:58 +0100)]
bpf: also improve pattern matches for meta access
Follow-up to
0fd4759c5515 ("bpf: fix pattern matches for direct
packet access") to cover also the remaining data_meta/data matches
in the verifier. The matches are also refactored a bit to simplify
handling of all the cases.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 1 Nov 2017 22:58:09 +0000 (23:58 +0100)]
bpf: minor cleanups after merge
Two minor cleanups after Dave's recent merge in
f8ddadc4db6c
("Merge git://git.kernel.org...") of net into net-next in
order to get the code in line with what was done originally
in the net tree: i) use max() instead of max_t() since both
ranges are u16, ii) don't split the direct access test cases
in the middle with bpf_exit test cases from
390ee7e29fc
("bpf: enforce return code for cgroup-bpf programs").
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 1 Nov 2017 18:48:00 +0000 (11:48 -0700)]
security: bpf: replace include of linux/bpf.h with forward declarations
Touching linux/bpf.h makes us rebuild a surprisingly large
portion of the kernel. Remove the unnecessary dependency
from security.h, it only needs forward declarations.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 1 Nov 2017 18:29:47 +0000 (11:29 -0700)]
net: systemport: Only inspect valid switch port & queues
Hesoteric board configurations where port 0 is not available would still
make SYSTEMPORT inspect the switch port 0, queue 0, which, not being
enabled, would cause transmit timeouts over time. Just ignore those
unconfigured rings instead.
Fixes: 84ff33eeb23d ("net: systemport: Establish DSA network device queue mapping")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 07:47:30 +0000 (16:47 +0900)]
Merge branch 'nfp-bpf-rename-ALU_OP_NEG-and-support-BPF_NEG'
Jakub Kicinski says:
====================
nfp: bpf: rename ALU_OP_NEG and support BPF_NEG
Jiong says:
Compilers are starting to use BPF_NEG, for example LLVM. However, NFP
does not support JITing it. This patch set adds this. Unit test is added
as well.
Meanwhile, the current NFP_ALU_NEG is actually doing bitwise NOT (one's
complement) operation, so the name is misleading. This patch set corrects
this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiong Wang [Wed, 1 Nov 2017 17:38:25 +0000 (10:38 -0700)]
nfp: bpf: support [BPF_ALU | BPF_ALU64] | BPF_NEG
This patch supports BPF_NEG under both BPF_ALU64 and BPF_ALU. LLVM recently
starts to generate it.
NOTE: BPF_NEG takes single operand which is an register and serve as both
input and output.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiong Wang [Wed, 1 Nov 2017 17:38:24 +0000 (10:38 -0700)]
nfp: bpf: rename ALU_OP_NEG to ALU_OP_NOT
The current ALU_OP_NEG is Op encoding 0x4 for NPF ALU instruction. It is
actually performing "~B" operation which is bitwise NOT.
The using naming ALU_OP_NEG is misleading as NEG is -B which is not the
same as ~B.
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 07:15:26 +0000 (16:15 +0900)]
Merge branch 'dpaa-cleanups'
yuan linyu says:
====================
net: dpaa: two minor cleanup
original i try to remove duplicate code which clean allocated per-cpu area,
thanks to David S. Miller, there are two build warning as errors.
path 1: fix old code maybe-uninitialized warning.
path 2: remove duplicate code and fix unused var warning.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
yuan linyu [Wed, 1 Nov 2017 13:11:11 +0000 (21:11 +0800)]
net: dpaa: remove init which already done in per-cpu allocation
Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
yuan linyu [Wed, 1 Nov 2017 13:10:32 +0000 (21:10 +0800)]
net: dpaa: fix maybe uninitialized var in dpaa_open()
Signed-off-by: yuan linyu <Linyu.Yuan@alcatel-sbell.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Wed, 1 Nov 2017 11:44:45 +0000 (12:44 +0100)]
bpf: cpumap micro-optimization in cpu_map_enqueue
Discovered that the compiler laid-out asm code in suboptimal way
when studying perf report during benchmarking of cpumap. Help
the compiler by the marking unlikely code paths.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 07:10:40 +0000 (16:10 +0900)]
Merge branch 'net-sched-block-callbacks-follow-up'
Jiri Pirko says:
====================
net: sched: block callbacks follow-up
This patchset does a bit of cleanup of leftovers after block callbacks
patchset. The main part is patch 2, which restores the original handling
of tc offload feature flag.
---
v1->v2:
- rebased on top of current net-next (bnxt changes)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 1 Nov 2017 10:47:41 +0000 (11:47 +0100)]
net: sched: remove ndo_setup_tc check from tc_can_offload
Since tc_can_offload is always called from block callback or egdev
callback, no need to check if ndo_setup_tc exists.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 1 Nov 2017 10:47:40 +0000 (11:47 +0100)]
net: sched: remove tc_can_offload check from egdev call
Since the only user, mlx5 driver does the check in
mlx5e_setup_tc_block_cb, no need to check here.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 1 Nov 2017 10:47:39 +0000 (11:47 +0100)]
net: sched: move the can_offload check from binding phase to rule insertion phase
This restores the original behaviour before the block callbacks were
introduced. Allow the drivers to do binding of block always, no matter
if the NETIF_F_HW_TC feature is on or off. Move the check to the block
callback which is called for rule insertion.
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 1 Nov 2017 10:47:38 +0000 (11:47 +0100)]
net: sched: remove unused tc_should_offload helper
tc_should_offload is no longer used, remove it.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Wed, 1 Nov 2017 10:18:13 +0000 (12:18 +0200)]
net: bridge: add notifications for the bridge dev on vlan change
Currently the bridge device doesn't generate any notifications upon vlan
modifications on itself because it doesn't use the generic bridge
notifications.
With the recent changes we know if anything was modified in the vlan config
thus we can generate a notification when necessary for the bridge device
so add support to br_ifinfo_notify() similar to how other combined
functions are done - if port is present it takes precedence, otherwise
notify about the bridge. I've explicitly marked the locations where the
notification should be always for the port by setting bridge to NULL.
I've also taken the liberty to rearrange each modified function's local
variables in reverse xmas tree as well.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 10:17:15 +0000 (10:17 +0000)]
net: hns3: remove a couple of redundant assignments
The assignment to kinfo is redundant as this is a duplicate of
the initialiation of kinfo a few lines earlier, so it can be
removed. The assignment to v_tc_info is never read, so this
variable is redundant and can be removed completely. Cleans
up two clang warnings:
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c:433:34:
warning: Value stored to 'kinfo' during its initialization is never read
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c:775:3:
warning: Value stored to 'v_tc_info' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 09:09:13 +0000 (09:09 +0000)]
liquidio: remove redundant setting of inst_processed to zero
The zero value assigned to inst_processed at the end of each
iteration of the do-while loop is overwritten on the next iteration
and hence it is a redundant assignment and can be removed. Cleans
up clang warning:
drivers/net/ethernet/cavium/liquidio/request_manager.c:480:3:
warning: Value stored to 'inst_processed' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 08:57:37 +0000 (08:57 +0000)]
net: dl2k: remove redundant re-assignment to np
The pointer np is initialized and then re-assigned the same value
a few lines later. Remove the redundant duplicated assignment. Cleans
up clang warning:
drivers/net/ethernet/dlink/dl2k.c:314:25: warning: Value stored to
'np' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 1 Nov 2017 08:49:45 +0000 (08:49 +0000)]
wan: wanxl: remove redundant assignment to stat
stat set to zero and the value is never read, instead stat is
set again in the do-loop. Hence the setting to zero is redundant
and can be removed. Cleans up clang warning:
drivers/net/wan/wanxl.c:737:2: warning: Value stored to 'stat'
is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 05:59:52 +0000 (14:59 +0900)]
Merge git://git./linux/kernel/git/davem/net
Smooth Cong Wang's bug fix into 'net-next'. Basically put
the bulk of the tcf_block_put() logic from 'net' into
tcf_block_put_ext(), but after the offload unbind.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 05:19:53 +0000 (14:19 +0900)]
Merge branch 'samples-pktgen-updates'
Jesper Dangaard Brouer says:
====================
Updates for samples/pktgen
This patchset updates samples/pktgen and synchronize with changes
maintained in https://github.com/netoptimizer/network-testing/
Features wise Robert Hoo <robert.hu@intel.com> added support for
detecting and determining dev NUMA node IRQs, and added a new script
named pktgen_sample06_numa_awared_queue_irq_affinity.sh that use these
features.
Cleanup remove last of the old sample files, as IPv6 is covered by
existing sample code.
====================
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Wed, 1 Nov 2017 10:41:24 +0000 (11:41 +0100)]
samples/pktgen: remove remaining old pktgen sample scripts
Since commit
0f06a6787e05 ("samples: Add an IPv6 '-6' option to the
pktgen scripts") the newer pktgen_sampleXX script does show howto use
IPv6 with pktgen.
Thus, there is no longer a reason to keep the older sample scripts around.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Wed, 1 Nov 2017 10:41:19 +0000 (11:41 +0100)]
samples/pktgen: update sample03, no need for clones when bursting
Like sample05, don't use pktgen clone_skb feature when using 'burst' feature,
it is not really needed. This brings the burst users in sync.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Hoo [Wed, 1 Nov 2017 10:41:14 +0000 (11:41 +0100)]
samples/pktgen: add script pktgen_sample06_numa_awared_queue_irq_affinity.sh
This script simply does:
* Detect $DEV's NUMA node belonging.
* Bind each thread (processor of NUMA locality) with each $DEV queue's
irq affinity, 1:1 mapping.
* How many '-t' threads input determines how many queues will be utilized.
If '-f' designates first cpu id, then offset in the NUMA node's cpu list.
(Changes by Jesper: allow changing count from cmdline via '-n')
Signed-off-by: Robert Hoo <robert.hu@intel.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Hoo [Wed, 1 Nov 2017 10:41:09 +0000 (11:41 +0100)]
samples/pktgen: Add some helper functions
1. given a device, get its NUMA belongings
2. given a device, get its queues' irq numbers.
3. given a NUMA node, get its cpu id list.
Signed-off-by: Robert Hoo <robert.hu@intel.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Nov 2017 05:17:12 +0000 (14:17 +0900)]
Merge branch 'enic-Additional-ethtool-support'
Parvi Kaustubhi says:
====================
enic: Additional ethtool support
This patch set allows the user to show or modify rx/tx ring sizes using
ethtool.
v2:
- remove unused variable to fix build warning.
- update list of maintainers for cisco vic ethernet nic driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Parvi Kaustubhi [Wed, 1 Nov 2017 15:44:48 +0000 (08:44 -0700)]
MAINTAINERS: update MAINTAINERS for cisco vic
Add myself to list of maintainers for cisco vic ethernet nic driver. Remove
Neel as he left.
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parvi Kaustubhi [Wed, 1 Nov 2017 15:44:47 +0000 (08:44 -0700)]
enic: Add support for 'ethtool -g/-G'
Add support for displaying and modifying rx and tx ring sizes using
ethtool.
Also, increasing version to 2.3.0.45
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parvi Kaustubhi [Wed, 1 Nov 2017 15:44:46 +0000 (08:44 -0700)]
enic: reset fetch index
Since we are allowing rx ring size modification, reset fetch index
everytime. Otherwise it could have a stale value that can lead to a null
pointer dereference.
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Nov 2017 23:04:27 +0000 (16:04 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull signal bugfix from Eric Biederman:
"When making the generic support for SIGEMT conditional on the presence
of SIGEMT I made a typo that causes it to fail to activate. It was
noticed comparatively quickly but the bug report just made it to me
today"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Fix name of SIGEMT in #if defined() check
Andrew Clayton [Wed, 1 Nov 2017 15:49:59 +0000 (15:49 +0000)]
signal: Fix name of SIGEMT in #if defined() check
Commit
cc731525f26a ("signal: Remove kernel interal si_code magic")
added a check for SIGMET and NSIGEMT being defined. That SIGMET should
in fact be SIGEMT, with SIGEMT being defined in
arch/{alpha,mips,sparc}/include/uapi/asm/signal.h
This was actually pointed out by BenHutchings in a lwn.net comment
here https://lwn.net/Comments/734608/
Fixes: cc731525f26a ("signal: Remove kernel interal si_code magic")
Signed-off-by: Andrew Clayton <andrew@digital-domain.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Linus Torvalds [Wed, 1 Nov 2017 21:46:38 +0000 (14:46 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few fixes that should go into this series:
- Regression fix for ide-cd, ensuring that a request is fully
initialized. From Hongxu.
- Ditto fix for virtio_blk, from Bart.
- NVMe fix from Keith, ensuring that we set the right block size on
revalidation. If the block size changed, we'd be in trouble without
it.
- NVMe rdma fix from Sagi, fixing a potential hang while the
controller is being removed"
* 'for-linus' of git://git.kernel.dk/linux-block:
ide:ide-cd: fix kernel panic resulting from missing scsi_req_init
nvme: Fix setting logical block format when revalidating
virtio_blk: Fix an SG_IO regression
nvme-rdma: fix possible hang when issuing commands during ctrl removal
Linus Torvalds [Wed, 1 Nov 2017 15:29:01 +0000 (08:29 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix refcounting in xfrm_bundle_lookup() when using a dummy bundle,
from Steffen Klassert.
2) Fix crypto header handling in rx data frames in ath10k driver, from
Vasanthakumar Thiagarajan.
3) Fix use after free of qdisc when we defer tcp_chain_flush() to a
workqueue. From Cong Wang.
4) Fix double free in lapbether driver, from Pan Bian.
5) Sanitize TUNSETSNDBUF values, from Craig Gallek.
6) Fix refcounting when addrconf_permanent_addr() calls
ipv6_del_addr(). From Eric Dumazet.
7) Fix MTU probing bug in TCP that goes back to 2007, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
tcp: fix tcp_mtu_probe() vs highest_sack
ipv6: addrconf: increment ifp refcount before ipv6_del_addr()
tun/tap: sanitize TUNSETSNDBUF input
mlxsw: i2c: Fix buffer increment counter for write transaction
mlxsw: reg: Add high and low temperature thresholds
MAINTAINERS: Remove Yotam from mlxfw
MAINTAINERS: Update Yotam's E-mail
net: hns: set correct return value
net: lapbether: fix double free
bpf: remove SK_REDIRECT from UAPI
net: phy: marvell: Only configure RGMII delays when using RGMII
xfrm: Fix GSO for IPsec with GRE tunnel.
tc-testing: fix arg to ip command: -s -> -n
net_sched: remove tcf_block_put_deferred()
l2tp: hold tunnel in pppol2tp_connect()
Revert "ath10k: fix napi_poll budget overflow"
ath10k: rebuild crypto header in rx data frames
wcn36xx: Remove unnecessary rcu_read_unlock in wcn36xx_bss_info_changed
xfrm: Clear sk_dst_cache when applying per-socket policy.
xfrm: Fix xfrm_dst_cache memleak
Vlastimil Babka [Wed, 1 Nov 2017 07:21:25 +0000 (08:21 +0100)]
x86/mm: fix use-after-free of vma during userfaultfd fault
Syzkaller with KASAN has reported a use-after-free of vma->vm_flags in
__do_page_fault() with the following reproducer:
mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
mmap(&(0x7f0000011000/0x3000)=nil, 0x3000, 0x1, 0x32, 0xffffffffffffffff, 0x0)
r0 = userfaultfd(0x0)
ioctl$UFFDIO_API(r0, 0xc018aa3f, &(0x7f0000002000-0x18)={0xaa, 0x0, 0x0})
ioctl$UFFDIO_REGISTER(r0, 0xc020aa00, &(0x7f0000019000)={{&(0x7f0000012000/0x2000)=nil, 0x2000}, 0x1, 0x0})
r1 = gettid()
syz_open_dev$evdev(&(0x7f0000013000-0x12)="
2f6465762f696e7075742f6576656e742300", 0x0, 0x0)
tkill(r1, 0x7)
The vma should be pinned by mmap_sem, but handle_userfault() might (in a
return to userspace scenario) release it and then acquire again, so when
we return to __do_page_fault() (with other result than VM_FAULT_RETRY),
the vma might be gone.
Specifically, per Andrea the scenario is
"A return to userland to repeat the page fault later with a
VM_FAULT_NOPAGE retval (potentially after handling any pending signal
during the return to userland). The return to userland is identified
whenever FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in
vmf->flags"
However, since commit
a3c4fb7c9c2e ("x86/mm: Fix fault error path using
unsafe vma pointer") there is a vma_pkey() read of vma->vm_flags after
that point, which can thus become use-after-free. Fix this by moving
the read before calling handle_mm_fault().
Reported-by: syzbot <bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Fixes: 3c4fb7c9c2e ("x86/mm: Fix fault error path using unsafe vma pointer")
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 1 Nov 2017 14:59:39 +0000 (07:59 -0700)]
Merge tag 'smb3-file-name-too-long-fix' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French:
"smb3 file name too long fix"
* tag 'smb3-file-name-too-long-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: check MaxPathNameComponentLength != 0 before using it
Hongxu Jia [Tue, 31 Oct 2017 07:39:40 +0000 (15:39 +0800)]
ide:ide-cd: fix kernel panic resulting from missing scsi_req_init
Since we split the scsi_request out of struct request, while the
standard prep_rq_fn builds 10 byte cmds, it missed to invoke
scsi_req_init() to initialize certain fields of a scsi_request
structure (.__cmd[], .cmd, .cmd_len and .sense_len but no other
members of struct scsi_request).
An example panic on virtual machines (qemu/virtualbox) to boot
from IDE cdrom:
...
[ 8.754381] Call Trace:
[ 8.755419] blk_peek_request+0x182/0x2e0
[ 8.755863] blk_fetch_request+0x1c/0x40
[ 8.756148] ? ktime_get+0x40/0xa0
[ 8.756385] do_ide_request+0x37d/0x660
[ 8.756704] ? cfq_group_service_tree_add+0x98/0xc0
[ 8.757011] ? cfq_service_tree_add+0x1e5/0x2c0
[ 8.757313] ? ktime_get+0x40/0xa0
[ 8.757544] __blk_run_queue+0x3d/0x60
[ 8.757837] queue_unplugged+0x2f/0xc0
[ 8.758088] blk_flush_plug_list+0x1f4/0x240
[ 8.758362] blk_finish_plug+0x2c/0x40
...
[ 8.770906] RIP: ide_cdrom_prep_fn+0x63/0x180 RSP:
ffff92aec018bae8
[ 8.772329] ---[ end trace
6408481e551a85c9 ]---
...
Fixes: 82ed4db499b8 ("block: split scsi_request out of struct request")
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
David S. Miller [Wed, 1 Nov 2017 13:08:48 +0000 (22:08 +0900)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
pull request: bluetooth-next 2017-11-01
Here's one more bluetooth-next pull request for the 4.15 kernel.
- New NFA344A device entry for btusb drvier
- Fix race conditions in hci_ldisc
- Fix for isochronous interface assignments in btusb driver
- A few other smaller fixes & improvements
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Wed, 1 Nov 2017 07:08:04 +0000 (00:08 -0700)]
bpf: fix verifier memory leaks
fix verifier memory leaks
Fixes: 638f5b90d460 ("bpf: reduce verifier memory consumption")
Signed-off-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 13:06:04 +0000 (22:06 +0900)]
Merge branch 'cxgb4-add-hash-filter-support-to-tc-flower-offload'
Rahul Lakkireddy says:
====================
cxgb4: add hash-filter support to tc-flower offload
This series of patches add support to create hash-filters; a.k.a
exact-match filters, to tc-flower offload. T6 supports creating
~500K hash-filters in hw and can theoretically be expanded up to
~1 million.
Patch 1 fetches and saves the configured hw filter tuple field shifts
and filter mask.
Patch 2 initializes the driver to use hash-filter configuration.
Patch 3 adds support to create hash filters in hw.
Patch 4 adds support to delete hash filters in hw.
Patch 5 adds support to retrieve filter stats for hash filters.
Patch 6 converts the flower table to use rhashtable instead of
static hlist.
Patch 7 finally adds support to create hash filters via tc-flower
offload.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:05 +0000 (08:53 +0530)]
cxgb4: add support to create hash-filters via tc-flower offload
Determine whether the flow classifies as exact-match with respect to
4-tuple and configured tuple mask in hw. If successfully classified
as exact-match, offload the flow as hash-filter in hw.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:04 +0000 (08:53 +0530)]
cxgb4: convert flower table to use rhashtable
T6 supports ~500K hash filters and can theoretically climb up to
~1 million hash filters. Preallocated hash table is not efficient
in terms of memory usage. So, use rhashtable instead which gives
the flexibility to grow based on usage.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:03 +0000 (08:53 +0530)]
cxgb4: add support to retrieve stats for hash filters
Add support to retrieve packet-count and byte-count for hash-filters
by retrieving filter-entry appropriately based on whether the
request is for hash-filter or not.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:02 +0000 (08:53 +0530)]
cxgb4: add support to delete hash filter
Use a combined ulptx work-request to send hash filter deletion
request to hw. Hash filter deletion reply is processed on
getting cpl_abort_rpl_rss.
Release any L2T/SMT/CLIP entries on filter deletion.
Also, free up the corresponding filter entry.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:01 +0000 (08:53 +0530)]
cxgb4: add support to create hash filters
Add support to create hash (exact-match) filters based on the value
of 'hash' field in ch_filter_specification.
Allocate SMT/L2T entries if DMAC-rewrite/SMAC-rewrite is requested.
Allocate CLIP entry in case of IPv6 filter.
Use cpl_act_open_req[6] to send hash filter create request to hw.
Also, the filter tuple is calculated as part of sending this request.
Hash-filter reply is processed on getting cpl_act_open_rpl.
In case of success, various bits/fields in filter-tcb are set per
filter requirement, such as enabling filter hitcnts, and/or various
header rewrite operations, such as VLAN-rewrite, NAT or
(L3/L4)-rewrite, and SMAC/DMAC-rewrite. In case of failure, clear the
filter entry and release any hw resources occupied by it.
The patch also moves the functions set_tcb_field, set_tcb_tflag and
configure_filter_smac towards beginning of file.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:23:00 +0000 (08:53 +0530)]
cxgb4: initialize hash-filter configuration
Add support for hash-filter configuration on T6. Also, do basic
checks for the related initialization.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 1 Nov 2017 03:22:59 +0000 (08:52 +0530)]
cxgb4: save additional filter tuple field shifts in tp_params
Save additional filter tuple field shifts in tp_params based on
configured filter tuple fields.
Also, save the combined filter tuple mask based on configured
filter tuple fields.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 12:30:24 +0000 (21:30 +0900)]
Merge branch 'lan9303-Fix-STP-and-flooding-issues'
Egil Hjelmeland says:
====================
net: dsa: lan9303: Fix STP and flooding issues
This patch set finishes the STP support, and fixes flooding issues.
Patch 1 fixes a flooding issue in the previous patch set.
Patch 2 finishes STP support by adding a ALR entry.
Patch 3 prevent duplicate flooding in HW and SW bridge.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 31 Oct 2017 14:48:02 +0000 (15:48 +0100)]
net: dsa: lan9303: lan9303_rcv set skb->offload_fwd_mark
The chip flood broadcast and unknown multicast frames.
On receive set skb->offload_fwd_mark to prevent the SW from flooding to the
same ports.
One exception: Because the ALR is set up to forward STP BPDUs only to CPU,
the SW bridge should flood STP BPDUs if local STP is not enabled.
This is archived by not setting skb->offload_fwd_mark on STP BPDUs.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 31 Oct 2017 14:48:01 +0000 (15:48 +0100)]
net: dsa: lan9303: Add STP ALR entry on port 0
STP BPDUs arriving on user ports must sent to CPU port only,
for processing by the SW bridge.
Add an ALR entry with STP state override to fix that.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Egil Hjelmeland [Tue, 31 Oct 2017 14:48:00 +0000 (15:48 +0100)]
net: dsa: lan9303: Transmit using ALR when unicast
lan9303_xmit_use_arl() introduced in previous patch set is wrong.
The chip flood broadcast and unknown multicast frames. The effect is that
broadcasts and multicasts are duplicated on egress. It is not possible to
configure the chip to direct unknown multicasts to CPU port only.
This means that only unicast frames can be transmitted using ALR lookup.
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 14:37:55 +0000 (14:37 +0000)]
net: thunderx: remove a couple of redundant assignments
The assignment to pointer msg is redundant as it is never read, so
remove msg. Also remove the first assignment to qset as this is not
read before the next re-assignment of a new value to qset in the
for-loop. Cleans up two clang warnings:
drivers/net/ethernet/cavium/thunder/nic_main.c:589:2: warning: Value
stored to 'msg' is never read
drivers/net/ethernet/cavium/thunder/nic_main.c:611:2: warning: Value
stored to 'qset' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Tue, 31 Oct 2017 14:37:18 +0000 (16:37 +0200)]
MAINTAINERS: Add lib/net_utils.c to NETWORKING (general)
It looks like the best place in MAINTAINERS data base to cover this
orphaned module.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 31 Oct 2017 14:29:47 +0000 (14:29 +0000)]
sfc: support rx-fcs and rx-all
Ethernet FCS inclusion (rx-fcs) is supported on EF10 NICs, conditional on
a firmware capability bit (MC_CMD_GET_CAPABILITIES_OUT_RX_INCLUDE_FCS).
To receive frames with bad FCS (rx-all) we just don't return the discard
flag EFX_RX_PKT_DISCARD from efx_ef10_handle_rx_event_errors() or
efx_farch_handle_rx_not_ok().
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 14:23:24 +0000 (14:23 +0000)]
net: macb: remove redundant assignment to variable work_done
Variable work_done is set to zero and this value is never read, instead
it is set to another value a few statements later. Remove the redundant
assignment. Cleans up clang warning:
drivers/net/ethernet/cadence/macb_main.c:1221:2: warning: Value stored
to 'work_done' is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Alexander Dahl <ada@thorsis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Tue, 31 Oct 2017 13:32:38 +0000 (14:32 +0100)]
ipv4: fix validate_source for VRF setup
David reported breakages of VRF scenarios due to the
commit
6e617de84e87 ("net: avoid a full fib lookup when rp_filter is
disabled."): the local addresses based test is too strict when VRFs
are in place.
With this change we fall-back to a full lookup when custom fib rules
are in place; so that we address the VRF use case and possibly other
similar issues in non trivial setups.
v1 -> v2:
- fix build breakage when CONFIG_IP_MULTIPLE_TABLES is not defined,
reported by the kbuild test robot
Reported-by: David Ahern <dsahern@gmail.com>
Fixes: 6e617de84e87 ("net: avoid a full fib lookup when rp_filter is disabled.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Tue, 31 Oct 2017 13:28:16 +0000 (13:28 +0000)]
sctp: fix error return code in sctp_send_add_streams()
Fix to returnerror code -ENOMEM from the sctp_make_strreset_addstrm()
error handling case instead of 0. 'retval' can be overwritten to 0 after
call sctp_stream_alloc_out().
Fixes: e090abd0d81c ("sctp: factor out stream->out allocation")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 12:01:47 +0000 (12:01 +0000)]
net: hso: remove redundant unused variable dev
The pointer dev is being assigned but is never used, hence it is
redundant and can be removed. Cleans up clang warning:
drivers/net/usb/hso.c:2280:2: warning: Value stored to 'dev' is
never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gao Feng [Tue, 31 Oct 2017 10:25:37 +0000 (18:25 +0800)]
ppp: Destroy the mutex when cleanup
The mutex_destroy only makes sense when enable DEBUG_MUTEX. For the
good readbility, it's better to invoke it in exit func when the init
func invokes mutex_init.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 31 Oct 2017 10:08:23 +0000 (10:08 +0000)]
net: ethernet: slicoss: remove redundant initialization of idx
Variable idx is being initialized and later on over-written by
a new value in a do-loop without the initial value ever being
read. Hence the initializion is redundant and can be removed.
Cleans up clang warning:
drivers/net/ethernet/alacritech/slicoss.c:358:15: warning: Value
stored to 'idx' during its initialization is never read
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 Oct 2017 06:08:20 +0000 (23:08 -0700)]
tcp: fix tcp_mtu_probe() vs highest_sack
Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.
Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.
If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.
Bad things would then happen, including infinite loops.
This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.
Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.
Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Roman Gushchin <guro@fb.com>
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 Oct 2017 05:47:09 +0000 (22:47 -0700)]
ipv6: addrconf: increment ifp refcount before ipv6_del_addr()
In the (unlikely) event fixup_permanent_addr() returns a failure,
addrconf_permanent_addr() calls ipv6_del_addr() without the
mandatory call to in6_ifa_hold(), leading to a refcount error,
spotted by syzkaller :
WARNING: CPU: 1 PID: 3142 at lib/refcount.c:227 refcount_dec+0x4c/0x50
lib/refcount.c:227
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 3142 Comm: ip Not tainted 4.14.0-rc4-next-
20171009+ #33
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
panic+0x1e4/0x41c kernel/panic.c:181
__warn+0x1c4/0x1e0 kernel/panic.c:544
report_bug+0x211/0x2d0 lib/bug.c:183
fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_dec+0x4c/0x50 lib/refcount.c:227
RSP: 0018:
ffff8801ca49e680 EFLAGS:
00010286
RAX:
000000000000002c RBX:
ffff8801d07cfcdc RCX:
0000000000000000
RDX:
000000000000002c RSI:
1ffff10039493c90 RDI:
ffffed0039493cc4
RBP:
ffff8801ca49e688 R08:
ffff8801ca49dd70 R09:
0000000000000000
R10:
ffff8801ca49df58 R11:
0000000000000000 R12:
1ffff10039493cd9
R13:
ffff8801ca49e6e8 R14:
ffff8801ca49e7e8 R15:
ffff8801d07cfcdc
__in6_ifa_put include/net/addrconf.h:369 [inline]
ipv6_del_addr+0x42b/0xb60 net/ipv6/addrconf.c:1208
addrconf_permanent_addr net/ipv6/addrconf.c:3327 [inline]
addrconf_notify+0x1c66/0x2190 net/ipv6/addrconf.c:3393
notifier_call_chain+0x136/0x2c0 kernel/notifier.c:93
__raw_notifier_call_chain kernel/notifier.c:394 [inline]
raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401
call_netdevice_notifiers_info+0x32/0x60 net/core/dev.c:1697
call_netdevice_notifiers net/core/dev.c:1715 [inline]
__dev_notify_flags+0x15d/0x430 net/core/dev.c:6843
dev_change_flags+0xf5/0x140 net/core/dev.c:6879
do_setlink+0xa1b/0x38e0 net/core/rtnetlink.c:2113
rtnl_newlink+0xf0d/0x1a40 net/core/rtnetlink.c:2661
rtnetlink_rcv_msg+0x733/0x1090 net/core/rtnetlink.c:4301
netlink_rcv_skb+0x216/0x440 net/netlink/af_netlink.c:2408
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4313
netlink_unicast_kernel net/netlink/af_netlink.c:1273 [inline]
netlink_unicast+0x4e8/0x6f0 net/netlink/af_netlink.c:1299
netlink_sendmsg+0xa4a/0xe70 net/netlink/af_netlink.c:1862
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
___sys_sendmsg+0x75b/0x8a0 net/socket.c:2049
__sys_sendmsg+0xe5/0x210 net/socket.c:2083
SYSC_sendmsg net/socket.c:2094 [inline]
SyS_sendmsg+0x2d/0x50 net/socket.c:2090
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7fa9174d3320
RSP: 002b:
00007ffe302ae9e8 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00007ffe302b2ae0 RCX:
00007fa9174d3320
RDX:
0000000000000000 RSI:
00007ffe302aea20 RDI:
0000000000000016
RBP:
0000000000000082 R08:
0000000000000000 R09:
000000000000000f
R10:
0000000000000000 R11:
0000000000000246 R12:
00007ffe302b32a0
R13:
0000000000000000 R14:
00007ffe302b2ab8 R15:
00007ffe302b32b8
Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ahern <dsahern@gmail.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 12:15:09 +0000 (21:15 +0900)]
Merge branch 'PHYLINK-cosmetic-and-build-fixes'
Florian Fainelli says:
====================
PHYLINK cosmetic and build fixes
Please find two small "fixes" one that corrects some stylistic changes and
another one that fixes an actual build failure in sfp.c. Since PHYLINK is
not directly visible to user, and there are no in-tree users yet (coming)
this is not targeted at "net" but "net-next" instead.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 31 Oct 2017 04:42:58 +0000 (21:42 -0700)]
net: phy: Fix sfp.c build against GPIO definitions
include/gpio.h does not contain the references we want, we should be including
linux/gpio/consumer.h instead.
Fixes: 73970055450e ("sfp: add SFP module support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 31 Oct 2017 04:42:57 +0000 (21:42 -0700)]
net: phy: Cosmetic fixes to phylink/sfp/sfp-bus.c
Perform a number of stylistic changes to phylink.c, sfp.c and sfp-bus.c:
- align with netdev-style comments
- align function arguments to the opening parenthesis
- remove blank lines
- fixup a few lines over 80 columns
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Tue, 31 Oct 2017 02:39:56 +0000 (19:39 -0700)]
bpf: document answers to common questions about BPF
to address common misconceptions about what BPF is and what it's not
add short BPF Q&A that clarifies core BPF design principles and
answers some common questions.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vishwanath Pai [Mon, 30 Oct 2017 23:38:52 +0000 (19:38 -0400)]
net: display hw address of source machine during ipv6 DAD failure
This patch updates the error messages displayed in kernel log to include
hwaddress of the source machine that caused ipv6 duplicate address
detection failures.
Examples:
a) When we receive a NA packet from another machine advertising our
address:
ICMPv6: NA: 34:ab:cd:56:11:e8 advertised our address 2001:db8:: on eth0!
b) When we detect DAD failure during address assignment to an interface:
IPv6: eth0: IPv6 duplicate address 2001:db8:: used by 34:ab:cd:56:11:e8
detected!
v2:
Changed %pI6 to %pI6c in ndisc_recv_na()
Chaged the v6 address in the commit message to 2001:db8::
Suggested-by: Igor Lubashev <ilubashe@akamai.com>
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Craig Gallek [Mon, 30 Oct 2017 22:50:11 +0000 (18:50 -0400)]
tun/tap: sanitize TUNSETSNDBUF input
Syzkaller found several variants of the lockup below by setting negative
values with the TUNSETSNDBUF ioctl. This patch adds a sanity check
to both the tun and tap versions of this ioctl.
watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:2389]
Modules linked in:
irq event stamp:
329692056
hardirqs last enabled at (
329692055): [<
ffffffff824b8381>] _raw_spin_unlock_irqrestore+0x31/0x75
hardirqs last disabled at (
329692056): [<
ffffffff824b9e58>] apic_timer_interrupt+0x98/0xb0
softirqs last enabled at (
35659740): [<
ffffffff824bc958>] __do_softirq+0x328/0x48c
softirqs last disabled at (
35659731): [<
ffffffff811c796c>] irq_exit+0xbc/0xd0
CPU: 0 PID: 2389 Comm: repro Not tainted 4.14.0-rc7 #23
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task:
ffff880009452140 task.stack:
ffff880006a20000
RIP: 0010:_raw_spin_lock_irqsave+0x11/0x80
RSP: 0018:
ffff880006a27c50 EFLAGS:
00000282 ORIG_RAX:
ffffffffffffff10
RAX:
ffff880009ac68d0 RBX:
ffff880006a27ce0 RCX:
0000000000000000
RDX:
0000000000000001 RSI:
ffff880006a27ce0 RDI:
ffff880009ac6900
RBP:
ffff880006a27c60 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000001 R11:
000000000063ff00 R12:
ffff880009ac6900
R13:
ffff880006a27cf8 R14:
0000000000000001 R15:
ffff880006a27cf8
FS:
00007f4be4838700(0000) GS:
ffff88000cc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000020101000 CR3:
0000000009616000 CR4:
00000000000006f0
Call Trace:
prepare_to_wait+0x26/0xc0
sock_alloc_send_pskb+0x14e/0x270
? remove_wait_queue+0x60/0x60
tun_get_user+0x2cc/0x19d0
? __tun_get+0x60/0x1b0
tun_chr_write_iter+0x57/0x86
__vfs_write+0x156/0x1e0
vfs_write+0xf7/0x230
SyS_write+0x57/0xd0
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x7f4be4356df9
RSP: 002b:
00007ffc18101c08 EFLAGS:
00000293 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f4be4356df9
RDX:
0000000000000046 RSI:
0000000020101000 RDI:
0000000000000005
RBP:
00007ffc18101c40 R08:
0000000000000001 R09:
0000000000000001
R10:
0000000000000001 R11:
0000000000000293 R12:
0000559c75f64780
R13:
00007ffc18101d30 R14:
0000000000000000 R15:
0000000000000000
Fixes: 33dccbb050bb ("tun: Limit amount of queued packets per device")
Fixes: 20d29d7a916a ("net: macvtap driver")
Signed-off-by: Craig Gallek <kraig@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 30 Oct 2017 22:17:20 +0000 (15:17 -0700)]
net: dsa: b53: Have b53_hdr_setup() enable/disable tagging
Have b53_hdr_setup() check what kind of tagging protocol is configured
(Broadcom or none) and apply the correct settings in both cases.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 11:46:41 +0000 (20:46 +0900)]
Merge branch 'netrom-cleanups'
Gustavo A. R. Silva says:
====================
netrom: refactor code and mark expected switch fall-throughs
The aim of this patchset is firstly to refactor code in nr_route.c in order to make it
easier to read and maintain and, secondly, to mark some expected switch fall-throughs
in preparation to enabling -Wimplicit-fallthrough.
I have to mention that I did not implement any unit test.
If someone has any suggestions on how I could test this piece of code
it'd be greatly appreciated.
Changes in v2:
- Make use of the swap macro and remove inline keyword as suggested by
Walter Harms and Kevin Dawson.
Changes in v3:
- Update subject for both patches.
- Add this cover letter as suggested by David Miller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Fri, 27 Oct 2017 05:51:08 +0000 (00:51 -0500)]
net: netrom: nr_route: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Fri, 27 Oct 2017 05:51:04 +0000 (00:51 -0500)]
net: netrom: nr_route: refactor code in nr_add_node
Code refactoring in order to make the code easier to read and maintain.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 1 Nov 2017 11:10:42 +0000 (12:10 +0100)]
mlxsw: i2c: Fix buffer increment counter for write transaction
It fixes a problem for the last chunk where 'chunk_size' is smaller than
MLXSW_I2C_BLK_MAX and data is copied to the wrong offset, overriding
previous data.
Fixes: 6882b0aee180 ("mlxsw: Introduce support for I2C bus")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 10:27:46 +0000 (19:27 +0900)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2017-11-01
1) Fix a memleak when a packet matches a policy
without a matching state.
2) Reset the socket cached dst_entry when inserting
a socket policy, otherwise the policy might be
ignored. From Jonathan Basseri.
3) Fix GSO for a IPsec, GRE tunnel combination.
We reset the encapsulation field at the skb
too erly, as a result GRE does not segment
GSO packets. Fix this by resetting the the
encapsulation field right before the
transformation where the inner headers get
invalid.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 30 Oct 2017 21:06:45 +0000 (14:06 -0700)]
net: tipc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: tipc-discussion@lists.sourceforge.net
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 30 Oct 2017 21:05:41 +0000 (14:05 -0700)]
drivers/net: tundra: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Philippe Reynes <tremyfr@gmail.com>
Cc: "yuval.shaia@oracle.com" <yuval.shaia@oracle.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Mon, 30 Oct 2017 21:05:12 +0000 (14:05 -0700)]
drivers/net: ntb_netdev: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <Allen.Hubbe@emc.com>
Cc: linux-ntb@googlegroups.com
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonghong Song [Mon, 30 Oct 2017 20:50:22 +0000 (13:50 -0700)]
bpf: avoid rcu_dereference inside bpf_event_mutex lock region
During perf event attaching/detaching bpf programs,
the tp_event->prog_array change is protected by the
bpf_event_mutex lock in both attaching and deteching
functions. Although tp_event->prog_array is a rcu
pointer, rcu_derefrence is not needed to access it
since mutex lock will guarantee ordering.
Verified through "make C=2" that sparse
locking check still happy with the new change.
Also change the label name in perf_event_{attach,detach}_bpf_prog
from "out" to "unlock" to reflect the code action after the label.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 30 Oct 2017 17:07:17 +0000 (10:07 -0700)]
net: sit: Update lookup to handle links set to L3 slave
Using SIT tunnels with VRFs works fine if the underlay device is in a
VRF and the link parameter is set to the VRF device. e.g.,
ip tunnel add jtun mode sit remote <addr> local <addr> dev myvrf
Update the device check to allow the link to be the enslaved device as
well. e.g.,
ip tunnel add jtun mode sit remote <addr> local <addr> dev eth4
where eth4 is enslaved to myvrf.
Reported-by: Jeff Barnhill <0xeffeff@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arvind Yadav [Mon, 30 Oct 2017 15:52:03 +0000 (21:22 +0530)]
atm: iphase: Fix space before '[' error.
Fix checkpatch.pl error:
ERROR: space prohibited before open square bracket '['.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Mon, 30 Oct 2017 10:56:33 +0000 (12:56 +0200)]
net: bridge: add neigh_suppress to bridge port policies
Add an entry for IFLA_BRPORT_NEIGH_SUPPRESS to bridge port policies.
Fixes: 821f1b21cabb ("bridge: add new BR_NEIGH_SUPPRESS port flag to suppress arp and nd flood")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 1 Nov 2017 03:28:33 +0000 (12:28 +0900)]
Merge branch 'mvpp2-various-improvements'
Antoine Tenart says:
====================
net: mvpp2: various improvements
This series includes various patches improving the Marvell PPv2 driver.
I send them as a series to avoid any possible merge conflict.
- Patches 1 and 2 improve the initializing of the Tx and Rx FIFO.
- Patch 3 initialize the RSS table to evenly distribute the ingress
packets across multiple Rx queues based on their hashes.
- Patch 4 limits the number of TSO segments sent to the driver, to avoid
having more segments to handle than the corresponding number of
available descriptors.
- Patch 5 and 6 are cosmetic improvements.
This applies on today's net-next branch, The patches were tested
extensively (I ran iperf and http downloads in parallel, transferring
TBs of data).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:33 +0000 (11:23 +0100)]
net: mvpp2: simplify the Tx desc set DMA logic
Two functions were always used to set the DMA addresses in Tx
descriptors, because this address is split into a base+offset in the
descriptors. A mask was used to come up with the base and offset
addresses and two functions were called, mvpp2_txdesc_dma_addr_set() and
mvpp2_txdesc_offset_set().
This patch moves the base+offset calculation logic to
mvpp2_txdesc_dma_addr_set(), and removes mvpp2_txdesc_offset_set() to
simplify things.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:32 +0000 (11:23 +0100)]
net: mvpp2: use the aggr txq size define everywhere
Cosmetic patch using the MVPP2_AGGR_TXQ_SIZE everywhere instead of the
size field of aggr_txq, as the size never change and is always equal to
the MVPP2_AGGR_TXQ_SIZE define.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:31 +0000 (11:23 +0100)]
net: mvpp2: limit TSO segments and use stop/wake thresholds
Too many TSO descriptors can be required for the default queue size,
when using small MSS values for example. Prevent this by adding a
maximum number of allowed TSO segments (300). In addition set a stop and
a wake thresholds to stop the queue when there's no room for a 1 "worst
case scenario skb". Wake up the queue when the number of descriptors is
low enough.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:30 +0000 (11:23 +0100)]
net: mvpp2: initialize the RSS tables
This patch initialize the RSS tables to evenly (depending on the packets
RSS hashes) distribute the packets across port Rx queues. This helps to
handle packets on different CPUs to improve performances, as more queues
will be used in parallel.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:29 +0000 (11:23 +0100)]
net: mvpp2: initialize the Tx FIFO size
So far only the Rx FIFO size was initialized. For PPv2.2 the Tx FIFO
size can be set as well. This patch initializes the Tx FIFO size for
PPv2.2 controllers to 3K.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Mon, 30 Oct 2017 10:23:28 +0000 (11:23 +0100)]
net: mvpp2: set the Rx FIFO size depending on the port speeds for PPv2.2
The Rx FIFO size was set to the same value for all ports. This patch
sets it depending on the maximum speed a given port can handle. This is
only working for PPv2.2.
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 30 Oct 2017 09:51:18 +0000 (10:51 +0100)]
mlxsw: reg: Add high and low temperature thresholds
The ASIC has the ability to generate events whenever a sensor indicates
the temperature goes above or below its high or low thresholds,
respectively.
In new firmware versions the firmware enforces a minimum of 5
degrees Celsius difference between both thresholds. Make the driver
conform to this requirement.
Note that this is required even when the events are disabled, as in
certain systems interrupts are generated via GPIO based on these
thresholds.
Fixes: 85926f877040 ("mlxsw: reg: Add definition of temperature management registers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 30 Oct 2017 09:41:37 +0000 (11:41 +0200)]
MAINTAINERS: Remove Yotam from mlxfw
Provide a mailing list for maintenance of the module instead.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>