David S. Miller [Fri, 5 Oct 2018 21:36:45 +0000 (14:36 -0700)]
Merge branch 'mscc-ocelot-add-support-for-SerDes-muxing-configuration'
Quentin Schulz says:
====================
mscc: ocelot: add support for SerDes muxing configuration
The Ocelot switch has currently an hardcoded SerDes muxing that suits only
a particular use case. Any other board setup will fail to work.
To prepare for upcoming boards' support that do not have the same muxing,
create a PHY driver that will handle all possible cases.
A SerDes can work in SGMII, QSGMII or PCIe and is also muxed to use a
given port depending on the selected mode or board design.
The SerDes configuration is in the middle of an address space (HSIO) that
is used to configure some parts in the MAC controller driver, that is why
we need to use a syscon so that we can write to the same address space from
different drivers safely using regmap.
This breaks backward compatibility but it's fine because there's only one
board at the moment that is using what's modified in this patch series.
This will break git bisect.
Even though this patch series is about SerDes __muxing__ configuration, the
DT node is named serdes for the simple reason that I couldn't find any
mention to SerDes anywhere else from the address space handled by this
driver.
v4:
- add reviewed-by,
- format the patch series with -M for identifying renamed files,
- add parent info in DT binding of the SerDes IP,
- move to macros SERDES[16]G(X) instead of multiple SERDES[16]G_[012345]
constants,
- move to SERDES[16]G_MAX being the last VALID macro of a type, so
migrate to <= conditions instead of < when iterating,
- create a SERDES_MUX_SGMII and SERDES_MUX_QSGMII macro so the muxing
configurations are a tad more readable,
- use a bunch of unsigned int instead of int,
- return -EOPNOTSUPP for SERDES6G/PCIe until it's supported,
- simplify condition when there is an error code returned by
devm_of_phy_get,
v3:
- add Paul Burton's Acked-By on MIPS patches so that the patch series can
be merged in the net tree in its entirety,
v2:
- use a switch case for setting the phy_mode in the SerDes driver as
suggested by Andrew,
- stop replacing the value of the error pointer in the SerDes driver,
- use a dev_dbg for the deferring of the probe in the SerDes driver,
- use constants in the Device Tree to select the SerDes macro in use with
a port,
- adapt the SerDes driver to use those constants,
- add a header file in include/dt-bindings for the constants,
- fix space/tab issue,
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:08 +0000 (14:22 +0200)]
net: mscc: ocelot: make use of SerDes PHYs for handling their configuration
Previously, the SerDes muxing was hardcoded to a given mode in the MAC
controller driver. Now, the SerDes muxing is configured within the
Device Tree and is enforced in the MAC controller driver so we can have
a lot of different SerDes configurations.
Make use of the SerDes PHYs in the MAC controller to set up the SerDes
according to the SerDes<->switch port mapping and the communication mode
with the Ethernet PHY.
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:07 +0000 (14:22 +0200)]
phy: add driver for Microsemi Ocelot SerDes muxing
The Microsemi Ocelot can mux SerDes lanes (aka macros) to different
switch ports or even make it act as a PCIe interface.
This adds support for the muxing of the SerDes.
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:06 +0000 (14:22 +0200)]
dt-bindings: add constants for Microsemi Ocelot SerDes driver
The Microsemi Ocelot has multiple SerDes and requires that the SerDes be
muxed accordingly to the hardware representation.
Let's add a constant for each SerDes available in the Microsemi Ocelot.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:05 +0000 (14:22 +0200)]
MIPS: mscc: ocelot: add SerDes mux DT node
The Microsemi Ocelot has a set of register for SerDes/switch port muxing
as well as PCIe muxing for a specific SerDes, so let's add the device
and all SerDes in the Device Tree.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:04 +0000 (14:22 +0200)]
dt-bindings: phy: add DT binding for Microsemi Ocelot SerDes muxing
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:03 +0000 (14:22 +0200)]
phy: add QSGMII and PCIE modes
Prepare for upcoming phys that'll handle QSGMII or PCIe.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:02 +0000 (14:22 +0200)]
net: mscc: ocelot: simplify register access for PLL5 configuration
Since HSIO address space can be accessed by different drivers, let's
simplify the register address definitions so that it can be easily used
by all drivers and put the register address definition in the
include/soc/mscc/ocelot_hsio.h header file.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:01 +0000 (14:22 +0200)]
net: mscc: ocelot: move the HSIO header to include/soc
Since HSIO address space can be used by different drivers (PLL, SerDes
muxing, temperature sensor), let's move it somewhere it can be included
by all drivers.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:22:00 +0000 (14:22 +0200)]
net: mscc: ocelot: get HSIO regmap from syscon
HSIO address space was moved to a syscon, hence we need to get the
regmap of this address space from there and no more from the device
node.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:21:59 +0000 (14:21 +0200)]
dt-bindings: net: ocelot: remove hsio from the list of register address spaces
HSIO register address space should be handled outside of the MAC
controller as there are some registers for PLL5 configuring,
SerDes/switch port muxing and a thermal sensor IP, so let's remove it.
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Schulz [Thu, 4 Oct 2018 12:21:58 +0000 (14:21 +0200)]
MIPS: mscc: ocelot: make HSIO registers address range a syscon
HSIO contains registers for PLL5 configuration, SerDes/switch port
muxing and a thermal sensor, hence we can't keep it in the switch DT
node.
Acked-by: Paul Burton <paul.burton@mips.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Sitnicki [Thu, 4 Oct 2018 09:09:40 +0000 (11:09 +0200)]
socket: Tighten no-error check in bind()
move_addr_to_kernel() returns only negative values on error, or zero on
success. Rewrite the error check to an idiomatic form to avoid confusing
the reader.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lance Roy [Thu, 4 Oct 2018 07:46:57 +0000 (00:46 -0700)]
atm: nicstar: Replace spin_is_locked() with spin_trylock()
ns_poll() used spin_is_locked() + spin_lock() to get achieve the same
thing as a spin_trylock(), so simplify it by using that instead. This is
also a step towards possibly removing spin_is_locked().
Signed-off-by: Lance Roy <ldr709@gmail.com>
Cc: Chas Williams <3chas3@gmail.com>
Cc: <linux-atm-general@lists.sourceforge.net>
Cc: <netdev@vger.kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 5 Oct 2018 19:01:55 +0000 (12:01 -0700)]
Merge branch 'hns3-Unicast-MAC-VLAN-table'
Salil Mehta says:
====================
Fixes, minor changes & cleanups for the Unicast MAC VLAN table
This patch-set presents necessary modifications, fixes & cleanups related
to Unicast MAC Vlan Table to support revision 0x21 hardware.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Fri, 5 Oct 2018 17:03:29 +0000 (18:03 +0100)]
net: hns3: Fix for rx vlan id handle to support Rev 0x21 hardware
In revision 0x20, we use vlan id != 0 to check whether a vlan tag
has been offloaded, so vlan id 0 is not supported.
In revision 0x21, rx buffer descriptor adds two bits to indicate
whether one or more vlan tags have been offloaded, so vlan id 0
is valid now.
This patch seperates the handle for vlan id 0, add vlan id 0 support
for revision 0x21.
Fixes: 5b5455a9ed5a ("net: hns3: Add STRP_TAGP field support for hardware revision 0x21")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhongzhu Liu [Fri, 5 Oct 2018 17:03:28 +0000 (18:03 +0100)]
net: hns3: Add egress/ingress vlan filter for revision 0x21
In revision 0x21, hw supports both ingress and egress vlan filter.
This patch adds support for it.
Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Fri, 5 Oct 2018 17:03:27 +0000 (18:03 +0100)]
net: hns3: Drop depricated mta table support
For mta table support has been dropped, remove the code for mta table.
Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Fri, 5 Oct 2018 17:03:26 +0000 (18:03 +0100)]
net: hns3: Optimize for unicast mac vlan table
In previously implement for unicast mac vlan table, the space is
shared by all the functions, driver does nothing when the space is
exhausted. This patch preallocates the space of unicast mac vlan
table for each function by software. Each function can only use its
private space and available shared space, avoiding single function
exhausts too much space, and other functions are unable to add
unicast mac address.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Fri, 5 Oct 2018 17:03:25 +0000 (18:03 +0100)]
net: hns3: Clear mac vlan table entries when unload driver or function reset
In original codes, the mac vlan table entries are not cleared when
unload hns3 driver. The dirty mac vlan table entries will make the
result of looking up mac vlan table being unexpected.
When doing core reset or global reset, the firmware will clear all
the tables for driver, and driver shouldn't send any commands to
firmware during reset. But when doing function reset, the driver
needs to clear the tables itself.
This patch clears the mac vlan table entries for each client when
unload driver or reset.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Fri, 5 Oct 2018 17:03:24 +0000 (18:03 +0100)]
net: hns3: Remove the default mask configuration for mac vlan table
The default mask configuration has been done by firmware, so the driver
doesn't need to do it any more.
Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 5 Oct 2018 17:01:19 +0000 (10:01 -0700)]
fib_tests: Add tests for invalid metric on route
Add ipv4 and ipv6 test cases with an invalid metrics option causing
ip_metrics_convert to fail. Tests clean up path during route add.
Also, add nodad to to ipv6 address add. When running ipv6_route_metrics
directly seeing an occasional failure on the "Using route with mtu metric"
test case.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 5 Oct 2018 16:17:50 +0000 (09:17 -0700)]
ipv6: do not leave garbage in rt->fib6_metrics
In case ip_fib_metrics_init() returns an error, we better
rewrite rt->fib6_metrics with &dst_default_metrics so that
we do not crash later in ip_fib_metrics_put()
Fixes: 767a2217533f ("net: common metrics init helper for FIB entries")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 5 Oct 2018 15:31:40 +0000 (11:31 -0400)]
udp: gro behind static key
Avoid the socket lookup cost in udp_gro_receive if no socket has a
udp tunnel callback configured.
udp_sk(sk)->gro_receive requires a registration with
setup_udp_tunnel_sock, which enables the static key.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Fri, 5 Oct 2018 13:29:09 +0000 (15:29 +0200)]
gigaset: asyncdata: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in this particular case, I replaced the
" --v-- fall through --v-- " comment with a proper
"fall through", which is what GCC is expecting to find.
Addresses-Coverity-ID:
1364476 ("Missing break in switch")
Addresses-Coverity-ID:
1364477 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Fri, 5 Oct 2018 09:34:45 +0000 (15:04 +0530)]
cxgb4: use FW_PORT_ACTION_L1_CFG32 for 32 bit capability
when 32 bit port capability is in use, use FW_PORT_ACTION_L1_CFG32
rather than FW_PORT_ACTION_L1_CFG.
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 2 Oct 2018 19:50:19 +0000 (12:50 -0700)]
net_sched: convert idrinfo->lock from spinlock to a mutex
In commit
ec3ed293e766 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
we move fl_hw_destroy_tmplt() to a workqueue to avoid blocking
with the spinlock held. Unfortunately, this causes a lot of
troubles here:
1. tcf_chain_destroy() could be called right after we queue the work
but before the work runs. This is a use-after-free.
2. The chain refcnt is already 0, we can't even just hold it again.
We can check refcnt==1 but it is ugly.
3. The chain with refcnt 0 is still visible in its block, which means
it could be still found and used!
4. The block has a refcnt too, we can't hold it without introducing a
proper API either.
We can make it working but the end result is ugly. Instead of wasting
time on reviewing it, let's just convert the troubling spinlock to
a mutex, which allows us to use non-atomic allocations too.
Fixes: ec3ed293e766 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
Reported-by: Ido Schimmel <idosch@idosch.org>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 3 Oct 2018 22:33:12 +0000 (15:33 -0700)]
net/neigh: Extend dump filter to proxy neighbor dumps
Move the attribute parsing from neigh_dump_table to neigh_dump_info, and
pass the filter arguments down to neigh_dump_table in a new struct. Add
the filter option to proxy neigh dumps as well to make them consistent.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 5 Oct 2018 04:54:45 +0000 (21:54 -0700)]
Merge branch 'net-metrics-consolidate'
David Ahern says:
====================
net: Consolidate metrics handling for ipv4 and ipv6
As part of the IPv6 fib info refactoring, the intent was to make metrics
handling for ipv6 identical to ipv4. One oversight in ip6_dst_destroy
led to confusion and a couple of incomplete attempts at finding and
fixing the resulting memory leak which was ultimately resolved by
ce7ea4af0838 ("ipv6: fix memory leak on dst->_metrics").
Refactor metrics hanlding make the code really identical for v4 and v6,
and add a few test cases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 5 Oct 2018 03:07:55 +0000 (20:07 -0700)]
fib_tests: Add tests for metrics on routes
Add ipv4 and ipv6 test cases for metrics (mtu) when fib entries are
created. Can be used with kmemleak to see leaks with both fib entries
and dst_entry.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 5 Oct 2018 03:07:54 +0000 (20:07 -0700)]
net: Move free of dst_metrics to helper
Move the refcounting and potential free of dst metrics associated
for ipv4 and ipv6 to a common helper.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 5 Oct 2018 03:07:53 +0000 (20:07 -0700)]
net: common metrics init helper for dst_entry
ipv4 and ipv6 both use refcounted metrics if FIB entries have metrics set.
Move the common initialization code to a helper and use for both protocols.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 5 Oct 2018 03:07:52 +0000 (20:07 -0700)]
net: Move free of fib_metrics to helper
Move the refcounting and potential free of dst metrics associated
with a fib entry to a helper and use it in both ipv4 and ipv6.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 5 Oct 2018 03:07:51 +0000 (20:07 -0700)]
net: common metrics init helper for FIB entries
Consolidate initialization of ipv4 and ipv6 metrics when fib entries
are created into a single helper, ip_fib_metrics_init, that handles
the call to ip_metrics_convert.
If no metrics are defined for the fib entry, then the metrics is set
to dst_default_metrics.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 5 Oct 2018 00:07:51 +0000 (17:07 -0700)]
net: sched: remove unused helpers
tcf_block_dev() doesn't seem to be used anywhere in the tree.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sat, 29 Sep 2018 15:06:29 +0000 (23:06 +0800)]
geneve: allow to clear ttl inherit
As Michal remaind, we should allow to clear ttl inherit. Then we will
have three states:
1. set the flag, and do ttl inherit.
2. do not set the flag, use configured ttl value, or default ttl (0) if
not set.
3. disable ttl inherit, use previous configured ttl value, or default ttl (0).
Fixes: 52d0d404d39dd ("geneve: add ttl inherit support")
CC: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vinicius Costa Gomes [Sat, 29 Sep 2018 00:59:43 +0000 (17:59 -0700)]
tc: Add support for configuring the taprio scheduler
This traffic scheduler allows traffic classes states (transmission
allowed/not allowed, in the simplest case) to be scheduled, according
to a pre-generated time sequence. This is the basis of the IEEE
802.1Qbv specification.
Example configuration:
tc qdisc replace dev enp3s0 parent root handle 100 taprio \
num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 2@2 \
base-time
1528743495910289987 \
sched-entry S 01 300000 \
sched-entry S 02 300000 \
sched-entry S 04 300000 \
clockid CLOCK_TAI
The configuration format is similar to mqprio. The main difference is
the presence of a schedule, built by multiple "sched-entry"
definitions, each entry has the following format:
sched-entry <CMD> <GATE MASK> <INTERVAL>
The only supported <CMD> is "S", which means "SetGateStates",
following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK>
is a bitmask where each bit is a associated with a traffic class, so
bit 0 (the least significant bit) being "on" means that traffic class
0 is "active" for that schedule entry. <INTERVAL> is a time duration
in nanoseconds that specifies for how long that state defined by <CMD>
and <GATE MASK> should be held before moving to the next entry.
This schedule is circular, that is, after the last entry is executed
it starts from the first one, indefinitely.
The other parameters can be defined as follows:
- base-time: specifies the instant when the schedule starts, if
'base-time' is a time in the past, the schedule will start at
base-time + (N * cycle-time)
where N is the smallest integer so the resulting time is greater
than "now", and "cycle-time" is the sum of all the intervals of the
entries in the schedule;
- clockid: specifies the reference clock to be used;
The parameters should be similar to what the IEEE 802.1Q family of
specification defines.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 4 Oct 2018 20:49:43 +0000 (13:49 -0700)]
Merge branch 'bnxt_en-devlink-param-updates'
Vasundhara Volam says:
====================
bnxt_en: devlink param updates
This patchset adds support for 3 generic and 1 driver-specific devlink
parameters. Add documentation for these configuration parameters.
Also, this patchset adds support to return proper error code if
HWRM_NVM_GET/SET_VARIABLE commands return error code
HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED.
v3->v4:
-Remove extra definition of NVM_OFF_HW_TC_OFFLOAD from bnxt_devlink.h
-Remove type information for generic parameters from
devlink-params-bnxt.txt
v2->v3:
-Remove description of generic parameters from devlink-params-bnxt.txt
v1->v2:
-Remove hw_tc_offload parameter.
-Update all patches with Cc of MAINTAINERS.
-Add more description in commit message for device specific parameter.
-Add a new Documentation/networking/devlink-params.txt with some
generic devlink parameters information.
-Add a new Documentation/networking/devlink-params-bnxt.txt with devlink
parameters information that are supported by bnxt_en driver.
====================
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:52 +0000 (11:13 +0530)]
devlink: Add Documentation/networking/devlink-params-bnxt.txt
This patch adds a new file to add information about configuration
parameters that are supported by bnxt_en driver via devlink.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:51 +0000 (11:13 +0530)]
devlink: Add Documentation/networking/devlink-params.txt
This patch adds a new file to add information about some of the
generic configuration parameters set via devlink.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:50 +0000 (11:13 +0530)]
bnxt_en: Add a driver specific gre_ver_check devlink parameter.
This patch adds following driver-specific permanent mode boolean
parameter.
gre_ver_check - Generic Routing Encapsulation(GRE) version check
will be enabled in the device. If disabled, device skips version
checking for GRE packets.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:49 +0000 (11:13 +0530)]
bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink params.
This patch adds support for following generic permanent mode
devlink parameters. They can be modified using devlink param
commands.
msix_vec_per_pf_max - This param sets the number of MSIX vectors
that the device requests from the host on driver initialization.
This value is set in the device which limits MSIX vectors per PF.
msix_vec_per_pf_min - This param sets the number of minimal MSIX
vectors required for the device initialization. Value 0 indicates
a default value is selected. This value is set in the device which
limits MSIX vectors per PF.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:48 +0000 (11:13 +0530)]
bnxt_en: return proper error when FW returns HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED
Return proper error code when Firmware returns
HWRM_ERR_CODE_RESOURCE_ACCESS_DENIED for HWRM_NVM_GET/SET_VARIABLE
commands.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:47 +0000 (11:13 +0530)]
bnxt_en: Use ignore_ari devlink parameter
This patch adds support for ignore_ari generic permanent mode
devlink parameter. This parameter is disabled by default. It can be
enabled using devlink param commands.
ignore_ari - If enabled, device ignores ARI(Alternate Routing ID)
capability, even when platforms has the support and creates same number
of partitions when platform does not support ARI capability.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:46 +0000 (11:13 +0530)]
devlink: Add generic parameter msix_vec_per_pf_min
msix_vec_per_pf_min - This param sets the number of minimal MSIX
vectors required for the device initialization. This value is set
in the device which limits MSIX vectors per PF.
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:45 +0000 (11:13 +0530)]
devlink: Add generic parameter msix_vec_per_pf_max
msix_vec_per_pf_max - This param sets the number of MSIX vectors
that the device requests from the host on driver initialization.
This value is set in the device which is applicable per PF.
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 4 Oct 2018 05:43:44 +0000 (11:13 +0530)]
devlink: Add generic parameter ignore_ari
ignore_ari - Device ignores ARI(Alternate Routing ID) capability,
even when platforms has the support and creates same number of
partitions when platform does not support ARI capability.
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Chancellor [Thu, 4 Oct 2018 16:39:20 +0000 (09:39 -0700)]
qed: Avoid implicit enum conversion in qed_ooo_submit_tx_buffers
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_ll2.c:799:32: warning: implicit
conversion from enumeration type 'enum core_tx_dest' to different
enumeration type 'enum qed_ll2_tx_dest' [-Wenum-conversion]
tx_pkt.tx_dest = p_ll2_conn->tx_dest;
~ ~~~~~~~~~~~~^~~~~~~
1 warning generated.
Fix this by using a switch statement to convert between the enumerated
values since they are not 1 to 1, which matches how the rest of the
driver handles this conversion.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Suggested-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 4 Oct 2018 16:48:37 +0000 (09:48 -0700)]
Merge tag 'mlx5-updates-2018-10-03' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2018-10-03
mlx5 core driver and ethernet netdev updates, please note there is a small
devlink releated update to allow extack argument to eswitch operations.
From Eli Britstein,
1) devlink: Add extack argument to the eswitch related operations
2) net/mlx5e: E-Switch, return extack messages for failures in the e-switch devlink callbacks
3) net/mlx5e: Add extack messages for TC offload failures
From Eran Ben Elisha,
4) mlx5e: Add counter for aRFS rule insertion failures
From Feras Daoud
5) Fast teardown support for mlx5 device
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the FW pages.
Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 4 Oct 2018 16:44:15 +0000 (09:44 -0700)]
Merge tag 'rxrpc-next-
20181004' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Development
Here are some development patches for AF_RXRPC. The most significant points
are:
(1) Change the tracepoint that indicates a packet has been transmitted
into one that indicates a packet is about to be transmitted. Without
this, the response tracepoint may occur first if the round trip is
fast enough.
(2) Sort out AFS address list handling to better enforce maximum capacity
to use helper functions to fill them and to do an insertion sort to
order them. This is here to make (3) easier.
(3) Keep AF_INET addresses as AF_INET addresses rather than converting
them to AF_INET6 in both AF_RXRPC and kAFS. I hadn't realised that a
UDP6 socket would just call down into UDP4 if given an AF_INET
address.
(4) Allow the timestamp on the first DATA packet of a reply to be
retrieved by a kernel service. This will give the kAFS a more
accurate base from which to calculate the callback promise expiration.
(5) Allow the rxrpc protocol epoch value to be retrieved from an incoming
call. This will allow kAFS to determine if the fileserver restarted
and if two addresses apparently assigned to the same fileserver
actually are different boxes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Thu, 4 Oct 2018 13:27:55 +0000 (14:27 +0100)]
dns: Allow the dns resolver to retrieve a server set
Allow the DNS resolver to retrieve a set of servers and their associated
addresses, ports, preference and weight ratings.
In terms of communication with userspace, "srv=1" is added to the callout
string (the '1' indicating the maximum data version supported by the
kernel) to ask the userspace side for this.
If the userspace side doesn't recognise it, it will ignore the option and
return the usual text address list.
If the userspace side does recognise it, it will return some binary data
that begins with a zero byte that would cause the string parsers to give an
error. The second byte contains the version of the data in the blob (this
may be between 1 and the version specified in the callout data). The
remainder of the payload is version-specific.
In version 1, the payload looks like (note that this is packed):
u8 Non-string marker (ie. 0)
u8 Content (0 => Server list)
u8 Version (ie. 1)
u8 Source (eg. DNS_RECORD_FROM_DNS_SRV)
u8 Status (eg. DNS_LOOKUP_GOOD)
u8 Number of servers
foreach-server {
u16 Name length (LE)
u16 Priority (as per SRV record) (LE)
u16 Weight (as per SRV record) (LE)
u16 Port (LE)
u8 Source (eg. DNS_RECORD_FROM_NSS)
u8 Status (eg. DNS_LOOKUP_GOT_NOT_FOUND)
u8 Protocol (eg. DNS_SERVER_PROTOCOL_UDP)
u8 Number of addresses
char[] Name (not NUL-terminated)
foreach-address {
u8 Family (AF_INET{,6})
union {
u8[4] ipv4_addr
u8[16] ipv6_addr
}
}
}
This can then be used to fetch a whole cell's VL-server configuration for
AFS, for example.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 4 Oct 2018 12:11:18 +0000 (13:11 +0100)]
liquidio: fix a couple of spelling mistakes
Trivial fix to spelling mistakes in dev_dbg warning messages
"Reloade" -> "Reload"
"chang" -> "change"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 4 Oct 2018 16:32:35 +0000 (09:32 -0700)]
Merge branch 'ieee802154-for-davem-2018-10-04' of git://git./linux/kernel/git/sschmidt/wpan-next
Stefan Schmidt says:
====================
pull-request: ieee802154-next 2018-10-04
An update from ieee802154 for *net-next*
A very quite cycle in the ieee802154 subsystem. We only have two cleanup
patches for this pull request.
Xue removed the platform_data struct handling from the mcr20a driver and
Alexander cleaned up some left overs in the hwsim driver.
Please pull, or let me know if there are any problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Thu, 4 Oct 2018 08:54:29 +0000 (09:54 +0100)]
rxrpc: Allow the reply time to be obtained on a client call
Allow the epoch value to be queried on a server connection. This is in the
rxrpc header of every packet for use in routing and is derived from the
client's state. It's also not supposed to change unless the client gets
restarted.
AFS can make use of this information to deduce whether a fileserver has
been restarted because the fileserver makes client calls to the filesystem
driver's cache manager to send notifications (ie. callback breaks) about
conflicting changes from other clients. These convey the fileserver's own
epoch value back to the filesystem.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:42:29 +0000 (09:42 +0100)]
rxrpc: Allow the reply time to be obtained on a client call
Allow the timestamp on the sk_buff holding the first DATA packet of a reply
to be queried. This can then be used as a base for the expiry time
calculation on the callback promise duration indicated by an operation
result.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:28 +0000 (09:32 +0100)]
rxrpc: Drop the local endpoint arg from rxrpc_extract_addr_from_skb()
rxrpc_extract_addr_from_skb() doesn't use the argument that points to the
local endpoint, so remove the argument.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:28 +0000 (09:32 +0100)]
rxrpc: Use IPv4 addresses throught the IPv6
AF_RXRPC opens an IPv6 socket through which to send and receive network
packets, both IPv6 and IPv4. It currently turns AF_INET addresses into
AF_INET-as-AF_INET6 addresses based on an assumption that this was
necessary; on further inspection of the code, however, it turns out that
the IPv6 code just farms packets aimed at AF_INET addresses out to the IPv4
code.
Fix AF_RXRPC to use AF_INET addresses directly when given them.
Fixes: 7b674e390e51 ("rxrpc: Fix IPv6 support")
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:28 +0000 (09:32 +0100)]
afs: Sort address lists so that they are in logical ascending order
Sort address lists so that they are in logical ascending order rather than
being partially in ascending order of the BE representations of those
values.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:27 +0000 (09:32 +0100)]
afs: Always build address lists using the helper functions
Make the address list string parser use the helper functions for adding
addresses to an address list so that they end up appropriately sorted.
This will better handles overruns and make them easier to compare.
It also reduces the number of places that addresses are handled, making it
easier to fix the handling.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:27 +0000 (09:32 +0100)]
afs: Do better max capacity handling on address lists
Note the maximum allocated capacity in an afs_addr_list struct and discard
addresses that would exceed it in afs_merge_fs_addr{4,6}().
Also, since the current maximum capacity is less than 255, reduce the
relevant members to bytes.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:27 +0000 (09:32 +0100)]
rxrpc: Emit the data Tx trace line before transmitting
Print the data Tx trace line before transmitting so that it appears before
the trace lines indicating success or failure of the transmission. This
makes the trace log less confusing.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 4 Oct 2018 08:32:27 +0000 (09:32 +0100)]
rxrpc: Use rxrpc_free_skb() rather than rxrpc_lose_skb()
rxrpc_lose_skb() is now exactly the same as rxrpc_free_skb(), so remove it
and use the latter instead.
Signed-off-by: David Howells <dhowells@redhat.com>
David S. Miller [Thu, 4 Oct 2018 04:00:17 +0000 (21:00 -0700)]
Merge git://git./linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Feras Daoud [Thu, 9 Aug 2018 06:55:21 +0000 (09:55 +0300)]
net/mlx5: Add Fast teardown support
Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown
This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.
Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been scatter
to memory
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eran Ben Elisha [Sun, 8 Jul 2018 11:43:19 +0000 (14:43 +0300)]
net/mlx5e: Add new counter for aRFS rule insertion failures
Count aRFS rules insertion failure for ethtool output. In addition, move
the error print into debug prints mechanism, as it could flood the dmesg
and reduce system BW dramatically.
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eli Britstein [Wed, 15 Aug 2018 13:10:05 +0000 (16:10 +0300)]
net/mlx5e: Add extack messages for TC offload failures
Return tc extack messages for failures to user space.
Messages provide reasons for not being able to offload rules to HW.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eli Britstein [Sun, 5 Aug 2018 11:32:33 +0000 (14:32 +0300)]
net/mlx5e: E-Switch, Add extack messages to devlink callbacks
Return extack messages for failures in the e-switch devlink callbacks.
Messages provide reasons for not being able to issue the operation.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eli Britstein [Wed, 15 Aug 2018 13:02:18 +0000 (16:02 +0300)]
devlink: Add extack for eswitch operations
Add extack argument to the eswitch related operations.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Greg Kroah-Hartman [Wed, 3 Oct 2018 23:09:11 +0000 (16:09 -0700)]
Merge gitolite./pub/scm/linux/kernel/git/davem/net
David writes:
"Networking fixes:
1) Prefix length validation in xfrm layer, from Steffen Klassert.
2) TX status reporting fix in mac80211, from Andrei Otcheretianski.
3) Fix hangs due to TX_DROP in mac80211, from Bob Copeland.
4) Fix DMA error regression in b43, from Larry Finger.
5) Add input validation to xenvif_set_hash_mapping(), from Jan Beulich.
6) SMMU unmapping fix in hns driver, from Yunsheng Lin.
7) Bluetooh crash in unpairing on SMP, from Matias Karhumaa.
8) WoL handling fixes in the phy layer, from Heiner Kallweit.
9) Fix deadlock in bonding, from Mahesh Bandewar.
10) Fill ttl inherit infor in vxlan driver, from Hangbin Liu.
11) Fix TX timeouts during netpoll, from Michael Chan.
12) RXRPC layer fixes from David Howells.
13) Another batch of ndo_poll_controller() removals to deal with
excessive resource consumption during load. From Eric Dumazet.
14) Fix a specific TIPC failure secnario, from LUU Duc Canh.
15) Really disable clocks in r8169 during suspend so that low
power states can actually be reached.
16) Fix SYN backlog lockdep issue in tcp and dccp, from Eric Dumazet.
17) Fix RCU locking in netpoll SKB send, which shows up in bonding,
from Dave Jones.
18) Fix TX stalls in r8169, from Heiner Kallweit.
19) Fix locksup in nfp due to control message storms, from Jakub
Kicinski.
20) Various rmnet bug fixes from Subash Abhinov Kasiviswanathan and
Sean Tranchetti.
21) Fix use after free in ip_cmsg_recv_dstaddr(), from Eric Dumazet."
* gitolite.kernel.org:/pub/scm/linux/kernel/git/davem/net: (122 commits)
ixgbe: check return value of napi_complete_done()
sctp: fix fall-through annotation
r8169: always autoneg on resume
ipv4: fix use-after-free in ip_cmsg_recv_dstaddr()
net: qualcomm: rmnet: Fix incorrect allocation flag in receive path
net: qualcomm: rmnet: Fix incorrect allocation flag in transmit
net: qualcomm: rmnet: Skip processing loopback packets
net: systemport: Fix wake-up interrupt race during resume
rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096
bonding: fix warning message
inet: make sure to grab rcu_read_lock before using ireq->ireq_opt
nfp: avoid soft lockups under control message storm
declance: Fix continuation with the adapter identification message
net: fec: fix rare tx timeout
r8169: fix network stalls due to missing bit TXCFG_AUTO_FIFO
tun: napi flags belong to tfile
tun: initialize napi_mutex unconditionally
tun: remove unused parameters
bond: take rcu lock in netpoll_send_skb_on_dev
rtnetlink: Fail dump if target netnsid is invalid
...
David S. Miller [Wed, 3 Oct 2018 21:44:22 +0000 (14:44 -0700)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
10GbE Intel Wired LAN Driver Updates 2018-10-03
This series contains updates to ixgbe/ixgbevf and few fixes for i40e & iavf.
Shannon Nelson fixes the message length for IPsec mailbox messages.
Radoslaw fixes a transmit hang that occurs when XDP_TX exceeds the queue
limit. Fixes a crash when we restor flow director filters after a reset.
YueHaibing cleans up dead code, which did not have any callers.
Dan Carpenter fixes an "off by one" error in IPsec for ixgbe.
Nathan Chancellor fixes the i40e driver to use the correct enum for link
speed. Also remove a debug statement since it was not producing useful
information and equated to always "TRUE".
Most notably, Björn introduces zero-copy AF_XDP support for the ixgbe
driver. The ixgbe zero-copy code is located in its own file ixgbe_xsk.[ch],
analogous to the i40e ZC support. Again, as in i40e, code paths have
been copied from the XDP path to the zero-copy path. Going forward we
will try to generalize more code between the AF_XDP ZC drivers, and
also reduce the heavy C&P.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Song Liu [Wed, 3 Oct 2018 18:30:35 +0000 (11:30 -0700)]
ixgbe: check return value of napi_complete_done()
The NIC driver should only enable interrupts when napi_complete_done()
returns true. This patch adds the check for ixgbe.
Cc: stable@vger.kernel.org # 4.10+
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rami Rosen [Tue, 25 Sep 2018 15:42:18 +0000 (08:42 -0700)]
iavf: fix a typo
This trivial patch fixes a typo in iavf.h.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Björn Töpel [Tue, 2 Oct 2018 08:00:34 +0000 (10:00 +0200)]
ixgbe: add AF_XDP zero-copy Tx support
This patch adds zero-copy Tx support for AF_XDP sockets. It implements
the ndo_xsk_async_xmit netdev ndo and performs all the Tx logic from a
NAPI context. This means pulling egress packets from the Tx ring,
placing the frames on the NIC HW descriptor ring and completing sent
frames back to the application via the completion ring.
The regular XDP Tx ring is used for AF_XDP as well. This rationale for
this is as follows: XDP_REDIRECT guarantees mutual exclusion between
different NAPI contexts based on CPU id. In other words, a netdev can
XDP_REDIRECT to another netdev with a different NAPI context, since
the operation is bound to a specific core and each core has its own
hardware ring.
As the AF_XDP Tx action is running in the same NAPI context and using
the same ring, it will also be protected from XDP_REDIRECT actions
with the exact same mechanism.
As with AF_XDP Rx, all AF_XDP Tx specific functions are added to
ixgbe_xsk.c.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Björn Töpel [Tue, 2 Oct 2018 08:00:33 +0000 (10:00 +0200)]
ixgbe: move common Tx functions to ixgbe_txrx_common.h
This patch prepares for the upcoming zero-copy Tx functionality by
moving common functions used both by the regular path and zero-copy
path.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Björn Töpel [Tue, 2 Oct 2018 08:00:32 +0000 (10:00 +0200)]
ixgbe: add AF_XDP zero-copy Rx support
This patch adds zero-copy Rx support for AF_XDP sockets. Instead of
allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are
allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain
queue.
All AF_XDP specific functions are added to a new file, ixgbe_xsk.c.
Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS
will allocate a new buffer and copy the zero-copy frame prior passing
it to the kernel stack.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Björn Töpel [Tue, 2 Oct 2018 08:00:31 +0000 (10:00 +0200)]
ixgbe: move common Rx functions to ixgbe_txrx_common.h
This patch prepares for the upcoming zero-copy Rx functionality, by
moving/changing linkage of common functions, used both by the regular
path and zero-copy path.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Björn Töpel [Tue, 2 Oct 2018 08:00:30 +0000 (10:00 +0200)]
ixgbe: added Rx/Tx ring disable/enable functions
Add functions for Rx/Tx ring enable/disable. Instead of resetting the
whole device, only the affected ring is disabled or enabled.
This plumbing is used in later commits, when zero-copy AF_XDP support
is introduced.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: William Tu <u9012063@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Radoslaw Tyl [Mon, 24 Sep 2018 07:24:20 +0000 (09:24 +0200)]
ixgbe: Fix crash with VFs and flow director on interface flap
This patch fix crash when we have restore flow director filters after reset
adapter. In ixgbe_fdir_filter_restore() filter->action is outside of the
rx_ring array, as it has a VF identifier in the upper 32 bits.
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Nathan Chancellor [Fri, 21 Sep 2018 19:39:07 +0000 (12:39 -0700)]
i40e: Remove unnecessary print statement
Clang warns that the address of a pointer will always evaluated as true
in a boolean context.
drivers/net/ethernet/intel/i40e/i40e_debugfs.c:136:9: warning: address
of array 'vsi->active_vlans' will always evaluate to 'true'
[-Wpointer-bool-conversion]
vsi->active_vlans ? "<valid>" : "<null>");
~~~~~^~~~~~~~~~~~ ~
./include/linux/device.h:1431:33: note: expanded from macro 'dev_info'
_dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~
1 warning generated.
Given that the statement shows that active_vlans is always valid, just
remove the statement since it's not giving any useful information.
Link: https://github.com/ClangBuiltLinux/linux/issues/82
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Nathan Chancellor [Fri, 21 Sep 2018 10:13:59 +0000 (03:13 -0700)]
i40e: Use proper enum in i40e_ndo_set_vf_link_state
Clang warns when one enumerated type is converted implicitly to another.
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c:4214:42: warning:
implicit conversion from enumeration type 'enum i40e_aq_link_speed' to
different enumeration type 'enum virtchnl_link_speed'
[-Wenum-conversion]
pfe.event_data.link_event.link_speed = I40E_LINK_SPEED_40GB;
~ ^~~~~~~~~~~~~~~~~~~~
1 warning generated.
Use the proper enum from virtchnl_link_speed, which has the same value
as I40E_LINK_SPEED_40GB, VIRTCHNL_LINK_SPEED_40GB. This appears to be
missed by commit
ff3f4cc267f6 ("virtchnl: finish conversion to virtchnl
interface").
Link: https://github.com/ClangBuiltLinux/linux/issues/81
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dan Carpenter [Wed, 19 Sep 2018 10:35:29 +0000 (13:35 +0300)]
ixgbevf: off by one in ixgbevf_ipsec_tx()
The ipsec->tx_tbl[] array has IXGBE_IPSEC_MAX_SA_COUNT elements so the >
should be a >=.
Fixes: 0062e7cc955e ("ixgbevf: add VF IPsec offload code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
YueHaibing [Fri, 7 Sep 2018 03:38:44 +0000 (11:38 +0800)]
ixgbe: remove redundant function ixgbe_fw_recovery_mode()
There are no in-tree callers.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Radoslaw Tyl [Wed, 5 Sep 2018 07:00:51 +0000 (09:00 +0200)]
ixgbe: Fix ixgbe TX hangs with XDP_TX beyond queue limit
We have Tx hang when number Tx and XDP queues are more than 64.
In XDP always is MTQC == 0x0 (64TxQs). We need more space for Tx queues.
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Shannon Nelson [Tue, 4 Sep 2018 19:33:29 +0000 (12:33 -0700)]
ixgbevf: fix msglen for ipsec mbx messages
Don't be fancy with message lengths, just set lengths to
number of dwords, not bytes.
Fixes: 0062e7cc955e ("ixgbevf: add VF IPsec offload code")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Greg Kroah-Hartman [Wed, 3 Oct 2018 18:06:49 +0000 (11:06 -0700)]
Merge tag 'linux-kselftest-4.19-rc7' of git://git./linux/kernel/git/shuah/linux-kselftest
Shuah writes:
"kselftest fixes for 4.19-rc7
This fixes update for 4.19-rc7 consists one fix to rseq test to
prevent it from seg-faulting when compiled with -fpie."
* tag 'linux-kselftest-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
rseq/selftests: fix parametrized test with -fpie
David S. Miller [Wed, 3 Oct 2018 16:39:42 +0000 (09:39 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2018-10-03
This series contains updates to ice and virtchnl.
Yashaswini Raghuram adds a new virtchnl capability flag to support the
exchange of additional supported speeds.
Anirudh adds support for SR-IOV for the ice driver. Added code to
initialize, configure and use mailbox queues for PF and VF
communication. Updated the VSI and queue management to handle both PF
and VF VSI type. Added "Adaptive Virtual Function (AVF)" support for
the ice PF driver by implementing virtchnl commands. Extended the
malicious driver detection logic to include the VF driver as well.
Fixed the queue region size which needs to be log base 2 of the number
of queues in region.
Brett fixes an issue which was causing switch rules to be lost, by
making a call to ice_update_pkt_fwd_rule() with the necessary changes.
Fixed how the PF and VF assigned the ITR index by adding a struct member
itr_idx to be used to dynamically program the correct ITR index.
Dave fixed a potential NULL pointer dereference by adding checks in the
filter handling.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Wed, 3 Oct 2018 12:56:32 +0000 (18:26 +0530)]
cxgb4: remove the unneeded locks
cxgb_set_tx_maxrate will be called holding rtnl lock,
hence remove all unneeded locks.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Wed, 3 Oct 2018 10:45:56 +0000 (12:45 +0200)]
sctp: fix fall-through annotation
Replace "fallthru" with a proper "fall through" annotation.
This fix is part of the ongoing efforts to enabling
-Wimplicit-fallthrough
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:43:08 +0000 (17:43 -0700)]
ice: Update version string
Update version string to 0.7.2-k
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dave Ertman [Thu, 20 Sep 2018 00:43:06 +0000 (17:43 -0700)]
ice: Use the right function to enable/disable VSI
The ice_ena/dis_vsi should have a single differentiating
factor to determine if the netdev_ops call is used or a
direct call to ice_vsi_open/close. This is if the netif is
running or not. If netif is running, use ndo_open/ndo_close.
Else, use ice_vsi_open/ice_vsi_close.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 20 Sep 2018 00:43:05 +0000 (17:43 -0700)]
ice: Add more flexibility on how we assign an ITR index
This issue came about when looking at the VF function
ice_vc_cfg_irq_map_msg. Currently we are assigning the itr_setting value
to the itr_idx received from the AVF driver, which is not correct and is
not used for the VF flow anyway. Currently the only way we set the ITR
index for both the PF and VF driver is by hard coding ICE_TX_ITR or
ICE_RX_ITR for the ITR index on each q_vector.
To fix this, add the member itr_idx in struct ice_ring_container. This
can then be used to dynamically program the correct ITR index. This change
also affected the PF driver so make the necessary changes there as well.
Also, removed the itr_setting member in struct ice_ring because it is not
being used meaningfully and is going to be removed in a future patch that
includes dynamic ITR.
On another note, this will be useful moving forward if we decide to split
Rx/Tx rings on different q_vectors instead of sharing them as queue pairs.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Dave Ertman [Thu, 20 Sep 2018 00:43:04 +0000 (17:43 -0700)]
ice: Fix potential null pointer issues
Add checks in the filter handling flow to avoid dereferencing
NULL pointers.
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 20 Sep 2018 00:43:03 +0000 (17:43 -0700)]
ice: Add code to go from ICE_FWD_TO_VSI_LIST to ICE_FWD_TO_VSI
When a switch rule is initially created we set the filter action to
ICE_FWD_TO_VSI. The filter action changes to ICE_FWD_TO_VSI_LIST
whenever more than one VSI is subscribed to the same switch rule. When
the switch rule goes from 2 VSIs in the list to 1 VSI we remove and
delete the VSI list rule, but we currently don't update the switch rule
in hardware. This is causing switch rules to be lost, so fix that by
making a call to ice_update_pkt_fwd_rule() with the necessary changes.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:43:02 +0000 (17:43 -0700)]
ice: Fix forward to queue group logic
When adding a rule, queue region size needs to be provided as log base 2
of the number of queues in region. Fix that.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:43:01 +0000 (17:43 -0700)]
ice: Extend malicious operations detection logic
This patch extends the existing malicious driver operation detection
logic to cover malicious operations by the VF driver as well.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:43:00 +0000 (17:43 -0700)]
ice: Notify VF of link status change
When PF gets a link status change event, notify the VFs of the same.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:42:59 +0000 (17:42 -0700)]
ice: Implement virtchnl commands for AVF support
virtchnl is a protocol/interface specification that allows the Intel
"Adaptive Virtual Function (AVF)" driver (iavf.ko) to work with more than
one physical function driver. The AVF driver sends "virtchnl commands"
(control plane only) to the PF driver over mailbox queues and the PF driver
executes these commands and returns a result to the VF, again over mailbox.
This patch adds AVF support for the ice PF driver by implementing the
following virtchnl commands:
VIRTCHNL_OP_VERSION
VIRTCHNL_OP_GET_VF_RESOURCES
VIRTCHNL_OP_RESET_VF
VIRTCHNL_OP_ADD_ETH_ADDR
VIRTCHNL_OP_DEL_ETH_ADDR
VIRTCHNL_OP_CONFIG_VSI_QUEUES
VIRTCHNL_OP_ENABLE_QUEUES
VIRTCHNL_OP_DISABLE_QUEUES
VIRTCHNL_OP_ADD_ETH_ADDR
VIRTCHNL_OP_DEL_ETH_ADDR
VIRTCHNL_OP_CONFIG_VSI_QUEUES
VIRTCHNL_OP_ENABLE_QUEUES
VIRTCHNL_OP_DISABLE_QUEUES
VIRTCHNL_OP_REQUEST_QUEUES
VIRTCHNL_OP_CONFIG_IRQ_MAP
VIRTCHNL_OP_CONFIG_RSS_KEY
VIRTCHNL_OP_CONFIG_RSS_LUT
VIRTCHNL_OP_GET_STATS
VIRTCHNL_OP_ADD_VLAN
VIRTCHNL_OP_DEL_VLAN
VIRTCHNL_OP_ENABLE_VLAN_STRIPPING
VIRTCHNL_OP_DISABLE_VLAN_STRIPPING
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:42:58 +0000 (17:42 -0700)]
ice: Add handlers for VF netdevice operations
This patch implements handlers for the following NDO operations:
.ndo_set_vf_spoofchk
.ndo_set_vf_mac
.ndo_get_vf_config
.ndo_set_vf_trust
.ndo_set_vf_vlan
.ndo_set_vf_link_state
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:42:57 +0000 (17:42 -0700)]
ice: Add support for VF reset events
Post VF initialization, there are a couple of different ways in which a
VF reset can be triggered. One is when the underlying PF itself goes
through a reset and other is via a VFLR interrupt. ice_reset_vf introduced
in this patch handles both these cases.
Also introduced in this patch is a helper function ice_aq_send_msg_to_vf
to send messages to VF over the mailbox queue. The PF uses this to send
reset notifications to VFs.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anirudh Venkataramanan [Thu, 20 Sep 2018 00:42:56 +0000 (17:42 -0700)]
ice: Update VSI and queue management code to handle VF VSI
Until now, all the VSI and queue management code supported only the PF
VSI type (ICE_VSI_PF). Update these flows to handle the VF VSI type
(ICE_VSI_VF) type as well.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>