Jesse Brandeburg [Fri, 28 Apr 2017 23:53:15 +0000 (16:53 -0700)]
i40evf: fix duplicate lines
This removes two duplicate lines that snuck into the code somehow.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Michael Chan [Wed, 31 May 2017 00:03:00 +0000 (20:03 -0400)]
bnxt_en: Fix xmit_more with BQL.
We need to write the doorbell if BQL has stopped the queue and
skb->xmit_more is set. Otherwise it is possible for the tx queue to
rot and cause tx timeout.
Fixes: 4d172f21cefe ("bnxt_en: Implement xmit_more.")
Suggested-by: Yuval Mintz <yuval.mintz@cavium.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 May 2017 22:14:13 +0000 (18:14 -0400)]
Merge branch 'bnxt_en-Misc-updates-for-net-next'
Michael Chan says:
====================
bnxt_en: Misc. updates for net-next.
The 1st 2 patches add short firmware message support for new VF devices.
The 3rd patch adds a pci shutdown callback for the RDMA driver for proper
shutdown. The next 3 patches improve the doorbell operations by
elimiating the double doorbell workaround on newer chips, and by adding
xmit_more support. The last patch adds a parameter to bnxt_set_dflt_rings().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 May 2017 23:06:10 +0000 (19:06 -0400)]
bnxt_en: Pass in sh parameter to bnxt_set_dflt_rings().
In the existing code, the local variable sh is hardcoded to true to
calculate default rings for shared ring configuration. It is better
to have the caller determine the value of sh.
Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 May 2017 23:06:09 +0000 (19:06 -0400)]
bnxt_en: Implement xmit_more.
Do not write the TX doorbell if skb->xmit_more is set unless the TX
queue is full.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 May 2017 23:06:08 +0000 (19:06 -0400)]
bnxt_en: Optimize doorbell write operations for newer chips.
Older chips require the doorbells to be written twice, but newer chips
do not. Add a new common function bnxt_db_write() to write all
doorbells appropriately depending on the chip. Eliminating the extra
doorbell on newer chips has a significant performance improvement
on pktgen.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 May 2017 23:06:07 +0000 (19:06 -0400)]
bnxt_en: Add additional chip ID definitions.
Add additional chip definitions and macros for all supported chips.
Add a new macro BNXT_CHIP_P4_PLUS for the newer generation of chips and
use the macro to properly determine the features supported by these
newer chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 May 2017 23:06:06 +0000 (19:06 -0400)]
bnxt_en: Add a callback to inform RDMA driver during PCI shutdown.
When bnxt_en gets a PCI shutdown call, we need to have a new callback
to inform the RDMA driver to do proper shutdown and removal.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Deepak Khungar [Mon, 29 May 2017 23:06:05 +0000 (19:06 -0400)]
bnxt_en: Add PCI IDs for BCM57454 VF devices.
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Deepak Khungar [Mon, 29 May 2017 23:06:04 +0000 (19:06 -0400)]
bnxt_en: Support for Short Firmware Message
The new short message format is used on the new BCM57454 VFs. Each
firmware message is a fixed 16-byte message sent using the standard
firmware communication channel. The short message has a DMA address
pointing to the legacy long firmware message.
Signed-off-by: Deepak Khungar <deepak.khungar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 26 May 2017 22:07:37 +0000 (18:07 -0400)]
net: dsa: b53: remove unused dev argument
The port net device passed to b53_fdb_copy is not used. Remove it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 26 May 2017 22:12:42 +0000 (18:12 -0400)]
net: dsa: remove dsa_port_is_bridged
The helper is only used once and makes the code more complicated that it
should. Remove it and reorganize the variables so that it fits on 80
columns.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Tue, 30 May 2017 08:00:24 +0000 (13:30 +0530)]
cxgb4: Fix netdev_features flag
GRO is not supported by Chelsio HW when rx_csum is disabled.
Update the netdev features flag when rx_csum is modified.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Tue, 30 May 2017 12:36:06 +0000 (18:06 +0530)]
cxgb4: FW upgrade fixes
Disable FW_OK flag while flashing Firmware. This will help to fix any
potential mailbox timeouts during Firmware flash.
Grab new devlog parameters after Firmware restart. When we FLASH new
Firmware onto an adapter, the new Firmware may have the Firmware Device Log
located at a different memory address or have a different size for it.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Tue, 30 May 2017 06:20:40 +0000 (11:50 +0530)]
cxgb4: add new T5 pci device id
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Surendra Mobiya [Tue, 30 May 2017 06:02:06 +0000 (11:32 +0530)]
cxgb4: keep carrier off before registering netdev
Mark carrier off before registering netdev to ensure that vlan device
picks up the correct state of the carrier
Signed-off-by: Surendra Mobiya <surendra@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 May 2017 17:57:33 +0000 (13:57 -0400)]
Merge branch 'net-qualcomm-add-QCA7000-UART-driver'
Stefan Wahren says:
====================
net: qualcomm: add QCA7000 UART driver
The Qualcomm QCA7000 HomePlug GreenPHY supports two interfaces:
UART and SPI. This patch series adds the missing support for UART.
This driver based on the Qualcomm code [1], but contains some changes:
* use random MAC address per default
* use net_device_stats from device
* share frame decoding between SPI and UART driver
* improve error handling
* reimplement tty_wakeup with work queue (based on slcan)
* use new serial device bus instead of ldisc
The patches 1 - 3 are just for clean up and are not related to
the UART support. Patch 4 adds SET_NETDEV_DEV() to qca_spi.
Patches 5 - 16 prepare the existing QCA7000 code for UART support.
The last patch contains the new driver.
The code itself has been tested on a Freescale i.MX28 board and
a Raspberry Pi Zero.
Changes in v8:
* add necessary header includes to qca_7k.c in order to reflect
dependencies
Changes in v7:
* fix race between tx workqueue and device deregistration (reported by Lino)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:25 +0000 (13:57 +0200)]
net: qualcomm: add QCA7000 UART driver
This patch adds the Ethernet over UART driver for the
Qualcomm QCA7000 HomePlug GreenPHY.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:24 +0000 (13:57 +0200)]
dt-bindings: qca7000: append UART interface to binding
This merges the serdev binding for the QCA7000 UART driver (Ethernet over
UART) into the existing document.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:23 +0000 (13:57 +0200)]
dt-bindings: slave-device: add current-speed property
This adds a new DT property to define the current baud rate of the
slave device.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:22 +0000 (13:57 +0200)]
dt-bindings: qca7000: rename binding
Before we can merge the QCA7000 UART binding the document needs to be
renamed.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:21 +0000 (13:57 +0200)]
dt-bindings: qca7000-spi: Rework binding
In preparation for the QCA7000 UART binding rework the binding document.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:20 +0000 (13:57 +0200)]
net: qualcomm: make qca_7k_common a separate kernel module
In order to share common functions between QCA7000 SPI and UART protocol
driver the qca_7k_common needs to be a separate kernel module.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:19 +0000 (13:57 +0200)]
net: qualcomm: prepare frame decoding for UART driver
Unfortunately the frame format is not exactly identical between SPI
and UART. In case of SPI there is an additional HW length at the
beginning. So store the initial state to make the decoding state machine
more flexible and easy to extend for UART support.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:18 +0000 (13:57 +0200)]
net: qualcomm: rename qca_framing.c to qca_7k_common.c
As preparation for the upcoming UART driver we need a module
which contains common functions for both interfaces. The module
qca_framing is a good candidate but renaming to qca_7k_common would
make it clear.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:17 +0000 (13:57 +0200)]
net: qca_spi: Clarify MODULE_DESCRIPTION
Since this driver is specific to the QCA7000, we should make the module
description more precisely.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:16 +0000 (13:57 +0200)]
net: qualcomm: move qcaspi_tx_cmd to qca_spi.c
The function qcaspi_tx_cmd() is only called from qca_spi.c. So we better
move it there.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:15 +0000 (13:57 +0200)]
net: qca_spi: remove QCASPI_MTU
There is no need for an additional MTU define.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:14 +0000 (13:57 +0200)]
net: qualcomm: Improve readability of length defines
In order to avoid mixing things up, make the MTU and frame length
defines easier to read.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:13 +0000 (13:57 +0200)]
net: qualcomm: use net_device_ops instead of direct call
There is no need to export qcaspi_netdev_open and qcaspi_netdev_close
because they are also accessible via the net_device_ops.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:12 +0000 (13:57 +0200)]
net: qca_spi: Use SET_NETDEV_DEV()
Use SET_NETDEV_DEV() in qca_spi to create the "/sys/class/net/<if>/device"
symlink.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:11 +0000 (13:57 +0200)]
net: qca_7k: Use BIT macro
Use the BIT macro for the CONFIG and INT register values.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:10 +0000 (13:57 +0200)]
net: qca_framing: use u16 for frame offset
It doesn't make sense to use a signed variable for offset here, so
fix it up.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Wahren [Mon, 29 May 2017 11:57:09 +0000 (13:57 +0200)]
net: qualcomm: qca_7k: clean up header includes
Currently the includes doesn't reflect the dependencies. So
fix this up by removing all unnecessary entries and add the
necessary ones explicit.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 May 2017 16:42:28 +0000 (12:42 -0400)]
Merge branch 'net-phy-Support-managed-Cortina-phys'
Bogdan Purcareata says:
====================
net: phy: Support managed Cortina phys
So far, the Cortina family phys (CS4340 in this particular case) are only
supported in fixed link mode (via fixed_phy_register). The generic 10G
phy driver does not work well with the phylib state machine, when the phy
is registered via of_phy_connect. This prohibits the user from describing the
phy nodes in the device tree.
In order to support this scenario, and to properly describe the board
device tree, add a minimal Cortina driver that reads the status from the
right register. With the generic 10G C45 driver, the kernel will print
messages like:
[ 0.226521] mdio_bus
8b96000: Error while reading PHY16 reg at 1.6
[ 0.232780] mdio_bus
8b96000: Error while reading PHY16 reg at 1.5
v3 -> v4:
- Add trademark info.
- Minor documentation entry consistency nit.
v2 -> v3:
- Add documentation entry.
v1 -> v2:
- Change approach for getting the phy_id from hacking get_phy_c45_ids to
describing the device in the device tree via ethernet-phy-id.
More patch version changes per individual patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Bogdan Purcareata [Mon, 29 May 2017 09:11:31 +0000 (09:11 +0000)]
dt-bindings: net: Add Cortina device tree bindings
Add device tree description info for Cortina 10G phy devices.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bogdan Purcareata [Mon, 29 May 2017 09:11:30 +0000 (09:11 +0000)]
net: phy: Add Cortina CS4340 driver
Add basic support for Cortina PHY drivers. Support only CS4340 for now.
The phys are not compatible with IEEE 802.3 clause 22/45 registers.
Implement proper read_status support. The generic 10G phy driver causes
bus register access errors.
The driver should be described using the "ethernet-phy-id" device tree
compatible.
Signed-off-by: Bogdan Purcareata <bogdan.purcareata@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 May 2017 16:07:04 +0000 (12:07 -0400)]
Merge branch 'qed-DCBx-and-Attentions-series'
Yuval Mintz says:
====================
qed: DCBx and Attentions series
The series contains 2 major components [& some odd bits]:
- The first 3 patches are DCBx-related, containg missing bits in the
implementation, correcting existing API and removing code no longer
necessary.
- Most of the remaining patches are interrupt/hw-attention related,
adding some differeneces relating to QL41xxx and QL45xxx differences.
While at it, they also remove a large chunk of unnecessary structure
definitions.
The series also contain a patch [#10] that was accidently missing
from a previous series.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:14 +0000 (09:53 +0300)]
qed: Cache alignemnt padding to match host
Improve PCI performance by adjusting padding sizes to match those of the
host machine's cacheline.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:13 +0000 (09:53 +0300)]
qed: Mask parities after occurance
Parities might exhibit a flood behavior since we re-enable the
attention line without preventing the parity from re-triggering the
assertion.
Mask the source in AEU until the parity would be handled.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:12 +0000 (09:53 +0300)]
qed: Print multi-bit attentions properly
In strucuture reflecting the AEU hw block some entries
represent multiple HW bits, and the associated name is in fact
a pattern.
Today, whenever such an attention would be asserted the resulted
prints would show the pattern string instead of indicating which
of the possible bits was set.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:11 +0000 (09:53 +0300)]
qed: Diffrentiate adapter-specific attentions
There are 4 attention bits in AEU that have different meaning
for QL45xxx and QL41xxx adapters.
Instead of doing a massive infrastructure change in favor of these
bits, we implement a point fix where only those four would change
meaning dependent on the adapter involved.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:10 +0000 (09:53 +0300)]
qed: Get rid of the attention-arrays
We have almost all the necessary information regarding attentions
in the logic employed for taking register dumps.
Add some more and get rid of the seperate implementation we have today
for identifying & printing various attention sources.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:09 +0000 (09:53 +0300)]
qed: Support dynamic s-tag change
In case management firmware indicates a change in the used S-tag,
propagate the configuration to HW and FW.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Mon, 29 May 2017 06:53:08 +0000 (09:53 +0300)]
qed: QL41xxx VF MSI-x table
The QL41xxx adapters' PCI allows a single configuration for the
MSI-x table size of all child VFs of a given PF.
The existing code wouldn't cause the management firmware to set
that value, meaning the VFs would retain the default MSI-x table
size.
Introduce a new scheme so that whenever a VF is enabled, driver
would set the number of MSI-x to be the maximum over the various
VFs' needs.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Mon, 29 May 2017 06:53:07 +0000 (09:53 +0300)]
qed: Don't inherit RoCE DCBx for V2
Older firmware used by device didn't distinguish between RoCE and RoCE
V2 from DCBx configuration perspective, and as a result we've used to
take a the RoCE-related configuration and apply to it for both.
Since we now support configuring each its own values, there's no reason
to reflect [& configure] that both are using the same.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Mon, 29 May 2017 06:53:06 +0000 (09:53 +0300)]
qed: Correct DCBx update scheme
Instead of using a boolean value that propagates to FW configuration,
use the proper firmware HSI values.
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudarsana Reddy Kalluru [Mon, 29 May 2017 06:53:05 +0000 (09:53 +0300)]
qed: Add missing static/local dcbx info
Some getters are not getting filled with the correct information
regarding local DCBx.
Fixes: 49632b5822ea ("qed: Add support for static dcbx.")
Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 May 2017 15:55:34 +0000 (11:55 -0400)]
Merge branch 'net-more-extack'
David Ahern says:
====================
net: another round of extack handling for routing
This set focuses on passing extack through lwtunnel and MPLS with
additional catches for IPv4 route add and minor cleanups in MPLS
encountered passing the extack arg around.
v2
- mindful of bloat adding duplicate messages
+ refactored prefix and prefix length checks in ipv4's fib_table_insert
and fib_table_del
+ refactored label check in mpls
- split mpls cleanups into 2 patches
+ move nla_get_via up in af_mpls to avoid forward declaration
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:33 +0000 (16:19 -0600)]
net: mpls: remove unnecessary initialization of err
err is initialized to EINVAL and not used before it is set again.
Remove the unnecessary initialization.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:32 +0000 (16:19 -0600)]
net: mpls: Make nla_get_via in af_mpls.c
nla_get_via is only used in af_mpls.c. Remove declaration from internal.h
and move up in af_mpls.c before first use. Code move only; no
functional change intended.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:31 +0000 (16:19 -0600)]
net: mpls: Add extack messages for route add and delete failures
Add error messages for failures in adding and deleting mpls routes.
This covers most of the annoying EINVAL errors.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:30 +0000 (16:19 -0600)]
net: mpls: Pull common label check into helper
mpls_route_add and mpls_route_del have the same checks on the label.
Move to a helper. Avoid duplicate extack messages in the next patch.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:29 +0000 (16:19 -0600)]
net: Fill in extack for mpls lwt encap
Fill in extack for errors in build_state for mpls lwt encap including
passing extack to nla_get_labels and adding error messages for failures
in it.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:28 +0000 (16:19 -0600)]
net: add extack arg to lwtunnel build state
Pass extack arg down to lwtunnel_build_state and the build_state callbacks.
Add messages for failures in lwtunnel_build_state, and add the extarg to
nla_parse where possible in the build_state callbacks.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:27 +0000 (16:19 -0600)]
net: lwtunnel: Add extack to encap attr validation
Pass extack down to lwtunnel_valid_encap_type and
lwtunnel_valid_encap_type_attr. Add messages for unknown
or unsupported encap types.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:26 +0000 (16:19 -0600)]
net: ipv4: Add extack message for invalid prefix or length
Add extack error message for invalid prefix length and invalid prefix.
Example of the latter is a route spec containing 172.16.100.1/24, where
the /24 mask means the lower 8-bits should be 0. Amazing how easy that
one is to overlook when an EINVAL is returned.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 27 May 2017 22:19:25 +0000 (16:19 -0600)]
net: ipv4: refactor key and length checks
fib_table_insert and fib_table_delete have the same checks on the prefix
and length. Refactor into a helper. Avoids duplicate extack messages in
the next patch.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 May 2017 15:27:10 +0000 (11:27 -0400)]
Merge branch 'nfp-pci-core-hwmon-live-mac-addr-change'
Jakub Kicinski says:
====================
nfp: pci core, hwmon, live mac addr change
This series brings updates to core PCI code, SR-IOV, exposes
firmware's capability to change MAC address at runtime and HWMON
interfaces.
The PCI code updates include resiliency improvement in conditions
which are quite unusual, but still shouldn't make the driver oops.
We also handle very large device memory operation more gracefully.
A timeout is added to acquiring mutexes in device memory.
Pablo provides a patch to expose to the stack the ability to change
MAC addresses under traffic while David adds HWMON interface for
reading device temperature and power consumption.
Last three patches are minor improvements to the netdev code.
v2:
- add patch 1 - fix for devlink build;
- fix build issue with the hwmon patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:53:04 +0000 (17:53 -0700)]
nfp: don't keep count for free buffers delayed kick
We only kick RX free buffer queue controller every NFP_NET_FL_BATCH
(currently 16) entries. This means that we will always kick the QC
when write ring index is divisable by NFP_NET_FL_BATCH. There is
no need to keep counts.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:53:03 +0000 (17:53 -0700)]
nfp: don't add ring size to index calculations
Adding ring size to index calculation is pointless, since index
will be masked with ring size - 1.
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:53:02 +0000 (17:53 -0700)]
nfp: fix print format for ring pointers in ring dumps
Ring pointers are unsigned. Fix the print formats to avoid
showing users negative values.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:53:01 +0000 (17:53 -0700)]
nfp: don't wait for resources indefinitely
There is currently no timeout to the resource and lock acquiring
loops. We printed warnings and depended on user sending a signal
to the waiting process to stop the waiting. This doesn't work
very well when wait happens out of a work queue. The simplest
example of that is PCI probe. When user loads the module and card
is in a broken state modprobe will wait forever and signals sent
to it will not actually reach the probing thread.
Make sure all wait loops have a time out. Set the upper wait time
to 60 seconds to stay on the safe side.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Brunecz [Mon, 29 May 2017 00:53:00 +0000 (17:53 -0700)]
nfp: add hwmon support
Add support for retrieving temperature and power sensor and limits via NSP.
Signed-off-by: David Brunecz <david.brunecz@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:59 +0000 (17:52 -0700)]
nfp: support variable NSP response lengths
We want to support extendable commands, where newer versions
of the management FW may provide more information. Zero out
the communication buffer before passing control to NSP. This
way if management FW is old and only fills in first N bytes,
the remaining ones will be zeros which extended ABI fields
should reserve as not supported/not available.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:58 +0000 (17:52 -0700)]
nfp: shorten CPP core probe logs
We currently print reserved BAR mappings info as we create them.
This makes the probe logs longer than necessary. Print into a
buffer instead and log all the info as a single line.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:57 +0000 (17:52 -0700)]
nfp: support long reads and writes with the cpp helpers
nfp_cpp_{read,write}() helpers perform device memory mapping (setting
the PCIe -> NOC translation BARs) and accessing it. They, however,
currently implicitly expect that the length of entire operation will
fit in one BAR translation window. There is a number of 16MB windows
available, and we don't really need to access such large areas today.
If the user, however, manages to trick the driver into making a big
mapping (e.g. by providing a huge fake FW file), the driver will
print a warning saying "No suitable BAR found for request" and a
stack trace - which most users find concerning.
To be future-proof and not scare users with warnings, make the
nfp_cpp_{read,write}() helpers do accesses chunk by chunk if the area
size is large. Set the notion of "large" to 2MB, which is the size
of the smallest BAR window.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:56 +0000 (17:52 -0700)]
nfp: only try to get to PCIe ctrl memory if BARs are wide enough
For accessing PCIe ctrl memory we depend on the BAR aperture being
large enough to reach all registers. Since the BAR aperture can
be set in the flash make sure the driver won't oops the kernel
when the PCIe configuration is unusual.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:55 +0000 (17:52 -0700)]
nfp: don't set aux pointers if ioremap failed
If ioremap of PCIe ctrl memory failed we can still get to it through
PCI config space, therefore we allow ioremap() to fail. When if fails,
however, we must leave all the IOMEM pointers as NULL. Currently we
would calculate csr and em pointers, adding offsets to the potential
NULL value and therefore making the NULL-checks throughout the code
ineffective.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:54 +0000 (17:52 -0700)]
nfp: set driver VF limit
PCI subsystem has support for drivers limiting the number of VFs
available below what the IOV capability claims. Make use of it.
While at it remove the #ifdef/#endif on CONFIG_PCI_IOV, it was
there to avoid unnecessary warnings in case device read failed
but kernel doesn't have SR-IOV support anyway. Device reads
should not fail.
Note that we still need the driver-internal check for the case
where max VFs is 0 since PCI subsystem treats 0 as limit not set.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Cascón [Mon, 29 May 2017 00:52:53 +0000 (17:52 -0700)]
nfp: add set_mac_address support while the interface is up
Expose FW app ability to change MAC address at runtime. Make sure
we only depend on it if FW app advertised the right capability.
Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 29 May 2017 00:52:52 +0000 (17:52 -0700)]
nfp: add MAY_USE_DEVLINK dependency
Fix build with DEVLINK=m and NFP=y.
Fixes: 1851f93fd2ee ("nfp: add devlink support")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sat, 27 May 2017 17:42:25 +0000 (10:42 -0700)]
net: phy: Relax error checking on sysfs_create_link()
Some Ethernet drivers will attach/connect to a PHY device before calling
register_netdevice() which is responsible for calling netdev_register_kobject()
which would do the network device's kobject initialization. In such a case,
sysfs_create_link() would return -ENOENT because the network device's kobject
is not ready yet, and we would fail to connect to the PHY device.
In order to keep things simple and symetrical, we just take the success path as
indicative of the ability to access the network device's kobject, and create
the second link if that's the case.
Fixes: 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev")
Reported-by: Woojung Hung <Woojung.Huh@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 26 May 2017 22:02:42 +0000 (18:02 -0400)]
net: dsa: mv88e6xxx: handle SERDES error appropriately
mv88e6xxx_serdes_power returns an error, so no need to print an error
message inside of it. Rather print it in its caller when the error is
ignored, which is in the mv88e6xxx_port_disable void function.
Catch and return its error in the counterpart mv88e6xxx_port_enable.
Fixes: 04aca9938255 ("dsa: mv88e6xxx: Enable/Disable SERDES on port enable/disable")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 27 May 2017 22:51:42 +0000 (18:51 -0400)]
Merge branch 'rtnetlink-Updates-to-rtnetlink_event'
Vladislav Yasevich says:
====================
rtnetlink: Updates to rtnetlink_event()
First is the patch to add IFLA_EVENT attribute to the netlink message. It
supports only currently white-listed events.
Like before, this is just an attribute that gets added to the rtnetlink
message only when the messaged was generated as a result of a netdev event.
In my case, this is necessary since I want to trap NETDEV_NOTIFY_PEERS
event (also possibly NETDEV_RESEND_IGMP event) and perform certain actions
in user space. This is not possible since the messages generated as
a result of netdev events do not usually contain any changed data. They
are just notifications. This patch exposes this notification type to
userspace.
Second, I remove duplicate messages that a result of a change to bonding
options. If netlink is used to configure bonding options, 2 messages
are generated, one as a result NETDEV_CHANGEINFODATA event triggered by
bonding code and one a result of device state changes triggered by
netdev_state_change (called from do_setlink).
V6: Updated names and refactored to make it less tied to netdev events.
(From David Ahern)
V5: Rebased. Added iproute2 patch to the series.
V4:
* Removed the patch the removed NETDEV_CHANGENAME from event whitelist.
It doesn't trigger duplicate messages since name changes can only be
done while device is down and netdev_state_change() doesn't report
changes while device is down.
* Added a patch to clean-up duplicate messages on bonding option changes.
V3: Rebased. Cleaned-up duplicate event.
V2: Added missed events (from David Ahern)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sat, 27 May 2017 14:14:35 +0000 (10:14 -0400)]
bonding: Prevent duplicate userspace notification
Whenever a user changes bonding options, a NETDEV_CHANGEINFODATA
notificatin is generated which results in a rtnelink message to
be sent. While runnig 'ip monitor', we can actually see 2 messages,
one a result of the event, and the other a result of state change
that is generated bo netdev_state_change(). However, this is not
always the case. If bonding changes were done via sysfs or ifenslave
(old ioctl interface), then only 1 message is seen.
This patch removes duplicate messages in the case of using netlink
to configure bonding. It introduceds a separte function that
triggers a netdev event and uses that function in the syfs and ioctl
cases.
This was discovered while auditing all the different envents and
continues the effort of cleaning up duplicated netlink messages.
CC: David Ahern <dsa@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sat, 27 May 2017 14:14:34 +0000 (10:14 -0400)]
rtnl: Add support for netdev event to link messages
When netdev events happen, a rtnetlink_event() handler will send
messages for every event in it's white list. These messages contain
current information about a particular device, but they do not include
the iformation about which event just happened. So, it is impossible
to tell what just happend for these events.
This patch adds a new extension to RTM_NEWLINK message called IFLA_EVENT
that would have an encoding of event that triggered this
message. This would allow the the message consumer to easily determine
if it needs to perform certain actions.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 27 May 2017 00:46:35 +0000 (20:46 -0400)]
Merge git://git./linux/kernel/git/davem/net
Overlapping changes in drivers/net/phy/marvell.c, bug fix in 'net'
restricting a HW workaround alongside cleanups in 'net-next'.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 26 May 2017 21:02:30 +0000 (14:02 -0700)]
Merge tag 'led_fixes_for_4-12-rc3' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED fix from Jacek Anaszewski:
"A single LED fix for 4.12-rc3.
leds-pca955x driver uses only i2c_smbus API and thus it should pass
I2C_FUNC_SMBUS_BYTE_DATA flag to i2c_check_functionality"
* tag 'led_fixes_for_4-12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: pca955x: Correct I2C Functionality
Linus Torvalds [Fri, 26 May 2017 20:51:01 +0000 (13:51 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix state pruning in bpf verifier wrt. alignment, from Daniel
Borkmann.
2) Handle non-linear SKBs properly in SCTP ICMP parsing, from Davide
Caratti.
3) Fix bit field definitions for rss_hash_type of descriptors in mlx5
driver, from Jesper Brouer.
4) Defer slave->link updates until bonding is ready to do a full commit
to the new settings, from Nithin Sujir.
5) Properly reference count ipv4 FIB metrics to avoid use after free
situations, from Eric Dumazet and several others including Cong Wang
and Julian Anastasov.
6) Fix races in llc_ui_bind(), from Lin Zhang.
7) Fix regression of ESP UDP encapsulation for TCP packets, from
Steffen Klassert.
8) Fix mdio-octeon driver Kconfig deps, from Randy Dunlap.
9) Fix regression in setting DSCP on ipv6/GRE encapsulation, from Peter
Dawson.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add reference counting to metrics
net: ethernet: ax88796: don't call free_irq without request_irq first
ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets
sctp: fix ICMP processing if skb is non-linear
net: llc: add lock_sock in llc_ui_bind to avoid a race condition
bonding: Don't update slave->link until ready to commit
test_bpf: Add a couple of tests for BPF_JSGE.
bpf: add various verifier test cases
bpf: fix wrong exposure of map_flags into fdinfo for lpm
bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data
bpf: properly reset caller saved regs after helper call and ld_abs/ind
bpf: fix incorrect pruning decision when alignment must be tracked
arp: fixed -Wuninitialized compiler warning
tcp: avoid fastopen API to be used on AF_UNSPEC
net: move somaxconn init from sysctl code
net: fix potential null pointer dereference
geneve: fix fill_info when using collect_metadata
virtio-net: enable TSO/checksum offloads for Q-in-Q vlans
be2net: Fix offload features for Q-in-Q packets
vlan: Fix tcp checksum offloads in Q-in-Q vlans
...
David S. Miller [Fri, 26 May 2017 19:32:47 +0000 (15:32 -0400)]
Merge branch 'ibmvnic-Driver-updates'
Nathan Fontenot says:
====================
ibmvnic: Driver updates
This set of patches implements several updates to the ibmvnic driver
to fix issues that have been found in testing. Most of the updates
invovle updating queue handling during driver close and reset
operations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Fontenot [Fri, 26 May 2017 14:31:12 +0000 (10:31 -0400)]
ibmvnic: Reset sub-crqs during driver reset
When the ibmvnic driver is resetting, we can just reset the sub crqs
instead of releasing all of their resources and re-allocting them.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Fontenot [Fri, 26 May 2017 14:31:06 +0000 (10:31 -0400)]
ibmvnic: Reset tx/rx pools on driver reset
When resetting the ibmvnic driver there is not a need to release
and re-allocate the resources for the tx and rx pools. These
resources can just be reset to avoid the re-allocations.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Fontenot [Fri, 26 May 2017 14:31:00 +0000 (10:31 -0400)]
ibmvnic: Reset the CRQ queue during driver reset
When a driver reset operation occurs there is not a need to release
the CRQ resources and re-allocate them. Instead a reset of the CRQ
will suffice.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nathan Fontenot [Fri, 26 May 2017 14:30:54 +0000 (10:30 -0400)]
ibmvnic: Check adapter state during ibmvnic_poll
We do not want to process any receive frames if the ibmvnic_poll
routine is invoked while a reset is in process. Also, before
replenishing the rx pools in the ibmvnic_poll, we want to
make sure the adapter is not in the process of closing.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Fri, 26 May 2017 14:30:48 +0000 (10:30 -0400)]
ibmvnic: Deactivate RX pool buffer replenishment on H_CLOSED
If H_CLOSED is returned, halt RX buffer replenishment activity
until firmware sends a notification that the driver can reset.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Fri, 26 May 2017 14:30:42 +0000 (10:30 -0400)]
ibmvnic: Halt TX and report carrier off on H_CLOSED return code
This patch disables transmissions and reports carrier off if xmit
function returns that the hardware TX queue is closed. The driver can
then await a signal from firmware to determine the correct reset method.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Allen [Fri, 26 May 2017 14:30:37 +0000 (10:30 -0400)]
ibmvnic: Non-fatal error handling
Handle non-fatal error conditions. The process to do this when
resetting the driver is to just do __ibmvnic_close followed by
__ibmvnic_open.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Fri, 26 May 2017 14:30:31 +0000 (10:30 -0400)]
ibmvnic: Fix cleanup of SKB's on driver close
A race condition occurs when closing the driver. Free'ing of skb's
can race between the close routine and ibmvnic_tx_interrupt. To fix
this we move the claenup of tx pools during close to after the
sub-CRQ interrupts are disabled.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Allen [Fri, 26 May 2017 14:30:25 +0000 (10:30 -0400)]
ibmvnic: Send gratuitous arp on reset
Send gratuitous arp after any reset.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Allen [Fri, 26 May 2017 14:30:19 +0000 (10:30 -0400)]
ibmvnic: Handle failover after failed init crq
Handle case where phyp sends a failover after failing to send the
init crq.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Allen [Fri, 26 May 2017 14:30:13 +0000 (10:30 -0400)]
ibmvnic: Track state of adapter napis
Track the state of ibmvnic napis. The driver can get into states where it
can be reset when napis are already disabled and attempting to disable them
again will cause the driver to hang.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 May 2017 19:18:50 +0000 (15:18 -0400)]
Merge branch 'mlxsw-Improve-extensibility'
Jiri Pirko says:
====================
mlxsw: Improve extensibility
Ido says:
Since the initial introduction of the bridge offload in commit
56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
the per-port struct was used to store both physical properties of the
port as well as logical bridge properties such as learning and active
VLANs in the VLAN-aware bridge.
The above resulted in a bloated struct and code that is getting
increasingly difficult to extend when stacked devices are taken into
account as well as more advanced use cases such as IGMP snooping.
Due to the incremental development nature of this driver as well as the
complexity of the underlying hardware, subsequent design decisions failed
to generalize the FID and RIF resources, which could've benefited from
a more generic design, resulting in consolidated code paths and better
extensibility with regards to future ASICs and use cases.
This patchset tries to solve both of these design problems, as they're
tightly coupled. To ease the code review, the changes are done in a
bottom-up manner, in which the port struct is the first to be patched,
then the FIDs the ports are mapped to and finally the RIFs configured on
top.
The first half of the patchset gradually moves away from the previous
design to a design that is more in sync with the underlying hardware and
which clearly separates between hardware-specific structs and logical
ones such as a bridge port.
All the bridge-specific information is removed from the port struct, as
well as the list of VLAN devices ("vPorts") configured on top of it.
Instead, a linked list of VLANs is introduced, which allows each VLAN
to hold a state, such as mapping to a particular FID and membership in
a bridge. The data structures are depicted in the following figure:
mlxsw_sp_bridge_device
+----------+
| |
+----+ |
| | |
| +----------+
|
mlxsw_sp_bridge_port |
+----------+ |
| | |
+--> +-----+--> ..
| | |
| +----+-----+
| |
| v
| mlxsw_sp_bridge_vlan
| +----------+
| | vid X |
| | +--> ..
| | |
| +----+-----+
| |
+--+----v-----+
| vid X |
+--+ +--> ..
| | |
mlxsw_sp_port | +----------+
+----------+ | mlxsw_sp_port_vlan
| | |
| +--+
| |
+----------+
This model allows us to consolidate many of the code paths relating to
VLAN-aware and VLAN-unaware bridges, as the latter is simply represented
using a bridge port with a VLAN list size of one. Another advantage of
the model is that it's easy to extend it with future per-VLAN
attributes - such as mrouter indication - by merely pushing these down
from the bridge port struct to the bridge VLAN one.
The second half of the patchset builds on top of previous work and
prepares the driver for the common FID and RIF cores, which are finally
implemented in the last two patches. These exploit the fact that despite
the different kinds of FIDs and RIFs, they do share a common object on
which the core operations can operate on.
By hiding both objects from the rest of the driver and modeling their
operations using a VFT, it'll be easier to extend the driver for future
use cases such as VXLAN.
Tested using following LNST recipes:
https://github.com/jpirko/lnst/tree/master/recipes/switchdev
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:40 +0000 (08:37 +0200)]
mlxsw: spectrum_router: Implement common RIF core
The mlxsw driver currently implements three types of RIFs. VLAN and FID
RIFs for L3 interfaces on top of VLAN-aware and VLAN-unaware bridges
(respectively) and Subport RIFs for all other L3 interfaces.
All the RIF types follow a common configuration procedure, which only
differs in the type-specific bits. The patch exploits this fact and
consolidates the common code paths, thereby simplifying the code and
making it more extensible.
This work also prepares the driver for use with future ASICs, where the
range of the Subport RIFs will be extended and their configuration
modified accordingly. By merely implementing a new RIF operations and
selecting it during initialization, the same driver could be re-used.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:39 +0000 (08:37 +0200)]
mlxsw: spectrum: Implement common FID core
The device supports three types of FIDs. 802.1Q and 802.1D FIDs for
VLAN-aware and VLAN-unaware bridges (respectively) and rFIDs to
transport packets to the router block.
The different users (e.g., bridge, router, ACLs) of the FIDs
infrastructure need not know about the internal FIDs implementation and
can therefore interact with it using a restricted set of exported
functions.
By encapsulating the entire FID logic and hiding it from the rest of the
driver we get a code base that it much simpler and easier to work with
and extend.
For example, in the current Spectrum ASIC only 802.1D FIDs can be
assigned a VNI, but future ASICs will also support 802.1Q FIDs. With
this patch in place, support for future ASICs can be easily added by
implementing a new FID operations according to their capabilities.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:38 +0000 (08:37 +0200)]
mlxsw: spectrum_router: Determine VR first when creating RIF
All RIF types are associated with a virtual router (VR), so determine VR
first when creating a RIF.
That way, we can more easily integrate the common RIF core in the
following patches.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:37 +0000 (08:37 +0200)]
mlxsw: spectrum_router: Flood packets to router after RIF creation
If a packet ingress the router but can't be assigned an ingress RIF,
it's dropped.
Therefore, in the case of RIF configured on top of a bridge, it makes
sense to start flooding broadcast packets to the router only after the
RIF was created.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:36 +0000 (08:37 +0200)]
mlxsw: spectrum_router: Destroy RIF only based on its struct
Now that all the information to create a RIF is contained within the RIF
struct itself, we can also simplify the destruction logic.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:35 +0000 (08:37 +0200)]
mlxsw: spectrum_router: Configure RIFs based on RIF struct
All the information necessary for the configuration of RIFs can now be
found in the RIF struct itself, so reduce the arguments list.
This gets us one step closer to the common RIF core.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 26 May 2017 06:37:34 +0000 (08:37 +0200)]
mlxsw: spectrum_router: Extend the RIF struct
Currently, when a Subport RIF is configured, the LAG status and VLAN of
the underlying port are read from the port itself. This is problematic,
as we would like to have common code to configure all types of RIFs,
which aren't necessarily bound to a port.
Instead, embed the RIF in a struct specific to the Subport type, which
contains all the necessary information.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>