openwrt/staging/blogic.git
6 years agonet/smc: add infrastructure to send delete rkey messages
Karsten Graul [Thu, 22 Nov 2018 09:26:42 +0000 (10:26 +0100)]
net/smc: add infrastructure to send delete rkey messages

Add the infrastructure to send LLC messages of type DELETE RKEY to
unregister a shared memory region at the peer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: avoid a delay by waiting for nothing
Karsten Graul [Thu, 22 Nov 2018 09:26:41 +0000 (10:26 +0100)]
net/smc: avoid a delay by waiting for nothing

When a send failed then don't start to wait for a response in
smc_llc_do_confirm_rkey.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: cleanup listen worker mutex unlocking
Ursula Braun [Thu, 22 Nov 2018 09:26:40 +0000 (10:26 +0100)]
net/smc: cleanup listen worker mutex unlocking

For easier reading move the unlock of mutex smc_create_lgr_pending into
smc_listen_work(), i.e. into the function the mutex has been locked.
No functional change.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: short wait for late smc_clc_wait_msg
Ursula Braun [Thu, 22 Nov 2018 09:26:39 +0000 (10:26 +0100)]
net/smc: short wait for late smc_clc_wait_msg

After sending one of the initial LLC messages CONFIRM LINK or
ADD LINK, there is already a wait for the LLC response. It does
not make sense to wait another long time for a CLC DECLINE. Thus
this patch introduces a shorter wait time for these cases.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: no link delete for a never active link
Ursula Braun [Thu, 22 Nov 2018 09:26:38 +0000 (10:26 +0100)]
net/smc: no link delete for a never active link

If a link is terminated that has never reached the active state,
there is no need to trigger an LLC DELETE LINK.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: allow fallback after clc timeouts
Ursula Braun [Thu, 22 Nov 2018 09:26:37 +0000 (10:26 +0100)]
net/smc: allow fallback after clc timeouts

If connection initialization fails for the LLC CONFIRM LINK or the
LLC ADD LINK step, fallback to TCP should be enabled. Thus
the negative return code -EAGAIN should switch to a positive timeout
reason code in these cases, and the internal CLC socket should
not have a set sk_err.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: remove sock_error detour in clc-functions
Ursula Braun [Thu, 22 Nov 2018 09:26:36 +0000 (10:26 +0100)]
net/smc: remove sock_error detour in clc-functions

There is no need to store the return value in sk_err, if it is
afterwards cleared again with sock_error(). This patch sets the
return value directly. Just cleanup, no functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: make smc_lgr_free() static
Ursula Braun [Thu, 22 Nov 2018 09:26:35 +0000 (10:26 +0100)]
net/smc: make smc_lgr_free() static

smc_lgr_free() is just called inside smc_core.c. Make it static.
Just cleanup, no functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: cleanup tcp_listen_worker initialization
Ursula Braun [Thu, 22 Nov 2018 09:26:34 +0000 (10:26 +0100)]
net/smc: cleanup tcp_listen_worker initialization

The tcp_listen_worker is already initialized when socket is
created (in smc_sock_alloc()). Get rid of the duplicate
initialization in smc_listen(). No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: remove redundant check for eee->tx_lpi_timer < 0
YueHaibing [Thu, 22 Nov 2018 06:42:00 +0000 (14:42 +0800)]
net: mvneta: remove redundant check for eee->tx_lpi_timer < 0

fixes the smatch warning:

drivers/net/ethernet/marvell/mvneta.c:4252 mvneta_ethtool_set_eee() warn:
 unsigned 'eee->tx_lpi_timer' is never less than zero.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Fri, 23 Nov 2018 19:33:54 +0000 (11:33 -0800)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2018-11-21

This series contains updates to all of the Intel LAN drivers and
documentation.

Shannon Nelson updates the ixgbe kernel documentation to include IPsec
hardware offload.

Joe Perches cleans up whitespace issues in the igb driver.

Jesse update the netdev kernel documentation for NETIF_F_GSO_UDP_L4 to
align with the actual code.  Also aligned all the NAPI driver code for
all of the Intel drivers to implement the recommendations of Eric
Dumazet to check the return code of the napi_complete_done() to
determine whether or not to enable interrupts or exit poll.

Paul E. McKenney replaces synchronize_sched() with synchronize_rcu() for
ixgbe.

Sasha implements suggestions made by Joe Perches to remove obsolete code
and to use the dev_err() method.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-gro: use ffs() to speedup napi_gro_flush()
Eric Dumazet [Wed, 21 Nov 2018 19:39:28 +0000 (11:39 -0800)]
net-gro: use ffs() to speedup napi_gro_flush()

We very often have few flows/chains to look at, and we
might increase GRO_HASH_BUCKETS to 32 or 64 in the future.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'dpaa-coalesce'
David S. Miller [Fri, 23 Nov 2018 19:17:07 +0000 (11:17 -0800)]
Merge branch 'dpaa-coalesce'

Madalin Bucur says:

====================
dpaa_eth: add ethtool coalesce control

Add control of the DPAA portal interrupt coalescing settings from
ethtool.

changes from v2: read ithresh from HW, set previous values on failure
changes from v1: added range checking for the QMan APIs
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodpaa_eth: add ethtool coalesce control
Madalin Bucur [Wed, 21 Nov 2018 11:41:09 +0000 (13:41 +0200)]
dpaa_eth: add ethtool coalesce control

Allow ethtool control of the DPAA QMan portal interrupt coalescing
settings.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosoc/qman: add return value to interrupt coalesce changing APIs
Madalin Bucur [Wed, 21 Nov 2018 11:41:08 +0000 (13:41 +0200)]
soc/qman: add return value to interrupt coalesce changing APIs

Check that the values received by the portal interrupt coalesce
change APIs are in range.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosoc: fsl: qbman: read ithresh from HW
Madalin Bucur [Wed, 21 Nov 2018 11:41:07 +0000 (13:41 +0200)]
soc: fsl: qbman: read ithresh from HW

Read the DQRR interrupt threshold directly from the hardware.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'ravb-Duplex-handling-update-V3'
David S. Miller [Fri, 23 Nov 2018 19:14:48 +0000 (11:14 -0800)]
Merge branch 'ravb-Duplex-handling-update-V3'

Magnus Damm says:

====================
ravb: Duplex handling update V3

[PATCH v3 01/02] ravb: Do not announce HDX as supported
[PATCH v3 02/02] ravb: Clean up duplex handling

This series is V3 of duplex handling improvements for the Ethernet-AVB driver.

Previous versions of this series have been posted to linux-renesas-soc as RFC
so V3 is the first actual version to make it to netdev.

Based on the latest data sheet for R-Car Gen3 [1] and R-Car Gen2 [2]
the following information is part of the EthernetAVB-IF overview page:

Transfer speed: Supports transfer at 100 and 1000 Mbps
Mode: Full-duplex mode

It seems that the driver implementation is not matching the information
provided in the friendly data sheet, and on top of this during run-time
when changing PHY configuration of the link partner the Ethernet-AVB PHY
seems to want to announce unsupported modes.

[1] R-Car Series, 3rd Generation Rev.1.00 (Apr 2018)
[2] R-Car Series, 2nd Generation Rev.2.00 (Feb 2016)

Changes since V2:
- Updated patch 1/2 to make use of phy_remove_link_mode()
- Added Reviewed-by from Sergei - thanks!

Changes since V1:
- Updated patches to reflect input from Sergei and Geert - thanks!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoravb: Clean up duplex handling
Magnus Damm [Wed, 21 Nov 2018 11:21:26 +0000 (20:21 +0900)]
ravb: Clean up duplex handling

Since only full-duplex operation is supported by the
hardware, remove duplex handling code and keep the
register setting of ECMR.DM fixed at 1.

This updates the driver implementation to follow the
data sheet text "This bit should always be set to 1."

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoravb: Do not announce HDX as supported
Magnus Damm [Wed, 21 Nov 2018 11:21:17 +0000 (20:21 +0900)]
ravb: Do not announce HDX as supported

According to the data sheet the Ethernet-AVB hardware in R-Car Gen3
and R-Car Gen2 SoCs do not support half duplex operation. So update
the driver to mark 100Mbit and 1Gbps HDX as unsupported.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: use new fw interface to get the VIN and smt index
Santosh Rastapur [Wed, 21 Nov 2018 08:10:24 +0000 (13:40 +0530)]
cxgb4: use new fw interface to get the VIN and smt index

If the fw supports returning VIN/VIVLD in FW_VI_CMD save it
in port_info structure else retrieve these from viid and save
them  in port_info structure. Do the same for smt_idx from
FW_VI_MAC_CMD

Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: bcmgenet: remove HFB_CTRL access
Doug Berger [Tue, 20 Nov 2018 23:17:01 +0000 (15:17 -0800)]
net: bcmgenet: remove HFB_CTRL access

Commit c5a54bbcecec ("net: bcmgenet: abort suspend on error")
mistakenly introduced register accesses that should not occur
in bcmgenet_wol_power_up_cfg().

Fixes: c5a54bbcecec ("net: bcmgenet: abort suspend on error")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years ago{net, IB}/mlx4: Initialize CQ buffers in the driver when possible
Daniel Jurgens [Wed, 21 Nov 2018 15:12:05 +0000 (17:12 +0200)]
{net, IB}/mlx4: Initialize CQ buffers in the driver when possible

Perform CQ initialization in the driver when the capability is supported
by the FW.  When passing the CQ to HW indicate that the CQ buffer has
been pre-initialized.

Doing so decreases CQ creation time.  Testing on P8 showed a single 2048
entry CQ creation time was reduced from ~395us to ~170us, which is
2.3x faster.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: explicitly require kernel features needed by udpgro tests
Paolo Abeni [Wed, 21 Nov 2018 13:31:15 +0000 (14:31 +0100)]
selftests: explicitly require kernel features needed by udpgro tests

commit 3327a9c46352f1 ("selftests: add functionals test for UDP GRO")
make use of ipv6 NAT, but such a feature is not currently implied by
selftests. Since the 'ip[6]tables' commands may actually create nft rules,
depending on the specific user-space version, let's pull both NF and
NFT nat modules plus the needed deps.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 3327a9c46352f1 ("selftests: add functionals test for UDP GRO")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-Add-VxLAN-learning-support'
David S. Miller [Thu, 22 Nov 2018 01:10:32 +0000 (17:10 -0800)]
Merge branch 'mlxsw-Add-VxLAN-learning-support'

Ido Schimmel says:

====================
mlxsw: Add VxLAN learning support

This patchset adds VxLAN learning support in the mlxsw driver.

The first five patches from Petr add the required switchdev APIs which
allow device drivers to notify the VxLAN driver about learned / aged-out
FDB entries.

First in patch #1, an unnecessary argument is dropped from
__vxlan_fdb_delete().

In patches #2-#4, the VxLAN FDB handling code is extended to make
sending the switchdev events configurable; to mark user-added entries as
such; and to make sure HW-learned FDB entries do not take over
user-added ones.

Finally in patch #5, the necessary switchdev notifications are added and
handled by VxLAN, similarly to how this is handled in the bridge driver.

Patch #6 allows changing of the VxLAN's device ageing time since it is
useful for the selftest in the last patch.

Patch #7 adds support for querying bridge port flags of a given
netdevice, as a new entry should not be learned and notified to the
bridge driver in case learning is disabled on the bridge port.

Next patches gradually add learning support in mlxsw.

The last patch adds a new test case for VxLAN learning.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: vxlan_bridge_1d: Add learning test
Ido Schimmel [Wed, 21 Nov 2018 08:02:52 +0000 (08:02 +0000)]
selftests: forwarding: vxlan_bridge_1d: Add learning test

Add a test which checks that the VxLAN driver can learn FDB entries and
that these entries are correctly deleted and aged-out.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Consider VxLAN learning enabled as valid
Ido Schimmel [Wed, 21 Nov 2018 08:02:51 +0000 (08:02 +0000)]
selftests: mlxsw: Consider VxLAN learning enabled as valid

The test currently expects that a configuration which includes a VxLAN
device with learning enabled to fail.

Previous patches enabled VxLAN learning in mlxsw, so change the test
accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_nve: Allow VxLAN learning
Ido Schimmel [Wed, 21 Nov 2018 08:02:50 +0000 (08:02 +0000)]
mlxsw: spectrum_nve: Allow VxLAN learning

Up until now the driver returned an error when learning was enabled on a
VxLAN device enslaved to an offloaded bridge.

Previous patches added VxLAN learning support, so remove the check.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_switchdev: Allow deletion of learned FDB entries
Ido Schimmel [Wed, 21 Nov 2018 08:02:48 +0000 (08:02 +0000)]
mlxsw: spectrum_switchdev: Allow deletion of learned FDB entries

Allow users to delete learned FDB entries from the bridge's FDB before
enabling VxLAN learning.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_switchdev: Process learned VxLAN FDB entries
Ido Schimmel [Wed, 21 Nov 2018 08:02:47 +0000 (08:02 +0000)]
mlxsw: spectrum_switchdev: Process learned VxLAN FDB entries

Start processing two new entry types in addition to current ones:
* Learned unicast tunnel entry
* Aged-out unicast tunnel entry

In both cases the device reports on a new {MAC, FID, IP address} tuple
that was learned / aged-out. Based on this notification, the driver
instructs the device to add / delete the entry to / from its database.

The driver also makes sure to notify the bridge and VxLAN drivers about
the new entry.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_nve: Add API to resolve learned IP addresses
Ido Schimmel [Wed, 21 Nov 2018 08:02:46 +0000 (08:02 +0000)]
mlxsw: spectrum_nve: Add API to resolve learned IP addresses

FDB notifications for entries learned from an NVE tunnel contain the IP
address of the remote VTEP. In the case of IPv4 underlay, the IP address
is specified as-is. IPv6 addresses on the other hand, are specified as
handles which then need to be used to query the actual address from the
device.

Only IPv4 underlay is currently supported, so we cannot receive
notifications for IPv6 addresses and therefore an error is returned when
one tries to resolve such an address.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_fid: Allow FID lookup by its index
Ido Schimmel [Wed, 21 Nov 2018 08:02:45 +0000 (08:02 +0000)]
mlxsw: spectrum_fid: Allow FID lookup by its index

When processing a notification about a new FDB entry learned from a
VxLAN tunnel, the driver is provided with the FID index among other
parameters.

The driver potentially needs to update the bridge and VxLAN drivers
about the new entry using a pointer to the VxLAN device and the
corresponding VNI.

These two parameters are stored in the FID, so add a new function that
allows looking up a FID based on its index.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum_fid: Store ifindex of NVE device in FID
Ido Schimmel [Wed, 21 Nov 2018 08:02:44 +0000 (08:02 +0000)]
mlxsw: spectrum_fid: Store ifindex of NVE device in FID

The driver periodically polls for new FDB entries learned by the device.
In the case of an FDB entry learned from a VxLAN tunnel, the
notification includes the IP of the remote VTEP, the filtering
identifier (FID) and the source MAC address of the overlay packet.

Assuming learning is enabled in the VxLAN and bridge drivers, the driver
needs to generate a notification and update them about the new FDB
entry.

Store the ifindex of the NVE device in the FID so that the driver will
be able to update the VxLAN and bridge drivers using it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: reg: Add definition of unicast tunnel record for SFN register
Ido Schimmel [Wed, 21 Nov 2018 08:02:42 +0000 (08:02 +0000)]
mlxsw: reg: Add definition of unicast tunnel record for SFN register

Will be used to process learned FDB records from an NVE tunnel.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobridge: Allow querying bridge port flags
Ido Schimmel [Wed, 21 Nov 2018 08:02:41 +0000 (08:02 +0000)]
bridge: Allow querying bridge port flags

Allow querying bridge port flags so that drivers capable of performing
VxLAN learning will update the bridge driver only if learning is enabled
on its bridge port corresponding to the VxLAN device.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovxlan: Allow changing ageing time
Ido Schimmel [Wed, 21 Nov 2018 08:02:40 +0000 (08:02 +0000)]
vxlan: Allow changing ageing time

In a similar fashion to the bridge device, allow changing the ageing
time of the VxLAN device by scheduling its timer to fire if the ageing
time changed.

One use case is selftests where learning / ageing of VxLAN FDB entries
is tested. The default ageing time is 5 minutes, which is too long for a
simple selftest.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovxlan: Add hardware FDB learning
Petr Machata [Wed, 21 Nov 2018 08:02:39 +0000 (08:02 +0000)]
vxlan: Add hardware FDB learning

In order to allow devices to signal learning events to VXLAN, introduce
two new switchdev messages: SWITCHDEV_VXLAN_FDB_ADD_TO_BRIDGE and
SWITCHDEV_VXLAN_FDB_DEL_TO_BRIDGE.

Listen to these notifications in the vxlan driver. The FDB entries
learned this way have an NTF_EXT_LEARNED flag, and only entries marked
as such can be unlearned by the _DEL_ event. They are also immediately
marked as offloaded. This is the same behavior that the bridge driver
observes.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovxlan: Don't override user-added entries with ext-learned ones
Petr Machata [Wed, 21 Nov 2018 08:02:37 +0000 (08:02 +0000)]
vxlan: Don't override user-added entries with ext-learned ones

When an external learning event collides with an user-added entry, the
user-added entry shouldn't be taken over. Otherwise on an unlearn event
the entry would be completely lost, even though the user added it by
hand.

Therefore skip update of FDB flags and state for these cases. This is in
accordance with the bridge behavior.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovxlan: Mark user-added FDB entries
Petr Machata [Wed, 21 Nov 2018 08:02:36 +0000 (08:02 +0000)]
vxlan: Mark user-added FDB entries

The VXLAN driver needs to differentiate between FDB entries learned by
the VXLAN driver, and those added by the user. The latter ones shouldn't
be taken over by external learning events. This is in accordance with
bridge behavior.

Therefore, extend the flags bitfield to 16 bits and add a new private
NTF flag to mark the user-added entries.

This seems preferable to adding a dedicated boolean, because passing the
flag, unlike passing e.g. a true, makes it clear what the meaning of the
bit is.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovxlan: vxlan_fdb_notify(): Make switchdev notification configurable
Petr Machata [Wed, 21 Nov 2018 08:02:35 +0000 (08:02 +0000)]
vxlan: vxlan_fdb_notify(): Make switchdev notification configurable

In a following patch, vxlan is extended to allow hardware FDB learning.
For FDB entries learned this way, switchdev notifications should not be
sent again, because the driver already knows about these entries.

To that end, add an argument vxlan_fdb_notify() to determine whether
the switchdev notifications should be sent. Propagate the argument to
all call sites transitively, eventually passing true in all root calls.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agovxlan: __vxlan_fdb_delete(): Drop unused argument vid
Petr Machata [Wed, 21 Nov 2018 08:02:34 +0000 (08:02 +0000)]
vxlan: __vxlan_fdb_delete(): Drop unused argument vid

This argument is necessary for vxlan_fdb_delete(), the API of which is
prescribed by ndo_fdb_del, but __vxlan_fdb_delete() doesn't need it.
Therefore drop it.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: lpc_eth: fix trivial comment typo
Andrea Claudi [Tue, 20 Nov 2018 17:30:30 +0000 (18:30 +0100)]
net: lpc_eth: fix trivial comment typo

Fix comment typo rxfliterctrl -> rxfilterctrl

Signed-off-by: Andrea Claudi <aclaudi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'VLAN-tag-handling-cleanup'
David S. Miller [Wed, 21 Nov 2018 23:41:31 +0000 (15:41 -0800)]
Merge branch 'VLAN-tag-handling-cleanup'

MichaÅ‚ MirosÅ‚aw says:

====================
VLAN tag handling cleanup

This is a cleanup set after VLAN_TAG_PRESENT removal. The CFI bit
handling is made similar to how other tag fields are used.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlx5: use skb_vlan_tag_get_prio()
Michał Mirosław [Tue, 20 Nov 2018 12:20:33 +0000 (13:20 +0100)]
mlx5: use skb_vlan_tag_get_prio()

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agobenet: use skb_vlan_tag_get_prio()
Michał Mirosław [Tue, 20 Nov 2018 12:20:32 +0000 (13:20 +0100)]
benet: use skb_vlan_tag_get_prio()

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/hyperv: use skb_vlan_tag_*() helpers
Michał Mirosław [Tue, 20 Nov 2018 12:20:32 +0000 (13:20 +0100)]
net/hyperv: use skb_vlan_tag_*() helpers

Replace open-coded bitfield manipulation with skb_vlan_tag_*() helpers.
This also enables correctly passing of VLAN.CFI bit.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/vlan: introduce skb_vlan_tag_get_cfi() helper
Michał Mirosław [Tue, 20 Nov 2018 12:20:31 +0000 (13:20 +0100)]
net/vlan: introduce skb_vlan_tag_get_cfi() helper

Abstract CFI/DEI bit access consistently with other VLAN tag fields.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoigc: Remove obsolete IGC_ERR define
Sasha Neftin [Mon, 12 Nov 2018 09:05:20 +0000 (11:05 +0200)]
igc: Remove obsolete IGC_ERR define

Address community comment.
Remove obsolete IGC_ERR define and use dev_err method.
Suggested by Joe Perches.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: Replace synchronize_sched() with synchronize_rcu()
Paul E. McKenney [Sun, 11 Nov 2018 19:43:42 +0000 (11:43 -0800)]
ixgbe: Replace synchronize_sched() with synchronize_rcu()

Now that synchronize_rcu() waits for preempt-disable regions of code
as well as RCU read-side critical sections, synchronize_sched() can be
replaced by synchronize_rcu().  This commit therefore makes this change.

Signed-off-by: "Paul E. McKenney" <paulmck@linux.ibm.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoethernet/intel: consolidate NAPI and NAPI exit
Jesse Brandeburg [Thu, 8 Nov 2018 22:55:32 +0000 (14:55 -0800)]
ethernet/intel: consolidate NAPI and NAPI exit

While reviewing code, I noticed that Eric Dumazet recommends that
drivers check the return code of napi_complete_done, and use that
to decide to enable interrupts or not when exiting poll.  One of
the Intel drivers was already fixed (ixgbe).

Upon looking at the Intel drivers as a whole, we are handling our
polling and NAPI exit in a few different ways based on whether we
have multiqueue and whether we have Tx cleanup included. Several
drivers had the bug of exiting NAPI with return 0, which appears
to mess up the accounting in the stack.

Consolidate all the NAPI routines to do best known way of exiting
and to just mostly look like each other.
1) check return code of napi_complete_done to control interrupt enable
2) return the actual amount of work done.
3) return budget immediately if need NAPI poll again

Tested the changes on e1000e with a high interrupt rate set, and
it shows about an 8% reduction in the CPU utilization when busy
polling because we aren't re-enabling interrupts when we're about
to be polled.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agodocs-networking: fix typo in define
Jesse Brandeburg [Thu, 8 Nov 2018 05:40:17 +0000 (21:40 -0800)]
docs-networking: fix typo in define

The #define for NETIF_F_GSO_UDP_L4 was incorrect in the
documentation, fix it by making it match the actual code.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoigb: Fix format with line continuation whitespace
Joe Perches [Thu, 1 Nov 2018 07:03:31 +0000 (00:03 -0700)]
igb: Fix format with line continuation whitespace

The line continuation unintentionally adds whitespace so
instead use a coalesced format to remove the whitespace.

Miscellanea:

o Use a more typical style for ternaries and arguments
  for this logging message

Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: add ipsec hw offload note to ixgbe Documentation
Shannon Nelson [Mon, 29 Oct 2018 22:54:12 +0000 (15:54 -0700)]
ixgbe: add ipsec hw offload note to ixgbe Documentation

Add a short note about using IPsec Hardware Offload with
the ixgbe driver.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 21 Nov 2018 04:59:27 +0000 (20:59 -0800)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2018-11-20

This series contains updates to the ice driver only.

Akeem updates the driver to determine whether or not to do
auto-negotiation based on the VSI state.

Bruce cleans up the control queue code to remove duplicate code.  Take
advantage of some compiler optimizations by making some structures
constant, and also note that they cannot be modified.  Cleaned up
formatting issues and code comment that needed clarification.  Fixed a
potential NULL pointer dereference by adding a check.

Jaroslaw adds a check to verify if memory was allocated or not.

Yashaswini Raghuram fixes the driver to ensure we are not enabling the
LAN_EN flag if the MAC in the MAC-VLAN is a unicast MAC, so that the
unicast packets are not forwarded to the wire.

Dave fixes the return value of ice_napi_poll() to be more useful in
returning the work that was done and should only return 0 when no work
was done.

Anirudh does code comment cleanup, to make more consistent.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'dsa-microchip-Modify-KSZ9477-DSA-driver-in-preparation-to-add-other...
David S. Miller [Wed, 21 Nov 2018 04:57:12 +0000 (20:57 -0800)]
Merge branch 'dsa-microchip-Modify-KSZ9477-DSA-driver-in-preparation-to-add-other-KSZ-switch-drivers'

Tristram Ha says:

====================
net: dsa: microchip: Modify KSZ9477 DSA driver in preparation to add other KSZ switch drivers

This series of patches is to modify the original KSZ9477 DSA driver so
that other KSZ switch drivers can be added and use the common code.

There are several steps to accomplish this achievement.  First is to
rename some function names with a prefix to indicate chip specific
function.  Second is to move common code into header that can be shared.
Last is to modify tag_ksz.c so that it can handle many tail tag formats
used by different KSZ switch drivers.

ksz_common.c will contain the common code used by all KSZ switch drivers.
ksz9477.c will contain KSZ9477 code from the original ksz_common.c.
ksz9477_spi.c is renamed from ksz_spi.c.
ksz9477_reg.h is renamed from ksz_9477_reg.h.
ksz_common.h is added to provide common code access to KSZ switch
drivers.
ksz_spi.h is added to provide common SPI access functions to KSZ SPI
drivers.

v4
- Patches were removed to concentrate on changing driver structure without
adding new code.

v3
- The phy_device structure is used to hold port link information
- A structure is passed in ksz_xmit and ksz_rcv instead of function pointer
- Switch offload forwarding is supported

v2
- Initialize reg_mutex before use
- The alu_mutex is only used inside chip specific functions

v1
- Each patch in the set is self-contained
- Use ksz9477 prefix to indicate KSZ9477 specific code
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: microchip: rename ksz_9477_reg.h to ksz9477_reg.h
Tristram Ha [Tue, 20 Nov 2018 23:55:10 +0000 (15:55 -0800)]
net: dsa: microchip: rename ksz_9477_reg.h to ksz9477_reg.h

Rename ksz_9477_reg.h to ksz9477_reg.h for consistency as the product
name is always KSZ####.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: microchip: break KSZ9477 DSA driver into two files
Tristram Ha [Tue, 20 Nov 2018 23:55:09 +0000 (15:55 -0800)]
net: dsa: microchip: break KSZ9477 DSA driver into two files

Break KSZ9477 DSA driver into two files in preparation to add more KSZ
switch drivers.
Add common functions in ksz_common.h so that other KSZ switch drivers
can access code in ksz_common.c.
Add ksz_spi.h for common functions used by KSZ switch SPI drivers.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: microchip: rename ksz_spi.c to ksz9477_spi.c
Tristram Ha [Tue, 20 Nov 2018 23:55:08 +0000 (15:55 -0800)]
net: dsa: microchip: rename ksz_spi.c to ksz9477_spi.c

Rename ksz_spi.c to ksz9477_spi.c and update Kconfig in preparation to add
more KSZ switch drivers.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: microchip: rename some functions with ksz9477 prefix
Tristram Ha [Tue, 20 Nov 2018 23:55:07 +0000 (15:55 -0800)]
net: dsa: microchip: rename some functions with ksz9477 prefix

Rename some functions with ksz9477 prefix to separate chip specific code
from common code.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: microchip: clean up code
Tristram Ha [Tue, 20 Nov 2018 23:55:06 +0000 (15:55 -0800)]
net: dsa: microchip: clean up code

Clean up code according to patch check suggestions.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: microchip: replace license with GPL
Tristram Ha [Tue, 20 Nov 2018 23:55:05 +0000 (15:55 -0800)]
net: dsa: microchip: replace license with GPL

Replace license with GPL.

Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoice: Fix possible NULL pointer de-reference
Bruce Allan [Wed, 7 Nov 2018 18:19:35 +0000 (10:19 -0800)]
ice: Fix possible NULL pointer de-reference

A recent update to smatch is causing it to report the error "we previously
assumed 'm_entry->vsi_list_info' could be null". Fix that.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Use Tx|Rx in comments
Anirudh Venkataramanan [Fri, 26 Oct 2018 18:44:47 +0000 (11:44 -0700)]
ice: Use Tx|Rx in comments

In code comments, use Tx|Rx instead of tx|rx

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Cosmetic formatting changes
Anirudh Venkataramanan [Fri, 26 Oct 2018 18:44:46 +0000 (11:44 -0700)]
ice: Cosmetic formatting changes

1. Fix several cases of double spacing
2. Fix typos
3. Capitalize abbreviations

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Cleanup short function signatures
Bruce Allan [Fri, 26 Oct 2018 18:44:45 +0000 (11:44 -0700)]
ice: Cleanup short function signatures

Function signatures that do not exceed 80-characters should be on a single
line.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Cleanup ice_tx_timeout()
Bruce Allan [Fri, 26 Oct 2018 18:44:44 +0000 (11:44 -0700)]
ice: Cleanup ice_tx_timeout()

Clean up number of formatting issues and a comment that could use
clarification.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Fix return value from NAPI poll
Dave Ertman [Fri, 26 Oct 2018 18:44:43 +0000 (11:44 -0700)]
ice: Fix return value from NAPI poll

ice_napi_poll is hard-coded to return zero when it's done. It should
instead return the work done (if any work was done). The only time it
should return zero is if an interrupt or poll is handled and no work
is performed. So change the return value to be the minimum of work
done or budget-1.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Constify global structures that can/should be
Bruce Allan [Fri, 26 Oct 2018 18:44:42 +0000 (11:44 -0700)]
ice: Constify global structures that can/should be

Indicate these structs should not be modified and take advantage of some
compiler optimizations by making these structs const.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Do not set LAN_EN for MAC-VLAN filters
Yashaswini Raghuram Prathivadi Bhayankaram [Fri, 26 Oct 2018 18:44:41 +0000 (11:44 -0700)]
ice: Do not set LAN_EN for MAC-VLAN filters

In the action fields for a MAC-VLAN filter, do not set the LAN_EN flag
if the MAC in the MAC-VLAN is unicast MAC. The unicast packets that
match should not be forwarded to the wire.

Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Pass the return value of ice_init_def_sw_recp()
Jaroslaw Ilgiewicz [Fri, 26 Oct 2018 18:44:40 +0000 (11:44 -0700)]
ice: Pass the return value of ice_init_def_sw_recp()

Added check of return value for ice_init_def_sw_recp().
Now we know if memory was correctly allocated.

Signed-off-by: Jaroslaw Ilgiewicz <jaroslaw.ilgiewicz@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Cleanup duplicate control queue code
Bruce Allan [Fri, 26 Oct 2018 18:44:39 +0000 (11:44 -0700)]
ice: Cleanup duplicate control queue code

1. Assigning the register offset and mask values contains duplicate code
   that can easily be replaced with a macro.

2. Separate functions for freeing send queue and receive queue rings are
   not needed; replace with a single function that uses a pointer to the
   struct ice_ctl_q_ring structure as a parameter instead of a pointer to
   the struct ice_ctl_q_info structure.

3. Initializing register settings for both send queue and receive queue
   contains duplicate code that can easily be replaced with a helper
   function.

4. Separate functions for freeing send queue and receive queue buffers are
   not needed; duplicate code can easily be replaced with a macro.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoice: Do autoneg based on VSI state
Akeem G Abodunrin [Fri, 26 Oct 2018 18:44:38 +0000 (11:44 -0700)]
ice: Do autoneg based on VSI state

If VSI state is up, we should do autoneg with link up, otherwise
with link down.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agonet-next/hinic: fix a bug in rx data flow
Xue Chaojing [Tue, 20 Nov 2018 05:47:34 +0000 (05:47 +0000)]
net-next/hinic: fix a bug in rx data flow

In rx_alloc_pkts(), there is a loop call of tasklet, which causes
100% cpu utilization, even no packets are being received. This patch
fixes this bug.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-next/hinic:fix a bug in set mac address
Xue Chaojing [Tue, 20 Nov 2018 05:47:33 +0000 (05:47 +0000)]
net-next/hinic:fix a bug in set mac address

In add_mac_addr(), if the MAC address is a muliticast address,
it will not be set, which causes the network card fail to receive
the multicast packet. This patch fixes this bug.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-next/hinic:add rx checksum offload for HiNIC
Xue Chaojing [Tue, 20 Nov 2018 05:47:32 +0000 (05:47 +0000)]
net-next/hinic:add rx checksum offload for HiNIC

In order to improve performance, this patch adds rx checksum offload
for the HiNIC driver. Performance test(Iperf) shows more than 80%
improvement in TCP streams.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-next/hinic:replace multiply and division operators
Xue Chaojing [Tue, 20 Nov 2018 05:47:31 +0000 (05:47 +0000)]
net-next/hinic:replace multiply and division operators

To improve performance, this patch uses bit operations to replace
multiply and division operators.

Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: core: Extend cooling device with cooling levels
Vadim Pasternak [Tue, 20 Nov 2018 06:52:03 +0000 (06:52 +0000)]
mlxsw: core: Extend cooling device with cooling levels

Extend cooling device with cooling levels vector to allow more
flexibility of PWM setting.

Thermal zone algorithm operates with the numerical states for PWM
setting. Each state is the index, defined in range from 0 to 10 and it's
mapped to the relevant duty cycle value, which is written to PWM
controller. With the current definition fan speed is set to 0% for state
0, 10% for state 1, and so on up to 100% for the maximum state 10.

Some systems have limitation for the PWM speed minimum. For such systems
PWM setting speed to 0% will just disable the ability to increase speed
anymore and such device will be stall on zero speed.  Cooling levels
allow to configure state vector according to the particular system
requirements. For example, if PWM speed is not allowed to be below 30%,
cooling levels could be configured as 30%, 30%, 30%, 30%, 40%, 50% and
so on.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4/cxgb4vf: Fix mac_hlist initialization and free
Arjun Vynipadath [Tue, 20 Nov 2018 06:41:39 +0000 (12:11 +0530)]
cxgb4/cxgb4vf: Fix mac_hlist initialization and free

Null pointer dereference seen when cxgb4vf driver is unloaded
without bringing up any interfaces, moving mac_hlist initialization
to driver probe and free the mac_hlist in remove to fix the issue.

Fixes: 24357e06ba51 ("cxgb4vf: fix memleak in mac_hlist initialization")
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotcp: drop dst in tcp_add_backlog()
Eric Dumazet [Tue, 20 Nov 2018 01:45:55 +0000 (17:45 -0800)]
tcp: drop dst in tcp_add_backlog()

Under stress, softirq rx handler often hits a socket owned by the user,
and has to queue the packet into socket backlog.

When this happens, skb dst refcount is taken before we escape rcu
protected region. This is done from __sk_add_backlog() calling
skb_dst_force().

Consumer will have to perform the opposite costly operation.

AFAIK nothing in tcp stack requests the dst after skb was stored
in the backlog. If this was the case, we would have had failures
already since skb_dst_force() can end up clearing skb dst anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv4: Don't try to print ASCII of link level header in martian dumps.
David S. Miller [Tue, 20 Nov 2018 18:15:36 +0000 (10:15 -0800)]
ipv4: Don't try to print ASCII of link level header in martian dumps.

This has no value whatsoever.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet_sched: sch_fq: avoid calling ktime_get_ns() if not needed
Eric Dumazet [Tue, 20 Nov 2018 01:30:19 +0000 (17:30 -0800)]
net_sched: sch_fq: avoid calling ktime_get_ns() if not needed

There are two cases were we can avoid calling ktime_get_ns() :

1) Queue is empty.
2) Internal queue is not empty.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'gred-add-offload-support'
David S. Miller [Tue, 20 Nov 2018 02:53:46 +0000 (18:53 -0800)]
Merge branch 'gred-add-offload-support'

Jakub Kicinski says:

====================
gred: add offload support

This series adds support for GRED offload in the nfp driver.  So
far we have only supported the RED Qdisc offload, but we need a
way to differentiate traffic types e.g. based on DSCP marking.

It may seem like PRIO+RED is a good match for this job, however,
(a) we don't need strict priority behaviour of PRIO, and (b) PRIO
uses the legacy way of mapping ToS fields to bands, which is quite
awkward and limitting.

The less commonly used GRED Qdisc is a better much for the scenario,
it allows multiple sets of RED parameters and queue lengths to be
maintained with a single FIFO queue.  This is exactly how nfp offload
behaves.  We use a trivial u32 classifier to assign packets to virtual
queues.

There is also the minor advantage that GRED can't have its child
changed, therefore limitting ways in which the configuration of SW
path can diverge from HW offload.

Last patch of the series adds support for (G)RED in non-ECN mode,
where packets are dropped instead of marked.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: add support for more threshold actions
Jakub Kicinski [Mon, 19 Nov 2018 23:21:50 +0000 (15:21 -0800)]
nfp: abm: add support for more threshold actions

Original FW only allowed us to perform ECN marking.  Newer releases
also support plain old drop.  Add the ability to configure drop
policy.  This is particularly useful in combination with GRED,
because different bands can have different ECN marking setting.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: add cls_u32 offload for simple band classification
Jakub Kicinski [Mon, 19 Nov 2018 23:21:49 +0000 (15:21 -0800)]
nfp: abm: add cls_u32 offload for simple band classification

Use offload of very simple u32 filters to direct packets to GRED
bands based on the DSCP marking.  No u32 hashing is supported,
just plain simple filters matching on ToS or Priority with
appropriate mask device can support.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: add functions to update DSCP -> virtual queue map
Jakub Kicinski [Mon, 19 Nov 2018 23:21:48 +0000 (15:21 -0800)]
nfp: abm: add functions to update DSCP -> virtual queue map

Learn how to set the DSCP map.  FW uses a packed array which
geometry depends on the number of supported priorities and
virtual queues.  Write code to assemble this map and to communicate
the setting to the FW via mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: calculate PRIO map len and check mailbox size
Jakub Kicinski [Mon, 19 Nov 2018 23:21:47 +0000 (15:21 -0800)]
nfp: abm: calculate PRIO map len and check mailbox size

In preparation for PRIO offload calculate how long the prio map
for FW will be and make sure the configuration can be performed
via the vNIC mailbox.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: cls_u32: add res to offload information
Jakub Kicinski [Mon, 19 Nov 2018 23:21:46 +0000 (15:21 -0800)]
net: sched: cls_u32: add res to offload information

In case of egress offloads the class/flowid assigned by the filter
may be very important for offloaded Qdisc selection.  Provide this
info to drivers.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: add GRED offload
Jakub Kicinski [Mon, 19 Nov 2018 23:21:45 +0000 (15:21 -0800)]
nfp: abm: add GRED offload

Add support for GRED offload.  It behaves much like RED, but
can apply different parameters to different bands.  GRED operates
pretty much exactly like our HW/FW with a single FIFO and different
RED state instances.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: wrap RED parameters in bands
Jakub Kicinski [Mon, 19 Nov 2018 23:21:44 +0000 (15:21 -0800)]
nfp: abm: wrap RED parameters in bands

Wrap RED parameters and stats into a structure, and a 1-element
array.  Upcoming GRED offload will add the support for more bands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: gred: support reporting stats from offloads
Jakub Kicinski [Mon, 19 Nov 2018 23:21:43 +0000 (15:21 -0800)]
net: sched: gred: support reporting stats from offloads

Allow drivers which offload GRED to report back statistics.  Since
A lot of GRED stats is fairly ad hoc in nature pass to drivers the
standard struct gnet_stats_basic/gnet_stats_queue pairs, and
untangle the values in the core.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: gred: add basic Qdisc offload
Jakub Kicinski [Mon, 19 Nov 2018 23:21:42 +0000 (15:21 -0800)]
net: sched: gred: add basic Qdisc offload

Add basic offload for the GRED Qdisc.  Inform the drivers any
time Qdisc or virtual queue configuration changes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: add up bands for sto/non-sto stats
Jakub Kicinski [Mon, 19 Nov 2018 23:21:41 +0000 (15:21 -0800)]
nfp: abm: add up bands for sto/non-sto stats

Add up stats for all bands for the extra ethtool statistics.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: switch to extended stats for reading packet/byte counts
Jakub Kicinski [Mon, 19 Nov 2018 23:21:40 +0000 (15:21 -0800)]
nfp: abm: switch to extended stats for reading packet/byte counts

In PRIO-enabled FW read the statistics from per-band symbol, rather
than from the standard per-PCIe-queue counters.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: size threshold table to account for bands
Jakub Kicinski [Mon, 19 Nov 2018 23:21:39 +0000 (15:21 -0800)]
nfp: abm: size threshold table to account for bands

Make sure the threshold table is large enough to hold information
for all bands.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: pass band parameter to functions
Jakub Kicinski [Mon, 19 Nov 2018 23:21:38 +0000 (15:21 -0800)]
nfp: abm: pass band parameter to functions

In preparation for per-band RED offload pass band parameter to
functions.  For now it will always be 0.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonfp: abm: map per-band symbols
Jakub Kicinski [Mon, 19 Nov 2018 23:21:37 +0000 (15:21 -0800)]
nfp: abm: map per-band symbols

In preparation for multi-band RED offload if FW is capable map
the extended symbols which will allow us to set per-band parameters
and read stats.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hns3: add common validation in hclge_dcb
Yunsheng Lin [Mon, 19 Nov 2018 13:02:15 +0000 (21:02 +0800)]
net: hns3: add common validation in hclge_dcb

Before setting tm related configuration to hardware, driver
needs to check the configuration provided by user is valid.
Currently hclge_ieee_setets and hclge_setup_tc both implement
their own checking, which has a lot in common.

This patch addes hclge_dcb_common_validate to do the common
checking. The checking in hclge_tm_prio_tc_info_update
and hclge_tm_schd_info_update is unnecessary now, so change
the return type to void, which removes the need to do error
handling when one of the checking fails.

Also, ets->prio_tc is indexed by user prio and ets->tc_tsa is
indexed by tc num, so this patch changes them to use different
index.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'selftests-Add-tests-for-VXLAN-at-an-802-1d-bridge'
David S. Miller [Tue, 20 Nov 2018 01:59:45 +0000 (17:59 -0800)]
Merge branch 'selftests-Add-tests-for-VXLAN-at-an-802-1d-bridge'

Ido Schimmel says:

====================
selftests: Add tests for VXLAN at an 802.1d bridge

Petr says:

This patchset adds several tests for VXLAN attached to an 802.1d bridge
and fixes a related bug.

First patch #1 fixes a bug in propagating SKB already-forwarded marks
over veth to bridges, where they are irrelevant. This bug causes the
vxlan_bridge_1d test suite from this patchset to fail as the packets
aren't forwarded by br2.

In patches #2 and #3, lib.sh is extended to support network namespaces.
The use of namespaces is necessitated by VXLAN, which allows only one
VXLAN device with a given VNI per namespace. Thus to host full topology
on a single box for selftests, the "remote" endpoints need to be in
namespaces.

In patches #4-#6, lib.sh is extended in other ways to facilitate the
following patches.

In patches #7-#15, first the skeleton, and later the generic tests
themselves are added.

Patch #16 then adds another test that serves as a wrapper around the
previous one, and runs it with a non-default port number.

Patches #17 and #18 add mlxsw-specific tests. About those, Ido writes:

The first test creates various configurations with regards to the VxLAN
and bridge devices and makes sure the driver correctly forbids
unsupported configuration and permits supported ones. It also verifies
that the driver correctly sets the offload indication on FDB entries and
the local route used for VxLAN decapsulation.

The second test verifies that the driver correctly configures the singly
linked list used to flood BUM traffic and that traffic is flooded as
expected.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add a test for VxLAN flooding
Ido Schimmel [Mon, 19 Nov 2018 16:11:27 +0000 (16:11 +0000)]
selftests: mlxsw: Add a test for VxLAN flooding

The device stores flood records in a singly linked list where each
record stores up to three IPv4 addresses of remote VTEPs. The test
verifies that packets are correctly flooded in various cases such as
deletion of a record in the middle of the list.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add a test for VxLAN configuration
Ido Schimmel [Mon, 19 Nov 2018 16:11:26 +0000 (16:11 +0000)]
selftests: mlxsw: Add a test for VxLAN configuration

Test various aspects of VxLAN offloading which are specific to mlxsw,
such as sanitization of invalid configurations and offload indication.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: forwarding: vxlan_bridge_1d_port_8472: New test
Petr Machata [Mon, 19 Nov 2018 16:11:25 +0000 (16:11 +0000)]
selftests: forwarding: vxlan_bridge_1d_port_8472: New test

This simple wrapper reruns the VXLAN ping test with a port number of
8472.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>