openwrt/staging/blogic.git
5 years agonet/mlx5: Add devlink flow_steering_mode parameter
Maor Gottlieb [Wed, 28 Aug 2019 12:10:54 +0000 (15:10 +0300)]
net/mlx5: Add devlink flow_steering_mode parameter

Add new parameter (flow_steering_mode) to control the flow steering
mode of the driver.
Two modes are supported:
1. DMFS - Device managed flow steering
2. SMFS - Software/Driver managed flow steering.

In the DMFS mode, the HW steering entities are created through the
FW. In the SMFS mode this entities are created though the driver
directly.

The driver will use the devlink steering mode only if the steering
domain supports it, for now SMFS will manages only the switchdev eswitch
steering domain.

User command examples:
- Set SMFS flow steering mode::

    $ devlink dev param set pci/0000:06:00.0 name flow_steering_mode value "smfs" cmode runtime

- Read device flow steering mode::

    $ devlink dev param show pci/0000:06:00.0 name flow_steering_mode
      pci/0000:06:00.0:
      name flow_steering_mode type driver-specific
      values:
         cmode runtime value smfs

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add support to use SMFS in switchdev mode
Maor Gottlieb [Sun, 18 Aug 2019 16:18:11 +0000 (19:18 +0300)]
net/mlx5: Add support to use SMFS in switchdev mode

In case that flow steering mode of the driver is SMFS (Software Managed
Flow Steering), then use the DR (SW steering) API to create the steering
objects.

In addition, add a call to the set peer namespace when switchdev gets
devcom pair event. It is required to support VF LAG in SMFS.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add API to set the namespace steering mode
Maor Gottlieb [Sun, 18 Aug 2019 16:15:22 +0000 (19:15 +0300)]
net/mlx5: Add API to set the namespace steering mode

Add API to set the flow steering root namesapce mode.
Setting new mode should be called before any steering operation
is executed on the namespace.
This API is going to be used by steering users such switchdev.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add direct rule fs_cmd implementation
Maor Gottlieb [Tue, 20 Aug 2019 07:06:48 +0000 (10:06 +0300)]
net/mlx5: Add direct rule fs_cmd implementation

Add support to create flow steering objects
via direct rule API (SW steering).
New layer is added - fs_dr, this layer translates the command that
fs_core sends to the FW into direct rule API. In case that direct
rule is not supported in some feature then -EOPNOTSUPP is
returned.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Add CONFIG_MLX5_SW_STEERING for software steering support
Alex Vesker [Tue, 20 Aug 2019 09:28:03 +0000 (12:28 +0300)]
net/mlx5: DR, Add CONFIG_MLX5_SW_STEERING for software steering support

Add new mlx5 Kconfig flag to allow selecting software steering
support and compile all the steering files only if the flag is
selected.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose APIs for direct rule managing
Alex Vesker [Tue, 20 Aug 2019 08:33:40 +0000 (11:33 +0300)]
net/mlx5: DR, Expose APIs for direct rule managing

Expose APIs for direct rule managing to increase insertion rate by
bypassing the firmware.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Add required FW steering functionality
Alex Vesker [Tue, 20 Aug 2019 06:41:25 +0000 (09:41 +0300)]
net/mlx5: DR, Add required FW steering functionality

SW steering is capable of doing many steering functionalities
but there are still some functionalities which are not exposed
to upper layers and therefore performed by the FW.

This is the support for recalculating checksum using a hairpin QP.
The recalculation is required after a modify TTL action which skips
the needed CS calculation in HW.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose steering rule functionality
Alex Vesker [Mon, 19 Aug 2019 10:28:35 +0000 (13:28 +0300)]
net/mlx5: DR, Expose steering rule functionality

Rules are the actual objects that tie matchers, header values and
actions. Each rule belongs to a matcher, which can hold multiple rules
sharing the same mask. Each rule is a specific set of values and
actions.
When a packet reaches a matcher it is being matched against the
matcher`s rules. In case of a match over a rule its actions will be
executed. Each rule object contains a set of STEs, where each STE is a
definition of match values and actions defined by the rule.
This file handles the rule operations and processing.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose steering action functionality
Alex Vesker [Mon, 19 Aug 2019 16:16:52 +0000 (19:16 +0300)]
net/mlx5: DR, Expose steering action functionality

On rule creation a set of actions can be provided, the actions describe
what to do with the packet in case of a match. It is possible to provide
a set of actions which will be done by order.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose steering matcher functionality
Alex Vesker [Mon, 19 Aug 2019 10:21:19 +0000 (13:21 +0300)]
net/mlx5: DR, Expose steering matcher functionality

Matcher defines which packets fields are matched when a packet arrives.
Matcher is a part of a table and can contain one or more rules. Where
rule defines specific values of the matcher's mask definition.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose steering table functionality
Alex Vesker [Mon, 19 Aug 2019 08:31:25 +0000 (11:31 +0300)]
net/mlx5: DR, Expose steering table functionality

Tables are objects which are used for storing matchers, each table
belongs to a domain and defined by the domain type. When a packet
reaches the table it is being processed by each of its matchers until a
successful match. Tables can hold multiple matchers ordered by matcher
priority. Each table has a level.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose steering domain functionality
Alex Vesker [Mon, 19 Aug 2019 08:30:56 +0000 (11:30 +0300)]
net/mlx5: DR, Expose steering domain functionality

Domain is the frame for all of the dr (direct rule) objects.
There are different domain types which also affect the object under that
domain. Each domain can hold multiple tables which can hold multiple
matchers and so on, this means that all of the dr (direct rule) objects
exist under a specific domain. The domain object also holds the
resources needed for other objects such as memory management and
communication with the device.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Add Steering entry (STE) utilities
Alex Vesker [Mon, 19 Aug 2019 11:14:55 +0000 (14:14 +0300)]
net/mlx5: DR, Add Steering entry (STE) utilities

Steering Entry (STE) object is the basic building block of the steering
map. There are several types of STEs. Each rule can be constructed of
multiple STEs. Each STE dictates which fields of the packet's header are
being matched as well as the information about the next step in map (hit
and miss pointers). The hardware gets a packet and tries to match it
against the STEs, going to either the hit pointer or the miss pointer.
This file handles the STE operations.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Expose an internal API to issue RDMA operations
Alex Vesker [Tue, 20 Aug 2019 06:39:34 +0000 (09:39 +0300)]
net/mlx5: DR, Expose an internal API to issue RDMA operations

Inserting or deleting a rule is done by RDMA read/write operation to SW
ICM device memory. This file provides the support for executing these
operations. It includes allocating the needed resources and providing an
API for writing steering entries to the memory.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, ICM pool memory allocator
Alex Vesker [Mon, 19 Aug 2019 10:46:40 +0000 (13:46 +0300)]
net/mlx5: DR, ICM pool memory allocator

ICM device memory is used for writing steering rules (STEs) to the NIC.
An ICM memory pool allocator was implemented to manage the required
memory. The pool consists of buckets, a bucket per chunk size.
Once a bucket is empty we will cut a row of memory from the latest
allocated MR, if the MR size is not sufficient we will allocate a new MR.
HW design requires that chunks memory address should be aligned to the
chunk size, this is the reason for managing the MR with row size that
insures memory alignment.
Current design is greedy in memory but provides quick allocation times
in steady state.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Add direct rule command utilities
Alex Vesker [Tue, 20 Aug 2019 08:19:02 +0000 (11:19 +0300)]
net/mlx5: DR, Add direct rule command utilities

Add direct rule command utilities which consists of all the FW
commands that are executed to provide the SW steering functionality.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: DR, Add the internal direct rule types definitions
Alex Vesker [Tue, 20 Aug 2019 08:57:28 +0000 (11:57 +0300)]
net/mlx5: DR, Add the internal direct rule types definitions

Add the internal header file that contains various types
definition that will be used in coming patches as well as
the internal functions decelerations.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add flow steering actions to fs_cmd shim layer
Maor Gottlieb [Thu, 15 Aug 2019 10:54:17 +0000 (13:54 +0300)]
net/mlx5: Add flow steering actions to fs_cmd shim layer

Add flow steering actions: modify header and packet reformat
to the fs_cmd shim layer. This allows each namespace to define
possibly different functionality for alloc/dealloc action commands.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoMerge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox...
Saeed Mahameed [Mon, 2 Sep 2019 06:47:09 +0000 (23:47 -0700)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux

Merge mlx5-next patches needed for upcoming mlx5 software steering.

1) Alex adds HW bits and definitions required for SW steering
2) Ariel moves device memory management to mlx5_core (From mlx5_ib)
3) Maor, Cleanups and fixups for eswitch mode and RoCE
4) Mark, Set only stag for match untagged packets

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Set only stag for match untagged packets
Mark Bloch [Thu, 29 Aug 2019 23:42:38 +0000 (23:42 +0000)]
net/mlx5: Set only stag for match untagged packets

cvlan_tag enabled in match criteria and disabled in
match value means both S & C tags don't exist (untagged of both).

Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add stub for mlx5_eswitch_mode
Maor Gottlieb [Thu, 29 Aug 2019 23:42:36 +0000 (23:42 +0000)]
net/mlx5: Add stub for mlx5_eswitch_mode

Return MLX5_ESWITCH_NONE when CONFIG_MLX5_ESWITCH
is not selected.

Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Avoid disabling RoCE when uninitialized
Maor Gottlieb [Thu, 29 Aug 2019 23:42:34 +0000 (23:42 +0000)]
net/mlx5: Avoid disabling RoCE when uninitialized

Move the check if RoCE steering is initialized to the
disable RoCE function, it will ensure that we disable
RoCE only if we succeeded in enabling it before.

Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add HW bits and definitions required for SW steering
Alex Vesker [Thu, 29 Aug 2019 23:42:32 +0000 (23:42 +0000)]
net/mlx5: Add HW bits and definitions required for SW steering

Add the required Software Steering hardware definitions and
bits to mlx5_ifc.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Yevgeny Klitenik <kliten@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Move device memory management to mlx5_core
Ariel Levkovich [Thu, 29 Aug 2019 23:42:30 +0000 (23:42 +0000)]
net/mlx5: Move device memory management to mlx5_core

Move the device memory allocation and deallocation commands
SW ICM memory to mlx5_core to expose this API for all
mlx5_core users.

This comes as preparation for supporting SW steering in kernel
where it will be required to allocate and register device
memory for direct rule insertion.

In addition, an API to register this device memory for future
remote access operations is introduced using the create_mkey
commands.

Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoMerge branch 'net-dsa-mv88e6xxx-centralize-SERDES-IRQ-handling'
David S. Miller [Sun, 1 Sep 2019 19:16:38 +0000 (12:16 -0700)]
Merge branch 'net-dsa-mv88e6xxx-centralize-SERDES-IRQ-handling'

Vivien Didelot says:

====================
net: dsa: mv88e6xxx: centralize SERDES IRQ handling

Following Marek's work on the abstraction of the SERDES lanes mapping, this
series trades the .serdes_irq_setup and .serdes_irq_free callbacks for new
.serdes_irq_mapping, .serdes_irq_enable and .serdes_irq_status operations.

This has the benefit to limit the various SERDES implementations to simple
register accesses only; centralize the IRQ handling and mutex locking logic;
as well as reducing boilerplate in the driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: centralize SERDES IRQ handling
Vivien Didelot [Sat, 31 Aug 2019 20:18:36 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: centralize SERDES IRQ handling

The .serdes_irq_setup are all following the same steps: get the SERDES
lane, get the IRQ mapping, request the IRQ, then enable it. So do
the .serdes_irq_free implementations: get the SERDES lane, disable
the IRQ, then free it.

This patch removes these operations in favor of generic functions.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: introduce .serdes_irq_status
Vivien Didelot [Sat, 31 Aug 2019 20:18:35 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: introduce .serdes_irq_status

Introduce a new .serdes_irq_status operation to prepare the abstraction
of IRQ thread from the SERDES IRQ setup code.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: introduce .serdes_irq_enable
Vivien Didelot [Sat, 31 Aug 2019 20:18:34 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: introduce .serdes_irq_enable

Introduce a new .serdes_irq_enable operation to prepare the abstraction
of IRQ enabling from the SERDES IRQ setup code.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: pass lane to .serdes_power
Vivien Didelot [Sat, 31 Aug 2019 20:18:33 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: pass lane to .serdes_power

Now the first step of all .serdes_power implementations is getting
the lane mapping. Since we have an operation for that, call it in
the wrapper and pass the lane down to the .serdes_power operation.

This also allows to avoid querying the SERDES lane twice in
mv88e6xxx_port_set_cmode.

At the same time provide mv88e6xxx_serdes_power_{up,down} helpers
and prefer up/down instead of on/off as in the documentation.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: merge mv88e6352_serdes_power_set
Vivien Didelot [Sat, 31 Aug 2019 20:18:32 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: merge mv88e6352_serdes_power_set

The mv88e6352_serdes_power_set helper is only used at one place, in
mv88e6352_serdes_power. Keep it simple and merge the two functions
together.

Use mv88e6xxx_serdes_get_lane instead of mv88e6352_port_has_serdes
to avoid moving code. No functional changes.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: implement mv88e6352_serdes_get_lane
Vivien Didelot [Sat, 31 Aug 2019 20:18:31 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: implement mv88e6352_serdes_get_lane

Even though 88E6352 has no dedicated lane for SERDES interfaces, it
uses a similar code as the other .serdes_get_lane implementations to
check the port's CMODE and ensure that SERDES operations are doable.

For consistency, implement mv88e6352_serdes_get_lane for the 88E6352
and similar switches which simply returns an unused 0xff lane address.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: simplify .serdes_get_lane
Vivien Didelot [Sat, 31 Aug 2019 20:18:30 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: simplify .serdes_get_lane

Because the mapping between a SERDES interface and its lane is static,
we don't need to stick with negative error codes actually and we can
simply return 0 if there is no lane, just like the IRQ mapping.

This way we can keep a simple and intuitive API using unsigned lane
numbers while simplifying the implementations with single return
statements. Last but not least, fix the reverse chrismas tree in
mv88e6390x_serdes_get_lane.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: introduce .serdes_irq_mapping
Vivien Didelot [Sat, 31 Aug 2019 20:18:29 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: introduce .serdes_irq_mapping

Introduce a new .serdes_irq_mapping operation to prepare the
abstraction of IRQ mapping from the SERDES IRQ setup code.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: fix SERDES IRQ mapping
Vivien Didelot [Sat, 31 Aug 2019 20:18:28 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: fix SERDES IRQ mapping

The current mv88e6xxx SERDES code checks for negative error code from
irq_find_mapping, while this function returns an unsigned integer. This
patch removes this dead code and simply returns 0 is no IRQ is found.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: check errors in mv88e6352_serdes_irq_link
Vivien Didelot [Sat, 31 Aug 2019 20:18:27 +0000 (16:18 -0400)]
net: dsa: mv88e6xxx: check errors in mv88e6352_serdes_irq_link

The mv88e6352_serdes_irq_link helper is not checking for any error that
may occur during hardware accesses. Worst, the "up" boolean is set from
the potentially unused "status" variable, if read operations failed.

As done in mv88e6390_serdes_irq_link_sgmii, return right away and do
not call dsa_port_phylink_mac_change if an error occurred.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: remove set but not used variable 'qos'
YueHaibing [Sat, 31 Aug 2019 12:29:11 +0000 (12:29 +0000)]
net: hns3: remove set but not used variable 'qos'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: In function 'hclge_restore_vlan_table':
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:8016:18: warning:
 variable 'qos' set but not used [-Wunused-but-set-variable]

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 70a214903da9 ("net: hns3: reduce the parameters of some functions")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: remove redundant assignment to pointer reg_info
Colin Ian King [Sat, 31 Aug 2019 07:29:49 +0000 (08:29 +0100)]
net: hns3: remove redundant assignment to pointer reg_info

Pointer reg_info is being initialized with a value that is never read and
is being re-assigned a little later on. The assignment is redundant
and hence can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetlabel: remove redundant assignment to pointer iter
Colin Ian King [Sun, 1 Sep 2019 15:52:05 +0000 (16:52 +0100)]
netlabel: remove redundant assignment to pointer iter

Pointer iter is being initialized with a value that is never read and
is being re-assigned a little later on. The assignment is redundant
and hence can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: don't set bit RxVlan on RTL8125
Heiner Kallweit [Sun, 1 Sep 2019 08:42:44 +0000 (10:42 +0200)]
r8169: don't set bit RxVlan on RTL8125

RTL8125 uses a different register for VLAN offloading config,
therefore don't set bit RxVlan.

Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ncsi: add response handlers for PLDM over NC-SI
Ben Wei [Fri, 30 Aug 2019 20:50:51 +0000 (20:50 +0000)]
net/ncsi: add response handlers for PLDM over NC-SI

This patch adds handlers for PLDM over NC-SI command response.

This enables NC-SI driver recognizes the packet type so the responses
don't get dropped as unknown packet type.

PLDM over NC-SI are not handled in kernel driver for now, but can be
passed back to user space via Netlink for further handling.

Signed-off-by: Ben Wei <benwei@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Minor-cleanup-in-devlink'
David S. Miller [Sun, 1 Sep 2019 06:46:13 +0000 (23:46 -0700)]
Merge branch 'Minor-cleanup-in-devlink'

Parav Pandit says:

====================
Minor cleanup in devlink

Two minor cleanup in devlink.

Patch-1 Explicitly defines devlink port index as unsigned int
Patch-2 Uses switch-case to handle different port flavours attributes
====================

Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: Use switch-case instead of if-else
Parav Pandit [Fri, 30 Aug 2019 10:39:45 +0000 (05:39 -0500)]
devlink: Use switch-case instead of if-else

Make core more readable with switch-case for various port flavours.

Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: Make port index data type as unsigned int
Parav Pandit [Fri, 30 Aug 2019 10:39:44 +0000 (05:39 -0500)]
devlink: Make port index data type as unsigned int

Devlink port index attribute is returned to users as u32 through
netlink response.
Change index data type from 'unsigned' to 'unsigned int' to avoid
below checkpatch.pl warning.

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
81: FILE: include/net/devlink.h:81:
+       unsigned index;

Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-tls-add-socket-diag'
David S. Miller [Sun, 1 Sep 2019 06:44:28 +0000 (23:44 -0700)]
Merge branch 'net-tls-add-socket-diag'

Davide Caratti says:

====================
net: tls: add socket diag

The current kernel does not provide any diagnostic tool, except
getsockopt(TCP_ULP), to know more about TCP sockets that have an upper
layer protocol (ULP) on top of them. This series extends the set of
information exported by INET_DIAG_INFO, to include data that are
specific to the ULP (and that might be meaningful for debug/testing
purposes).

patch 1/3 ensures that the control plane reads/updates ULP specific data
using RCU.

patch 2/3 extends INET_DIAG_INFO and allows knowing the ULP name for
each TCP socket that has done setsockopt(TCP_ULP) successfully.

patch 3/3 extends kTLS to let programs like 'ss' know the protocol
version and the cipher in use.

Changes since v2:
- remove unneeded #ifdef and fix reverse christmas tree in
  tls_get_info(), thanks to Jakub Kicinski

Changes since v1:
- don't worry about grace period when accessing ulp_ops, thanks to
  Jakub Kicinski and Eric Dumazet
- use rcu_dereference() to access ULP data in tls get_info(), and
  test against NULL value, thanks to Jakub Kicinski
- move RCU protected section inside tls get_info(), thanks to Jakub
  Kicinski

Changes since RFC:
- some coding style fixes, thanks to Jakub Kicinski
- add X_UNSPEC as lowest value of uAPI enums, thanks to Jakub Kicinski
- fix assignment of struct nlattr *start, thanks to Jakub Kicinski
- let tls dump RXCONF and TXCONF, suggested by Jakub Kicinski
- don't dump anything if TLS version or cipher are 0 (but still return a
  constant size in get_aux_size()), thanks to Boris Pismenny
- constify first argument of get_info() and get_size()
- use RCU to access access ulp_ops, like it's done for ca_ops
- add patch 1/3, from Jakub Kicinski
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag
Davide Caratti [Fri, 30 Aug 2019 10:25:49 +0000 (12:25 +0200)]
net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag

When an application configures kernel TLS on top of a TCP socket, it's
now possible for inet_diag_handler() to collect information regarding the
protocol version, the cipher type and TX / RX configuration, in case
INET_DIAG_INFO is requested.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: ulp: add functions to dump ulp-specific information
Davide Caratti [Fri, 30 Aug 2019 10:25:48 +0000 (12:25 +0200)]
tcp: ulp: add functions to dump ulp-specific information

currently, only getsockopt(TCP_ULP) can be invoked to know if a ULP is on
top of a TCP socket. Extend idiag_get_aux() and idiag_get_aux_size(),
introduced by commit b37e88407c1d ("inet_diag: allow protocols to provide
additional data"), to report the ULP name and other information that can
be made available by the ULP through optional functions.

Users having CAP_NET_ADMIN privileges will then be able to retrieve this
information through inet_diag_handler, if they specify INET_DIAG_INFO in
the request.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: use RCU protection on icsk->icsk_ulp_data
Jakub Kicinski [Fri, 30 Aug 2019 10:25:47 +0000 (12:25 +0200)]
net/tls: use RCU protection on icsk->icsk_ulp_data

We need to make sure context does not get freed while diag
code is interrogating it. Free struct tls_context with
kfree_rcu().

We add the __rcu annotation directly in icsk, and cast it
away in the datapath accessor. Presumably all ULPs will
do a similar thing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'qed-Enhancements'
David S. Miller [Sat, 31 Aug 2019 20:32:30 +0000 (13:32 -0700)]
Merge branch 'qed-Enhancements'

Sudarsana Reddy Kalluru says:

====================
qed*: Enhancements.

The patch series adds couple of enhancements to qed/qede drivers.
  - Support for dumping the config id attributes via ethtool -w/W.
  - Support for dumping the GRC data of required memory regions using
    ethtool -w/W interfaces.

Patch (1) adds driver APIs for reading the config id attributes.
Patch (2) adds ethtool support for dumping the config id attributes.
Patch (3) adds support for configuring the GRC dump config flags.
Patch (4) adds ethtool support for dumping the grc dump.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqede: Add support for dumping the grc data.
Sudarsana Reddy Kalluru [Fri, 30 Aug 2019 07:42:06 +0000 (00:42 -0700)]
qede: Add support for dumping the grc data.

This patch adds driver support for configuring grc dump config flags, and
dumping the grc data via ethtool get/set-dump interfaces.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Add APIs for configuring grc dump config flags.
Sudarsana Reddy Kalluru [Fri, 30 Aug 2019 07:42:05 +0000 (00:42 -0700)]
qed: Add APIs for configuring grc dump config flags.

The patch adds driver support for configuring the grc dump config flags.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqede: Add support for reading the config id attributes.
Sudarsana Reddy Kalluru [Fri, 30 Aug 2019 07:42:04 +0000 (00:42 -0700)]
qede: Add support for reading the config id attributes.

Add driver support for dumping the config id attributes via ethtool dump
interfaces.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Add APIs for reading config id attributes.
Sudarsana Reddy Kalluru [Fri, 30 Aug 2019 07:42:03 +0000 (00:42 -0700)]
qed: Add APIs for reading config id attributes.

The patch adds driver support for reading the config id attributes from NVM
flash partition.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Dynamic-toggling-of-vlan_filtering-for-SJA1105-DSA'
David S. Miller [Sat, 31 Aug 2019 20:21:19 +0000 (13:21 -0700)]
Merge branch 'Dynamic-toggling-of-vlan_filtering-for-SJA1105-DSA'

Vladimir Oltean says:

====================
Dynamic toggling of vlan_filtering for SJA1105 DSA

This patchset addresses a limitation in dsa_8021q where this sequence of
commands was causing the switch to stop forwarding traffic:

  ip link add name br0 type bridge vlan_filtering 0
  ip link set dev swp2 master br0
  echo 1 > /sys/class/net/br0/bridge/vlan_filtering
  echo 0 > /sys/class/net/br0/bridge/vlan_filtering

The issue has to do with the VLAN table manipulations that dsa_8021q
does without notifying the bridge layer. The solution is to always
restore the VLANs that the bridge knows about, when disabling tagging.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: tag_8021q: Restore bridge VLANs when enabling vlan_filtering
Vladimir Oltean [Fri, 30 Aug 2019 00:53:25 +0000 (03:53 +0300)]
net: dsa: tag_8021q: Restore bridge VLANs when enabling vlan_filtering

The bridge core assumes that enabling/disabling vlan_filtering will
translate into the simple toggling of a flag for switchdev drivers.

That is clearly not the case for sja1105, which alters the VLAN table
and the pvids in order to obtain port separation in standalone mode.

There are 2 parts to the issue.

First, tag_8021q changes the pvid to a unique per-port rx_vid for frame
identification. But we need to disable tag_8021q when vlan_filtering
kicks in, and at that point, the VLAN configured as pvid will have to be
removed from the filtering table of the ports. With an invalid pvid, the
ports will drop all traffic.  Since the bridge will not call any vlan
operation through switchdev after enabling vlan_filtering, we need to
ensure we're in a functional state ourselves. Hence read the pvid that
the bridge is aware of, and program that into our ports.

Secondly, tag_8021q uses the 1024-3071 range privately in
vlan_filtering=0 mode. Had the user installed one of these VLANs during
a previous vlan_filtering=1 session, then upon the next tag_8021q
cleanup for vlan_filtering to kick in again, VLANs in that range will
get deleted unconditionally, hence breaking user expectation. So when
deleting the VLANs, check if the bridge had knowledge about them, and if
it did, re-apply the settings. Wrap this logic inside a
dsa_8021q_vid_apply helper function to reduce code duplication.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: bridge: Populate the pvid flag in br_vlan_get_info
Vladimir Oltean [Fri, 30 Aug 2019 00:53:24 +0000 (03:53 +0300)]
net: bridge: Populate the pvid flag in br_vlan_get_info

Currently this simplified code snippet fails:

br_vlan_get_pvid(netdev, &pvid);
br_vlan_get_info(netdev, pvid, &vinfo);
ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));

It is intuitive that the pvid of a netdevice should have the
BRIDGE_VLAN_INFO_PVID flag set.

However I can't seem to pinpoint a commit where this behavior was
introduced. It seems like it's been like that since forever.

At a first glance it would make more sense to just handle the
BRIDGE_VLAN_INFO_PVID flag in __vlan_add_flags. However, as Nikolay
explains:

  There are a few reasons why we don't do it, most importantly because
  we need to have only one visible pvid at any single time, even if it's
  stale - it must be just one. Right now that rule will not be violated
  by this change, but people will try using this flag and could see two
  pvids simultaneously. You can see that the pvid code is even using
  memory barriers to propagate the new value faster and everywhere the
  pvid is read only once.  That is the reason the flag is set
  dynamically when dumping entries, too.  A second (weaker) argument
  against would be given the above we don't want another way to do the
  same thing, specifically if it can provide us with two pvids (e.g. if
  walking the vlan list) or if it can provide us with a pvid different
  from the one set in the vg. [Obviously, I'm talking about RCU
  pvid/vlan use cases similar to the dumps.  The locked cases are fine.
  I would like to avoid explaining why this shouldn't be relied upon
  without locking]

So instead of introducing the above change and making sure of the pvid
uniqueness under RCU, simply dynamically populate the pvid flag in
br_vlan_get_info().

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'batadv-next-for-davem-20190830' of git://git.open-mesh.org/linux-merge
David S. Miller [Sat, 31 Aug 2019 20:15:19 +0000 (13:15 -0700)]
Merge tag 'batadv-next-for-davem-20190830' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This maintenance patchset includes the following patches:

 - Add Sven to the MAINTAINERS file, by Simon Wunderlich
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoudp: Remove unlikely() from IS_ERR*() condition
Denis Efremov [Thu, 29 Aug 2019 16:50:24 +0000 (19:50 +0300)]
udp: Remove unlikely() from IS_ERR*() condition

"unlikely(IS_ERR_OR_NULL(x))" is excessive. IS_ERR_OR_NULL() already uses
unlikely() internally.

Signed-off-by: Denis Efremov <efremov@linux.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx5e: Remove unlikely() from WARN*() condition
Denis Efremov [Thu, 29 Aug 2019 16:50:17 +0000 (19:50 +0300)]
net/mlx5e: Remove unlikely() from WARN*() condition

"unlikely(WARN_ON_ONCE(x))" is excessive. WARN_ON_ONCE() already uses
unlikely() internally.

Signed-off-by: Denis Efremov <efremov@linux.com>
Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: netdev@vger.kernel.org
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Fix compile error regression with CONFIG_BNXT_SRIOV not set.
Michael Chan [Fri, 30 Aug 2019 23:10:38 +0000 (19:10 -0400)]
bnxt_en: Fix compile error regression with CONFIG_BNXT_SRIOV not set.

Add a new function bnxt_get_registered_vfs() to handle the work
of getting the number of registered VFs under #ifdef CONFIG_BNXT_SRIOV.
The main code will call this function and will always work correctly
whether CONFIG_BNXT_SRIOV is set or not.

Fixes: 230d1f0de754 ("bnxt_en: Handle firmware reset.")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Fixes-for-unlocked-cls-hardware-offload-API-refactoring'
David S. Miller [Fri, 30 Aug 2019 22:12:05 +0000 (15:12 -0700)]
Merge branch 'Fixes-for-unlocked-cls-hardware-offload-API-refactoring'

Vlad Buslov says:

====================
Fixes for unlocked cls hardware offload API refactoring

Two fixes for my "Refactor cls hardware offload API to support
rtnl-independent drivers" series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx5e: Move local var definition into ifdef block
Vlad Buslov [Thu, 29 Aug 2019 16:15:17 +0000 (19:15 +0300)]
net/mlx5e: Move local var definition into ifdef block

New local variable "struct flow_block_offload *f" was added to
mlx5e_setup_tc() in recent rtnl lock removal patches. The variable is used
in code that is only compiled when CONFIG_MLX5_ESWITCH is enabled. This
results compilation warning about unused variable when CONFIG_MLX5_ESWITCH
is not set. Move the variable definition into eswitch-specific code block
from the beginning of mlx5e_setup_tc() function.

Fixes: c9f14470d048 ("net: sched: add API for registering unlocked offload block callbacks")
Reported-by: tanhuazhong <tanhuazhong@huawei.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: cls_matchall: cleanup flow_action before deallocating
Vlad Buslov [Thu, 29 Aug 2019 16:15:16 +0000 (19:15 +0300)]
net: sched: cls_matchall: cleanup flow_action before deallocating

Recent rtnl lock removal patch changed flow_action infra to require proper
cleanup besides simple memory deallocation. However, matchall classifier
was not updated to call tc_cleanup_flow_action(). Add proper cleanup to
mall_replace_hw_filter() and mall_reoffload().

Fixes: 5a6ff4b13d59 ("net: sched: take reference to action dev before calling offloads")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp_bbr: clarify that bbr_bdp() rounds up in comments
Luke Hsiao [Thu, 29 Aug 2019 14:02:44 +0000 (10:02 -0400)]
tcp_bbr: clarify that bbr_bdp() rounds up in comments

This explicitly clarifies that bbr_bdp() returns the rounded-up value of
the bandwidth-delay product and why in the comments.

Signed-off-by: Luke Hsiao <lukehsiao@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosched: act_vlan: implement stats_update callback
Jiri Pirko [Thu, 29 Aug 2019 13:38:42 +0000 (15:38 +0200)]
sched: act_vlan: implement stats_update callback

Implement this callback in order to get the offloaded stats added to the
kernel stats.

Reported-by: Pengfei Liu <pengfeil@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: depend on COMMON_CLK
Stephen Rothwell [Fri, 30 Aug 2019 21:34:05 +0000 (14:34 -0700)]
net: stmmac: depend on COMMON_CLK

Fixes: 190f73ab4c43 ("net: stmmac: setup higher frequency clk support for EHL & TGL")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoarcnet: capmode: remove redundant assignment to pointer pkt
Colin Ian King [Wed, 28 Aug 2019 23:14:50 +0000 (00:14 +0100)]
arcnet: capmode: remove redundant assignment to pointer pkt

Pointer pkt is being initialized with a value that is never read
and pkt is being re-assigned a little later on. The assignment is
redundant and hence can be removed.

Addresses-Coverity: ("Ununsed value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'bnxt_en-health-and-error-recovery'
David S. Miller [Fri, 30 Aug 2019 21:02:19 +0000 (14:02 -0700)]
Merge branch 'bnxt_en-health-and-error-recovery'

Michael Chan says:

====================
bnxt_en: health and error recovery.

This patchset implements adapter health and error recovery.  The status
is reported through several devlink reporters and the driver will
initiate and complete the recovery process using the devlink infrastructure.

v2: Added 4 patches at the beginning of the patchset to clean up error code
    handling related to firmware messages and to convert to use standard
    error codes.

    Removed the dropping of rtnl_lock in bnxt_close().

    Broke up the patches some more for better patch organization and
    future bisection.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add FW fatal devlink_health_reporter.
Vasundhara Volam [Fri, 30 Aug 2019 03:55:05 +0000 (23:55 -0400)]
bnxt_en: Add FW fatal devlink_health_reporter.

Health show command example and output:

$ devlink health show pci/0000:af:00.0 reporter fw_fatal

pci/0000:af:00.0:
  name fw_fatal
    state healthy error 1 recover 1 grace_period 0 auto_recover true

Fatal events from firmware or missing periodic heartbeats will
be reported and recovery will be handled.

We also turn on the support flags when we register with the firmware to
enable this health and recovery feature in the firmware.

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add bnxt_fw_exception() to handle fatal firmware errors.
Michael Chan [Fri, 30 Aug 2019 03:55:04 +0000 (23:55 -0400)]
bnxt_en: Add bnxt_fw_exception() to handle fatal firmware errors.

This call will handle fatal firmware errors by forcing a reset on the
firmware.  The master function driver will carry out the forced reset.
The sequence will go through the same bnxt_fw_reset_task() workqueue.
This fatal reset differs from the non-fatal reset at the beginning
stages.  From the BNXT_FW_RESET_STATE_ENABLE_DEV state onwards where
the firmware is coming out of reset, it is practically identical to the
non-fatal reset.

The next patch will add the periodic heartbeat check and the devlink
reporter to report the fatal event and to initiate the bnxt_fw_exception()
call.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add RESET_FW state logic to bnxt_fw_reset_task().
Michael Chan [Fri, 30 Aug 2019 03:55:03 +0000 (23:55 -0400)]
bnxt_en: Add RESET_FW state logic to bnxt_fw_reset_task().

This state handles driver initiated chip reset during error recovery.
Only the master function will perform this step during error recovery.
The next patch will add code to initiate this reset from the master
function.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Do not send firmware messages if firmware is in error state.
Michael Chan [Fri, 30 Aug 2019 03:55:02 +0000 (23:55 -0400)]
bnxt_en: Do not send firmware messages if firmware is in error state.

Add a flag to mark that the firmware has encountered fatal condition.
The driver will not send any more firmware messages and will return
error to the caller.  Fix up some clean up functions to continue
and not abort when the firmware message function returns error.

This is preparation work to fully handle firmware error recovery
under fatal conditions.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Retain user settings on a VF after RESET_NOTIFY event.
Vasundhara Volam [Fri, 30 Aug 2019 03:55:01 +0000 (23:55 -0400)]
bnxt_en: Retain user settings on a VF after RESET_NOTIFY event.

Retain the VF MAC address, default VLAN, TX rate control, trust settings
of VFs after firmware reset.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add devlink health reset reporter.
Vasundhara Volam [Fri, 30 Aug 2019 03:55:00 +0000 (23:55 -0400)]
bnxt_en: Add devlink health reset reporter.

Add devlink health reporter for the firmware reset event.  Once we get
the notification from firmware about the impending reset, the driver
will report this to devlink and the call to bnxt_fw_reset() will be
initiated to complete the reset sequence.

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Handle firmware reset.
Michael Chan [Fri, 30 Aug 2019 03:54:59 +0000 (23:54 -0400)]
bnxt_en: Handle firmware reset.

Add the bnxt_fw_reset() main function to handle firmware reset.  This
is triggered by firmware to initiate an orderly reset, for example
when a non-fatal exception condition has been detected.  bnxt_fw_reset()
will first wait for all VFs to shutdown and then start the
bnxt_fw_reset_task() work queue to go through the sequence of reset,
re-probe, and re-initialization.

The next patch will add the devlink reporter to start the sequence and
call bnxt_fw_reset().

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Handle RESET_NOTIFY async event from firmware.
Michael Chan [Fri, 30 Aug 2019 03:54:58 +0000 (23:54 -0400)]
bnxt_en: Handle RESET_NOTIFY async event from firmware.

This event from firmware signals a coordinated reset initiated by the
firmware.  It may be triggered by some error conditions encountered
in the firmware or other orderly reset conditions.

We store the parameters from this event.  Subsequent patches will
add logic to handle reset itself using devlink reporters.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add new FW devlink_health_reporter
Vasundhara Volam [Fri, 30 Aug 2019 03:54:57 +0000 (23:54 -0400)]
bnxt_en: Add new FW devlink_health_reporter

Create new FW devlink_health_reporter, to know the current health
status of FW.

Command example and output:
$ devlink health show pci/0000:af:00.0 reporter fw

pci/0000:af:00.0:
  name fw
    state healthy error 0 recover 0

 FW status: Healthy; Reset count: 1

Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Add BNXT_STATE_IN_FW_RESET state.
Michael Chan [Fri, 30 Aug 2019 03:54:56 +0000 (23:54 -0400)]
bnxt_en: Add BNXT_STATE_IN_FW_RESET state.

The new flag will be set in subsequent patches when firmware is
going through reset.  If bnxt_close() is called while the new flag
is set, the FW reset sequence will have to be aborted because the
NIC is prematurely closed before FW reset has completed.  We also
reject SRIOV configurations while FW reset is in progress.

v2: No longer drop rtnl_lock() in close and wait for FW reset to complete.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Enable health monitoring.
Michael Chan [Fri, 30 Aug 2019 03:54:55 +0000 (23:54 -0400)]
bnxt_en: Enable health monitoring.

Handle the async event from the firmware that enables firmware health
monitoring.  Store initial health metrics.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Pre-map the firmware health monitoring registers.
Michael Chan [Fri, 30 Aug 2019 03:54:54 +0000 (23:54 -0400)]
bnxt_en: Pre-map the firmware health monitoring registers.

Pre-map the GRC registers for periodic firmware health monitoring.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Discover firmware error recovery capabilities.
Michael Chan [Fri, 30 Aug 2019 03:54:53 +0000 (23:54 -0400)]
bnxt_en: Discover firmware error recovery capabilities.

Call the new firmware API HWRM_ERROR_RECOVERY_QCFG if it is supported
to discover the firmware health and recovery capabilities and settings.
This feature allows the driver to reset the chip if firmware crashes and
becomes unresponsive.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Handle firmware reset status during IF_UP.
Michael Chan [Fri, 30 Aug 2019 03:54:52 +0000 (23:54 -0400)]
bnxt_en: Handle firmware reset status during IF_UP.

During IF_UP, newer firmware has a new status flag that indicates that
firmware has reset.  Add new function bnxt_fw_init_one() to re-probe the
firmware and re-setup VF resources on the PF if necessary.  If the
re-probe fails, set a flag to prevent bnxt_open() from proceeding again.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Register buffers for VFs before reserving resources.
Vasundhara Volam [Fri, 30 Aug 2019 03:54:51 +0000 (23:54 -0400)]
bnxt_en: Register buffers for VFs before reserving resources.

When VFs need to be reconfigured dynamically after firmwware reset, the
configuration sequence on the PF needs to be changed to register the VF
buffers first.  Otherwise, some VF firmware commands may not succeed as
there may not be PF buffers ready for the re-directed firmware commands.

This sequencing did not matter much before when we only supported
the normal bring-up of VFs.

Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Refactor bnxt_sriov_enable().
Michael Chan [Fri, 30 Aug 2019 03:54:50 +0000 (23:54 -0400)]
bnxt_en: Refactor bnxt_sriov_enable().

Refactor the hardware/firmware configuration portion in
bnxt_sriov_enable() into a new function bnxt_cfg_hw_sriov().  This
new function can be called after a firmware reset to reconfigure the
VFs previously enabled.

v2: straight refactor of the code.  Reordering done in the next patch.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Prepare bnxt_init_one() to be called multiple times.
Michael Chan [Fri, 30 Aug 2019 03:54:49 +0000 (23:54 -0400)]
bnxt_en: Prepare bnxt_init_one() to be called multiple times.

In preparation for the new firmware reset feature, some of the logic
in bnxt_init_one() and related functions will be called again after
firmware has reset.  Reset some of the flags and capabilities so that
everything that can change can be re-initialized.  Refactor some
functions to probe firmware versions and capabilities.  Check some
buffers before allocating as they may have been allocated previously.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Suppress all error messages in hwrm_do_send_msg() in silent mode.
Michael Chan [Fri, 30 Aug 2019 03:54:48 +0000 (23:54 -0400)]
bnxt_en: Suppress all error messages in hwrm_do_send_msg() in silent mode.

If the silent parameter is set, suppress all messages when there is
no response from firmware.  When polling for firmware to come out of
reset, no response may be normal and we want to suppress the error
messages.  Also, don't poll for the firmware DMA response if Bus Master
is disabled.  This is in preparation for error recovery when firmware
may be in error or reset state or Bus Master is disabled.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Simplify error checking in the SR-IOV message forwarding functions.
Michael Chan [Fri, 30 Aug 2019 03:54:47 +0000 (23:54 -0400)]
bnxt_en: Simplify error checking in the SR-IOV message forwarding functions.

There are 4 functions handling message forwarding for SR-IOV.  They
check for non-zero firmware response code and then return -1.  There
is no need to do this anymore.  The main messaging function will
now return standard error code.  Since we don't need to examine the
response, we can use the hwrm_send_message() variant which will
take the mutex automatically.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Convert error code in firmware message response to standard code.
Michael Chan [Fri, 30 Aug 2019 03:54:46 +0000 (23:54 -0400)]
bnxt_en: Convert error code in firmware message response to standard code.

The main firmware messaging function returns the firmware defined error
code and many callers have to convert to standard error code for proper
propagation to userspace.  Convert bnxt_hwrm_do_send_msg() to return
standard error code so we can do away with all the special error code
handling by the many callers.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Remove the -1 error return code from bnxt_hwrm_do_send_msg().
Michael Chan [Fri, 30 Aug 2019 03:54:45 +0000 (23:54 -0400)]
bnxt_en: Remove the -1 error return code from bnxt_hwrm_do_send_msg().

Replace the non-standard -1 code with -EBUSY when there is no firmware
response after waiting for the maximum timeout.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnxt_en: Use a common function to print the same ethtool -f error message.
Michael Chan [Fri, 30 Aug 2019 03:54:44 +0000 (23:54 -0400)]
bnxt_en: Use a common function to print the same ethtool -f error message.

The same message is printed 3 times in the code, so use a common function
to do that.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ioc3-eth-improvements'
David S. Miller [Fri, 30 Aug 2019 20:54:36 +0000 (13:54 -0700)]
Merge branch 'ioc3-eth-improvements'

Thomas Bogendoerfer says:

====================
ioc3-eth improvements

In my patch series for splitting out the serial code from ioc3-eth
by using a MFD device there was one big patch for ioc3-eth.c,
which wasn't really usefull for reviews. This series contains the
ioc3-eth changes splitted in smaller steps and few more cleanups.
Only the conversion to MFD will be done later in a different series.

Changes in v3:
- no need to check skb == NULL before passing it to dev_kfree_skb_any
- free memory allocated with get_page(s) with free_page(s)
- allocate rx ring with just GFP_KERNEL
- add required alignment for rings in comments

Changes in v2:
- use net_err_ratelimited for printing various ioc3 errors
- added missing clearing of rx buf valid flags into ioc3_alloc_rings
- use __func__ for printing out of memory messages
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: no need to stop queue set_multicast_list
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:38 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: no need to stop queue set_multicast_list

netif_stop_queue()/netif_wake_qeue() aren't needed for changing
multicast filters.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: protect emcr in all cases
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:37 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: protect emcr in all cases

emcr in private struct wasn't always protected by spinlock.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: Fix IPG settings
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:36 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: Fix IPG settings

The half/full duplex settings for inter packet gap counters/timer were
reversed.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: use csum_fold
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:35 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: use csum_fold

replace open coded checksum folding by csum_fold.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: use dma-direct for dma allocations
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:34 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: use dma-direct for dma allocations

Replace the homegrown DMA memory allocation, which only works on
SGI-IP27 machines, with the generic dma allocations.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: refactor rx buffer allocation
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:33 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: refactor rx buffer allocation

Move common code for rx buffer setup into ioc3_alloc_skb and deal
with allocation failures. Also clean up allocation size calculation.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: split ring cleaning/freeing and allocation
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:32 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: split ring cleaning/freeing and allocation

Do tx ring cleaning and freeing of rx buffers, when chip is shutdown and
allocate buffers before bringing chip up.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: introduce chip start function
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:31 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: introduce chip start function

ioc3_init did everything from reset to init rings to starting the chip.
This change move out chip start into a new function as preparation
for easier handling of receive buffer allocation failures.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: separate tx and rx ring handling
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:30 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: separate tx and rx ring handling

After allocation of descriptor memory is now done once in probe
handling of tx ring is completely done by ioc3_clean_tx_ring. So
we remove the remaining tx ring actions out of ioc3_alloc_rings
and ioc3_free_rings and rename it to ioc3_[alloc|free]_rx_bufs
to better describe what they are doing.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sgi: ioc3-eth: get rid of ioc3_clean_rx_ring()
Thomas Bogendoerfer [Fri, 30 Aug 2019 09:25:29 +0000 (11:25 +0200)]
net: sgi: ioc3-eth: get rid of ioc3_clean_rx_ring()

Move clearing of the descriptor valid bit into ioc3_alloc_rings. This
makes ioc3_clean_rx_ring obsolete.

Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>