Bodong Wang [Fri, 14 Dec 2018 15:33:22 +0000 (09:33 -0600)]
net/mlx5: E-Switch, Assign a different position for uplink rep and vport
In offloads mode, the current implementation puts the uplink
representor at index zero of the vport reps array. It is not "natural"
to place it at index 0 since we want to put the representor for vport
0 at index 0 with the introduction of SmartNIC. A separate patch will
handle the case whether a rep is needed for vport 0 (PF vport).
So, we want to have a different placeholder for uplink vport and
representor. It was placed at the end of vport and rep array. Since
vport number can no longer act as an index into the vport or
representors arrays, use functions to map vport numbers to indices
when accessing the vports or representors arrays, and vice versa.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Thu, 31 Jan 2019 23:42:57 +0000 (17:42 -0600)]
net/mlx5: E-Switch, Centralize repersentor reg/unreg to eswitch driver
Eswitch has two users: IB and ETH. They both register repersentors
when mlx5 interface is added, and unregister the repersentors when
mlx5 interface is removed. Ideally, each driver should only deal with
the entities which are unique to itself. However, current IB and ETH
drivers have to perform the following eswitch operations:
1. When registering, specify how many vports to register. This number
is the same for both drivers which is the total available vport
numbers.
2. When unregistering, specify the number of registered vports to do
unregister. Also, unload the repersentors which are already loaded.
It's unnecessary for eswitch driver to hands out the control of above
operations to individual driver users, as they're not unique to each
driver. Instead, such operations should be centralized to eswitch
driver. This consolidates eswitch control flow, and simplified IB and
ETH driver.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 30 Jan 2019 04:57:21 +0000 (22:57 -0600)]
net/mlx5: E-Switch, Support load/unload reps of specific vport types
Currently the driver loads and unloads all reps in an unbreakable
group. However, with ECPF, the reps of special vports such as uplink
and host PF should always be loaded in switchdev mode where the reps
for VFs will be loaded on-demand and unloaded on no-demand. This is
a pre-step for that change.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 30 Jan 2019 03:48:31 +0000 (21:48 -0600)]
net/mlx5: E-Switch, Add state to eswitch vport representors
Currently the eswitch vport reps have a valid indicator, which is
set on register and unset on unregister. However, a rep can be loaded
or not loaded when doing unregister, current driver checks if the
vport of that rep is enabled as a flag to imply the rep is loaded.
However, for ECPF, this is not valid as the host PF will enable the
vports for its VFs instead.
Add three states: {unregistered, registered, loaded}, with the
following state changes across different operations:
create: (none) -> unregistered
reg: unregistered -> registered
load: registered -> loaded
unload: loaded -> registered
unreg: registered -> unregistered
Note that the state shall only be updated inside eswitch driver rather
than individual drivers such as ETH or IB.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Suggested-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Tue, 29 Jan 2019 04:12:45 +0000 (22:12 -0600)]
net/mlx5: E-Switch, Use getter and iterator to access vport/rep
With only PF and VF, it is sufficient to have the vport/rep array
index as the vport number. This is because PF and VF vports numbers
are consecutive serial numbers. In downstream patches with
introducing of ECPF and UPLINK vports, it's not consecutive any more.
Use getter to get specific vport/rep, and use iterator to traversal
a list of vport/rep. This hides the translation between array index
and vport number, and provides flexibility of using different
translation mechanism in the future.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Suggested-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Thu, 31 Jan 2019 20:40:53 +0000 (14:40 -0600)]
net/mlx5: E-Switch, Split VF and special vports for offloads mode
When driver is entering offloads mode, there are two major tasks to
do: initialize flow steering and create representors. Flow steering
should make sure enough flow table/group spaces are reserved for all
reps. Representors will be created in a group, all or none.
With the introduction of ECPF, flow steering should still reserve the
same spaces. But, the representors are not always loaded/unloaded in a
single piece. Once ECPF is in offloads mode, it will get the number
of VF changing event from host PF. In such scenario, only the VF reps
should be loaded/unloaded, not the reps for special vports (such as
the uplink vport).
Thus, when entering offloads mode, driver should specify the total
number of reps, and the number of VF reps separately. When leaving
offloads mode, the cleanup should use the information self-contained
in eswitch such as number of VFs.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Thu, 7 Feb 2019 16:40:58 +0000 (10:40 -0600)]
net/mlx5: E-Switch, Refactor offloads flow steering init/cleanup
E-switch offloads mode initialize/cleanup multiple steering related
entities (flow table/group). Refactor these operations to internal
helper functions for better block design.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Fri, 1 Feb 2019 23:34:55 +0000 (17:34 -0600)]
net/mlx5: E-Switch, Properly refer to host PF vport as other vport
Commands referring to vports use the following scheme:
1. When referring to my own vport, put 0 in vport and 0 in other_vport.
2. When referring to another vport, put the vport number of the
referred vport and put 1 in other_vport. It was assumed that driver
is accessing other vport when vport number is greater than 0.
With the above scheme, the case that ECPF eswitch manager is trying
to access host PF vport will fall over with scheme 1 as the vport
number is 0. This is apparently wrong as driver is trying to refer
other vport.
As such usage can only happen in the eswitch context, change relevant
functions to provide other vport input properly.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Thu, 8 Nov 2018 20:37:04 +0000 (22:37 +0200)]
net/mlx5: E-Switch, Properly refer to the esw manager vport
In SmartNIC mode, the eswitch manager is not necessarily the PF
(vport 0). Use a helper function to get the correct eswitch manager
vport number and cache on the eswitch instance for fast reference.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 16:52:34 +0000 (10:52 -0600)]
net/mlx5: Correctly set LAG mode for ECPF
When bonding is added, driver assumes that it's RoCE LAG if no VF is
enabled. This is not enough for ECPF as the VF is enabled in host PF
side. LAG should only choose RoCE mode when both slave devices meet
conditions below:
1. E-Switch offloads mode is NONE.
2. No VF is enabled.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Saeed Mahameed [Fri, 15 Feb 2019 23:16:36 +0000 (15:16 -0800)]
Merge branch 'mlx5-next' of git://git./linux/kernel/git/mellanox/linux
Merge mlx5-next shared branched into net-next,
From Bodong Wang:
1) Introduction of ECPF (Embedded CPU Physical Function), and low level
bits for mlx5 SmartNic capabilities support.
2) Vport enumeration refactoring that affect mlx5_ib and mlx5_core
From Aya Levin,
3) Add support for 50Gbps per lane link modes in the Port Type and Speed
register (PTYS)
4) Refactor low level query functions for PTYS register
5) Add support for 50Gbps per lane link modes to mlx5_ib
Note: due to a change in API in mlx5/core and a later patch from net-next,
a fixup was squashed with this merge commit that replaces FDB_UPLINK_VPORT
with MLX5_VPORT_UPLINK which exists only in upstream net-next.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 13 Feb 2019 06:55:46 +0000 (22:55 -0800)]
IB/mlx5: Add support for 50Gbps per lane link modes
Driver now supports new link modes: 50Gbps per lane support for
50G/100G/200G. This patch reads the correct field (legacy vs. extended)
based on a FW indication bit, and adds a translation function (link
modes to IB width and speed) to the new link modes.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 13 Feb 2019 06:55:45 +0000 (22:55 -0800)]
net/mlx5: Add support to ext_* fields introduced in Port Type and Speed register
This patch exposes new link modes (including 50Gbps per lane), and ext_*
fields which describes the new link modes in Port Type and Speed
register (PTYS).
Access functions, translation functions (speed <-> HW bits) and
link max speed function were modified.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 13 Feb 2019 06:55:44 +0000 (22:55 -0800)]
net/mlx5: Add new fields to Port Type and Speed register
Register Port Type and Speed (PTYS) introduces three new fields
extending the speed/protocols the can be reported and configured.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 13 Feb 2019 06:55:43 +0000 (22:55 -0800)]
net/mlx5: Refactor queries to speed fields in Port Type and Speed register
This patch fascicles queries to speed related fields in Port Type and
Speed register (PTYS) into a single API. I addition, this patch
refactors functions which serves only Ethernet driver: remove the
protocol type as an input parameter, move code from 'core' directory
into 'en' directory and add 'eth' prefix to the function's name. The
patch also encapsulates functions that are not used outside the Ethernet
driver removes redundant include files.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:42 +0000 (22:55 -0800)]
net/mlx5: E-Switch, Avoid magic numbers when initializing offloads mode
When dealing with the offloads mode initialization, driver refers to
the number of VFs and add magic number one (1) to take account of the
uplink. This is not clear and will make the code less readable after
adding other vports (e.g. host PF). As these are special vports
compared to VF vports, add a helper macro to denote such special
vports and eliminate the use of magic number.
Moreover, when creating offloads flow table and groups, the driver
reserves two more slots for UC and MC miss rules. Replace this magic
number with a helper macro as well.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:41 +0000 (22:55 -0800)]
net/mlx5: Relocate vport macros to the vport header file
These are two macros in the driver general header which deal with the
number of total vports and if a vport is vport manager. Such macros
are vport entities, better to place them at the vport header file.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:40 +0000 (22:55 -0800)]
net/mlx5: E-Switch, Normalize the name of uplink vport number
Driver used to name uplink vport as FDB_UPLINK_VPORT, it's hard to
comply with the same naming convention along with the introduction of
other vports. Use MLX5_VPORT as the prefix for such vports and
relocate the uplink vport definition to public header file for the
benefits of both net and IB drivers.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:39 +0000 (22:55 -0800)]
net/mlx5: Provide an alternative VF upper bound for ECPF
ECPF doesn't support SR-IOV, but an ECPF E-Switch manager shall know
the max VFs supported by its peer host PF in order to control those
VF vports.
The current driver implementation uses the total vfs quantity as
provided by the pci sub-system for an upper bound of the VF vports
the e-switch code needs to deal with. This obviously can't work as
is on ECPF e-switch manager. For now, we use a hard coded value of
128 on such systems.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:38 +0000 (22:55 -0800)]
net/mlx5: Add host params change event
In Embedded CPU (EC) configurations, the EC driver needs to know when
the number of virtual functions change on the corresponding PF at the
host side. This is required so the EC driver can create or destroy
representor net devices that represent the VFs ports.
Whenever a change in the number of VFs occurs, firmware will generate an
event towards the EC which will trigger a work to complete the rest of
the handling. The specifics of the handling will be introduced in a
downstream patch.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:37 +0000 (22:55 -0800)]
net/mlx5: Add query host params command
The QUERY_HOST_PARAMS command is used by an Embedded CPU Physical
Function (ECPF) driver to identify and retrieve information about the
PF on the host side. E.g, number of virtual functions and PCI BDF.
The number of VFs can be changed on the fly, a function is added to
query current number of VFs and will be used in downstream patches.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:36 +0000 (22:55 -0800)]
net/mlx5: Update enable HCA dependency
With the introduction of ECPF, we require that the ECPF driver will
aways call enable/disable HCA for that PF in the same way a PF does
this for its VFs. The PF is still responsible for calling enable and
disable HCA for its VFs.
To distinguish between the ECPF executing enable/disable HCA for
itself or for the PF, it sets the embedded CPU function bit in the
input params struct of these commands. When the bit is cleared and
function ID is zero, it refers to the peer PF.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:35 +0000 (22:55 -0800)]
net/mlx5: Introduce Mellanox SmartNIC and modify page management logic
Mellanox's SmartNIC combines embedded CPU(e.g, ARM) processing power
with advanced network offloads to accelerate a multitude of security,
networking and storage applications.
With the introduction of the SmartNIC, there is a new PCI function
called Embedded CPU Physical Function(ECPF). And it's possible for a
PF to get its ICM pages from the ECPF PCI function. Driver shall
identify if it is running on such a function by reading a bit in
the initialization segment.
When firmware asks for pages, it would issue a page request event
specifying how many pages it requests and for which function. That
driver responds with a manage_pages command providing the requested
pages along with an indication for which function it is providing these
pages.
The encoding before this patch was as follows:
function_id == 0: pages are requested for the function receiving
the EQE.
function_id != 0: pages are requested for VF identified by the
function_id value
A new one bit field in the EQE identifies that pages are requested for
the ECPF.
The notion of page_supplier can be introduced here and to support that,
manage pages and query pages were modified so firmware can distinguish
the following cases:
1. Function provides pages for itself
2. PF provides pages for its VF
3. ECPF provides pages to itself
4. ECPF provides pages for another function
This distinction is possible through the introduction of the bit
"embedded_cpu_function" in query_pages, manage_pages and page request
EQE.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:34 +0000 (22:55 -0800)]
IB/mlx5: Use unified register/load function for uplink and VF vports
IB driver maintains different registration and load function calls
for uplink and VF vports. This is not necessary as they only differ
with each other on their profiles.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:33 +0000 (22:55 -0800)]
net/mlx5: Use consistent vport num argument type
Use u16 for vport number, which matches how hardware refers to this
argument throughout commands.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Bodong Wang [Wed, 13 Feb 2019 06:55:32 +0000 (22:55 -0800)]
net/mlx5: Use void pointer as the type in address_of macro
Better to use void * and avoid unnecessary casts.
This patch doesn't change any functionality.
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Robert Stonehouse [Thu, 14 Feb 2019 17:27:43 +0000 (17:27 +0000)]
sfc: ensure recovery after allocation failures
After failing to allocate a receive buffer the driver may fail to ever
request additional allocations. EF10 NICs require new receive buffers to
be pushed in batches of eight or more. The test for whether a slow fill
should be scheduled failed to take account of this. There is little
downside to *always* requesting a slow fill if we failed to allocate a
buffer, so the condition has been removed completely. The timer that
triggers the request for a refill has also been shortened.
Signed-off-by: Robert Stonehouse <rstonehouse@solarflare.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Thu, 14 Feb 2019 15:06:40 +0000 (23:06 +0800)]
net: adaptec: starfire: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in intr_handler() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Thu, 14 Feb 2019 14:55:14 +0000 (22:55 +0800)]
net: 3com: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Thu, 14 Feb 2019 14:53:30 +0000 (22:53 +0800)]
net: arc_emac: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in arc_emac_tx_clean() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Thu, 14 Feb 2019 14:52:28 +0000 (22:52 +0800)]
net: packetengines: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Thu, 14 Feb 2019 14:50:58 +0000 (22:50 +0800)]
net: xilinx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Thu, 14 Feb 2019 14:45:38 +0000 (22:45 +0800)]
net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in i596_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 14 Feb 2019 14:39:07 +0000 (15:39 +0100)]
lib: objagg: fix handling of object with 0 users when assembling hints
It is possible that there might be an originally parent object with 0
direct users that is in hints no longer considered as parent. Then the
weight of this object is 0 and current code ignores him. That's why the
total amount of hint objects might be lower than for the original
objagg and WARN_ON is hit. Fix this be considering 0 weight valid.
Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 17:39:35 +0000 (12:39 -0500)]
Merge branch 'cxgb4-SGE-doorbell-queue-timer'
Vishal Kulkarni says:
====================
cxgb4/cxgb4vfSupport for SGE doorbell queue timer
This series of patchs add SGE doorbell queue timer for faster DMA completions.
Patch 1 Implements SGE doorbell queue timer
Patch 2 Adds ethtool capability to set/get SGE doorbell queue timer tick
v2
- Reverse christmas tree formatting for local variables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vishal Kulkarni [Thu, 14 Feb 2019 12:49:16 +0000 (18:19 +0530)]
cxgb4: Add capability to get/set SGE Doorbell Queue Timer Tick
This patch gets/sets SGE Doorbell Queue timer ticks via ethtool
Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vishal Kulkarni [Thu, 14 Feb 2019 12:49:15 +0000 (18:19 +0530)]
cxgb4/cxgb4vf: Add support for SGE doorbell queue timer
T6 introduced a Timer Mechanism in SGE called the
SGE Doorbell Queue Timer. With this we can now configure
TX Queues to get CIDX Updates when:
Time(CIDX == PIDX) >= Timer
Previously we rely on TX Queue Status Page updates by hardware
for DMA completions. This will make Hardware/Firmware actually
deliver the CIDX Updates as Ingress Queue messages with
commensurate Interrupts.
So we now have a new RX Path component for processing CIDX Updates
and reclaiming TX Descriptors faster.
Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang Zijiang [Thu, 14 Feb 2019 06:42:13 +0000 (14:42 +0800)]
sfc: Replace dev_kfree_skb_any by dev_consume_skb_any
The skb should be freed by dev_consume_skb_any() in efx_tx_tso_fallback()
when skb is still used. The skb will be replaced by segments, so the
original skb should be consumed(not drop).
Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Acked-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang Zijiang [Thu, 14 Feb 2019 06:41:18 +0000 (14:41 +0800)]
net:ethernet:cadence: Replace dev_kfree_skb_any by dev_consume_skb_any
The skb should be freed by dev_consume_skb_any() in macb_pad_and_fcs()
when *skb is still used. The *skb is be replaced by nskb, so the
original *skb should be consumed(not drop).
Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang Zijiang [Thu, 14 Feb 2019 06:40:56 +0000 (14:40 +0800)]
net:dl2k: Replace dev_kfree_skb_irq by dev_consume_skb_irq
dev_consume_skb_irq() should be called when skb xmit
done.It makes drop profiles more friendly.
Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang Zijiang [Thu, 14 Feb 2019 06:40:31 +0000 (14:40 +0800)]
net:dl2k: Modify the code style escaping the warning
modify the code style in order to removing the following warning
when excute the script checkpatch.pl
WARNING: space prohibited between function name and open parenthesis '('
Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huang Zijiang [Thu, 14 Feb 2019 06:39:59 +0000 (14:39 +0800)]
isdn:hisax: Replace dev_kfree_skb_any by dev_consume_skb_any
The skb should be freed by dev_consume_skb_any() in hfcpci_fill_fifo()
when bcs->tx_skb is still used. The bcs->tx_skb is be replaced by
skb_dequeue(&bcs->squeue), so the original bcs->tx_skb should
be consumed(not drop).
Signed-off-by: Huang Zijiang <huang.zijiang@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Wed, 13 Feb 2019 16:55:02 +0000 (08:55 -0800)]
net: ipvlan_l3s: fix kconfig dependency warning
Fix the kconfig warning in IPVLAN_L3S when neither INET nor IPV6
is enabled:
WARNING: unmet direct dependencies detected for NET_L3_MASTER_DEV
Depends on [n]: NET [=y] && (INET [=n] || IPV6 [=n])
Selected by [y]:
- IPVLAN_L3S [=y] && NETDEVICES [=y] && NET_CORE [=y] && NETFILTER [=y]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mahesh Bandewar <maheshb@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:21:02 +0000 (23:21 +0800)]
net: nuvoton: w90p910_ether: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in w90p910_ether_start_xmit()
when skb xmit done. It makes drop profiles(dropwatch, perf) more
friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:19:14 +0000 (23:19 +0800)]
net: natsemi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:18:09 +0000 (23:18 +0800)]
net: micrel: ks8695net: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in ks8695_tx_irq() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:17:06 +0000 (23:17 +0800)]
net: sgi: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:15:43 +0000 (23:15 +0800)]
net: myri10ge: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in myri10ge_tx_done() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:14:54 +0000 (23:14 +0800)]
net: amd: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Wed, 13 Feb 2019 15:12:02 +0000 (23:12 +0800)]
net: dlink: sundance: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in intr_handler() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Remove a redundant blank line in intr_handler().
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 16:51:51 +0000 (11:51 -0500)]
Merge branch 'uapi-Add-a-new-header-for-time-types'
Deepa Dinamani says:
====================
uapi: Add a new header for time types
The series aims at adding a new time header: time_types.h. This header
is what will eventually hold all the uapi time types that we plan to
leave across the interfaces after the y2038 cleanup.
The series was discussed with Arnd Bergmann.
The second patch fixes the errqueue.h header, which has a dependency on
these types.
Note that there may be a trivial merge conflict with linux-next
c70a772fda11 ("y2038: remove struct definition redirects").
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Deepa Dinamani [Wed, 13 Feb 2019 03:26:04 +0000 (19:26 -0800)]
errqueue.h: Include time_types.h
Now that we have a separate header for struct __kernel_timespec,
include it directly without relying on userspace to do it.
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Deepa Dinamani [Wed, 13 Feb 2019 03:26:03 +0000 (19:26 -0800)]
time: Add time_types.h
sys/time.h is the mandated include for many time related
defines. However, linux/time.h overlaps sys/time.h
significantly and this makes including both from userspace
or one from the other impossible.
This also means that userspace can get away with including
sys/time.h whenever it needs linux/time.h and this is what's
been happening in the user world usually.
But, we have new data types that we plan to use in the uapi time
interfaces also defined in the linux/time.h. But, we are unable
to use these types when sys/time.h is included.
Hence, move the new types to a new header, time_types.h.
We intend to eventually have all the uapi defines that the kernel
uses defined in this header.
Note that the plan is to replace uapi interfaces with timeval to
use __kernel_old_timeval, timespec to use __kernel_old_timespec etc.
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW")
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 16:45:39 +0000 (11:45 -0500)]
Merge branch 'devlink-region-read-fixes'
Parav Pandit says:
====================
devlink: 2 fixes for devlink region read
This 2 patches consist of fixes for devlink region read handling.
v0->v1:
- Fixed typo from user to use
v1->v2:
- Rebased
====================
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parav Pandit [Tue, 12 Feb 2019 20:24:08 +0000 (14:24 -0600)]
devlink: Fix list access without lock while reading region
While finding the devlink device during region reading,
devlink device list is accessed and devlink device is
returned without holding a lock. This could lead to use-after-free
accesses.
While at it, add lockdep assert to ensure that all future callers hold
the lock when calling devlink_get_from_attrs().
Fixes: 4e54795a27f5 ("devlink: Add support for region snapshot read command")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Parav Pandit [Tue, 12 Feb 2019 20:23:58 +0000 (14:23 -0600)]
devlink: Return right error code in case of errors for region read
devlink_nl_cmd_region_read_dumpit() misses to return right error code on
most error conditions.
Return the right error code on such errors.
Fixes: 4e54795a27f5 ("devlink: Add support for region snapshot read command")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tonghao Zhang [Mon, 11 Feb 2019 18:49:48 +0000 (10:49 -0800)]
bonding: check slave set command firstly
This patch is a little improvement. If user use the
command shown as below, we should print the info [1]
instead of [2]. The eth0 exists actually, and it may
confuse user.
$ echo "eth0" > /sys/class/net/bond4/bonding/slaves
[1] "bond4: no command found in slaves file - use +ifname or -ifname"
[2] "write error: No such device"
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 06:33:03 +0000 (22:33 -0800)]
Merge branch 'mlxsw-hwmon-and-thermal-extensions'
Ido Schimmel says:
====================
mlxsw: hwmon and thermal extensions
Vadim says:
This patchset contains various improvements to hwmon and thermal code in
mlxsw. The most significant improvement is the ability to read modules'
temperature attributes (input, fault, critical and emergency thresholds)
as well as fans' fault indication. These new attributes will improve the
ability to monitor the system.
Patches #1-#4 add the necessary device registers and APIs to read
modules' temperature attributes and fans' fault indication.
Patches #5-#8 perform small improvements in hwmon and thermal code such
as using a more indicative name for cooling devices.
Patch #9 exposes fans' fault indication via hwmon.
Patch #10 exposes modules' temperature attributes via hwmon.
Patch #11 adds an hwmon label to modules' temperature sensor. This helps
to parse the output of utilities such as "sensors".
Patch #12 allows to bind an external cooling device ("mlxreg-fan") to
mlxsw thermal zone. This will allow the mlxsw thermal zone to change the
cooling level of cooling devices not programmed via switch registers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:56 +0000 (11:28 +0000)]
mlxsw: core: Allow thermal zone binding to an external cooling device
Allow thermal zone binding to an external cooling device from the
cooling devices white list.
It provides support for Mellanox next generation systems on which
cooling device logic is not controlled through the switch registers.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:55 +0000 (11:28 +0000)]
mlxsw: core: Add QSFP module temperature label attribute to hwmon
Add label attribute to hwmon object for exposing QSFP module's
temperature sensor name. Modules are labeled as "front panel xxx". The
label is used by utilities such as "sensors":
front panel 001: +0.0C (crit = +0.0C, emerg = +0.0C)
..
front panel 020: +31.0C (crit = +70.0C, emerg = +80.0C)
..
front panel 056: +41.0C (crit = +70.0C, emerg = +80.0C)
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:54 +0000 (11:28 +0000)]
mlxsw: core: Extend hwmon interface with QSFP module temperature attributes
Add new attributes to hwmon object for exposing QSFP module temperature
input, fault indication, critical and emergency thresholds. Temperature
input and fault indication are read from Management Temperature Bulk
Register. Temperature thresholds are read from Management Cable Info
Access Register.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:53 +0000 (11:28 +0000)]
mlxsw: core: Extend hwmon interface with fan fault attribute
Add new fan hwmon attribute for exposing fan faults (fault indication is
read from Fan Out of Range Event Register).
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:52 +0000 (11:28 +0000)]
mlxsw: core: Rename cooling device
Rename cooling device from "Fan" to "mlxsw_fan". Name "Fan" is too
common name, and such name is misleading, while it's interpreted by
user. For example name "Fan" could be used by ACPI.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:51 +0000 (11:28 +0000)]
mlxsw: core: Replace thermal temperature trips with defines
Replace thermal hardcoded temperature trip values with defines.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:50 +0000 (11:28 +0000)]
mlxsw: core: Modify thermal zone definition
Modify thermal zone trip points setting for better alignment with system
thermal requirement.
Add hysteresis thresholds for thermal trips in order to avoid throttling
around thermal trip point. If hysteresis temperature is not considered,
PWM can have side effect of flip up/down on thermal trip point boundary.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:48 +0000 (11:28 +0000)]
mlxsw: core: Set different thermal polling time based on bus frequency capability
Add low frequency bus capability in order to allow core functionality
separation based on bus type. Driver could run over PCIe, which is
considered as high frequency bus or I2C, which is considered as low
frequency bus. In the last case time setting, for example, for thermal
polling interval, should be increased.
Use different thermal monitoring based on bus type. For I2C bus time is
set to 20 seconds, while for PCIe 1 second polling interval is used.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:47 +0000 (11:28 +0000)]
mlxsw: core: Add API for QSFP module temperature thresholds reading
Add new API to read QSFP module's temperature thresholds - warning and
critical.
New internal API reads the temperature thresholds from the modules,
which are equipped with the thermal sensor. These thresholds will be
exposed via hwmon subsystem.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:46 +0000 (11:28 +0000)]
mlxsw: reg: Add Fan Out of Range Event Register
Add FORE (Fan Out of Range Event Register), which is used for fan fault
reading.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:45 +0000 (11:28 +0000)]
mlxsw: reg: Add Management Temperature Bulk Register
Add MTBR (Management Temperature Bulk Register), which is used for port
temperature reading in a bulk mode.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Wed, 13 Feb 2019 11:28:44 +0000 (11:28 +0000)]
mlxsw: spectrum: Move QSFP EEPROM definitions to common location
Move QSFP EEPROM definitions to common location from the spectrum driver
in order to make them available for other mlxsw modules. They are common
for all kind of chips and have relation to SFF specifications 8024,
8436, 8472, 8636, rather than to chip type.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 06:28:11 +0000 (22:28 -0800)]
Merge tag 'batadv-next-for-davem-
20190213' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature/cleanup patchset includes the following patches:
- fix memory leak in in batadv_dat_put_dhcp, by Martin Weinelt
- fix typo, by Sven Eckelmann
- netlink restructuring patch series (part 2), by Sven Eckelmann
(19 patches)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 13 Feb 2019 08:59:31 +0000 (11:59 +0300)]
test_objagg: Uninitialized variable in error handling
We need to set the error message on this path otherwise some of the
callers, such as test_hints_case(), print from an uninitialized pointer.
We had a similar bug earlier and set "errmsg" to NULL in the caller,
test_delta_action_item(). That code is no longer required so I have
removed it.
Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 13 Feb 2019 08:58:20 +0000 (11:58 +0300)]
test_objagg: Test the correct variable
There is a typo here. We intended to check "objagg2" but we instead
test "objagg" which is not an error pointer.
Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 13 Feb 2019 08:56:50 +0000 (11:56 +0300)]
lib: objagg: Fix an error code in objagg_hints_get()
We need to set the error code on this path otherwise we return
ERR_PTR(0) which would result in a NULL dereference in the caller.
Fixes: 9069a3817d82 ("lib: objagg: implement optimization hints assembly and use hints for object creation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vishal Kulkarni [Wed, 13 Feb 2019 05:18:52 +0000 (10:48 +0530)]
cxgb4vf: Few more link management changes.
CR4_QSFP 10G Speed technology should be 10000baseKR_Full
And also report available FEC modes.
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 06:00:17 +0000 (22:00 -0800)]
Merge branch 'pagepool-api-and-dma-address-storage'
Jesper Dangaard Brouer says:
====================
Fix page_pool API and dma address storage
As pointed out by David Miller in [1] the current page_pool implementation
stores dma_addr_t in page->private. This won't work on 32-bit platforms with
64-bit DMA addresses since the page->private is an unsigned long and the
dma_addr_t a u64.
Since no driver is yet using the DMA mapping capabilities of the API let's
fix this by storing the information in 'struct page' and use that to store
and retrieve DMA addresses from network drivers.
As long as the addresses returned from dma_map_page() are aligned the first
bit, used by the compound pages code should not be set.
Ilias tested the first two patches on Espressobin driver mvneta, for which
we have patches for using the DMA API of page_pool.
[1]: https://lore.kernel.org/netdev/
20181207.230655.
1261252486319967024.davem@davemloft.net/
====================
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Wed, 13 Feb 2019 01:55:50 +0000 (02:55 +0100)]
page_pool: use DMA_ATTR_SKIP_CPU_SYNC for DMA mappings
As pointed out by Alexander Duyck, the DMA mapping done in page_pool needs
to use the DMA attribute DMA_ATTR_SKIP_CPU_SYNC.
As the principle behind page_pool keeping the pages mapped is that the
driver takes over the DMA-sync steps.
Reported-by: Alexander Duyck <alexander.duyck@gmail.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilias Apalodimas [Wed, 13 Feb 2019 01:55:45 +0000 (02:55 +0100)]
net: page_pool: don't use page->private to store dma_addr_t
As pointed out by David Miller the current page_pool implementation
stores dma_addr_t in page->private.
This won't work on 32-bit platforms with 64-bit DMA addresses since the
page->private is an unsigned long and the dma_addr_t a u64.
A previous patch is adding dma_addr_t on struct page to accommodate this.
This patch adapts the page_pool related functions to use the newly added
struct for storing and retrieving DMA addresses from network drivers.
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Wed, 13 Feb 2019 01:55:40 +0000 (02:55 +0100)]
mm: add dma_addr_t to struct page
The page_pool API is using page->private to store DMA addresses.
As pointed out by David Miller we can't use that on 32-bit architectures
with 64-bit DMA
This patch adds a new dma_addr_t struct to allow storing DMA addresses
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Wed, 13 Feb 2019 01:42:00 +0000 (01:42 +0000)]
net: sched: remove duplicated include from cls_api.c
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Wed, 13 Feb 2019 00:23:52 +0000 (00:23 +0000)]
flow_offload: fix block stats
With the introduction of flow_stats_update(), drivers now update the stats
fields of the passed tc_cls_flower_offload struct, rather than call
tcf_exts_stats_update() directly to update the stats of offloaded TC
flower rules. However, if multiple qdiscs are registered to a TC shared
block and a flower rule is applied, then, when getting stats for the rule,
multiple callbacks may be made.
Take this into consideration by modifying flow_stats_update to gather the
stats from all callbacks. Currently, the values in tc_cls_flower_offload
only account for the last stats callback in the list.
Fixes: 3b1903ef97c0 ("flow_offload: add statistics retrieval infrastructure and use it")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Tue, 12 Feb 2019 21:39:06 +0000 (23:39 +0200)]
net: sched: flower: only return error from hw offload if skip_sw
Recently introduced tc_setup_flow_action() can fail when parsing tcf_exts
on some unsupported action commands. However, this should not affect the
case when user did not explicitly request hw offload by setting skip_sw
flag. Modify tc_setup_flow_action() callers to only propagate the error if
skip_sw flag is set for filter that is being offloaded, and set extack
error message in that case.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Fixes: 3a7b68617de7 ("cls_api: add translator to flow_action representation")
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 12 Feb 2019 16:08:07 +0000 (16:08 +0000)]
qlge: fix some indentation issues
There are some statements that are indented incorrectly. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 12 Feb 2019 16:01:53 +0000 (16:01 +0000)]
qed: fix indentation issue with statements in an if-block
There are some statements in an if-block that are not correctly
indented. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 16:01:11 +0000 (00:01 +0800)]
net: ixp4xx_eth: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in eth_txdone_irq() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 16:00:02 +0000 (00:00 +0800)]
net: macb: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in at91ether_interrupt() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:59:04 +0000 (23:59 +0800)]
net: sis: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:56:53 +0000 (23:56 +0800)]
net: fealnx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in intr_handler() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:56:00 +0000 (23:56 +0800)]
net: moxa: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in moxart_tx_finished() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:52:53 +0000 (23:52 +0800)]
net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in mace_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:51:45 +0000 (23:51 +0800)]
net: atheros: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:49:57 +0000 (23:49 +0800)]
net: qualcomm: emac: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in emac_mac_tx_process() when
skb xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 12 Feb 2019 15:47:31 +0000 (23:47 +0800)]
net: neterion: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called when skb xmit done. It makes
drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 14 Feb 2019 00:17:53 +0000 (19:17 -0500)]
Merge branch 'phy-25g'
Maxime Chevallier says:
====================
net: phy: Add 2.5G/5GBASET PHYs support
The 802.3bz standard defines 2 modes based on the NBASET alliance work
that allow to use 2.5Gbps and 5Gbps speeds on Cat 5e, 6 and 7 cables.
This series adds the necessary infrastructure to handle these modes with
C45 PHYs. This series was originally part of a bigger one, that has
seen 2 iterations [1] [2] that added support for these modes on Marvell
Alaska PHYs.
Following some discussions with Heiner and Andrew [3], we decided to
split-out the generic parts so that we can work together on the
following steps to get these mode fully working with Aquantia and
Marvell PHYS.
The first 3 patches are reworking some of the internal network phy
infrastructure to handle the new modes in a more generic way.
The 4th patch adds all the C45 register definition and accesses that
follows the 802.3bz standard to support 2.5GBASET and 5GBASET.
[1] : https://lore.kernel.org/netdev/
20190118152352.26417-1-maxime.chevallier@bootlin.com/
[2] : https://lore.kernel.org/netdev/
20190207094939.27369-1-maxime.chevallier@bootlin.com/
[3] : https://lore.kernel.org/netdev/
81c340ea-54b0-1abf-94af-
b8dc4ee83e3a@gmail.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Mon, 11 Feb 2019 14:25:29 +0000 (15:25 +0100)]
net: phy: Add generic support for 2.5GBaseT and 5GBaseT
The 802.3bz specification, based on previous by the NBASET alliance,
defines the 2.5GBaseT and 5GBaseT link modes for ethernet traffic on
cat5e, cat6 and cat7 cables.
These mode integrate with the already defined C45 MDIO PMA/PMD registers
set that added 10G support, by defining some previously reserved bits,
and adding a new register (2.5G/5G Extended abilities).
This commit adds the required definitions in include/uapi/linux/mdio.h
to support these modes, and detect when a link-partner advertises them.
It also adds support for these mode in the generic C45 PHY
infrastructure.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Mon, 11 Feb 2019 14:25:28 +0000 (15:25 +0100)]
net: phy: Extract genphy_c45_pma_read_abilities from marvell10g
Marvell 10G PHY driver has a generic way of initializing the supported
link modes by reading the PHY's C45 PMA abilities. This can be made
generic, since these registers are part of the 802.3 specifications.
This commit extracts the config_init link_mode initialization code from
marvell10g and uses it to introduce the genphy_c45_pma_read_abilities
function.
Only PMA modes are read, it's still up to the caller to set the Pause
parameters.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Mon, 11 Feb 2019 14:25:27 +0000 (15:25 +0100)]
net: phy: Move of_set_phy_eee_broken to phy-core.c
Since of_set_phy_supported was moved to phy-core.c, we can also move
of_set_phy_eee_broken to the same location, so that we have all OF
functions in the same place.
This patch doesn't intend to introduce any change in behaviour.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Mon, 11 Feb 2019 14:25:26 +0000 (15:25 +0100)]
net: phy: Mask-out non-compatible modes when setting the max-speed
When setting a PHY's max speed using either the max-speed DT property
or ethtool, we should mask-out all non-compatible modes according to the
settings table, instead of just the 10/100BASET modes.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Leon Romanovsky [Mon, 11 Feb 2019 11:56:07 +0000 (13:56 +0200)]
net/mlx5: Align ODP capability function with netdev coding style
Update newly introduced function to be aligned to netdev coding style.
Fixes: 46861e3e88be ("net/mlx5: Set ODP SRQ support in firmware")
Reported-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
David S. Miller [Wed, 13 Feb 2019 01:31:30 +0000 (17:31 -0800)]
Merge branch 'net-Remove-unused-variables'
Florian Fainelli says:
====================
Remove unused variables
This removes unused variables from mlxsw and ethsw after the recent
removal of SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS, build scripts are now
fixed to take care of those warnings :).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>