David S. Miller [Wed, 21 Aug 2019 19:58:39 +0000 (12:58 -0700)]
Merge branch 'mlxsw-Add-devlink-trap-support'
Ido Schimmel says:
====================
mlxsw: Add devlink-trap support
This patchset adds devlink-trap support in mlxsw.
Patches #1-#4 add the necessary APIs and defines in mlxsw.
Patch #5 implements devlink-trap support for layer 2 drops. More drops
will be added in the future.
Patches #6-#7 add selftests to make sure that all the new code paths are
exercised and that the feature is working as expected.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:37 +0000 (10:19 +0300)]
selftests: mlxsw: Add a test case for devlink-trap
Test generic devlink-trap functionality over mlxsw. These tests are not
specific to a single trap, but do not check the devlink-trap common
infrastructure either.
Currently, the only test case is device deletion (by reloading the
driver) while packets are being trapped.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:36 +0000 (10:19 +0300)]
selftests: mlxsw: Add test cases for devlink-trap L2 drops
Test that each supported packet trap is triggered under the right
conditions and that packets are indeed dropped and not forwarded.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:35 +0000 (10:19 +0300)]
mlxsw: spectrum: Add devlink-trap support
Register supported packet traps (layer 2 drops only, currently) and
associated trap group with devlink during driver initialization.
The amount of traffic generated by these packet drop traps is capped at
10Kpps to ensure the CPU is not overwhelmed by incoming packets.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:34 +0000 (10:19 +0300)]
mlxsw: Add trap group for layer 2 discards
Discard trap groups are defined in a different enum so that they could
all share the same policer ID: MLXSW_REG_HTGT_TRAP_GROUP_MAX + 1.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:33 +0000 (10:19 +0300)]
mlxsw: Add layer 2 discard trap IDs
Add the trap IDs used to report layer 2 drops.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:32 +0000 (10:19 +0300)]
mlxsw: reg: Add new trap actions
Subsequent patches will add discard traps support in mlxsw. The driver
cannot configure such traps with a normal trap action, but needs to use
exception trap action, which also increments an error counter.
On the other hand, when these traps are initialized or set to drop
action, they should use the default drop action set by the firmware.
This guarantees that when the feature is disabled we get the exact same
behavior as before the feature was introduced.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 21 Aug 2019 07:19:31 +0000 (10:19 +0300)]
mlxsw: core: Add API to set trap action
Up until now the action of a trap was never changed during its lifetime.
This is going to change by subsequent patches that will allow devlink to
control the action of certain traps.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Aug 2019 05:59:45 +0000 (22:59 -0700)]
Merge tag 'mlx5-updates-2019-08-15' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2019-08-15
This patchset introduces changes in mlx5 devlink health reporters.
The highlight of these changes is adding a new reporter: RX reporter
mlx5 RX reporter: reports and recovers from timeouts and RX completion
error.
1) Perform TX reporter cleanup. In order to maintain the
code flow as similar as possible between RX and TX reporters, start the
set with cleanup.
2) Prepare for code sharing, generalize and move shared
functionality.
3) Refactor and extend TX reporter diagnostics information
to align the TX reporter diagnostics output with the RX reporter's
diagnostics output.
4) Add helper functions Patch 11: Add RX reporter, initially
supports only the diagnostics call back.
5) Change ICOSQ (Internal Operations Send Queue) open/close flow to
avoid race between interface down and completion error recovery.
6) Introduce recovery flows for RX ring population timeout on ICOSQ,
and for completion errors on ICOSQ and on RQ (Regular receive queues).
7) Include RX reporters in mlx5 documentation.
8) Last two patches of this series, are trivial fixes for previously
submitted patches on this release cycle.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 21 Aug 2019 00:02:44 +0000 (17:02 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2019-08-20
This series contains updates to ice driver only.
Brett fixes the detection of a hung transmit ring by checking the
software based tail (next_to_use) to determine if there is pending work.
Updates the driver to assume that using more than one receive queue per
receive ring container is a rare case, so use unlikely() in the case
were we actually need to divide our budget for multiple queues. Fixed
an issue where the write back on ITR bit was not being set when
interrupts are disabled, which was causing only write backs when polling
only when a cache line is filled. Cleans up unnecessary wait times
during VF bring up and reset paths. Increased the mailbox size for
receive queues that are used to communicate with VFs to accommodate the
large number of VFs that the driver can support.
Akeem restructures the initialization flows for VFs, including how VFs
are configured and resources allocated to improve flows so that when we
clean up resources, we do not try to free resources that were never
allocated. Organizes code to ensure that VF specific code is located in
the SR-IOV specific file.
Paul fixes an issue when setting the pause parameter which was
incorrectly blocking users from changing receive or transmit pause
settings. Ensure register access for MSIX vector index is only done in
the PF space and not absolute device space.
Usha fixes a potential kernel hang in the DCB rebuild path when in CEE
mode, where the ETS recommended DCB configuration is not being set or
set correctly.
Mitch updates the driver to process all receive descriptors, regardless
of the size of the associated data.
Tony fixes and issue during the reset/rebuild path of a PF VSI where we
were assuming that the PF VSI was always to be enabled, which can
attempt to bring up a PF VSI on a downed interface which can lead to
various crashes.
Pawel fixes up variable definitions to match the type of data being
stored.
v2: Dropped patch 1 of the series to add ethtool support to query/add
channels on a VSI, while we re-qork the functionality to match the
ethtool expected behavior to report combined (Tx and Rx) numbers.
v3: Updated patch 4 to use kzalloc() and kfree() instead devm_kzalloc()
and devm_kfree().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Brett Creeley [Thu, 25 Jul 2019 08:55:41 +0000 (01:55 -0700)]
ice: improve print for VF's when adding/deleting MAC filters
When we fail to add/delete MAC filters in the VF, the print doesn't
distinguish between the two. Fix that by printing whether or not we
failed to add/delete the MAC filter respectively.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Pawel Kaminski [Thu, 25 Jul 2019 08:55:40 +0000 (01:55 -0700)]
ice: Change type for queue counts
These queue variables are being assigned values that are type u16.
Change the local variables to match these types. Since these
represent queue counts, they should never be negative.
Signed-off-by: Pawel Kaminski <pawel.kaminski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Thu, 25 Jul 2019 08:55:39 +0000 (01:55 -0700)]
ice: Move VF resources definition to SR-IOV specific file
In order to use some of the VF resources definition in the SR-IOV specific
virtchnl header file, this patch moves applicable code to
ice_virtchnl_pf.h file accordingly... and they should have been defined in
the destination file originally.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 25 Jul 2019 08:55:38 +0000 (01:55 -0700)]
ice: Increase size of Mailbox receive queue for many VFs
Currently we use the ICE_MBXQ_LEN for both the Mailbox send and receive
queues that are used to communicate with VFs. This is fine for the send
queue because the PF driver will lock the queue for every single send,
but for the Mailbox receive queue every VF is posting to its Mailbox
send queue and the hardware is then handing the message to the PF on its
Mailbox receive queue. This becomes a problem with many VFs because it
seems to overburden the Mailbox receive queue on the PF. Fix this by
increasing the Mailbox receive queue for the PF to 512 entries.
The number 512 was determined based on the number of VFs supported by
the device. We can have a total of 256 VFs so in the worst case this
allows the VFs to put 2 messages in the PFs Mailbox receive queue at the
same time.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 25 Jul 2019 08:55:37 +0000 (01:55 -0700)]
ice: Reduce wait times during VF bringup/reset
Currently there are a couple places where the VF is waiting too long when
checking the status of registers. This is causing the AVF driver to
spin for longer than necessary in the __IAVF_STARTUP state. Sometimes
it causes the AVF to go into the __IAVF_COMM_FAILED, which may retrigger
the __IAVF_STARTUP state. Try to reduce the chance of this happening by
removing unnecessary wait times in VF bringup/resets.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Paul Greenwalt [Thu, 25 Jul 2019 08:55:36 +0000 (01:55 -0700)]
ice: update GLINT_DYN_CTL and GLINT_VECT2FUNC register access
Register access for GLINT_DYN_CTL and GLINT_VECT2FUNC should be within
the PF space and not the absolute device space.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Thu, 25 Jul 2019 08:55:35 +0000 (01:55 -0700)]
ice: Do not always bring up PF VSI in ice_ena_vsi()
During rebuild ice_ena_vsi() is called to recover the VSI state.
This function assumes the PF VSI is always to be enabled, however,
it's possible that during reset/rebuild the interface can be
brought down. If this occurs, we can attempt to bring up the PF
VSI on a downed interface which can lead to various crashes. If
the interface is not running, do not bring up the associated VSI.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mitch Williams [Thu, 25 Jul 2019 08:55:34 +0000 (01:55 -0700)]
ice: allow empty Rx descriptors
In some circumstances, the hardware will hand us a receive descriptor
which has no data attached, but is otherwise valid. The receive code was
improperly ignoring these descriptors, which result in an infinite loop.
To fix this, change the receive code to process all descriptors,
regardless of the size of the associated data. Add checks to the
memory-handling functions to allow for zero size.
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Usha Ketineni [Thu, 25 Jul 2019 08:55:33 +0000 (01:55 -0700)]
ice: Fix kernel hang with DCB reset in CEE mode
This patch fixes the set local MIB AQ call failures in the DCB rebuild path
by setting the defaults for the ETS recommended DCB configuration. Also,
willing bits for the DCB configuration needs to be set correctly. Resets
works fine in IEEE mode as the ETS recommended DCB configuration is
populated but not in CEE mode.
Without this patch, PFR causes the kernel hang in CEE mode.
Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 25 Jul 2019 08:55:32 +0000 (01:55 -0700)]
ice: Set WB_ON_ITR when we don't re-enable interrupts
Currently when busy polling is enabled we aren't setting/enabling
WB_ON_ITR in the driver. This doesn't break the driver, but it does
cause issues. If we don't enable WB_ON_ITR mode we will still get
write-backs from hardware during polling when a cache line has been
filled, but if a cache line is not filled we will not get the
write-back because WB_ON_ITR is not set. Fix this by enabling
WB_ON_ITR in the driver when interrupts are disabled.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 20 Aug 2019 21:01:56 +0000 (14:01 -0700)]
Merge tag 'linux-can-next-for-5.4-
20190820' of git://git./linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2019-08-20
this is a pull request for net-next/master consisting of 18 patches.
The first patch is by Geert Uytterhoeven, it removes the unused platform
data support from the rcar_can driver.
A patch by Nishka Dasgupta marks the structure peak_pciec_i2c_bit_ops in
the peak_pci driver as constant.
A patch by me removes the custom DMA support from the hi311x driver.
The next 4 patches target the tcan4x5x driver and are also by me, they
first clean up the driver a bit, and then add missing error handling and
fix a bug in the length calculation in the regmap callbacks.
The next 2 patches are by me for the m_can_platform driver, they also
remove unneeded casts and add missing error handling.
The remaining 9 patches all target the mcp251x driver. The first 5 are
clean up patches by me, the next relaxes the timing in the
mcp251x_hw_reset() function. Alexander Shiyan's patch improves the name
which is used while registering the interrupt handler. Phil Elwell's
patch improves the mcp251x_open() function to use the DT-supplied
interrupt flags instead of hard coding them. The final patch is again by
me, it removes the custom DMA support from the hi311x driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Greenwalt [Thu, 25 Jul 2019 08:55:31 +0000 (01:55 -0700)]
ice: fix set pause param autoneg check
When ETHTOOL_GLINKSETTINGS is defined get pause param pause->autoneg
reports SW configured setting, however when not defined get pause param
pause->autoneg reports the link status. Set pause param needs to compare
pause->autoneg with the same source as get pause param to block the user
from changing autoneg with the set pause param option, or the user
may be incorrectly blocked from changing Rx|Tx pause settings.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 20 Aug 2019 20:51:46 +0000 (13:51 -0700)]
Merge branch 's390-net-next'
Julian Wiedmann says:
====================
s390/net: updates 2019-08-20
please apply the following patches to net-next. This series brings a mix
of cleanups and small improvements for various parts of qeth's control
path. Also, a minor cleanup for ctcm and lcs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:43 +0000 (16:46 +0200)]
s390/lcs: don't use intparm for channel IO
lcs passes an intparm when calling ccw_device_*(), even though lcs_irq()
later makes no use of this.
To reduce the confusion, consistently pass 0 as intparm instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:42 +0000 (16:46 +0200)]
s390/ctcm: don't use intparm for channel IO
ctcm passes an intparm when calling ccw_device_*(), even though
ctcm_irq_handler() later makes no use of this.
To reduce the confusion, consistently pass 0 as intparm instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:41 +0000 (16:46 +0200)]
s390/qeth: streamline control code for promisc mode
We have logic to determine the desired promisc mode in _each_ code path.
Change things around so that there is a clean split between
(a) high-level code that selects the new mode, and (b) implementations
of the various mechanisms to program this mode.
This also keeps qeth_promisc_to_bridge() from polluting the debug logs
on each RX modeset.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:40 +0000 (16:46 +0200)]
s390/qeth: get vnicc sub-cmd type from reply data
When processing the reply for a vnicc cmd, there's no need to remember
which specific sub-cmd type we initially sent. The reply itself contains
all the needed information.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:39 +0000 (16:46 +0200)]
s390/qeth: merge qeth_reply struct into qeth_cmd_buffer
Except for card->read_cmd, every cmd we issue now passes through
qeth_send_control_data() and allocates a qeth_reply struct. The way we
use this struct requires additional refcounting, and pointer tracking.
Clean up things by moving most of qeth_reply's content into the main
cmd struct. This keeps things in one place, saves us the additional
refcounting and simplifies the overall code flow.
A nice little benefit is that we can now match incoming replies against
the pending requests themselves, without caching the requests' seqnos.
The qeth_reply struct stays around for a little bit longer in a shrunk
form, to avoid touching every single callback.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:38 +0000 (16:46 +0200)]
s390/qeth: keep cmd alive after IO completion
Current code releases the cmd struct after its initial IO has completed.
Any reply processing is done independently, using a separate qeth_reply
struct.
In preparation for merging the cmd and reply structs together, take an
additional reference on the cmd object so that it stays around all the
way until qeth_send_control_data() returns.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:37 +0000 (16:46 +0200)]
s390/qeth: use correct length field in SNMP cmd callback
qeth_snmp_command_cb() is the only cmd callback that pulls the reply's
data length from a low-level transport header field. This requires
additional complexity (ie. reply->offset) to make the header accessible
to what is supposed to be a pure IPA cmd callback.
Adapter cmds have a length field in their sub-cmd header, get the data
length from there instead.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:36 +0000 (16:46 +0200)]
s390/qeth: propagate length of processed cmd IO data to callback
When an cmd IO completes in qeth_irq(), calculate how much data was
processed by the device and pass this value to the cmd's callback.
This allows cmds that retrieve data from the device to check whether
sufficient data was received, so we do that in qeth_read_conf_data_cb().
Suggested-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Tue, 20 Aug 2019 14:46:35 +0000 (16:46 +0200)]
s390/qeth: use node_descriptor struct
Rather than fumbling with hard-coded offsets, use the proper struct to
access the retrieved RCD information.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
YueHaibing [Tue, 20 Aug 2019 14:14:46 +0000 (22:14 +0800)]
netdevsim: Fix build error without CONFIG_INET
If CONFIG_INET is not set, building fails:
drivers/net/netdevsim/dev.o: In function `nsim_dev_trap_report_work':
dev.c:(.text+0x67b): undefined reference to `ip_send_check'
Use ip_fast_csum instead of ip_send_check to avoid
dependencies on CONFIG_INET.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: da58f90f11f5 ("netdevsim: Add devlink-trap support")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gavi Teitz [Sun, 11 Aug 2019 06:21:20 +0000 (09:21 +0300)]
net/mlx5: Fix the order of fc_stats cleanup
Previously, mlx5_cleanup_fc_stats() would cleanup the flow counter
pool beofre releasing all the counters to it, which would result in
flow counter bulks not getting freed. Resolve this by changing the
order in which elements of fc_stats are cleaned up, so that the flow
counter pool is cleaned up after all the counters are released.
Also move cleanup actions for freeing the bulk query memory and
destroying the idr to the end of mlx5_cleanup_fc_stats().
Signed-off-by: Gavi Teitz <gavi@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Vlad Buslov [Mon, 12 Aug 2019 12:38:38 +0000 (15:38 +0300)]
net/mlx5e: Fix deallocation of non-fully init encap entries
Recent rtnl lock dependency refactoring changed encap entry attach code to
insert encap entry to hash table before it was fully initialized in order
to allow concurrent tc users to wait on completion for encap entry to
finish initialization. That change required all the users of encap entry to
obtain reference to it first and for caller that creates encap to put
reference to it on error, instead of freeing the entry memory directly.
However, releasing reference to such encap entry that wasn't fully
initialized causes NULL pointer dereference in
mlx5e_rep_encap_entry_detach() which expects e->out_dev to be set and encap
to be attached to nhe:
[ 1092.454517] BUG: unable to handle page fault for address:
00000000000420e8
[ 1092.454571] #PF: supervisor read access in kernel mode
[ 1092.454602] #PF: error_code(0x0000) - not-present page
[ 1092.454632] PGD
800000083032c067 P4D
800000083032c067 PUD
84107d067 PMD 0
[ 1092.454673] Oops: 0000 [#1] SMP PTI
[ 1092.454697] CPU: 20 PID: 22393 Comm: tc Not tainted 5.3.0-rc3+ #589
[ 1092.454733] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 1092.454806] RIP: 0010:mlx5e_rep_encap_entry_detach+0x1c/0x630 [mlx5_core]
[ 1092.454845] Code: be f4 ff ff ff e9 11 ff ff ff 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 89 f3 48 83 ec 30 <48> 8b 87 28 16 04 00 48 89 f7 48 05 d0 03 00 00 48 89
45 c8 e8 cb
[ 1092.454942] RSP: 0018:
ffffb6f08421f5a0 EFLAGS:
00010286
[ 1092.454974] RAX:
0000000000000000 RBX:
ffff8ab668644e00 RCX:
ffffb6f08421f56c
[ 1092.455013] RDX:
ffff8ab668644e40 RSI:
ffff8ab668644e00 RDI:
0000000000000ac0
[ 1092.455053] RBP:
ffffb6f08421f5f8 R08:
0000000000000001 R09:
0000000000000000
[ 1092.455092] R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000ac0
[ 1092.455131] R13:
00000000ffffff9b R14:
ffff8ab63f200ac0 R15:
ffff8ab668644e40
[ 1092.455171] FS:
00007fa195bdc480(0000) GS:
ffff8ab66fa00000(0000) knlGS:
0000000000000000
[ 1092.455216] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 1092.455249] CR2:
00000000000420e8 CR3:
0000000867522001 CR4:
00000000001606e0
[ 1092.455288] Call Trace:
[ 1092.455315] ? __mutex_unlock_slowpath+0x4d/0x2a0
[ 1092.455365] mlx5e_encap_dealloc.isra.0+0x31/0x60 [mlx5_core]
[ 1092.455424] mlx5e_tc_add_fdb_flow+0x596/0x750 [mlx5_core]
[ 1092.455484] __mlx5e_add_fdb_flow+0x152/0x210 [mlx5_core]
[ 1092.455534] mlx5e_configure_flower+0x4d5/0xe30 [mlx5_core]
[ 1092.455574] tc_setup_cb_call+0x67/0xb0
[ 1092.455601] fl_hw_replace_filter+0x142/0x300 [cls_flower]
[ 1092.455639] fl_change+0xd24/0x1bdb [cls_flower]
[ 1092.455675] tc_new_tfilter+0x3e0/0x970
[ 1092.455709] ? tc_del_tfilter+0x720/0x720
[ 1092.455735] rtnetlink_rcv_msg+0x389/0x4b0
[ 1092.455763] ? netlink_deliver_tap+0x95/0x400
[ 1092.455791] ? rtnl_dellink+0x2d0/0x2d0
[ 1092.455817] netlink_rcv_skb+0x49/0x110
[ 1092.455844] netlink_unicast+0x171/0x200
[ 1092.455872] netlink_sendmsg+0x224/0x3f0
[ 1092.455901] sock_sendmsg+0x5e/0x60
[ 1092.455924] ___sys_sendmsg+0x2ae/0x330
[ 1092.455950] ? task_work_add+0x43/0x50
[ 1092.455976] ? fput_many+0x45/0x80
[ 1092.456004] ? __lock_acquire+0x248/0x18e0
[ 1092.456033] ? find_held_lock+0x2b/0x80
[ 1092.456058] ? task_work_run+0x7b/0xd0
[ 1092.456085] __sys_sendmsg+0x59/0xa0
[ 1092.457013] do_syscall_64+0x5c/0xb0
[ 1092.457924] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1092.458842] RIP: 0033:0x7fa195da27b8
[ 1092.459918] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 65 8f 0c 00 8b 00 85 c0 75 17 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83
ec 28 89 54
[ 1092.462634] RSP: 002b:
00007fff94409298 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 1092.464011] RAX:
ffffffffffffffda RBX:
000000005d515b0e RCX:
00007fa195da27b8
[ 1092.465391] RDX:
0000000000000000 RSI:
00007fff94409300 RDI:
0000000000000003
[ 1092.466761] RBP:
0000000000000000 R08:
0000000000000001 R09:
0000000000000006
[ 1092.468121] R10:
0000000000404ec2 R11:
0000000000000246 R12:
0000000000000001
[ 1092.469456] R13:
0000000000480640 R14:
0000000000000016 R15:
0000000000000001
[ 1092.470766] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm
iw_cm ib_cm mlx5_ib ib_uverbs ib_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlx5_core kvm_intel kvm irqbypass crct10dif_pclmul mei_me crc32_pclmul crc32
c_intel igb iTCO_wdt ghash_clmulni_intel ses mlxfw intel_cstate iTCO_vendor_support ptp intel_uncore lpc_ich pps_core mei i2c_i801 joydev intel_rapl_perf ioatdma enclosure ipmi_ssif pcspkr dca wmi ipmi_
si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ast i2c_algo_bit drm_vram_helper ttm drm_kms_helper drm mpt3sas raid_class scsi_transport_sas
[ 1092.479618] CR2:
00000000000420e8
[ 1092.481214] ---[ end trace
ce2e0f4d9a67f604 ]---
To fix the issue, set e->compl_result to positive value after encap was
initialized successfully. Check e->compl_result value in
mlx5e_encap_dealloc() and only detach and dealloc encap if the value is
positive.
Fixes: d589e785baf5 ("net/mlx5e: Allow concurrent creation of encap entries")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Tue, 23 Jul 2019 08:46:02 +0000 (11:46 +0300)]
Documentation: net: mlx5: Devlink health documentation updates
Add documentation for devlink health rx reporter supported by mlx5.
Update tx reporter documentation.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Wed, 26 Jun 2019 20:21:40 +0000 (23:21 +0300)]
net/mlx5e: Report and recover from CQE with error on RQ
Add support for report and recovery from error on completion on RQ by
setting the queue back to ready state. Handle only errors with a
syndrome indicating the RQ might enter error state and could be
recovered.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Saeed Mahameed [Fri, 14 Jun 2019 22:21:15 +0000 (15:21 -0700)]
net/mlx5e: RX, Handle CQE with error at the earliest stage
Just to be aligned with the MPWQE handlers, handle RX WQE with error
for legacy RQs in the top RX handlers, just before calling skb_from_cqe().
CQE error handling will now be called at the same stage regardless of
the RQ type or netdev mode NIC, Representor, IPoIB, etc ..
This will be useful for down stream patch to improve error CQE
handling.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Tue, 25 Jun 2019 18:42:27 +0000 (21:42 +0300)]
net/mlx5e: Report and recover from rx timeout
Add support for report and recovery from rx timeout. On driver open we
post NOP work request on the rx channels to trigger napi in order to
fillup the rx rings. In case napi wasn't scheduled due to a lost
interrupt, perform EQ recovery.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Tue, 25 Jun 2019 14:44:28 +0000 (17:44 +0300)]
net/mlx5e: Report and recover from CQE error on ICOSQ
Add support for report and recovery from error on completion on ICOSQ.
Deactivate RQ and flush, then deactivate ICOSQ. Set the queue back to
ready state (firmware) and reset the ICOSQ and the RQ (software
resources). Finally, activate the ICOSQ and the RQ.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Tue, 2 Jul 2019 12:47:29 +0000 (15:47 +0300)]
net/mlx5e: Split open/close ICOSQ into stages
Align ICOSQ open/close behaviour with RQ and SQ. Split open flow into
open and activate where open handles creation and activate enables the
queue. Do a symmetric thing in close flow: split into close and
deactivate.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Tue, 25 Jun 2019 13:26:46 +0000 (16:26 +0300)]
net/mlx5e: Add support to rx reporter diagnose
Add rx reporter, which supports diagnose call-back. Diagnostics output
include: information common to all RQs: RQ type, RQ size, RQ stride
size, CQ size and CQ stride size. In addition advertise information per
RQ and its related icosq and attached CQ.
$ devlink health diagnose pci/0000:00:0b.0 reporter rx
Common config:
RQ:
type: 2 stride size: 2048 size: 8
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4308 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1032 HW status: 0
channel ix: 1 rqn: 4313 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1036 HW status: 0
channel ix: 2 rqn: 4318 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1040 HW status: 0
channel ix: 3 rqn: 4323 HW state: 1 SW state: 3 posted WQEs: 7 cc: 7 ICOSQ HW state: 1
CQ:
cqn: 1044 HW status: 0
$ devlink health diagnose pci/0000:00:0b.0 reporter rx -jp
{
"Common config": {
"RQ": {
"type": 2,
"stride size": 2048,
"size": 8
},
"CQ": {
"stride size": 64,
"size": 1024
}
},
"RQs": [ {
"channel ix": 0,
"rqn": 4308,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1032,
"HW status": 0
}
},{
"channel ix": 1,
"rqn": 4313,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1036,
"HW status": 0
}
},{
"channel ix": 2,
"rqn": 4318,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1040,
"HW status": 0
}
},{
"channel ix": 3,
"rqn": 4323,
"HW state": 1,
"SW state": 3,
"posted WQEs": 7,
"cc": 7,
"ICOSQ HW state": 1,
"CQ": {
"cqn": 1044,
"HW status": 0
}
} ]
}
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Thu, 11 Jul 2019 14:17:36 +0000 (17:17 +0300)]
net/mlx5e: Add helper functions for reporter's basics
Introduce helper functions for create and destroy reporters and update
channels. In the following patch, rx reporter is added and it will use
these helpers too.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Sun, 30 Jun 2019 12:08:00 +0000 (15:08 +0300)]
net/mlx5e: Add cq info to tx reporter diagnose
Add cq information to general diagnose output: CQ size and stride size.
Per SQ add information about the related CQ: cqn and CQ's HW status.
$ devlink health diagnose pci/0000:00:0b.0 reporter tx
Common Config:
SQ:
stride size: 64 size: 1024
CQ:
stride size: 64 size: 1024
SQs:
channel ix: 0 tc: 0 txq ix: 0 sqn: 4307 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1030 HW status: 0
channel ix: 1 tc: 0 txq ix: 1 sqn: 4312 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1034 HW status: 0
channel ix: 2 tc: 0 txq ix: 2 sqn: 4317 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1038 HW status: 0
channel ix: 3 tc: 0 txq ix: 3 sqn: 4322 HW state: 1 stopped: false cc: 0 pc: 0
CQ:
cqn: 1042 HW status: 0
$ devlink health diagnose pci/0000:00:0b.0 reporter tx -jp
{
"Common Config": {
"SQ": {
"stride size": 64,
"size": 1024
},
"CQ": {
"stride size": 64,
"size": 1024
}
},
"SQs": [ {
"channel ix": 0,
"tc": 0,
"txq ix": 0,
"sqn": 4307,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1030,
"HW status": 0
}
},{
"channel ix": 1,
"tc": 0,
"txq ix": 1,
"sqn": 4312,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1034,
"HW status": 0
}
},{
"channel ix": 2,
"tc": 0,
"txq ix": 2,
"sqn": 4317,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1038,
"HW status": 0
}
},{
"channel ix": 3,
"tc": 0,
"txq ix": 3,
"sqn": 4322,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0,
"CQ": {
"cqn": 1042,
"HW status": 0
} ]
}
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Sun, 30 Jun 2019 08:34:15 +0000 (11:34 +0300)]
net/mlx5e: Extend tx reporter diagnostics output
Enhance tx reporter's diagnostics output to include: information common
to all SQs: SQ size, SQ stride size.
In addition add channel ix, tc, txq ix, cc and pc.
$ devlink health diagnose pci/0000:00:0b.0 reporter tx
Common config:
SQ:
stride size: 64 size: 1024
SQs:
channel ix: 0 tc: 0 txq ix: 0 sqn: 4307 HW state: 1 stopped: false cc: 0 pc: 0
channel ix: 1 tc: 0 txq ix: 1 sqn: 4312 HW state: 1 stopped: false cc: 0 pc: 0
channel ix: 2 tc: 0 txq ix: 2 sqn: 4317 HW state: 1 stopped: false cc: 0 pc: 0
channel ix: 3 tc: 0 txq ix: 3 sqn: 4322 HW state: 1 stopped: false cc: 0 pc: 0
$ devlink health diagnose pci/0000:00:0b.0 reporter tx -jp
{
"Common config": {
"SQ": {
"stride size": 64,
"size": 1024
}
},
"SQs": [ {
"channel ix": 0,
"tc": 0,
"txq ix": 0,
"sqn": 4307,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
},{
"channel ix": 1,
"tc": 0,
"txq ix": 1,
"sqn": 4312,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
},{
"channel ix": 2,
"tc": 0,
"txq ix": 2,
"sqn": 4317,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
},{
"channel ix": 3,
"tc": 0,
"txq ix": 3,
"sqn": 4322,
"HW state": 1,
"stopped": false,
"cc": 0,
"pc": 0
} ]
}
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Mon, 24 Jun 2019 18:41:21 +0000 (21:41 +0300)]
net/mlx5e: Extend tx diagnose function
The following patches in the set enhance the diagnostics info of tx
reporter. Therefore, it is better to pass a pointer to the SQ for
further data extraction.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Mon, 1 Jul 2019 12:08:13 +0000 (15:08 +0300)]
net/mlx5e: Generalize tx reporter's functionality
Prepare for code sharing with rx reporter, which is added in the
following patches in the set. Introduce a generic error_ctx for
agnostic recovery despatch.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Mon, 1 Jul 2019 12:51:51 +0000 (15:51 +0300)]
net/mlx5e: Change naming convention for reporter's functions
Change from mlx5e_tx_reporter_* to mlx5e_reporter_tx_*. In the following
patches in the set rx reporter is added, the new naming convention is
more uniformed.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Aya Levin [Mon, 1 Jul 2019 11:53:34 +0000 (14:53 +0300)]
net/mlx5e: Rename reporter header file
Rename reporter.h -> health.h so patches in the set can use it for
health related functionality.
Signed-off-by: Aya Levin <ayal@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
David S. Miller [Tue, 20 Aug 2019 19:33:49 +0000 (12:33 -0700)]
Merge branch 'net-dsa-enable-and-disable-all-ports'
Vivien Didelot says:
====================
net: dsa: enable and disable all ports
The DSA stack currently calls the .port_enable and .port_disable switch
callbacks for slave ports only. However, it is useful to call them for all
port types. For example this allows some drivers to delay the optimization
of power consumption after the switch is setup. This can also help reducing
the setup code of drivers a bit.
The first DSA core patches enable and disable all ports of a switch, regardless
their type. The last mv88e6xxx patches remove redundant code from the driver
setup and the said callbacks, now that they handle SERDES power for all ports.
Changes in v2: do not guard .port_disable for broadcom switches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:53 +0000 (16:00 -0400)]
net: dsa: mv88e6xxx: wrap SERDES IRQ in power function
Now that mv88e6xxx_serdes_power is only called after driver setup,
we can wrap the SERDES IRQ code directly within it for clarity.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:52 +0000 (16:00 -0400)]
net: dsa: mv88e6xxx: enable SERDES after setup
SERDES is powered on for CPU and DSA ports and powered down for unused
ports at setup time. But now that DSA calls mv88e6xxx_port_enable
and mv88e6xxx_port_disable for all ports, the SERDES power can now
be handled after setup inconditionally for all ports.
Using the port enable and disable callbacks also have the benefit to
handle the SERDES IRQ for non user ports as well.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:51 +0000 (16:00 -0400)]
net: dsa: mv88e6xxx: do not change STP state on port disabling
When disabling a port, that is not for the driver to decide what to
do with the STP state. This is already handled by the DSA layer.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:50 +0000 (16:00 -0400)]
net: dsa: enable and disable all ports
Call the .port_enable and .port_disable functions for all ports,
not only the user ports, so that drivers may optimize the power
consumption of all ports after a successful setup.
Unused ports are now disabled on setup. CPU and DSA ports are now
enabled on setup and disabled on teardown. User ports were already
enabled at slave creation and disabled at slave destruction.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:49 +0000 (16:00 -0400)]
net: dsa: do not enable or disable non user ports
The .port_enable and .port_disable operations are currently only
called for user ports, hence assuming they have a slave device. In
preparation for using these operations for other port types as well,
simply guard all implementations against non user ports and return
directly in such case.
Note that bcm_sf2_sw_suspend() currently calls bcm_sf2_port_disable()
(and thus b53_disable_port()) against the user and CPU ports, so do
not guards those functions. They will be called for unused ports in
the future, but that was expected by those drivers anyway.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 19 Aug 2019 20:00:48 +0000 (16:00 -0400)]
net: dsa: use a single switch statement for port setup
It is currently difficult to read the different steps involved in the
setup and teardown of ports in the DSA code. Keep it simple with a
single switch statement for each port type: UNUSED, CPU, DSA, or USER.
Also no need to call devlink_port_unregister from within dsa_port_setup
as this step is inconditionally handled by dsa_port_teardown on error.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Akeem G Abodunrin [Thu, 25 Jul 2019 08:55:30 +0000 (01:55 -0700)]
ice: Restructure VFs initialization flows
This patch restructures how VFs are configured, and resources allocated.
Instead of freeing resources that were never allocated, and resetting
empty VFs that have never been created - the new flow will just allocate
resources for number of requested VFs based on the availability.
During VFs initialization process, global interrupt is disabled, and
rearmed after getting MSIX vectors for VFs. This allows immediate mailbox
communications, instead of delaying it till later and VFs.
PF communications resulted to using polling instead of actual interrupt.
The issue manifested when creating higher number of VFs (128 VFs) per PF.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 25 Jul 2019 08:55:29 +0000 (01:55 -0700)]
ice: Assume that more than one Rx queue is rare in ice_napi_poll
Currently we divide budget by the number of Rx queues per Rx ring
container in ice_napi_poll even if there is only 1. This is an
unnecessary divide for the normal case of 1 Rx ring per Rx ring
container. Fix this by using an unlikely() call in the case where we
actually need to divide.
Also, we will always set budget_per_ring even if there are no Rx rings
in the Rx ring container so we don't need to initialize it to 0.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Thu, 25 Jul 2019 08:55:28 +0000 (01:55 -0700)]
ice: Use the software based tail when checking for hung Tx ring
Currently in ice_get_tx_pending we try to read a Tx ring's tail. This is
then compared with the software based head (next_to_clean) to determine
if we have pending work. This will never work because reading of the Tx
ring's tail is no longer supported. Fix this by using the software based
tail (next_to_use) to determine if there is pending work.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Hayes Wang [Mon, 19 Aug 2019 06:40:36 +0000 (14:40 +0800)]
r8152: divide the tx and rx bottom functions
Move the tx bottom function from NAPI to a new tasklet. Then, for
multi-cores, the bottom functions of tx and rx may be run at same
time with different cores. This is used to improve performance.
On x86, Tx/Rx 943/943 Mbits/sec -> 945/944.
For arm platform, Tx/Rx: 917/917 Mbits/sec -> 933/933.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde [Mon, 19 Aug 2019 15:08:29 +0000 (17:08 +0200)]
can: mcp251x: remove custom DMA mapped buffer
There is no need to duplicate what SPI core already does, i.e. mapping buffers
for DMA capable transfers. This patch removes all related pices of code.
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Phil Elwell [Tue, 14 Nov 2017 11:03:22 +0000 (11:03 +0000)]
can: mcp251x: Use DT-supplied interrupt flags
The MCP2515 datasheet clearly describes a level-triggered interrupt pin.
Therefore the receiving interrupt controller must also be configured for
level-triggered operation otherwise there is a danger of a missed
interrupt condition blocking all subsequent interrupts. The ONESHOT
flag ensures that the interrupt is masked until the threaded interrupt
handler exits.
Rather than change the flags globally (they must have worked for at
least one user), keep the old behavior for for non DT devices. DT based
devices specify the flags in their corresonding DT node.
See: https://github.com/raspberrypi/linux/issues/2175
https://github.com/raspberrypi/linux/issues/2263
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Alexander Shiyan [Thu, 31 Jan 2019 08:46:11 +0000 (11:46 +0300)]
can: mcp251x: Use dev_name() during request_threaded_irq()
Passing driver name as name during request_threaded_irq() results in all
CAN IRQs have same name. This does not help much to easily identify which
IRQ belongs to which CAN instance. Therefore pass dev_name() during
request_threaded_irq() so that better identifiable name is listed for
CAN devices in cat /proc/interrupts output.
Output of cat /proc/interrupts
Before this patch:
253: 2 gpio-mxc 13 Edge mcp251x
259: 2 gpio-mxc 19 Edge mcp251x
After this patch:
253: 2 gpio-mxc 13 Edge spi1.1
259: 2 gpio-mxc 19 Edge spi1.2
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Tue, 13 Aug 2019 14:01:02 +0000 (16:01 +0200)]
can: mcp251x: mcp251x_hw_reset(): allow more time after a reset
Some boards take longer than 5ms to power up after a reset, so allow
some retries attempts before giving up.
Fixes: ff06d611a31c ("can: mcp251x: Improve mcp251x_hw_reset()")
Cc: linux-stable <stable@vger.kernel.org>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Tue, 13 Aug 2019 12:26:45 +0000 (14:26 +0200)]
can: mcp251x: use u8 instead of uint8_t
This patch changes all the uint8_t in the arguments in several function
to u8.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Tue, 13 Aug 2019 13:33:01 +0000 (15:33 +0200)]
can: mcp251x: fix print formating strings
This patch fixes the print format strings in the driver.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Wed, 24 Jul 2019 12:34:42 +0000 (14:34 +0200)]
can: mcp251x: avoid long lines
This patch fixes long lines in the driver.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Wed, 24 Jul 2019 12:28:21 +0000 (14:28 +0200)]
can: mcp251x: remove unnecessary blank lines
This patch removes unnecessary blank lines, so that checkpatch doesn't
complain anymore.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Wed, 24 Jul 2019 12:16:29 +0000 (14:16 +0200)]
can: mcp251x: convert block comments to network style comments
This patch converts all block comments to network subsystem style block
comments.
Acked-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 19 Aug 2019 17:34:28 +0000 (19:34 +0200)]
can: m_can_platform: m_can_plat_probe(): add missing error handling if mcan_class is NULL
This patch adds the missing error handling in m_can_plat_probe() if
mcan_class is NULL.
Fixes: f524f829b75a ("can: m_can: Create a m_can platform framework")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 19 Aug 2019 17:17:13 +0000 (19:17 +0200)]
can: m_can_platform: remove not needed casts to struct m_can_plat_priv *
The struct m_can_classdev::device_data is a void pointer, so there's no
need to cast it to struct m_can_plat_priv *, when assigning the struct
m_can_plat_priv pointer.
This patch removes the not needed casts from the m_can_platform driver.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Fri, 16 Aug 2019 08:44:49 +0000 (10:44 +0200)]
can: tcan4x5x: fix data length in regmap write path
In regmap_spi_gather_write() the "addr" is prepared. The chip expects
the number of 32 bit words to write in the lower 8 bits of addr. However
the number of byte to write in shifted left by 3 (== divided by 8).
The function tcan4x5x_regmap_write() is called with a data buffer, which
holds the register information in the first 32 bits, followed by the
actual data. tcan4x5x_regmap_write() calls regmap_spi_gather_write()
with the val pointer pointing to the actual data (i.e. the original
pointer is incremented by 4 bytes), but without decrementing the count.
If the regmap framework only calls tcan4x5x_regmap_write() to read
single 32 bit registers these two bugs cancel each other.
This patch fixes the code.
Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 19 Aug 2019 17:34:28 +0000 (19:34 +0200)]
can: tcan4x5x: tcan4x5x_can_probe(): add missing error handling if mcan_class is NULL
This patch adds the missing error handling in tcan4x5x_can_probe() if
mcan_class is NULL.
Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 19 Aug 2019 17:17:13 +0000 (19:17 +0200)]
can: tcan4x5x: remove not needed casts to struct tcan4x5x_priv *
The struct m_can_classdev::device_data is a void pointer, so there's no
need to cast it to struct tcan4x5x_priv *, when assigning the struct
tcan4x5x_priv pointer.
This patch removes the not needed casts from the tcan4x5x driver.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Fri, 16 Aug 2019 16:44:04 +0000 (18:44 +0200)]
can: tcan4x5x: remove unused struct tcan4x5x_priv::tcan4x5x_lock
The mutex struct tcan4x5x_priv::tcan4x5x_lock is unused in the driver,
so this patch removes the variable from the driver.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Mon, 19 Aug 2019 15:08:29 +0000 (17:08 +0200)]
can: hi311x: remove custom DMA mapped buffer
There is no need to duplicate what SPI core already does, i.e. mapping buffers
for DMA capable transfers. This patch removes all related pices of code.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Nishka Dasgupta [Mon, 19 Aug 2019 08:00:18 +0000 (13:30 +0530)]
can: peak_pci: Make structure peak_pciec_i2c_bit_ops constant
Static structure peak_pciec_i2c_bit_ops, of type i2c_algo_bit_data, is
not used except to be copied into another variable. Hence make it const
to protect it from modification.
Issue found with Coccinelle.
Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Geert Uytterhoeven [Wed, 14 Aug 2019 09:22:21 +0000 (11:22 +0200)]
can: rcar_can: Remove unused platform data support
All R-Car platforms use DT for describing CAN controllers. R-Car CAN
platform data support was never used in any upstream kernel.
Move the Clock Select Register settings enum into the driver, and remove
platform data support and the corresponding header file.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
David S. Miller [Tue, 20 Aug 2019 01:32:30 +0000 (18:32 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2019-08-19' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 5.4
First set of patches for 5.4.
Major changes:
brcmfmac
* enable 160 MHz channel support
rt2x00
* add support for PLANEX GW-USMicroN USB device
rtw88
* add Bluetooth coexistance support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Aug 2019 01:27:29 +0000 (18:27 -0700)]
Merge branch 'sctp-support-per-endpoint-auth-and-asconf-flags'
Xin Long says:
====================
sctp: support per endpoint auth and asconf flags
This patchset mostly does 3 things:
1. add per endpint asconf flag and use asconf flag properly
and add SCTP_ASCONF_SUPPORTED sockopt.
2. use auth flag properly and add SCTP_AUTH_SUPPORTED sockopt.
3. remove the 'global feature switch' to discard chunks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:50 +0000 (22:02 +0800)]
sctp: remove net sctp.x_enable working as a global switch
The netns sctp feature flags shouldn't work as a global switch,
which is mostly like a firewall/netfilter's job. Also, it will
break asoc as it discard or accept chunks incorrectly when net
sctp.x_enable is changed after the asoc is created.
Since each type of chunk's processing function will check the
corresp asoc's feature flag, this 'global switch' should be
removed, and net sctp.x_enable will only work as the default
feature flags for the future sctp sockets/endpoints.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:49 +0000 (22:02 +0800)]
sctp: add SCTP_AUTH_SUPPORTED sockopt
SCTP_AUTH_SUPPORTED sockopt is used to set enpoint's auth
flag. With this feature, each endpoint will have its own
flag for its future asoc's auth_capable, instead of netns
auth flag.
Note that when both ep's auth_enable is enabled, endpoint
auth related data should be initialized. If asconf_enable
is also set, SCTP_CID_ASCONF/SCTP_CID_ASCONF_ACK should
be added into auth_chunk_list.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:48 +0000 (22:02 +0800)]
sctp: add sctp_auth_init and sctp_auth_free
This patch is to factor out sctp_auth_init and sctp_auth_free
functions, and sctp_auth_init will also be used in the next
patch for SCTP_AUTH_SUPPORTED sockopt.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:47 +0000 (22:02 +0800)]
sctp: use ep and asoc auth_enable properly
sctp has per endpoint auth flag and per asoc auth flag, and
the asoc one should be checked when coming to asoc and the
endpoint one should be checked when coming to endpoint.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:46 +0000 (22:02 +0800)]
sctp: add SCTP_ASCONF_SUPPORTED sockopt
SCTP_ASCONF_SUPPORTED sockopt is used to set enpoint's asconf
flag. With this feature, each endpoint will have its own flag
for its future asoc's asconf_capable, instead of netns asconf
flag.
Note that when both ep's asconf_enable and auth_enable are
enabled, SCTP_CID_ASCONF and SCTP_CID_ASCONF_ACK should be
added into auth_chunk_list.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:45 +0000 (22:02 +0800)]
sctp: check asoc peer.asconf_capable before processing asconf
asconf chunks should be dropped when the asoc doesn't support
asconf feature.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:44 +0000 (22:02 +0800)]
sctp: not set peer.asconf_capable in sctp_association_init
asoc->peer.asconf_capable is to be set during handshake, and its
value should be initialized to 0. net->sctp.addip_noauth will be
checked in sctp_process_init when processing INIT_ACK on client
and COOKIE_ECHO on server.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 19 Aug 2019 14:02:43 +0000 (22:02 +0800)]
sctp: add asconf_enable in struct sctp_endpoint
This patch is to make addip/asconf flag per endpoint,
and its value is initialized by the per netns flag,
net->sctp.addip_enable.
It also replaces the checks of net->sctp.addip_enable
with ep->asconf_enable in some places.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Mon, 19 Aug 2019 12:05:15 +0000 (20:05 +0800)]
net: remove empty inet_exit_net
Pointer members of an object with static storage duration, if not
explicitly initialized, will be initialized to a NULL pointer. The
net namespace API checks if this pointer is not NULL before using it,
it are safe to remove the function.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 20 Aug 2019 01:19:48 +0000 (18:19 -0700)]
Merge branch 'ns-plugin-fixes'
Vlad Buslov says:
====================
Fix problems with using ns plugin
Recent changes to plugin architecture broke some of the tests when running tdc
without specifying a test group. Fix tests incompatible with ns plugin and
modify tests to not reuse interface name of ns veth interface for dummy
interface.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 19 Aug 2019 07:52:08 +0000 (10:52 +0300)]
tc-testing: concurrency: wrap piped rule update commands
Concurrent tests use several commands to update rules in parallel: 'find'
prints names of batch files in tmp directory and pipes result to 'xargs'
which runs instance of tc per batch file in parallel. This breaks when used
with ns plugin that adds 'ip netns exec $NS' prefix to the command, which
causes only first command in pipe to be executed in namespace:
=====> Test e41d: Add 1M flower filters with 10 parallel tc instances
-----> prepare stage
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [/bin/mkdir tmp] list [['/bin/mkdir', 'tmp']]
adjust_command: return command [ip netns exec tcut /bin/mkdir tmp]
command "ip netns exec tcut /bin/mkdir tmp"
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [/sbin/tc qdisc add dev ens1f0 ingress] list [['/sbin/tc', 'qdisc', 'add', 'dev', 'ens1f0', 'ingress']]
adjust_command: return command [ip netns exec tcut /sbin/tc qdisc add dev ens1f0 ingress]
command "ip netns exec tcut /sbin/tc qdisc add dev ens1f0 ingress"
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [./tdc_multibatch.py ens1f0 tmp 100000 10 add] list [['./tdc_multibatch.py', 'ens1f0', 'tmp', '100000', '10', 'add']]
adjust_command: return command [ip netns exec tcut ./tdc_multibatch.py ens1f0 tmp 100000 10 add]
command "ip netns exec tcut ./tdc_multibatch.py ens1f0 tmp 100000 10 add"
-----> execute stage
ns/SubPlugin.adjust_command
adjust_command: stage is execute; inserting netns stuff in command [find tmp/add* -print | xargs -n 1 -P 10 /sbin/tc -b] list [['find', 'tmp/add*', '-print', '|', 'xargs', '-n', '1', '-P', '10', '/sbin/tc', '-b']
]
adjust_command: return command [ip netns exec tcut find tmp/add* -print | xargs -n 1 -P 10 /sbin/tc -b]
command "ip netns exec tcut find tmp/add* -print | xargs -n 1 -P 10 /sbin/tc -b"
exit: 123
exit: 0
Cannot find device "ens1f0"
Cannot find device "ens1f0"
Command failed tmp/add_0:1
Command failed tmp/add_1:1
Cannot find device "ens1f0"
Command failed tmp/add_2:1
Cannot find device "ens1f0"
Command failed tmp/add_4:1
Cannot find device "ens1f0"
Command failed tmp/add_3:1
Cannot find device "ens1f0"
Command failed tmp/add_5:1
Cannot find device "ens1f0"
Command failed tmp/add_6:1
Cannot find device "ens1f0"
Command failed tmp/add_8:1
Cannot find device "ens1f0"
Command failed tmp/add_7:1
Cannot find device "ens1f0"
Command failed tmp/add_9:1
Fix the issue by executing whole compound command in namespace by wrapping
it in 'bash -c' invocation.
Fixes: 489ce2f42514 ("tc-testing: Restore original behaviour for namespaces in tdc")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 19 Aug 2019 07:52:07 +0000 (10:52 +0300)]
tc-testing: use dedicated DUMMY interface name for dummy dev
A lot of tests reuse $DEV1 veth name for naming dummy device. This causes
problem when tdc is invoked without specifying a test group and tries to
execute all tests. In this case tdc instantiates ns plugin, which creates
veth pair once before running tests. However, if any of the tests that
reuse $DEV1 run before test that depend on ns plugin, it will delete $DEV1
as a part of teardown section:
=====> Test 3b88: Delete ingress qdisc twice [3770/41080]
-----> prepare stage
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [/sbin/ip link add dev v0p1 type dummy || /bin/true] list [['/sbin/ip', 'link', 'add', 'dev', 'v0p1', 'type', 'dummy', '||', '/bin/true']]
adjust_command: return command [ip netns exec tcut /sbin/ip link add dev v0p1 type dummy || /bin/true]
command "ip netns exec tcut /sbin/ip link add dev v0p1 type dummy || /bin/true"
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [/sbin/tc qdisc add dev v0p1 ingress] list [['/sbin/tc', 'qdisc', 'add', 'dev', 'v0p1', 'ingress']]
adjust_command: return command [ip netns exec tcut /sbin/tc qdisc add dev v0p1 ingress]
command "ip netns exec tcut /sbin/tc qdisc add dev v0p1 ingress"
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [/sbin/tc qdisc del dev v0p1 ingress] list [['/sbin/tc', 'qdisc', 'del', 'dev', 'v0p1', 'ingress']]
adjust_command: return command [ip netns exec tcut /sbin/tc qdisc del dev v0p1 ingress]
command "ip netns exec tcut /sbin/tc qdisc del dev v0p1 ingress"
-----> execute stage
ns/SubPlugin.adjust_command
adjust_command: stage is execute; inserting netns stuff in command [/sbin/tc qdisc del dev v0p1 ingress] list [['/sbin/tc', 'qdisc', 'del', 'dev', 'v0p1', 'ingress']]
adjust_command: return command [ip netns exec tcut /sbin/tc qdisc del dev v0p1 ingress]
command "ip netns exec tcut /sbin/tc qdisc del dev v0p1 ingress"
-----> verify stage
ns/SubPlugin.adjust_command
adjust_command: stage is verify; inserting netns stuff in command [/sbin/tc qdisc show dev v0p1] list [['/sbin/tc', 'qdisc', 'show', 'dev', 'v0p1']]
adjust_command: return command [ip netns exec tcut /sbin/tc qdisc show dev v0p1]
command "ip netns exec tcut /sbin/tc qdisc show dev v0p1"
-----> teardown stage
ns/SubPlugin.adjust_command
adjust_command: stage is teardown; inserting netns stuff in command [/sbin/ip link del dev v0p1 type dummy] list [['/sbin/ip', 'link', 'del', 'dev', 'v0p1', 'type', 'dummy']]
adjust_command: return command [ip netns exec tcut /sbin/ip link del dev v0p1 type dummy]
command "ip netns exec tcut /sbin/ip link del dev v0p1 type dummy"
After this ns-dependent tests will fail because dev doesn't exist:
=====> Test 901f: Add fw filter with prio at 32-bit maxixum
-----> prepare stage
ns/SubPlugin.adjust_command
adjust_command: stage is setup; inserting netns stuff in command [/sbin/tc qdisc add dev v0p1 ingress] list [['/sbin/tc', 'qdisc', 'add', 'dev', 'v0p1', 'ingress']]
adjust_command: return command [ip netns exec tcut /sbin/tc qdisc add dev v0p1 ingress]
command "ip netns exec tcut /sbin/tc qdisc add dev v0p1 ingress"
-----> prepare stage *** Could not execute: "$TC qdisc add dev $DEV1 ingress"
-----> prepare stage *** Error message: "Cannot find device "v0p1"
"
returncode 1; expected [0]
-----> prepare stage *** Aborting test run.
<_io.BufferedReader name=3> *** stdout ***
<_io.BufferedReader name=5> *** stderr ***
"-----> prepare stage" did not complete successfully
Exception <class '__main__.PluginMgrTestFail'> ('setup', None, '"-----> prepare stage" did not complete successfully') (caught in test_runner, running test 477 901f Add fw filter with prio at 32-bit maxixum stage
setup)
---------------
traceback
File "./tdc.py", line 371, in test_runner
res = run_one_test(pm, args, index, tidx)
File "./tdc.py", line 272, in run_one_test
prepare_env(args, pm, 'setup', "-----> prepare stage", tidx["setup"])
File "./tdc.py", line 247, in prepare_env
'"{}" did not complete successfully'.format(prefix))
---------------
Fix the issue by introducing standalone $DUMMY config variable and
substitute all usage of $DEV1 in tests that don't depend on ns plugin.
Fixes: 489ce2f42514 ("tc-testing: Restore original behaviour for namespaces in tdc")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang [Mon, 19 Aug 2019 03:15:19 +0000 (11:15 +0800)]
r8152: fix accessing skb after napi_gro_receive
Fix accessing skb after napi_gro_receive which is caused by
commit
47922fcde536 ("r8152: support skb_add_rx_frag").
Fixes: 47922fcde536 ("r8152: support skb_add_rx_frag")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Aug 2019 20:04:45 +0000 (13:04 -0700)]
Merge branch 'RTL8125-EEE'
Heiner Kallweit says:
====================
net: phy: realtek: support NBase-T MMD EEE registers on RTL8125
Add missing EEE-related constants, including the new MMD EEE registers
for NBase-T / 802.3bz. Based on that emulate the new 802.3bz MMD EEE
registers for 2.5Gbps EEE on RTL8125.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 16 Aug 2019 19:57:38 +0000 (21:57 +0200)]
net: phy: realtek: support NBase-T MMD EEE registers on RTL8125
Emulate the 802.3bz MMD EEE registers for 2.5Gbps EEE on RTL8125.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 16 Aug 2019 19:56:27 +0000 (21:56 +0200)]
net: phy: add EEE-related constants
Add EEE-related constants. This includes the new MMD EEE registers for
NBase-T / 802.3bz.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Fri, 16 Aug 2019 15:06:54 +0000 (18:06 +0300)]
net: flow_offload: convert block_ing_cb_list to regular list type
RCU list block_ing_cb_list is protected by rcu read lock in
flow_block_ing_cmd() and with flow_indr_block_ing_cb_lock mutex in all
functions that use it. However, flow_block_ing_cmd() needs to call blocking
functions while iterating block_ing_cb_list which leads to following
suspicious RCU usage warning:
[ 401.510948] =============================
[ 401.510952] WARNING: suspicious RCU usage
[ 401.510993] 5.3.0-rc3+ #589 Not tainted
[ 401.510996] -----------------------------
[ 401.511001] include/linux/rcupdate.h:265 Illegal context switch in RCU read-side critical section!
[ 401.511004]
other info that might help us debug this:
[ 401.511008]
rcu_scheduler_active = 2, debug_locks = 1
[ 401.511012] 7 locks held by test-ecmp-add-v/7576:
[ 401.511015] #0:
00000000081d71a5 (sb_writers#4){.+.+}, at: vfs_write+0x166/0x1d0
[ 401.511037] #1:
000000002bd338c3 (&of->mutex){+.+.}, at: kernfs_fop_write+0xef/0x1b0
[ 401.511051] #2:
00000000c921c634 (kn->count#317){.+.+}, at: kernfs_fop_write+0xf7/0x1b0
[ 401.511062] #3:
00000000a19cdd56 (&dev->mutex){....}, at: sriov_numvfs_store+0x6b/0x130
[ 401.511079] #4:
000000005425fa52 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x30/0x140
[ 401.511092] #5:
00000000c5822793 (rtnl_mutex){+.+.}, at: unregister_netdevice_notifier+0x35/0x140
[ 401.511101] #6:
00000000c2f3507e (rcu_read_lock){....}, at: flow_block_ing_cmd+0x5/0x130
[ 401.511115]
stack backtrace:
[ 401.511121] CPU: 21 PID: 7576 Comm: test-ecmp-add-v Not tainted 5.3.0-rc3+ #589
[ 401.511124] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 401.511127] Call Trace:
[ 401.511138] dump_stack+0x85/0xc0
[ 401.511146] ___might_sleep+0x100/0x180
[ 401.511154] __mutex_lock+0x5b/0x960
[ 401.511162] ? find_held_lock+0x2b/0x80
[ 401.511173] ? __tcf_get_next_chain+0x1d/0xb0
[ 401.511179] ? mark_held_locks+0x49/0x70
[ 401.511194] ? __tcf_get_next_chain+0x1d/0xb0
[ 401.511198] __tcf_get_next_chain+0x1d/0xb0
[ 401.511251] ? uplink_rep_async_event+0x70/0x70 [mlx5_core]
[ 401.511261] tcf_block_playback_offloads+0x39/0x160
[ 401.511276] tcf_block_setup+0x1b0/0x240
[ 401.511312] ? mlx5e_rep_indr_setup_tc_cb+0xca/0x290 [mlx5_core]
[ 401.511347] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core]
[ 401.511359] tc_indr_block_get_and_ing_cmd+0x11b/0x1e0
[ 401.511404] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core]
[ 401.511414] flow_block_ing_cmd+0x7e/0x130
[ 401.511453] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core]
[ 401.511462] __flow_indr_block_cb_unregister+0x7f/0xf0
[ 401.511502] mlx5e_nic_rep_netdevice_event+0x75/0xb0 [mlx5_core]
[ 401.511513] unregister_netdevice_notifier+0xe9/0x140
[ 401.511554] mlx5e_cleanup_rep_tx+0x6f/0xe0 [mlx5_core]
[ 401.511597] mlx5e_detach_netdev+0x4b/0x60 [mlx5_core]
[ 401.511637] mlx5e_vport_rep_unload+0x71/0xc0 [mlx5_core]
[ 401.511679] esw_offloads_disable+0x5b/0x90 [mlx5_core]
[ 401.511724] mlx5_eswitch_disable.cold+0xdf/0x176 [mlx5_core]
[ 401.511759] mlx5_device_disable_sriov+0xab/0xb0 [mlx5_core]
[ 401.511794] mlx5_core_sriov_configure+0xaf/0xd0 [mlx5_core]
[ 401.511805] sriov_numvfs_store+0xf8/0x130
[ 401.511817] kernfs_fop_write+0x122/0x1b0
[ 401.511826] vfs_write+0xdb/0x1d0
[ 401.511835] ksys_write+0x65/0xe0
[ 401.511847] do_syscall_64+0x5c/0xb0
[ 401.511857] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 401.511862] RIP: 0033:0x7fad892d30f8
[ 401.511868] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 96 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 60 c3 0f 1f 80 00 00 00 00 48 83
ec 28 48 89
[ 401.511871] RSP: 002b:
00007ffca2a9fad8 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
[ 401.511875] RAX:
ffffffffffffffda RBX:
0000000000000002 RCX:
00007fad892d30f8
[ 401.511878] RDX:
0000000000000002 RSI:
000055afeb072a90 RDI:
0000000000000001
[ 401.511881] RBP:
000055afeb072a90 R08:
00000000ffffffff R09:
000000000000000a
[ 401.511884] R10:
000055afeb058710 R11:
0000000000000246 R12:
0000000000000002
[ 401.511887] R13:
00007fad893a8780 R14:
0000000000000002 R15:
00007fad893a3740
To fix the described incorrect RCU usage, convert block_ing_cb_list from
RCU list to regular list and protect it with flow_indr_block_ing_cb_lock
mutex in flow_block_ing_cmd().
Fixes: 1150ab0f1b33 ("flow_offload: support get multi-subsystem block")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 19 Aug 2019 18:54:03 +0000 (11:54 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Merge conflict of mlx5 resolved using instructions in merge
commit
9566e650bf7fdf58384bb06df634f7531ca3a97e.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 19 Aug 2019 17:00:01 +0000 (10:00 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov.
2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan.
3) Fix severe performance regression due to lack of SKB coalescing of
fragments during local delivery, from Guillaume Nault.
4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk.
5) Fix batched events in skbedit packet action, from Roman Mashak.
6) Propagate VLAN TX offload to hw_enc_features in bond and team
drivers, from Yue Haibing.
7) RXRPC local endpoint refcounting fix and read after free in
rxrpc_queue_local(), from David Howells.
8) Fix endian bug in ibmveth multicast list handling, from Thomas
Falcon.
9) Oops, make nlmsg_parse() wrap around the correct function,
__nlmsg_parse not __nla_parse(). Fix from David Ahern.
10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin.
11) Fix memory leak in cxgb4, from Wenwen Wang.
12) Yet another race in AF_PACKET, from Eric Dumazet.
13) Fix false detection of retransmit failures in tipc, from Tuong
Lien.
14) Use after free in ravb_tstamp_skb, from Tho Vu.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
ravb: Fix use-after-free ravb_tstamp_skb
netfilter: nf_tables: map basechain priority to hardware priority
net: sched: use major priority number as hardware priority
wimax/i2400m: fix a memory leak bug
net: cavium: fix driver name
ibmvnic: Unmap DMA address of TX descriptor buffers after use
bnxt_en: Fix to include flow direction in L2 key
bnxt_en: Use correct src_fid to determine direction of the flow
bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
bnxt_en: Improve RX doorbell sequence.
bnxt_en: Fix VNIC clearing logic for 57500 chips.
net: kalmia: fix memory leaks
cx82310_eth: fix a memory leak bug
bnx2x: Fix VF's VLAN reconfiguration in reload.
Bluetooth: Add debug setting for changing minimum encryption key size
tipc: fix false detection of retransmit failures
lan78xx: Fix memory leaks
MAINTAINERS: r8169: Update path to the driver
MAINTAINERS: PHY LIBRARY: Update files in the record
...
David Howells [Mon, 19 Aug 2019 15:02:01 +0000 (16:02 +0100)]
keys: Fix description size
The maximum key description size is 4095. Commit
f771fde82051 ("keys:
Simplify key description management") inadvertantly reduced that to 255
and made sizes between 256 and 4095 work weirdly, and any size whereby
size & 255 == 0 would cause an assertion in __key_link_begin() at the
following line:
BUG_ON(index_key->desc_len == 0);
This can be fixed by simply increasing the size of desc_len in struct
keyring_index_key to a u16.
Note the argument length test in keyutils only checked empty
descriptions and descriptions with a size around the limit (ie. 4095)
and not for all the values in between, so it missed this. This has been
addressed and
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/commit/?id=
066bf56807c26cd3045a25f355b34c1d8a20a5aa
now exhaustively tests all possible lengths of type, description and
payload and then some.
The assertion failure looks something like:
kernel BUG at security/keys/keyring.c:1245!
...
RIP: 0010:__key_link_begin+0x88/0xa0
...
Call Trace:
key_create_or_update+0x211/0x4b0
__x64_sys_add_key+0x101/0x200
do_syscall_64+0x5b/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
It can be triggered by:
keyctl add user "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" a @s
Fixes: f771fde82051 ("keys: Simplify key description management")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>