David S. Miller [Wed, 30 Jan 2019 22:26:07 +0000 (14:26 -0800)]
Merge branch 'net-dsa-mt7530-support-MT7530-in-the-MT7621-SoC'
Greg Ungerer says:
====================
net: dsa: mt7530: support MT7530 in the MT7621 SoC
This is the fourth version of a patch series supporting the MT7530 switch
as used in the MediaTek MT7621 SoC. Unlike the MediaTek MT7623 the MT7621
is built around a dual core MIPS CPU architecture. But inside it uses
basically the same 7530 switch.
This series resolves all issues I had with previous versions, and I can
now reliably use the driver on a 7621 SoC platform. These patches were
generated against linux-5.0-rc4.
The first patch enables support for the existing kernel mediatek ethernet
driver on the MT7621 SoC. This support is from Bjørn Mork, with an update
and fix by me. Using this driver fixed a number of problems I had
(TX checksums, large RX packet drop) over the staging driver
(drivers/staging/mt7621-eth).
Patch 2 modifies the mt7530 DSA driver to support the 7530 switch as
implemented in the Mediatek MT7621 SoC. The last patch updates the
devicetree bindings to reflect the new support in the mt7530 driver.
There is no real dependencies between the patches, so they can be taken
independantly.
Creating a new binding for the MT7621 seems like the only viable approach
to distinguish between a stand alone 7530 switch, the silicon module
in the MT7623 SoC and the silicon in the MT7621. Certainly the 7530 ID
register in the MT7623 and MT7621 returns the same value, "0x7530001".
Looking at the mt7530.c DSA driver it might make some sense to convert
the existing "mediatek,mcm" binding to something like "mediatek,mt7623"
to be consistent with this new MT7621 support. As far as I can tell
this is the intention of this binding.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Ungerer [Wed, 30 Jan 2019 01:24:06 +0000 (11:24 +1000)]
dt-bindings: net: dsa: add new MT7530 binding to support MT7621
Add devicetree binding to support the compatible mt7530 switch as used
in the MediaTek MT7621 SoC.
Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Ungerer [Wed, 30 Jan 2019 01:24:05 +0000 (11:24 +1000)]
net: dsa: mt7530: support the 7530 switch on the Mediatek MT7621 SoC
The MediaTek MT7621 SoC device contains a 7530 switch, and the existing
linux kernel 7530 DSA switch driver can be used with it.
The bulk of the changes required stem from the 7621 having different
regulator and pad setup. The existing setup of these in the 7530
driver appears to be very specific to its implemtation in the Mediatek
7623 SoC. (Not entirely surprising given the 7623 is a quad core ARM
based SoC, and the 7621 is a dual core, dual thread MIPS based SoC).
Create a new devicetree type, "mediatek,mt7621", to support the 7530
switch in the 7621 SoC. There appears to be no usable ID register to
distinguish it from a 7530 in other hardware at runtime. This is used
to carry out the appropriate configuration and setup.
Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Wed, 30 Jan 2019 01:24:04 +0000 (11:24 +1000)]
net: ethernet: mediatek: support MT7621 SoC ethernet hardware
The Mediatek MT7621 SoC contains the same ethernet hardware module as
used on a number of other MediaTek SoC parts. There are some minor
differences to deal with but we can use the same driver to support
them all.
This patch is based on work by Bjørn Mork <bjorn@mork.no>, and his
original patch is at:
https://github.com/bmork/LEDE/commit/
3293bc63f5461ca1eb0bbc4ed90145335e7e3404
There is an additional compatible devicetree type added, and the primary
change to the code required is to support a single interrupt (for both
RX and TX interrupts).
Signed-off-by: Bjørn Mork <bjorn@mork.no>
[gerg@kernel.org: rebase to mainline and irq handler fix]
Signed-off-by: Greg Ungerer <gerg@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Sean Wang <sean.wang@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Jan 2019 18:00:40 +0000 (10:00 -0800)]
Merge branch 'mlxsw-spectrum_acl-Include-delta-bits-into-hashtable-key'
Ido Schimmel says:
====================
mlxsw: spectrum_acl: Include delta bits into hashtable key
The Spectrum-2 ASIC allows multiple rules to use the same mask provided
that the difference between their masks is small enough (up to 8
consecutive delta bits). A more detailed explanation is provided in
merge commit
756cd36626f7 ("Merge branch
'mlxsw-Introduce-algorithmic-TCAM-support'").
These delta bits are part of the rule's key and therefore rules that
only differ in their delta bits can be inserted with the same A-TCAM
mask. In case two rules share the same key and only differ in their
priority, then the second will spill to the C-TCAM.
Current code does not take the delta bits into account when checking for
duplicate rules, which leads to unnecessary spillage to the C-TCAM.
This may result in reduced scale and performance.
Patch #1 includes the delta bits in the rule's key to avoid the above
mentioned problem.
Patch #2 adds a tracepoint when a rule is inserted into the C-TCAM.
Patches #3-#5 add test cases to make sure unnecessary spillage into the
C-TCAM does not occur.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 30 Jan 2019 08:58:37 +0000 (08:58 +0000)]
selftests: spectrum-2: Add delta two masks one key test
Ensure that the bug is fixed and we no longer have C-TCAM spill for two
keys that differ only in delta.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 30 Jan 2019 08:58:36 +0000 (08:58 +0000)]
selftests: spectrum-2: Fix multiple_masks_test
With recent fix in C-TCAM spillage for delta masks, the test stops to be
falsely positive. So fix it not to use delta by adding src_ip bits to the
masks. Alongside with that, use C-TCAM spill trace to see when the
spillage actually happens.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 30 Jan 2019 08:58:35 +0000 (08:58 +0000)]
selftests: spectrum-2: Extend and move trace helpers
Allow to specify number of trace hits and move helpers
to the beginning of the file.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 30 Jan 2019 08:58:34 +0000 (08:58 +0000)]
mlxsw: spectrum_acl: Add C-TCAM spill tracepoint
Add some visibility to the rule addition process and trace whenever rule
spilled into C-TCAM.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 30 Jan 2019 08:58:33 +0000 (08:58 +0000)]
mlxsw: spectrum_acl: Include delta bits into hashtable key
Currently only ERP mask masked bits in key are considered for
the hashtable key. That leads to false negative collisions
and fallbacks to C-TCAM in case two keys differ only in delta bits.
Fix this by taking full encoded key as a hashtable key,
including delta bits.
Reported-by: Nir Dotan <nird@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Jan 2019 08:44:08 +0000 (00:44 -0800)]
Merge branch 'sctp-support-SCTP_FUTURE-CURRENT-ALL_ASSOC'
Xin Long says:
====================
sctp: support SCTP_FUTURE/CURRENT/ALL_ASSOC
This patchset adds the support for 3 assoc_id constants: SCTP_FUTURE_ASSOC
SCTP_CURRENT_ASSOC, SCTP_ALL_ASSOC, described in rfc6458#section-7.2:
All socket options set on a one-to-one style listening socket also
apply to all future accepted sockets. For one-to-many style sockets,
often a socket option will pass a structure that includes an assoc_id
field. This field can be filled with the association identifier of a
particular association and unless otherwise specified can be filled
with one of the following constants:
SCTP_FUTURE_ASSOC: Specifies that only future associations created
after this socket option will be affected by this call.
SCTP_CURRENT_ASSOC: Specifies that only currently existing
associations will be affected by this call, and future
associations will still receive the previous default value.
SCTP_ALL_ASSOC: Specifies that all current and future associations
will be affected by this call.
The functions for many other sockopts that use assoc_id also need to be
updated accordingly.
====================
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:46 +0000 (15:08 +0800)]
sctp: add SCTP_FUTURE_ASOC and SCTP_CURRENT_ASSOC for SCTP_STREAM_SCHEDULER sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_scheduler and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_scheduler,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_STREAM_SCHEDULER in this
patch. It also adds default_ss in sctp_sock to support
SCTP_FUTURE_ASSOC.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:45 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_EVENT sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_event and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_event,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_EVENT in this patch.
It also adds sctp_assoc_ulpevent_type_set() to make code more
readable.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:44 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_ENABLE_STREAM_RESET sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_enable_strreset and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_enable_strreset,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_ENABLE_STREAM_RESET in this patch.
It also adjusts some code to keep a same check form as other functions.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:43 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_PRINFO sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_default_prinfo and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_default_prinfo,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_DEFAULT_PRINFO in this patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:42 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_DEACTIVATE_KEY sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_deactivate_key.
SCTP_CURRENT_ASSOC is supported for SCTP_AUTH_DEACTIVATE_KEY in this
patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:41 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_DELETE_KEY sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_del_key.
SCTP_CURRENT_ASSOC is supported for SCTP_AUTH_DELETE_KEY in this patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:40 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_ACTIVE_KEY sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_auth_key.
SCTP_CURRENT_ASSOC is supported for SCTP_AUTH_ACTIVE_KEY in this patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:39 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_AUTH_KEY sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_auth_key.
SCTP_CURRENT_ASSOC is supported for SCTP_AUTH_KEY in this patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:38 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_MAX_BURST sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_maxburst and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_maxburst,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_CONTEXT in this patch.
It also adjusts some code to keep a same check form as other
functions.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:37 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_CONTEXT sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_context and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_context,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_CONTEXT in this patch.
It also adjusts some code to keep a same check form as other
functions.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:36 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_SNDINFO sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_default_sndinfo and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_default_sndinfo,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_DEFAULT_SNDINFO in this patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:35 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DEFAULT_SEND_PARAM sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_default_send_param and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_default_send_param,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_DEFAULT_SEND_PARAM in this patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:34 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_DELAYED_SACK sockopt
Check with SCTP_ALL_ASSOC instead in sctp_setsockopt_delayed_ack and
check with SCTP_FUTURE_ASSOC instead in sctp_getsockopt_delayed_ack,
it's compatible with 0.
SCTP_CURRENT_ASSOC is supported for SCTP_DELAYED_SACK in this patch.
It also adds sctp_apply_asoc_delayed_ack() to make code more readable.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:33 +0000 (15:08 +0800)]
sctp: add SCTP_CURRENT_ASSOC for SCTP_STREAM_SCHEDULER_VALUE sockopt
SCTP_STREAM_SCHEDULER_VALUE is a special one, as its value is not
save in sctp_sock, but only in asoc. So only SCTP_CURRENT_ASSOC
reserved assoc_id can be used in sctp_setsockopt_scheduler_value.
This patch adds SCTP_CURRENT_ASOC support for
SCTP_STREAM_SCHEDULER_VALUE.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:32 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_INTERLEAVING_SUPPORTED sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_reconfig_supported, it's compatible with 0.
It also adjusts some code to keep a same check form as other functions.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:31 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_RECONFIG_SUPPORTED sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_reconfig_supported, it's compatible with 0.
It also adjusts some code to keep a same check form as other functions.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:30 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_PR_SUPPORTED sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_pr_supported, it's compatible with 0.
It also adjusts some code to keep a same check form as other functions.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:29 +0000 (15:08 +0800)]
sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_paddr_thresholds, it's compatible with 0.
It also adds pf_retrans in sctp_sock to support SCTP_FUTURE_ASSOC.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:28 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_LOCAL_AUTH_CHUNKS sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_getsockopt_local_auth_chunks, it's compatible with 0.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:27 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_MAXSEG sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_maxseg, it's compatible with 0.
Also check asoc_id early as other sctp setsockopts does.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:26 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_ASSOCINFO sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_associnfo, it's compatible with 0.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:25 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_RTOINFO sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_set/getsockopt_rtoinfo, it's compatible with 0.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:24 +0000 (15:08 +0800)]
sctp: use SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_PARAMS sockopt
Check with SCTP_FUTURE_ASSOC instead in
sctp_/setgetsockopt_peer_addr_params, it's compatible with 0.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 28 Jan 2019 07:08:23 +0000 (15:08 +0800)]
sctp: introduce SCTP_FUTURE/CURRENT/ALL_ASSOC
This patch is to add 3 constants SCTP_FUTURE_ASSOC,
SCTP_CURRENT_ASSOC and SCTP_ALL_ASSOC for reserved
assoc_ids, as defined in rfc6458#section-7.2.
And add the process for them when doing lookup and
inserting in sctp_id2assoc and sctp_assoc_set_id.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Jan 2019 06:13:09 +0000 (22:13 -0800)]
Merge branch 'devlink-port'
Vasundhara Volam says:
====================
devlink: Add configuration parameters support for devlink_port
This patchset adds support for configuration parameters setting through
devlink_port. Each device registers supported configuration parameters
table.
The user can retrieve data on these parameters by
"devlink port param show" command and can set new value to a
parameter by "devlink port param set" command.
All configuration modes supported by devlink_dev are supported
by devlink_port also.
Command examples and output:
pci/0000:3b:00.0/0:
name wake-on-lan type generic
values:
cmode permanent value false
pci/0000:3b:00.1/1:
name wake-on-lan type generic
values:
cmode permanent value false
pci/0000:af:00.0/0:
name wake-on-lan type generic
values:
cmode permanent value true
pci/0000:3b:00.0/0:
name wake-on-lan type generic
values:
cmode permanent value false
There is difference of opinion on adding WOL parameter to devlink, between
Jakub Kicinski and Michael Chan.
Quote from Jakud Kicinski:
********
As explained previously I think it's a very bad idea to add existing
configuration options to devlink, just because devlink has the ability
to persist the setting in NVM. Especially that for WoL you have to get
the link up so you potentially have all link config stuff as well. And
that n-tuple filters are one of the WoL options, meaning we'd need the
ability to persist n-tuple filters via devlink.
The effort would be far better spent helping with migrating ethtool to
netlink, and allowing persisting there.
I have not heard any reason why devlink is a better fit. I can imagine
you're just doing it here because it's less effort for you since
ethtool is not yet migrated.
********
Quote from Michael Chan:
********
The devlink's WoL parameter is a persistent WoL parameter stored in the
NIC's NVRAM. It is different from ethtool's WoL parameter in a number of
ways. ethtool WoL is not persistent over AC power cycle and is considered
OS-present WoL. As such, ethtool WoL can use a more sophisticated pattern
including n-tuple with IP address in addition to the more basic types
(e.g. magic packet). Whereas OS-absent power up WoL should only include
magic packet and other simple types. The devlink WoL setting does not have
to match the ethtool WoL setting. The card will autoneg up to the speed
supported by Vaux so no special devlink link setting is needed.
********
Future expansion of WOL parameter to devlink:
********
Add an additional flag to support additional setting to address link settings.
This will allow attributes to support both runtime and persistent
configuration.
********
v7->v8:
* Re-ordered function definitions.
* Append with "Acked-by: Jiri Pirko <jiri@mellanox.com>" to first 3 patches.
* Add missing devlink_port_param_driverinit_value_get() declaration.
v6->v7:
* Remove RFC tag from the patch-set.
v5->v6:
* Replace '-' with '*' in cover letter to avoid cutoff by git.
v4->v5:
* Added quotes from Jakub Kicinski and Michael chan on devlink's WOL
parameter in the cover letter.
v3->v4:
* Update changes done from v2 to v3 version in individual patch
descriptions.
v2->v3:
Make following changes as per suggestions from Jiri Pirko and
Michal Kubecek.
* Add a helper __devlink_params_register() with common code used by
both devlink_params_register() and devlink_port_params_register().
* Define only WOL types used now and define them as bitfield, so that
mutliple WOL types can be enabled upon power on.
* Modify "wake-on-lan" name to "wake_on_lan" to be symmetric with
previous definitions.
* Rename DEVLINK_PARAM_WOL_XXX to DEVLINK_PARAM_WAKE_XXX to be
symmetrical with ethtool WOL definitions.
* Modify bnxt_dl_wol_validate(), to throw error message when user gives
value other than DEVLINK_PARAM_WAKE_MAGIC or to disable WOL.
* Use netdev_err() instead of netdev_warn(), when devlink_port_register()
and devlink_port_params_register() returns error. Also, don't log rc
in this message.
v1->v2:
Make following changes as per suggestions from Jiri Pirko.
* Remove separate enum devlink_port_param_generic_id for port params.
Instead club it with existing device params. Accordingly refactor
remaining patchset.
* Move INIT_LIST_HEAD of port param_list to devlink_port_register()
* Add a helper devlink_param_verify() to be used for both
devlink_params_register() and devlink_port_params_register().
* Add a helper __devlink_params_unregister() for common code in
devlink_params_unregister() and devlink_port_params_unregister().
* Move DEVLINK_CMD_PORT_PARAM_XXX definitions to the end of the enum.
* Split the patches for devlink_port_param_driverinit_value_get() and
devlink_port_param_driverinit_value_set() into separate patches.
* define DEVLINK_PARAM_GENERIC_ID_WOL type as u8 and define enum for
different types of WOL. Accordingly modify bnxt_en patch to validate
wol type.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:27 +0000 (18:00 +0530)]
bnxt_en: Add bnxt_en initial port params table and register it
Register devlink_port with devlink and create initial port params
table for bnxt_en. The table consists of a generic parameter:
wake_on_lan: Enables Wake on Lan for this port when magic packet
is received with this port's MAC address using ACPI pattern.
If enabled, the controller asserts a wake pin upon reception of
WoL packet. ACPI (Advanced Configuration and Power Interface) is
an industry specification for the efficient handling of power
consumption in desktop and mobile computers.
v2->v3:
- Modify bnxt_dl_wol_validate(), to throw error message when user gives
value other than DEVLINK_PARAM_WAKE_MAGIC ot to disable WOL.
- Use netdev_err() instead of netdev_warn(), when devlink_port_register()
and devlink_port_params_register() returns error. Also, don't log rc
in this message.
Cc: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:26 +0000 (18:00 +0530)]
devlink: Add a generic wake_on_lan port parameter
wake_on_lan - Enables Wake on Lan for this port. If enabled,
the controller asserts a wake pin based on the WOL type.
v2->v3:
- Define only WOL types used now and define them as bitfield, so that
mutliple WOL types can be enabled upon power on.
- Modify "wake-on-lan" name to "wake_on_lan" to be symmetric with
previous definitions.
- Rename DEVLINK_PARAM_WOL_XXX to DEVLINK_PARAM_WAKE_XXX to be
symmetrical with ethtool WOL definitions.
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:25 +0000 (18:00 +0530)]
devlink: Add devlink notifications support for port params
Add notification call for devlink port param set, register and unregister
functions.
Add devlink_port_param_value_changed() function to enable the driver notify
devlink on value change. Driver should use this function after value was
changed on any configuration mode part to driverinit.
v7->v8:
Order devlink_port_param_value_changed() definitions followed by
devlink_param_value_changed()
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:24 +0000 (18:00 +0530)]
devlink: Add support for driverinit set value for devlink_port
Add support for "driverinit" configuration mode value for devlink_port
configuration parameters. Add devlink_port_param_driverinit_value_set()
function to help the driver set the value to devlink_port.
Also, move the common code to __devlink_param_driverinit_value_set()
to be used by both device and port params.
v7->v8:
Re-order the definitions as follows:
__devlink_param_driverinit_value_get
__devlink_param_driverinit_value_set
devlink_param_driverinit_value_get
devlink_param_driverinit_value_set
devlink_port_param_driverinit_value_get
devlink_port_param_driverinit_value_set
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:23 +0000 (18:00 +0530)]
devlink: Add support for driverinit get value for devlink_port
Add support for "driverinit" configuration mode value for devlink_port
configuration parameters. Add devlink_port_param_driverinit_value_get()
function to help the driver get the value from devlink_port.
Also, move the common code to __devlink_param_driverinit_value_get()
to be used by both device and port params.
v7->v8:
-Add the missing devlink_port_param_driverinit_value_get() declaration.
-Also, order devlink_port_param_driverinit_value_get() after
devlink_param_driverinit_value_get/set() calls
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:22 +0000 (18:00 +0530)]
devlink: Add port param set command
Add port param set command to set the value for a parameter.
Value can be set to any of the supported configuration modes.
v7->v8: Append "Acked-by: Jiri Pirko <jiri@mellanox.com>"
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:21 +0000 (18:00 +0530)]
devlink: Add port param get command
Add port param get command which gets data per parameter.
It also has option to dump the parameters data per port.
v7->v8: Append "Acked-by: Jiri Pirko <jiri@mellanox.com>"
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Mon, 28 Jan 2019 12:30:20 +0000 (18:00 +0530)]
devlink: Add devlink_param for port register and unregister
Add functions to register and unregister for the driver supported
configuration parameters table per port.
v7->v8:
- Order the definitions following way as suggested by Jiri.
__devlink_params_register
__devlink_params_unregister
devlink_params_register
devlink_params_unregister
devlink_port_params_register
devlink_port_params_unregister
- Append with Acked-by: Jiri Pirko <jiri@mellanox.com>.
v2->v3:
- Add a helper __devlink_params_register() with common code used by
both devlink_params_register() and devlink_port_params_register().
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 30 Jan 2019 05:18:54 +0000 (21:18 -0800)]
Merge git://git./linux/kernel/git/davem/net
Linus Torvalds [Wed, 30 Jan 2019 01:11:47 +0000 (17:11 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Need to save away the IV across tls async operations, from Dave
Watson.
2) Upon successful packet processing, we should liberate the SKB with
dev_consume_skb{_irq}(). From Yang Wei.
3) Only apply RX hang workaround on effected macb chips, from Harini
Katakam.
4) Dummy netdev need a proper namespace assigned to them, from Josh
Elsasser.
5) Some paths of nft_compat run lockless now, and thus we need to use a
proper refcnt_t. From Florian Westphal.
6) Avoid deadlock in mlx5 by doing IRQ locking, from Moni Shoua.
7) netrom does not refcount sockets properly wrt. timers, fix that by
using the sock timer API. From Cong Wang.
8) Fix locking of inexact inserts of xfrm policies, from Florian
Westphal.
9) Missing xfrm hash generation bump, also from Florian.
10) Missing of_node_put() in hns driver, from Yonglong Liu.
11) Fix DN_IFREQ_SIZE, from Johannes Berg.
12) ip6mr notifier is invoked during traversal of wrong table, from Nir
Dotan.
13) TX promisc settings not performed correctly in qed, from Manish
Chopra.
14) Fix OOB access in vhost, from Jason Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
MAINTAINERS: Add entry for XDP (eXpress Data Path)
net: set default network namespace in init_dummy_netdev()
net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles
net: caif: call dev_consume_skb_any when skb xmit done
net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
net: macb: Apply RXUBR workaround only to versions with errata
net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq
net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq
net: tls: Fix deadlock in free_resources tx
net: tls: Save iv in tls_rec for async crypto requests
vhost: fix OOB in get_rx_bufs()
qed: Fix stack out of bounds bug
qed: Fix system crash in ll2 xmit
qed: Fix VF probe failure while FLR
qed: Fix LACP pdu drops for VFs
qed: Fix bug in tx promiscuous mode settings
net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
netfilter: ipt_CLUSTERIP: fix warning unused variable cn
...
Jesper Dangaard Brouer [Tue, 29 Jan 2019 13:22:33 +0000 (14:22 +0100)]
MAINTAINERS: Add entry for XDP (eXpress Data Path)
Add multiple people as maintainers for XDP, sorted alphabetically.
XDP is also tied to driver level support and code, but we cannot add all
drivers to the list. Instead K: and N: match on 'xdp' in hope to catch some
of those changes in drivers.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Josh Elsasser [Sat, 26 Jan 2019 22:38:33 +0000 (14:38 -0800)]
net: set default network namespace in init_dummy_netdev()
Assign a default net namespace to netdevs created by init_dummy_netdev().
Fixes a NULL pointer dereference caused by busy-polling a socket bound to
an iwlwifi wireless device, which bumps the per-net BUSYPOLLRXPACKETS stat
if napi_poll() received packets:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000190
IP: napi_busy_loop+0xd6/0x200
Call Trace:
sock_poll+0x5e/0x80
do_sys_poll+0x324/0x5a0
SyS_poll+0x6c/0xf0
do_syscall_64+0x6b/0x1f0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Fixes: 7db6b048da3b ("net: Commonize busy polling code to focus on napi_id instead of socket")
Signed-off-by: Josh Elsasser <jelsasser@appneta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Tue, 29 Jan 2019 17:05:23 +0000 (11:05 -0600)]
cxgb4: cxgb4_tc_u32: use struct_size() in kvzalloc()
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
instance = kvzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:
instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Tue, 29 Jan 2019 16:44:44 +0000 (10:44 -0600)]
cxgb4: clip_tbl: Use struct_size() in kvzalloc()
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with memory
for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
instance = kvzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:
instance = kvzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 29 Jan 2019 15:04:40 +0000 (23:04 +0800)]
net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles
The skb should be freed by dev_consume_skb_any() in b44_start_xmit()
when bounce_skb is used. The skb is be replaced by bounce_skb, so the
original skb should be consumed(not drop).
dev_consume_skb_irq() should be called in b44_tx() when skb xmit
done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 29 Jan 2019 15:32:22 +0000 (23:32 +0800)]
net: caif: call dev_consume_skb_any when skb xmit done
The skb shouled be consumed when xmit done, it makes drop profiles
(dropwatch, perf) more friendly.
dev_kfree_skb_irq()/kfree_skb() shouled be replaced by
dev_consume_skb_any(), it makes code cleaner.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Tue, 29 Jan 2019 14:40:51 +0000 (22:40 +0800)]
net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in cp_tx() when skb xmit
done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Tue, 29 Jan 2019 10:08:42 +0000 (15:38 +0530)]
MAINTAINERS: update cxgb4 and cxgb3 maintainer
Vishal Kulkarni will be the new maintainer for Chelsio cxgb3/cxgb4
drivers.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Tue, 29 Jan 2019 09:50:19 +0000 (15:20 +0530)]
cxgb4vf: Update port information in cxgb4vf_open()
It's possible that the basic port information could have
changed since we first read it.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Harini Katakam [Tue, 29 Jan 2019 09:50:03 +0000 (15:20 +0530)]
net: macb: Apply RXUBR workaround only to versions with errata
The interrupt handler contains a workaround for RX hang applicable
to Zynq and AT91RM9200 only. Subsequent versions do not need this
workaround. This workaround unnecessarily resets RX whenever RX used
bit read is observed, which can be often under heavy traffic. There
is no other action performed on RX UBR interrupt. Hence introduce a
CAPS mask; enable this interrupt and workaround only on affected
versions.
Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Veerasenareddy Burru [Mon, 28 Jan 2019 19:38:31 +0000 (19:38 +0000)]
liquidio: fix the validation of rx checksum status from NIC hardware
Fixed the code that was incorrectly interpreting the rx checksum validation
status from hardware, and updating kernel that the packet arrived with
correct checksum though the packet arrived with incorrect checksum and
hardware also indicated checksum is not correct.
Signed-off-by: Veerasenareddy Burru <vburru@marvell.com>
Acked-by: Derek Chickles <dchickles@marvell.com>
Signed-off-by: Felix Manlunas <fmanlunas@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 28 Jan 2019 23:40:10 +0000 (07:40 +0800)]
net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in cpmac_end_xmit() when
xmit done. It makes drop profiles more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 28 Jan 2019 23:39:13 +0000 (07:39 +0800)]
net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in bmac_txdma_intr() when
xmit done. It makes drop profiles more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Sun, 27 Jan 2019 15:58:25 +0000 (23:58 +0800)]
net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq
dev_consume_skb_irq() should be called in amd8111e_tx() when xmit
done. It makes drop profiles more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Sun, 27 Jan 2019 15:56:34 +0000 (23:56 +0800)]
net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq
dev_consume_skb_irq() should be called in ace_tx_int() when xmit
done. It makes drop profiles more friendly.
Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Watson [Sun, 27 Jan 2019 00:59:03 +0000 (00:59 +0000)]
net: tls: Fix deadlock in free_resources tx
If there are outstanding async tx requests (when crypto returns EINPROGRESS),
there is a potential deadlock: the tx work acquires the lock, while we
cancel_delayed_work_sync() while holding the lock. Drop the lock while waiting
for the work to complete.
Fixes: a42055e8d2c30 ("Add support for async encryption of records...")
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Watson [Sun, 27 Jan 2019 00:57:38 +0000 (00:57 +0000)]
net: tls: Save iv in tls_rec for async crypto requests
aead_request_set_crypt takes an iv pointer, and we change the iv
soon after setting it. Some async crypto algorithms don't save the iv,
so we need to save it in the tls_rec for async requests.
Found by hardcoding x64 aesni to use async crypto manager (to test the async
codepath), however I don't think this combination can happen in the wild.
Presumably other hardware offloads will need this fix, but there have been
no user reports.
Fixes: a42055e8d2c30 ("Add support for async encryption of records...")
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Mon, 28 Jan 2019 07:05:05 +0000 (15:05 +0800)]
vhost: fix OOB in get_rx_bufs()
After batched used ring updating was introduced in commit
e2b3b35eb989
("vhost_net: batch used ring update in rx"). We tend to batch heads in
vq->heads for more than one packet. But the quota passed to
get_rx_bufs() was not correctly limited, which can result a OOB write
in vq->heads.
headcount = get_rx_bufs(vq, vq->heads + nvq->done_idx,
vhost_len, &in, vq_log, &log,
likely(mergeable) ? UIO_MAXIOV : 1);
UIO_MAXIOV was still used which is wrong since we could have batched
used in vq->heads, this will cause OOB if the next buffer needs more
than 960 (1024 (UIO_MAXIOV) - 64 (VHOST_NET_BATCH)) heads after we've
batched 64 (VHOST_NET_BATCH) heads:
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
=============================================================================
BUG kmalloc-8k (Tainted: G B ): Redzone overwritten
-----------------------------------------------------------------------------
INFO: 0x00000000fd93b7a2-0x00000000f0713384. First byte 0xa9 instead of 0xcc
INFO: Allocated in alloc_pd+0x22/0x60 age=
3933677 cpu=2 pid=2674
kmem_cache_alloc_trace+0xbb/0x140
alloc_pd+0x22/0x60
gen8_ppgtt_create+0x11d/0x5f0
i915_ppgtt_create+0x16/0x80
i915_gem_create_context+0x248/0x390
i915_gem_context_create_ioctl+0x4b/0xe0
drm_ioctl_kernel+0xa5/0xf0
drm_ioctl+0x2ed/0x3a0
do_vfs_ioctl+0x9f/0x620
ksys_ioctl+0x6b/0x80
__x64_sys_ioctl+0x11/0x20
do_syscall_64+0x43/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
INFO: Slab 0x00000000d13e87af objects=3 used=3 fp=0x (null) flags=0x200000000010201
INFO: Object 0x0000000003278802 @offset=17064 fp=0x00000000e2e6652b
Fixing this by allocating UIO_MAXIOV + VHOST_NET_BATCH iovs for
vhost-net. This is done through set the limitation through
vhost_dev_init(), then set_owner can allocate the number of iov in a
per device manner.
This fixes CVE-2018-16880.
Fixes: e2b3b35eb989 ("vhost_net: batch used ring update in rx")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Rothwell [Tue, 29 Jan 2019 05:13:08 +0000 (16:13 +1100)]
enetc: include linux/vmalloc.h for vzalloc etc
Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 29 Jan 2019 03:38:33 +0000 (19:38 -0800)]
Merge git://git./linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-01-29
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Teach verifier dead code removal, this also allows for optimizing /
removing conditional branches around dead code and to shrink the
resulting image. Code store constrained architectures like nfp would
have hard time doing this at JIT level, from Jakub.
2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
code generation for 32-bit sub-registers. Evaluation shows that this
can result in code reduction of ~5-20% compared to 64 bit-only code
generation. Also add implementation for most JITs, from Jiong.
3) Add support for __int128 types in BTF which is also needed for
vmlinux's BTF conversion to work, from Yonghong.
4) Add a new command to bpftool in order to dump a list of BPF-related
parameters from the system or for a specific network device e.g. in
terms of available prog/map types or helper functions, from Quentin.
5) Add AF_XDP sock_diag interface for querying sockets from user
space which provides information about the RX/TX/fill/completion
rings, umem, memory usage etc, from Björn.
6) Add skb context access for skb_shared_info->gso_segs field, from Eric.
7) Add support for testing flow dissector BPF programs by extending
existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.
8) Split BPF kselftest's test_verifier into various subgroups of tests
in order better deal with merge conflicts in this area, from Jakub.
9) Add support for queue/stack manipulations in bpftool, from Stanislav.
10) Document BTF, from Yonghong.
11) Dump supported ELF section names in libbpf on program load
failure, from Taeung.
12) Silence a false positive compiler warning in verifier's BTF
handling, from Peter.
13) Fix help string in bpftool's feature probing, from Prashant.
14) Remove duplicate includes in BPF kselftests, from Yue.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 29 Jan 2019 01:34:38 +0000 (17:34 -0800)]
Merge git://git./linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter/IPVS updates for your net-next tree:
1) Introduce a hashtable to speed up object lookups, from Florian Westphal.
2) Make direct calls to built-in extension, also from Florian.
3) Call helper before confirming the conntrack as it used to be originally,
from Florian.
4) Call request_module() to autoload br_netfilter when physdev is used
to relax the dependency, also from Florian.
5) Allow to insert rules at a given position ID that is internal to the
batch, from Phil Sutter.
6) Several patches to replace conntrack indirections by direct calls,
and to reduce modularization, from Florian. This also includes
several follow up patches to deal with minor fallout from this
rework.
7) Use RCU from conntrack gre helper, from Florian.
8) GRE conntrack module becomes built-in into nf_conntrack, from Florian.
9) Replace nf_ct_invert_tuplepr() by calls to nf_ct_invert_tuple(),
from Florian.
10) Unify sysctl handling at the core of nf_conntrack, from Florian.
11) Provide modparam to register conntrack hooks.
12) Allow to match on the interface kind string, from wenxu.
13) Remove several exported symbols, not required anymore now after
a bit of de-modulatization work has been done, from Florian.
14) Remove built-in map support in the hash extension, this can be
done with the existing userspace infrastructure, from laura.
15) Remove indirection to calculate checksums in IPVS, from Matteo Croce.
16) Use call wrappers for indirection in IPVS, also from Matteo.
17) Remove superfluous __percpu parameter in nft_counter, patch from
Luc Van Oostenryck.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 29 Jan 2019 00:08:30 +0000 (01:08 +0100)]
Merge branch 'bpf-flow-dissector-tests'
Stanislav Fomichev says:
====================
This patch series adds support for testing flow dissector BPF programs
by extending already existing BPF_PROG_TEST_RUN. The goal is to have
a packet as an input and `struct bpf_flow_key' as an output. That way
we can easily test flow dissector programs' behavior. I've also modified
existing test_progs.c test to do a simple flow dissector run as well.
* first patch introduces new __skb_flow_bpf_dissect to simplify
sharing between __skb_flow_bpf_dissect and BPF_PROG_TEST_RUN
* second patch adds actual BPF_PROG_TEST_RUN support
* third patch adds example usage to the selftests
v3:
* rebased on top of latest bpf-next
v2:
* loop over 'kattr->test.repeat' inside of
bpf_prog_test_run_flow_dissector, don't reuse
bpf_test_run/bpf_test_run_one
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Stanislav Fomichev [Mon, 28 Jan 2019 16:53:55 +0000 (08:53 -0800)]
selftests/bpf: add simple BPF_PROG_TEST_RUN examples for flow dissector
Use existing pkt_v4 and pkt_v6 to make sure flow_keys are what we want.
Also, add new bpf_flow_load routine (and flow_dissector_load.h header)
that loads bpf_flow.o program and does all required setup.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Stanislav Fomichev [Mon, 28 Jan 2019 16:53:54 +0000 (08:53 -0800)]
bpf: add BPF_PROG_TEST_RUN support for flow dissector
The input is packet data, the output is struct bpf_flow_key. This should
make it easy to test flow dissector programs without elaborate
setup.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Stanislav Fomichev [Mon, 28 Jan 2019 16:53:53 +0000 (08:53 -0800)]
net/flow_dissector: move bpf case into __skb_flow_bpf_dissect
This way, we can reuse it for flow dissector in BPF_PROG_TEST_RUN.
No functional changes.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Jakub Kicinski [Mon, 28 Jan 2019 18:29:15 +0000 (10:29 -0800)]
tools: bpftool: warn about risky prog array updates
When prog array is updated with bpftool users often refer
to the map via the ID. Unfortunately, that's likely
to lead to confusion because prog arrays get flushed when
the last user reference is gone. If there is no other
reference bpftool will create one, update successfully
just to close the map again and have it flushed.
Warn about this case in non-JSON mode.
If the problem continues causing confusion we can remove
the support for referring to a map by ID for prog array
update completely. For now it seems like the potential
inconvenience to users who know what they're doing outweighs
the benefit.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
YueHaibing [Fri, 25 Jan 2019 02:46:34 +0000 (10:46 +0800)]
selftests: bpf: remove duplicated include
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
David S. Miller [Mon, 28 Jan 2019 19:13:35 +0000 (11:13 -0800)]
Merge branch 'qed-Bug-fixes'
Manish Chopra says:
====================
qed: Bug fixes
This series have SR-IOV and some general fixes.
Please consider applying it to "net"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 28 Jan 2019 18:05:08 +0000 (10:05 -0800)]
qed: Fix stack out of bounds bug
KASAN reported following bug in qed_init_qm_get_idx_from_flags
due to inappropriate casting of "pq_flags". Fix the type of "pq_flags".
[ 196.624707] BUG: KASAN: stack-out-of-bounds in qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
[ 196.624712] Read of size 8 at addr
ffff809b00bc7360 by task kworker/0:9/1712
[ 196.624714]
[ 196.624720] CPU: 0 PID: 1712 Comm: kworker/0:9 Not tainted 4.18.0-60.el8.aarch64+debug #1
[ 196.624723] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL024 09/26/2018
[ 196.624733] Workqueue: events work_for_cpu_fn
[ 196.624738] Call trace:
[ 196.624742] dump_backtrace+0x0/0x2f8
[ 196.624745] show_stack+0x24/0x30
[ 196.624749] dump_stack+0xe0/0x11c
[ 196.624755] print_address_description+0x68/0x260
[ 196.624759] kasan_report+0x178/0x340
[ 196.624762] __asan_report_load_n_noabort+0x38/0x48
[ 196.624786] qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
[ 196.624808] qed_init_qm_info+0xec0/0x2200 [qed]
[ 196.624830] qed_resc_alloc+0x284/0x7e8 [qed]
[ 196.624853] qed_slowpath_start+0x6cc/0x1ae8 [qed]
[ 196.624864] __qede_probe.isra.10+0x1cc/0x12c0 [qede]
[ 196.624874] qede_probe+0x78/0xf0 [qede]
[ 196.624879] local_pci_probe+0xc4/0x180
[ 196.624882] work_for_cpu_fn+0x54/0x98
[ 196.624885] process_one_work+0x758/0x1900
[ 196.624888] worker_thread+0x4e0/0xd18
[ 196.624892] kthread+0x2c8/0x350
[ 196.624897] ret_from_fork+0x10/0x18
[ 196.624899]
[ 196.624902] Allocated by task 2:
[ 196.624906] kasan_kmalloc.part.1+0x40/0x108
[ 196.624909] kasan_kmalloc+0xb4/0xc8
[ 196.624913] kasan_slab_alloc+0x14/0x20
[ 196.624916] kmem_cache_alloc_node+0x1dc/0x480
[ 196.624921] copy_process.isra.1.part.2+0x1d8/0x4a98
[ 196.624924] _do_fork+0x150/0xfa0
[ 196.624926] kernel_thread+0x48/0x58
[ 196.624930] kthreadd+0x3a4/0x5a0
[ 196.624932] ret_from_fork+0x10/0x18
[ 196.624934]
[ 196.624937] Freed by task 0:
[ 196.624938] (stack is not available)
[ 196.624940]
[ 196.624943] The buggy address belongs to the object at
ffff809b00bc0000
[ 196.624943] which belongs to the cache thread_stack of size 32768
[ 196.624946] The buggy address is located 29536 bytes inside of
[ 196.624946] 32768-byte region [
ffff809b00bc0000,
ffff809b00bc8000)
[ 196.624948] The buggy address belongs to the page:
[ 196.624952] page:
ffff7fe026c02e00 count:1 mapcount:0 mapping:
ffff809b4001c000 index:0x0 compound_mapcount: 0
[ 196.624960] flags: 0xfffff8000008100(slab|head)
[ 196.624967] raw:
0fffff8000008100 dead000000000100 dead000000000200 ffff809b4001c000
[ 196.624970] raw:
0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
[ 196.624973] page dumped because: kasan: bad access detected
[ 196.624974]
[ 196.624976] Memory state around the buggy address:
[ 196.624980]
ffff809b00bc7200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 196.624983]
ffff809b00bc7280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 196.624985] >
ffff809b00bc7300: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
[ 196.624988] ^
[ 196.624990]
ffff809b00bc7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 196.624993]
ffff809b00bc7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 196.624995] ==================================================================
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 28 Jan 2019 18:05:07 +0000 (10:05 -0800)]
qed: Fix system crash in ll2 xmit
Cache number of fragments in the skb locally as in case
of linear skb (with zero fragments), tx completion
(or freeing of skb) may happen before driver tries
to get number of frgaments from the skb which could
lead to stale access to an already freed skb.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 28 Jan 2019 18:05:06 +0000 (10:05 -0800)]
qed: Fix VF probe failure while FLR
VFs may hit VF-PF channel timeout while probing, as in some
cases it was observed that VF FLR and VF "acquire" message
transaction (i.e first message from VF to PF in VF's probe flow)
could occur simultaneously which could lead VF to fail sending
"acquire" message to PF as VF is marked disabled from HW perspective
due to FLR, which will result into channel timeout and VF probe failure.
In such cases, try retrying VF "acquire" message so that in later
attempts it could be successful to pass message to PF after the VF
FLR is completed and can be probed successfully.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 28 Jan 2019 18:05:05 +0000 (10:05 -0800)]
qed: Fix LACP pdu drops for VFs
VF is always configured to drop control frames
(with reserved mac addresses) but to work LACP
on the VFs, it would require LACP control frames
to be forwarded or transmitted successfully.
This patch fixes this in such a way that trusted VFs
(marked through ndo_set_vf_trust) would be allowed to
pass the control frames such as LACP pdus.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 28 Jan 2019 18:05:04 +0000 (10:05 -0800)]
qed: Fix bug in tx promiscuous mode settings
When running tx switched traffic between VNICs
created via a bridge(to which VFs are added),
adapter drops the unicast packets in tx flow due to
VNIC's ucast mac being unknown to it. But VF interfaces
being in promiscuous mode should have caused adapter
to accept all the unknown ucast packets. Later, it
was found that driver doesn't really configure tx
promiscuous mode settings to accept all unknown unicast macs.
This patch fixes tx promiscuous mode settings to accept all
unknown/unmatched unicast macs and works out the scenario.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 28 Jan 2019 18:58:41 +0000 (10:58 -0800)]
Merge branch 'qed-Error-recovery-process'
Michal Kalderon says:
====================
qed*: Error recovery process
Parity errors might happen in the device's memories due to momentary bit
flips which are caused by radiation.
Errors that are not correctable initiate a process kill event, which blocks
the device access towards the host and the network, and a recovery process
is started in the management FW and in the driver.
This series adds the support of this process in the qed core module and in
the qede driver (patches 2 & 3).
Patch 1 in the series revises the load sequence, to avoid PCI errors that
might be observed during a recovery process.
Changes in v2:
- Addressed issue found in https://patchwork.ozlabs.org/patch/
1030545/
The change was done be removing the enum and passing a boolean to
the related functions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tomer Tayar [Mon, 28 Jan 2019 17:27:56 +0000 (19:27 +0200)]
qede: Error recovery process
This patch adds the error recovery process in the qede driver.
The process includes a partial/customized driver unload and load, which
allows it to look like a short suspend period to the kernel while
preserving the net devices' state.
Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tomer Tayar [Mon, 28 Jan 2019 17:27:55 +0000 (19:27 +0200)]
qed: Add infrastructure for error detection and recovery
This patch adds the detection and handling of a parity error ("process kill
event"), including the update of the protocol drivers, and the prevention
of any HW access that will lead to device access towards the host while
recovery is in progress.
It also provides the means for the protocol drivers to trigger a recovery
process on their decision.
Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tomer Tayar [Mon, 28 Jan 2019 17:27:54 +0000 (19:27 +0200)]
qed: Revise load sequence to avoid PCI errors
Initiating final cleanup after an ungraceful driver unload can lead to bad
PCI accesses towards the host.
This patch revises the load sequence so final cleanup is sent while the
internal master enable is cleared, to prevent the host accesses, and clears
the internal error indications just before enabling the internal master
enable.
Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lubomir Rintel [Mon, 28 Jan 2019 16:17:40 +0000 (17:17 +0100)]
benet: remove broken and unused macro
is_broadcast_packet() expands to compare_ether_addr() which doesn't
exist since commit
7367d0b573d1 ("drivers/net: Convert uses of
compare_ether_addr to ether_addr_equal"). It turns out it's actually not
used.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Wei [Mon, 28 Jan 2019 14:42:25 +0000 (22:42 +0800)]
net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
dev_consume_skb_irq() should be called in i596_interrupt() when skb
xmit done. It makes drop profiles(dropwatch, perf) more friendly.
Signed-off-by: Yang Wei <albin_yang@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 28 Jan 2019 18:51:51 +0000 (10:51 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree:
1) The nftnl mutex is now per-netns, therefore use reference counter
for matches and targets to deal with concurrent updates from netns.
Moreover, place extensions in a pernet list. Patches from Florian Westphal.
2) Bail out with EINVAL in case of negative timeouts via setsockopt()
through ip_vs_set_timeout(), from ZhangXiaoxu.
3) Spurious EINVAL on ebtables 32bit binary with 64bit kernel, also
from Florian.
4) Reset TCP option header parser in case of fingerprint mismatch,
otherwise follow up overlapping fingerprint definitions including
TCP options do not work, from Fernando Fernandez Mancera.
5) Compilation warning in ipt_CLUSTER with CONFIG_PROC_FS unset.
From Anders Roxell.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 28 Jan 2019 18:43:15 +0000 (10:43 -0800)]
Merge branch 'mlxsw-Misc-updates'
Ido Schimmel says:
====================
mlxsw: Misc updates
This patchset contains miscellaneous patches we gathered in our queue.
Some of them are dependencies of larger patchsets that I will submit
later this cycle.
Patches #1-#3 perform small non-functional changes in mlxsw.
Patch #4 adds more extended ack messages in mlxsw.
Patch #5 adds devlink parameters documentation for mlxsw. To be extended
with more parameters this cycle.
Patches #6-#7 perform small changes in forwarding selftests
infrastructure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 28 Jan 2019 12:02:13 +0000 (12:02 +0000)]
selftests: forwarding: Use OK instead of PASS in test output
It is easier to distinguish "[ OK ]" from "[FAIL]" than "[PASS]".
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 28 Jan 2019 12:02:12 +0000 (12:02 +0000)]
selftests: net: forwarding: change devlink resource support checking
As for the others, check help message output to find out if devlink
supports "resource" object.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 28 Jan 2019 12:02:11 +0000 (12:02 +0000)]
Documentation: add devlink param file for mlxsw driver
Add initial documentation file for devlink params of mlxsw driver. Only
"fw_load_policy" is now supported.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 28 Jan 2019 12:02:10 +0000 (12:02 +0000)]
mlxsw: spectrum_switchdev: Add more extack messages
Add more extack messages that let the user know why VXLAN offload
failed.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 28 Jan 2019 12:02:09 +0000 (12:02 +0000)]
mlxsw: spectrum_acl: Fix rul/rule typo
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 28 Jan 2019 12:02:08 +0000 (12:02 +0000)]
mlxsw: spectrum_acl: Move mr_ruleset and mr_rule structs
Move the struct to the place where they belong, alongside with the rest
of the MR code.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 28 Jan 2019 12:02:07 +0000 (12:02 +0000)]
mlxsw: spectrum_acl: Remove unnecessary arg on action_replace call path
No need to pass ruleset/group and chunk pointers on action_replace call
path, nobody uses them.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Hocko [Fri, 25 Jan 2019 18:08:58 +0000 (19:08 +0100)]
Revert "mm, memory_hotplug: initialize struct pages for the full memory section"
This reverts commit
2830bf6f05fb3e05bc4743274b806c821807a684.
The underlying assumption that one sparse section belongs into a single
numa node doesn't hold really. Robert Shteynfeld has reported a boot
failure. The boot log was not captured but his memory layout is as
follows:
Early memory node ranges
node 1: [mem 0x0000000000001000-0x0000000000090fff]
node 1: [mem 0x0000000000100000-0x00000000dbdf8fff]
node 1: [mem 0x0000000100000000-0x0000001423ffffff]
node 0: [mem 0x0000001424000000-0x0000002023ffffff]
This means that node0 starts in the middle of a memory section which is
also in node1. memmap_init_zone tries to initialize padding of a
section even when it is outside of the given pfn range because there are
code paths (e.g. memory hotplug) which assume that the full worth of
memory section is always initialized.
In this particular case, though, such a range is already intialized and
most likely already managed by the page allocator. Scribbling over
those pages corrupts the internal state and likely blows up when any of
those pages gets used.
Reported-by: Robert Shteynfeld <robert.shteynfeld@gmail.com>
Fixes: 2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full memory section")
Cc: stable@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Florian Westphal [Sun, 27 Jan 2019 18:18:57 +0000 (19:18 +0100)]
netfilter: ipv4: remove useless export_symbol
Only one caller; place it where needed and get rid of the EXPORT_SYMBOL.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Cong Wang [Wed, 23 Jan 2019 20:58:57 +0000 (12:58 -0800)]
netfilter: conntrack: fix error path in nf_conntrack_pernet_init()
When nf_ct_netns_get() fails, it should clean up itself,
its caller doesn't need to call nf_conntrack_fini_net().
nf_conntrack_init_net() is called after registering sysctl
and proc, so its cleanup function should be called before
unregistering sysctl and proc.
Fixes: ba3fbe663635 ("netfilter: nf_conntrack: provide modparam to always register conntrack hooks")
Fixes: b884fa461776 ("netfilter: conntrack: unify sysctl handling")
Reported-and-tested-by: syzbot+fcee88b2d87f0539dfe9@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Luc Van Oostenryck [Sat, 19 Jan 2019 21:50:24 +0000 (22:50 +0100)]
netfilter: nft_counter: remove wrong __percpu of nft_counter_resest()'s arg
nft_counter_rest() has its first argument declared as
struct nft_counter_percpu_priv __percpu *priv
but this structure is not percpu (it only countains
a member 'counter' which is, correctly, a pointer to a
percpu struct nft_counter).
So, remove the '__percpu' from the argument's declaration.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Matteo Croce [Sat, 19 Jan 2019 14:25:35 +0000 (15:25 +0100)]
ipvs: use indirect call wrappers
Use the new indirect call wrappers in IPVS when calling the TCP or UDP
protocol specific functions.
This avoids an indirect calls in IPVS, and reduces the performance
impact of the Spectre mitigation.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Matteo Croce [Sat, 19 Jan 2019 14:22:38 +0000 (15:22 +0100)]
ipvs: avoid indirect calls when calculating checksums
The function pointer ip_vs_protocol->csum_check is only used in protocol
specific code, and never in the generic one.
Remove the function pointer from struct ip_vs_protocol and call the
checksum functions directly.
This reduces the performance impact of the Spectre mitigation, and
should give a small improvement even with RETPOLINES disabled.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>