openwrt/staging/blogic.git
6 years agoact_mirred: use TC_ACT_REINSERT when possible
Paolo Abeni [Mon, 30 Jul 2018 12:30:45 +0000 (14:30 +0200)]
act_mirred: use TC_ACT_REINSERT when possible

When mirred is invoked from the ingress path, and it wants to redirect
the processed packet, it can now use the TC_ACT_REINSERT action,
filling the tcf_result accordingly, and avoiding a per packet
skb_clone().

Overall this gives a ~10% improvement in forwarding performance for the
TC S/W data path and TC S/W performances are now comparable to the
kernel openvswitch datapath.

v1 -> v2: use ACT_MIRRED instead of ACT_REDIRECT
v2 -> v3: updated after action rename, fixed typo into the commit
message
v3 -> v4: updated again after action rename, added more comments to
the code (JiriP), skip the optimization if the control action
need to touch the tcf_result (Paolo)
v4 -> v5: fix sparse warning (kbuild bot)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/tc: introduce TC_ACT_REINSERT.
Paolo Abeni [Mon, 30 Jul 2018 12:30:44 +0000 (14:30 +0200)]
net/tc: introduce TC_ACT_REINSERT.

This is similar TC_ACT_REDIRECT, but with a slightly different
semantic:
- on ingress the mirred skbs are passed to the target device
network stack without any additional check not scrubbing.
- the rcu-protected stats provided via the tcf_result struct
  are updated on error conditions.

This new tcfa_action value is not exposed to the user-space
and can be used only internally by clsact.

v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce
 a new action type instead
v2 -> v3:
 - rename the new action value TC_ACT_REINJECT, update the
   helper accordingly
 - take care of uncloned reinjected packets in XDP generic
   hook
v3 -> v4:
 - renamed again the new action value (JiriP)
v4 -> v5:
 - fix build error with !NET_CLS_ACT (kbuild bot)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc/act: remove unneeded RCU lock in action callback
Paolo Abeni [Mon, 30 Jul 2018 12:30:43 +0000 (14:30 +0200)]
tc/act: remove unneeded RCU lock in action callback

Each lockless action currently does its own RCU locking in ->act().
This allows using plain RCU accessor, even if the context
is really RCU BH.

This change drops the per action RCU lock, replace the accessors
with the _bh variant, cleans up a bit the surrounding code and
documents the RCU status in the relevant header.
No functional nor performance change is intended.

The goal of this patch is clarifying that the RCU critical section
used by the tc actions extends up to the classifier's caller.

v1 -> v2:
 - preserve rcu lock in act_bpf: it's needed by eBPF helpers,
   as pointed out by Daniel

v3 -> v4:
 - fixed some typos in the commit message (JiriP)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/sched: user-space can't set unknown tcfa_action values
Paolo Abeni [Mon, 30 Jul 2018 12:30:42 +0000 (14:30 +0200)]
net/sched: user-space can't set unknown tcfa_action values

Currently, when initializing an action, the user-space can specify
and use arbitrary values for the tcfa_action field. If the value
is unknown by the kernel, is implicitly threaded as TC_ACT_UNSPEC.

This change explicitly checks for unknown values at action creation
time, and explicitly convert them to TC_ACT_UNSPEC. No functional
changes are introduced, but this will allow introducing tcfa_action
values not exposed to user-space in a later patch.

Note: we can't use the above to hide TC_ACT_REDIRECT from user-space,
as the latter is already part of uAPI.

v3 -> v4:
 - use an helper to check for action validity (JiriP)
 - emit an extack for invalid actions (JiriP)
v4 -> v5:
 - keep messages on a single line, drop net_warn (Marcelo)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: remove redundant function cn23xx_dump_iq_regs
YueHaibing [Mon, 30 Jul 2018 11:03:41 +0000 (19:03 +0800)]
liquidio: remove redundant function cn23xx_dump_iq_regs

There are no in-tree callers of cn23xx_dump_iq_regs.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'socket-poll-related-cleanups-v2'
David S. Miller [Mon, 30 Jul 2018 16:10:25 +0000 (09:10 -0700)]
Merge branch 'socket-poll-related-cleanups-v2'

Christoph Hellwig says:

====================
socket poll related cleanups v2

A couple of cleanups I stumbled upon when studying the networking
poll code.

Changes since v1:
 - drop a dispute patch from this series (to be sent separately)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: remove sock_poll_busy_flag
Christoph Hellwig [Mon, 30 Jul 2018 07:42:13 +0000 (09:42 +0200)]
net: remove sock_poll_busy_flag

Fold it into the only caller to make the code simpler and easier to read.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: remove sock_poll_busy_loop
Christoph Hellwig [Mon, 30 Jul 2018 07:42:12 +0000 (09:42 +0200)]
net: remove sock_poll_busy_loop

There is no point in hiding this logic in a helper.  Also remove the
useless events != 0 check and only busy loop once we know we actually
have a poll method.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: don not detour through struct sock to find the poll waitqueue
Christoph Hellwig [Mon, 30 Jul 2018 07:42:11 +0000 (09:42 +0200)]
net: don not detour through struct sock to find the poll waitqueue

For any open socket file descriptor sock->sk->sk_wq->wait will always
point to sock->wq->wait.  That means we can do the shorter dereference
and removal a NULL check and don't have to not worry about any RCU
protection.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: simplify sock_poll_wait
Christoph Hellwig [Mon, 30 Jul 2018 07:42:10 +0000 (09:42 +0200)]
net: simplify sock_poll_wait

The wait_address argument is always directly derived from the filp
argument, so remove it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoact_bpf: Use kmemdup instead of duplicating it in tcf_bpf_init_from_ops
YueHaibing [Sat, 28 Jul 2018 10:38:06 +0000 (18:38 +0800)]
act_bpf: Use kmemdup instead of duplicating it in tcf_bpf_init_from_ops

Replace calls to kmalloc followed by a memcpy with a direct call to
kmemdup.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_bpf: Use kmemdup instead of duplicating it in cls_bpf_prog_from_ops
YueHaibing [Sat, 28 Jul 2018 10:35:15 +0000 (18:35 +0800)]
cls_bpf: Use kmemdup instead of duplicating it in cls_bpf_prog_from_ops

Replace calls to kmalloc followed by a memcpy with a direct call to
kmemdup.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoact_pedit: remove unnecessary semicolon
YueHaibing [Sat, 28 Jul 2018 10:29:01 +0000 (18:29 +0800)]
act_pedit: remove unnecessary semicolon

net/sched/act_pedit.c:289:2-3: Unneeded semicolon

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: remove redundant functions qed_get_cm_pq_idx_rl
YueHaibing [Sat, 28 Jul 2018 10:23:36 +0000 (18:23 +0800)]
qed: remove redundant functions qed_get_cm_pq_idx_rl

There are no in-tree callers of qed_get_cm_pq_idx_rl since it be there,
so it can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-next: mediatek: cleanup unnecessary get chip id and its user
Sean Wang [Sat, 28 Jul 2018 05:35:56 +0000 (13:35 +0800)]
net-next: mediatek: cleanup unnecessary get chip id and its user

Since driver is devicetree-based, all device type and charateristic can be
determined by the compatible string and its data. It's unnecessary to
create another dependent function to check chip ID and then decide whether
the specific funciton is being supported on a certain device. It can be
totally replaced by the existing flag, so a cleanup is made by removing
the function and the only user, HWLRO.

MT2701 also have a missing HWLRO support in old code, so add it the same
patch.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-next: mediatek: improve more with using dma_zalloc_coherent
Sean Wang [Sat, 28 Jul 2018 05:35:55 +0000 (13:35 +0800)]
net-next: mediatek: improve more with using dma_zalloc_coherent

Improve more in the existing code by reusing dma_zalloc_coherent instead
of dma_alloc_coherent with __GFP_ZERO or superfluous zeroing buffer.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosysfs: Fix regression when adding a file to an existing group
Tyler Hicks [Fri, 27 Jul 2018 21:33:27 +0000 (21:33 +0000)]
sysfs: Fix regression when adding a file to an existing group

Commit 5f81880d5204 ("sysfs, kobject: allow creating kobject belonging
to arbitrary users") incorrectly changed the argument passed as the
parent parameter when calling sysfs_add_file_mode_ns(). This caused some
sysfs attribute files to not be added correctly to certain groups.

Fixes: 5f81880d5204 ("sysfs, kobject: allow creating kobject belonging to arbitrary users")
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'mlx5e-updates-2018-07-27' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Sun, 29 Jul 2018 20:08:42 +0000 (13:08 -0700)]
Merge tag 'mlx5e-updates-2018-07-27' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-07-27 (Vxlan updates)

This series from Gal and Saeed provides updates to mlx5 vxlan implementation.

Gal, started with three cleanups to reflect the actual hardware vxlan state
- reflect 4789 UDP port default addition to software database
- check maximum number of vxlan  UDP ports
- cleanup an unused member in vxlan work

Then Gal provides performance optimization by replacing the
vxlan radix tree with a hash table.

Measuring mlx5e_vxlan_lookup_port execution time:

                      Radix Tree   Hash Table
     --------------- ------------ ------------
      Single Stream   161 ns       79  ns (51% improvement)
      Multi Stream    259 ns       136 ns (47% improvement)

    Measuring UDP stream packet rate, single fully utilized TX core:
    Radix Tree: 498,300 PPS
    Hash Table: 555,468 PPS (11% improvement)

Next, from Saeed, vxlan refactoring to allow sharing the vxlan table
between different mlx5 netdevice instances like PF and VF representors,
this is done by making mlx5 vxlan interface more generic and decoupling
it from PF netdevice structures and logic, then moving it into mlx5 core
as a low level interface so it can be used by VF representors, which is
illustrated in the last patch of the serious.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: qos_dscp_bridge: Fix
Petr Machata [Fri, 27 Jul 2018 22:48:13 +0000 (00:48 +0200)]
selftests: mlxsw: qos_dscp_bridge: Fix

There are two problems in this test case:

- When indexing in bash associative array, the subscript is interpreted as
  string, not as a variable name to be expanded.

- The keys stored to t0s and t1s are not DSCP values, but priority +
  base (i.e. the logical DSCP value, not the full bitfield value).

In combination these two bugs conspire to make the test just work,
except it doesn't really test anything and always passes.

Fix the above two problems in obvious manner.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mtu-related-changes'
David S. Miller [Sun, 29 Jul 2018 19:57:26 +0000 (12:57 -0700)]
Merge branch 'mtu-related-changes'

Stephen Hemminger says:

====================
mtu related changes

While looking at other MTU issues, noticed a couple oppurtunties
for improving user experience.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: report invalid mtu value via netlink extack
Stephen Hemminger [Fri, 27 Jul 2018 20:43:23 +0000 (13:43 -0700)]
net: report invalid mtu value via netlink extack

If an invalid MTU value is set through rtnetlink return extra error
information instead of putting message in kernel log. For other cases
where there is no visible API, keep the error report in the log.

Example:
# ip li set dev enp12s0 mtu 10000
Error: mtu greater than device maximum.

# ifconfig enp12s0 mtu 10000
SIOCSIFMTU: Invalid argument
# dmesg | tail -1
[ 2047.795467] enp12s0: mtu greater than device maximum

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: report min and max mtu network device settings
Stephen Hemminger [Fri, 27 Jul 2018 20:43:22 +0000 (13:43 -0700)]
net: report min and max mtu network device settings

Report the minimum and maximum MTU allowed on a device
via netlink so that it can be displayed by tools like
ip link.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agofailover: change mtu has RTNL
Stephen Hemminger [Fri, 27 Jul 2018 20:43:21 +0000 (13:43 -0700)]
failover: change mtu has RTNL

When changing MTU, RTNL is held so use rtnl_dereference
instead of rcu_dereference.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dcb: add DSCP to comment about priority selector types
Jakub Kicinski [Fri, 27 Jul 2018 20:11:00 +0000 (13:11 -0700)]
net: dcb: add DSCP to comment about priority selector types

Commit ee2059819450 ("net/dcb: Add dscp to priority selector type")
added a define for the new DSCP selector type created by
IEEE 802.1Qcd, but missed the comment enumerating all selector types.
Update the comment.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ethernet: ti: cpsw: add missed RX_CTAG feature for second slave
Ivan Khoronzhuk [Fri, 27 Jul 2018 16:54:39 +0000 (19:54 +0300)]
net: ethernet: ti: cpsw: add missed RX_CTAG feature for second slave

Seems it was missed while adding for first net dev in dual-emac mode.

Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'route-add-support-and-selftests-for-directed-broadcast-forwarding'
David S. Miller [Sun, 29 Jul 2018 19:37:06 +0000 (12:37 -0700)]
Merge branch 'route-add-support-and-selftests-for-directed-broadcast-forwarding'

Xin Long says:

====================
route: add support and selftests for directed broadcast forwarding

Patch 1/2 is the feature and 2/2 is the selftest. Check the changelog
on each of them to know the details.

v1->v2:
  - fix a typo in changelog.
  - fix an uapi break that Davide noticed.
  - flush route cache when bc_forwarding is changed.
  - add the selftest for this patch as Ido's suggestion.

v2->v3:
  - fix an incorrect 'if check' in devinet_conf_proc as David Ahern
    noticed.
  - extend the selftest after one David Ahern fix for vrf.

v3->v4:
  - improve the output log in the selftest as David Ahern suggested.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: add a selftest for directed broadcast forwarding
Xin Long [Fri, 27 Jul 2018 08:37:29 +0000 (16:37 +0800)]
selftests: add a selftest for directed broadcast forwarding

As Ido's suggestion, this patch is to add a selftest for directed
broadcast forwarding with vrf. It does the assertion by checking
the src IP of the echo-reply packet in ping_test_from.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoroute: add support for directed broadcast forwarding
Xin Long [Fri, 27 Jul 2018 08:37:28 +0000 (16:37 +0800)]
route: add support for directed broadcast forwarding

This patch implements the feature described in rfc1812#section-5.3.5.2
and rfc2644. It allows the router to forward directed broadcast when
sysctl bc_forwarding is enabled.

Note that this feature could be done by iptables -j TEE, but it would
cause some problems:
  - target TEE's gateway param has to be set with a specific address,
    and it's not flexible especially when the route wants forward all
    directed broadcasts.
  - this duplicates the directed broadcasts so this may cause side
    effects to applications.

Besides, to keep consistent with other os router like BSD, it's also
necessary to implement it in the route rx path.

Note that route cache needs to be flushed when bc_forwarding is
changed.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind
Vincent Bernat [Wed, 25 Jul 2018 11:19:13 +0000 (13:19 +0200)]
net/ipv6: allow any source address for sendmsg pktinfo with ip_nonlocal_bind

When freebind feature is set of an IPv6 socket, any source address can
be used when sending UDP datagrams using IPv6 PKTINFO ancillary
message. Global non-local bind feature was added in commit
35a256fee52c ("ipv6: Nonlocal bind") for IPv6. This commit also allows
IPv6 source address spoofing when non-local bind feature is enabled.

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'linux-can-next-for-4.19-20180727' of ssh://gitolite.kernel.org/pub/scm...
David S. Miller [Sun, 29 Jul 2018 15:35:03 +0000 (08:35 -0700)]
Merge tag 'linux-can-next-for-4.19-20180727' of ssh://gitolite./linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2018-01-16

this is a pull request for net-next/master consisting of 38 patches.

Dan Murphy's patch fixes the path to a file in the comment of the CAN
Error Message Frame Mask structure.

A patch by Colin Ian King fixes a typo in the cc770 driver.

The next patch is by me an sorts the Kconfigand Makefile entries of the
CAN-USB driver subdir alphabetically.

The patch by Jakob Unterwurzacher adds support for the UCAN USB-CAN
adapter.

YueHaibing's patch replaces a open coded skb_put()+memset() by
skb_put_zero() in the CAN-dev infrastructure.

Zhu Yi provides a patch to enable multi-queue CAN devices.

Three patches by Luc Van Oostenryck fix the return value of several
driver's xmit function, I contribute a patch for the a fourth driver.

Fabio Estevam's patch switches the flexcan driver to SPDX identifier.

Two patches by Jia-Ju Bai replace the mdelay() by a usleep_range() in
the sja1000 drivers.

The next 6 patches are by Anssi Hannula and refactor the xilinx CAN
driver and add support for the xilinx CAN FD core.

A patch by Gustavo A. R. Silva adds fallthrough annotation to the
peak_usb driver.

5 patches by Stephane Grosjean for the peak CANFD driver do some
cleanups and provide more improvements for further firmware releases.

The remaining 13 patches are by Jimmy Assarsson and the first clean up
the kvaser_usb driver, so that the later patches add support for the
Kvaser USB hydra family.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoqed: remove redundant functions qed_set_gft_event_id_cm_hdr
YueHaibing [Fri, 27 Jul 2018 13:24:27 +0000 (21:24 +0800)]
qed: remove redundant functions qed_set_gft_event_id_cm_hdr

There are no in-tree callers of qed_set_gft_event_id_cm_hdr.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: remove redundant function cn23xx_dump_vf_iq_regs
YueHaibing [Fri, 27 Jul 2018 11:57:24 +0000 (19:57 +0800)]
liquidio: remove redundant function cn23xx_dump_vf_iq_regs

There are no in-tree callers.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'tls-Fix-improper-revert-in-zerocopy_from_iter'
David S. Miller [Sun, 29 Jul 2018 05:53:31 +0000 (22:53 -0700)]
Merge branch 'tls-Fix-improper-revert-in-zerocopy_from_iter'

Doron Roberts-Kedes says:

====================
tls: Fix improper revert in zerocopy_from_iter

This series fixes the improper iov_iter_revert introcded in
"tls: Fix zerocopy_from_iter iov handling".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotls: Fix improper revert in zerocopy_from_iter
Doron Roberts-Kedes [Thu, 26 Jul 2018 14:59:36 +0000 (07:59 -0700)]
tls: Fix improper revert in zerocopy_from_iter

The current code is problematic because the iov_iter is reverted and
never advanced in the non-error case. This patch skips the revert in the
non-error case. This patch also fixes the amount by which the iov_iter
is reverted. Currently, iov_iter is reverted by size, which can be
greater than the amount by which the iter was actually advanced.
Instead, only revert by the amount that the iter was advanced.

Fixes: 4718799817c5 ("tls: Fix zerocopy_from_iter iov handling")
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotls: Remove dead code in tls_sw_sendmsg
Doron Roberts-Kedes [Thu, 26 Jul 2018 14:59:35 +0000 (07:59 -0700)]
tls: Remove dead code in tls_sw_sendmsg

tls_push_record either returns 0 on success or a negative value on failure.
This patch removes code that would only be executed if tls_push_record
were to return a positive value.

Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mvneta-next'
David S. Miller [Sun, 29 Jul 2018 05:12:55 +0000 (22:12 -0700)]
Merge branch 'mvneta-next'

Gregory CLEMENT says:

====================
A fix and a few improvements on mvneta

This series brings some improvements for the mvneta driver and also
adds a fix.

Compared to the v2, the main change is another patch fixing a bug
in mtu_change.

Changelog:
v1 -> v2

 - In patch 2, use EXPORT_SYMBOL_GPL for mvneta_bm_get and
   mvneta_bm_put to be used in module, reported by kbuild test robot.

 - In patch 4, add the counter to the driver's ethtool state,
   suggested by David Miller.

 - In patch 6, use a single if, suggested by Marcin Wojtas

v2 -> v3

 - Adding a patch fixing the mtu change issue

 - Removing the inline keyword for mvneta_rx_refill() and let the
   comiler decided, suggested by David Miller.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: Improve the buffer allocation method for SWBM
Yelena Krivosheev [Wed, 18 Jul 2018 16:10:57 +0000 (18:10 +0200)]
net: mvneta: Improve the buffer allocation method for SWBM

With system having a small memory (around 256MB), the state "cannot
allocate memory to refill with new buffer" is reach pretty quickly.

By this patch we changed buffer allocation method to a better handling of
this use case by avoiding memory allocation issues.

Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory: extract from a larger patch]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: Verify hardware checksum only when offload checksum feature is set
Yelena Krivosheev [Wed, 18 Jul 2018 16:10:56 +0000 (18:10 +0200)]
net: mvneta: Verify hardware checksum only when offload checksum feature is set

If the checksum offload feature is not set, then there is no point to
check the status of the hardware.

[gregory: extract from a larger patch]
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: Allocate page for the descriptor
Gregory CLEMENT [Wed, 18 Jul 2018 16:10:55 +0000 (18:10 +0200)]
net: mvneta: Allocate page for the descriptor

Instead of trying to allocate the exact amount of memory for each
descriptor use a page for each of them, it allows to simplify the
allocation management and increase the performance of the driver.

Based on the work of Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: discriminate error cause for missed packet
Gregory CLEMENT [Wed, 18 Jul 2018 16:10:54 +0000 (18:10 +0200)]
net: mvneta: discriminate error cause for missed packet

In order to improve the diagnostic in case of error, make the distinction
between refill error and skb allocation error. Also make the information
available through the ethtool state.

Based on the work of Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: increase number of buffers in RX and TX queue
Yelena Krivosheev [Wed, 18 Jul 2018 16:10:53 +0000 (18:10 +0200)]
net: mvneta: increase number of buffers in RX and TX queue

The initial values were too small leading to poor performance when using
the software buffer management.

Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory: extract from a larger patch]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: remove data pointer usage from device_node structure
Gregory CLEMENT [Wed, 18 Jul 2018 16:10:52 +0000 (18:10 +0200)]
net: mvneta: remove data pointer usage from device_node structure

On year ago Rob Herring wanted to remove the data pointer from the
device_node structure[1]. The mvneta driver seemed to be the only one
which used (abused ?) it. However, the proposal of Rob to remove this
pointer from the driver introduced a regression, and I tested and fixed an
alternative way, but it was never submitted as a proper patch.

Now here it is: Instead of using the device_node structure ->data
pointer, we store the BM private data as the driver data of the BM
platform_device. The core mvneta code can retrieve it by doing a lookup
on which platform_device corresponds to the BM device tree node using
of_find_device_by_node(), and get its driver data

[1]https://www.spinics.net/lists/netdev/msg445197.html

Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: mvneta: fix mtu change on port without link
Yelena Krivosheev [Wed, 18 Jul 2018 16:10:51 +0000 (18:10 +0200)]
net: mvneta: fix mtu change on port without link

It is incorrect to enable TX/RX queues (call by mvneta_port_up()) for
port without link. Indeed MTU change for interface without link causes TX
queues to stuck.

Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP
network unit")
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
[gregory.clement: adding Fixes tags and rewording commit log]
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ethernet: mvneta: Fix napi structure mixup on armada 3700
Andrew Lunn [Wed, 18 Jul 2018 16:10:50 +0000 (18:10 +0200)]
net: ethernet: mvneta: Fix napi structure mixup on armada 3700

The mvneta Ethernet driver is used on a few different Marvell SoCs.
Some SoCs have per cpu interrupts for Ethernet events. Some SoCs have
a single interrupt, independent of the CPU. The driver handles this by
having a per CPU napi structure when there are per CPU interrupts, and
a global napi structure when there is a single interrupt.

When the napi core calls mvneta_poll(), it passes the napi
instance. This was not being propagated through the call chain, and
instead the per-cpu napi instance was passed to napi_gro_receive()
call. This breaks when there is a single global napi instance.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 2636ac3cc2b4 ("net: mvneta: Add network support for Armada 3700 SoC")
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/mlx5e: Issue direct lookup on vxlan ports by vport representors
Saeed Mahameed [Sat, 19 May 2018 12:34:48 +0000 (05:34 -0700)]
net/mlx5e: Issue direct lookup on vxlan ports by vport representors

Remove uplink representor netdevice private structure lookup, and use
mlx5 core handle directly from representor private structure to lookup
vxlan ports.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, move vxlan logic to core driver
Saeed Mahameed [Wed, 9 May 2018 20:28:00 +0000 (13:28 -0700)]
net/mlx5e: Vxlan, move vxlan logic to core driver

Move vxlan logic and objects to mlx5 core dirver.
Since it going to be used from different mlx5 interfaces.
e.g. mlx5e PF NIC netdev and mlx5e E-Switch representors.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, add sync lock for add/del vxlan port
Saeed Mahameed [Tue, 8 May 2018 23:17:06 +0000 (16:17 -0700)]
net/mlx5e: Vxlan, add sync lock for add/del vxlan port

Vxlan API can and will be called from different mlx5 modules, we should
not count on mlx5e private state lock only, hence we introduce a vxlan
private mutex to sync between add/del vxlan port operations.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, return values for add/del port
Saeed Mahameed [Tue, 8 May 2018 21:33:19 +0000 (14:33 -0700)]
net/mlx5e: Vxlan, return values for add/del port

For a better API mlx5_vxlan_{add/del}_port can fail, make them return
error values.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, rename from mlx5e to mlx5
Saeed Mahameed [Tue, 8 May 2018 09:51:23 +0000 (02:51 -0700)]
net/mlx5e: Vxlan, rename from mlx5e to mlx5

Rename vxlan functions from mlx5e_vxlan_* to mlx5_vxlan_*.
Rename mlx5e_vxlan_db to mlx5_vxlan and move it from en.h to vxlan.c
since it is not related to mlx5e anymore.

Allocate mlx5_vxlan structure dynamically in order to make it easier to
move later to core driver and to make it private in vxlan.c.

This is in preparation to move vxlan API to mlx5 core.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, rename struct mlx5e_vxlan to mlx5_vxlan_port
Saeed Mahameed [Tue, 8 May 2018 09:23:49 +0000 (02:23 -0700)]
net/mlx5e: Vxlan, rename struct mlx5e_vxlan to mlx5_vxlan_port

The name mlx5e_vxlan will be used in downstream patch to describe
mlx5 vxlan structure that will replace mlx5e_vxlan_db.

Hence we rename struct mlx5e_vxlan to mlx5_vxlan_port which describes a
mlx5 vxlan port.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, move netdev only logic to en_main.c
Saeed Mahameed [Tue, 8 May 2018 08:49:43 +0000 (01:49 -0700)]
net/mlx5e: Vxlan, move netdev only logic to en_main.c

Create a direct vxlan API to add and delete vxlan ports from HW.
+void mlx5e_vxlan_add_port(struct mlx5e_priv *priv, u16 port);
+void mlx5e_vxlan_del_port(struct mlx5e_priv *priv, u16 port);

And move vxlan_add/del_work to en_main.c since they are netdev only
logic.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, add direct delete function
Saeed Mahameed [Tue, 8 May 2018 01:06:26 +0000 (18:06 -0700)]
net/mlx5e: Vxlan, add direct delete function

Add direct vxlan delete function to be called from vxlan_delete_work.
Needed in downstream patch.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
6 years agonet/mlx5e: Vxlan, cleanup an unused member in vxlan work
Gal Pressman [Tue, 13 Feb 2018 08:31:26 +0000 (10:31 +0200)]
net/mlx5e: Vxlan, cleanup an unused member in vxlan work

Cleanup the sa_family member of the vxlan work, it is unused/needed
anywhere in the code.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Vxlan, replace ports radix-tree with hash table
Gal Pressman [Tue, 26 Dec 2017 16:27:08 +0000 (18:27 +0200)]
net/mlx5e: Vxlan, replace ports radix-tree with hash table

The VXLAN database is accessed in the data path for each VXLAN TX skb in
order to check whether the UDP port is being offloaded or not.
The number of elements in the database is relatively small, we can
simplify the radix-tree to a hash table and speedup the lookup process.

Measuring mlx5e_vxlan_lookup_port execution time:

                  Radix Tree   Hash Table
 --------------- ------------ ------------
  Single Stream   161 ns       79  ns (51% improvement)
  Multi Stream    259 ns       136 ns (47% improvement)

Measuring UDP stream packet rate, single fully utilized TX core:
Radix Tree: 498,300 PPS
Hash Table: 555,468 PPS (11% improvement)

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Vxlan, check maximum number of UDP ports
Gal Pressman [Mon, 25 Dec 2017 16:40:52 +0000 (18:40 +0200)]
net/mlx5e: Vxlan, check maximum number of UDP ports

The NIC has a limited number of offloaded VXLAN UDP ports (usually 4).
Instead of letting the firmware fail when trying to add more ports than
it can handle, let the driver check it on its own.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Vxlan, reflect 4789 UDP port default addition to software database
Gal Pressman [Wed, 17 Jan 2018 09:02:31 +0000 (11:02 +0200)]
net/mlx5e: Vxlan, reflect 4789 UDP port default addition to software database

The hardware offloads 4789 UDP port (default VXLAN port) automatically.
Add it to the software database as well in order to reflect the hardware
state appropriately.

Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet: tipc: bcast: Replace GFP_ATOMIC with GFP_KERNEL in tipc_bcast_init()
Jia-Ju Bai [Fri, 27 Jul 2018 09:31:35 +0000 (17:31 +0800)]
net: tipc: bcast: Replace GFP_ATOMIC with GFP_KERNEL in tipc_bcast_init()

tipc_bcast_init() is never called in atomic context.
It calls kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: tipc: name_table: Replace GFP_ATOMIC with GFP_KERNEL in tipc_nametbl_init()
Jia-Ju Bai [Fri, 27 Jul 2018 09:28:25 +0000 (17:28 +0800)]
net: tipc: name_table: Replace GFP_ATOMIC with GFP_KERNEL in tipc_nametbl_init()

tipc_nametbl_init() is never called in atomic context.
It calls kzalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: sr9700: Replace mdelay() with msleep() in sr9700_bind()
Jia-Ju Bai [Fri, 27 Jul 2018 08:41:04 +0000 (16:41 +0800)]
net: usb: sr9700: Replace mdelay() with msleep() in sr9700_bind()

sr9700_bind() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: pegasus: Replace mdelay() with msleep() in setup_pegasus_II()
Jia-Ju Bai [Fri, 27 Jul 2018 08:36:29 +0000 (16:36 +0800)]
net: usb: pegasus: Replace mdelay() with msleep() in setup_pegasus_II()

setup_pegasus_II() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: marvell: Replace mdelay() with msleep() in m88e1116r_config_init()
Jia-Ju Bai [Fri, 27 Jul 2018 08:34:25 +0000 (16:34 +0800)]
net: phy: marvell: Replace mdelay() with msleep() in m88e1116r_config_init()

m88e1116r_config_init() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: nvidia: forcedeth: Replace GFP_ATOMIC with GFP_KERNEL in nv_probe()
Jia-Ju Bai [Fri, 27 Jul 2018 08:29:31 +0000 (16:29 +0800)]
net: nvidia: forcedeth: Replace GFP_ATOMIC with GFP_KERNEL in nv_probe()

nv_probe() is never called in atomic context.
It calls dma_alloc_coherent() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: jme: Replace mdelay() with msleep() and usleep_range() in jme_wait_link()
Jia-Ju Bai [Fri, 27 Jul 2018 08:25:07 +0000 (16:25 +0800)]
net: jme: Replace mdelay() with msleep() and usleep_range() in jme_wait_link()

jme_wait_link() is never called in atomic context.
It calls mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep() and usleep_range().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: hisilicon: hns: Replace mdelay() with msleep()
Jia-Ju Bai [Fri, 27 Jul 2018 08:01:41 +0000 (16:01 +0800)]
net: hisilicon: hns: Replace mdelay() with msleep()

hns_ppe_common_init_hw() and hns_xgmac_init() are never
called in atomic context.
They call mdelay() to busily wait, which is not necessary.
mdelay() can be replaced with msleep().

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: amd: pcnet32: Replace GFP_ATOMIC with GFP_KERNEL in pcnet32_alloc_ring()
Jia-Ju Bai [Fri, 27 Jul 2018 07:57:58 +0000 (15:57 +0800)]
net: amd: pcnet32: Replace GFP_ATOMIC with GFP_KERNEL in pcnet32_alloc_ring()

pcnet32_alloc_ring() is never called in atomic context.
It calls kcalloc() with GFP_ATOMIC, which is not necessary.
GFP_ATOMIC can be replaced with GFP_KERNEL.

This is found by a static analysis tool named DCNS written by myself.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosch_cake: Make gso-splitting configurable from userspace
Dave Taht [Fri, 27 Jul 2018 02:45:10 +0000 (19:45 -0700)]
sch_cake: Make gso-splitting configurable from userspace

This patch restores cake's deployed behavior at line rate to always
split gso, and makes gso splitting configurable from userspace.

running cake unlimited (unshaped) at 1gigE, local traffic:

no-split-gso bql limit: 131966
split-gso bql limit:   ~42392-45420

On this 4 stream test splitting gso apart results in halving the
observed interpacket latency at no loss in throughput.

Summary of tcp_nup test run 'gso-split' (at 2018-07-26 16:03:51.824728):

 Ping (ms) ICMP :         0.83         0.81 ms              341
 TCP upload avg :       235.43       235.39 Mbits/s         301
 TCP upload sum :       941.71       941.56 Mbits/s         301
 TCP upload::1  :       235.45       235.43 Mbits/s         271
 TCP upload::2  :       235.45       235.41 Mbits/s         289
 TCP upload::3  :       235.40       235.40 Mbits/s         288
 TCP upload::4  :       235.41       235.40 Mbits/s         291

verses

Summary of tcp_nup test run 'no-split-gso' (at 2018-07-26 16:37:23.563960):

                           avg       median          # data pts
 Ping (ms) ICMP :         1.67         1.73 ms              348
 TCP upload avg :       234.56       235.37 Mbits/s         301
 TCP upload sum :       938.24       941.49 Mbits/s         301
 TCP upload::1  :       234.55       235.38 Mbits/s         285
 TCP upload::2  :       234.57       235.37 Mbits/s         286
 TCP upload::3  :       234.58       235.37 Mbits/s         274
 TCP upload::4  :       234.54       235.42 Mbits/s         288

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: print ULD queue information managed by LLD
Rahul Lakkireddy [Fri, 27 Jul 2018 08:59:22 +0000 (14:29 +0530)]
cxgb4: print ULD queue information managed by LLD

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'l2tp-remove-unused-session-fields'
David S. Miller [Fri, 27 Jul 2018 20:34:54 +0000 (13:34 -0700)]
Merge branch 'l2tp-remove-unused-session-fields'

Guillaume Nault says:

====================
l2tp: remove unused session fields

Several fields of the session structures can be set, but remain unused
otherwise.

This series removes these fields and explicitely ignores the
associated ioctls and netlink attributes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agol2tp: drop ->mru from struct l2tp_session
Guillaume Nault [Fri, 27 Jul 2018 09:00:00 +0000 (11:00 +0200)]
l2tp: drop ->mru from struct l2tp_session

This field is not used.

Treat PPPIOC*MRU the same way as PPPIOC*FLAGS: "get" requests return 0,
while "set" requests vadidate the user supplied pointer but discard its
value.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agol2tp: drop ->flags from struct pppol2tp_session
Guillaume Nault [Fri, 27 Jul 2018 08:59:59 +0000 (10:59 +0200)]
l2tp: drop ->flags from struct pppol2tp_session

This field is not used.

Keep validating user input in PPPIOCSFLAGS. Even though we discard the
value, it would look wrong to succeed if an invalid address was passed
from userspace.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agol2tp: ignore L2TP_ATTR_VLAN_ID netlink attribute
Guillaume Nault [Fri, 27 Jul 2018 08:59:58 +0000 (10:59 +0200)]
l2tp: ignore L2TP_ATTR_VLAN_ID netlink attribute

The value of this attribute is never used.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agol2tp: ignore L2TP_ATTR_DATA_SEQ netlink attribute
Guillaume Nault [Fri, 27 Jul 2018 08:59:57 +0000 (10:59 +0200)]
l2tp: ignore L2TP_ATTR_DATA_SEQ netlink attribute

The value of this attribute is never used.

Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/rds/Kconfig: Correct the RDS depends
Anders Roxell [Fri, 27 Jul 2018 13:18:49 +0000 (15:18 +0200)]
net/rds/Kconfig: Correct the RDS depends

Remove prefix 'CONFIG_' from CONFIG_IPV6

Fixes: ba7d7e2677c0 ("net/rds/Kconfig: RDS should depend on IPV6")
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-Support-DSCP-prioritization-and-rewrite'
David S. Miller [Fri, 27 Jul 2018 20:17:50 +0000 (13:17 -0700)]
Merge branch 'mlxsw-Support-DSCP-prioritization-and-rewrite'

Ido Schimmel says:

====================
mlxsw: Support DSCP prioritization and rewrite

Petr says:

On ingress, a network device such as a switch assigns to packets
priority based on various criteria. Common options include interpreting
PCP and DSCP fields according to user configuration. When a packet
egresses the switch, a reverse process may rewrite PCP and/or DSCP
headers according to packet priority.

So far, mlxsw has supported prioritization based on PCP (802.1p priority
tag). This patch set introduces support for prioritization based on
DSCP, and DSCP rewrite.

To configure the DSCP-to-priority maps, the user is expected to invoke
ieee_setapp and ieee_delapp DCBNL ops, e.g. by using lldptool:

To decide whether or not to pay attention to DSCP values, the Spectrum
switch recognize a per-port configuration of trust level. Until the
first APP rule is added for a given port, this port's trust level stays
at PCP, meaning that PCP is used for packet prioritization. With the
first DSCP APP rule, the port is configured to trust DSCP instead, and
it stays there until all DSCP APP rules are removed again.

Besides the DSCP (value 5) selector, another selector that plays into
packet prioritization is Ethernet type (value 1) with PID of 0. Such APP
entries denote default priority[1]:

With this patch set, mlxsw uses these values to configure priority for
DSCP values not explicitly specified in DSCP APP map. In the future we
expect to also use this to configure default port priority for untagged
packets.

Access to DSCP-to-priority map, priority-to-DSCP map, and default
priority for a port is exposed through three new DCB helpers. Like the
already-existing dcb_ieee_getapp_mask() helper, these helpers operate in
terms of bitmaps, to support the arbitrary M:N mapping that the APP
rules allow. Such interface presents all the relevant information from
the APP database without necessitating exposition of iterators, locking
or other complex primitives. It is up to the driver to then digest the
mapping in a way that the device supports. In this patch set, mlxsw
resolves conflicts by favoring higher-numbered DSCP values and
priorities.

In this patchset:

- Patch #1 fixes a bug in DCB APP database management.
- Patch #2 adds the getters described above.
- Patches #3-#6 add Spectrum configuration registers.
- Patch #7 adds the mlxsw logic that configures the device according to
  APP rules.
- Patch #8 adds a self-test. The test is added to the subdirectory
  drivers/net/mlxsw. Even though it's not particularly specific to
  mlxsw, it's not suitable for running on soft devices (which don't
  support the ieee_getapp et.al.), and thus isn't a good fit for the
  general net/forwarding directory.

[1] 802.1Q-2014, Table D-9
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: mlxsw: Add test for trust-DSCP
Petr Machata [Fri, 27 Jul 2018 12:27:02 +0000 (15:27 +0300)]
selftests: mlxsw: Add test for trust-DSCP

Add a test that exercises the new code. Send DSCP-tagged packets, and
observe how they are prioritized in the switch and the DSCP is updated
on egress again.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum: Support ieee_setapp, ieee_delapp
Petr Machata [Fri, 27 Jul 2018 12:27:01 +0000 (15:27 +0300)]
mlxsw: spectrum: Support ieee_setapp, ieee_delapp

The APP TLVs are used for communicating priority-to-protocol ID maps for
a given netdevice. Support the following APP TLVs:

- DSCP (selector 5) to configure priority-to-DSCP code point maps. Use
  these maps to configure packet priority on ingress, and DSCP code
  point rewrite on egress.

- Default priority (selector 1, PID 0) to configure priority for the
  DSCP code points that don't have one assigned by the DSCP selector. In
  future this could also be used for assigning default port priority
  when a packet arrives without DSCP tagging.

Besides setting up the maps themselves, also configure port trust level
and rewrite bits.

Port trust level determines whether, for a packet arriving through a
certain port, the priority should be determined based on PCP or DSCP
header fields. So far, mlxsw kept the device default of trust-PCP. Now,
as soon as the first DSCP APP TLV is configured, switch to trust-DSCP.
Only when all DSCP APP TLVs are removed, switch back to trust-PCP again.
Note that the default priority APP TLV doesn't impact the trust level
configuration.

Rewrite bits determine whether DSCP and PCP fields of egressing packets
should be updated according to switch priority. When port trust is
switched to DSCP, enable rewrite of DSCP field.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: reg: Add QoS Priority to DSCP Mapping Register
Petr Machata [Fri, 27 Jul 2018 12:27:00 +0000 (15:27 +0300)]
mlxsw: reg: Add QoS Priority to DSCP Mapping Register

This register controls mapping from Priority to DSCP for purposes of
rewrite. Note that rewrite happens as the packet is transmitted provided
that the DSCP rewrite bit is enabled for the packet.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: reg: Add QoS ReWrite Enable Register
Petr Machata [Fri, 27 Jul 2018 12:26:59 +0000 (15:26 +0300)]
mlxsw: reg: Add QoS ReWrite Enable Register

This register configures the rewrite enable (whether PCP or DSCP value
in packet should be updated according to packet priority) per receive
port.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: reg: Add QoS Priority Trust State Register
Petr Machata [Fri, 27 Jul 2018 12:26:58 +0000 (15:26 +0300)]
mlxsw: reg: Add QoS Priority Trust State Register

The QPTS register controls the port policy to calculate the switch
priority and packet color based on incoming packet fields.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: reg: Add QoS Port DSCP to Priority Mapping Register
Petr Machata [Fri, 27 Jul 2018 12:26:57 +0000 (15:26 +0300)]
mlxsw: reg: Add QoS Port DSCP to Priority Mapping Register

The QPDPM register controls the mapping from DSCP field to Switch
Priority for IP packets.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dcb: Add priority-to-DSCP map getters
Petr Machata [Fri, 27 Jul 2018 12:26:56 +0000 (15:26 +0300)]
net: dcb: Add priority-to-DSCP map getters

On ingress, a network device such as a switch assigns to packets
priority based on various criteria. Common options include interpreting
PCP and DSCP fields according to user configuration. When a packet
egresses the switch, a reverse process may rewrite PCP and/or DSCP
values according to packet priority.

The following three functions support a) obtaining a DSCP-to-priority
map or vice versa, and b) finding default-priority entries in APP
database.

The DCB subsystem supports for APP entries a very generous M:N mapping
between priorities and protocol identifiers. Understandably,
several (say) DSCP values can map to the same priority. But this
asymmetry holds the other way around as well--one priority can map to
several DSCP values. For this reason, the following functions operate in
terms of bitmaps, with ones in positions that match some APP entry.

- dcb_ieee_getapp_dscp_prio_mask_map() to compute for a given netdevice
  a map of DSCP-to-priority-mask, which gives for each DSCP value a
  bitmap of priorities related to that DSCP value by APP, along the
  lines of dcb_ieee_getapp_mask().

- dcb_ieee_getapp_prio_dscp_mask_map() similarly to compute for a given
  netdevice a map from priorities to a bitmap of DSCPs.

- dcb_ieee_getapp_default_prio_mask() which finds all default-priority
  rules for a given port in APP database, and returns a mask of
  priorities allowed by these default-priority rules.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dcb: For wild-card lookups, use priority -1, not 0
Petr Machata [Fri, 27 Jul 2018 12:26:55 +0000 (15:26 +0300)]
net: dcb: For wild-card lookups, use priority -1, not 0

The function dcb_app_lookup walks the list of specified DCB APP entries,
looking for one that matches a given criteria: ifindex, selector,
protocol ID and optionally also priority. The "don't care" value for
priority is set to 0, because that priority has not been allowed under
CEE regime, which predates the IEEE standardization.

Under IEEE, 0 is a valid priority number. But because dcb_app_lookup
considers zero a wild card, attempts to add an APP entry with priority 0
fail when other entries exist for a given ifindex / selector / PID
triplet.

Fix by changing the wild-card value to -1.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: sched: don't dump chains only held by actions
Jiri Pirko [Fri, 27 Jul 2018 07:45:05 +0000 (09:45 +0200)]
net: sched: don't dump chains only held by actions

In case a chain is empty and not explicitly created by a user,
such chain should not exist. The only exception is if there is
an action "goto chain" pointing to it. In that case, don't show the
chain in the dump. Track the chain references held by actions and
use them to find out if a chain should or should not be shown
in chain dump.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec...
David S. Miller [Fri, 27 Jul 2018 16:33:37 +0000 (09:33 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2018-07-27

1) Extend the output_mark to also support the input direction
   and masking the mark values before applying to the skb.

2) Add a new lookup key for the upcomming xfrm interfaces.

3) Extend the xfrm lookups to match xfrm interface IDs.

4) Add virtual xfrm interfaces. The purpose of these interfaces
   is to overcome the design limitations that the existing
   VTI devices have.

  The main limitations that we see with the current VTI are the
  following:

  VTI interfaces are L3 tunnels with configurable endpoints.
  For xfrm, the tunnel endpoint are already determined by the SA.
  So the VTI tunnel endpoints must be either the same as on the
  SA or wildcards. In case VTI tunnel endpoints are same as on
  the SA, we get a one to one correlation between the SA and
  the tunnel. So each SA needs its own tunnel interface.

  On the other hand, we can have only one VTI tunnel with
  wildcard src/dst tunnel endpoints in the system because the
  lookup is based on the tunnel endpoints. The existing tunnel
  lookup won't work with multiple tunnels with wildcard
  tunnel endpoints. Some usecases require more than on
  VTI tunnel of this type, for example if somebody has multiple
  namespaces and every namespace requires such a VTI.

  VTI needs separate interfaces for IPv4 and IPv6 tunnels.
  So when routing to a VTI, we have to know to which address
  family this traffic class is going to be encapsulated.
  This is a lmitation because it makes routing more complex
  and it is not always possible to know what happens behind the
  VTI, e.g. when the VTI is move to some namespace.

  VTI works just with tunnel mode SAs. We need generic interfaces
  that ensures transfomation, regardless of the xfrm mode and
  the encapsulated address family.

  VTI is configured with a combination GRE keys and xfrm marks.
  With this we have to deal with some extra cases in the generic
  tunnel lookup because the GRE keys on the VTI are actually
  not GRE keys, the GRE keys were just reused for something else.
  All extensions to the VTI interfaces would require to add
  even more complexity to the generic tunnel lookup.

  So to overcome this, we developed xfrm interfaces with the
  following design goal:

  It should be possible to tunnel IPv4 and IPv6 through the same
  interface.

  No limitation on xfrm mode (tunnel, transport and beet).

  Should be a generic virtual interface that ensures IPsec
  transformation, no need to know what happens behind the
  interface.

  Interfaces should be configured with a new key that must match a
  new policy/SA lookup key.

  The lookup logic should stay in the xfrm codebase, no need to
  change or extend generic routing and tunnel lookups.

  Should be possible to use IPsec hardware offloads of the underlying
  interface.

5) Remove xfrm pcpu policy cache. This was added after the flowcache
   removal, but it turned out to make things even worse.
   From Florian Westphal.

6) Allow to update the set mark on SA updates.
   From Nathan Harold.

7) Convert some timestamps to time64_t.
   From Arnd Bergmann.

8) Don't check the offload_handle in xfrm code,
   it is an opaque data cookie for the driver.
   From Shannon Nelson.

9) Remove xfrmi interface ID from flowi. After this pach
   no generic code is touched anymore to do xfrm interface
   lookups. From Benedict Wong.

10) Allow to update the xfrm interface ID on SA updates.
    From Nathan Harold.

11) Don't pass zero to ERR_PTR() in xfrm_resolve_and_create_bundle.
    From YueHaibing.

12) Return more detailed errors on xfrm interface creation.
    From Benedict Wong.

13) Use PTR_ERR_OR_ZERO instead of IS_ERR + PTR_ERR.
    From the kbuild test robot.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocan: kvaser_usb: Simplify struct kvaser_cmd_cardinfo
Jimmy Assarsson [Fri, 16 Feb 2018 13:41:06 +0000 (14:41 +0100)]
can: kvaser_usb: Simplify struct kvaser_cmd_cardinfo

serial_number_high can be removed from the struct since it is never used in
the USBcan II firmware.

Signed-off-by: Jimmy Assarsson <jimmyassarsson@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Add support for Kvaser USB hydra family
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:29 +0000 (23:29 +0200)]
can: kvaser_usb: Add support for Kvaser USB hydra family

This patch adds support for a new Kvaser USB family, denoted hydra.
The hydra family currently contains USB devices with one CAN channel
up to five. There are devices with and without CAN FD support.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Christer Beskow <chbe@kvaser.com>
Signed-off-by: Nicklas Johansson <extnj@kvaser.com>
Signed-off-by: Martin Henriksson <mh@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Split driver into kvaser_usb_core.c and kvaser_usb_leaf.c
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:28 +0000 (23:29 +0200)]
can: kvaser_usb: Split driver into kvaser_usb_core.c and kvaser_usb_leaf.c

First part of adding support for Kvaser USB device family "hydra".

Split kvaser_usb.c into kvaser_usb/kvaser_usb{.h,_core.c,_leaf.c}.

kvaser_usb_core.c contains common functionality, such as USB
writing/reading and allocation of netdev.
kvaser_usb_leaf.c contains device specific code, used in
kvaser_usb_core.c.

struct kvaser_usb_dev_ops contains device specific functions that are
common for all devices in the family. While, struct kvaser_usb_dev_cfg
describes the device configurations in terms of CAN clock frequency,
timestamp frequency and CAN controller bittiming constants.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Add SPDX GPL-2.0 license identifier
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:27 +0000 (23:29 +0200)]
can: kvaser_usb: Add SPDX GPL-2.0 license identifier

Add SPDX GPL-2.0 license identifier to kvaser_usb.c.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Fix typos
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:26 +0000 (23:29 +0200)]
can: kvaser_usb: Fix typos

Fix some typos. Change can to CAN, when referring to
Controller Area Network.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Improve logging messages
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:25 +0000 (23:29 +0200)]
can: kvaser_usb: Improve logging messages

Replace dev->udev->dev.parent with &dev->intf->dev, when it is the
first argument passed to dev_* logging function call.

This will result in:
kvaser_usb 1-2:1.0: Format error
compared to
usb 1-2: Format error

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Refactor kvaser_usb_init_one()
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:24 +0000 (23:29 +0200)]
can: kvaser_usb: Refactor kvaser_usb_init_one()

Replace first parameter in kvaser_usb_init_one() with a pointer to
struct kvaser_usb.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Refactor kvaser_usb_get_endpoints()
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:23 +0000 (23:29 +0200)]
can: kvaser_usb: Refactor kvaser_usb_get_endpoints()

Replace parameters with struct kvaser_usb pointer. Rename the function
from kvaser_usb_get_endpoints() to kvaser_usb_setup_endpoints().

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Add pointer to struct usb_interface into struct kvaser_usb
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:22 +0000 (23:29 +0200)]
can: kvaser_usb: Add pointer to struct usb_interface into struct kvaser_usb

Add pointer to struct usb_interface into struct kvaser_usb.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Replace USB timeout constants with one define
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:21 +0000 (23:29 +0200)]
can: kvaser_usb: Replace USB timeout constants with one define

Replace USB timeout constants used when sending and receiving, with a
single constant.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Rename message/msg to command/cmd
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:20 +0000 (23:29 +0200)]
can: kvaser_usb: Rename message/msg to command/cmd

Rename message to command and msg to cmd, where appropriate. To make the
code more readable and to better match Kvaser's Linux drivers.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Remove unused commands and defines
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:19 +0000 (23:29 +0200)]
can: kvaser_usb: Remove unused commands and defines

Remove unused commands:
  struct kvaser_msg_cardinfo2
  struct leaf_msg_tx_acknowledge
  struct usbcan_msg_tx_acknowledge

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: kvaser_usb: Remove unnecessary return
Jimmy Assarsson [Wed, 18 Jul 2018 21:29:18 +0000 (23:29 +0200)]
can: kvaser_usb: Remove unnecessary return

Remove unnecessary return at end of void function.

Signed-off-by: Jimmy Assarsson <extja@kvaser.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: peak_canfd: rearrange the way resources are released
Stephane Grosjean [Thu, 21 Jun 2018 13:23:30 +0000 (15:23 +0200)]
can: peak_canfd: rearrange the way resources are released

This patch improves the sequence the resources are released by, first,

- disabling the IRQ in the controller, then by
- resetting the DMA logic, and finally, by
- adding a read cycle to ensure that the above commands have been received

before freeing the system interrupt.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: peak_canfd: fix typo in error message
Stephane Grosjean [Thu, 21 Jun 2018 13:23:29 +0000 (15:23 +0200)]
can: peak_canfd: fix typo in error message

This patch fixes a typo in the error message in pciefd_can_probe().

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: peak_canfd: use ndev irq instead of pci_dev one
Stephane Grosjean [Thu, 21 Jun 2018 13:23:28 +0000 (15:23 +0200)]
can: peak_canfd: use ndev irq instead of pci_dev one

This cosmetic change should facilitate in the future the use of MSI
rather than legacy INTx interrupts.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>