openwrt/staging/blogic.git
7 years agobpf, doc: Add arm32 as arch supporting eBPF JIT
Shubham Bansal [Wed, 23 Aug 2017 15:59:10 +0000 (21:29 +0530)]
bpf, doc: Add arm32 as arch supporting eBPF JIT

As eBPF JIT support for arm32 was added recently with
commit 39c13c204bb1150d401e27d41a9d8b332be47c49, it seems appropriate to
add arm32 as arch with support for eBPF JIT in bpf and sysctl docs as well.

Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bpf-verifier-fixes'
David S. Miller [Thu, 24 Aug 2017 05:38:08 +0000 (22:38 -0700)]
Merge branch 'bpf-verifier-fixes'

Edward Cree says:

====================
bpf: verifier fixes

Fix a couple of bugs introduced in my recent verifier patches.
Patch #2 does slightly increase the insn count on bpf_lxc.o, but only by
 about a hundred insns (i.e. 0.2%).

v2: added test for write-marks bug (patch #1); reworded comment on
 propagate_liveness() for clarity.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: document liveness analysis
Edward Cree [Wed, 23 Aug 2017 14:11:21 +0000 (15:11 +0100)]
bpf/verifier: document liveness analysis

The liveness tracking algorithm is quite subtle; add comments to explain it.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: remove varlen_map_value_access flag
Edward Cree [Wed, 23 Aug 2017 14:10:50 +0000 (15:10 +0100)]
bpf/verifier: remove varlen_map_value_access flag

The optimisation it does is broken when the 'new' register value has a
 variable offset and the 'old' was constant.  I broke it with my pointer
 types unification (see Fixes tag below), before which the 'new' value
 would have type PTR_TO_MAP_VALUE_ADJ and would thus not compare equal;
 other changes in that patch mean that its original behaviour (ignore
 min/max values) cannot be restored.
Tests on a sample set of cilium programs show no change in count of
 processed instructions.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: add a test for a pruning bug in the verifier
Alexei Starovoitov [Wed, 23 Aug 2017 14:10:26 +0000 (15:10 +0100)]
selftests/bpf: add a test for a pruning bug in the verifier

The test makes a read through a map value pointer, then considers pruning
 a branch where the register holds an adjusted map value pointer.  It
 should not prune, but currently it does.

Signed-off-by: Alexei Starovoitov <ast@fb.com>
[ecree@solarflare.com: added test-name and patch description]
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: when pruning a branch, ignore its write marks
Edward Cree [Wed, 23 Aug 2017 14:10:03 +0000 (15:10 +0100)]
bpf/verifier: when pruning a branch, ignore its write marks

The fact that writes occurred in reaching the continuation state does
 not screen off its reads from us, because we're not really its parent.
So detect 'not really the parent' in do_propagate_liveness, and ignore
 write marks in that case.

Fixes: dc503a8ad984 ("bpf/verifier: track liveness for pruning")
Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: add a test for a bug in liveness-based pruning
Edward Cree [Wed, 23 Aug 2017 14:09:46 +0000 (15:09 +0100)]
selftests/bpf: add a test for a bug in liveness-based pruning

Writes in straight-line code should not prevent reads from propagating
 along jumps.  With current verifier code, the jump from 3 to 5 does not
 add a read mark on 3:R0 (because 5:R0 has a write mark), meaning that
 the jump from 1 to 3 gets pruned as safe even though R0 is NOT_INIT.

Verifier output:
0: (61) r2 = *(u32 *)(r1 +0)
1: (35) if r2 >= 0x0 goto pc+1
 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
2: (b7) r0 = 0
3: (35) if r2 >= 0x0 goto pc+1
 R0=inv0 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
4: (b7) r0 = 0
5: (95) exit

from 3 to 5: safe

from 1 to 3: safe
processed 8 insns, stack depth 0

Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agogre: remove duplicated assignment of iph
Colin Ian King [Wed, 23 Aug 2017 11:59:48 +0000 (12:59 +0100)]
gre: remove duplicated assignment of iph

iph is being assigned the same value twice; remove the redundant
first assignment. (Thanks to Nikolay Aleksandrov for pointing out
that the first asssignment should be removed and not the second)

Fixes warning:
net/ipv4/ip_gre.c:265:2: warning: Value stored to 'iph' is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: tipc: constify genl_ops
Arvind Yadav [Wed, 23 Aug 2017 10:52:20 +0000 (16:22 +0530)]
net: tipc: constify genl_ops

genl_ops are not supposed to change at runtime. All functions
working with genl_ops provided by <net/genetlink.h> work with
const genl_ops. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hinic: make functions set_ctrl0 and set_ctrl1 static
Colin Ian King [Wed, 23 Aug 2017 09:59:40 +0000 (10:59 +0100)]
net: hinic: make functions set_ctrl0 and set_ctrl1 static

The functions set_ctrl0 and set_ctrl1 are local to the source and do
not need to be in global scope, so make them static.

Cleans up sparse warnings:
symbol 'set_ctrl0' was not declared. Should it be static?
symbol 'set_ctrl1' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/sock: allow the user to set negative peek offset
Paolo Abeni [Wed, 23 Aug 2017 09:57:51 +0000 (11:57 +0200)]
net/sock: allow the user to set negative peek offset

This is necessary to allow the user to disable peeking with
offset once it's enabled.
Unix sockets already allow the above, with this patch we
permit it for udp[6] sockets, too.

Fixes: 627d2d6b5500 ("udp: enable MSG_PEEK at non-zero offset")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mlxsw-multichain-tc-offload'
David S. Miller [Thu, 24 Aug 2017 03:44:32 +0000 (20:44 -0700)]
Merge branch 'mlxsw-multichain-tc-offload'

Jiri Pirko says:

====================
mlxsw: spectrum: Introduce multichain TC offload

This patchset introduces offloading of rules added to chain with
non-zero index, which was previously forbidden. Also, goto_chain
termination action is offloaded allowing to jump to processing
of desired chain.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_flower: Offload goto_chain termination action
Jiri Pirko [Wed, 23 Aug 2017 08:08:22 +0000 (10:08 +0200)]
mlxsw: spectrum_flower: Offload goto_chain termination action

If action is gact goto_chain, offload it to HW by jumping to another
ruleset.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_acl: Provide helper to lookup ruleset
Jiri Pirko [Wed, 23 Aug 2017 08:08:21 +0000 (10:08 +0200)]
mlxsw: spectrum_acl: Provide helper to lookup ruleset

We need to lookup ruleset in order to offload goto_chain termination
action. This patch adds it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum_acl: Allow to get group_id value for a ruleset
Jiri Pirko [Wed, 23 Aug 2017 08:08:20 +0000 (10:08 +0200)]
mlxsw: spectrum_acl: Allow to get group_id value for a ruleset

For goto_chain action we need to know group_id of a ruleset to jump to.
Provide infrastructure in order to get it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: add couple of goto_chain helpers
Jiri Pirko [Wed, 23 Aug 2017 08:08:19 +0000 (10:08 +0200)]
net: sched: add couple of goto_chain helpers

Add helpers to find out if a gact instance is goto_chain termination
action and to get chain index.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Offload multichain TC rules
Jiri Pirko [Wed, 23 Aug 2017 08:08:18 +0000 (10:08 +0200)]
mlxsw: spectrum: Offload multichain TC rules

Reflect chain index coming down from TC core and create a ruleset per
chain. Note that only chain 0, being the implicit chain, is bound to the
device for processing. The rest of chains have to be "jumped-to" by
actions.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mvpp2-software-TSO-support'
David S. Miller [Thu, 24 Aug 2017 03:42:10 +0000 (20:42 -0700)]
Merge branch 'mvpp2-software-TSO-support'

Antoine Tenart says:

====================
net: mvpp2: software TSO support

This series adds the s/w TSO support in the PPv2 driver, in addition to
two cosmetic commits. As stated in patch 3/3:

Using iperf and 10G ports, using TSO shows a significant performance
improvement by a factor 2 to reach around 9.5Gbps in TX; as well as a
significant CPU usage drop (from 25% to 15%).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: software tso support
Antoine Ténart [Wed, 23 Aug 2017 07:46:56 +0000 (09:46 +0200)]
net: mvpp2: software tso support

The patch uses the tso API to implement the tso functionality in Marvell
PPv2 driver.

Using iperf and 10G ports, using TSO shows a significant performance
improvement by a factor 2 to reach around 9.5Gbps in TX; as well as a
significant CPU usage drop (from 25% to 15%).

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: unify the txq size define use
Antoine Ténart [Wed, 23 Aug 2017 07:46:55 +0000 (09:46 +0200)]
net: mvpp2: unify the txq size define use

The txq size is defined by MVPP2_AGGR_TXQ_SIZE, which is sometime not
used directly but through variables. As it is a fixed value use the
define everywhere in the driver.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: define the TSO header size in net/tso.h
Antoine Ténart [Wed, 23 Aug 2017 07:46:54 +0000 (09:46 +0200)]
net: define the TSO header size in net/tso.h

The TSO header size was defined in many drivers. Factorize the code and
define its size in net/tso.h.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv4: do metrics match when looking up and deleting a route
Xin Long [Wed, 23 Aug 2017 02:07:26 +0000 (10:07 +0800)]
ipv4: do metrics match when looking up and deleting a route

Now when ipv4 route inserts a fib_info, it memcmp fib_metrics.
It means ipv4 route identifies one route also with metrics.

But when removing a route, it tries to find the route without
caring about the metrics. It will cause that the route with
right metrics can't be removed.

Thomas noticed this issue when doing the testing:

1. add:
   # ip route append 192.168.7.0/24 dev v window 1000
   # ip route append 192.168.7.0/24 dev v window 1001
   # ip route append 192.168.7.0/24 dev v window 1002
   # ip route append 192.168.7.0/24 dev v window 1003
2. delete:
   # ip route delete 192.168.7.0/24 dev v window 1002
3. show:
     192.168.7.0/24 proto boot scope link window 1001
     192.168.7.0/24 proto boot scope link window 1002
     192.168.7.0/24 proto boot scope link window 1003

The one with window 1002 wasn't deleted but the first one was.

This patch is to do metrics match when looking up and deleting
one route.

Reported-by: Thomas Haller <thaller@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'tcp-sw-rx-timestamps'
David S. Miller [Thu, 24 Aug 2017 03:30:48 +0000 (20:30 -0700)]
Merge branch 'tcp-sw-rx-timestamps'

Mike Maloney says:

====================
net: Add software rx timestamp for TCP.

Add software rx timestamps for TCP, and a test to ensure consistency of
behavior between IP, UDP, and TCP implementation.

Changes since v1:
  -Initialize tss->ts[1] to 0 if caller requested any timestamps.
  -Fix test case to validate that tss->ts[1] is zero.
  -Fix tests to actually use a raw socket.
  -Fix --tcp flag to work on the test.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/net: Add a test to validate behavior of rx timestamps
Mike Maloney [Tue, 22 Aug 2017 21:08:49 +0000 (17:08 -0400)]
selftests/net: Add a test to validate behavior of rx timestamps

Validate the behavior of the combination of various timestamp socket
options, and ensure consistency across ip, udp, and tcp.

Signed-off-by: Mike Maloney <maloney@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: Extend SOF_TIMESTAMPING_RX_SOFTWARE to TCP recvmsg
Mike Maloney [Tue, 22 Aug 2017 21:08:48 +0000 (17:08 -0400)]
tcp: Extend SOF_TIMESTAMPING_RX_SOFTWARE to TCP recvmsg

When SOF_TIMESTAMPING_RX_SOFTWARE is enabled for tcp sockets, return the
timestamp corresponding to the highest sequence number data returned.

Previously the skb->tstamp is overwritten when a TCP packet is placed
in the out of order queue.  While the packet is in the ooo queue, save the
timestamp in the TCB_SKB_CB.  This space is shared with the gso_*
options which are only used on the tx path, and a previously unused 4
byte hole.

When skbs are coalesced either in the sk_receive_queue or the
out_of_order_queue always choose the timestamp of the appended skb to
maintain the invariant of returning the timestamp of the last byte in
the recvmsg buffer.

Signed-off-by: Mike Maloney <maloney@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: change manner of detecting whether or not NIC firmware is loaded
Felix Manlunas [Tue, 22 Aug 2017 19:46:37 +0000 (12:46 -0700)]
liquidio: change manner of detecting whether or not NIC firmware is loaded

In the NIC firmware, the 1-bit flag indicating "firmware is loaded" moved
from SLI_SCRATCH_1 to SLI_SCRATCH_2 (these are Octeon general-purpose
scratch registers).  Make the PF driver conform to this change.

Remove code that sets the "firmware is loaded" flag because it's now the
firmware's job to do that.

In the code that detects whether or not the firmware is loaded, don't just
rely on checking the "firmware is loaded" flag because that may cause a
rare false negative.  Add code that deduces whether or not the firmware is
loaded; that will never give a false negative.

Also bump up driver version to match newer NIC firmware.

Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agogre: fix goto statement typo
William Tu [Wed, 23 Aug 2017 00:04:05 +0000 (17:04 -0700)]
gre: fix goto statement typo

Fix typo: pnet_tap_faied.

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bpf-minor-cleanups'
David S. Miller [Wed, 23 Aug 2017 04:26:30 +0000 (21:26 -0700)]
Merge branch 'bpf-minor-cleanups'

Daniel Borkmann says:

====================
Two minor BPF cleanups

Two minor cleanups on devmap and redirect I still had
in my queue.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: minor cleanups for dev_map
Daniel Borkmann [Tue, 22 Aug 2017 23:47:54 +0000 (01:47 +0200)]
bpf: minor cleanups for dev_map

Some minor code cleanups, while going over it I also noticed that
we're accounting the bitmap only for one CPU currently, so fix that
up as well.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: misc xdp redirect cleanups
Daniel Borkmann [Tue, 22 Aug 2017 23:47:53 +0000 (01:47 +0200)]
bpf: misc xdp redirect cleanups

Few cleanups including: bpf_redirect_map() is really XDP only due to
the return code. Move it to a more appropriate location where we do
the XDP redirect handling and change it's name into bpf_xdp_redirect_map()
to make it consistent to the bpf_xdp_redirect() helper.

xdp_do_redirect_map() helper can be static since only used out of filter.c
file. Drop the goto in xdp_do_generic_redirect() and only return errors
directly. In xdp_do_flush_map() only clear ri->map_to_flush which is the
arg we're using in that function, ri->map is cleared earlier along with
ri->ifindex.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: fix map value attribute for hash of maps
Daniel Borkmann [Tue, 22 Aug 2017 22:06:09 +0000 (00:06 +0200)]
bpf: fix map value attribute for hash of maps

Currently, iproute2's BPF ELF loader works fine with array of maps
when retrieving the fd from a pinned node and doing a selfcheck
against the provided map attributes from the object file, but we
fail to do the same for hash of maps and thus refuse to get the
map from pinned node.

Reason is that when allocating hash of maps, fd_htab_map_alloc() will
set the value size to sizeof(void *), and any user space map creation
requests are forced to set 4 bytes as value size. Thus, selfcheck
will complain about exposed 8 bytes on 64 bit archs vs. 4 bytes from
object file as value size. Contract is that fdinfo or BPF_MAP_GET_FD_BY_ID
returns the value size used to create the map.

Fix it by handling it the same way as we do for array of maps, which
means that we leave value size at 4 bytes and in the allocation phase
round up value size to 8 bytes. alloc_htab_elem() needs an adjustment
in order to copy rounded up 8 bytes due to bpf_fd_htab_map_update_elem()
calling into htab_map_update_elem() with the pointer of the map
pointer as value. Unlike array of maps where we just xchg(), we're
using the generic htab_map_update_elem() callback also used from helper
calls, which published the key/value already on return, so we need
to ensure to memcpy() the right size.

Fixes: bcc6b1b7ebf8 ("bpf: Add hash of maps support")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMIPS,bpf: fix missing break in switch statement
Colin Ian King [Tue, 22 Aug 2017 22:46:06 +0000 (23:46 +0100)]
MIPS,bpf: fix missing break in switch statement

There is a missing break causing a fall-through and setting
ctx.use_bbit_insns to the wrong value. Fix this by adding the
missing break.

Detected with cppcheck:
"Variable 'ctx.use_bbit_insns' is reassigned a value before the old
one has been used. 'break;' missing?"

Fixes: 8d8d18c3283f ("MIPS,bpf: Fix using smp_processor_id() in preemptible splat.")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: use kvmalloc() for class hash tables
Eric Dumazet [Tue, 22 Aug 2017 19:26:46 +0000 (12:26 -0700)]
net: sched: use kvmalloc() for class hash tables

High order GFP_KERNEL allocations can stress the host badly.

Use modern kvmalloc_array()/kvfree() instead of custom
allocations.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: amd: constify zorro_device_id
Arvind Yadav [Tue, 22 Aug 2017 18:11:12 +0000 (23:41 +0530)]
net: amd: constify zorro_device_id

zorro_device_id are not supposed to change at runtime. All functions
working with zorro_device_id provided by <linux/zorro.h> work with
const zorro_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'net-mvpp2-MAC-GoP-configuration'
David S. Miller [Tue, 22 Aug 2017 21:32:20 +0000 (14:32 -0700)]
Merge branch 'net-mvpp2-MAC-GoP-configuration'

Antoine Tenart says:

====================
net: mvpp2: MAC/GoP configuration

This is based on net-next (e2a7c34fb285).

I removed the GoP interrupt and PHY optional parts in this v2 to ease
the review process as the MAC/GoP initialization seemed to start less
discussions :)

This series now only aims at making the PPv2 driver less depending on
the firmware/bootloader initialization. Some patches cleanup some parts
as well, and add new interface descriptions in the dt.

The current implementation of the PPv2 driver relies on the
firmware/bootloader initialization to configure some parts, as the Group
of Ports (GoP) and the MACs (GMAC and/or XLG MAC --for 10G--).  The
drawback is the kernel must be configured to match exactly what the
bootloader configures which is not convenient and is an issue when using
boards having an Ethernet port and an SFP port wired to the same GoP
port, as no dynamic configuration can be done.

This series adds the GoP and GMAC/XLG MAC initializations so that the
PPV2 does not have to rely on a previous initialization. One part is
still missing from this series, and that would be the 'comphy' which
provides shared serdes PHYs and which must be configured as well for a
full kernel initialization to work. This comphy support will be part of
a following up series. (This series was also tested with this 'comphy'
support, as it's nearly ready).

@Dave: patches 9 and 10 should go through the mvebu tree. Thanks!
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoDocumentation/bindings: net: marvell-pp2: add the system controller
Antoine Ténart [Tue, 22 Aug 2017 17:08:28 +0000 (19:08 +0200)]
Documentation/bindings: net: marvell-pp2: add the system controller

This patch documents the new marvell,system-controller property used by
the Marvell ppv2 network driver.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: initialize the GoP
Antoine Ténart [Tue, 22 Aug 2017 17:08:27 +0000 (19:08 +0200)]
net: mvpp2: initialize the GoP

The patch adds GoP (group of ports) initialization functions. The mvpp2
driver was relying on the firmware/bootloader initialization; this patch
moves this setup to the mvpp2 driver.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: set maximum packet size for 10G ports
Stefan Chulski [Tue, 22 Aug 2017 17:08:26 +0000 (19:08 +0200)]
net: mvpp2: set maximum packet size for 10G ports

Set maximum packet size for XLG 10G ports. Missing maximum packet size
for XLG configuration will cause kernel panic if oversized packet is
received by port.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: initialize the XLG MAC when using a port
Antoine Ténart [Tue, 22 Aug 2017 17:08:25 +0000 (19:08 +0200)]
net: mvpp2: initialize the XLG MAC when using a port

This adds a routine to initialize the XLG MAC at the port level when
using a port and the XAUI/10GKR interface mode. This wasn't done until
this commit, and the mvpp2 driver was relying on the bootloader/firmware
initialization. This doesn't mean everything is configured in the mvpp2
driver now, but it helps reducing the gap.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: initialize the GMAC when using a port
Antoine Ténart [Tue, 22 Aug 2017 17:08:24 +0000 (19:08 +0200)]
net: mvpp2: initialize the GMAC when using a port

This adds a routine to initialize the GMAC at the port level when using
a port. This wasn't done until this commit, and the mvpp2 driver was
relying on the bootloader/firmware initialization. This doesn't mean
everything is configured in the mvpp2 driver now, but it helps reducing
the gap.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: move the mii configuration in the ndo_open path
Antoine Ténart [Tue, 22 Aug 2017 17:08:23 +0000 (19:08 +0200)]
net: mvpp2: move the mii configuration in the ndo_open path

This moves the mii configuration in the ndo_open path, to allow handling
different mii configurations later and to switch between these
configurations at runtime.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: fix the synchronization module bypass macro name
Antoine Ténart [Tue, 22 Aug 2017 17:08:22 +0000 (19:08 +0200)]
net: mvpp2: fix the synchronization module bypass macro name

The macro defining the bit to toggle to bypass or not the
synchronization module is wrongly named. Writing 1 will disable bypass.
This patch s/MVPP22_CTRL4_SYNC_BYPASS/MVPP22_CTRL4_SYNC_BYPASS_DIS/.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mvpp2: unify register definitions coding style
Antoine Ténart [Tue, 22 Aug 2017 17:08:21 +0000 (19:08 +0200)]
net: mvpp2: unify register definitions coding style

Cosmetic patch to use the same formatting rules on all register
definitions.

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Tested-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agogre: introduce native tunnel support for ERSPAN
William Tu [Tue, 22 Aug 2017 16:40:28 +0000 (09:40 -0700)]
gre: introduce native tunnel support for ERSPAN

The patch adds ERSPAN type II tunnel support.  The implementation
is based on the draft at [1].  One of the purposes is for Linux
box to be able to receive ERSPAN monitoring traffic sent from
the Cisco switch, by creating a ERSPAN tunnel device.
In addition, the patch also adds ERSPAN TX, so Linux virtual
switch can redirect monitored traffic to the ERSPAN tunnel device.
The traffic will be encapsulated into ERSPAN and sent out.

The implementation reuses tunnel key as ERSPAN session ID, and
field 'erspan' as ERSPAN Index fields:
./ip link add dev ers11 type erspan seq key 100 erspan 123 \
local 172.16.1.200 remote 172.16.1.100

To use the above device as ERSPAN receiver, configure
Nexus 5000 switch as below:

monitor session 100 type erspan-source
  erspan-id 123
  vrf default
  destination ip 172.16.1.200
  source interface Ethernet1/11 both
  source interface Ethernet1/12 both
  no shut
monitor erspan origin ip-address 172.16.1.100 global

[1] https://tools.ietf.org/html/draft-foschiano-erspan-01
[2] iproute2 patch: http://marc.info/?l=linux-netdev&m=150306086924951&w=2
[3] test script: http://marc.info/?l=linux-netdev&m=150231021807304&w=2

Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: Meenakshi Vohra <mvohra@vmware.com>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoudp: remove unreachable ufo branches
Willem de Bruijn [Tue, 22 Aug 2017 15:39:57 +0000 (11:39 -0400)]
udp: remove unreachable ufo branches

Remove two references to ufo in the udp send path that are no longer
reachable now that ufo has been removed.

Commit 85f1bd9a7b5a ("udp: consistently apply ufo or fragmentation")
is a fix to ufo. It is safe to revert what remains of it.

Also, no skb can enter ip_append_page with skb_is_gso true now that
skb_shinfo(skb)->gso_type is no longer set in ip_append_page/_data.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: mdio-gpio: make mdiobb_ops const
Bhumika Goyal [Tue, 22 Aug 2017 08:13:29 +0000 (13:43 +0530)]
net: mdio-gpio: make mdiobb_ops const

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: freescale: fs_enet: make mdiobb_ops const
Bhumika Goyal [Tue, 22 Aug 2017 08:15:59 +0000 (13:45 +0530)]
net: ethernet: freescale: fs_enet: make mdiobb_ops const

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: ax88796: make mdiobb_ops const
Bhumika Goyal [Tue, 22 Aug 2017 08:11:19 +0000 (13:41 +0530)]
net: ethernet: ax88796: make mdiobb_ops const

Make this const as it is only stored in a const field of a
mdiobb_ctrl structure.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'tcp_conn_request-cleanup'
David S. Miller [Tue, 22 Aug 2017 21:16:13 +0000 (14:16 -0700)]
Merge branch 'tcp_conn_request-cleanup'

Tonghao Zhang says:

====================
tcp: Simplify the tcp_conn_request.

Just simplify the tcp_conn_request function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: Remove the unused parameter for tcp_try_fastopen.
Tonghao Zhang [Tue, 22 Aug 2017 06:33:49 +0000 (23:33 -0700)]
tcp: Remove the unused parameter for tcp_try_fastopen.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: Get a proper dst before checking it.
Tonghao Zhang [Tue, 22 Aug 2017 06:33:48 +0000 (23:33 -0700)]
tcp: Get a proper dst before checking it.

tcp_peer_is_proven needs a proper route to make the
determination, but dst always is NULL. This bug may
be there at the beginning of git tree. This does not
look serious enough to deserve backports to stable
versions.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'hv_netvsc-Ethtool-handler-to-change-UDP-hash-levels'
David S. Miller [Tue, 22 Aug 2017 21:08:12 +0000 (14:08 -0700)]
Merge branch 'hv_netvsc-Ethtool-handler-to-change-UDP-hash-levels'

Haiyang Zhang says:

====================
hv_netvsc: Ethtool handler to change UDP hash levels

The patch set adds the functions to switch UDP hash level between
L3 and L4 by ethtool command. UDP over IPv4 and v6 can be set
differently. The default hash level is L4. We currently only
allow switching TX hash level from within the guests.

The ethtool callback function is triggered by command line, and
update the per device variables of the hash level.

On Azure, fragmented UDP packets is not yet supported with L4
hashing, and may have high packet loss rate. Using L3 hashing is
recommended in this case. This ethtool option allows a user to
make this selection.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agohv_netvsc: Update netvsc Document for UDP hash level setting
Haiyang Zhang [Tue, 22 Aug 2017 02:22:40 +0000 (19:22 -0700)]
hv_netvsc: Update netvsc Document for UDP hash level setting

Update Documentation/networking/netvsc.txt for UDP hash level setting
and related info.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agohv_netvsc: Add ethtool handler to set and get UDP hash levels
Haiyang Zhang [Tue, 22 Aug 2017 02:22:39 +0000 (19:22 -0700)]
hv_netvsc: Add ethtool handler to set and get UDP hash levels

The patch add the functions to switch UDP hash level between
L3 and L4 by ethtool command. UDP over IPv4 and v6 can be set
differently. The default hash level is L4. We currently only
allow switching TX hash level from within the guests.

On Azure, fragmented UDP packets have high loss rate with L4
hashing. Using L3 hashing is recommended in this case.

For example, for UDP over IPv4 on eth0:
To include UDP port numbers in hasing:
ethtool -N eth0 rx-flow-hash udp4 sdfn
To exclude UDP port numbers in hasing:
ethtool -N eth0 rx-flow-hash udp4 sd
To show UDP hash level:
ethtool -n eth0 rx-flow-hash udp4

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agohv_netvsc: Clean up unused parameter from netvsc_get_rss_hash_opts()
Haiyang Zhang [Tue, 22 Aug 2017 02:22:38 +0000 (19:22 -0700)]
hv_netvsc: Clean up unused parameter from netvsc_get_rss_hash_opts()

The parameter "nvdev" is not in use.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agohv_netvsc: Clean up unused parameter from netvsc_get_hash()
Haiyang Zhang [Tue, 22 Aug 2017 02:22:37 +0000 (19:22 -0700)]
hv_netvsc: Clean up unused parameter from netvsc_get_hash()

The parameter "sk" is not in use.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'liquidio-VF-driver-will-notify-NIC-firmware-of-MTU-change'
David S. Miller [Tue, 22 Aug 2017 18:08:16 +0000 (11:08 -0700)]
Merge branch 'liquidio-VF-driver-will-notify-NIC-firmware-of-MTU-change'

Veerasenareddy Burru says:

====================
liquidio: VF driver will notify NIC firmware of MTU change

Make VF driver notify NIC firmware of MTU change.  Firmware needs this
information for MTU propagation and enforcement.

The first patch in this series moves a macro definition to a proper place
to prevent a build error in the second patch which has the code that sends
the notification.

Change Log:
  V1 -> V2
    * Add "From:" line to patch #1 and #2 to give credit to the author.
    * In patch #2, order local variable declarations from longest to
      shortest line.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: make VF driver notify NIC firmware of MTU change
Veerasenareddy Burru [Mon, 21 Aug 2017 19:35:59 +0000 (12:35 -0700)]
liquidio: make VF driver notify NIC firmware of MTU change

Signed-off-by: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: move macro definition to a proper place
Veerasenareddy Burru [Mon, 21 Aug 2017 19:35:56 +0000 (12:35 -0700)]
liquidio: move macro definition to a proper place

The macro LIO_CMD_WAIT_TM is not specific to the PF driver; it can be used
by the VF driver too, so move its definition from a PF-specific header file
to one that's common to PF and VF.

Signed-off-by: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoptp: make ptp_clock_info const
Bhumika Goyal [Mon, 21 Aug 2017 17:31:12 +0000 (23:01 +0530)]
ptp: make ptp_clock_info const

Make these const as they are only used in a copy operation.
Done using Coccinelle.

@match disable optional_qualifier@
identifier s;
@@
static struct ptp_clock_info s = {...};

@ref@
position p;
identifier match.s;
@@
s@p

@good1@
position ref.p;
identifier match.s,f,c;
expression e;
@@
(
e = s@p
|
e = s@p.f
|
c(...,s@p.f,...)
|
c(...,s@p,...)
)

@bad depends on  !good1@
position ref.p;
identifier match.s;
@@
s@p

@depends on forall !bad disable optional_qualifier@
identifier match.s;
@@
static
+ const
struct ptp_clock_info s;

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: make ptp_clock_info const
Bhumika Goyal [Mon, 21 Aug 2017 17:06:50 +0000 (22:36 +0530)]
net: ethernet: make ptp_clock_info const

Make these const as they are only used in a copy operation.
Done using Coccinelle.

@match disable optional_qualifier@
identifier s;
@@
static struct ptp_clock_info s = {...};

@ref@
position p;
identifier match.s;
@@
s@p

@good1@
position ref.p;
identifier match.s,f,c;
expression e;
@@
(
e = s@p
|
e = s@p.f
|
c(...,s@p.f,...)
|
c(...,s@p,...)
)

@bad depends on  !good1@
position ref.p;
identifier match.s;
@@
s@p

@depends on forall !bad disable optional_qualifier@
identifier match.s;
@@
static
+ const
struct ptp_clock_info s;

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: hns3: Add support to change MTU in HNS3 hardware
Salil [Mon, 21 Aug 2017 16:05:24 +0000 (17:05 +0100)]
net: hns3: Add support to change MTU in HNS3 hardware

This patch adds the following support to the HNS3 driver:
1. Support to change the Maximum Transmission Unit of a
   port in the HNS NIC hardware.
2. Initializes the supported MTU range for the netdevice.

Signed-off-by: lipeng <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'Huawei-HiNIC-Ethernet-Driver'
David S. Miller [Tue, 22 Aug 2017 17:48:54 +0000 (10:48 -0700)]
Merge branch 'Huawei-HiNIC-Ethernet-Driver'

Aviad Krawczyk says:

====================
Huawei HiNIC Ethernet Driver

The patch-set contains the support of the HiNIC Ethernet driver for
hinic family of PCIE Network Interface Cards.

The Huawei's PCIE HiNIC card is a new Ethernet card and hence there was
a need of a new driver.

The current driver is meant to be used for the Physical Function and there
would soon be a support for Virtual Function and more features once the
basic PF driver has been accepted.

Changes V7 -> V8:
1. Remove unnecessary cast from void * - Stephen Hemminger comment
https://lkml.org/lkml/2017/8/17/1008

Changes V6 -> V7:
1. Separate netpoll and MAINTAINERS patch - Sergei Shtylyov comment
https://lkml.org/lkml/2017/8/17/479

Changes V5 -> V6:
1. Fix cover letter Message-Id

Changes V4 -> V5:
1. Remove select_queue NOP - David Miller comment
        https://lkml.org/lkml/2017/8/16/625

Changes V3 -> V4:
1. Reverse christmas tree order - David Miller comment
        https://lkml.org/lkml/2017/8/3/862

Changes V2 -> V3:
1. Replace dev_ functions by netif_ functions - Joe Perches comment
        https://lkml.org/lkml/2017/7/19/424
2. Fix the driver directory in MAINTAINERS file - Sergei Shtylyov comment
        https://lkml.org/lkml/2017/7/19/615
3. Add a newline at the end of Makefile - David Miller comment
        https://lkml.org/lkml/2017/7/19/1345
4. Return a pointer as a val instead of in arg - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319
5. Change the error labels to err_xyz - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319
6. Remove check of Func Type in free function - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319
7. Remove !netdev check in remove function - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319
8. Use module_pci_driver - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319
9. Move the PCI device ID to the .c file - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319
10. Remove void * to avoid passing wrong ptr - Francois Romieu comment
        https://lkml.org/lkml/2017/7/19/1319

Changes V1 -> V2:
1. remove driver verstion - Andrew Lunn comment
        https://lkml.org/lkml/2017/7/12/372
2. replace kzalloc by devm_kzalloc for short clean - Andrew Lunn comment
        https://lkml.org/lkml/2017/7/12/372
3. replace pr_ functions by dev_ functions - Andrew Lunn comment
        https://lkml.org/lkml/2017/7/12/375
4. seperate last patch by moving ops to a new patch - Andrew Lunn comment
        https://lkml.org/lkml/2017/7/12/377
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add Maintainer
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:08 +0000 (23:56 +0800)]
net-next/hinic: Add Maintainer

Update MAINTAINERS file

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add netpoll
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:07 +0000 (23:56 +0800)]
net-next/hinic: Add netpoll

Add more netdev operation - netpoll.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add ethtool and stats
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:06 +0000 (23:56 +0800)]
net-next/hinic: Add ethtool and stats

Add ethtool operations and statistics operations.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add Tx operation
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:05 +0000 (23:56 +0800)]
net-next/hinic: Add Tx operation

Add transmit operation for sending data by qp operations.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add Rx handler
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:04 +0000 (23:56 +0800)]
net-next/hinic: Add Rx handler

Set the io resources in the nic and handle rx events by qp operations.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add cmdq completion handler
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:03 +0000 (23:56 +0800)]
net-next/hinic: Add cmdq completion handler

Add cmdq completion handler for getting a notification about the
completion of cmdq commands.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add cmdq commands
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:02 +0000 (23:56 +0800)]
net-next/hinic: Add cmdq commands

Add cmdq commands for setting queue pair contexts in the nic.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add ceqs
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:01 +0000 (23:56 +0800)]
net-next/hinic: Add ceqs

Initialize the completion event queues and handle ceq events by calling
the registered handlers. Used for cmdq command completion.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Initialize cmdq
Aviad Krawczyk [Mon, 21 Aug 2017 15:56:00 +0000 (23:56 +0800)]
net-next/hinic: Initialize cmdq

Create the work queues for cmdq and update the nic about the work queue
contexts. cmdq commands are used for updating the nic about the qp
contexts.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Set qp context
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:59 +0000 (23:55 +0800)]
net-next/hinic: Set qp context

Update the nic about the resources of the queue pairs.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add qp resources
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:58 +0000 (23:55 +0800)]
net-next/hinic: Add qp resources

Create the resources for queue pair operations: doorbell area,
consumer index address and producer index address.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add wq
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:57 +0000 (23:55 +0800)]
net-next/hinic: Add wq

Create work queues for being used by the queue pairs.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add logical Txq and Rxq
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:56 +0000 (23:55 +0800)]
net-next/hinic: Add logical Txq and Rxq

Create the logical queues of the nic.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add Rx mode and link event handler
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:55 +0000 (23:55 +0800)]
net-next/hinic: Add Rx mode and link event handler

Add port management message for setting Rx mode in the card,
used for rx_mode netdev operation.
The link event handler is used for getting a notification about the
link state.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add port management commands
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:54 +0000 (23:55 +0800)]
net-next/hinic: Add port management commands

Add the port management commands that are sent as management messages.
The port management commands are used for netdev operations.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add aeqs
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:53 +0000 (23:55 +0800)]
net-next/hinic: Add aeqs

Handle aeq elements that are accumulated on the aeq by calling the
registered handler for the specific event.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add api cmd commands
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:52 +0000 (23:55 +0800)]
net-next/hinic: Add api cmd commands

Add the api cmd commands for sending management messages to the nic.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Add management messages
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:51 +0000 (23:55 +0800)]
net-next/hinic: Add management messages

Add the management messages for sending to api cmd and the asynchronous
event handler for the completion of the messages.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Initialize api cmd hw
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:50 +0000 (23:55 +0800)]
net-next/hinic: Initialize api cmd hw

Update the hardware about api cmd resources and initialize it.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Initialize api cmd resources
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:49 +0000 (23:55 +0800)]
net-next/hinic: Initialize api cmd resources

Initialize api cmd resources as part of management initialization.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Initialize hw device components
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:48 +0000 (23:55 +0800)]
net-next/hinic: Initialize hw device components

Initialize hw device by calling the initialization functions of aeqs and
management channel.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet-next/hinic: Initialize hw interface
Aviad Krawczyk [Mon, 21 Aug 2017 15:55:47 +0000 (23:55 +0800)]
net-next/hinic: Initialize hw interface

Initialize hw interface as part of the nic initialization for accessing hw.

Signed-off-by: Aviad Krawczyk <aviad.krawczyk@huawei.com>
Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: dwmac-rk: Add rv1108 gmac support
David Wu [Mon, 21 Aug 2017 10:12:55 +0000 (18:12 +0800)]
net: ethernet: stmmac: dwmac-rk: Add rv1108 gmac support

It only supports rmii interface. Add constants and callback functions
for the dwmac on rv1108 socs. As can be seen, the base structure is
the same, only registers and the bits in them moved slightly.

Signed-off-by: David Wu <david.wu@rock-chips.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoethernet: xircom: small clean up in setup_xirc2ps_cs()
Dan Carpenter [Mon, 21 Aug 2017 09:47:30 +0000 (12:47 +0300)]
ethernet: xircom: small clean up in setup_xirc2ps_cs()

The get_options() function takes the whole ARRAY_SIZE().  It doesn't
matter here because we don't use more than 7 elements.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoarm: eBPF JIT compiler
Shubham Bansal [Tue, 22 Aug 2017 06:32:33 +0000 (12:02 +0530)]
arm: eBPF JIT compiler

The JIT compiler emits ARM 32 bit instructions. Currently, It supports
eBPF only. Classic BPF is supported because of the conversion by BPF core.

This patch is essentially changing the current implementation of JIT compiler
of Berkeley Packet Filter from classic to internal with almost all
instructions from eBPF ISA supported except the following
BPF_ALU64 | BPF_DIV | BPF_K
BPF_ALU64 | BPF_DIV | BPF_X
BPF_ALU64 | BPF_MOD | BPF_K
BPF_ALU64 | BPF_MOD | BPF_X
BPF_STX | BPF_XADD | BPF_W
BPF_STX | BPF_XADD | BPF_DW

Implementation is using scratch space to emulate 64 bit eBPF ISA on 32 bit
ARM because of deficiency of general purpose registers on ARM. Currently,
only LITTLE ENDIAN machines are supported in this eBPF JIT Compiler.

Tested on ARMv7 with QEMU by me (Shubham Bansal).

Testing results on ARMv7:

1) test_bpf: Summary: 341 PASSED, 0 FAILED, [312/333 JIT'ed]
2) test_tag: OK (40945 tests)
3) test_progs: Summary: 30 PASSED, 0 FAILED
4) test_lpm: OK
5) test_lru_map: OK

Above tests are all done with following flags enabled discreatly.

1) bpf_jit_enable=1
a) CONFIG_FRAME_POINTER enabled
b) CONFIG_FRAME_POINTER disabled
2) bpf_jit_enable=1 and bpf_jit_harden=2
a) CONFIG_FRAME_POINTER enabled
b) CONFIG_FRAME_POINTER disabled

See Documentation/networking/filter.txt for more information.

Signed-off-by: Shubham Bansal <illusionist.neo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Tue, 22 Aug 2017 00:06:42 +0000 (17:06 -0700)]
Merge git://git./linux/kernel/git/davem/net

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Mon, 21 Aug 2017 21:07:48 +0000 (14:07 -0700)]
Merge git://git./linux/kernel/git/davem/sparc

Pull sparc fixes from David Miller:
 "Just a couple small fixes, two of which have to do with gcc-7:

   1) Don't clobber kernel fixed registers in __multi4 libgcc helper.

   2) Fix a new uninitialized variable warning on sparc32 with gcc-7,
      from Thomas Petazzoni.

   3) Adjust pmd_t initializer on sparc32 to make gcc happy.

   4) If ATU isn't available, don't bark in the logs. From Tushar Dave"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
  sparc64: remove unnecessary log message
  sparc64: Don't clibber fixed registers in __multi4.
  mm: add pmd_t initializer __pmd() to work around a GCC bug.

7 years agosparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()
Thomas Petazzoni [Sun, 13 Aug 2017 21:14:58 +0000 (23:14 +0200)]
sparc: kernel/pcic: silence gcc 7.x warning in pcibios_fixup_bus()

When building the kernel for Sparc using gcc 7.x, the build fails
with:

arch/sparc/kernel/pcic.c: In function ‘pcibios_fixup_bus’:
arch/sparc/kernel/pcic.c:647:8: error: ‘cmd’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
    cmd |= PCI_COMMAND_IO;
        ^~

The simplified code looks like this:

unsigned int cmd;
[...]
pcic_read_config(dev->bus, dev->devfn, PCI_COMMAND, 2, &cmd);
[...]
cmd |= PCI_COMMAND_IO;

I.e, the code assumes that pcic_read_config() will always initialize
cmd. But it's not the case. Looking at pcic_read_config(), if
bus->number is != 0 or if the size is not one of 1, 2 or 4, *val will
not be initialized.

As a simple fix, we initialize cmd to zero at the beginning of
pcibios_fixup_bus.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: Add the invalid handle check in qdisc_class_find
Gao Feng [Fri, 18 Aug 2017 07:23:24 +0000 (15:23 +0800)]
net: sched: Add the invalid handle check in qdisc_class_find

Add the invalid handle "0" check to avoid unnecessary search, because
the qdisc uses the skb->priority as the handle value to look up, and
it is "0" usually.

Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotipc: don't reset stale broadcast send link
Jon Paul Maloy [Mon, 21 Aug 2017 15:59:30 +0000 (17:59 +0200)]
tipc: don't reset stale broadcast send link

When the broadcast send link after 100 attempts has failed to
transfer a packet to all peers, we consider it stale, and reset
it. Thereafter it needs to re-synchronize with the peers, something
currently done by just resetting and re-establishing all links to
all peers. This has turned out to be overkill, with potentially
unwanted consequences for the remaining cluster.

A closer analysis reveals that this can be done much simpler. When
this kind of failure happens, for reasons that may lie outside the
TIPC protocol, it is typically only one peer which is failing to
receive and acknowledge packets. It is hence sufficient to identify
and reset the links only to that peer to resolve the situation, without
having to reset the broadcast link at all. This solution entails a much
lower risk of negative consequences for the own node as well as for
the overall cluster.

We implement this change in this commit.

Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupt...
Linus Torvalds [Mon, 21 Aug 2017 20:30:36 +0000 (13:30 -0700)]
Merge tag 'arc-4.13-rc7-fixes' of git://git./linux/kernel/git/vgupta/arc

Pull ARC fixes from Vineet Gupta:

 - PAE40 related updates

 - SLC errata for region ops

 - intc line masking by default

* tag 'arc-4.13-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  arc: Mask individual IRQ lines during core INTC init
  ARCv2: PAE40: set MSB even if !CONFIG_ARC_HAS_PAE40 but PAE exists in SoC
  ARCv2: PAE40: Explicitly set MSB counterpart of SLC region ops addresses
  ARC: dma: implement dma_unmap_page and sg variant
  ARCv2: SLC: Make sure busy bit is set properly for region ops
  ARC: [plat-sim] Include this platform unconditionally
  ARC: [plat-axs10x]: prepare dts files for enabling PAE40 on axs103
  ARC: defconfig: Cleanup from old Kconfig options

7 years agoMerge tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni...
Linus Torvalds [Mon, 21 Aug 2017 20:27:51 +0000 (13:27 -0700)]
Merge tag 'rtc-4.13-fixes' of git://git./linux/kernel/git/abelloni/linux

Pull RTC fix from Alexandre Belloni:
 "Fix regmap configuration for ds1307"

* tag 'rtc-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: ds1307: fix regmap config

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Mon, 21 Aug 2017 20:16:27 +0000 (13:16 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix IGMP handling wrt VRF, from David Ahern.

 2) Fix timer access to freed object in dccp, from Eric Dumazet.

 3) Use kmalloc_array() in ptr_ring to avoid overflow cases which are
    triggerable by userspace. Also from Eric Dumazet.

 4) Fix infinite loop in unmapping cleanup of nfp driver, from Colin Ian
    King.

 5) Correct datagram peek handling of empty SKBs, from Matthew Dawson.

 6) Fix use after free in TIPC, from Eric Dumazet.

 7) When replacing a route in ipv6 we need to reset the round robin
    pointer, from Wei Wang.

 8) Fix bug in pci_find_pcie_root_port() which was unearthed by the
    relaxed ordering changes, from Thierry Redding. I made sure to get
    an explicit ACK from Bjorn this time around :-)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
  ipv6: repair fib6 tree in failure case
  net_sched: fix order of queue length updates in qdisc_replace()
  tools lib bpf: improve warning
  switchdev: documentation: minor typo fixes
  bpf, doc: also add s390x as arch to sysctl description
  net: sched: fix NULL pointer dereference when action calls some targets
  rxrpc: Fix oops when discarding a preallocated service call
  irda: do not leak initialized list.dev to userspace
  net/mlx4_core: Enable 4K UAR if SRIOV module parameter is not enabled
  PCI: Allow PCI express root ports to find themselves
  tcp: when rearming RTO, if RTO time is in past then fire RTO ASAP
  net: check and errout if res->fi is NULL when RTM_F_FIB_MATCH is set
  ipv6: reset fn->rr_ptr when replacing route
  sctp: fully initialize the IPv6 address in sctp_v6_to_addr()
  tipc: fix use-after-free
  tun: handle register_netdevice() failures properly
  datagram: When peeking datagrams with offset < 0 don't skip empty skbs
  bpf, doc: improve sysctl knob description
  netxen: fix incorrect loop counter decrement
  nfp: fix infinite loop on umapping cleanup
  ...

7 years agopids: make task_tgid_nr_ns() safe
Oleg Nesterov [Mon, 21 Aug 2017 15:35:02 +0000 (17:35 +0200)]
pids: make task_tgid_nr_ns() safe

This was reported many times, and this was even mentioned in commit
52ee2dfdd4f5 ("pids: refactor vnr/nr_ns helpers to make them safe") but
somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is
not safe because task->group_leader points to nowhere after the exiting
task passes exit_notify(), rcu_read_lock() can not help.

We really need to change __unhash_process() to nullify group_leader,
parent, and real_parent, but this needs some cleanups.  Until then we
can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and
fix the problem.

Reported-by: Troy Kensinger <tkensinger@google.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agonet: check type when freeing metadata dst
David Lamparter [Fri, 18 Aug 2017 12:31:35 +0000 (14:31 +0200)]
net: check type when freeing metadata dst

Commit 3fcece12bc1b ("net: store port/representator id in metadata_dst")
added a new type field to metadata_dst, but metadata_dst_free() wasn't
updated to check it before freeing the METADATA_IP_TUNNEL specific dst
cache entry.

This is not currently causing problems since it's far enough back in the
struct to be zeroed for the only other type currently in existance
(METADATA_HW_PORT_MUX), but nevertheless it's not correct.

Fixes: 3fcece12bc1b ("net: store port/representator id in metadata_dst")
Signed-off-by: David Lamparter <equinox@diac24.net>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Sridhar Samudrala <sridhar.samudrala@intel.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ipv6: put host and anycast routes on device with address
David Ahern [Thu, 17 Aug 2017 19:17:20 +0000 (12:17 -0700)]
net: ipv6: put host and anycast routes on device with address

One nagging difference between ipv4 and ipv6 is host routes for ipv6
addresses are installed using the loopback device or VRF / L3 Master
device. e.g.,

    2001:db8:1::/120 dev veth0 proto kernel metric 256 pref medium
    local 2001:db8:1::1 dev lo table local proto kernel metric 0 pref medium

Using the loopback device is convenient -- necessary for local tx, but
has some nasty side effects, most notably setting the 'lo' device down
causes all host routes for all local IPv6 address to be removed from the
FIB and completely breaks IPv6 networking across all interfaces.

This patch puts FIB entries for IPv6 routes against the device. This
simplifies the routes in the FIB, for example by making dst->dev and
rt6i_idev->dev the same (a future patch can look at removing the device
reference taken for rt6i_idev for FIB entries).

When copies are made on FIB lookups, the cloned route has dst->dev
set to loopback (or the L3 master device). This is needed for the
local Tx of packets to local addresses.

With fib entries allocated against the real network device, the addrconf
code that reinserts host routes on admin up of 'lo' is no longer needed.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodsa: remove unused net_device arg from handlers
Florian Westphal [Thu, 17 Aug 2017 14:47:00 +0000 (16:47 +0200)]
dsa: remove unused net_device arg from handlers

compile tested only, but saw no warnings/errors with
allmodconfig build.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>