openwrt/staging/blogic.git
5 years agoselftests: bpf: tests.h should depend on .c files, not the output
Stanislav Fomichev [Tue, 2 Apr 2019 17:08:32 +0000 (10:08 -0700)]
selftests: bpf: tests.h should depend on .c files, not the output

This makes sure we don't put headers as input files when doing
compilation, because clang complains about the following:

clang-9: error: cannot specify -o when generating multiple output files
../lib.mk:152: recipe for target 'xxx/tools/testing/selftests/bpf/test_verifier' failed
make: *** [xxx/tools/testing/selftests/bpf/test_verifier] Error 1
make: *** Waiting for unfinished jobs....
clang-9: error: cannot specify -o when generating multiple output files
../lib.mk:152: recipe for target 'xxx/tools/testing/selftests/bpf/test_progs' failed

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agobpf: add bpffs multi-dimensional array tests in test_btf
Yonghong Song [Fri, 29 Mar 2019 05:30:53 +0000 (22:30 -0700)]
bpf: add bpffs multi-dimensional array tests in test_btf

For multiple dimensional arrays like below,
  int a[2][3]
both llvm and pahole generated one BTF_KIND_ARRAY type like
  . element_type: int
  . index_type: unsigned int
  . number of elements: 6

Such a collapsed BTF_KIND_ARRAY type will cause the divergence
in BTF vs. the user code. In the compile-once-run-everywhere
project, the header file is generated from BTF and used for bpf
program, and the definition in the header file will be different
from what user expects.

But the kernel actually supports chained multi-dimensional array
types properly. The above "int a[2][3]" can be represented as
  Type #n:
    . element_type: int
    . index_type: unsigned int
    . number of elements: 3
  Type #(n+1):
    . element_type: type #n
    . index_type: unsigned int
    . number of elements: 2

The following llvm commit
  https://reviews.llvm.org/rL357215
also enables llvm to generated proper chained multi-dimensional arrays.

The test_btf already has a raw test ("struct test #1") for chained
multi-dimensional arrays. This patch added amended bpffs test for
chained multi-dimensional arrays.

Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoMerge branch 'variable-stack-access'
Alexei Starovoitov [Fri, 29 Mar 2019 19:05:36 +0000 (12:05 -0700)]
Merge branch 'variable-stack-access'

Andrey Ignatov says:

====================
The patch set adds support for stack access with variable offset from helpers.

Patch 1 is the main patch in the set and provides more details.
Patch 2 adds selftests for new functionality.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: Test variable offset stack access
Andrey Ignatov [Fri, 29 Mar 2019 01:01:58 +0000 (18:01 -0700)]
selftests/bpf: Test variable offset stack access

Test different scenarios of indirect variable-offset stack access: out of
bound access (>0), min_off below initialized part of the stack,
max_off+size above initialized part of the stack, initialized stack.

Example of output:
  ...
  #856/p indirect variable-offset stack access, out of bound OK
  #857/p indirect variable-offset stack access, max_off+size > max_initialized OK
  #858/p indirect variable-offset stack access, min_off < min_initialized OK
  #859/p indirect variable-offset stack access, ok OK
  ...

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: Support variable offset stack access from helpers
Andrey Ignatov [Fri, 29 Mar 2019 01:01:57 +0000 (18:01 -0700)]
bpf: Support variable offset stack access from helpers

Currently there is a difference in how verifier checks memory access for
helper arguments for PTR_TO_MAP_VALUE and PTR_TO_STACK with regard to
variable part of offset.

check_map_access, that is used for PTR_TO_MAP_VALUE, can handle variable
offsets just fine, so that BPF program can call a helper like this:

  some_helper(map_value_ptr + off, size);

, where offset is unknown at load time, but is checked by program to be
in a safe rage (off >= 0 && off + size < map_value_size).

But it's not the case for check_stack_boundary, that is used for
PTR_TO_STACK, and same code with pointer to stack is rejected by
verifier:

  some_helper(stack_value_ptr + off, size);

For example:
  0: (7a) *(u64 *)(r10 -16) = 0
  1: (7a) *(u64 *)(r10 -8) = 0
  2: (61) r2 = *(u32 *)(r1 +0)
  3: (57) r2 &= 4
  4: (17) r2 -= 16
  5: (0f) r2 += r10
  6: (18) r1 = 0xffff888111343a80
  8: (85) call bpf_map_lookup_elem#1
  invalid variable stack read R2 var_off=(0xfffffffffffffff0; 0x4)

Add support for variable offset access to check_stack_boundary so that
if offset is checked by program to be in a safe range it's accepted by
verifier.

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agotools/bpf: generate pkg-config file for libbpf
Luca Boccassi [Thu, 28 Mar 2019 11:33:53 +0000 (11:33 +0000)]
tools/bpf: generate pkg-config file for libbpf

Generate a libbpf.pc file at build time so that users can rely
on pkg-config to find the library, its CFLAGS and LDFLAGS.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Thu, 28 Mar 2019 00:37:58 +0000 (17:37 -0700)]
Merge git://git./linux/kernel/git/davem/net

5 years agoinet: switch IP ID generator to siphash
Eric Dumazet [Wed, 27 Mar 2019 19:40:33 +0000 (12:40 -0700)]
inet: switch IP ID generator to siphash

According to Amit Klein and Benny Pinkas, IP ID generation is too weak
and might be used by attackers.

Even with recent net_hash_mix() fix (netns: provide pure entropy for net_hash_mix())
having 64bit key and Jenkins hash is risky.

It is time to switch to siphash and its 128bit keys.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Amit Klein <aksecurity@gmail.com>
Reported-by: Benny Pinkas <benny@pinkas.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: mdio-bcm-unimac: remove redundant !timeout check
Colin Ian King [Wed, 27 Mar 2019 16:15:20 +0000 (16:15 +0000)]
net: phy: mdio-bcm-unimac: remove redundant !timeout check

The check for zero timeout is always true at the end of the proceeding
while loop; the only other exit path in the loop is if the unimac MDIO
is not busy.  Remove the redundant zero timeout check and always
return -ETIMEDOUT on this timeout return path.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: fix zerocopy and notsent_lowat issues
Eric Dumazet [Tue, 26 Mar 2019 15:34:55 +0000 (08:34 -0700)]
tcp: fix zerocopy and notsent_lowat issues

My recent patch had at least three problems :

1) TX zerocopy wants notification when skb is acknowledged,
   thus we need to call skb_zcopy_clear() if the skb is
   cached into sk->sk_tx_skb_cache

2) Some applications might expect precise EPOLLOUT
   notifications, so we need to update sk->sk_wmem_queued
   and call sk_mem_uncharge() from sk_wmem_free_skb()
   in all cases. The SOCK_QUEUE_SHRUNK flag must also be set.

3) Reuse of saved skb should have used skb_cloned() instead
  of simply checking if the fast clone has been freed.

Fixes: 472c2e07eef0 ("tcp: add one skb cache for tx")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: openvswitch: Add a new action check_pkt_len
Numan Siddique [Tue, 26 Mar 2019 00:43:46 +0000 (06:13 +0530)]
net: openvswitch: Add a new action check_pkt_len

This patch adds a new action - 'check_pkt_len' which checks the
packet length and executes a set of actions if the packet
length is greater than the specified length or executes
another set of actions if the packet length is lesser or equal to.

This action takes below nlattrs
  * OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for

  * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER - Nested actions
    to apply if the packet length is greater than the specified 'pkt_len'

  * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL - Nested
    actions to apply if the packet length is lesser or equal to the
    specified 'pkt_len'.

The main use case for adding this action is to solve the packet
drops because of MTU mismatch in OVN virtual networking solution.
When a VM (which belongs to a logical switch of OVN) sends a packet
destined to go via the gateway router and if the nic which provides
external connectivity, has a lesser MTU, OVS drops the packet
if the packet length is greater than this MTU.

With the help of this action, OVN will check the packet length
and if it is greater than the MTU size, it will generate an
ICMP packet (type 3, code 4) and includes the next hop mtu in it
so that the sender can fragment the packets.

Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Gregory Rose <gvrose8192@gmail.com>
CC: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ethtool-add-support-for-Fast-Link-Down-as-new-PHY-tunable'
David S. Miller [Wed, 27 Mar 2019 20:51:49 +0000 (13:51 -0700)]
Merge branch 'ethtool-add-support-for-Fast-Link-Down-as-new-PHY-tunable'

Heiner Kallweit says:

====================
ethtool: add support for Fast Link Down as new PHY tunable

This adds support for Fast Link Down as new PHY tunable.
Fast Link Down reduces the time until a link down event is reported
for 1000BaseT. According to the standard it's 750ms what is too long
for several use cases.

This is the kernel-related series, the ethtool userspace extension
I'd submit once the kernel part has been applied.

v2:
- add describing comment in patch 1
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: marvell: add PHY tunable fast link down support for 88E1540
Heiner Kallweit [Mon, 25 Mar 2019 18:35:41 +0000 (19:35 +0100)]
net: phy: marvell: add PHY tunable fast link down support for 88E1540

1000BaseT standard requires that a link is reported as down earliest
after 750ms. Several use case however require a much faster detecion
of a broken link. Fast Link Down supports this by intentionally
violating a the standard. This patch exposes the Fast Link Down
feature of 88E1540 and 88E6390. These PHY's can be found as internal
PHY's in several switches: 88E635288E624088E617688E6172,
and 88E6390(X). Fast Link Down and EEE are mutually exclusive.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoethtool: add PHY Fast Link Down support
Heiner Kallweit [Mon, 25 Mar 2019 18:34:58 +0000 (19:34 +0100)]
ethtool: add PHY Fast Link Down support

This adds support for Fast Link Down as new PHY tunable.
Fast Link Down reduces the time until a link down event is reported
for 1000BaseT. According to the standard it's 750ms what is too long
for several use cases.

v2:
- add comment describing the constants

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/core: Allow the compiler to verify declaration and definition consistency
Bart Van Assche [Mon, 25 Mar 2019 16:17:23 +0000 (09:17 -0700)]
net/core: Allow the compiler to verify declaration and definition consistency

Instead of declaring a function in a .c file, declare it in a header
file and include that header file from the source files that define
and that use the function. That allows the compiler to verify
consistency of declaration and definition. See also commit
52267790ef52 ("sock: add MSG_ZEROCOPY") # v4.14.

Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/core: Fix rtnetlink kernel-doc headers
Bart Van Assche [Mon, 25 Mar 2019 16:17:22 +0000 (09:17 -0700)]
net/core: Fix rtnetlink kernel-doc headers

This patch avoids that the following warnings are reported when building
with W=1:

net/core/rtnetlink.c:3580: warning: Function parameter or member 'ndm' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'tb' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'dev' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'addr' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'vid' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3580: warning: Function parameter or member 'flags' not described in 'ndo_dflt_fdb_add'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'ndm' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'tb' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'dev' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'addr' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3718: warning: Function parameter or member 'vid' not described in 'ndo_dflt_fdb_del'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'skb' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'cb' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'filter_dev' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Function parameter or member 'idx' not described in 'ndo_dflt_fdb_dump'
net/core/rtnetlink.c:3861: warning: Excess function parameter 'nlh' description in 'ndo_dflt_fdb_dump'

Cc: Hubert Sokolowski <hubert.sokolowski@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/core: Document __skb_flow_dissect() flags argument
Bart Van Assche [Mon, 25 Mar 2019 16:17:21 +0000 (09:17 -0700)]
net/core: Document __skb_flow_dissect() flags argument

This patch avoids that the following warning is reported when building
with W=1:

warning: Function parameter or member 'flags' not described in '__skb_flow_dissect'

Cc: Tom Herbert <tom@herbertland.com>
Fixes: cd79a2382aa5 ("flow_dissector: Add flags argument to skb_flow_dissector functions") # v4.3.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/core: Document all dev_ioctl() arguments
Bart Van Assche [Mon, 25 Mar 2019 16:17:20 +0000 (09:17 -0700)]
net/core: Document all dev_ioctl() arguments

This patch avoids that the following warnings are reported when building
with W=1:

net/core/dev_ioctl.c:378: warning: Function parameter or member 'ifr' not described in 'dev_ioctl'
net/core/dev_ioctl.c:378: warning: Function parameter or member 'need_copyout' not described in 'dev_ioctl'
net/core/dev_ioctl.c:378: warning: Excess function parameter 'arg' description in 'dev_ioctl'

Cc: Al Viro <viro@zeniv.linux.org.uk>
Fixes: 44c02a2c3dc5 ("dev_ioctl(): move copyin/copyout to callers") # v4.16.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/core: Document reuseport_add_sock() bind_inany argument
Bart Van Assche [Mon, 25 Mar 2019 16:17:19 +0000 (09:17 -0700)]
net/core: Document reuseport_add_sock() bind_inany argument

This patch avoids that the following warning is reported when building
with W=1:

warning: Function parameter or member 'bind_inany' not described in 'reuseport_add_sock'

Cc: Martin KaFai Lau <kafai@fb.com>
Fixes: 2dbb9b9e6df6 ("bpf: Introduce BPF_PROG_TYPE_SK_REUSEPORT") # v4.19.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: mv88e6xxx: remove unneeded cmode initialization
Heiner Kallweit [Sat, 23 Mar 2019 18:45:13 +0000 (19:45 +0100)]
net: dsa: mv88e6xxx: remove unneeded cmode initialization

This partially reverts ed8fe20205ac ("net: dsa: mv88e6xxx: prevent
interrupt storm caused by mv88e6390x_port_set_cmode"). I missed
that chip->ports[].cmode is overwritten anyway by the cmode
caching in mv88e6xxx_setup().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobnx2x: Utilize FW 7.13.11.0.
Sudarsana Reddy Kalluru [Wed, 27 Mar 2019 11:40:43 +0000 (04:40 -0700)]
bnx2x: Utilize FW 7.13.11.0.

Commit 8fcf0ec44c11f "bnx2x: Add FW 7.13.11.0" added said .bin FW to
linux-firmware; This patch incorporates the FW in the bnx2x driver.
This introduces few FW fixes and the support for Tx VLAN filtering.

Please consider applying it to 'net-next' tree.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofou: Support binding FoU socket
Kristian Evensen [Wed, 27 Mar 2019 10:16:03 +0000 (11:16 +0100)]
fou: Support binding FoU socket

An FoU socket is currently bound to the wildcard-address. While this
works fine, there are several use-cases where the use of the
wildcard-address is not desirable. For example, I use FoU on some
multi-homed servers and would like to use FoU on only one of the
interfaces.

This commit adds support for binding FoU sockets to a given source
address/interface, as well as connecting the socket to a given
destination address/port. udp_tunnel already provides the required
infrastructure, so most of the code added is for exposing and setting
the different attributes (local address, peer address, etc.).

The lookups performed when we add, delete or get an FoU-socket has also
been updated to compare all the attributes a user can set. Since the
comparison now involves several elements, I have added a separate
comparison-function instead of open-coding.

In order to test the code and ensure that the new comparison code works
correctly, I started by creating a wildcard socket bound to port 1234 on
my machine. I then tried to create a non-wildcarded socket bound to the
same port, as well as fetching and deleting the socket (including source
address, peer address or interface index in the netlink request).  Both
the create, fetch and delete request failed. Deleting/fetching the
socket was only successful when my netlink request attributes matched
those used to create the socket.

I then repeated the tests, but with a socket bound to a local ip
address, a socket bound to a local address + interface, and a bound
socket that was also «connected» to a peer. Add only worked when no
socket with the matching source address/interface (or wildcard) existed,
while fetch/delete was only successful when all attributes matched.

In addition to testing that the new code work, I also checked that the
current behavior is kept. If none of the new attributes are provided,
then an FoU-socket is configured as before (i.e., wildcarded).  If any
of the new attributes are provided, the FoU-socket is configured as
expected.

v1->v2:
* Fixed building with IPv6 disabled (kbuild).
* Fixed a return type warning and make the ugly comparison function more
readable (kbuild).
* Describe more in detail what has been tested (thanks David Miller).
* Make peer port required if peer address is specified.

Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 27 Mar 2019 19:22:57 +0000 (12:22 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "Fixes here and there, a couple new device IDs, as usual:

   1) Fix BQL race in dpaa2-eth driver, from Ioana Ciornei.

   2) Fix 64-bit division in iwlwifi, from Arnd Bergmann.

   3) Fix documentation for some eBPF helpers, from Quentin Monnet.

   4) Some UAPI bpf header sync with tools, also from Quentin Monnet.

   5) Set descriptor ownership bit at the right time for jumbo frames in
      stmmac driver, from Aaro Koskinen.

   6) Set IFF_UP properly in tun driver, from Eric Dumazet.

   7) Fix load/store doubleword instruction generation in powerpc eBPF
      JIT, from Naveen N. Rao.

   8) nla_nest_start() return value checks all over, from Kangjie Lu.

   9) Fix asoc_id handling in SCTP after the SCTP_*_ASSOC changes this
      merge window. From Marcelo Ricardo Leitner and Xin Long.

  10) Fix memory corruption with large MTUs in stmmac, from Aaro
      Koskinen.

  11) Do not use ipv4 header for ipv6 flows in TCP and DCCP, from Eric
      Dumazet.

  12) Fix topology subscription cancellation in tipc, from Erik Hugne.

  13) Memory leak in genetlink error path, from Yue Haibing.

  14) Valid control actions properly in packet scheduler, from Davide
      Caratti.

  15) Even if we get EEXIST, we still need to rehash if a shrink was
      delayed. From Herbert Xu.

  16) Fix interrupt mask handling in interrupt handler of r8169, from
      Heiner Kallweit.

  17) Fix leak in ehea driver, from Wen Yang"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (168 commits)
  dpaa2-eth: fix race condition with bql frame accounting
  chelsio: use BUG() instead of BUG_ON(1)
  net: devlink: skip info_get op call if it is not defined in dumpit
  net: phy: bcm54xx: Encode link speed and activity into LEDs
  tipc: change to check tipc_own_id to return in tipc_net_stop
  net: usb: aqc111: Extend HWID table by QNAP device
  net: sched: Kconfig: update reference link for PIE
  net: dsa: qca8k: extend slave-bus implementations
  net: dsa: qca8k: remove leftover phy accessors
  dt-bindings: net: dsa: qca8k: support internal mdio-bus
  dt-bindings: net: dsa: qca8k: fix example
  net: phy: don't clear BMCR in genphy_soft_reset
  bpf, libbpf: clarify bump in libbpf version info
  bpf, libbpf: fix version info and add it to shared object
  rxrpc: avoid clang -Wuninitialized warning
  tipc: tipc clang warning
  net: sched: fix cleanup NULL pointer exception in act_mirr
  r8169: fix cable re-plugging issue
  net: ethernet: ti: fix possible object reference leak
  net: ibm: fix possible object reference leak
  ...

5 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 27 Mar 2019 18:19:13 +0000 (11:19 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-03-26

This series contains more updates to the ice driver only.

Jeremiah provides his first patch to the Linux kernel to clean up
un-necessary newlines in driver log messages.

Mitch updates the ice driver to use existing status codes in the iavf
driver so that when errors occur, it will not report nonsensical
results.  Adds support for VF admin queue interrupts by programming the
VPINT_MBX_CTL register array.

Brett adds a check for a bit that we set while preparing for a reset, to
ensure we are prepared to do a proper reset.  Also implemented PCI error
handling operations.  Went through and audited the hot path with pahole
and made modifications based on the results since 2 structures were
taking up more space than necessary due to cache alignment issues.
Fixed an issue where when flow control was disabled, the state of flow
control was being displayed as "Unknown".

Anirudh fixes adaptive interrupt moderation changes by adding code that
was missed, that should have been added in the initial patch to add that
support.  Cleaned up a function prototype that was never implemented.
Did additional code cleanup by removing unneeded braces and redundant
code comments.

Akeem fixes an issue that occurs when the VF is attempting to remove the
default LAN/MAC address, which is programmed by the administrator by
updating the error message to explicitly say that the VF cannot change
the MAC programmed by the PF.

Preethi fixes the driver to not fall into the error path when a added
filter already exists, but instead continue to process the rest of the
function and add appropriate checks after adding MAC filters.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-mvpp2-Classifier-updates-and-cleanups'
David S. Miller [Wed, 27 Mar 2019 18:10:58 +0000 (11:10 -0700)]
Merge branch 'net-mvpp2-Classifier-updates-and-cleanups'

Maxime Chevallier says:

====================
net: mvpp2: Classifier updates and cleanups

In preparation for the future addition of classification offload
support, this series cleans-up the current code dealing with the
classification engines.

As of today, the classification code only deals with RSS, which is a
very narrow use-case when considering all of the classifier's features.

This lead to design and naming decisions that don't really stand when
considering using more classification features.

This series therefore includes quite a lot a renaming of functions and
macros, and tries to make the naming schemes more consistent and the
code more readable.

The debugfs interface that allows reading the various Parsing and
Classification engines has been cleaned-up and made more generic,
allowing to read the Flow Table and C2 engine hit counters on a
per-entry basis.

The Classifier itself has been made more robust by introducing the
lu_type field in classification lookups, that prevents false-positive
matches in the future. We also initialise the various engine in a more
extensive and less error-prone way by assining default values to all
entries in the C2 and Flow table.

This is a pretty big series considering it's mostly cleanup, but my goal
is to make the future series that will contain new big features easier
to review, focusing on the real logic.

Besides the debugfs interface, this series doesn't intend to introduce any
new features or change in behaviours.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Rework C2 engine macros
Maxime Chevallier [Wed, 27 Mar 2019 08:44:22 +0000 (09:44 +0100)]
net: mvpp2: cls: Rework C2 engine macros

The C2 classification engine has a 256 entry TCAM, used for ternary
matches on an 8 byte Header Extracted Key. For now, we compute the
various indices for classification and RSS that use this engine thanks
to a set of macros.

This commit mainly renames the macros used to make it clear that they
should be used with the C2 engine, but also make use of the full 256
entries in the engine. For now, the C2 entries are only used for RSS.

These entries are put at the end of the TCAM range, in case we want to
add higher priority matches later on.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Initialize lookup priorities for all entries in the flow
Maxime Chevallier [Wed, 27 Mar 2019 08:44:21 +0000 (09:44 +0100)]
net: mvpp2: cls: Initialize lookup priorities for all entries in the flow

When classifying a packet pertaining to a given flow, the classifier
will issue multiple lookup commands until it finds one with the 'last'
bit set. It expects all prorities to be assign continuously (although
not necessarily in an ordered fashion) from 0 to the number of lookups.

We can initialize this once, and make sure unused lookups are given an
empty port map. This avoids having to maintain priorities and the
information of which lookup is the last.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Invalidate all C2 entries except the ones we use
Maxime Chevallier [Wed, 27 Mar 2019 08:44:20 +0000 (09:44 +0100)]
net: mvpp2: cls: Invalidate all C2 entries except the ones we use

C2 TCAM entries can be invalidated to avoid unwanted matches. Make sure
all entries are invalidated at init, then validate only the ones we use.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Rename the flow table macros
Maxime Chevallier [Wed, 27 Mar 2019 08:44:19 +0000 (09:44 +0100)]
net: mvpp2: cls: Rename the flow table macros

The Flow Table dictates what lookups will be issued for each flow type.
The lookup sequence for each flow is similar, and the index of each
lookup is computed by some macros.

There are similar mechanisms for the C2 TCAM lookups, so in order to
avoid confusion, rename the flow table index computing macros with a
common prefix.

The only difference in behaviour is that we now use the very first entry
in the flow for the RSS lookup (the first entry was previously unused).

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Don't use the sequence attribute for classification
Maxime Chevallier [Wed, 27 Mar 2019 08:44:18 +0000 (09:44 +0100)]
net: mvpp2: cls: Don't use the sequence attribute for classification

The classifier allows to combine multiple lookups in one "sequence" that
is counted as a single lookup to an engine, with a single result.

We don't actually use that feature, so remove any places where we set
this field, so that the classifier doesn't try to interpret these
fields.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Rename classifer per-port functions
Maxime Chevallier [Wed, 27 Mar 2019 08:44:17 +0000 (09:44 +0100)]
net: mvpp2: cls: Rename classifer per-port functions

This commit renames some of the classifier functions to follow the
naming 'mvpp2_port_*' that's used for function that act on a given port.

This commit is purely cosmetic.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Move C2 read/write helpers around
Maxime Chevallier [Wed, 27 Mar 2019 08:44:16 +0000 (09:44 +0100)]
net: mvpp2: cls: Move C2 read/write helpers around

Move C2 read/write helpers higher in the file to ease future work that
rely on these helpers

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Write C2 TCAM data last when writing a C2 entry
Maxime Chevallier [Wed, 27 Mar 2019 08:44:15 +0000 (09:44 +0100)]
net: mvpp2: cls: Write C2 TCAM data last when writing a C2 entry

When writing a C2 entry to hardware, some registers writes will only
take effect when the TCAM_DATA4 register is written. This includes all
C2 TCAM registers, and the C2 invalidate register.

To make sure we always write C2 entries correctly, document that
behaviour with a comment, and move TCAM writes to the end of the
mvpp2_cls_c2_write helper.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Use iterators to go through the cls_table
Maxime Chevallier [Wed, 27 Mar 2019 08:44:14 +0000 (09:44 +0100)]
net: mvpp2: cls: Use iterators to go through the cls_table

The cls_table is a global read-only table containing the different
parameters that are used by various tables in the classifier. It
describes the links between the Header Parser, the decoding table and
the flow_table.

There are several possible way we want to iterate over that table,
depending on wich classifier engine we want to configure. For the Header
Parser, we want to iterate over each entry. For the Decoding table, we
want to iterate over each entry having a unique flow_id. Finally, when
configuring an ethtool flow, we want to iterate over each entry having a
unique flow_id and that has a given flow_type.

This commit introduces some iterator to both provide syntactic sugar and
also clarify the way we want to iterate over the table.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: debugfs: Allow reading the C2 engine table from debugfs
Maxime Chevallier [Wed, 27 Mar 2019 08:44:13 +0000 (09:44 +0100)]
net: mvpp2: debugfs: Allow reading the C2 engine table from debugfs

PPv2's Classifier uses multiple engines to perform classification. So
far, only the C2 engine is used, which has a 256 entries TCAM.

So far, we only accessed the relevant entries from the C2 engines, which
are the one implementing RSS. To implement and debug ntuple
classification offload, beaing able to see the hit count for each C2
entry is helpful, so this commit moves the logic to a dedicated
directory allowing to access each entry.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: debugfs: Allow reading the flow table from debugfs
Maxime Chevallier [Wed, 27 Mar 2019 08:44:12 +0000 (09:44 +0100)]
net: mvpp2: debugfs: Allow reading the flow table from debugfs

The Classifier flow table is the central part of the PPv2 Classifier,
since it describes all classification steps performed for each flow.

It has 512 entries, shared between all ports, which are divided into
sequences that are pointed-to by the decoding table. Being able to see
which entries in the flow table were hit is a key point when
implementing and debugging classification offload.

This commit allows reading each flow table entry's hit count
independently, with a clear-on-read behaviour.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: debugfs: Store debugfs entries data in mvpp2 struct
Maxime Chevallier [Wed, 27 Mar 2019 08:44:11 +0000 (09:44 +0100)]
net: mvpp2: debugfs: Store debugfs entries data in mvpp2 struct

The current way to store the required private data needed to access
various debugfs entries is to alloc them on the fly, share them within
the entries that need to access them, and finally have one entry free
that data upon closing. This leads to hard to maintain code, and is very
error-prone.

This commit stores all debugfs related data in the same place, making
sure this is allocated only when the debugfs directory is successfully
created, so that we don't waste memory when we don't use this feature.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Make the flow definitions const
Maxime Chevallier [Wed, 27 Mar 2019 08:44:10 +0000 (09:44 +0100)]
net: mvpp2: cls: Make the flow definitions const

The cls_flow table represent the overall configuration of the
classifier, used to match the different traffic classes in the Parsing
and Classification engines.

This configuration is static, and applies to all PPv2 instances, we must
therefore keep it const so that no modifications of this table are
performed at runtime.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Rename MVPP2_N_FLOWS to MVPP2_N_PRS_FLOWS
Maxime Chevallier [Wed, 27 Mar 2019 08:44:09 +0000 (09:44 +0100)]
net: mvpp2: cls: Rename MVPP2_N_FLOWS to MVPP2_N_PRS_FLOWS

The macro definition MVPP2_N_FLOWS is ambiguous because it really
represents the number of entries in the Header Parser that are used to
identify the classification flows.

Rename the macro to clearly state that we represent the number of flows
in the Header Parser.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: use Lookup Type in classification engines
Maxime Chevallier [Wed, 27 Mar 2019 08:44:08 +0000 (09:44 +0100)]
net: mvpp2: cls: use Lookup Type in classification engines

The PPv2 classifier allows to perform multiple lookups on the same
engine when classifying a packet. These lookups can match similar parts
of a packet header, but perform different actions upon matching. To
differentiate these types of lookups, it's possible to specify a Lookup
Type in the flow table entries, which will be part of the key for the
lookup engines.

This commit introduces the use of Lookup Types for C2 matches. Since for
now we only perform C2 lookups to enable RSS, we only need one Lookup
Type.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Start cls flow entries from beginning of table
Maxime Chevallier [Wed, 27 Mar 2019 08:44:07 +0000 (09:44 +0100)]
net: mvpp2: cls: Start cls flow entries from beginning of table

The Classifier flow table has 512 entries, that contains lookups
commands executed consecutively for every flow. Since we have 21
different flows, we have to carefully manage the flow table use.

As of today, the start index of a lookup sequence is computed
directly based in the flow->id. There are 8 reserved flow ids, from
0-7, which don't have any corresponding sequence in the flow table. We
can therefore ignore them when computing the index, and make so that the
first non-reserved flow point to the very beginning of the flow table.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Suggested-by: Alan Winkowski <walan@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: cls: Add missing MAC_DA field extraction
Maxime Chevallier [Wed, 27 Mar 2019 08:44:06 +0000 (09:44 +0100)]
net: mvpp2: cls: Add missing MAC_DA field extraction

PPv2's classifier supports extracting the MAC Destination Address from the
L2 header to perform RSS and flow steering. Add the missing case when
setting the Header Extracted Key fields in the flow table.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: Don't use an int to store netdev_features_t
Maxime Chevallier [Wed, 27 Mar 2019 08:44:05 +0000 (09:44 +0100)]
net: mvpp2: Don't use an int to store netdev_features_t

int is not long enough to store all netdev_features, use the correct
dedicated type to store them when building the list of dev->features.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Wed, 27 Mar 2019 04:44:13 +0000 (21:44 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2019-03-26

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) introduce bpf_tcp_check_syncookie() helper for XDP and tc, from Lorenz.

2) allow bpf_skb_ecn_set_ce() in tc, from Peter.

3) numerous bpf tc tunneling improvements, from Willem.

4) and other miscellaneous improvements from Adrian, Alan, Daniel, Ivan, Stanislav.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoice: Remove "2 BITS" comment
Anirudh Venkataramanan [Tue, 19 Feb 2019 23:04:11 +0000 (15:04 -0800)]
ice: Remove "2 BITS" comment

Some enums in ice_tx_desc_cmd_bits have a trailing /* 2 BITS */ comment,
but the value has just one bit set (ex. ICE_TX_DESC_CMD_L4T_EOFT_SCTP
has the value 0x200 (i.e. only bit 9 is set). This is confusing and
misleading. So remove the comment.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Update comment regarding the ITR_GRAN_S
Brett Creeley [Tue, 19 Feb 2019 23:04:10 +0000 (15:04 -0800)]
ice: Update comment regarding the ITR_GRAN_S

Since the driver now hard codes the ITR granularity to 2 us in the
GLINT_CTL register the comment next to ITR_GRAN_S needs to be updated.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Update function header for __ice_vsi_get_qs
Anirudh Venkataramanan [Tue, 19 Feb 2019 23:04:09 +0000 (15:04 -0800)]
ice: Update function header for __ice_vsi_get_qs

Remove some redundant text in the function header for __ice_vsi_get_qs

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Remove unnecessary braces
Anirudh Venkataramanan [Tue, 19 Feb 2019 23:04:08 +0000 (15:04 -0800)]
ice: Remove unnecessary braces

Single statement if conditions don't need braces. Remove it.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Remove unused function prototype
Anirudh Venkataramanan [Tue, 19 Feb 2019 23:04:07 +0000 (15:04 -0800)]
ice: Remove unused function prototype

Commit 37bb83901286 ("ice: Move common functions out of ice_main.c part
7/7") seems to have inadvertently introduced a function prototype for
ice_vsi_cfg_tc without a corresponding function implementation. Remove it.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Add missing case in print_link_msg for printing flow control
Brett Creeley [Tue, 19 Feb 2019 23:04:06 +0000 (15:04 -0800)]
ice: Add missing case in print_link_msg for printing flow control

Currently we aren't checking for the ICE_FC_NONE case for the current
flow control mode. This is causing "Unknown" to be printed for the
current flow control method if flow control is disabled. Fix this by
adding the case for ICE_FC_NONE to print "None".

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Audit hotpath structures with pahole
Brett Creeley [Tue, 19 Feb 2019 23:04:05 +0000 (15:04 -0800)]
ice: Audit hotpath structures with pahole

Currently the ice_q_vector structure and ice_ring_container structure
are taking up more space than necessary due to cache alignment holes
and unnecessary variables respectively. This is not helping the
driver's performance. The following fixes were done to improve cache
alignment, reduce wasted space, and increase performance.

1. Remove the ice_latency_range enum as it is unused.
2. Remove the latency_range variable in the ice_ring_container structure.
3. Change the size of the itr_idx in the ice_ring_container structure
   from an int to an u16. This reduced the size of ice_ring_container
   structure to 32 Bytes so it has no holes or padding.
4. Re-arrange the ice_q_vector structure using pahole to align
   members as best as possible in regards to 64 Byte cache line size.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Do not bail out when filter already exists
Preethi Banala [Tue, 19 Feb 2019 23:04:04 +0000 (15:04 -0800)]
ice: Do not bail out when filter already exists

If filter already exists, do not go through error path flow but instead
continue to process rest of the function. Hence have an appropriate check
after adding MAC filters.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix issue with VF attempt to delete default MAC address
Akeem G Abodunrin [Wed, 20 Feb 2019 17:11:48 +0000 (09:11 -0800)]
ice: Fix issue with VF attempt to delete default MAC address

This patch fixes issue that occurs when VF is attempting to remove
default LAN/MAC address, which is programmed by the administrator.
We shouldn't return error for the call by the VF, but continue with
the remaining steps to handle MAC opcode. Also update the dev_err
message to explicitly say that VF can't change MAC programmed by PF.

Also change "mac" to "MAC" for kernel print statements in the same file.

Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoMerge tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Tue, 26 Mar 2019 21:25:48 +0000 (14:25 -0700)]
Merge tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

  Stable fixes:
   - Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()
   - fix mount/umount race in nlmclnt.
   - NFSv4.1 don't free interrupted slot on open

  Bugfixes:
   - Don't let RPC_SOFTCONN tasks time out if the transport is connected
   - Fix a typo in nfs_init_timeout_values()
   - Fix layoutstats handling during read failovers
   - fix uninitialized variable warning"

* tag 'nfs-for-5.1-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: fix uninitialized variable warning
  pNFS/flexfiles: Fix layoutstats handling during read failovers
  NFS: Fix a typo in nfs_init_timeout_values()
  SUNRPC: Don't let RPC_SOFTCONN tasks time out if the transport is connected
  NFSv4.1 don't free interrupted slot on open
  NFS: fix mount/umount race in nlmclnt.
  NFS: Fix nfs4_lock_state refcounting in nfs4_alloc_{lock,unlock}data()

5 years agoice: enable VF admin queue interrupts
Mitch Williams [Tue, 19 Feb 2019 23:04:02 +0000 (15:04 -0800)]
ice: enable VF admin queue interrupts

The VPINT_MBX_CTL register array must be programmed to enable VF admin
queue interrupts. Without this, VFs never get interrupts on vector 0,
and some VF drivers will fail to init.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix for adaptive interrupt moderation
Anirudh Venkataramanan [Tue, 19 Feb 2019 23:04:01 +0000 (15:04 -0800)]
ice: Fix for adaptive interrupt moderation

commit 63f545ed1285 ("ice: Add support for adaptive interrupt moderation")
was meant to add support for adaptive interrupt moderation but there was
an error on my part while formatting the patch, and thus only part of the
patch ended up being submitted.

This patch rectifies the error by adding the rest of the code.

Fixes: 63f545ed1285 ("ice: Add support for adaptive interrupt moderation")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Implement pci_error_handler ops
Brett Creeley [Wed, 13 Feb 2019 18:51:15 +0000 (10:51 -0800)]
ice: Implement pci_error_handler ops

This patch implements the following pci_error_handler ops:
.error_detected = ice_pci_err_detected
.slot_reset = ice_pci_err_slot_reset
.reset_notify = ice_pci_err_reset_notify
.reset_prepare = ice_pci_err_reset_prepare
.reset_done = ice_pci_err_reset_done
.resume = ice_pci_err_resume

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Put __ICE_PREPARED_FOR_RESET check in ice_prepare_for_reset
Brett Creeley [Wed, 13 Feb 2019 18:51:14 +0000 (10:51 -0800)]
ice: Put __ICE_PREPARED_FOR_RESET check in ice_prepare_for_reset

Currently we check if the __ICE_PREPARED_FOR_RESET bit is set prior to
calling ice_prepare_for_reset in ice_reset_subtask(), but we aren't
checking that bit in ice_do_reset() before calling
ice_prepare_for_reset(). This is not consistent and can cause issues if
ice_prepare_for_reset() is called prior to ice_do_reset(). Fix this by
checking if the __ICE_PREPARED_FOR_RESET bit is set internal to
ice_prepare_for_reset().

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: use virt channel status codes
Mitch Williams [Wed, 13 Feb 2019 18:51:13 +0000 (10:51 -0800)]
ice: use virt channel status codes

When communicating with the AVF driver, we need to use the status codes
from virtchnl.h, not our own ice-specific codes. Without this, when an
error occurs, the VF will report nonsensical results.

NOTE: this depends on changes made to include/linux/avf/virtchnl.h by
commit bb58fd7eeffc ("i40e: Update status codes")

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Remove unnecessary newlines from log messages
Jeremiah Kyle [Wed, 13 Feb 2019 18:51:12 +0000 (10:51 -0800)]
ice: Remove unnecessary newlines from log messages

Two log messages contained newlines in the middle of the message. This
resulted in unexpected driver log output.

This patch removes the newlines to restore consistency with the rest of
the driver log messages.

Signed-off-by: Jeremiah Kyle <jeremiah.kyle@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoSUNRPC: fix uninitialized variable warning
Alakesh Haloi [Tue, 26 Mar 2019 02:00:01 +0000 (02:00 +0000)]
SUNRPC: fix uninitialized variable warning

Avoid following compiler warning on uninitialized variable

net/sunrpc/xprtsock.c: In function ‘xs_read_stream_request.constprop’:
net/sunrpc/xprtsock.c:525:10: warning: ‘read’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   return read;
          ^~~~
net/sunrpc/xprtsock.c:529:23: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  return ret < 0 ? ret : read;
         ~~~~~~~~~~~~~~^~~~~~

Signed-off-by: Alakesh Haloi <alakesh.haloi@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
5 years agoopenvswitch: add seqadj extension when NAT is used.
Flavio Leitner [Mon, 25 Mar 2019 18:58:31 +0000 (15:58 -0300)]
openvswitch: add seqadj extension when NAT is used.

When the conntrack is initialized, there is no helper attached
yet so the nat info initialization (nf_nat_setup_info) skips
adding the seqadj ext.

A helper is attached later when the conntrack is not confirmed
but is going to be committed. In this case, if NAT is needed then
adds the seqadj ext as well.

Fixes: 16ec3d4fbb96 ("openvswitch: Fix cached ct with helper.")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: bpf: don't depend on hardcoded perf sample_freq
Stanislav Fomichev [Tue, 19 Mar 2019 21:53:24 +0000 (14:53 -0700)]
selftests: bpf: don't depend on hardcoded perf sample_freq

When running stacktrace_build_id_nmi, try to query
kernel.perf_event_max_sample_rate sysctl and use it as a sample_freq.
If there was an error reading sysctl, fallback to 5000.

kernel.perf_event_max_sample_rate sysctl can drift and/or can be
adjusted by the perf tool, so assuming a fixed number might be
problematic on a long running machine.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agodpaa2-eth: use netif_receive_skb_list
Ioana Ciornei [Mon, 25 Mar 2019 13:42:39 +0000 (13:42 +0000)]
dpaa2-eth: use netif_receive_skb_list

Take advantage of the software Rx batching by using
netif_receive_skb_list instead of napi_gro_receive.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: fix race condition with bql frame accounting
Ioana Ciornei [Mon, 25 Mar 2019 13:06:22 +0000 (13:06 +0000)]
dpaa2-eth: fix race condition with bql frame accounting

It might happen that Tx conf acknowledges a frame before it was
subscribed in bql, as subscribing was previously done after the enqueue
operation.

This patch moves the netdev_tx_sent_queue call before the actual frame
enqueue, so that this can never happen.

Fixes: 569dac6a5a0d ("dpaa2-eth: bql support")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agochelsio: use BUG() instead of BUG_ON(1)
Arnd Bergmann [Mon, 25 Mar 2019 12:49:16 +0000 (13:49 +0100)]
chelsio: use BUG() instead of BUG_ON(1)

clang warns about possible bugs in a dead code branch after
BUG_ON(1) when CONFIG_PROFILE_ALL_BRANCHES is enabled:

 drivers/net/ethernet/chelsio/cxgb4/sge.c:479:3: error: variable 'buf_size' is used uninitialized whenever 'if'
      condition is false [-Werror,-Wsometimes-uninitialized]
                BUG_ON(1);
                ^~~~~~~~~
 include/asm-generic/bug.h:61:36: note: expanded from macro 'BUG_ON'
 #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                                   ^~~~~~~~~~~~~~~~~~~
 include/linux/compiler.h:48:23: note: expanded from macro 'unlikely'
 #  define unlikely(x)   (__branch_check__(x, 0, __builtin_constant_p(x)))
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 drivers/net/ethernet/chelsio/cxgb4/sge.c:482:9: note: uninitialized use occurs here
        return buf_size;
               ^~~~~~~~
 drivers/net/ethernet/chelsio/cxgb4/sge.c:479:3: note: remove the 'if' if its condition is always true
                BUG_ON(1);
                ^
 include/asm-generic/bug.h:61:32: note: expanded from macro 'BUG_ON'
 #define BUG_ON(condition) do { if (unlikely(condition)) BUG(); } while (0)
                               ^
 drivers/net/ethernet/chelsio/cxgb4/sge.c:459:14: note: initialize the variable 'buf_size' to silence this warning
        int buf_size;
                    ^
                     = 0

Use BUG() here to create simpler code that clang understands
correctly.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: fix return value check in tipc_mcast_send_sync()
Wei Yongjun [Mon, 25 Mar 2019 06:31:09 +0000 (06:31 +0000)]
tipc: fix return value check in tipc_mcast_send_sync()

Fix the return value check which testing the wrong variable
in tipc_mcast_send_sync().

Fixes: c55c8edafa91 ("tipc: smooth change between replicast and broadcast")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-phy-aquantia-report-Aquantia-specific-settings-and-features'
David S. Miller [Tue, 26 Mar 2019 18:33:43 +0000 (11:33 -0700)]
Merge branch 'net-phy-aquantia-report-Aquantia-specific-settings-and-features'

Heiner Kallweit says:

====================
net: phy: aquantia: report Aquantia-specific settings and features

This series detects and reports quite some Aquantia-specific settings
and features.

v2:
- propagate timeout in patch 2
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: inform about proprietary 1000Base-T2 mode being in use
Heiner Kallweit [Sun, 24 Mar 2019 10:09:41 +0000 (11:09 +0100)]
net: phy: aquantia: inform about proprietary 1000Base-T2 mode being in use

The AQCS109 supports a proprietary 2-pair 1Gbps mode. The standard
registers don't allow to tell between 1000BaseT and 1000BaseT2.
Add reporting this proprietary mode based on a vendor register.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: report PHY details like firmware version
Heiner Kallweit [Sun, 24 Mar 2019 10:08:13 +0000 (11:08 +0100)]
net: phy: aquantia: report PHY details like firmware version

Add reporting firmware details. These details are available only once
the firmware has finished initializing the chip. This can take some
time and we need to poll for init completion.

v2:
- Propagate timeout in aqr107_wait_reset_complete(). Don't bail out
  completely on timeout because chip may be functional even w/o
  firmware image.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: print remote capabilities if link partner is Aquantia PHY
Heiner Kallweit [Sun, 24 Mar 2019 10:04:21 +0000 (11:04 +0100)]
net: phy: aquantia: print remote capabilities if link partner is Aquantia PHY

If both link partners are Aquantia PHY's then additional information is
exchanged as part of the auto-negotiation. Report remote capabilities
if link partner is Aquantia PHY.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: Avoid null pointer when failing to connect to PHY
Vladimir Oltean [Sat, 23 Mar 2019 23:24:07 +0000 (01:24 +0200)]
net: dsa: Avoid null pointer when failing to connect to PHY

When phylink_of_phy_connect fails, dsa_slave_phy_setup tries to save the
day by connecting to an alternative PHY, none other than a PHY on the
switch's internal MDIO bus, at an address equal to the port's index.

However this does not take into consideration the scenario when the
switch that failed to probe an external PHY does not have an internal
MDIO bus at all.

Fixes: aab9c4067d23 ("net: dsa: Plug in PHYLINK support")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: devlink: skip info_get op call if it is not defined in dumpit
Jiri Pirko [Sat, 23 Mar 2019 23:21:03 +0000 (00:21 +0100)]
net: devlink: skip info_get op call if it is not defined in dumpit

In dumpit, unlike doit, the check for info_get op being defined
is missing. Add it and avoid null pointer dereference in case driver
does not define this op.

Fixes: f9cf22882c60 ("devlink: add device information API")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: bcm54xx: Encode link speed and activity into LEDs
Vladimir Oltean [Sat, 23 Mar 2019 22:18:46 +0000 (00:18 +0200)]
net: phy: bcm54xx: Encode link speed and activity into LEDs

Previously the green and amber LEDs on this quad PHY were solid, to
indicate an encoding of the link speed (10/100/1000).

This keeps the LEDs always on just as before, but now they flash on
Rx/Tx activity.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: change to check tipc_own_id to return in tipc_net_stop
Xin Long [Sat, 23 Mar 2019 16:48:22 +0000 (00:48 +0800)]
tipc: change to check tipc_own_id to return in tipc_net_stop

When running a syz script, a panic occurred:

[  156.088228] BUG: KASAN: use-after-free in tipc_disc_timeout+0x9c9/0xb20 [tipc]
[  156.094315] Call Trace:
[  156.094844]  <IRQ>
[  156.095306]  dump_stack+0x7c/0xc0
[  156.097346]  print_address_description+0x65/0x22e
[  156.100445]  kasan_report.cold.3+0x37/0x7a
[  156.102402]  tipc_disc_timeout+0x9c9/0xb20 [tipc]
[  156.106517]  call_timer_fn+0x19a/0x610
[  156.112749]  run_timer_softirq+0xb51/0x1090

It was caused by the netns freed without deleting the discoverer timer,
while later on the netns would be accessed in the timer handler.

The timer should have been deleted by tipc_net_stop() when cleaning up a
netns. However, tipc has been able to enable a bearer and start d->timer
without the local node_addr set since Commit 52dfae5c85a4 ("tipc: obtain
node identity from interface by default"), which caused the timer not to
be deleted in tipc_net_stop() then.

So fix it in tipc_net_stop() by changing to check local node_id instead
of local node_addr, as Jon suggested.

While at it, remove the calling of tipc_nametbl_withdraw() there, since
tipc_nametbl_stop() will take of the nametbl's freeing after.

Fixes: 52dfae5c85a4 ("tipc: obtain node identity from interface by default")
Reported-by: syzbot+a25307ad099309f1c2b9@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: usb: aqc111: Extend HWID table by QNAP device
Dmitry Bezrukov [Sat, 23 Mar 2019 13:59:53 +0000 (13:59 +0000)]
net: usb: aqc111: Extend HWID table by QNAP device

New device of QNAP based on aqc111u
Add this ID to blacklist of cdc_ether driver as well

Signed-off-by: Dmitry Bezrukov <dmitry.bezrukov@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: Kconfig: update reference link for PIE
Leslie Monis [Sat, 23 Mar 2019 13:41:33 +0000 (19:11 +0530)]
net: sched: Kconfig: update reference link for PIE

RFC 8033 replaces the IETF draft for PIE

Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: simplify aqr_config_aneg
Heiner Kallweit [Sat, 23 Mar 2019 13:35:20 +0000 (14:35 +0100)]
net: phy: aquantia: simplify aqr_config_aneg

Simplify aqr_config_aneg().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: qca8k: extend slave-bus implementations
Christian Lamparter [Fri, 22 Mar 2019 00:05:03 +0000 (01:05 +0100)]
net: dsa: qca8k: extend slave-bus implementations

This patch implements accessors for the QCA8337 MDIO access
through the MDIO_MASTER register, which makes it possible to
access the PHYs on slave-bus through the switch. In cases
where the switch ports are already mapped via external
"phy-phandles", the internal mdio-bus is disabled in order to
prevent a duplicated discovery and enumeration of the same
PHYs. Don't use mixed external and internal mdio-bus
configurations, as this is not supported by the hardware.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: qca8k: remove leftover phy accessors
Christian Lamparter [Fri, 22 Mar 2019 00:05:02 +0000 (01:05 +0100)]
net: dsa: qca8k: remove leftover phy accessors

This belated patch implements Andrew Lunn's request of
"remove the phy_read() and phy_write() functions."
<https://lore.kernel.org/patchwork/comment/902734/>

While seemingly harmless, this causes the switch's user
port PHYs to get registered twice. This is because the
DSA subsystem will create a slave mdio-bus not knowing
that the qca8k_phy_(read|write) accessors operate on
the external mdio-bus. So the same "bus" gets effectively
duplicated.

Cc: stable@vger.kernel.org
Fixes: 6b93fb46480a ("net-next: dsa: add new driver for qca8xxx family")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: dsa: qca8k: support internal mdio-bus
Christian Lamparter [Fri, 22 Mar 2019 00:05:01 +0000 (01:05 +0100)]
dt-bindings: net: dsa: qca8k: support internal mdio-bus

This patch updates the qca8k's binding to document to the
approach for using the internal mdio-bus of the supported
qca8k switches.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: dsa: qca8k: fix example
Christian Lamparter [Fri, 22 Mar 2019 00:05:00 +0000 (01:05 +0100)]
dt-bindings: net: dsa: qca8k: fix example

In the example, the phy at phy@0 is clashing with
the switch0@0 at the same address. Usually, the switches
are accessible through pseudo PHYs which in case of the
qca8k are located at 0x10 - 0x18.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'for-5.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Tue, 26 Mar 2019 17:32:13 +0000 (10:32 -0700)]
Merge tag 'for-5.1-rc2-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fsync fixes: i_size for truncate vs fsync, dio vs buffered during
   snapshotting, remove complicated but incomplete assertion

 - removed excessive warnigs, misreported device stats updates

 - fix raid56 page mapping for 32bit arch

 - fixes reported by static analyzer

* tag 'for-5.1-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix assertion failure on fsync with NO_HOLES enabled
  btrfs: Avoid possible qgroup_rsv_size overflow in btrfs_calculate_inode_block_rsv_size
  btrfs: Fix bound checking in qgroup_trace_new_subtree_blocks
  btrfs: raid56: properly unmap parity page in finish_parity_scrub()
  btrfs: don't report readahead errors and don't update statistics
  Btrfs: fix file corruption after snapshotting due to mix of buffered/DIO writes
  btrfs: remove WARN_ON in log_dir_items
  Btrfs: fix incorrect file size after shrinking truncate and fsync

5 years agoMerge tag 'trace-v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 26 Mar 2019 17:21:55 +0000 (10:21 -0700)]
Merge tag 'trace-v5.1-rc2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Three small fixes:

   - A fix to a double free in the histogram code

   - Uninitialized variable fix

   - Use NULL instead of zero fix and spelling fixes"

* tag 'trace-v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix warning using plain integer as NULL & spelling corrections
  tracing: initialize variable in create_dyn_event()
  tracing: Remove unnecessary var_ref destroy in track_data_destroy()

5 years agoMerge tag 'locks-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Linus Torvalds [Tue, 26 Mar 2019 17:06:29 +0000 (10:06 -0700)]
Merge tag 'locks-v5.1' of git://git./linux/kernel/git/jlayton/linux

Pull file locking bugfix from Jeff Layton:
 "Just a single fix for a bug that crept into POSIX lock deadlock
  detection in v5.0"

* tag 'locks-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  locks: wake any locks blocked on request before deadlock check

5 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Tue, 26 Mar 2019 16:41:27 +0000 (09:41 -0700)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-03-25

This series contains updates to the ice driver only.

Victor updates the ice driver to be able to update the VSI queue
configuration dynamically, by providing the ability to increase or
decrease the VSI's number of queues.

Michal fixes an issue when the VM starts or the VF driver is reloaded,
the VLAN switch rule was lost (i.e. not added), so ensure it gets added
in these cases.

Brett updates the driver to support link events over the admin receive
queue, instead of polling link events.

Maciej refactors the code a bit to introduce a new function to fetch the
receiver buffer and do the DMA synchronization to reduce the code
duplication.  Also added ice_can_reuse_rx_page() to verify whether the
page can be reused so that in the future, we can use this check
elsewhere in the driver.  Additional driver optimizations so that we can
drop the ice_pull_tail() altogether.  Added support for bulk updates of
refcount instead of doing it one by one.  Refactored the page counting
and buffer recycling so that we can use this code to clean up receive
buffers when there is no skb allocated, like XDP.  Added
DMA_ATTR_WEAK_ORDERING and DMA_ATTR_SKIP_CPU_SYNC attributes to the DMA
API during the mapping operations on the receive side, so that nonx86
platforms will be able to sync with what is being used (2k buffers)
instead of the entire page.

Dave fixes the driver to perform the most intrusive of the resets
requested and clear the other request bits so that we do not end up with
repeated reset, after reset.

Bruce adds a iterator macro to clean up several for() loops.

Chinh modifies the packet flags to be more generic so that they can be
used for both receive and transmit.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoftrace: Fix warning using plain integer as NULL & spelling corrections
Hariprasad Kelam [Sat, 23 Mar 2019 18:35:23 +0000 (00:05 +0530)]
ftrace: Fix warning using plain integer as NULL & spelling corrections

Changed  0 --> NULL to avoid sparse warning
Corrected spelling mistakes reported by checkpatch.pl
Sparse warning below:

sudo make C=2 CF=-D__CHECK_ENDIAN__ M=kernel/trace

CHECK   kernel/trace/ftrace.c
kernel/trace/ftrace.c:3007:24: warning: Using plain integer as NULL pointer
kernel/trace/ftrace.c:4758:37: warning: Using plain integer as NULL pointer

Link: http://lkml.kernel.org/r/20190323183523.GA2244@hari-Inspiron-1545
Signed-off-by: Hariprasad Kelam <hariprasad.kelam@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
5 years agotracing: initialize variable in create_dyn_event()
Frank Rowand [Fri, 22 Mar 2019 06:58:20 +0000 (23:58 -0700)]
tracing: initialize variable in create_dyn_event()

Fix compile warning in create_dyn_event(): 'ret' may be used uninitialized
in this function [-Wuninitialized].

Link: http://lkml.kernel.org/r/1553237900-8555-1-git-send-email-frowand.list@gmail.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 5448d44c3855 ("tracing: Add unified dynamic event framework")
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
5 years agotracing: Remove unnecessary var_ref destroy in track_data_destroy()
Tom Zanussi [Wed, 20 Mar 2019 17:53:33 +0000 (12:53 -0500)]
tracing: Remove unnecessary var_ref destroy in track_data_destroy()

Commit 656fe2ba85e8 (tracing: Use hist trigger's var_ref array to
destroy var_refs) centralized the destruction of all the var_refs
in one place so that other code didn't have to do it.

The track_data_destroy() added later ignored that and also destroyed
the track_data var_ref, causing a double-free error flagged by KASAN.

==================================================================
BUG: KASAN: use-after-free in destroy_hist_field+0x30/0x70
Read of size 8 at addr ffff888086df2210 by task bash/1694

CPU: 6 PID: 1694 Comm: bash Not tainted 5.1.0-rc1-test+ #15
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03
07/14/2016
Call Trace:
 dump_stack+0x71/0xa0
 ? destroy_hist_field+0x30/0x70
 print_address_description.cold.3+0x9/0x1fb
 ? destroy_hist_field+0x30/0x70
 ? destroy_hist_field+0x30/0x70
 kasan_report.cold.4+0x1a/0x33
 ? __kasan_slab_free+0x100/0x150
 ? destroy_hist_field+0x30/0x70
 destroy_hist_field+0x30/0x70
 track_data_destroy+0x55/0xe0
 destroy_hist_data+0x1f0/0x350
 hist_unreg_all+0x203/0x220
 event_trigger_open+0xbb/0x130
 do_dentry_open+0x296/0x700
 ? stacktrace_count_trigger+0x30/0x30
 ? generic_permission+0x56/0x200
 ? __x64_sys_fchdir+0xd0/0xd0
 ? inode_permission+0x55/0x200
 ? security_inode_permission+0x18/0x60
 path_openat+0x633/0x22b0
 ? path_lookupat.isra.50+0x420/0x420
 ? __kasan_kmalloc.constprop.12+0xc1/0xd0
 ? kmem_cache_alloc+0xe5/0x260
 ? getname_flags+0x6c/0x2a0
 ? do_sys_open+0x149/0x2b0
 ? do_syscall_64+0x73/0x1b0
 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
 ? _raw_write_lock_bh+0xe0/0xe0
 ? __kernel_text_address+0xe/0x30
 ? unwind_get_return_address+0x2f/0x50
 ? __list_add_valid+0x2d/0x70
 ? deactivate_slab.isra.62+0x1f4/0x5a0
 ? getname_flags+0x6c/0x2a0
 ? set_track+0x76/0x120
 do_filp_open+0x11a/0x1a0
 ? may_open_dev+0x50/0x50
 ? _raw_spin_lock+0x7a/0xd0
 ? _raw_write_lock_bh+0xe0/0xe0
 ? __alloc_fd+0x10f/0x200
 do_sys_open+0x1db/0x2b0
 ? filp_open+0x50/0x50
 do_syscall_64+0x73/0x1b0
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fa7b24a4ca2
Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 85 7a 0d 00 8b 00 85 c0
75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff
0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
RSP: 002b:00007fffbafb3af0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 000055d3648ade30 RCX: 00007fa7b24a4ca2
RDX: 0000000000000241 RSI: 000055d364a55240 RDI: 00000000ffffff9c
RBP: 00007fffbafb3bf0 R08: 0000000000000020 R09: 0000000000000002
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000003 R14: 0000000000000001 R15: 000055d364a55240
==================================================================

So remove the track_data_destroy() destroy_hist_field() call for that
var_ref.

Link: http://lkml.kernel.org/r/1deffec420f6a16d11dd8647318d34a66d1989a9.camel@linux.intel.com
Fixes: 466f4528fbc69 ("tracing: Generalize hist trigger onmax and save action")
Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
5 years agonet: phy: don't clear BMCR in genphy_soft_reset
Heiner Kallweit [Fri, 22 Mar 2019 19:00:20 +0000 (20:00 +0100)]
net: phy: don't clear BMCR in genphy_soft_reset

So far we effectively clear the BMCR register. Some PHY's can deal
with this (e.g. because they reset BMCR to a default as part of a
soft-reset) whilst on others this causes issues because e.g. the
autoneg bit is cleared. Marvell is an example, see also thread [0].
So let's be a little bit more gentle and leave all bits we're not
interested in as-is. This change is needed for PHY drivers to
properly deal with the original patch.

[0] https://marc.info/?t=155264050700001&r=1&w=2

Fixes: 6e2d85ec0559 ("net: phy: Stop with excessive soft reset")
Tested-by: Phil Reid <preid@electromag.com.au>
Tested-by: liweihang <liweihang@hisilicon.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "parport: daisy: use new parport device model"
Linus Torvalds [Mon, 25 Mar 2019 21:49:00 +0000 (14:49 -0700)]
Revert "parport: daisy: use new parport device model"

This reverts commit 1aec4211204d9463d1fd209eb50453de16254599.

Steven Rostedt reports that it causes a hang at bootup and bisected it
to this commit.

The troigger is apparently a module alias for "parport_lowlevel" that
points to "parport_pc", which causes a hang with

    modprobe -q -- parport_lowlevel

blocking forever with a backtrace like this:

    wait_for_completion_killable+0x1c/0x28
    call_usermodehelper_exec+0xa7/0x108
    __request_module+0x351/0x3d8
    get_lowlevel_driver+0x28/0x41 [parport]
    __parport_register_driver+0x39/0x1f4 [parport]
    daisy_drv_init+0x31/0x4f [parport]
    parport_bus_init+0x5d/0x7b [parport]
    parport_default_proc_register+0x26/0x1000 [parport]
    do_one_initcall+0xc2/0x1e0
    do_init_module+0x50/0x1d4
    load_module+0x1c2e/0x21b3
    sys_init_module+0xef/0x117

Supid says:
 "Due to the new device model daisy driver will now try to find the
  parallel ports while trying to register its driver so that it can bind
  with them. Now, since daisy driver is loaded while parport bus is
  initialising the list of parport is still empty and it tries to load
  the lowlevel driver, which has an alias set to parport_pc, now causes
  a deadlock"

But I don't think the daisy driver should be loaded by the parport
initialization in the first place, so let's revert the whole change.

If the daisy driver can just initialize separately on its own (like a
driver should), instead of hooking into the parport init sequence
directly, this issue probably would go away.

Reported-and-bisected-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoice: Create a generic name for the ice_rx_flg64_bits structure
Chinh T Cao [Wed, 13 Feb 2019 18:51:11 +0000 (10:51 -0800)]
ice: Create a generic name for the ice_rx_flg64_bits structure

This structure is used to define the packet flags. These flags are
applicable for both TX and RX packet. Thus, this patch changes its
name from ice_rx_flag64_bits to ice_flg64_bits, and its member definition.

Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
Reviewed-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: add and use new ice_for_each_traffic_class() macro
Bruce Allan [Wed, 13 Feb 2019 18:51:10 +0000 (10:51 -0800)]
ice: add and use new ice_for_each_traffic_class() macro

There are numerous for() loops iterating over each of the max traffic
classes.  Use a simple iterator macro instead to make the code cleaner.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: change VF VSI tc info along with num_queues
Preethi Banala [Wed, 13 Feb 2019 18:51:09 +0000 (10:51 -0800)]
ice: change VF VSI tc info along with num_queues

Update VF VSI tc info along with vsi->num_txq/num_rxq when VF requests to
configure queues.

Signed-off-by: Preethi Banala <preethi.banala@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Prevent unintended multiple chain resets
Dave Ertman [Wed, 13 Feb 2019 18:51:08 +0000 (10:51 -0800)]
ice: Prevent unintended multiple chain resets

In the current implementation of ice_reset_subtask, if multiple reset
types are set in the pf->state, the most intrusive one is meant to be
performed only, but the bits requesting the other types are not being
cleared. This would lead to another reset being performed the next time
the service task is scheduled.

Change the flow of ice_reset_subtask so that all reset request bits in
pf->state are cleared, and we still perform the most intrusive of the
resets requested.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: map Rx buffer pages with DMA attributes
Maciej Fijalkowski [Wed, 13 Feb 2019 18:51:07 +0000 (10:51 -0800)]
ice: map Rx buffer pages with DMA attributes

Provide DMA_ATTR_WEAK_ORDERING and DMA_ATTR_SKIP_CPU_SYNC attributes to
the DMA API during the mapping operations on Rx side. With this change
the non-x86 platforms will be able to sync only with what is being used
(2k buffer) instead of entire page. This should yield a slight
performance improvement.

Furthermore, DMA unmap may destroy the changes that were made to the
buffer by CPU when platform is not a x86 one. DMA_ATTR_SKIP_CPU_SYNC
attribute usage fixes this issue.

Also add a sync_single_for_device call during the Rx buffer assignment,
to make sure that the cache lines are cleared before device attempting
to write to the buffer.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Limit the ice_add_rx_frag to frag addition
Maciej Fijalkowski [Wed, 13 Feb 2019 18:51:06 +0000 (10:51 -0800)]
ice: Limit the ice_add_rx_frag to frag addition

Refactor ice_fetch_rx_buf and ice_add_rx_frag in a way that we have
standalone functions that do either the skb construction or frag
addition to previously constructed skb.

The skb handling between rx_bufs is spread among various functions. The
ice_get_rx_buf will retrieve the skb pointer from rx_buf and if it is a
NULL pointer then we do the ice_construct_skb, otherwise we add a frag
to the current skb via ice_add_rx_frag. Then, on the ice_put_rx_buf the
skb pointer that belongs to rx_buf will be cleared. Moving further, if
the current frame is not EOP frame we assign the current skb to the
rx_buf that is pointed by updated next_to_clean indicator.

What is more during the buffer reuse let's assign each member of
ice_rx_buf individually so we avoid the unnecessary copy of skb.

Last but not least, this logic split will allow us for better code reuse
when adding a support for build_skb.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Gather the rx buf clean-up logic for better reuse
Maciej Fijalkowski [Wed, 13 Feb 2019 18:51:05 +0000 (10:51 -0800)]
ice: Gather the rx buf clean-up logic for better reuse

Pull out the code responsible for page counting and buffer recycling so
that it will be possible to clean up the Rx buffers in cases where we
won't allocate skb (ex. XDP)

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Introduce bulk update for page count
Maciej Fijalkowski [Wed, 13 Feb 2019 18:51:04 +0000 (10:51 -0800)]
ice: Introduce bulk update for page count

{get,put}_page are atomic operations which we use for page count
handling. The current logic for refcount handling is that we increment
it when passing a skb with the data from the first half of page up to
netstack and recycle the second half of page. This operation protects us
from losing a page since the network stack can decrement the refcount of
page from skb.

The performance can be gently improved by doing the bulk updates of
refcount instead of doing it one by one. During the buffer initialization,
maximize the page's refcount and don't allow the refcount to become
less than two.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Get rid of ice_pull_tail
Maciej Fijalkowski [Wed, 13 Feb 2019 18:51:03 +0000 (10:51 -0800)]
ice: Get rid of ice_pull_tail

Instead of adding a frag and later when dealing with EOP frame accessing
that frag in order to copy the headers onto linear part of skb, we can do
this in ice_add_rx_frag in case where the data_len is still 0 and frame
won't fit onto the linear part as a whole.

Function comment of ice_pull_tail was a bit misleading because of
mentioned optimizations that can be performed (drop a frag/maintaining
accurate truesize of skb) - it seems that this part of logic was dropped
and the comment was not updated to reflect this change.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>