openwrt/staging/blogic.git
5 years agoMerge branch 'sctp-add-some-missing-events-from-rfc5061'
Jakub Kicinski [Thu, 10 Oct 2019 00:10:44 +0000 (17:10 -0700)]
Merge branch 'sctp-add-some-missing-events-from-rfc5061'

Xin Long says:
====================
There are 4 events defined in rfc5061 missed in linux sctp:
SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED, SCTP_ADDR_MADE_PRIM and
SCTP_SEND_FAILED_EVENT.

This patchset is to add them up.
====================

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agosctp: add SCTP_SEND_FAILED_EVENT event
Xin Long [Tue, 8 Oct 2019 11:27:36 +0000 (19:27 +0800)]
sctp: add SCTP_SEND_FAILED_EVENT event

This patch is to add a new event SCTP_SEND_FAILED_EVENT described in
rfc6458#section-6.1.11. It's a update of SCTP_SEND_FAILED event:

  struct sctp_sndrcvinfo ssf_info is replaced with
  struct sctp_sndinfo ssfe_info in struct sctp_send_failed_event.

SCTP_SEND_FAILED is being deprecated, but we don't remove it in this
patch. Both are being processed in sctp_datamsg_destroy() when the
corresp event flag is set.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agosctp: add SCTP_ADDR_MADE_PRIM event
Xin Long [Tue, 8 Oct 2019 11:27:35 +0000 (19:27 +0800)]
sctp: add SCTP_ADDR_MADE_PRIM event

sctp_ulpevent_nofity_peer_addr_change() would be called in
sctp_assoc_set_primary() to send SCTP_ADDR_MADE_PRIM event
when this transport is set to the primary path of the asoc.

This event is described in rfc6458#section-6.1.2:

  SCTP_ADDR_MADE_PRIM:  This address has now been made the primary
     destination address.  This notification is provided whenever an
     address is made primary.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agosctp: add SCTP_ADDR_REMOVED event
Xin Long [Tue, 8 Oct 2019 11:27:34 +0000 (19:27 +0800)]
sctp: add SCTP_ADDR_REMOVED event

sctp_ulpevent_nofity_peer_addr_change() is called in
sctp_assoc_rm_peer() to send SCTP_ADDR_REMOVED event
when this transport is removed from the asoc.

This event is described in rfc6458#section-6.1.2:

  SCTP_ADDR_REMOVED:  The address is no longer part of the
     association.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agosctp: add SCTP_ADDR_ADDED event
Xin Long [Tue, 8 Oct 2019 11:27:33 +0000 (19:27 +0800)]
sctp: add SCTP_ADDR_ADDED event

A helper sctp_ulpevent_nofity_peer_addr_change() will be extracted
to make peer_addr_change event and enqueue it, and the helper will
be called in sctp_assoc_add_peer() to send SCTP_ADDR_ADDED event.

This event is described in rfc6458#section-6.1.2:

  SCTP_ADDR_ADDED:  The address is now part of the association.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: stmmac: add flexible PPS to dwmac 4.10a
Antonio Borneo [Mon, 7 Oct 2019 15:43:06 +0000 (17:43 +0200)]
net: stmmac: add flexible PPS to dwmac 4.10a

All the registers and the functionalities used in the callback
dwmac5_flex_pps_config() are common between dwmac 4.10a [1] and
5.00a [2].

Reuse the same callback for dwmac 4.10a too.

Tested on STM32MP15x, based on dwmac 4.10a.

[1] DWC Ethernet QoS Databook 4.10a October 2014
[2] DWC Ethernet QoS Databook 5.00a September 2017

Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agoMerge tag 'spi-ptp-api' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Jakub Kicinski [Wed, 9 Oct 2019 22:06:44 +0000 (15:06 -0700)]
Merge tag 'spi-ptp-api' of https://git./linux/kernel/git/broonie/spi

Pull in a dependency for Vladimir's work on more precise
packet time stamping.

Mark Brown says:

====================
spi: Add a PTP API

For detailed timestamping of operations.
====================

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agoRevert "tun: call dev_get_valid_name() before register_netdevice()"
Eric Dumazet [Tue, 8 Oct 2019 21:20:34 +0000 (14:20 -0700)]
Revert "tun: call dev_get_valid_name() before register_netdevice()"

This reverts commit 0ad646c81b2182f7fa67ec0c8c825e0ee165696d.

As noticed by Jakub, this is no longer needed after
commit 11fc7d5a0a2d ("tun: fix memory leak in error path")

This no longer exports dev_get_valid_name() for the exclusive
use of tun driver.

Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: tipc: prepare attrs in __tipc_nl_compat_dumpit()
Jiri Pirko [Tue, 8 Oct 2019 11:01:51 +0000 (13:01 +0200)]
net: tipc: prepare attrs in __tipc_nl_compat_dumpit()

__tipc_nl_compat_dumpit() calls tipc_nl_publ_dump() which expects
the attrs to be available by genl_dumpit_info(cb)->attrs. Add info
struct and attr parsing in compat dumpit function.

Reported-by: syzbot+8d37c50ffb0f52941a5e@syzkaller.appspotmail.com
Fixes: 057af7071344 ("net: tipc: have genetlink code to parse the attrs during dumpit")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: genetlink: always allocate separate attrs for dumpit ops
Jiri Pirko [Tue, 8 Oct 2019 10:31:43 +0000 (12:31 +0200)]
net: genetlink: always allocate separate attrs for dumpit ops

Individual dumpit ops (start, dumpit, done) are locked by genl_lock
if !family->parallel_ops. However, multiple
genl_family_rcv_msg_dumpit() calls may in in flight in parallel.
Each has a separate struct genl_dumpit_info allocated
but they share the same family->attrbuf. Fix this by allocating separate
memory for attrs for dumpit ops, for non-parallel_ops (for parallel_ops
it is done already).

Reported-by: syzbot+495688b736534bb6c6ad@syzkaller.appspotmail.com
Reported-by: syzbot+ff59dc711f2cff879a05@syzkaller.appspotmail.com
Reported-by: syzbot+dbe02e13bcce52bcf182@syzkaller.appspotmail.com
Reported-by: syzbot+9cb7edb2906ea1e83006@syzkaller.appspotmail.com
Fixes: bf813b0afeae ("net: genetlink: parse attrs and store in contect info struct during dumpit")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agoMerge branch 'hns3-next' into net-next
Jakub Kicinski [Wed, 9 Oct 2019 00:26:41 +0000 (17:26 -0700)]
Merge branch 'hns3-next' into net-next

Huazhong Tan says:

====================
This patch-set includes some new features for the HNS3 ethernet
controller driver.

[patch 01/06] adds support for configuring VF link status on the host.

[patch 02/06] adds support for configuring VF spoof check.

[patch 03/06] adds support for configuring VF trust.

[patch 04/06] adds support for configuring VF bandwidth on the host.

[patch 05/06] adds support for configuring VF MAC on the host.

[patch 06/06] adds support for tx-scatter-gather-fraglist.
====================

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: hns3: support tx-scatter-gather-fraglist feature
Yunsheng Lin [Tue, 8 Oct 2019 01:20:09 +0000 (09:20 +0800)]
net: hns3: support tx-scatter-gather-fraglist feature

The hardware supports up to 8 TX BD for non-tso skb and up to
63 TX BD for TSO skb. Currently, the hns3 driver supports RX skb
with fraglist when HW GRO is enabled, when the stack forwards a
RX skb with fraglist, the stack need to linearize the skb before
sending to other interface without TX fraglist support.

This patch adds support for TX fraglist. The performance increases
from 1 GByte to 1.5 GByte for one iperf TCP stream during
forwarding test after this patch. BTW, the minimum BD number of
ring should be updated to 72 for supporting TX fraglist.

This patch also changes the error handling of some function that
called by hns3_fill_desc, which returns BD num when there is no
error, change some macro to more meaningful name.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: hns3: add support for configuring VF MAC from the host
Huazhong Tan [Tue, 8 Oct 2019 01:20:08 +0000 (09:20 +0800)]
net: hns3: add support for configuring VF MAC from the host

This patch adds support of configuring VF MAC from the host
for the HNS3 driver.

BTW, the parameter init in the hns3_init_mac_addr is
unnecessary now, since the MAC address will not read from
NCL_CONFIG when doing reset, so it should be removed,
otherwise it will affect VF's MAC address initialization.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: hns3: add support for configuring bandwidth of VF on the host
Yonglong Liu [Tue, 8 Oct 2019 01:20:07 +0000 (09:20 +0800)]
net: hns3: add support for configuring bandwidth of VF on the host

This patch adds support for configuring bandwidth of VF on the host
for HNS3 drivers.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: hns3: add support for setting VF trust
Jian Shen [Tue, 8 Oct 2019 01:20:06 +0000 (09:20 +0800)]
net: hns3: add support for setting VF trust

This patch adds supports for setting VF trust by host. If specified
VF is trusted, then it can enable promisc(include allmulti mode).
If a trusted VF enabled promisc, and being untrusted, host will
disable promisc mode for this VF.

For VF will update its promisc mode from set_rx_mode now, so it's
unnecessary to set broadcst promisc mode when initialization or
reset.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: hns3: add support for spoof check setting
Jian Shen [Tue, 8 Oct 2019 01:20:05 +0000 (09:20 +0800)]
net: hns3: add support for spoof check setting

This patch adds support for spoof check configuration for VFs.
When it is enabled, "spoof checking" is done for both mac address
and VLAN. For each VF, the HW ensures that the source MAC address
(or VLAN) of every outgoing packet exists in the MAC-list (or
VLAN-list) configured for RX filtering for that VF. If not,
the packet is dropped.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: hns3: add support for setting VF link status on the host
Yufeng Mo [Tue, 8 Oct 2019 01:20:04 +0000 (09:20 +0800)]
net: hns3: add support for setting VF link status on the host

This patch adds support to configure VF link properties.
The options are auto, enable, and disable. Even if the PF
is down, the communication between VFs will be normal
if the VFs are set to enable. The commands are as follows:

'ip link set <pf> vf <vf_id> state <auto|enable|disable>'
change the VF status

'ip link show'
show the setting status

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agotun: fix memory leak in error path
Eric Dumazet [Mon, 7 Oct 2019 19:21:05 +0000 (12:21 -0700)]
tun: fix memory leak in error path

syzbot reported a warning [1] that triggered after recent Jiri patch.

This exposes a bug that we hit already in the past (see commit
ff244c6b29b1 ("tun: handle register_netdevice() failures properly")
for details)

tun uses priv->destructor without an ndo_init() method.

register_netdevice() can return an error, but will
not call priv->destructor() in some cases. Jiri recent
patch added one more.

A long term fix would be to transfer the initialization
of what we destroy in ->destructor() in the ndo_init()

This looks a bit risky given the complexity of tun driver.

A simpler fix is to detect after the failed register_netdevice()
if the tun_free_netdev() function was called already.

[1]
ODEBUG: free active (active state 0) object type: timer_list hint: tun_flow_cleanup+0x0/0x280 drivers/net/tun.c:457
WARNING: CPU: 0 PID: 8653 at lib/debugobjects.c:481 debug_print_object+0x168/0x250 lib/debugobjects.c:481
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 8653 Comm: syz-executor976 Not tainted 5.4.0-rc1-next-20191004 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 panic+0x2dc/0x755 kernel/panic.c:220
 __warn.cold+0x2f/0x3c kernel/panic.c:581
 report_bug+0x289/0x300 lib/bug.c:195
 fixup_bug arch/x86/kernel/traps.c:174 [inline]
 fixup_bug arch/x86/kernel/traps.c:169 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:267
 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:286
 invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028
RIP: 0010:debug_print_object+0x168/0x250 lib/debugobjects.c:481
Code: dd 80 b9 e6 87 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 b5 00 00 00 48 8b 14 dd 80 b9 e6 87 48 c7 c7 e0 ae e6 87 e8 80 84 ff fd <0f> 0b 83 05 e3 ee 80 06 01 48 83 c4 20 5b 41 5c 41 5d 41 5e 5d c3
RSP: 0018:ffff888095997a28 EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815cb526 RDI: ffffed1012b32f37
RBP: ffff888095997a68 R08: ffff8880a92ac580 R09: ffffed1015d04101
R10: ffffed1015d04100 R11: ffff8880ae820807 R12: 0000000000000001
R13: ffffffff88fb5340 R14: ffffffff81627110 R15: ffff8880aa41eab8
 __debug_check_no_obj_freed lib/debugobjects.c:963 [inline]
 debug_check_no_obj_freed+0x2d4/0x43f lib/debugobjects.c:994
 kfree+0xf8/0x2c0 mm/slab.c:3755
 kvfree+0x61/0x70 mm/util.c:593
 netdev_freemem net/core/dev.c:9384 [inline]
 free_netdev+0x39d/0x450 net/core/dev.c:9533
 tun_set_iff drivers/net/tun.c:2871 [inline]
 __tun_chr_ioctl+0x317b/0x3f30 drivers/net/tun.c:3075
 tun_chr_ioctl+0x2b/0x40 drivers/net/tun.c:3355
 vfs_ioctl fs/ioctl.c:47 [inline]
 file_ioctl fs/ioctl.c:539 [inline]
 do_vfs_ioctl+0xdb6/0x13e0 fs/ioctl.c:726
 ksys_ioctl+0xab/0xd0 fs/ioctl.c:743
 __do_sys_ioctl fs/ioctl.c:750 [inline]
 __se_sys_ioctl fs/ioctl.c:748 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:748
 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441439
Code: e8 9c ae 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff61c37438 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441439
RDX: 0000000020000400 RSI: 00000000400454ca RDI: 0000000000000004
RBP: 00007fff61c37470 R08: 0000000000000001 R09: 0000000100000000
R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
R13: 0000000000000005 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..

Fixes: ff92741270bf ("net: introduce name_node struct to be used in hashlist")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonetdevsim: fix spelling mistake "forbidded" -> "forbid"
Colin Ian King [Tue, 8 Oct 2019 08:17:47 +0000 (09:17 +0100)]
netdevsim: fix spelling mistake "forbidded" -> "forbid"

There is a spelling mistake in a NL_SET_ERR_MSG_MOD message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonet: phy: mscc: make arrays static, makes object smaller
Colin Ian King [Mon, 7 Oct 2019 12:03:08 +0000 (13:03 +0100)]
net: phy: mscc: make arrays static, makes object smaller

Don't populate const arrays on the stack but instead make them
static. Makes the object code smaller by 1058 bytes.

Before:
   text    data     bss     dec     hex filename
  29879    6144       0   36023    8cb7 drivers/net/phy/mscc.o

After:
   text    data     bss     dec     hex filename
  28437    6528       0   34965    8895 drivers/net/phy/mscc.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agonfp: bpf: make array exp_mask static, makes object smaller
Colin Ian King [Mon, 7 Oct 2019 11:52:39 +0000 (12:52 +0100)]
nfp: bpf: make array exp_mask static, makes object smaller

Don't populate the array exp_mask on the stack but instead make it
static. Makes the object code smaller by 224 bytes.

Before:
   text    data     bss     dec     hex filename
  77832    2290       0   80122   138fa ethernet/netronome/nfp/bpf/jit.o

After:
   text    data     bss     dec     hex filename
  77544    2354       0   79898   1381a ethernet/netronome/nfp/bpf/jit.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
5 years agospi: Add a PTP system timestamp to the transfer structure
Vladimir Oltean [Thu, 5 Sep 2019 01:01:12 +0000 (04:01 +0300)]
spi: Add a PTP system timestamp to the transfer structure

SPI is one of the interfaces used to access devices which have a POSIX
clock driver (real time clocks, 1588 timers etc). The fact that the SPI
bus is slow is not what the main problem is, but rather the fact that
drivers don't take a constant amount of time in transferring data over
SPI. When there is a high delay in the readout of time, there will be
uncertainty in the value that has been read out of the peripheral.
When that delay is constant, the uncertainty can at least be
approximated with a certain accuracy which is fine more often than not.

Timing jitter occurs all over in the kernel code, and is mainly caused
by having to let go of the CPU for various reasons such as preemption,
servicing interrupts, going to sleep, etc. Another major reason is CPU
dynamic frequency scaling.

It turns out that the problem of retrieving time from a SPI peripheral
with high accuracy can be solved by the use of "PTP system
timestamping" - a mechanism to correlate the time when the device has
snapshotted its internal time counter with the Linux system time at that
same moment. This is sufficient for having a precise time measurement -
it is not necessary for the whole SPI transfer to be transmitted "as
fast as possible", or "as low-jitter as possible". The system has to be
low-jitter for a very short amount of time to be effective.

This patch introduces a PTP system timestamping mechanism in struct
spi_transfer. This is to be used by SPI device drivers when they need to
know the exact time at which the underlying device's time was
snapshotted. More often than not, SPI peripherals have a very exact
timing for when their SPI-to-interconnect bridge issues a transaction
for snapshotting and reading the time register, and that will be
dependent on when the SPI-to-interconnect bridge figures out that this
is what it should do, aka as soon as it sees byte N of the SPI transfer.
Since spi_device drivers are the ones who'd know best how the peripheral
behaves in this regard, expose a mechanism in spi_transfer which allows
them to specify which word (or word range) from the transfer should be
timestamped.

Add a default implementation of the PTP system timestamping in the SPI
core. This is not going to be satisfactory performance-wise, but should
at least increase the likelihood that SPI device drivers will use PTP
system timestamping in the future.
There are 3 entry points from the core towards the SPI controller
drivers:

- transfer_one: The driver is passed individual spi_transfers to
  execute. This is the easiest to timestamp.

- transfer_one_message: The core passes the driver an entire spi_message
  (a potential batch of spi_transfers). The core puts the same pre and
  post timestamp to all transfers within a message. This is not ideal,
  but nothing better can be done by default anyway, since the core has
  no insight into how the driver batches the transfers.

- transfer: Like transfer_one_message, but for unqueued drivers (i.e.
  the driver implements its own queue scheduling).

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20190905010114.26718-3-olteanv@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
5 years agoMerge branch 'dpaa2-eth-misc-cleanup'
David S. Miller [Mon, 7 Oct 2019 14:08:09 +0000 (10:08 -0400)]
Merge branch 'dpaa2-eth-misc-cleanup'

Ioana Ciornei says:

====================
dpaa2-eth: misc cleanup

This patch set consists of some cleanup patches ranging from removing dead
code to fixing a minor issue in ethtool stats. Also, unbounded while loops
are removed from the driver by adding a maximum number of retries for DPIO
portal commands.

Changes in v2:
 - return -ETIMEDOUT where possible if the number of retries is hit
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: Avoid unbounded while loops
Ioana Radulescu [Mon, 7 Oct 2019 11:38:28 +0000 (14:38 +0300)]
dpaa2-eth: Avoid unbounded while loops

Throughout the driver there are several places where we wait
indefinitely for DPIO portal commands to be executed, while
the portal returns a busy response code.

Even though in theory we are guaranteed the portals become
available eventually, in practice the QBMan hardware module
may become unresponsive in various corner cases.

Make sure we can never get stuck in an infinite while loop
by adding a retry counter for all portal commands.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: Fix minor bug in ethtool stats reporting
Ioana Radulescu [Mon, 7 Oct 2019 11:38:27 +0000 (14:38 +0300)]
dpaa2-eth: Fix minor bug in ethtool stats reporting

Don't print error message for a successful return value.

Fixes: d84c3a4ded96 ("dpaa2-eth: Add new DPNI statistics counters")
Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: Cleanup dead code
Ioana Radulescu [Mon, 7 Oct 2019 11:38:26 +0000 (14:38 +0300)]
dpaa2-eth: Cleanup dead code

Remove one function call whose result was not used anywhere.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: make array tick_array static, makes object smaller
Colin Ian King [Mon, 7 Oct 2019 11:09:35 +0000 (12:09 +0100)]
net: hns3: make array tick_array static, makes object smaller

Don't populate the array tick_array on the stack but instead make it
static. Makes the object code smaller by 29 bytes.

Before:
   text    data     bss     dec     hex filename
  19191     432       0   19623    4ca7 hisilicon/hns3/hns3pf/hclge_tm.o

After:
   text    data     bss     dec     hex filename
  19098     496       0   19594    4c8a hisilicon/hns3/hns3pf/hclge_tm.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns: make arrays static, makes object smaller
Colin Ian King [Mon, 7 Oct 2019 10:55:10 +0000 (11:55 +0100)]
net: hns: make arrays static, makes object smaller

Don't populate the arrays port_map and sl_map on the stack but
instead make them static. Makes the object code smaller by 64 bytes.

Before:
   text    data     bss     dec     hex filename
  49575    6872      64   56511    dcbf hisilicon/hns/hns_dsaf_main.o

After:
   text    data     bss     dec     hex filename
  49350    7032      64   56446    dc7e hisilicon/hns/hns_dsaf_main.o

(gcc version 9.2.1, amd64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-tls-minor-micro-optimizations'
David S. Miller [Mon, 7 Oct 2019 13:58:28 +0000 (09:58 -0400)]
Merge branch 'net-tls-minor-micro-optimizations'

Jakub Kicinski says:

====================
net/tls: minor micro optimizations

This set brings a number of minor code changes from my tree which
don't have a noticeable impact on performance but seem reasonable
nonetheless.

First sk_msg_sg copy array is converted to a bitmap, zeroing that
structure takes a lot of time, hence we should try to keep it
small.

Next two conditions are marked as unlikely, GCC seemed to had
little trouble correctly reasoning about those.

Patch 4 adds parameters to tls_device_decrypted() to avoid
walking structures, as all callers already have the relevant
pointers.

Lastly two boolean members of TLS context structures are
converted to a bitfield.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: store decrypted on a single bit
Jakub Kicinski [Mon, 7 Oct 2019 04:09:32 +0000 (21:09 -0700)]
net/tls: store decrypted on a single bit

Use a single bit instead of boolean to remember if packet
was already decrypted.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: store async_capable on a single bit
Jakub Kicinski [Mon, 7 Oct 2019 04:09:31 +0000 (21:09 -0700)]
net/tls: store async_capable on a single bit

Store async_capable on a single bit instead of a full integer
to save space.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: pass context to tls_device_decrypted()
Jakub Kicinski [Mon, 7 Oct 2019 04:09:30 +0000 (21:09 -0700)]
net/tls: pass context to tls_device_decrypted()

Avoid unnecessary pointer chasing and calculations, callers already
have most of the state tls_device_decrypted() needs.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: make allocation failure unlikely
Jakub Kicinski [Mon, 7 Oct 2019 04:09:29 +0000 (21:09 -0700)]
net/tls: make allocation failure unlikely

Make sure GCC realizes it's unlikely that allocations will fail.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: mark sk->err being set as unlikely
Jakub Kicinski [Mon, 7 Oct 2019 04:09:28 +0000 (21:09 -0700)]
net/tls: mark sk->err being set as unlikely

Tell GCC sk->err is not likely to be set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sockmap: use bitmap for copy info
Jakub Kicinski [Mon, 7 Oct 2019 04:09:27 +0000 (21:09 -0700)]
net: sockmap: use bitmap for copy info

Don't use bool array in struct sk_msg_sg, save 12 bytes.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: core: use helper skb_ensure_writable in more places
Heiner Kallweit [Sun, 6 Oct 2019 16:52:43 +0000 (18:52 +0200)]
net: core: use helper skb_ensure_writable in more places

Use helper skb_ensure_writable in two more places to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Make ipv6_mc_may_pull() return bool.
David S. Miller [Mon, 7 Oct 2019 13:37:27 +0000 (09:37 -0400)]
ipv6: Make ipv6_mc_may_pull() return bool.

Consistent with how pskb_may_pull() also now does so.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: core: change return type of pskb_may_pull to bool
Heiner Kallweit [Sun, 6 Oct 2019 16:19:54 +0000 (18:19 +0200)]
net: core: change return type of pskb_may_pull to bool

This function de-facto returns a bool, so let's change the return type
accordingly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ena-set_channels'
David S. Miller [Mon, 7 Oct 2019 13:30:03 +0000 (09:30 -0400)]
Merge branch 'ena-set_channels'

Sameeh Jubran says:

====================
ena: Support ethtool set_channels

Difference from v2:
* ethtool's set/get channels: Switched to using combined instead of
  separate rx/tx
* Fixed error handling in set_channels
* Fixed indentation and cosmetic issues as requested by Jakub Kicinski

Difference from v1:
* Dropped the print from patch 0002 - "net: ena: multiple queue creation
  related cleanups" as requested by David Miller
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: ethtool: support set_channels callback
Sameeh Jubran [Sun, 6 Oct 2019 12:33:28 +0000 (15:33 +0300)]
net: ena: ethtool: support set_channels callback

Set channels callback enables the user to change the count of queues
used by the driver using ethtool. We decided to currently support only
equal number of rx and tx queues, this might change in the future.

Also rename dev_up to dev_was_up in ena_update_queue_count() to make
it clearer.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: remove redundant print of number of queues
Sameeh Jubran [Sun, 6 Oct 2019 12:33:27 +0000 (15:33 +0300)]
net: ena: remove redundant print of number of queues

The number of queues can be derived using ethtool, no need to print
it in ena_probe()

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: make ethtool -l show correct max number of queues
Sameeh Jubran [Sun, 6 Oct 2019 12:33:26 +0000 (15:33 +0300)]
net: ena: make ethtool -l show correct max number of queues

- Update ena_ethtool:ena_get_channels() to return adapter->max_io_queues
  so that ethtool -l returns the correct maximum queue number.

- Change the name of ena_calc_io_queue_num() to
  ena_calc_max_io_queue_num() as it returns the maximum number of io
  queues and actual number of queues can be smaller if changed
  by ethtool -L which is implemented in a later commit.

- Change variable name from io_queue_num to max_num_io_queues in
  ena_calc_max_io_queue_num() and ena_probe().

- Make all types of variables that convey the number and sizeof queues
  to be u32, for consistency with the API between the driver and the
  device.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: ethtool: get_channels: use combined only
Sameeh Jubran [Sun, 6 Oct 2019 12:33:25 +0000 (15:33 +0300)]
net: ena: ethtool: get_channels: use combined only

Since we use the same IRQ and NAPI to service RX and TX then we need to
use a combined channel instead of rx and tx channels.

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: multiple queue creation related cleanups
Sameeh Jubran [Sun, 6 Oct 2019 12:33:24 +0000 (15:33 +0300)]
net: ena: multiple queue creation related cleanups

- Rename ena_calc_queue_size() to ena_calc_io_queue_size() for clarity
  and consistency
- Remove redundant number of io queues parameter in functions
  ena_enable_msix() and ena_enable_msix_and_set_admin_interrupts(),
  which already get adapter parameter, so use adapter->num_io_queues
  in the function instead.
- Use the local variable ena_dev instead of ctx->ena_dev in
  ena_calc_io_queue_size
- Fix multi row comment alignments

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: change num_queues to num_io_queues for clarity and consistency
Sameeh Jubran [Sun, 6 Oct 2019 12:33:23 +0000 (15:33 +0300)]
net: ena: change num_queues to num_io_queues for clarity and consistency

Most places in the code refer to the IO queues as io_queues and not
simply queues. Examples - max_io_queues_per_vf, ENA_MAX_NUM_IO_QUEUES,
ena_destroy_all_io_queues() etc..

We are also adding the new max_num_io_queues field to struct ena_adapter
in the following commit.

The changes included in this commit are:
struct ena_adapter->num_queues => struct ena_adapter->num_io_queues

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'samples-pktgen-allow-to-specify-destination-IP-range'
David S. Miller [Mon, 7 Oct 2019 13:26:32 +0000 (09:26 -0400)]
Merge branch 'samples-pktgen-allow-to-specify-destination-IP-range'

Daniel T. Lee says:

====================
samples: pktgen: allow to specify destination IP range

Currently, pktgen script supports specify destination port range.

To further extend the capabilities, this commit allows to specify destination
IP range with CIDR when running pktgen script.

Specifying destination IP range will be useful on various situation such as
testing RSS/RPS with randomizing n-tuple.

This patchset fixes the problem with checking the command result on proc_cmd,
and add feature to allow destination IP range.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosamples: pktgen: allow to specify destination IP range (CIDR)
Daniel T. Lee [Sat, 5 Oct 2019 08:25:09 +0000 (17:25 +0900)]
samples: pktgen: allow to specify destination IP range (CIDR)

Currently, kernel pktgen has the feature to specify destination
address range for sending packet. (e.g. pgset "dst_min/dst_max")

But on samples, each pktgen script doesn't have any option to achieve this.

This commit adds the feature to specify the destination address range with CIDR.

    -d : ($DEST_IP)   destination IP. CIDR (e.g. 198.18.0.0/15) is also allowed

    # ./pktgen_sample01_simple.sh -6 -d fe80::20/126 -p 3000 -n 4
    # tcpdump ip6 and udp
    05:14:18.082285 IP6 fe80::99.71 > fe80::23.3000: UDP, length 16
    05:14:18.082564 IP6 fe80::99.43 > fe80::23.3000: UDP, length 16
    05:14:18.083366 IP6 fe80::99.107 > fe80::22.3000: UDP, length 16
    05:14:18.083585 IP6 fe80::99.97 > fe80::21.3000: UDP, length 16

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosamples: pktgen: add helper functions for IP(v4/v6) CIDR parsing
Daniel T. Lee [Sat, 5 Oct 2019 08:25:08 +0000 (17:25 +0900)]
samples: pktgen: add helper functions for IP(v4/v6) CIDR parsing

This commit adds CIDR parsing and IP validate helper function to parse
single IP or range of IP with CIDR. (e.g. 198.18.0.0/15)

Validating the address should be preceded prior to the parsing.
Helpers will be used in prior to set target address in samples/pktgen.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosamples: pktgen: fix proc_cmd command result check logic
Daniel T. Lee [Sat, 5 Oct 2019 08:25:07 +0000 (17:25 +0900)]
samples: pktgen: fix proc_cmd command result check logic

Currently, proc_cmd is used to dispatch command to 'pg_ctrl', 'pg_thread',
'pg_set'. proc_cmd is designed to check command result with grep the
"Result:", but this might fail since this string is only shown in
'pg_thread' and 'pg_set'.

This commit fixes this logic by grep-ing the "Result:" string only when
the command is not for 'pg_ctrl'.

For clarity of an execution flow, 'errexit' flag has been set.

To cleanup pktgen on exit, trap has been added for EXIT signal.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosamples: pktgen: make variable consistent with option
Daniel T. Lee [Sat, 5 Oct 2019 08:25:06 +0000 (17:25 +0900)]
samples: pktgen: make variable consistent with option

This commit changes variable names that can cause confusion.

For example, variable DST_MIN is quite confusing since the
keyword 'udp_dst_min' and keyword 'dst_min' is used with pg_ctrl.

On the following commit, 'dst_min' will be used to set destination IP,
and the existing variable name DST_MIN should be changed.

Variable names are matched to the exact keyword used with pg_ctrl.

Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'netdevsim-implement-devlink-dev_info-op'
David S. Miller [Mon, 7 Oct 2019 13:11:07 +0000 (09:11 -0400)]
Merge branch 'netdevsim-implement-devlink-dev_info-op'

Jiri Pirko says:

====================
netdevsim: implement devlink dev_info op

Initial implementation of devlink dev_info op - just driver name is
filled up and sent to user. Bundled with selftest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: add netdevsim devlink dev info test
Jiri Pirko [Mon, 7 Oct 2019 08:27:09 +0000 (10:27 +0200)]
selftests: add netdevsim devlink dev info test

Add test to verify netdevsim driver name returned by devlink dev info.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetdevsim: implement devlink dev_info op
Jiri Pirko [Mon, 7 Oct 2019 08:27:08 +0000 (10:27 +0200)]
netdevsim: implement devlink dev_info op

Do simple dev_info devlink operation implementation which only fills up
the driver name.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: devlink: fix reporter dump dumpit
Jiri Pirko [Mon, 7 Oct 2019 07:28:31 +0000 (09:28 +0200)]
net: devlink: fix reporter dump dumpit

In order for attrs to be prepared for reporter dump dumpit callback,
set GENL_DONT_VALIDATE_DUMP_STRICT instead of GENL_DONT_VALIDATE_DUMP.

Fixes: ee85da535fe3 ("devlink: have genetlink code to parse the attrs during dumpit"
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'stmmac-next'
David S. Miller [Sun, 6 Oct 2019 16:46:31 +0000 (18:46 +0200)]
Merge branch 'stmmac-next'

Jose Abreu says:

====================
net: stmmac: Improvements for -next

Improvements for -next. More info in commit logs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Implement L3/L4 Filters in GMAC4+
Jose Abreu [Sun, 6 Oct 2019 11:17:14 +0000 (13:17 +0200)]
net: stmmac: Implement L3/L4 Filters in GMAC4+

GMAC4+ cores support Layer 3 and Layer 4 filtering. Add the
corresponding callbacks in these cores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: selftests: Add tests for VLAN Perfect Filtering
Jose Abreu [Sun, 6 Oct 2019 11:17:13 +0000 (13:17 +0200)]
net: stmmac: selftests: Add tests for VLAN Perfect Filtering

Add two new tests for VLAN Perfect Filtering. While at it, increase a
little bit the tests strings lenght so that we can have more descriptive
test names.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Fallback to VLAN Perfect filtering if HASH is not available
Jose Abreu [Sun, 6 Oct 2019 11:17:12 +0000 (13:17 +0200)]
net: stmmac: Fallback to VLAN Perfect filtering if HASH is not available

If VLAN Hash Filtering is not available we can fallback to perfect
filtering instead. Let's implement this in XGMAC and GMAC cores and let
the user use this filter.

VLAN VID=0 always passes filter so we check if more than 2 VLANs are
created and return proper error code if so because perfect filtering
only supports 1 VID at a time.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfc: s3fwrn5: fix platform_no_drv_owner.cocci warning
YueHaibing [Sun, 6 Oct 2019 10:53:52 +0000 (18:53 +0800)]
nfc: s3fwrn5: fix platform_no_drv_owner.cocci warning

Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfc: nfcmrvl: fix platform_no_drv_owner.cocci warning
YueHaibing [Sun, 6 Oct 2019 10:52:17 +0000 (18:52 +0800)]
nfc: nfcmrvl: fix platform_no_drv_owner.cocci warning

Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: ksz9477: fix platform_no_drv_owner.cocci warning
YueHaibing [Sun, 6 Oct 2019 10:50:07 +0000 (18:50 +0800)]
net: dsa: ksz9477: fix platform_no_drv_owner.cocci warning

Remove .owner field if calls are used which set it automatically
Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/rds: Add missing include file
YueHaibing [Sun, 6 Oct 2019 07:08:32 +0000 (15:08 +0800)]
net/rds: Add missing include file

Fix build error:

net/rds/ib_cm.c: In function rds_dma_hdrs_alloc:
net/rds/ib_cm.c:475:13: error: implicit declaration of function dma_pool_zalloc; did you mean mempool_alloc? [-Werror=implicit-function-declaration]
   hdrs[i] = dma_pool_zalloc(pool, GFP_KERNEL, &hdr_daddrs[i]);
             ^~~~~~~~~~~~~~~
             mempool_alloc

net/rds/ib.c: In function rds_ib_dev_free:
net/rds/ib.c:111:3: error: implicit declaration of function dma_pool_destroy; did you mean mempool_destroy? [-Werror=implicit-function-declaration]
   dma_pool_destroy(rds_ibdev->rid_hdrs_pool);
   ^~~~~~~~~~~~~~~~
   mempool_destroy

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 9b17f5884be4 ("net/rds: Use DMA memory pool allocation for rds_header")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-Query-number-of-modules-from-firmware'
David S. Miller [Sun, 6 Oct 2019 16:31:40 +0000 (18:31 +0200)]
Merge branch 'mlxsw-Query-number-of-modules-from-firmware'

Ido Schimmel says:

====================
mlxsw: Query number of modules from firmware

Vadim says:

The patchset adds support for a new field "num_of_modules" of Management
General Peripheral Information Register (MGPIR), providing the maximum
number of QSFP modules, which can be supported by the system.

It allows to obtain the number of QSFP modules directly from this field,
as a static data, instead of old method of getting this info through
"network port to QSFP module" mapping. With the old method, in case of
port dynamic re-configuration some modules can logically "disappear" as
a result of port split operations, which can cause some modules to
appear missing.

Such scenario can happen on a system equipped with a BMC card, while PCI
chip driver at host CPU side can perform some ports "split" or "unsplit"
operations, while BMC side I2C chip driver reads the "port-to-module"
mapping.

Add common API for FW "minor" and "subminor" versions validation and
share it between PCI and I2C based drivers.

Add FW version validation for "minimal" driver, because use of new field
"num_of_modules" in MGPIR register is not backward compatible.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: minimal: Add validation for FW version
Vadim Pasternak [Sun, 6 Oct 2019 06:34:52 +0000 (09:34 +0300)]
mlxsw: minimal: Add validation for FW version

Add validation for FW version in order to prevent driver initialization
in case FW version is older than expected. FW version validation is
necessary, because use of a new field 'num_of_modules' in MGPIR register
is not backward compatible. FW 'minor' and 'subminor' versions are
expected to be greater than or equal to 2000 and 1886, respectively.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: core: Push minor/subminor fw version check into helper
Vadim Pasternak [Sun, 6 Oct 2019 06:34:51 +0000 (09:34 +0300)]
mlxsw: core: Push minor/subminor fw version check into helper

Add new API for FW "minor" and "subminor" version validation for
sharing it between "spectrum" and "minimal" drivers.
Use it in "spectrum" driver.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: thermal: Provide optimization for QSFP modules number detection
Vadim Pasternak [Sun, 6 Oct 2019 06:34:50 +0000 (09:34 +0300)]
mlxsw: thermal: Provide optimization for QSFP modules number detection

Use new field "num_of_modules" of MGPIR register for "thermal" interface
in order to get the number of modules supported by system directly from
the system configuration, instead of getting it from port to module
mapping info.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: hwmon: Provide optimization for QSFP modules number detection
Vadim Pasternak [Sun, 6 Oct 2019 06:34:49 +0000 (09:34 +0300)]
mlxsw: hwmon: Provide optimization for QSFP modules number detection

Use new field "num_of_modules" of MGPIR register for "hwmon" interface
in order to get the number of modules supported by system directly from
the system configuration, instead of getting it from port to module
mapping info.

Reading this info through MGPIR register is faster and does not depend
on possible dynamic re-configuration of ports.
In case of port dynamic re-configuration some modules can logically
"disappear" as a result of port split and un-spilt operations, which
can cause missing of some modules, in case this info is taken from port
to module mapping info.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Extend MGPIR register with new field exposing the number of QSFP modules
Vadim Pasternak [Sun, 6 Oct 2019 06:34:48 +0000 (09:34 +0300)]
mlxsw: reg: Extend MGPIR register with new field exposing the number of QSFP modules

Extend MGPIR - Management General Peripheral Information Register
with new field "num_of_modules" exposing the number of modules
supported by specific system.

Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'netdevsim-allow-to-test-reload-failures'
David S. Miller [Sun, 6 Oct 2019 16:28:42 +0000 (18:28 +0200)]
Merge branch 'netdevsim-allow-to-test-reload-failures'

Jiri Pirko says:

====================
netdevsim: allow to test reload failures

Allow user to test devlink reload failures: Fail to reload and fail
during reload.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: test netdevsim reload forbid and fail
Jiri Pirko [Sun, 6 Oct 2019 06:30:02 +0000 (08:30 +0200)]
selftests: test netdevsim reload forbid and fail

Extend netdevsim reload test by simulation of failures.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetdevsim: add couple of debugfs bools to debug devlink reload
Jiri Pirko [Sun, 6 Oct 2019 06:30:01 +0000 (08:30 +0200)]
netdevsim: add couple of debugfs bools to debug devlink reload

Add flag to disallow reload and another one that causes reload to
always fail.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-genetlink-parse-attrs-for-dumpit-callback'
David S. Miller [Sun, 6 Oct 2019 13:44:47 +0000 (15:44 +0200)]
Merge branch 'net-genetlink-parse-attrs-for-dumpit-callback'

Jiri Pirko says:

====================
net: genetlink: parse attrs for dumpit() callback

In generic netlink, parsing attributes for doit() callback is already
implemented. They are available in info->attrs.

For dumpit() however, each user which is interested in attributes have to
parse it manually. Even though the attributes may be (depending on flag)
already validated (by parse function).

Make usage of attributes in dumpit() more convenient and prepare
info->attrs too.

Patchset also make the existing users of genl_family_attrbuf() converted
to use info->attrs and removes the helper.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: have genetlink code to parse the attrs during dumpit
Jiri Pirko [Sat, 5 Oct 2019 18:04:42 +0000 (20:04 +0200)]
devlink: have genetlink code to parse the attrs during dumpit

Benefit from the fact that the generic netlink code can parse the attrs
for dumpit op and avoid need to parse it in the op callback.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: genetlink: remove unused genl_family_attrbuf()
Jiri Pirko [Sat, 5 Oct 2019 18:04:41 +0000 (20:04 +0200)]
net: genetlink: remove unused genl_family_attrbuf()

genl_family_attrbuf() function is no longer used by anyone, so remove it.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: tipc: allocate attrs locally instead of using genl_family_attrbuf in compat_dumpit()
Jiri Pirko [Sat, 5 Oct 2019 18:04:40 +0000 (20:04 +0200)]
net: tipc: allocate attrs locally instead of using genl_family_attrbuf in compat_dumpit()

As this is the last user of genl_family_attrbuf, convert to allocate
attrs locally and do it in a similar way this is done in compat_doit().

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: tipc: have genetlink code to parse the attrs during dumpit
Jiri Pirko [Sat, 5 Oct 2019 18:04:39 +0000 (20:04 +0200)]
net: tipc: have genetlink code to parse the attrs during dumpit

Benefit from the fact that the generic netlink code can parse the attrs
for dumpit op and avoid need to parse it in the op callback.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: nfc: have genetlink code to parse the attrs during dumpit
Jiri Pirko [Sat, 5 Oct 2019 18:04:38 +0000 (20:04 +0200)]
net: nfc: have genetlink code to parse the attrs during dumpit

Benefit from the fact that the generic netlink code can parse the attrs
for dumpit op and avoid need to parse it in the op callback.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ieee802154: have genetlink code to parse the attrs during dumpit
Jiri Pirko [Sat, 5 Oct 2019 18:04:37 +0000 (20:04 +0200)]
net: ieee802154: have genetlink code to parse the attrs during dumpit

Benefit from the fact that the generic netlink code can parse the attrs
for dumpit op and avoid need to parse it in the op callback.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: genetlink: parse attrs and store in contect info struct during dumpit
Jiri Pirko [Sat, 5 Oct 2019 18:04:36 +0000 (20:04 +0200)]
net: genetlink: parse attrs and store in contect info struct during dumpit

Extend the dumpit info struct for attrs. Instead of existing attribute
validation do parse them and save in the info struct. Caller can benefit
from this and does not have to do parse itself. In order to properly
free attrs, genl_family pointer needs to be added to dumpit info struct
as well.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: genetlink: push attrbuf allocation and parsing to a separate function
Jiri Pirko [Sat, 5 Oct 2019 18:04:35 +0000 (20:04 +0200)]
net: genetlink: push attrbuf allocation and parsing to a separate function

To be re-usable by dumpit as well, push the code that is taking care of
attrbuf allocation and parting from doit into separate function.
Introduce a helper to free the buffer too.

Check family->maxattr too before calling kfree() to be symmetrical with
the allocation check.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: genetlink: introduce dump info struct to be available during dumpit op
Jiri Pirko [Sat, 5 Oct 2019 18:04:34 +0000 (20:04 +0200)]
net: genetlink: introduce dump info struct to be available during dumpit op

Currently the cb->data is taken by ops during non-parallel dumping.
Introduce a new structure genl_dumpit_info and store the ops there.
Distribute the info to both non-parallel and parallel dumping. Also add
a helper genl_dumpit_info() to easily get the info structure in the
dumpit callback from cb.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: genetlink: push doit/dumpit code from genl_family_rcv_msg
Jiri Pirko [Sat, 5 Oct 2019 18:04:33 +0000 (20:04 +0200)]
net: genetlink: push doit/dumpit code from genl_family_rcv_msg

Currently the function genl_family_rcv_msg() is quite big. Since it is
quite convenient, push code that is related to doit and dumpit ops into
separate functions.

Do small changes on the way, like rc/err unification, NULL check etc.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoopenvswitch: Allow attaching helper in later commit
Yi-Hung Wei [Fri, 4 Oct 2019 16:26:44 +0000 (09:26 -0700)]
openvswitch: Allow attaching helper in later commit

This patch allows to attach conntrack helper to a confirmed conntrack
entry.  Currently, we can only attach alg helper to a conntrack entry
when it is in the unconfirmed state.  This patch enables an use case
that we can firstly commit a conntrack entry after it passed some
initial conditions.  After that the processing pipeline will further
check a couple of packets to determine if the connection belongs to
a particular application, and attach alg helper to the connection
in a later stage.

Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'create-netdevsim-instances-in-namespace'
David S. Miller [Sat, 5 Oct 2019 23:34:15 +0000 (16:34 -0700)]
Merge branch 'create-netdevsim-instances-in-namespace'

Jiri Pirko says:

====================
create netdevsim instances in namespace

Allow user to create netdevsim devlink and netdevice instances in a
network namespace according to the namespace where the user resides in.
Add a selftest to test this.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: test creating netdevsim inside network namespace
Jiri Pirko [Sat, 5 Oct 2019 06:10:33 +0000 (08:10 +0200)]
selftests: test creating netdevsim inside network namespace

Add a test that creates netdevsim instance inside network namespace
and verifies that the related devlink instance and port netdevices
reside in the namespace.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetdevsim: create devlink and netdev instances in namespace
Jiri Pirko [Sat, 5 Oct 2019 06:10:32 +0000 (08:10 +0200)]
netdevsim: create devlink and netdev instances in namespace

When user does create new netdevsim instance using sysfs bus file,
create the devlink instance and related netdev instance in the namespace
of the caller.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: devlink: export devlink net setter
Jiri Pirko [Sat, 5 Oct 2019 06:10:31 +0000 (08:10 +0200)]
net: devlink: export devlink net setter

For newly allocated devlink instance allow drivers to set net struct

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-tls-add-ctrl-path-tracing-and-statistics'
David S. Miller [Sat, 5 Oct 2019 23:29:01 +0000 (16:29 -0700)]
Merge branch 'net-tls-add-ctrl-path-tracing-and-statistics'

Jakub Kicinski says:

====================
net/tls: add ctrl path tracing and statistics

This set adds trace events related to TLS offload and basic MIB stats
for TLS.

First patch contains the TLS offload related trace points. Those are
helpful in troubleshooting offload issues, especially around the
resync paths.

Second patch adds a tracepoint to the fastpath of device offload,
it's separated out in case there will be objections to adding
fast path tracepoints. Again, it's quite useful for debugging
offload issues.

Next four patches add MIB statistics. The statistics are implemented
as per-cpu per-netns counters. Since there are currently no fast path
statistics we could move to atomic variables. Per-CPU seem more common.

Most basic statistics are number of created and live sessions, broken
out to offloaded and non-offloaded. Users seem to like those a lot.

Next there is a statistic for decryption errors. These are primarily
useful for device offload debug, in normal deployments decryption
errors should not be common.

Last but not least a counter for device RX resync.
====================

Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: add TlsDeviceRxResync statistic
Jakub Kicinski [Fri, 4 Oct 2019 23:19:27 +0000 (16:19 -0700)]
net/tls: add TlsDeviceRxResync statistic

Add a statistic for number of RX resyncs sent down to the NIC.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: add TlsDecryptError stat
Jakub Kicinski [Fri, 4 Oct 2019 23:19:26 +0000 (16:19 -0700)]
net/tls: add TlsDecryptError stat

Add a statistic for TLS record decryption errors.

Since devices are supposed to pass records as-is when they
encounter errors this statistic will count bad records in
both pure software and inline crypto configurations.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: add statistics for installed sessions
Jakub Kicinski [Fri, 4 Oct 2019 23:19:25 +0000 (16:19 -0700)]
net/tls: add statistics for installed sessions

Add SNMP stats for number of sockets with successfully
installed sessions.  Break them down to software and
hardware ones.  Note that if hardware offload fails
stack uses software implementation, and counts the
session appropriately.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: add skeleton of MIB statistics
Jakub Kicinski [Fri, 4 Oct 2019 23:19:24 +0000 (16:19 -0700)]
net/tls: add skeleton of MIB statistics

Add a skeleton structure for adding TLS statistics.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: add device decrypted trace point
Jakub Kicinski [Fri, 4 Oct 2019 23:19:23 +0000 (16:19 -0700)]
net/tls: add device decrypted trace point

Add a tracepoint to the TLS offload's fast path. This tracepoint
can be used to track the decrypted and encrypted status of received
records. Records decrypted by the device should have decrypted set
to 1, records which have neither decrypted nor decrypted set are
partially decrypted, require re-encryption and therefore are most
expensive to deal with.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: add tracing for device/offload events
Jakub Kicinski [Fri, 4 Oct 2019 23:19:22 +0000 (16:19 -0700)]
net/tls: add tracing for device/offload events

Add tracing of device-related interaction to aid performance
analysis, especially around resync:

 tls:tls_device_offload_set
 tls:tls_device_rx_resync_send
 tls:tls_device_rx_resync_nh_schedule
 tls:tls_device_rx_resync_nh_delay
 tls:tls_device_tx_resync_req
 tls:tls_device_tx_resync_send

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
David S. Miller [Sat, 5 Oct 2019 20:37:23 +0000 (13:37 -0700)]
Merge git://git./linux/kernel/git/netdev/net

5 years agoMerge tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahi...
Linus Torvalds [Sat, 5 Oct 2019 19:56:59 +0000 (12:56 -0700)]
Merge tag 'kbuild-fixes-v5.4' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - remove unneeded ar-option and KBUILD_ARFLAGS

 - remove long-deprecated SUBDIRS

 - fix modpost to suppress false-positive warnings for UML builds

 - fix namespace.pl to handle relative paths to ${objtree}, ${srctree}

 - make setlocalversion work for /bin/sh

 - make header archive reproducible

 - fix some Makefiles and documents

* tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kheaders: make headers archive reproducible
  kbuild: update compile-test header list for v5.4-rc2
  kbuild: two minor updates for Documentation/kbuild/modules.rst
  scripts/setlocalversion: clear local variable to make it work for sh
  namespace: fix namespace.pl script to support relative paths
  video/logo: do not generate unneeded logo C files
  video/logo: remove unneeded *.o pattern from clean-files
  integrity: remove pointless subdir-$(CONFIG_...)
  integrity: remove unneeded, broken attempt to add -fshort-wchar
  modpost: fix static EXPORT_SYMBOL warnings for UML build
  kbuild: correct formatting of header in kbuild module docs
  kbuild: remove SUBDIRS support
  kbuild: remove ar-option and KBUILD_ARFLAGS

5 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 5 Oct 2019 19:53:27 +0000 (12:53 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Twelve patches mostly small but obvious fixes or cosmetic but small
  updates"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix Nport ID display value
  scsi: qla2xxx: Fix N2N link up fail
  scsi: qla2xxx: Fix N2N link reset
  scsi: qla2xxx: Optimize NPIV tear down process
  scsi: qla2xxx: Fix stale mem access on driver unload
  scsi: qla2xxx: Fix unbound sleep in fcport delete path.
  scsi: qla2xxx: Silence fwdump template message
  scsi: hisi_sas: Make three functions static
  scsi: megaraid: disable device when probe failed after enabled device
  scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
  scsi: qedf: Remove always false 'tmp_prio < 0' statement
  scsi: ufs: skip shutdown if hba is not powered
  scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF

5 years agoMerge branch 'readdir' (readdir speedup and sanity checking)
Linus Torvalds [Sat, 5 Oct 2019 19:03:27 +0000 (12:03 -0700)]
Merge branch 'readdir' (readdir speedup and sanity checking)

This makes getdents() and getdents64() do sanity checking on the
pathname that it gives to user space.  And to mitigate the performance
impact of that, it first cleans up the way it does the user copying, so
that the code avoids doing the SMAP/PAN updates between each part of the
dirent structure write.

I really wanted to do this during the merge window, but didn't have
time.  The conversion of filldir to unsafe_put_user() is something I've
had around for years now in a private branch, but the extra pathname
checking finally made me clean it up to the point where it is mergable.

It's worth noting that the filename validity checking really should be a
bit smarter: it would be much better to delay the error reporting until
the end of the readdir, so that non-corrupted filenames are still
returned.  But that involves bigger changes, so let's see if anybody
actually hits the corrupt directory entry case before worrying about it
further.

* branch 'readdir':
  Make filldir[64]() verify the directory entry filename is valid
  Convert filldir[64]() from __put_user() to unsafe_put_user()

5 years agoMake filldir[64]() verify the directory entry filename is valid
Linus Torvalds [Sat, 5 Oct 2019 18:32:52 +0000 (11:32 -0700)]
Make filldir[64]() verify the directory entry filename is valid

This has been discussed several times, and now filesystem people are
talking about doing it individually at the filesystem layer, so head
that off at the pass and just do it in getdents{64}().

This is partially based on a patch by Jann Horn, but checks for NUL
bytes as well, and somewhat simplified.

There's also commentary about how it might be better if invalid names
due to filesystem corruption don't cause an immediate failure, but only
an error at the end of the readdir(), so that people can still see the
filenames that are ok.

There's also been discussion about just how much POSIX strictly speaking
requires this since it's about filesystem corruption.  It's really more
"protect user space from bad behavior" as pointed out by Jann.  But
since Eric Biederman looked up the POSIX wording, here it is for context:

 "From readdir:

   The readdir() function shall return a pointer to a structure
   representing the directory entry at the current position in the
   directory stream specified by the argument dirp, and position the
   directory stream at the next entry. It shall return a null pointer
   upon reaching the end of the directory stream. The structure dirent
   defined in the <dirent.h> header describes a directory entry.

  From definitions:

   3.129 Directory Entry (or Link)

   An object that associates a filename with a file. Several directory
   entries can associate names with the same file.

  ...

   3.169 Filename

   A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
   characters composing the name may be selected from the set of all
   character values excluding the slash character and the null byte. The
   filenames dot and dot-dot have special meaning. A filename is
   sometimes referred to as a 'pathname component'."

Note that I didn't bother adding the checks to any legacy interfaces
that nobody uses.

Also note that if this ends up being noticeable as a performance
regression, we can fix that to do a much more optimized model that
checks for both NUL and '/' at the same time one word at a time.

We haven't really tended to optimize 'memchr()', and it only checks for
one pattern at a time anyway, and we really _should_ check for NUL too
(but see the comment about "soft errors" in the code about why it
currently only checks for '/')

See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
lookup code looks for pathname terminating characters in parallel.

Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoConvert filldir[64]() from __put_user() to unsafe_put_user()
Linus Torvalds [Sun, 22 May 2016 04:59:07 +0000 (21:59 -0700)]
Convert filldir[64]() from __put_user() to unsafe_put_user()

We really should avoid the "__{get,put}_user()" functions entirely,
because they can easily be mis-used and the original intent of being
used for simple direct user accesses no longer holds in a post-SMAP/PAN
world.

Manually optimizing away the user access range check makes no sense any
more, when the range check is generally much cheaper than the "enable
user accesses" code that the __{get,put}_user() functions still need.

So instead of __put_user(), use the unsafe_put_user() interface with
user_access_{begin,end}() that really does generate better code these
days, and which is generally a nicer interface.  Under some loads, the
multiple user writes that filldir() does are actually quite noticeable.

This also makes the dirent name copy use unsafe_put_user() with a couple
of macros.  We do not want to make function calls with SMAP/PAN
disabled, and the code this generates is quite good when the
architecture uses "asm goto" for unsafe_put_user() like x86 does.

Note that this doesn't bother with the legacy cases.  Nobody should use
them anyway, so performance doesn't really matter there.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>