Julian Wiedmann [Thu, 27 Jun 2019 15:01:22 +0000 (17:01 +0200)]
s390/qeth: dynamically allocate simple IPA cmds
This patch reduces the usage of the write channel's static cmd buffers,
by dynamically allocating all simple IPA cmds (eg. STARTLAN, SETVMAC).
It also converts the OSN path.
Doing so requires some changes to how we calculate the cmd length.
Currently when building IPA cmds, we're quite generous in how much data
we send down to the device (basically the size of the biggest cmd we
know). This is no real concern at the moment, since the static cmd
buffers are backed with zeroed pages. But for dynamic allocations, the
exact length matters. So this patch also adds the needed length
calculations to each cmd path.
Commands that have multiple subtypes (eg. SETADP) of differing length
will be converted with follow-up patches.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Thu, 27 Jun 2019 13:04:52 +0000 (15:04 +0200)]
net/smc: common release code for non-accepted sockets
There are common steps when releasing an accepted or unaccepted socket.
Move this code into a common routine.
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 27 Jun 2019 16:54:35 +0000 (09:54 -0700)]
Merge branch 'net-ipv4-fix-circular-list-infinite-loop'
Florian Westphal says:
====================
net: ipv4: fix circular-list infinite loop
Tariq and Ran reported a regression caused by net-next commit
2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list").
This happens when net.ipv4.conf.$dev.promote_secondaries sysctl is
enabled -- we can arrange for ifa->next to point at ifa, so next
process that tries to walk the list loops forever.
Fix this and extend rtnetlink.sh with a small test case for this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Thu, 27 Jun 2019 12:03:33 +0000 (14:03 +0200)]
selftests: rtnetlink: add small test case with 'promote_secondaries' enabled
This exercises the 'promote_secondaries' code path.
Without previous fix, this triggers infinite loop/soft lockup:
ifconfig process spinning at 100%, never to return.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Thu, 27 Jun 2019 12:03:32 +0000 (14:03 +0200)]
net: ipv4: fix infinite loop on secondary addr promotion
secondary address promotion causes infinite loop -- it arranges
for ifa->ifa_next to point back to itself.
Problem is that 'prev_prom' and 'last_prim' might point at the same entry,
so 'last_sec' pointer must be obtained after prev_prom->next update.
Fixes: 2638eb8b50cf ("net: ipv4: provide __rcu annotation for ifa_list")
Reported-by: Ran Rozenstein <ranro@mellanox.com>
Reported-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maxime Chevallier [Thu, 27 Jun 2019 08:52:26 +0000 (10:52 +0200)]
net: ethtool: Allow parsing ETHER_FLOW types when using flow_rule
When parsing an ethtool_rx_flow_spec, users can specify an ethernet flow
which could contain matches based on the ethernet header, such as the
MAC address, the VLAN tag or the ethertype.
ETHER_FLOW uses the src and dst ethernet addresses, along with the
ethertype as keys. Matches based on the vlan tag are also possible, but
they are specified using the special FLOW_EXT flag.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by: Pablo Neira Ayuso <pablo@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 26 Jun 2019 21:09:33 +0000 (14:09 -0700)]
Merge branch 'macb-build-fixes'
Palmer Dabbelt says:
====================
net: macb: Fix compilation on systems without COMMON_CLK, v2
Our patch to add support for the FU540-C000 broke compilation on at
least powerpc allyesconfig, which was found as part of the linux-next
build regression tests. This must have somehow slipped through the
cracks, as the patch has been reverted in linux-next for a while now.
This patch applies on top of the offending commit, which is the only one
I've even tried it on as I'm not sure how this subsystem makes it to
Linus.
This patch set fixes the issue by adding a dependency of COMMON_CLK to
the MACB Kconfig entry, which avoids the build failure by disabling MACB
on systems where it wouldn't compile. All known users of MACB have
COMMON_CLK, so this shouldn't cause any issues. This is a significantly
simpler approach than disabling just the FU540-C000 support.
I've also included a second patch to indicate this is a driver for a
Cadence device that was originally written by an engineer at Atmel. The
only relation is that I stumbled across it when writing the first patch.
Changes since v1 <
20190624061603.1704-1-palmer@sifive.com>:
* Disable MACB on systems without COMMON_CLK, instead of just disabling
the FU540-C000 support on these systems.
* Update the commit message to reflect the driver was written by Atmel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Palmer Dabbelt [Tue, 25 Jun 2019 08:48:28 +0000 (01:48 -0700)]
net: macb: Kconfig: Rename Atmel to Cadence
The help text makes it look like NET_VENDOR_CADENCE enables support for
Atmel devices, when in reality it's a driver written by Atmel that
supports Cadence devices. This may confuse users that have this device
on a non-Atmel SoC.
The fix is just s/Atmel/Cadence/, but I did go and re-wrap the Kconfig
help text as that change caused it to go over 80 characters.
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Palmer Dabbelt [Tue, 25 Jun 2019 08:48:27 +0000 (01:48 -0700)]
net: macb: Kconfig: Make MACB depend on COMMON_CLK
commit
c218ad559020 ("macb: Add support for SiFive FU540-C000") added a
dependency on the common clock framework to the macb driver, but didn't
express that dependency in Kconfig. As a result macb now fails to
compile on systems without COMMON_CLK, which specifically causes a build
failure on powerpc allyesconfig.
This patch adds the dependency, which results in the macb driver no
longer being selectable on systems without the common clock framework.
All known systems that have this device already support the common clock
framework, so this should not cause trouble for any uses. Supporting
both the FU540-C000 and systems without COMMON_CLK is quite ugly.
I've build tested this on powerpc allyesconfig and RISC-V defconfig
(which selects MACB), but I have not even booted the resulting kernels.
Fixes: c218ad559020 ("macb: Add support for SiFive FU540-C000")
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Taht [Sat, 22 Jun 2019 17:07:34 +0000 (10:07 -0700)]
Allow 0.0.0.0/8 as a valid address range
The longstanding prohibition against using 0.0.0.0/8 dates back
to two issues with the early internet.
There was an interoperability problem with BSD 4.2 in 1984, fixed in
BSD 4.3 in 1986. BSD 4.2 has long since been retired.
Secondly, addresses of the form 0.x.y.z were initially defined only as
a source address in an ICMP datagram, indicating "node number x.y.z on
this IPv4 network", by nodes that know their address on their local
network, but do not yet know their network prefix, in RFC0792 (page
19). This usage of 0.x.y.z was later repealed in RFC1122 (section
3.2.2.7), because the original ICMP-based mechanism for learning the
network prefix was unworkable on many networks such as Ethernet (which
have longer addresses that would not fit into the 24 "node number"
bits). Modern networks use reverse ARP (RFC0903) or BOOTP (RFC0951)
or DHCP (RFC2131) to find their full 32-bit address and CIDR netmask
(and other parameters such as default gateways). 0.x.y.z has had
16,777,215 addresses in 0.0.0.0/8 space left unused and reserved for
future use, since 1989.
This patch allows for these 16m new IPv4 addresses to appear within
a box or on the wire. Layer 2 switches don't care.
0.0.0.0/32 is still prohibited, of course.
Signed-off-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: John Gilmore <gnu@toad.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 21 Jun 2019 23:27:16 +0000 (16:27 -0700)]
rtnetlink: skip metrics loop for dst_default_metrics
dst_default_metrics has all of the metrics initialized to 0, so nothing
will be added to the skb in rtnetlink_put_metrics. Avoid the loop if
metrics is from dst_default_metrics.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 26 Jun 2019 20:05:42 +0000 (13:05 -0700)]
Merge branch 'skfp-cleanups'
Puranjay Mohan says:
====================
net: fddi: skfp: Use PCI generic definitions instead of private duplicates
This patch series removes the private duplicates of PCI definitions in
favour of generic definitions defined in pci_regs.h.
This driver only uses some of the generic PCI definitons,
which are included from pci_regs.h and thier private versions
are removed from skfbi.h with all other private defines.
The skfbi.h defines PCI_REV_ID and other private defines with different
names, these are renamed to Generic PCI names to make them
compatible with defines in pci_regs.h.
All unused defines are removed from skfbi.h.
Changes in v5:
Removed unused PCI definitions which were left in v4
Changes in v4:
Removed unused PCI definitions which were left in v3
Changes in v3:
Renamed all local PCI definitions to Generic names.
Corrected coding style mistakes.
Changes in v2:
Converted individual patches to a series.
Made sure that individual patches build correctly
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Puranjay Mohan [Fri, 21 Jun 2019 15:40:37 +0000 (21:10 +0530)]
net: fddi: skfp: Remove unused private PCI definitions
Remove unused private PCI definitions from skfbi.h because generic PCI
symbols are already included from pci_regs.h.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Puranjay Mohan [Fri, 21 Jun 2019 15:40:36 +0000 (21:10 +0530)]
net: fddi: skfp: Include generic PCI definitions
Include the uapi/linux/pci_regs.h header file which contains the generic
PCI defines.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Puranjay Mohan [Fri, 21 Jun 2019 15:40:35 +0000 (21:10 +0530)]
net: fddi: skfp: Rename local PCI defines to match generic PCI defines
Rename the PCI_REV_ID and other local defines to Generic PCI define names
in skfbi.h and drvfbi.c to make it compatible with the pci_regs.h.
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 26 Jun 2019 17:12:17 +0000 (10:12 -0700)]
Merge tag 'wireless-drivers-next-for-davem-2019-06-26' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valu says:
====================
wireless-drivers-next patches for 5.3
First set of patches for 5.3, but not that many patches this time.
This pull request fails to compile with the tip tree due to
ktime_get_boot_ns() API changes there. It should be easy for Linus to
fix it in p54 driver once he pulls this, an example resolution here:
https://lkml.kernel.org/r/
20190625160432.
533aa140@canb.auug.org.au
Major changes:
airo
* switch to use skcipher interface
p54
* support boottime in scan results
rtw88
* add fast xmit support
* add random mac address on scan support
rt2x00
* add software watchdog to detect hangs, it's disabled by default
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Hunter [Wed, 26 Jun 2019 10:23:22 +0000 (11:23 +0100)]
net: stmmac: Fix crash observed if PHY does not support EEE
If the PHY does not support EEE mode, then a crash is observed when the
ethernet interface is enabled. The crash occurs, because if the PHY does
not support EEE, then although the EEE timer is never configured, it is
still marked as enabled and so the stmmac ethernet driver is still
trying to update the timer by calling mod_timer(). This triggers a BUG()
in the mod_timer() because we are trying to update a timer when there is
no callback function set because timer_setup() was never called for this
timer.
The problem is caused because we return true from the function
stmmac_eee_init(), marking the EEE timer as enabled, even when we have
not configured the EEE timer. Fix this by ensuring that we return false
if the PHY does not support EEE and hence, 'eee_active' is not set.
Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Hunter [Wed, 26 Jun 2019 10:23:21 +0000 (11:23 +0100)]
net: stmmac: Fix possible deadlock when disabling EEE support
When stmmac_eee_init() is called to disable EEE support, then the timer
for EEE support is stopped and we return from the function. Prior to
stopping the timer, a mutex was acquired but in this case it is never
released and so could cause a deadlock. Fix this by releasing the mutex
prior to returning from stmmax_eee_init() when stopping the EEE timer.
Fixes: 74371272f97f ("net: stmmac: Convert to phylink and remove phylib logic")
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 26 Jun 2019 10:05:28 +0000 (03:05 -0700)]
ipv6: fix suspicious RCU usage in rt6_dump_route()
syzbot reminded us that rt6_nh_dump_exceptions() needs to be called
with rcu_read_lock()
net/ipv6/route.c:1593 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
2 locks held by syz-executor609/8966:
#0:
00000000b7dbe288 (rtnl_mutex){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
#1:
00000000f2d87c21 (&(&tb->tb6_lock)->rlock){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
#1:
00000000f2d87c21 (&(&tb->tb6_lock)->rlock){+...}, at: fib6_dump_table.isra.0+0x37e/0x570 net/ipv6/ip6_fib.c:533
stack backtrace:
CPU: 0 PID: 8966 Comm: syz-executor609 Not tainted 5.2.0-rc5+ #43
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5250
fib6_nh_get_excptn_bucket+0x18e/0x1b0 net/ipv6/route.c:1593
rt6_nh_dump_exceptions+0x45/0x4d0 net/ipv6/route.c:5541
rt6_dump_route+0x904/0xc50 net/ipv6/route.c:5640
fib6_dump_node+0x168/0x280 net/ipv6/ip6_fib.c:467
fib6_walk_continue+0x4a9/0x8e0 net/ipv6/ip6_fib.c:1986
fib6_walk+0x9d/0x100 net/ipv6/ip6_fib.c:2034
fib6_dump_table.isra.0+0x38a/0x570 net/ipv6/ip6_fib.c:534
inet6_dump_fib+0x93c/0xb00 net/ipv6/ip6_fib.c:624
rtnl_dump_all+0x295/0x490 net/core/rtnetlink.c:3445
netlink_dump+0x558/0xfb0 net/netlink/af_netlink.c:2244
__netlink_dump_start+0x5b1/0x7d0 net/netlink/af_netlink.c:2352
netlink_dump_start include/linux/netlink.h:226 [inline]
rtnetlink_rcv_msg+0x73d/0xb00 net/core/rtnetlink.c:5182
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:646 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:665
sock_write_iter+0x27c/0x3e0 net/socket.c:994
call_write_iter include/linux/fs.h:1872 [inline]
new_sync_write+0x4d3/0x770 fs/read_write.c:483
__vfs_write+0xe1/0x110 fs/read_write.c:496
vfs_write+0x20c/0x580 fs/read_write.c:558
ksys_write+0x14f/0x290 fs/read_write.c:611
__do_sys_write fs/read_write.c:623 [inline]
__se_sys_write fs/read_write.c:620 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:620
do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4401b9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007ffc8e134978 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
00000000004002c8 RCX:
00000000004401b9
RDX:
000000000000001c RSI:
0000000020000000 RDI: 00
Fixes: 1e47b4837f3b ("ipv6: Dump route exceptions if requested")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 26 Jun 2019 10:04:50 +0000 (03:04 -0700)]
ipv4: fix suspicious RCU usage in fib_dump_info_fnhe()
sysbot reported that we lack appropriate rcu_read_lock()
protection in fib_dump_info_fnhe()
net/ipv4/route.c:2875 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz-executor609/8966:
#0:
00000000b7dbe288 (rtnl_mutex){+.+.}, at: netlink_dump+0xe7/0xfb0 net/netlink/af_netlink.c:2199
stack backtrace:
CPU: 0 PID: 8966 Comm: syz-executor609 Not tainted 5.2.0-rc5+ #43
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5250
fib_dump_info_fnhe+0x9d9/0x1080 net/ipv4/route.c:2875
fn_trie_dump_leaf net/ipv4/fib_trie.c:2141 [inline]
fib_table_dump+0x64a/0xd00 net/ipv4/fib_trie.c:2175
inet_dump_fib+0x83c/0xa90 net/ipv4/fib_frontend.c:1004
rtnl_dump_all+0x295/0x490 net/core/rtnetlink.c:3445
netlink_dump+0x558/0xfb0 net/netlink/af_netlink.c:2244
__netlink_dump_start+0x5b1/0x7d0 net/netlink/af_netlink.c:2352
netlink_dump_start include/linux/netlink.h:226 [inline]
rtnetlink_rcv_msg+0x73d/0xb00 net/core/rtnetlink.c:5182
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5237
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8ae/0xd70 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:646 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:665
sock_write_iter+0x27c/0x3e0 net/socket.c:994
call_write_iter include/linux/fs.h:1872 [inline]
new_sync_write+0x4d3/0x770 fs/read_write.c:483
__vfs_write+0xe1/0x110 fs/read_write.c:496
vfs_write+0x20c/0x580 fs/read_write.c:558
ksys_write+0x14f/0x290 fs/read_write.c:611
__do_sys_write fs/read_write.c:623 [inline]
__se_sys_write fs/read_write.c:620 [inline]
__x64_sys_write+0x73/0xb0 fs/read_write.c:620
do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4401b9
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007ffc8e134978 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
00000000004002c8 RCX:
00000000004401b9
RDX:
000000000000001c RSI:
0000000020000000 RDI:
0000000000000003
RBP:
00000000006ca018 R08:
00000000004002c8 R09:
00000000004002c8
R10:
0000000000000010 R11:
0000000000000246 R12:
0000000000401a40
R13:
0000000000401ad0 R14:
0000000000000000 R15:
0000000000000000
Fixes: ee28906fd7a1 ("ipv4: Dump route exceptions if requested")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stefano Brivio <sbrivio@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 25 Jun 2019 16:59:56 +0000 (09:59 -0700)]
Revert "net: ena: ethtool: add extra properties retrieval via get_priv_flags"
This reverts commit
315c28d2b714 ("net: ena: ethtool: add extra properties retrieval via get_priv_flags").
As discussed at netconf and on the mailing list we can't allow
for the the abuse of private flags for exposing arbitrary device
labels.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 26 Jun 2019 15:59:15 +0000 (11:59 -0400)]
Merge branch 'net-hns3-some-code-optimizations-bugfixes'
Huazhong Tan says:
====================
net: hns3: some code optimizations & bugfixes
This patch-set includes code optimizations and bugfixes for
the HNS3 ethernet controller driver.
[patch 1/11] fixes a selftest issue when doing autoneg.
[patch 2/11 - 3-11] adds two code optimizations about VLAN issue.
[patch 4/11] restores the MAC autoneg state after reset.
[patch 5/11 - 8/11] adds some code optimizations and bugfixes about
HW errors handling.
[patch 9/11 - 11/11] fixes some issues related to driver loading and
unloading.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Thu, 20 Jun 2019 08:52:45 +0000 (16:52 +0800)]
net: hns3: add exception handling when enable NIC HW error interrupts
If we failed to enable NIC HW error interrupts during client
initialization in some cases, we should do exception handling to clear
flags and free the resources.
Fixes: 00ea6e5fda9d ("net: hns3: delay and separate enabling of NIC and ROCE HW errors")
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 20 Jun 2019 08:52:44 +0000 (16:52 +0800)]
net: hns3: fixes wrong place enabling ROCE HW error when loading
The ROCE HW errors should only be enabled when initializing ROCE's
client, the current code enable it no matter initializing NIC or
ROCE client.
So this patch fixes it.
Fixes: 00ea6e5fda9d ("net: hns3: delay and separate enabling of NIC and ROCE HW errors")
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 20 Jun 2019 08:52:43 +0000 (16:52 +0800)]
net: hns3: fix race conditions between reset and module loading & unloading
When loading or unloading module, it should wait for the reset task
done before it un-initializes the client, otherwise the reset task
may cause a NULL pointer reference.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Thu, 20 Jun 2019 08:52:42 +0000 (16:52 +0800)]
net: hns3: add check to number of buffer descriptors
This patch adds check to number of bds before we allocate memory for
them. If we get an invalid bd num in some cases, it will cause a memory
overflow.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Thu, 20 Jun 2019 08:52:41 +0000 (16:52 +0800)]
net: hns3: remove override_pci_need_reset
We add override_pci_need_reset to prevent redundant and unwanted PF
resets if a RAS error occurs in commit
69b51bbb03f7 ("net: hns3: fix
to stop multiple HNS reset due to the AER changes").
Now in HNS3 driver, we use hw_err_reset_req to record reset level that
we need to recover from a RAS error. This variable cans solve above
issue as override_pci_need_reset, so this patch removes
override_pci_need_reset.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Thu, 20 Jun 2019 08:52:40 +0000 (16:52 +0800)]
net: hns3: modify handling of out of memory in hclge_err.c
Users should be informed if HNS driver failed to allocate memory for
descriptor when handling hw errors. This patch solve above issues.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Weihang Li [Thu, 20 Jun 2019 08:52:39 +0000 (16:52 +0800)]
net: hns3: code optimizaition of hclge_handle_hw_ras_error()
This patch optimizes hclge_handle_hw_ras_error() to make the code logic
clearer.
1. If there was no NIC or Roce RAS when we read
HCLGE_RAS_PF_OTHER_INT_STS_REG, we return directly.
2. Because NIC and Roce RAS may occurs at the same time, so we should
check value of revision at first before we handle Roce RAS instead
of only checking it in branch of no NIC RAS is detected.
3. Check HCLGE_STATE_RST_HANDLING each time before we want to return
PCI_ERS_RESULT_NEED_RESET.
4. Remove checking of HCLGE_RAS_REG_NFE_MASK and
HCLGE_RAS_REG_ROCEE_ERR_MASK because if hw_err_reset_req is not
zero, it proves that we have set it in handling of NIC or Roce RAS.
Signed-off-by: Weihang Li <liweihang@hisilicon.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 20 Jun 2019 08:52:38 +0000 (16:52 +0800)]
net: hns3: restore the MAC autoneg state after reset
When doing global reset, the MAC autoneg state of fibre
port is set to default, which may cause user configuration
lost. This patch fixes it by restore the MAC autoneg state
after reset.
Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 20 Jun 2019 08:52:37 +0000 (16:52 +0800)]
net: hns3: sync VLAN filter entries when kill VLAN ID failed
When HW is resetting, firmware is unable to handle commands
from driver. So if remove VLAN device from stack at this time,
it will fail to remove the VLAN ID from HW VLAN filter, then
the VLAN filter status is unsynced with stack.
This patch fixes it by recording the VLAN ID delete failed,
and removes them again when reset complete.
Fixes: 44e626f720c3 ("net: hns3: fix VLAN offload handle for VLAN inserted by port")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 20 Jun 2019 08:52:36 +0000 (16:52 +0800)]
net: hns3: remove VF VLAN filter entry inexistent warning print
For VF VLAN filter is disabled when VF VLAN table is full, then the
new VLAN ID won't be added into VF VLAN table, it will always print
fail log when remove these VLAN IDs. If user has added too many
VLANs, it will cause massive verbose print logs.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Thu, 20 Jun 2019 08:52:35 +0000 (16:52 +0800)]
net: hns3: fix selftest fail issue for fibre port with autoneg on
When doing selftest for fibre port with autoneg on, the MAC speed
may be incorrect, which may cause the selftest failed. This patch
fixes it by halting autoneg during the selftest.
Fixes: 22f48e24a23d ("net: hns3: add autoneg and change speed support for fibre port")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Tue, 25 Jun 2019 18:18:52 +0000 (14:18 -0400)]
tc-testing: add ingress qdisc tests
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Tue, 25 Jun 2019 17:37:00 +0000 (19:37 +0200)]
tipc: rename function msg_get_wrapped() to msg_inner_hdr()
We rename the inline function msg_get_wrapped() to the more
comprehensible msg_inner_hdr().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Tue, 25 Jun 2019 16:08:13 +0000 (18:08 +0200)]
tipc: eliminate unnecessary skb expansion during retransmission
We increase the allocated headroom for the buffer copies to be
retransmitted. This eliminates the need for the lower stack levels
(UDP/IP/L2) to expand the headroom in order to add their own headers.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Tue, 25 Jun 2019 15:36:43 +0000 (17:36 +0200)]
tipc: simplify stale link failure criteria
In commit
a4dc70d46cf1 ("tipc: extend link reset criteria for stale
packet retransmission") we made link retransmission failure events
dependent on the link tolerance, and not only of the number of failed
retransmission attempts, as we did earlier. This works well. However,
keeping the original, additional criteria of 99 failed retransmissions
is now redundant, and may in some cases lead to failure detection
times in the order of minutes instead of the expected 1.5 sec link
tolerance value.
We now remove this criteria altogether.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lucas Bates [Tue, 25 Jun 2019 01:00:27 +0000 (21:00 -0400)]
tc-testing: Restore original behaviour for namespaces in tdc
This patch restores the original behaviour for tdc prior to the
introduction of the plugin system, where the network namespace
functionality was split from the main script.
It introduces the concept of required plugins for testcases,
and will automatically load any plugin that isn't already
enabled when said plugin is required by even one testcase.
Additionally, the -n option for the nsPlugin is deprecated
so the default action is to make use of the namespaces.
Instead, we introduce -N to not use them, but still create
the veth pair.
buildebpfPlugin's -B option is also deprecated.
If a test cases requires the features of a specific plugin
in order to pass, it should instead include a new key/value
pair describing plugin interactions:
"plugins": {
"requires": "buildebpfPlugin"
},
A test case can have more than one required plugin: a list
can be inserted as the value for 'requires'.
Signed-off-by: Lucas Bates <lucasb@mojatatu.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 25 Jun 2019 19:42:12 +0000 (12:42 -0700)]
Merge git://git./linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patches contains Netfilter updates for net-next:
1) .br_defrag indirection depends on CONFIG_NF_DEFRAG_IPV6, from wenxu.
2) Remove unnecessary memset() in ipset, from Florent Fourcot.
3) Merge control plane addition and deletion in ipset, also from Florent.
4) A few missing check for nla_parse() in ipset, from Aditya Pakki
and Jozsef Kadlecsik.
5) Incorrect cleanup in error path of xt_set version 3, from Jozsef.
6) Memory accounting problems when resizing in ipset, from Stefano Brivio.
7) Jozsef updates his email to @netfilter.org, this batch comes with a
conflict resolution with recent SPDX header updates.
8) Add to create custom conntrack expectations via nftables, from
Stephane Veyret.
9) A lookup optimization for conntrack, from Florian Westphal.
10) Check for supported flags in xt_owner.
11) Support for pernet sysctl in br_netfilter, patches
from Christian Brauner.
12) Patches to move common synproxy infrastructure to nf_synproxy.c,
to prepare the synproxy support for nf_tables, patches from
Fernando Fernandez Mancera.
13) Support to restore expiration time in set element, from Laura Garcia.
14) Fix recent rewrite of netfilter IPv6 to avoid indirections
when CONFIG_IPV6 is unset, from Arnd Bergmann.
15) Always reset vlan tag on skbuff fraglist when refragmenting in
bridge conntrack, from wenxu.
16) Support to match IPv4 options in nf_tables, from Stephen Suryaputra.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ard Biesheuvel [Mon, 17 Jun 2019 08:43:38 +0000 (10:43 +0200)]
airo: switch to skcipher interface
The AIRO driver applies a ctr(aes) on a buffer of considerable size
(2400 bytes), and instead of invoking the crypto API to handle this
in its entirety, it open codes the counter manipulation and invokes
the AES block cipher directly.
Let's fix this, by switching to the sync skcipher API instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:01:00 +0000 (12:01 +0200)]
rt2800: do not enable watchdog by default
Make watchdog disabled by default and add module parameter to enable it.
User will have to create file in /etc/modprobe.d/ with
options rt2800lib watchdog=1
to enable the watchdog or load "rt2800lib watchdog=1" module manually
before loading rt2800{soc,pci,usb} module.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:00:59 +0000 (12:00 +0200)]
rt2x00: add restart hw
Add ieee80211_restart_hw() to watchdog and debugfs file for testing
if restart works as expected.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:00:58 +0000 (12:00 +0200)]
rt2800: do not nullify initialization vector data
If we restart hw we should keep existing IV (initialization vector)
otherwise HW encryption will be broken after restart.
Also fix some coding style issues on the way.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:00:57 +0000 (12:00 +0200)]
rt2800: add pre_reset_hw callback
Add routine to cleanup interfaces data before hw reset as
ieee80211_restart_hw() will do setup interfaces again.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:00:56 +0000 (12:00 +0200)]
rt2800: initial watchdog implementation
Add watchdog for rt2800 devices. For now it only detect hung
and print error.
[Note: I verified that printing messages from process context is
fine on MT7620 (WT3020) platform that have problem when printk
is called from interrupt context].
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:00:55 +0000 (12:00 +0200)]
rt2800: add helpers for reading dma done index
For mmio we do not properlly trace dma done Q_INDEX_DMA_DONE index
for TX queues. That would require implementing INT_SOURCE_CSR_*_DMA_DONE
interrupts, what is rather not worth to do due to adding extra
CPU load (small but still somewhat not necessary otherwise).
We can just read TX DMA done indexes from registers directly. What
will be used by watchdog.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Stanislaw Gruszka [Sat, 15 Jun 2019 10:00:54 +0000 (12:00 +0200)]
rt2x00: allow to specify watchdog interval
Allow subdriver to change watchdog interval by intialize
link->watchdog_interval value before rt2x00link_register().
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Christian Lamparter [Sat, 15 Jun 2019 10:00:09 +0000 (12:00 +0200)]
p54: remove dead branch in op_conf_tx callback
This patch removes the error branch for (queue > dev->queues).
It is no longer needed anymore as the "queue" value is validated by
cfg80211's parse_txq_params() before the driver code gets called.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Tzu-En Huang [Fri, 14 Jun 2019 07:24:15 +0000 (15:24 +0800)]
rtw88: fix typo rtw_writ16_set
rtw_writ16_set should be rtw_write16_set
Signed-off-by: Tzu-En Huang <tehuang@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:14 +0000 (15:24 +0800)]
rtw88: rsvd page should go though management queue
The hardware default uses management queue to transmit frames that are
downloaded into reserved page, so we need to clearly assign the frames
to use qsel in TX_DESC_QSEL_MGMT to avoid using wrong queue.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:13 +0000 (15:24 +0800)]
rtw88: restore DACK results to save time
DACK is done right after the hardware has been turned on, which
means it will be done every time we leave the IDLE state.
But it takes ~2 seconds to finish DACK.
We can back up the results and restore them. And it only takes a few
milliseconds to restore the results to the hardware, saving a lot of
time.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:12 +0000 (15:24 +0800)]
rtw88: power on again if it was already on
We could fail to power on because it was already on. If the return
value is -EALREADY, power off and then power on again to turn on the
hardware as expected.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:11 +0000 (15:24 +0800)]
rtw88: 8822c: use more accurate ofdm fa counting
8822c used to count OFDM FA count by subtracting tx count from FA count.
But it need to substract more counters to be accurate.
However, we can count it by adding up all of the FA counters we want.
And it is simpler to add than list all of the components to substract.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:10 +0000 (15:24 +0800)]
rtw88: 8822c: disable rx clock gating before counter reset
Driver Could fail to reset counter if rx clock gating is not disabled.
So we need to disable rx clock gating before resetting counters.
Otherwise counters may increase unexpected.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Chien-Hsun Liao [Fri, 14 Jun 2019 07:24:09 +0000 (15:24 +0800)]
rtw88: 8822c: update channel and bandwidth BB setting
In 2G channels, the cck source and rxagc should be set to different
values based on different bandwidth to increase the performance of rx
sensitivity.
To improve rx throughput performance, the values of sbd subtune and
pt_opt should be changed in different bandwidth.
Signed-off-by: Chien-Hsun Liao <ben.liao@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Chien-Hsun Liao [Fri, 14 Jun 2019 07:24:08 +0000 (15:24 +0800)]
rtw88: 8822c: add rf write protection when switching channel
Collision of writing rf registers could occur if the driver writes
rf registers by direct write while the hardware is writing other rf
registers by pi write simultaneously.
Hardware pi write can be triggered by rf calibrations sometimes, so
the driver can not always write rf registers by direct write
protection. Direct write protection can make sure that there is no
hardware pi write during the direct write.
According to some experiments, if we add direct write protection
when switching channel, the performance of rf calibration will not
be affected.
Signed-off-by: Chien-Hsun Liao <ben.liao@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Chin-Yen Lee [Fri, 14 Jun 2019 07:24:07 +0000 (15:24 +0800)]
rtw88: add beacon function setting
Add beacon function setting routines for each hardware port.
If beacon function is not enabled, the hardware is not able
to synchronize with AP's beacon and can miss the beacons
under some scenarios such as PS mode.
For AP and Adhoc modes that require to send beacons, do not
update the TSF, otherwise the beacon interval may be affected.
Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:06 +0000 (15:24 +0800)]
rtw88: add support for random mac scan
When driver uses random mac address to scan, the unicast probe response
will not be received because the addr1 is not matched. Configure port
address by requested mac address to receive probe response from AP.
To support random mac scan, we need to configure the mac address during
scan period to receive unicast prop_resp. After scan is completed,
configure the mac address back to the original one that the port used.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Yan-Hsuan Chuang [Fri, 14 Jun 2019 07:24:05 +0000 (15:24 +0800)]
rtw88: add fast xmit support
With dynamic power save support, rtw88 is able to support fast tx
path, claim it to mac80211.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Greg Kroah-Hartman [Wed, 12 Jun 2019 14:26:55 +0000 (16:26 +0200)]
iwlegacy: 4965: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value. This driver was saving the debugfs file away to be
removed at a later time. However, the 80211 core would delete the whole
directory that the debugfs files are created in, after it asks the
driver to do the deletion, so just rely on the 80211 core to do all of
the cleanup for us, making us not need to keep a pointer to the dentries
around at all.
This cleans up the structure of the driver data a bit and makes the code
a tiny bit smaller.
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Greg Kroah-Hartman [Wed, 12 Jun 2019 14:26:54 +0000 (16:26 +0200)]
iwlegacy: 3945: no need to check return value of debugfs_create functions
When calling debugfs functions, there is no need to ever check the
return value. This driver was saving the debugfs file away to be
removed at a later time. However, the 80211 core would delete the whole
directory that the debugfs files are created in, after it asks the
driver to do the deletion, so just rely on the 80211 core to do all of
the cleanup for us, making us not need to keep a pointer to the dentries
around at all.
This cleans up the structure of the driver data a bit and makes the code
a tiny bit smaller.
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Michael Büsch [Mon, 10 Jun 2019 18:49:27 +0000 (20:49 +0200)]
ssb/gpio: Remove unnecessary WARN_ON from driver_gpio
The WARN_ON triggers on older BCM4401-B0 100Base-TX ethernet controllers.
The warning serves no purpose. So let's just remove it.
Reported-by: H Buus <ubuntu@hbuus.com>
Signed-off-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Colin Ian King [Sat, 8 Jun 2019 10:58:00 +0000 (11:58 +0100)]
rtlwifi: rtl8188ee: remove redundant assignment to rtstatus
Variable rtstatus is being initialized with a value that is never read
as rtstatus is being re-assigned a little later on. The assignment is
redundant and hence can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Gustavo A. R. Silva [Fri, 7 Jun 2019 19:17:45 +0000 (14:17 -0500)]
qtnfmac: Use struct_size() in kzalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct ieee80211_regdomain {
...
struct ieee80211_reg_rule reg_rules[];
};
instance = kzalloc(sizeof(*mac->rd) +
sizeof(struct ieee80211_reg_rule) *
count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, reg_rules, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Lorenzo Bianconi [Fri, 7 Jun 2019 11:48:10 +0000 (13:48 +0200)]
mt7601u: fix possible memory leak when the device is disconnected
When the device is disconnected while passing traffic it is possible
to receive out of order urbs causing a memory leak since the skb linked
to the current tx urb is not removed. Fix the issue deallocating the skb
cleaning up the tx ring. Moreover this patch fixes the following kernel
warning
[ 57.480771] usb 1-1: USB disconnect, device number 2
[ 57.483451] ------------[ cut here ]------------
[ 57.483462] TX urb mismatch
[ 57.483481] WARNING: CPU: 1 PID: 32 at drivers/net/wireless/mediatek/mt7601u/dma.c:245 mt7601u_complete_tx+0x165/00
[ 57.483483] Modules linked in:
[ 57.483496] CPU: 1 PID: 32 Comm: kworker/1:1 Not tainted 5.2.0-rc1+ #72
[ 57.483498] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-2.fc30 04/01/2014
[ 57.483502] Workqueue: usb_hub_wq hub_event
[ 57.483507] RIP: 0010:mt7601u_complete_tx+0x165/0x1e0
[ 57.483510] Code: 8b b5 10 04 00 00 8b 8d 14 04 00 00 eb 8b 80 3d b1 cb e1 00 00 75 9e 48 c7 c7 a4 ea 05 82 c6 05 f
[ 57.483513] RSP: 0000:
ffffc900000a0d28 EFLAGS:
00010092
[ 57.483516] RAX:
000000000000000f RBX:
ffff88802c0a62c0 RCX:
ffffc900000a0c2c
[ 57.483518] RDX:
0000000000000000 RSI:
0000000000000000 RDI:
ffffffff810a8371
[ 57.483520] RBP:
ffff88803ced6858 R08:
0000000000000000 R09:
0000000000000001
[ 57.483540] R10:
0000000000000002 R11:
0000000000000000 R12:
0000000000000046
[ 57.483542] R13:
ffff88802c0a6c88 R14:
ffff88803baab540 R15:
ffff88803a0cc078
[ 57.483548] FS:
0000000000000000(0000) GS:
ffff88803eb00000(0000) knlGS:
0000000000000000
[ 57.483550] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 57.483552] CR2:
000055e7f6780100 CR3:
0000000028c86000 CR4:
00000000000006a0
[ 57.483554] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 57.483556] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 57.483559] Call Trace:
[ 57.483561] <IRQ>
[ 57.483565] __usb_hcd_giveback_urb+0x77/0xe0
[ 57.483570] xhci_giveback_urb_in_irq.isra.0+0x8b/0x140
[ 57.483574] handle_cmd_completion+0xf5b/0x12c0
[ 57.483577] xhci_irq+0x1f6/0x1810
[ 57.483581] ? lockdep_hardirqs_on+0x9e/0x180
[ 57.483584] ? _raw_spin_unlock_irq+0x24/0x30
[ 57.483588] __handle_irq_event_percpu+0x3a/0x260
[ 57.483592] handle_irq_event_percpu+0x1c/0x60
[ 57.483595] handle_irq_event+0x2f/0x4c
[ 57.483599] handle_edge_irq+0x7e/0x1a0
[ 57.483603] handle_irq+0x17/0x20
[ 57.483607] do_IRQ+0x54/0x110
[ 57.483610] common_interrupt+0xf/0xf
[ 57.483612] </IRQ>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Lorenzo Bianconi [Fri, 7 Jun 2019 11:48:09 +0000 (13:48 +0200)]
mt7601u: do not schedule rx_tasklet when the device has been disconnected
Do not schedule rx_tasklet when the usb dongle is disconnected.
Moreover do not grub rx_lock in mt7601u_kill_rx since usb_poison_urb
can run concurrently with urb completion and we can unlink urbs from rx
ring in any order.
This patch fixes the common kernel warning reported when
the device is removed.
[ 24.921354] usb 3-14: USB disconnect, device number 7
[ 24.921593] ------------[ cut here ]------------
[ 24.921594] RX urb mismatch
[ 24.921675] WARNING: CPU: 4 PID: 163 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0xcb/0xd0 [mt7601u]
[ 24.921769] CPU: 4 PID: 163 Comm: kworker/4:2 Tainted: G OE 4.19.31-041931-generic #
201903231635
[ 24.921770] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P1.30 05/23/2014
[ 24.921782] Workqueue: usb_hub_wq hub_event
[ 24.921797] RIP: 0010:mt7601u_complete_rx+0xcb/0xd0 [mt7601u]
[ 24.921800] RSP: 0018:
ffff9bd9cfd03d08 EFLAGS:
00010086
[ 24.921802] RAX:
0000000000000000 RBX:
ffff9bd9bf043540 RCX:
0000000000000006
[ 24.921803] RDX:
0000000000000007 RSI:
0000000000000096 RDI:
ffff9bd9cfd16420
[ 24.921804] RBP:
ffff9bd9cfd03d28 R08:
0000000000000002 R09:
00000000000003a8
[ 24.921805] R10:
0000002f485fca34 R11:
0000000000000000 R12:
ffff9bd9bf043c1c
[ 24.921806] R13:
ffff9bd9c62fa3c0 R14:
0000000000000082 R15:
0000000000000000
[ 24.921807] FS:
0000000000000000(0000) GS:
ffff9bd9cfd00000(0000) knlGS:
0000000000000000
[ 24.921808] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 24.921808] CR2:
00007fb2648b0000 CR3:
0000000142c0a004 CR4:
00000000001606e0
[ 24.921809] Call Trace:
[ 24.921812] <IRQ>
[ 24.921819] __usb_hcd_giveback_urb+0x8b/0x140
[ 24.921821] usb_hcd_giveback_urb+0xca/0xe0
[ 24.921828] xhci_giveback_urb_in_irq.isra.42+0x82/0xf0
[ 24.921834] handle_cmd_completion+0xe02/0x10d0
[ 24.921837] xhci_irq+0x274/0x4a0
[ 24.921838] xhci_msi_irq+0x11/0x20
[ 24.921851] __handle_irq_event_percpu+0x44/0x190
[ 24.921856] handle_irq_event_percpu+0x32/0x80
[ 24.921861] handle_irq_event+0x3b/0x5a
[ 24.921867] handle_edge_irq+0x80/0x190
[ 24.921874] handle_irq+0x20/0x30
[ 24.921889] do_IRQ+0x4e/0xe0
[ 24.921891] common_interrupt+0xf/0xf
[ 24.921892] </IRQ>
[ 24.921900] RIP: 0010:usb_hcd_flush_endpoint+0x78/0x180
[ 24.921354] usb 3-14: USB disconnect, device number 7
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Colin Ian King [Fri, 31 May 2019 14:14:12 +0000 (15:14 +0100)]
rtlwifi: remove redundant assignment to variable k
The assignment of 0 to variable k is never read once we break out of
the loop, so the assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Colin Ian King [Thu, 30 May 2019 18:40:44 +0000 (19:40 +0100)]
rtlwifi: remove redundant assignment to variable badworden
The variable badworden is assigned with a value that is never read and
it is re-assigned a new value immediately afterwards. The assignment is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Ping-Ke Shih [Wed, 29 May 2019 06:57:30 +0000 (14:57 +0800)]
rtlwifi: rtl8192cu: fix error handle when usb probe failed
rtl_usb_probe() must do error handle rtl_deinit_core() only if
rtl_init_core() is done, otherwise goto error_out2.
| usb 1-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
| rtl_usb: reg 0xf0, usbctrl_vendorreq TimeOut! status:0xffffffb9 value=0x0
| rtl8192cu: Chip version 0x10
| rtl_usb: reg 0xa, usbctrl_vendorreq TimeOut! status:0xffffffb9 value=0x0
| rtl_usb: Too few input end points found
| INFO: trying to register non-static key.
| the code is fine but needs lockdep annotation.
| turning off the locking correctness validator.
| CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted
5.1.0-rc4-319354-g9a33b36 #3
| Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
| Google 01/01/2011
| Workqueue: usb_hub_wq hub_event
| Call Trace:
| __dump_stack lib/dump_stack.c:77 [inline]
| dump_stack+0xe8/0x16e lib/dump_stack.c:113
| assign_lock_key kernel/locking/lockdep.c:786 [inline]
| register_lock_class+0x11b8/0x1250 kernel/locking/lockdep.c:1095
| __lock_acquire+0xfb/0x37c0 kernel/locking/lockdep.c:3582
| lock_acquire+0x10d/0x2f0 kernel/locking/lockdep.c:4211
| __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
| _raw_spin_lock_irqsave+0x44/0x60 kernel/locking/spinlock.c:152
| rtl_c2hcmd_launcher+0xd1/0x390
| drivers/net/wireless/realtek/rtlwifi/base.c:2344
| rtl_deinit_core+0x25/0x2d0 drivers/net/wireless/realtek/rtlwifi/base.c:574
| rtl_usb_probe.cold+0x861/0xa70
| drivers/net/wireless/realtek/rtlwifi/usb.c:1093
| usb_probe_interface+0x31d/0x820 drivers/usb/core/driver.c:361
| really_probe+0x2da/0xb10 drivers/base/dd.c:509
| driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
| __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
| bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
| __device_attach+0x223/0x3a0 drivers/base/dd.c:844
| bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
| device_add+0xad2/0x16e0 drivers/base/core.c:2106
| usb_set_configuration+0xdf7/0x1740 drivers/usb/core/message.c:2021
| generic_probe+0xa2/0xda drivers/usb/core/generic.c:210
| usb_probe_device+0xc0/0x150 drivers/usb/core/driver.c:266
| really_probe+0x2da/0xb10 drivers/base/dd.c:509
| driver_probe_device+0x21d/0x350 drivers/base/dd.c:671
| __device_attach_driver+0x1d8/0x290 drivers/base/dd.c:778
| bus_for_each_drv+0x163/0x1e0 drivers/base/bus.c:454
| __device_attach+0x223/0x3a0 drivers/base/dd.c:844
| bus_probe_device+0x1f1/0x2a0 drivers/base/bus.c:514
| device_add+0xad2/0x16e0 drivers/base/core.c:2106
| usb_new_device.cold+0x537/0xccf drivers/usb/core/hub.c:2534
| hub_port_connect drivers/usb/core/hub.c:5089 [inline]
| hub_port_connect_change drivers/usb/core/hub.c:5204 [inline]
| port_event drivers/usb/core/hub.c:5350 [inline]
| hub_event+0x138e/0x3b00 drivers/usb/core/hub.c:5432
| process_one_work+0x90f/0x1580 kernel/workqueue.c:2269
| worker_thread+0x9b/0xe20 kernel/workqueue.c:2415
| kthread+0x313/0x420 kernel/kthread.c:253
| ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Reported-by: syzbot+1fcc5ef45175fc774231@syzkaller.appspotmail.com
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Swati Kushwaha [Fri, 21 Jun 2019 14:14:44 +0000 (19:44 +0530)]
mwifiex: ignore processing invalid command response
Firmware can send invalid command response, the processing of
which can attempt to modify unexpected context and cause issues.
To fix this, driver should check that the command response ID is
same as the one it downloaded, and ignore processing of invalid
response.
Signed-off-by: Swati Kushwaha <swatiuma@marvell.com>
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Sharvari Harisangam [Wed, 12 Jun 2019 15:12:11 +0000 (20:42 +0530)]
mwifiex: update set_mac_address logic
In set_mac_address, driver check for interfaces with same bss_type
For first STA entry, this would return 3 interfaces since all priv's have
bss_type as 0 due to kzalloc. Thus mac address gets changed for STA
unexpected. This patch adds check for first STA and avoids mac address
change. This patch also adds mac_address change for p2p based on bss_num
type.
Signed-off-by: Sharvari Harisangam <sharvari@marvell.com>
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Brian Norris [Tue, 4 Jun 2019 17:31:44 +0000 (10:31 -0700)]
mwifiex: print PCI mmap with %pK
Unadorned '%p' has restrictive policies these days, such that it usually
just prints garbage at early boot (see
Documentation/core-api/printk-formats.rst, "kernel will print
``(ptrval)`` until it gathers enough entropy"). Annotating with %pK
(for "kernel pointer") allows the kptr_restrict sysctl to control
printing policy better.
We might just as well drop this message entirely, but this fix was easy
enough for now.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Brian Norris [Tue, 4 Jun 2019 17:28:58 +0000 (10:28 -0700)]
mwifiex: drop 'set_consistent_dma_mask' log message
This message is pointless.
While we're at it, include the error code in the error message, which is
not pointless.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Alan Stern [Mon, 20 May 2019 14:44:21 +0000 (10:44 -0400)]
p54usb: Fix race between disconnect and firmware loading
The syzbot fuzzer found a bug in the p54 USB wireless driver. The
issue involves a race between disconnect and the firmware-loader
callback routine, and it has several aspects.
One big problem is that when the firmware can't be loaded, the
callback routine tries to unbind the driver from the USB _device_ (by
calling device_release_driver) instead of from the USB _interface_ to
which it is actually bound (by calling usb_driver_release_interface).
The race involves access to the private data structure. The driver's
disconnect handler waits for a completion that is signalled by the
firmware-loader callback routine. As soon as the completion is
signalled, you have to assume that the private data structure may have
been deallocated by the disconnect handler -- even if the firmware was
loaded without errors. However, the callback routine does access the
private data several times after that point.
Another problem is that, in order to ensure that the USB device
structure hasn't been freed when the callback routine runs, the driver
takes a reference to it. This isn't good enough any more, because now
that the callback routine calls usb_driver_release_interface, it has
to ensure that the interface structure hasn't been freed.
Finally, the driver takes an unnecessary reference to the USB device
structure in the probe function and drops the reference in the
disconnect handler. This extra reference doesn't accomplish anything,
because the USB core already guarantees that a device structure won't
be deallocated while a driver is still bound to any of its interfaces.
To fix these problems, this patch makes the following changes:
Call usb_driver_release_interface() rather than
device_release_driver().
Don't signal the completion until after the important
information has been copied out of the private data structure,
and don't refer to the private data at all thereafter.
Lock udev (the interface's parent) before unbinding the driver
instead of locking udev->parent.
During the firmware loading process, take a reference to the
USB interface instead of the USB device.
Don't take an unnecessary reference to the device during probe
(and then don't drop it during disconnect).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+200d4bb11b23d929335f@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Acked-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Pablo Neira Ayuso [Mon, 24 Jun 2019 23:32:59 +0000 (01:32 +0200)]
Merge git://git./linux/kernel/git/davem/net-next
Resolve conflict between
d2912cb15bdd ("treewide: Replace GPLv2
boilerplate/reference with SPDX - rule 500") removing the GPL disclaimer
and
fe03d4745675 ("Update my email address") which updates Jozsef
Kadlecsik's email.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Mon, 24 Jun 2019 21:54:06 +0000 (14:54 -0700)]
Merge branch 'cxgb4-Reference-count-MPS-TCAM-entries-within-a-PF'
Raju Rangoju says:
====================
cxgb4: Reference count MPS TCAM entries within a PF
Firmware reference counts the MPS TCAM entries by PF and VF,
but it does not do it for usage within a PF or VF. This patch
adds the support to track MPS TCAM entries within a PF.
v2->v3:
Fixed the compiler errors due to incorrect patch
Also, removed the new blank line at EOF
v1->v2:
Use refcount_t type instead of atomic_t for mps reference count
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Raju Rangoju [Mon, 24 Jun 2019 17:35:35 +0000 (23:05 +0530)]
cxgb4: Add MPS refcounting for alloc/free mac filters
This patch adds reference counting support for
alloc/free mac filters
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Raju Rangoju [Mon, 24 Jun 2019 17:35:34 +0000 (23:05 +0530)]
cxgb4: Add MPS TCAM refcounting for cxgb4 change mac
This patch adds TCAM reference counting
support for cxgb4 change mac path
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Raju Rangoju [Mon, 24 Jun 2019 17:35:33 +0000 (23:05 +0530)]
cxgb4: Add MPS TCAM refcounting for raw mac filters
This patch adds TCAM reference counting
support for raw mac filters.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Raju Rangoju [Mon, 24 Jun 2019 17:35:32 +0000 (23:05 +0530)]
cxgb4: Re-work the logic for mps refcounting
Remove existing mps refcounting code which was
added only for encap filters and add necessary
data structures/functions to support mps reference
counting for all the mac filters. Also add wrapper
functions for allocating and freeing encap mac
filters.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Icenowy Zheng [Thu, 20 Jun 2019 13:47:44 +0000 (15:47 +0200)]
net: stmmac: sun8i: force select external PHY when no internal one
The PHY selection bit also exists on SoCs without an internal PHY; if it's
set to 1 (internal PHY, default value) then the MAC will not make use of
any PHY on such SoCs.
This problem appears when adapting for H6, which has no real internal PHY
(the "internal PHY" on H6 is not on-die, but on a co-packaged AC200 chip,
connected via RMII interface at GPIO bank A).
Force the PHY selection bit to 0 when the SOC doesn't have an internal PHY,
to address the problem of a wrong default value.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Ondrej Jirman <megous@megous.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Icenowy Zheng [Thu, 20 Jun 2019 13:47:43 +0000 (15:47 +0200)]
net: stmmac: sun8i: add support for Allwinner H6 EMAC
The EMAC on Allwinner H6 is just like the one on A64. The "internal PHY" on
H6 is on a co-packaged AC200 chip, and it's not really internal (it's
connected via RMII at PA GPIO bank).
Add support for the Allwinner H6 EMAC in the dwmac-sun8i driver.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Ondrej Jirman <megous@megous.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Jun 2019 17:18:49 +0000 (10:18 -0700)]
Merge branch 'cached-route-listings'
Stefano Brivio says:
====================
Fix listing (IPv4, IPv6) and flushing (IPv6) of cached route exceptions
For IPv6 cached routes, the commands 'ip -6 route list cache' and
'ip -6 route flush cache' don't work at all after route exceptions have
been moved to a separate hash table in commit
2b760fcf5cfb ("ipv6: hook
up exception table to store dst cache").
For IPv4 cached routes, the command 'ip route list cache' has also
stopped working in kernel 3.5 after commit
4895c771c7f0 ("ipv4: Add FIB
nexthop exceptions.") introduced storage for route exceptions as a
separate entity.
Fix this by allowing userspace to clearly request cached routes with
the RTM_F_CLONED flag used as a filter (in conjuction with strict
checking) and by retrieving and dumping cached routes if requested.
If strict checking is not requested (iproute2 < 5.0.0), we don't have a
way to consistently filter results on other selectors (e.g. on tables),
so skip filtering entirely and dump both regular routes and exceptions.
For IPv4, cache flushing uses a completely different mechanism, so it
wasn't affected. Listing of exception routes (modified routes pre-3.5) was
tested against these versions of kernel and iproute2:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0
3.5-rc4 + + + + +
4.4
4.9
4.14
4.15
4.19
5.0
5.1
fixed + + + + +
For IPv6, a separate iproute2 patch is required. Versions of iproute2
and kernel tested:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0 5.1.0, patched
3.18 list + + + + + +
flush + + + + + +
4.4 list + + + + + +
flush + + + + + +
4.9 list + + + + + +
flush + + + + + +
4.14 list + + + + + +
flush + + + + + +
4.15 list
flush
4.19 list
flush
5.0 list
flush
5.1 list
flush
with list + + + + + +
fix flush + + + +
v7: Make sure r->rtm_tos is initialised in 3/11, move loop over nexthop
objects in 4/11, add comments about usage of "skip" counters in commit
messages of 4/11 and 8/11
v6: Target for net-next, rebase and adapt to nexthop objects for IPv6 paths.
Merge selftests into this series (as they were addressed for net-next).
A number of minor changes detailed in logs of single patches.
v5: Skip filtering altogether if no strict checking is requested: selecting
routes or exceptions only would be inconsistent with the fact we can't
filter on tables. Drop 1/8 (non-strict dump filter function no longer
needed), replace 2/8 (don't use NLM_F_MATCH, decide to skip routes or
exceptions in filter function), drop 6/8 (2/8 is enough for IPv6 too).
Introduce dump_routes and dump_exceptions flags in filter, adapt other
patches to that.
v4: Fix the listing issue also for IPv4, making the behaviour consistent
with IPv6. Honour NLM_F_MATCH as per RFC 3549 and allow usage of
RTM_F_CLONED filter. Split patches into smaller logical changes.
v3: Drop check on RTM_F_CLONED and rework logic of return values of
rt6_dump_route()
v2: Add count of routes handled in partial dumps, and skip them, in patch 1/2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:30 +0000 (17:45 +0200)]
selftests: pmtu: Make list_flush_ipv6_exception test more demanding
Instead of just listing and flushing two cached exceptions, create
a relatively big number of them, and count how many are listed. Single
netlink dump messages contain approximately 25 entries each, and this
way we can make sure the partial dump tracking mechanism is working
properly.
While at it, also ensure that no cached routes can be listed after
flush, and remove 'sleep 1' calls, they are not actually needed.
v7: No changes
v6:
- Merge this patch into series including fix, as it's also targeted
for net-next. No actual changes
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:29 +0000 (17:45 +0200)]
selftests: pmtu: Introduce list_flush_ipv4_exception test case
This test checks that route exceptions can be successfully listed and
flushed using ip -6 route {list,flush} cache.
v7: No changes
v6:
- Merge this patch into series including fix, as it's also targeted
for net-next
- Drop left-over print of 'ip route list cache | wc -l'
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:28 +0000 (17:45 +0200)]
ip6_fib: Don't discard nodes with valid routing information in fib6_locate_1()
When we perform an inexact match on FIB nodes via fib6_locate_1(), longer
prefixes will be preferred to shorter ones. However, it might happen that
a node, with higher fn_bit value than some other, has no valid routing
information.
In this case, we'll pick that node, but it will be discarded by the check
on RTN_RTINFO in fib6_locate(), and we might miss nodes with valid routing
information but with lower fn_bit value.
This is apparent when a routing exception is created for a default route:
# ip -6 route list
fc00:1::/64 dev veth_A-R1 proto kernel metric 256 pref medium
fc00:2::/64 dev veth_A-R2 proto kernel metric 256 pref medium
fc00:4::1 via fc00:2::2 dev veth_A-R2 metric 1024 pref medium
fe80::/64 dev veth_A-R1 proto kernel metric 256 pref medium
fe80::/64 dev veth_A-R2 proto kernel metric 256 pref medium
default via fc00:1::2 dev veth_A-R1 metric 1024 pref medium
# ip -6 route list cache
fc00:4::1 via fc00:2::2 dev veth_A-R2 metric 1024 expires 593sec mtu 1500 pref medium
fc00:3::1 via fc00:1::2 dev veth_A-R1 metric 1024 expires 593sec mtu 1500 pref medium
# ip -6 route flush cache # node for default route is discarded
Failed to send flush request: No such process
# ip -6 route list cache
fc00:3::1 via fc00:1::2 dev veth_A-R1 metric 1024 expires 586sec mtu 1500 pref medium
Check right away if the node has a RTN_RTINFO flag, before replacing the
'prev' pointer, that indicates the longest matching prefix found so far.
Fixes: 38fbeeeeccdb ("ipv6: prepare fib6_locate() for exception table")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:27 +0000 (17:45 +0200)]
ipv6: Dump route exceptions if requested
Since commit
2b760fcf5cfb ("ipv6: hook up exception table to store dst
cache"), route exceptions reside in a separate hash table, and won't be
found by walking the FIB, so they won't be dumped to userspace on a
RTM_GETROUTE message.
This causes 'ip -6 route list cache' and 'ip -6 route flush cache' to
have no function anymore:
# ip -6 route get fc00:3::1
fc00:3::1 via fc00:1::2 dev veth_A-R1 src fc00:1::1 metric 1024 expires 539sec mtu 1400 pref medium
# ip -6 route get fc00:4::1
fc00:4::1 via fc00:2::2 dev veth_A-R2 src fc00:2::1 metric 1024 expires 536sec mtu 1500 pref medium
# ip -6 route list cache
# ip -6 route flush cache
# ip -6 route get fc00:3::1
fc00:3::1 via fc00:1::2 dev veth_A-R1 src fc00:1::1 metric 1024 expires 520sec mtu 1400 pref medium
# ip -6 route get fc00:4::1
fc00:4::1 via fc00:2::2 dev veth_A-R2 src fc00:2::1 metric 1024 expires 519sec mtu 1500 pref medium
because iproute2 lists cached routes using RTM_GETROUTE, and flushes them
by listing all the routes, and deleting them with RTM_DELROUTE one by one.
If cached routes are requested using the RTM_F_CLONED flag together with
strict checking, or if no strict checking is requested (and hence we can't
consistently apply filters), look up exceptions in the hash table
associated with the current fib6_info in rt6_dump_route(), and, if present
and not expired, add them to the dump.
We might be unable to dump all the entries for a given node in a single
message, so keep track of how many entries were handled for the current
node in fib6_walker, and skip that amount in case we start from the same
partially dumped node.
When a partial dump restarts, as the starting node might change when
'sernum' changes, we have no guarantee that we need to skip the same
amount of in-node entries. Therefore, we need two counters, and we need to
zero the in-node counter if the node from which the dump is resumed
differs.
Note that, with the current version of iproute2, this only fixes the
'ip -6 route list cache': on a flush command, iproute2 doesn't pass
RTM_F_CLONED and, due to this inconsistency, 'ip -6 route flush cache' is
still unable to fetch the routes to be flushed. This will be addressed in
a patch for iproute2.
To flush cached routes, a procfs entry could be introduced instead: that's
how it works for IPv4. We already have a rt6_flush_exception() function
ready to be wired to it. However, this would not solve the issue for
listing.
Versions of iproute2 and kernel tested:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0 5.1.0, patched
3.18 list + + + + + +
flush + + + + + +
4.4 list + + + + + +
flush + + + + + +
4.9 list + + + + + +
flush + + + + + +
4.14 list + + + + + +
flush + + + + + +
4.15 list
flush
4.19 list
flush
5.0 list
flush
5.1 list
flush
with list + + + + + +
fix flush + + + +
v7:
- Explain usage of "skip" counters in commit message (suggested by
David Ahern)
v6:
- Rebase onto net-next, use recently introduced nexthop walker
- Make rt6_nh_dump_exceptions() a separate function (suggested by David
Ahern)
v5:
- Use dump_routes and dump_exceptions from filter, ignore NLM_F_MATCH,
update test results (flushing works with iproute2 < 5.0.0 now)
v4:
- Split NLM_F_MATCH and strict check handling in separate patches
- Filter routes using RTM_F_CLONED: if it's not set, only return
non-cached routes, and if it's set, only return cached routes:
change requested by David Ahern and Martin Lau. This implies that
iproute2 needs a separate patch to be able to flush IPv6 cached
routes. This is not ideal because we can't fix the breakage caused
by
2b760fcf5cfb entirely in kernel. However, two years have passed
since then, and this makes it more tolerable
v3:
- More descriptive comment about expired exceptions in rt6_dump_route()
- Swap return values of rt6_dump_route() (suggested by Martin Lau)
- Don't zero skip_in_node in case we don't dump anything in a given pass
(also suggested by Martin Lau)
- Remove check on RTM_F_CLONED altogether: in the current UAPI semantic,
it's just a flag to indicate the route was cloned, not to filter on
routes
v2: Add tracking of number of entries to be skipped in current node after
a partial dump. As we restart from the same node, if not all the
exceptions for a given node fit in a single message, the dump will
not terminate, as suggested by Martin Lau. This is a concrete
possibility, setting up a big number of exceptions for the same route
actually causes the issue, suggested by David Ahern.
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:26 +0000 (17:45 +0200)]
ipv6/route: Change return code of rt6_dump_route() for partial node dumps
In the next patch, we are going to add optional dump of exceptions to
rt6_dump_route().
Change the return code of rt6_dump_route() to accomodate partial node
dumps: we might dump multiple routes per node, and might be able to dump
only a given number of them, so fib6_dump_node() will need to know how
many routes have been dumped on partial dump, to restart the dump from the
point where it was interrupted.
Note that fib6_dump_node() is the only caller and already handles all
non-negative return codes as success: those become -1 to signal that we're
done with the node. If we fail, return 0, as we were unable to dump the
single route in the node, but we're not done with it.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:25 +0000 (17:45 +0200)]
ipv6/route: Don't match on fc_nh_id if not set in ip6_route_del()
If fc_nh_id isn't set, we shouldn't try to match against it. This
actually matters just for the RTF_CACHE below (where this case is
already handled): if iproute2 gets a route exception and tries to
delete it, it won't reference it by fc_nh_id, even if a nexthop
object might be associated to the originating route.
Fixes: 5b98324ebe29 ("ipv6: Allow routes to use nexthop objects")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:24 +0000 (17:45 +0200)]
Revert "net/ipv6: Bail early if user only wants cloned entries"
This reverts commit
08e814c9e8eb5a982cbd1e8f6bd255d97c51026f: as we
are preparing to fix listing and dumping of IPv6 cached routes, we
need to allow RTM_F_CLONED as a flag to match routes against while
dumping them.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:23 +0000 (17:45 +0200)]
ipv4: Dump route exceptions if requested
Since commit
4895c771c7f0 ("ipv4: Add FIB nexthop exceptions."), cached
exception routes are stored as a separate entity, so they are not dumped
on a FIB dump, even if the RTM_F_CLONED flag is passed.
This implies that the command 'ip route list cache' doesn't return any
result anymore.
If the RTM_F_CLONED is passed, and strict checking requested, retrieve
nexthop exception routes and dump them. If no strict checking is
requested, filtering can't be performed consistently: dump everything in
that case.
With this, we need to add an argument to the netlink callback in order to
track how many entries were already dumped for the last leaf included in
a partial netlink dump.
A single additional argument is sufficient, even if we traverse logically
nested structures (nexthop objects, hash table buckets, bucket chains): it
doesn't matter if we stop in the middle of any of those, because they are
always traversed the same way. As an example, s_i values in [], s_fa
values in ():
node (fa) #1 [1]
nexthop #1
bucket #1 -> #0 in chain (1)
bucket #2 -> #0 in chain (2) -> #1 in chain (3) -> #2 in chain (4)
bucket #3 -> #0 in chain (5) -> #1 in chain (6)
nexthop #2
bucket #1 -> #0 in chain (7) -> #1 in chain (8)
bucket #2 -> #0 in chain (9)
--
node (fa) #2 [2]
nexthop #1
bucket #1 -> #0 in chain (1) -> #1 in chain (2)
bucket #2 -> #0 in chain (3)
it doesn't matter if we stop at (3), (4), (7) for "node #1", or at (2)
for "node #2": walking flattens all that.
It would even be possible to drop the distinction between the in-tree
(s_i) and in-node (s_fa) counter, but a further improvement might
advise against this. This is only as accurate as the existing tracking
mechanism for leaves: if a partial dump is restarted after exceptions
are removed or expired, we might skip some non-dumped entries.
To improve this, we could attach a 'sernum' attribute (similar to the
one used for IPv6) to nexthop entities, and bump this counter whenever
exceptions change: having a distinction between the two counters would
make this more convenient.
Listing of exception routes (modified routes pre-3.5) was tested against
these versions of kernel and iproute2:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0
3.5-rc4 + + + + +
4.4
4.9
4.14
4.15
4.19
5.0
5.1
fixed + + + + +
v7:
- Move loop over nexthop objects to route.c, and pass struct fib_info
and table ID to it, not a struct fib_alias (suggested by David Ahern)
- While at it, note that the NULL check on fa->fa_info is redundant,
and the check on RTNH_F_DEAD is also not consistent with what's done
with regular route listing: just keep it for nhc_flags
- Rename entry point function for dumping exceptions to
fib_dump_info_fnhe(), and rearrange arguments for consistency with
fib_dump_info()
- Rename fnhe_dump_buckets() to fnhe_dump_bucket() and make it handle
one bucket at a time
- Expand commit message to describe why we can have a single "skip"
counter for all exceptions stored in bucket chains in nexthop objects
(suggested by David Ahern)
v6:
- Rebased onto net-next
- Loop over nexthop paths too. Move loop over fnhe buckets to route.c,
avoids need to export rt_fill_info() and to touch exceptions from
fib_trie.c. Pass NULL as flow to rt_fill_info(), it now allows that
(suggested by David Ahern)
Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:22 +0000 (17:45 +0200)]
ipv4/route: Allow NULL flowinfo in rt_fill_info()
In the next patch, we're going to use rt_fill_info() to dump exception
routes upon RTM_GETROUTE with NLM_F_ROOT, meaning userspace is requesting
a dump and not a specific route selection, which in turn implies the input
interface is not relevant. Update rt_fill_info() to handle a NULL
flowinfo.
v7: If fl4 is NULL, explicitly set r->rtm_tos to 0: it's not initialised
otherwise (spotted by David Ahern)
v6: New patch
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:21 +0000 (17:45 +0200)]
ipv4/fib_frontend: Allow RTM_F_CLONED flag to be used for filtering
This functionally reverts the check introduced by commit
e8ba330ac0c5 ("rtnetlink: Update fib dumps for strict data checking")
as modified by commit
e4e92fb160d7 ("net/ipv4: Bail early if user only
wants prefix entries").
As we are preparing to fix listing of IPv4 cached routes, we need to
give userspace a way to request them.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 21 Jun 2019 15:45:20 +0000 (17:45 +0200)]
fib_frontend, ip6_fib: Select routes or exceptions dump from RTM_F_CLONED
The following patches add back the ability to dump IPv4 and IPv6 exception
routes, and we need to allow selection of regular routes or exceptions.
Use RTM_F_CLONED as filter to decide whether to dump routes or exceptions:
iproute2 passes it in dump requests (except for IPv6 cache flush requests,
this will be fixed in iproute2) and this used to work as long as
exceptions were stored directly in the FIB, for both IPv4 and IPv6.
Caveat: if strict checking is not requested (that is, if the dump request
doesn't go through ip_valid_fib_dump_req()), we can't filter on protocol,
tables or route types.
In this case, filtering on RTM_F_CLONED would be inconsistent: we would
fix 'ip route list cache' by returning exception routes and at the same
time introduce another bug in case another selector is present, e.g. on
'ip route list cache table main' we would return all exception routes,
without filtering on tables.
Keep this consistent by applying no filters at all, and dumping both
routes and exceptions, if strict checking is not requested. iproute2
currently filters results anyway, and no unwanted results will be
presented to the user. The kernel will just dump more data than needed.
v7: No changes
v6: Rebase onto net-next, no changes
v5: New patch: add dump_routes and dump_exceptions flags in filter and
simply clear the unwanted one if strict checking is enabled, don't
ignore NLM_F_MATCH and don't set filter_set if NLM_F_MATCH is set.
Skip filtering altogether if no strict checking is requested:
selecting routes or exceptions only would be inconsistent with the
fact we can't filter on tables.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 21 Jun 2019 15:30:02 +0000 (17:30 +0200)]
net: macb: use GRO
This patch updates the macb driver to use NAPI GRO helpers when
receiving SKBs. This improves performances.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 21 Jun 2019 15:28:55 +0000 (17:28 +0200)]
net: macb: use NAPI_POLL_WEIGHT
Use NAPI_POLL_WEIGHT, the default NAPI poll() weight instead of
redefining our own value (which turns out to be 64 as well).
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Jun 2019 16:02:56 +0000 (09:02 -0700)]
Merge branch 'ipv4-fix-bugs-when-enable-route_localnet'
Shijie Luo says:
====================
ipv4: fix bugs when enable route_localnet
When enable route_localnet, route of the 127/8 address is enabled.
But in some situations like arp_announce=2, ARP requests or reply
work abnormally.
This patchset fix some bugs when enable route_localnet.
Change History:
V2:
- Change a single patch to a patchset.
- Add bug fix for arp_ignore = 3.
- Add a couple of test for enabling route_localnet in selftests.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shijie Luo [Tue, 18 Jun 2019 15:14:05 +0000 (15:14 +0000)]
selftests: add route_localnet test script
Add a simple scripts to exercise several situations when enable
route_localnet.
Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: Zhiqiang liu <liuzhiqiang26@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shijie Luo [Tue, 18 Jun 2019 15:14:04 +0000 (15:14 +0000)]
ipv4: fix confirm_addr_indev() when enable route_localnet
When arp_ignore=3, the NIC won't reply for scope host addresses, but
if enable route_locanet, we need to reply ip address with head 127 and
scope RT_SCOPE_HOST.
Fixes: d0daebc3d622 ("ipv4: Add interface option to enable routing of 127.0.0.0/8")
Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shijie Luo [Tue, 18 Jun 2019 15:14:03 +0000 (15:14 +0000)]
ipv4: fix inet_select_addr() when enable route_localnet
Suppose we have two interfaces eth0 and eth1 in two hosts, follow
the same steps in the two hosts:
# sysctl -w net.ipv4.conf.eth1.route_localnet=1
# sysctl -w net.ipv4.conf.eth1.arp_announce=2
# ip route del 127.0.0.0/8 dev lo table local
and then set ip to eth1 in host1 like:
# ifconfig eth1 127.25.3.4/24
set ip to eth2 in host2 and ping host1:
# ifconfig eth1 127.25.3.14/24
# ping -I eth1 127.25.3.4
Well, host2 cannot connect to host1.
When set a ip address with head 127, the scope of the address defaults
to RT_SCOPE_HOST. In this situation, host2 will use arp_solicit() to
send a arp request for the mac address of host1 with ip
address 127.25.3.14. When arp_announce=2, inet_select_addr() cannot
select a correct saddr with condition ifa->ifa_scope > scope, because
ifa_scope is RT_SCOPE_HOST and scope is RT_SCOPE_LINK. Then,
inet_select_addr() will go to no_in_dev to lookup all interfaces to find
a primary ip and finally get the primary ip of eth0.
Here I add a localnet_scope defaults to RT_SCOPE_HOST, and when
route_localnet is enabled, this value changes to RT_SCOPE_LINK to make
inet_select_addr() find a correct primary ip as saddr of arp request.
Fixes: d0daebc3d622 ("ipv4: Add interface option to enable routing of 127.0.0.0/8")
Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>