Vishal Kulkarni [Fri, 13 Dec 2019 01:09:39 +0000 (06:39 +0530)]
cxgb4: Fix kernel panic while accessing sge_info
The sge_info debugfs collects offload queue info even when offload
capability is disabled and leads to panic.
[ 144.139871] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 144.139874] CR2:
0000000000000000 CR3:
000000082d456005 CR4:
00000000001606e0
[ 144.139876] Call Trace:
[ 144.139887] sge_queue_start+0x12/0x30 [cxgb4]
[ 144.139897] seq_read+0x1d4/0x3d0
[ 144.139906] full_proxy_read+0x50/0x70
[ 144.139913] vfs_read+0x89/0x140
[ 144.139916] ksys_read+0x55/0xd0
[ 144.139924] do_syscall_64+0x5b/0x1d0
[ 144.139933] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 144.139936] RIP: 0033:0x7f4b01493990
Fix this crash by skipping the offload queue access in sge_qinfo when
offload capability is disabled
Signed-off-by: Herat Ramani <herat@chelsio.com>
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Ursula Braun [Thu, 12 Dec 2019 21:35:58 +0000 (22:35 +0100)]
net/smc: add fallback check to connect()
FASTOPEN setsockopt() or sendmsg() may switch the SMC socket to fallback
mode. Once fallback mode is active, the native TCP socket functions are
called. Nevertheless there is a small race window, when FASTOPEN
setsockopt/sendmsg runs in parallel to a connect(), and switch the
socket into fallback mode before connect() takes the sock lock.
Make sure the SMC-specific connect setup is omitted in this case.
This way a syzbot-reported refcount problem is fixed, triggered by
different threads running non-blocking connect() and FASTOPEN_KEY
setsockopt.
Reported-by: syzbot+96d3f9ff6a86d37e44c8@syzkaller.appspotmail.com
Fixes: 6d6dd528d5af ("net/smc: fix refcount non-blocking connect() -part 2")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Russell King [Fri, 13 Dec 2019 10:06:30 +0000 (10:06 +0000)]
net: phylink: fix interface passed to mac_link_up
A mismerge between the following two commits:
c678726305b9 ("net: phylink: ensure consistent phy interface mode")
27755ff88c0e ("net: phylink: Add phylink_mac_link_{up, down} wrapper functions")
resulted in the wrong interface being passed to the mac_link_up()
function. Fix this up.
Fixes: b4b12b0d2f02 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Thadeu Lima de Souza Cascardo [Fri, 13 Dec 2019 10:39:02 +0000 (07:39 -0300)]
selftests: net: tls: remove recv_rcvbuf test
This test only works when [1] is applied, which was rejected.
Basically, the errors are reported and cleared. In this particular case of
tls sockets, following reads will block.
The test case was originally submitted with the rejected patch, but, then,
was included as part of a different patchset, possibly by mistake.
[1] https://lore.kernel.org/netdev/
20191007035323.4360-2-jakub.kicinski@netronome.com/#t
Thanks Paolo Pisati for pointing out the original patchset where this
appeared.
Fixes: 65190f77424d (selftests/tls: add a test for fragmented messages)
Reported-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Jakub Kicinski [Sun, 15 Dec 2019 01:16:12 +0000 (17:16 -0800)]
Merge branch 'gtp-fix-several-bugs-in-gtp-module'
Taehee Yoo says:
====================
gtp: fix several bugs in gtp module
This patchset fixes several bugs in the GTP module.
1. Do not allow adding duplicate TID and ms_addr pdp context.
In the current code, duplicate TID and ms_addr pdp context could be added.
So, RX and TX path could find correct pdp context.
2. Fix wrong condition in ->dumpit() callback.
->dumpit() callback is re-called if dump packet size is too big.
So, before return, it saves last position and then restart from
last dump position.
TID value is used to find last dump position.
GTP module allows adding zero TID value. But ->dumpit() callback ignores
zero TID value.
So, dump would not work correctly if dump packet size too big.
3. Fix use-after-free in ipv4_pdp_find().
RX and TX patch always uses gtp->tid_hash and gtp->addr_hash.
but while packet processing, these hash pointer would be freed.
So, use-after-free would occur.
4. Fix panic because of zero size hashtable
GTP hashtable size could be set by user-space.
If hashsize is set to 0, hashtable will not work and panic will occur.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Taehee Yoo [Wed, 11 Dec 2019 08:23:48 +0000 (08:23 +0000)]
gtp: avoid zero size hashtable
GTP default hashtable size is 1024 and userspace could set specific
hashtable size with IFLA_GTP_PDP_HASHSIZE. If hashtable size is set to 0
from userspace, hashtable will not work and panic will occur.
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Taehee Yoo [Wed, 11 Dec 2019 08:23:34 +0000 (08:23 +0000)]
gtp: fix an use-after-free in ipv4_pdp_find()
ipv4_pdp_find() is called in TX packet path of GTP.
ipv4_pdp_find() internally uses gtp->tid_hash to lookup pdp context.
In the current code, gtp->tid_hash and gtp->addr_hash are freed by
->dellink(), which is gtp_dellink().
But gtp_dellink() would be called while packets are processing.
So, gtp_dellink() should not free gtp->tid_hash and gtp->addr_hash.
Instead, dev->priv_destructor() would be used because this callback
is called after all packet processing safely.
Test commands:
ip link add veth1 type veth peer name veth2
ip a a 172.0.0.1/24 dev veth1
ip link set veth1 up
ip a a 172.99.0.1/32 dev lo
gtp-link add gtp1 &
gtp-tunnel add gtp1 v1 200 100 172.99.0.2 172.0.0.2
ip r a 172.99.0.2/32 dev gtp1
ip link set gtp1 mtu 1500
ip netns add ns2
ip link set veth2 netns ns2
ip netns exec ns2 ip a a 172.0.0.2/24 dev veth2
ip netns exec ns2 ip link set veth2 up
ip netns exec ns2 ip a a 172.99.0.2/32 dev lo
ip netns exec ns2 ip link set lo up
ip netns exec ns2 gtp-link add gtp2 &
ip netns exec ns2 gtp-tunnel add gtp2 v1 100 200 172.99.0.1 172.0.0.1
ip netns exec ns2 ip r a 172.99.0.1/32 dev gtp2
ip netns exec ns2 ip link set gtp2 mtu 1500
hping3 172.99.0.2 -2 --flood &
ip link del gtp1
Splat looks like:
[ 72.568081][ T1195] BUG: KASAN: use-after-free in ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.568916][ T1195] Read of size 8 at addr
ffff8880b9a35d28 by task hping3/1195
[ 72.569631][ T1195]
[ 72.569861][ T1195] CPU: 2 PID: 1195 Comm: hping3 Not tainted 5.5.0-rc1 #199
[ 72.570547][ T1195] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 72.571438][ T1195] Call Trace:
[ 72.571764][ T1195] dump_stack+0x96/0xdb
[ 72.572171][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.572761][ T1195] print_address_description.constprop.5+0x1be/0x360
[ 72.573400][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.573971][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.574544][ T1195] __kasan_report+0x12a/0x16f
[ 72.575014][ T1195] ? ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.575593][ T1195] kasan_report+0xe/0x20
[ 72.576004][ T1195] ipv4_pdp_find.isra.12+0x130/0x170 [gtp]
[ 72.576577][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ ... ]
[ 72.647671][ T1195] BUG: unable to handle page fault for address:
ffff8880b9a35d28
[ 72.648512][ T1195] #PF: supervisor read access in kernel mode
[ 72.649158][ T1195] #PF: error_code(0x0000) - not-present page
[ 72.649849][ T1195] PGD
a6c01067 P4D
a6c01067 PUD
11fb07067 PMD
11f939067 PTE
800fffff465ca060
[ 72.652958][ T1195] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 72.653834][ T1195] CPU: 2 PID: 1195 Comm: hping3 Tainted: G B 5.5.0-rc1 #199
[ 72.668062][ T1195] RIP: 0010:ipv4_pdp_find.isra.12+0x86/0x170 [gtp]
[ ... ]
[ 72.679168][ T1195] Call Trace:
[ 72.679603][ T1195] gtp_build_skb_ip4+0x199/0x1420 [gtp]
[ 72.681915][ T1195] ? ipv4_pdp_find.isra.12+0x170/0x170 [gtp]
[ 72.682513][ T1195] ? lock_acquire+0x164/0x3b0
[ 72.682966][ T1195] ? gtp_dev_xmit+0x35e/0x890 [gtp]
[ 72.683481][ T1195] gtp_dev_xmit+0x3c2/0x890 [gtp]
[ ... ]
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Taehee Yoo [Wed, 11 Dec 2019 08:23:17 +0000 (08:23 +0000)]
gtp: fix wrong condition in gtp_genl_dump_pdp()
gtp_genl_dump_pdp() is ->dumpit() callback of GTP module and it is used
to dump pdp contexts. it would be re-executed because of dump packet size.
If dump packet size is too big, it saves current dump pointer
(gtp interface pointer, bucket, TID value) then it restarts dump from
last pointer.
Current GTP code allows adding zero TID pdp context but dump code
ignores zero TID value. So, last dump pointer will not be found.
In addition, this patch adds missing rcu_read_lock() in
gtp_genl_dump_pdp().
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Taehee Yoo [Wed, 11 Dec 2019 08:23:00 +0000 (08:23 +0000)]
gtp: do not allow adding duplicate tid and ms_addr pdp context
GTP RX packet path lookups pdp context with TID. If duplicate TID pdp
contexts are existing in the list, it couldn't select correct pdp context.
So, TID value should be unique.
GTP TX packet path lookups pdp context with ms_addr. If duplicate ms_addr pdp
contexts are existing in the list, it couldn't select correct pdp context.
So, ms_addr value should be unique.
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Mahesh Bandewar [Fri, 6 Dec 2019 23:44:55 +0000 (15:44 -0800)]
bonding: fix active-backup transition after link failure
After the recent fix in commit
1899bb325149 ("bonding: fix state
transition issue in link monitoring"), the active-backup mode with
miimon initially come-up fine but after a link-failure, both members
transition into backup state.
Following steps to reproduce the scenario (eth1 and eth2 are the
slaves of the bond):
ip link set eth1 up
ip link set eth2 down
sleep 1
ip link set eth2 up
ip link set eth1 down
cat /sys/class/net/eth1/bonding_slave/state
cat /sys/class/net/eth2/bonding_slave/state
Fixes: 1899bb325149 ("bonding: fix state transition issue in link monitoring")
CC: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Jakub Kicinski [Sat, 14 Dec 2019 21:01:17 +0000 (13:01 -0800)]
Merge branch 'bnx2x-bug-fixes'
Manish Chopra says:
====================
bnx2x: bug fixes
This series has two driver changes, one to fix some unexpected
hardware behaviour casued during the parity error recovery in
presence of SR-IOV VFs and another one related for fixing resource
management in the driver among the PFs configured on an engine.
Please consider applying it to "net".
V1->V2:
=======
Fix the compilation errors reported by kbuild test robot
on the patch #1 with CONFIG_BNX2X_SRIOV=n
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Manish Chopra [Wed, 11 Dec 2019 17:59:56 +0000 (09:59 -0800)]
bnx2x: Fix logic to get total no. of PFs per engine
Driver doesn't calculate total number of PFs configured on a
given engine correctly which messed up resources in the PFs
loaded on that engine, leading driver to exceed configuration
of resources (like vlan filters etc.) beyond the limit per
engine, which ended up with asserts from the firmware.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Manish Chopra [Wed, 11 Dec 2019 17:59:55 +0000 (09:59 -0800)]
bnx2x: Do not handle requests from VFs after parity
Parity error from the hardware will cause PF to lose the state
of their VFs due to PF's internal reload and hardware reset following
the parity error. Restrict any configuration request from the VFs after
the parity as it could cause unexpected hardware behavior, only way
for VFs to recover would be to trigger FLR on VFs and reload them.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Arnd Bergmann [Wed, 11 Dec 2019 12:56:10 +0000 (13:56 +0100)]
net: ethernet: ti: build cpsw-common for switchdev
Without the common part of the driver, the new file fails to link:
drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_probe':
cpsw_new.c:(.text+0x312c): undefined reference to `ti_cm_get_macid'
Use the same Makefile hack as before, and build cpsw-common.o for
any driver that needs it.
Fixes: ed3525eda4c4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Arnd Bergmann [Wed, 11 Dec 2019 12:56:09 +0000 (13:56 +0100)]
net: ethernet: ti: select PAGE_POOL for switchdev driver
The new driver misses a dependency:
drivers/net/ethernet/ti/cpsw_new.o: In function `cpsw_rx_handler':
cpsw_new.c:(.text+0x259c): undefined reference to `__page_pool_put_page'
cpsw_new.c:(.text+0x25d0): undefined reference to `page_pool_alloc_pages'
drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_fill_rx_channels':
cpsw_priv.c:(.text+0x22d8): undefined reference to `page_pool_alloc_pages'
cpsw_priv.c:(.text+0x2420): undefined reference to `__page_pool_put_page'
drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_create_xdp_rxqs':
cpsw_priv.c:(.text+0x2624): undefined reference to `page_pool_create'
drivers/net/ethernet/ti/cpsw_priv.o: In function `cpsw_run_xdp':
cpsw_priv.c:(.text+0x2dc8): undefined reference to `__page_pool_put_page'
Other drivers use 'select' for PAGE_POOL, so do the same here.
Fixes: ed3525eda4c4 ("net: ethernet: ti: introduce cpsw switchdev based driver part 1 - dual-emac")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Haiyang Zhang [Wed, 11 Dec 2019 22:26:27 +0000 (14:26 -0800)]
hv_netvsc: Fix tx_table init in rndis_set_subchannel()
Host can provide send indirection table messages anytime after RSS is
enabled by calling rndis_filter_set_rss_param(). So the host provided
table values may be overwritten by the initialization in
rndis_set_subchannel().
To prevent this problem, move the tx_table initialization before calling
rndis_filter_set_rss_param().
Fixes: a6fb6aa3cfa9 ("hv_netvsc: Set tx_table to equal weight after subchannels open")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Russell King [Tue, 10 Dec 2019 22:33:05 +0000 (22:33 +0000)]
net: marvell: mvpp2: phylink requires the link interrupt
phylink requires the MAC to report when its link status changes when
operating in inband modes. Failure to report link status changes
means that phylink has no idea when the link events happen, which
results in either the network interface's carrier remaining up or
remaining permanently down.
For example, with a fiber module, if the interface is brought up and
link is initially established, taking the link down at the far end
will cut the optical power. The SFP module's LOS asserts, we
deactivate the link, and the network interface reports no carrier.
When the far end is brought back up, the SFP module's LOS deasserts,
but the MAC may be slower to establish link. If this happens (which
in my tests is a certainty) then phylink never hears that the MAC
has established link with the far end, and the network interface is
stuck reporting no carrier. This means the interface is
non-functional.
Avoiding the link interrupt when we have phylink is basically not
an option, so remove the !port->phylink from the test.
Fixes: 4bb043262878 ("net: mvpp2: phylink support")
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de>
Tested-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Jakub Kicinski [Sat, 14 Dec 2019 17:57:36 +0000 (09:57 -0800)]
Merge branch 'tcp-take-care-of-empty-skbs-in-write-queue'
Eric Dumazet says:
====================
tcp: take care of empty skbs in write queue
We understood recently that TCP sockets could have an empty
skb at the tail of the write queue, leading to various problems.
This patch series :
1) Make sure we do not send an empty packet since this
was unintended and causing crashes in old kernels.
2) Change tcp_write_queue_empty() to not be fooled by
the presence of an empty skb.
3) Fix a bug that could trigger suboptimal epoll()
application behavior under memory pressure.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Thu, 12 Dec 2019 20:55:31 +0000 (12:55 -0800)]
tcp: refine rule to allow EPOLLOUT generation under mem pressure
At the time commit
ce5ec440994b ("tcp: ensure epoll edge trigger
wakeup when write queue is empty") was added to the kernel,
we still had a single write queue, combining rtx and write queues.
Once we moved the rtx queue into a separate rb-tree, testing
if sk_write_queue is empty has been suboptimal.
Indeed, if we have packets in the rtx queue, we probably want
to delay the EPOLLOUT generation at the time incoming packets
will free them, making room, but more importantly avoiding
flooding application with EPOLLOUT events.
Solution is to use tcp_rtx_and_write_queues_empty() helper.
Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Thu, 12 Dec 2019 20:55:30 +0000 (12:55 -0800)]
tcp: refine tcp_write_queue_empty() implementation
Due to how tcp_sendmsg() is implemented, we can have an empty
skb at the tail of the write queue.
Most [1] tcp_write_queue_empty() callers want to know if there is
anything to send (payload and/or FIN)
Instead of checking if the sk_write_queue is empty, we need
to test if tp->write_seq == tp->snd_nxt
[1] tcp_send_fin() was the only caller that expected to
see if an skb was in the write queue, I have changed the code
to reuse the tcp_write_queue_tail() result.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Thu, 12 Dec 2019 20:55:29 +0000 (12:55 -0800)]
tcp: do not send empty skb from tcp_write_xmit()
Backport of commit
fdfc5c8594c2 ("tcp: remove empty skb from
write queue in error cases") in linux-4.14 stable triggered
various bugs. One of them has been fixed in commit
ba2ddb43f270
("tcp: Don't dequeue SYN/FIN-segments from write-queue"), but
we still have crashes in some occasions.
Root-cause is that when tcp_sendmsg() has allocated a fresh
skb and could not append a fragment before being blocked
in sk_stream_wait_memory(), tcp_write_xmit() might be called
and decide to send this fresh and empty skb.
Sending an empty packet is not only silly, it might have caused
many issues we had in the past with tp->packets_out being
out of sync.
Fixes: c65f7f00c587 ("[TCP]: Simplify SKB data portion allocation with NETIF_F_SG.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Cc: Jason Baron <jbaron@akamai.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Thu, 12 Dec 2019 18:32:13 +0000 (10:32 -0800)]
6pack,mkiss: fix possible deadlock
We got another syzbot report [1] that tells us we must use
write_lock_irq()/write_unlock_irq() to avoid possible deadlock.
[1]
WARNING: inconsistent lock state
5.5.0-rc1-syzkaller #0 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage.
syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes:
ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
__raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline]
_raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319
sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657
tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489
tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585
tiocsetd drivers/tty/tty_io.c:2337 [inline]
tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597
vfs_ioctl fs/ioctl.c:47 [inline]
file_ioctl fs/ioctl.c:545 [inline]
do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732
ksys_ioctl+0xab/0xd0 fs/ioctl.c:749
__do_sys_ioctl fs/ioctl.c:756 [inline]
__se_sys_ioctl fs/ioctl.c:754 [inline]
__x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
irq event stamp: 3946
hardirqs last enabled at (3945): [<
ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
hardirqs last enabled at (3945): [<
ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199
hardirqs last disabled at (3946): [<
ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42
softirqs last enabled at (2658): [<
ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline]
softirqs last enabled at (2658): [<
ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222
softirqs last disabled at (2656): [<
ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline]
softirqs last disabled at (2656): [<
ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(disc_data_lock);
<Interrupt>
lock(disc_data_lock);
*** DEADLOCK ***
5 locks held by syz-executor826/9605:
#0:
ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
#1:
ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413
#2:
ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
#2:
ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116
#3:
ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823
#4:
ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288
stack backtrace:
CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x197/0x210 lib/dump_stack.c:118
print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101
valid_state kernel/locking/lockdep.c:3112 [inline]
mark_lock_irq kernel/locking/lockdep.c:3309 [inline]
mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666
mark_usage kernel/locking/lockdep.c:3554 [inline]
__lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909
lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
__raw_read_lock include/linux/rwlock_api_smp.h:149 [inline]
_raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223
sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138
sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402
tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536
tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50
tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387
uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104
serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761
serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834
serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline]
serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850
serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126
__handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149
handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189
handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206
handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830
generic_handle_irq_desc include/linux/irqdesc.h:156 [inline]
do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250
common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607
</IRQ>
RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline]
RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579
Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7
RSP: 0018:
ffffc90001eafa20 EFLAGS:
00000246 ORIG_RAX:
ffffffffffffffd7
RAX:
0000000000000000 RBX:
ffff88809fd9e0c0 RCX:
1ffffffff13266dd
RDX:
0000000000000000 RSI:
0000000000000008 RDI:
0000000000000000
RBP:
ffffc90001eafa60 R08:
1ffff11013d22898 R09:
ffffed1013d22899
R10:
ffffed1013d22898 R11:
ffff88809e9144c7 R12:
ffff8880a905e138
R13:
ffff88809e9144c0 R14:
0000000000000000 R15:
dffffc0000000000
mutex_optimistic_spin kernel/locking/mutex.c:673 [inline]
__mutex_lock_common kernel/locking/mutex.c:962 [inline]
__mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106
mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121
tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19
tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665
__fput+0x2ff/0x890 fs/file_table.c:280
____fput+0x16/0x20 fs/file_table.c:313
task_work_run+0x145/0x1c0 kernel/task_work.c:113
exit_task_work include/linux/task_work.h:22 [inline]
do_exit+0x8e7/0x2ef0 kernel/exit.c:797
do_group_exit+0x135/0x360 kernel/exit.c:895
__do_sys_exit_group kernel/exit.c:906 [inline]
__se_sys_exit_group kernel/exit.c:904 [inline]
__x64_sys_exit_group+0x44/0x50 kernel/exit.c:904
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x43fef8
Code: Bad RIP value.
RSP: 002b:
00007ffdb07d2338 EFLAGS:
00000246 ORIG_RAX:
00000000000000e7
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
000000000043fef8
RDX:
0000000000000000 RSI:
000000000000003c RDI:
0000000000000000
RBP:
00000000004bf730 R08:
00000000000000e7 R09:
ffffffffffffffd0
R10:
00000000004002c8 R11:
0000000000000246 R12:
0000000000000001
R13:
00000000006d1180 R14:
0000000000000000 R15:
0000000000000000
Fixes: 6e4e2f811bad ("6pack,mkiss: fix lock inconsistency")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Eric Dumazet [Sat, 14 Dec 2019 02:20:41 +0000 (18:20 -0800)]
tcp/dccp: fix possible race __inet_lookup_established()
Michal Kubecek and Firo Yang did a very nice analysis of crashes
happening in __inet_lookup_established().
Since a TCP socket can go from TCP_ESTABLISH to TCP_LISTEN
(via a close()/socket()/listen() cycle) without a RCU grace period,
I should not have changed listeners linkage in their hash table.
They must use the nulls protocol (Documentation/RCU/rculist_nulls.txt),
so that a lookup can detect a socket in a hash list was moved in
another one.
Since we added code in commit
d296ba60d8e2 ("soreuseport: Resolve
merge conflict for v4/v6 ordering fix"), we have to add
hlist_nulls_add_tail_rcu() helper.
Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Reported-by: Firo Yang <firo.yang@suse.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Link: https://lore.kernel.org/netdev/20191120083919.GH27852@unicorn.suse.cz/
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Thomas Falcon [Wed, 11 Dec 2019 15:38:39 +0000 (09:38 -0600)]
net/ibmvnic: Fix typo in retry check
This conditional is missing a bang, with the intent
being to break when the retry count reaches zero.
Fixes: 476d96ca9cc5 ("ibmvnic: Bound waits for device queries")
Suggested-by: Juliet Kim <julietk@linux.vnet.ibm.com>
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Hangbin Liu [Wed, 11 Dec 2019 14:20:16 +0000 (22:20 +0800)]
ipv6/addrconf: only check invalid header values when NETLINK_F_STRICT_CHK is set
In commit
4b1373de73a3 ("net: ipv6: addr: perform strict checks also for
doit handlers") we add strict check for inet6_rtm_getaddr(). But we did
the invalid header values check before checking if NETLINK_F_STRICT_CHK
is set. This may break backwards compatibility if user already set the
ifm->ifa_prefixlen, ifm->ifa_flags, ifm->ifa_scope in their netlink code.
I didn't move the nlmsg_len check because I thought it's a valid check.
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 4b1373de73a3 ("net: ipv6: addr: perform strict checks also for doit handlers")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Arnd Bergmann [Tue, 10 Dec 2019 19:56:34 +0000 (20:56 +0100)]
ptp: clockmatrix: add I2C dependency
Without I2C, we get a link failure:
drivers/ptp/ptp_clockmatrix.o: In function `idtcm_xfer.isra.3':
ptp_clockmatrix.c:(.text+0xcc): undefined reference to `i2c_transfer'
drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_init':
ptp_clockmatrix.c:(.init.text+0x14): undefined reference to `i2c_register_driver'
drivers/ptp/ptp_clockmatrix.o: In function `idtcm_driver_exit':
ptp_clockmatrix.c:(.exit.text+0x10): undefined reference to `i2c_del_driver'
Fixes: 3a6ba7dc7799 ("ptp: Add a ptp clock driver for IDT ClockMatrix.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Jonathan Lemon [Tue, 10 Dec 2019 16:39:46 +0000 (08:39 -0800)]
bnxt: apply computed clamp value for coalece parameter
After executing "ethtool -C eth0 rx-usecs-irq 0", the box becomes
unresponsive, likely due to interrupt livelock. It appears that
a minimum clamp value for the irq timer is computed, but is never
applied.
Fix by applying the corrected clamp value.
Fixes: 74706afa712d ("bnxt_en: Update interrupt coalescing logic.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Vivien Didelot [Thu, 12 Dec 2019 17:59:08 +0000 (12:59 -0500)]
mailmap: add entry for myself
I no longer work at Savoir-faire Linux but even though MAINTAINERS is
up-to-date, some emails are still sent to my old email address.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Subash Abhinov Kasiviswanathan [Thu, 12 Dec 2019 04:00:15 +0000 (04:00 +0000)]
MAINTAINERS: Add maintainers for rmnet
Add myself and Sean as maintainers for rmnet driver.
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Manish Chopra [Thu, 12 Dec 2019 14:49:28 +0000 (06:49 -0800)]
qede: Fix multicast mac configuration
Driver doesn't accommodate the configuration for max number
of multicast mac addresses, in such particular case it leaves
the device with improper/invalid multicast configuration state,
causing connectivity issues (in lacp bonding like scenarios).
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cristian Birsan [Thu, 12 Dec 2019 11:52:47 +0000 (13:52 +0200)]
net: usb: lan78xx: Fix suspend/resume PHY register access error
Lan78xx driver accesses the PHY registers through MDIO bus over USB
connection. When performing a suspend/resume, the PHY registers can be
accessed before the USB connection is resumed. This will generate an
error and will prevent the device to resume correctly.
This patch adds the dependency between the MDIO bus and USB device to
allow correct handling of suspend/resume.
Fixes: ce85e13ad6ef ("lan78xx: Update to use phylib instead of mii_if_info.")
Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 12 Dec 2019 04:13:45 +0000 (20:13 -0800)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
pull-request: bpf 2019-12-11
The following pull-request contains BPF updates for your *net* tree.
We've added 8 non-merge commits during the last 1 day(s) which contain
a total of 10 files changed, 126 insertions(+), 18 deletions(-).
The main changes are:
1) Make BPF trampoline co-exist with ftrace-based tracers, from Alexei.
2) Fix build in minimal configurations, from Arnd.
3) Fix mips, riscv bpf_tail_call limit, from Paul.
4) Fix bpftool segfault, from Toke.
5) Fix samples/bpf, from Daniel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel T. Lee [Thu, 5 Dec 2019 08:01:14 +0000 (17:01 +0900)]
samples: bpf: fix syscall_tp due to unused syscall
Currently, open() is called from the user program and it calls the syscall
'sys_openat', not the 'sys_open'. This leads to an error of the program
of user side, due to the fact that the counter maps are zero since no
function such 'sys_open' is called.
This commit adds the kernel bpf program which are attached to the
tracepoint 'sys_enter_openat' and 'sys_enter_openat'.
Fixes: 1da236b6be963 ("bpf: add a test case for syscalls/sys_{enter|exit}_* tracepoints")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Daniel T. Lee [Thu, 5 Dec 2019 08:01:13 +0000 (17:01 +0900)]
samples: bpf: Replace symbol compare of trace_event
Previously, when this sample is added, commit
1c47910ef8013
("samples/bpf: add perf_event+bpf example"), a symbol 'sys_read' and
'sys_write' has been used without no prefixes. But currently there are
no exact symbols with these under kallsyms and this leads to failure.
This commit changes exact compare to substring compare to keep compatible
with exact symbol or prefixed symbol.
Fixes: 1c47910ef8013 ("samples/bpf: add perf_event+bpf example")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191205080114.19766-2-danieltimlee@gmail.com
Alexei Starovoitov [Mon, 9 Dec 2019 00:01:14 +0000 (16:01 -0800)]
selftests/bpf: Test function_graph tracer and bpf trampoline together
Add simple test script to execute funciton graph tracer while BPF trampoline
attaches and detaches from the functions being graph traced.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191209000114.1876138-4-ast@kernel.org
Alexei Starovoitov [Mon, 9 Dec 2019 00:01:13 +0000 (16:01 -0800)]
bpf: Make BPF trampoline use register_ftrace_direct() API
Make BPF trampoline attach its generated assembly code to kernel functions via
register_ftrace_direct() API. It helps ftrace-based tracers co-exist with BPF
trampoline on the same kernel function. It also switches attaching logic from
arch specific text_poke to generic ftrace that is available on many
architectures. text_poke is still necessary for bpf-to-bpf attach and for
bpf_tail_call optimization.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191209000114.1876138-3-ast@kernel.org
Toke Høiland-Jørgensen [Tue, 10 Dec 2019 18:14:12 +0000 (19:14 +0100)]
bpftool: Don't crash on missing jited insns or ksyms
When the kptr_restrict sysctl is set, the kernel can fail to return
jited_ksyms or jited_prog_insns, but still have positive values in
nr_jited_ksyms and jited_prog_len. This causes bpftool to crash when
trying to dump the program because it only checks the len fields not
the actual pointers to the instructions and ksyms.
Fix this by adding the missing checks.
Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Fixes: f84192ee00b7 ("tools: bpftool: resolve calls without using imm field")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191210181412.151226-1-toke@redhat.com
Arnd Bergmann [Tue, 10 Dec 2019 20:35:46 +0000 (21:35 +0100)]
bpf: Fix build in minimal configurations, again
Building with -Werror showed another failure:
kernel/bpf/btf.c: In function 'btf_get_prog_ctx_type.isra.31':
kernel/bpf/btf.c:3508:63: error: array subscript 0 is above array bounds of 'u8[0]' {aka 'unsigned char[0]'} [-Werror=array-bounds]
ctx_type = btf_type_member(conv_struct) + bpf_ctx_convert_map[prog_type] * 2;
I don't actually understand why the array is empty, but a similar
fix has addressed a related problem, so I suppose we can do the
same thing here.
Fixes: ce27709b8162 ("bpf: Fix build in minimal configurations")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191210203553.2941035-1-arnd@arndb.de
Paul Chaignon [Mon, 9 Dec 2019 18:52:52 +0000 (19:52 +0100)]
bpf, mips: Limit to 33 tail calls
All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls
limit at runtime. In addition, a test was recently added, in tailcalls2,
to check this limit.
This patch updates the tail call limit in MIPS' JIT compiler to allow
33 tail calls.
Fixes: b6bd53f9c4e8 ("MIPS: Add missing file for eBPF JIT.")
Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/b8eb2caac1c25453c539248e56ca22f74b5316af.1575916815.git.paul.chaignon@gmail.com
Paul Chaignon [Mon, 9 Dec 2019 18:52:07 +0000 (19:52 +0100)]
bpf, riscv: Limit to 33 tail calls
All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls
limit at runtime. In addition, a test was recently added, in tailcalls2,
to check this limit.
This patch updates the tail call limit in RISC-V's JIT compiler to allow
33 tail calls. I tested it using the above selftest on an emulated
RISCV64.
Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G")
Reported-by: Mahshid Khezri <khezri.mahshid@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/966fe384383bf23a0ee1efe8d7291c78a3fb832b.1575916815.git.paul.chaignon@gmail.com
Netanel Belgazal [Tue, 10 Dec 2019 11:27:44 +0000 (11:27 +0000)]
net: ena: fix napi handler misbehavior when the napi budget is zero
In netpoll the napi handler could be called with budget equal to zero.
Current ENA napi handler doesn't take that into consideration.
The napi handler handles Rx packets in a do-while loop.
Currently, the budget check happens only after decrementing the
budget, therefore the napi handler, in rare cases, could run over
MAX_INT packets.
In addition to that, this moves all budget related variables to int
calculation and stop mixing u32 to avoid ambiguity
Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2019 01:45:04 +0000 (17:45 -0800)]
Merge branch 'tipc-fix-some-issues'
Tuong Lien says:
====================
tipc: fix some issues
This series consists of some bug-fixes for TIPC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tuong Lien [Tue, 10 Dec 2019 08:21:05 +0000 (15:21 +0700)]
tipc: fix use-after-free in tipc_disc_rcv()
In the function 'tipc_disc_rcv()', the 'msg_peer_net_hash()' is called
to read the header data field but after the message skb has been freed,
that might result in a garbage value...
This commit fixes it by defining a new local variable to store the data
first, just like the other header fields' handling.
Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tuong Lien [Tue, 10 Dec 2019 08:21:04 +0000 (15:21 +0700)]
tipc: fix retrans failure due to wrong destination
When a user message is sent, TIPC will check if the socket has faced a
congestion at link layer. If that happens, it will make a sleep to wait
for the congestion to disappear. This leaves a gap for other users to
take over the socket (e.g. multi threads) since the socket is released
as well. Also, in case of connectionless (e.g. SOCK_RDM), user is free
to send messages to various destinations (e.g. via 'sendto()'), then
the socket's preformatted header has to be updated correspondingly
prior to the actual payload message building.
Unfortunately, the latter action is done before the first action which
causes a condition issue that the destination of a certain message can
be modified incorrectly in the middle, leading to wrong destination
when that message is built. Consequently, when the message is sent to
the link layer, it gets stuck there forever because the peer node will
simply reject it. After a number of retransmission attempts, the link
is eventually taken down and the retransmission failure is reported.
This commit fixes the problem by rearranging the order of actions to
prevent the race condition from occurring, so the message building is
'atomic' and its header will not be modified by anyone.
Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tuong Lien [Tue, 10 Dec 2019 08:21:03 +0000 (15:21 +0700)]
tipc: fix potential hanging after b/rcast changing
In commit
c55c8edafa91 ("tipc: smooth change between replicast and
broadcast"), we allow instant switching between replicast and broadcast
by sending a dummy 'SYN' packet on the last used link to synchronize
packets on the links. The 'SYN' message is an object of link congestion
also, so if that happens, a 'SOCK_WAKEUP' will be scheduled to be sent
back to the socket...
However, in that commit, we simply use the same socket 'cong_link_cnt'
counter for both the 'SYN' & normal payload message sending. Therefore,
if both the replicast & broadcast links are congested, the counter will
be not updated correctly but overwritten by the latter congestion.
Later on, when the 'SOCK_WAKEUP' messages are processed, the counter is
reduced one by one and eventually overflowed. Consequently, further
activities on the socket will only wait for the false congestion signal
to disappear but never been met.
Because sending the 'SYN' message is vital for the mechanism, it should
be done anyway. This commit fixes the issue by marking the message with
an error code e.g. 'TIPC_ERR_NO_PORT', so its sending should not face a
link congestion, there is no need to touch the socket 'cong_link_cnt'
either. In addition, in the event of any error (e.g. -ENOBUFS), we will
purge the entire payload message queue and make a return immediately.
Fixes: c55c8edafa91 ("tipc: smooth change between replicast and broadcast")
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tuong Lien [Tue, 10 Dec 2019 08:21:02 +0000 (15:21 +0700)]
tipc: fix name table rbtree issues
The current rbtree for service ranges in the name table is built based
on the 'lower' & 'upper' range values resulting in a flaw in the rbtree
searching. Some issues have been observed in case of range overlapping:
Case #1: unable to withdraw a name entry:
After some name services are bound, all of them are withdrawn by user
but one remains in the name table forever. This corrupts the table and
that service becomes dummy i.e. no real port.
E.g.
/
{22, 22}
/
/
---> {10, 50}
/ \
/ \
{10, 30} {20, 60}
The node {10, 30} cannot be removed since the rbtree searching stops at
the node's ancestor i.e. {10, 50}, so starting from it will never reach
the finding node.
Case #2: failed to send data in some cases:
E.g. Two service ranges: {20, 60}, {10, 50} are bound. The rbtree for
this service will be one of the two cases below depending on the order
of the bindings:
{20, 60} {10, 50} <--
/ \ / \
/ \ / \
{10, 50} NIL <-- NIL {20, 60}
(a) (b)
Now, try to send some data to service {30}, there will be two results:
(a): Failed, no route to host.
(b): Ok.
The reason is that the rbtree searching will stop at the pointing node
as shown above.
Case #3: Same as case #2b above but if the data sending's scope is
local and the {10, 50} is published by a peer node, then it will result
in 'no route to host' even though the other {20, 60} is for example on
the local node which should be able to get the data.
The issues are actually due to the way we built the rbtree. This commit
fixes it by introducing an additional field to each node - named 'max',
which is the largest 'upper' of that node subtree. The 'max' value for
each subtrees will be propagated correctly whenever a node is inserted/
removed or the tree is rebalanced by the augmented rbtree callbacks.
By this way, we can change the rbtree searching appoarch to solve the
issues above. Another benefit from this is that we can now improve the
searching for a next range matching e.g. in case of multicast, so get
rid of the unneeded looping over all nodes in the tree.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 11 Dec 2019 01:37:14 +0000 (17:37 -0800)]
Merge branch 'bnxt_en-Error-recovery-fixes'
Michael Chan says:
====================
bnxt_en: Error recovery fixes.
This patch series contains fixes mostly for the error recovery feature
and related areas. Please queue the series for -stable also. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 10 Dec 2019 07:49:13 +0000 (02:49 -0500)]
bnxt_en: Add missing devlink health reporters for VFs.
The VF driver also needs to create the health reporters since
VFs are also involved in firmware reset and recovery. Modify
bnxt_dl_register() and bnxt_dl_unregister() so that they can
be called by the VFs to register/unregister devlink. Only the PF
will register the devlink parameters. With devlink registered,
we can now create the health reporters on the VFs.
Fixes: 6763c779c2d8 ("bnxt_en: Add new FW devlink_health_reporter")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 10 Dec 2019 07:49:12 +0000 (02:49 -0500)]
bnxt_en: Fix the logic that creates the health reporters.
Fix the logic to properly check the fw capabilities and create the
devlink health reporters only when needed. The current code creates
the reporters unconditionally as long as bp->fw_health is valid, and
that's not correct.
Call bnxt_dl_fw_reporters_create() directly from the init and reset
code path instead of from bnxt_dl_register(). This allows the
reporters to be adjusted when capabilities change. The same
applies to bnxt_dl_fw_reporters_destroy().
Fixes: 6763c779c2d8 ("bnxt_en: Add new FW devlink_health_reporter")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 10 Dec 2019 07:49:11 +0000 (02:49 -0500)]
bnxt_en: Remove unnecessary NULL checks for fw_health
After fixing the allocation of bp->fw_health in the previous patch,
the driver will not go through the fw reset and recovery code paths
if bp->fw_health allocation fails. So we can now remove the
unnecessary NULL checks.
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 10 Dec 2019 07:49:10 +0000 (02:49 -0500)]
bnxt_en: Fix bp->fw_health allocation and free logic.
bp->fw_health needs to be allocated for either the firmware initiated
reset feature or the driver initiated error recovery feature. The
current code is not allocating bp->fw_health for all the necessary cases.
This patch corrects the logic to allocate bp->fw_health correctly when
needed. If allocation fails, we clear the feature flags.
We also add the the missing kfree(bp->fw_health) when the driver is
unloaded. If we get an async reset message from the firmware, we also
need to make sure that we have a valid bp->fw_health before proceeding.
Fixes: 07f83d72d238 ("bnxt_en: Discover firmware error recovery capabilities.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Tue, 10 Dec 2019 07:49:09 +0000 (02:49 -0500)]
bnxt_en: Return error if FW returns more data than dump length
If any change happened in the configuration of VF in VM while
collecting live dump, there could be a race and firmware can return
more data than allocated dump length. Fix it by keeping track of
the accumulated core dump length copied so far and abort the copy
with error code if the next chunk of core dump will exceed the
original dump length.
Fixes: 6c5657d085ae ("bnxt_en: Add support for ethtool get dump.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 10 Dec 2019 07:49:08 +0000 (02:49 -0500)]
bnxt_en: Free context memory in the open path if firmware has been reset.
This will trigger new context memory to be rediscovered and allocated
during the re-probe process after a firmware reset. Without this, the
newly reset firmware does not have valid context memory and the driver
will eventually fail to allocate some resources.
Fixes: ec5d31e3c15d ("bnxt_en: Handle firmware reset status during IF_UP.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 10 Dec 2019 07:49:07 +0000 (02:49 -0500)]
bnxt_en: Fix MSIX request logic for RDMA driver.
The logic needs to check both bp->total_irqs and the reserved IRQs in
hw_resc->resv_irqs if applicable and see if both are enough to cover
the L2 and RDMA requested vectors. The current code is only checking
bp->total_irqs and can fail in some code paths, such as the TX timeout
code path with the RDMA driver requesting vectors after recovery. In
this code path, we have not reserved enough MSIX resources for the
RDMA driver yet.
Fixes: 75720e6323a1 ("bnxt_en: Keep track of reserved IRQs.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephan Gerhold [Mon, 9 Dec 2019 18:53:43 +0000 (19:53 +0100)]
NFC: nxp-nci: Fix probing without ACPI
devm_acpi_dev_add_driver_gpios() returns -ENXIO if CONFIG_ACPI
is disabled (e.g. on device tree platforms).
In this case, nxp-nci will silently fail to probe.
The other NFC drivers only log a debug message if
devm_acpi_dev_add_driver_gpios() fails.
Do the same in nxp-nci to fix this problem.
Fixes: ad0acfd69add ("NFC: nxp-nci: Get rid of code duplication in ->probe()")
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Mon, 9 Dec 2019 16:58:52 +0000 (17:58 +0100)]
tc-testing: unbreak full listing of tdc testcases
the following command currently fails:
[root@fedora tc-testing]# ./tdc.py -l
The following test case IDs are not unique:
{'6f5e'}
Please correct them before continuing.
this happens because there are two tests having the same id:
[root@fedora tc-testing]# grep -r 6f5e tc-tests/*
tc-tests/actions/pedit.json: "id": "6f5e",
tc-tests/filters/basic.json: "id": "6f5e",
fix it replacing the latest duplicate id with a brand new one:
[root@fedora tc-testing]# sed -i 's/6f5e//1' tc-tests/filters/basic.json
[root@fedora tc-testing]# ./tdc.py -i
Fixes: 4717b05328ba ("tc-testing: Introduced tdc tests for basic filter")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chuhong Yuan [Mon, 9 Dec 2019 16:22:07 +0000 (00:22 +0800)]
fjes: fix missed check in fjes_acpi_add
fjes_acpi_add() misses a check for platform_device_register_simple().
Add a check to fix it.
Fixes: 658d439b2292 ("fjes: Introduce FUJITSU Extended Socket Network Device driver")
Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mao Wenan [Mon, 9 Dec 2019 13:31:25 +0000 (21:31 +0800)]
af_packet: set defaule value for tmo
There is softlockup when using TPACKET_V3:
...
NMI watchdog: BUG: soft lockup - CPU#2 stuck for 60010ms!
(__irq_svc) from [<
c0558a0c>] (_raw_spin_unlock_irqrestore+0x44/0x54)
(_raw_spin_unlock_irqrestore) from [<
c027b7e8>] (mod_timer+0x210/0x25c)
(mod_timer) from [<
c0549c30>]
(prb_retire_rx_blk_timer_expired+0x68/0x11c)
(prb_retire_rx_blk_timer_expired) from [<
c027a7ac>]
(call_timer_fn+0x90/0x17c)
(call_timer_fn) from [<
c027ab6c>] (run_timer_softirq+0x2d4/0x2fc)
(run_timer_softirq) from [<
c021eaf4>] (__do_softirq+0x218/0x318)
(__do_softirq) from [<
c021eea0>] (irq_exit+0x88/0xac)
(irq_exit) from [<
c0240130>] (msa_irq_exit+0x11c/0x1d4)
(msa_irq_exit) from [<
c0209cf0>] (handle_IPI+0x650/0x7f4)
(handle_IPI) from [<
c02015bc>] (gic_handle_irq+0x108/0x118)
(gic_handle_irq) from [<
c0558ee4>] (__irq_usr+0x44/0x5c)
...
If __ethtool_get_link_ksettings() is failed in
prb_calc_retire_blk_tmo(), msec and tmo will be zero, so tov_in_jiffies
is zero and the timer expire for retire_blk_timer is turn to
mod_timer(&pkc->retire_blk_timer, jiffies + 0),
which will trigger cpu usage of softirq is 100%.
Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.")
Tested-by: Xiao Jiangfeng <xiaojiangfeng@huawei.com>
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko [Mon, 9 Dec 2019 11:19:24 +0000 (13:19 +0200)]
net: ethernet: ti: davinci_cpdma: fix warning "device driver frees DMA memory with different size"
The TI CPSW(s) driver produces warning with DMA API debug options enabled:
WARNING: CPU: 0 PID: 1033 at kernel/dma/debug.c:1025 check_unmap+0x4a8/0x968
DMA-API: cpsw
48484000.ethernet: device driver frees DMA memory with different size
[device address=0x00000000abc6aa02] [map size=64 bytes] [unmap size=42 bytes]
CPU: 0 PID: 1033 Comm: ping Not tainted 5.3.0-dirty #41
Hardware name: Generic DRA72X (Flattened Device Tree)
[<
c0112c60>] (unwind_backtrace) from [<
c010d270>] (show_stack+0x10/0x14)
[<
c010d270>] (show_stack) from [<
c09bc564>] (dump_stack+0xd8/0x110)
[<
c09bc564>] (dump_stack) from [<
c013b93c>] (__warn+0xe0/0x10c)
[<
c013b93c>] (__warn) from [<
c013b9ac>] (warn_slowpath_fmt+0x44/0x6c)
[<
c013b9ac>] (warn_slowpath_fmt) from [<
c01e0368>] (check_unmap+0x4a8/0x968)
[<
c01e0368>] (check_unmap) from [<
c01e08a8>] (debug_dma_unmap_page+0x80/0x90)
[<
c01e08a8>] (debug_dma_unmap_page) from [<
c0752414>] (__cpdma_chan_free+0x114/0x16c)
[<
c0752414>] (__cpdma_chan_free) from [<
c07525c4>] (__cpdma_chan_process+0x158/0x17c)
[<
c07525c4>] (__cpdma_chan_process) from [<
c0753690>] (cpdma_chan_process+0x3c/0x5c)
[<
c0753690>] (cpdma_chan_process) from [<
c0758660>] (cpsw_tx_mq_poll+0x48/0x94)
[<
c0758660>] (cpsw_tx_mq_poll) from [<
c0803018>] (net_rx_action+0x108/0x4e4)
[<
c0803018>] (net_rx_action) from [<
c010230c>] (__do_softirq+0xec/0x598)
[<
c010230c>] (__do_softirq) from [<
c0143914>] (do_softirq.part.4+0x68/0x74)
[<
c0143914>] (do_softirq.part.4) from [<
c0143a44>] (__local_bh_enable_ip+0x124/0x17c)
[<
c0143a44>] (__local_bh_enable_ip) from [<
c0871590>] (ip_finish_output2+0x294/0xb7c)
[<
c0871590>] (ip_finish_output2) from [<
c0875440>] (ip_output+0x210/0x364)
[<
c0875440>] (ip_output) from [<
c0875e2c>] (ip_send_skb+0x1c/0xf8)
[<
c0875e2c>] (ip_send_skb) from [<
c08a7fd4>] (raw_sendmsg+0x9a8/0xc74)
[<
c08a7fd4>] (raw_sendmsg) from [<
c07d6b90>] (sock_sendmsg+0x14/0x24)
[<
c07d6b90>] (sock_sendmsg) from [<
c07d8260>] (__sys_sendto+0xbc/0x100)
[<
c07d8260>] (__sys_sendto) from [<
c01011ac>] (__sys_trace_return+0x0/0x14)
Exception stack(0xea9a7fa8 to 0xea9a7ff0)
...
The reason is that cpdma_chan_submit_si() now stores original buffer length
(sw_len) in CPDMA descriptor instead of adjusted buffer length (hw_len)
used to map the buffer.
Hence, fix an issue by passing correct buffer length in CPDMA descriptor.
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Fixes: 6670acacd59e ("net: ethernet: ti: davinci_cpdma: add dma mapped submit")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 9 Dec 2019 22:03:33 +0000 (14:03 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Wait for rcu grace period after releasing netns in ctnetlink,
from Florian Westphal.
2) Incorrect command type in flowtable offload ndo invocation,
from wenxu.
3) Incorrect callback type in flowtable offload flow tuple
updates, also from wenxu.
4) Fix compile warning on flowtable offload infrastructure due to
possible reference to uninitialized variable, from Nathan Chancellor.
5) Do not inline nf_ct_resolve_clash(), this is called from slow
path / stress situations. From Florian Westphal.
6) Missing IPv6 flow selector description in flowtable offload.
7) Missing check for NETDEV_UNREGISTER in nf_tables offload
infrastructure, from wenxu.
8) Update NAT selftest to use randomized netns names, from
Florian Westphal.
9) Restore nfqueue bridge support, from Marco Oliverio.
10) Compilation warning in SCTP_CHUNKMAP_*() on xt_sctp header.
From Phil Sutter.
11) Fix bogus lookup/get match for non-anonymous rbtree sets.
12) Missing netlink validation for NFT_SET_ELEM_INTERVAL_END
elements.
13) Missing netlink validation for NFT_DATA_VALUE after
nft_data_init().
14) If rule specifies no actions, offload infrastructure returns
EOPNOTSUPP.
15) Module refcount leak in object updates.
16) Missing sanitization for ARP traffic from br_netfilter, from
Eric Dumazet.
17) Compilation breakage on big-endian due to incorrect memcpy()
size in the flowtable offload infrastructure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Sat, 7 Dec 2019 17:38:12 +0000 (18:38 +0100)]
netfilter: nf_flow_table_offload: Correct memcpy size for flow_overload_mangle()
In function 'memcpy',
inlined from 'flow_offload_mangle' at net/netfilter/nf_flow_table_offload.c:112:2,
inlined from 'flow_offload_port_dnat' at net/netfilter/nf_flow_table_offload.c:373:2,
inlined from 'nf_flow_rule_route_ipv4' at net/netfilter/nf_flow_table_offload.c:424:3:
./include/linux/string.h:376:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter
376 | __read_overflow2();
| ^~~~~~~~~~~~~~~~~~
The original u8* was done in the hope to make this more adaptable but
consensus is to keep this like it is in tc pedit.
Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Martin Schiller [Mon, 9 Dec 2019 07:21:34 +0000 (08:21 +0100)]
net/x25: add new state X25_STATE_5
This is needed, because if the flag X25_ACCPT_APPRV_FLAG is not set on a
socket (manual call confirmation) and the channel is cleared by remote
before the manual call confirmation was sent, this situation needs to
be handled.
Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 9 Dec 2019 06:56:34 +0000 (08:56 +0200)]
selftests: forwarding: Delete IPv6 address at the end
When creating the second host in h2_create(), two addresses are assigned
to the interface, but only one is deleted. When running the test twice
in a row the following error is observed:
$ ./router_bridge_vlan.sh
TEST: ping [ OK ]
TEST: ping6 [ OK ]
TEST: vlan [ OK ]
$ ./router_bridge_vlan.sh
RTNETLINK answers: File exists
TEST: ping [ OK ]
TEST: ping6 [ OK ]
TEST: vlan [ OK ]
Fix this by deleting the address during cleanup.
Fixes: 5b1e7f9ebd56 ("selftests: forwarding: Test routed bridge interface")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 9 Dec 2019 06:55:20 +0000 (08:55 +0200)]
mlxsw: spectrum_router: Remove unlikely user-triggerable warning
In case the driver vetoes the addition of an IPv6 multipath route, the
IPv6 stack will emit delete notifications for the sibling routes that
were already added to the FIB trie. Since these siblings are not present
in hardware, a warning will be generated.
Have the driver ignore notifications for routes it does not have.
Fixes: ebee3cad835f ("ipv6: Add IPv6 multipath notifications for add / replace")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 9 Dec 2019 05:45:54 +0000 (13:45 +0800)]
sctp: fully initialize v4 addr in some functions
Syzbot found a crash:
BUG: KMSAN: uninit-value in crc32_body lib/crc32.c:112 [inline]
BUG: KMSAN: uninit-value in crc32_le_generic lib/crc32.c:179 [inline]
BUG: KMSAN: uninit-value in __crc32c_le_base+0x4fa/0xd30 lib/crc32.c:202
Call Trace:
crc32_body lib/crc32.c:112 [inline]
crc32_le_generic lib/crc32.c:179 [inline]
__crc32c_le_base+0x4fa/0xd30 lib/crc32.c:202
chksum_update+0xb2/0x110 crypto/crc32c_generic.c:90
crypto_shash_update+0x4c5/0x530 crypto/shash.c:107
crc32c+0x150/0x220 lib/libcrc32c.c:47
sctp_csum_update+0x89/0xa0 include/net/sctp/checksum.h:36
__skb_checksum+0x1297/0x12a0 net/core/skbuff.c:2640
sctp_compute_cksum include/net/sctp/checksum.h:59 [inline]
sctp_packet_pack net/sctp/output.c:528 [inline]
sctp_packet_transmit+0x40fb/0x4250 net/sctp/output.c:597
sctp_outq_flush_transports net/sctp/outqueue.c:1146 [inline]
sctp_outq_flush+0x1823/0x5d80 net/sctp/outqueue.c:1194
sctp_outq_uncork+0xd0/0xf0 net/sctp/outqueue.c:757
sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1781 [inline]
sctp_side_effects net/sctp/sm_sideeffect.c:1184 [inline]
sctp_do_sm+0x8fe1/0x9720 net/sctp/sm_sideeffect.c:1155
sctp_primitive_REQUESTHEARTBEAT+0x175/0x1a0 net/sctp/primitive.c:185
sctp_apply_peer_addr_params+0x212/0x1d40 net/sctp/socket.c:2433
sctp_setsockopt_peer_addr_params net/sctp/socket.c:2686 [inline]
sctp_setsockopt+0x189bb/0x19090 net/sctp/socket.c:4672
The issue was caused by transport->ipaddr set with uninit addr param, which
was passed by:
sctp_transport_init net/sctp/transport.c:47 [inline]
sctp_transport_new+0x248/0xa00 net/sctp/transport.c:100
sctp_assoc_add_peer+0x5ba/0x2030 net/sctp/associola.c:611
sctp_process_param net/sctp/sm_make_chunk.c:2524 [inline]
where 'addr' is set by sctp_v4_from_addr_param(), and it doesn't initialize
the padding of addr->v4.
Later when calling sctp_make_heartbeat(), hbinfo.daddr(=transport->ipaddr)
will become the part of skb, and the issue occurs.
This patch is to fix it by initializing the padding of addr->v4 in
sctp_v4_from_addr_param(), as well as other functions that do the similar
thing, and these functions shouldn't trust that the caller initializes the
memory, as Marcelo suggested.
Reported-by: syzbot+6dcbfea81cd3d4dd0b02@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 7 Dec 2019 22:10:34 +0000 (14:10 -0800)]
bonding: fix bond_neigh_init()
1) syzbot reported an uninit-value in bond_neigh_setup() [1]
bond_neigh_setup() uses a temporary on-stack 'struct neigh_parms parms',
but only clears parms.neigh_setup field.
A stacked bonding device would then enter bond_neigh_setup()
and read garbage from parms->dev.
If we get really unlucky and garbage is matching @dev, then we
could recurse and eventually crash.
Let's make sure the whole structure is cleared to avoid surprises.
2) bond_neigh_setup() can be called while another cpu manipulates
the master device, removing or adding a slave.
We need at least rcu protection to prevent use-after-free.
Note: Prior code does not support a stack of bonding devices,
this patch does not attempt to fix this, and leave a comment instead.
[1]
BUG: KMSAN: uninit-value in bond_neigh_setup+0xa4/0x110 drivers/net/bonding/bond_main.c:3655
CPU: 0 PID: 11256 Comm: syz-executor.0 Not tainted 5.4.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
__msan_warning+0x57/0xa0 mm/kmsan/kmsan_instr.c:245
bond_neigh_setup+0xa4/0x110 drivers/net/bonding/bond_main.c:3655
bond_neigh_init+0x216/0x4b0 drivers/net/bonding/bond_main.c:3626
___neigh_create+0x169e/0x2c40 net/core/neighbour.c:613
__neigh_create+0xbd/0xd0 net/core/neighbour.c:674
ip6_finish_output2+0x149a/0x2670 net/ipv6/ip6_output.c:113
__ip6_finish_output+0x83d/0x8f0 net/ipv6/ip6_output.c:142
ip6_finish_output+0x2db/0x420 net/ipv6/ip6_output.c:152
NF_HOOK_COND include/linux/netfilter.h:294 [inline]
ip6_output+0x5d3/0x720 net/ipv6/ip6_output.c:175
dst_output include/net/dst.h:436 [inline]
NF_HOOK include/linux/netfilter.h:305 [inline]
mld_sendpack+0xebd/0x13d0 net/ipv6/mcast.c:1682
mld_send_cr net/ipv6/mcast.c:1978 [inline]
mld_ifc_timer_expire+0x116b/0x1680 net/ipv6/mcast.c:2477
call_timer_fn+0x232/0x530 kernel/time/timer.c:1404
expire_timers kernel/time/timer.c:1449 [inline]
__run_timers+0xd60/0x1270 kernel/time/timer.c:1773
run_timer_softirq+0x2d/0x50 kernel/time/timer.c:1786
__do_softirq+0x4a1/0x83a kernel/softirq.c:293
invoke_softirq kernel/softirq.c:375 [inline]
irq_exit+0x230/0x280 kernel/softirq.c:416
exiting_irq+0xe/0x10 arch/x86/include/asm/apic.h:536
smp_apic_timer_interrupt+0x48/0x70 arch/x86/kernel/apic/apic.c:1138
apic_timer_interrupt+0x2e/0x40 arch/x86/entry/entry_64.S:835
</IRQ>
RIP: 0010:kmsan_free_page+0x18d/0x1c0 mm/kmsan/kmsan_shadow.c:439
Code: 4c 89 ff 44 89 f6 e8 82 0d ee ff 65 ff 0d 9f 26 3b 60 65 8b 05 98 26 3b 60 85 c0 75 24 e8 5b f6 35 ff 4c 89 6d d0 ff 75 d0 9d <48> 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 0f 0b 0f 0b 0f
RSP: 0018:
ffffb328034af818 EFLAGS:
00000246 ORIG_RAX:
ffffffffffffff13
RAX:
0000000000000000 RBX:
ffffe2d7471f8360 RCX:
0000000000000000
RDX:
ffffffffadea7000 RSI:
0000000000000004 RDI:
ffff93496fcda104
RBP:
ffffb328034af850 R08:
ffff934a47e86d00 R09:
ffff93496fc41900
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000001
R13:
0000000000000246 R14:
0000000000000000 R15:
ffffe2d7472225c0
free_pages_prepare mm/page_alloc.c:1138 [inline]
free_pcp_prepare mm/page_alloc.c:1230 [inline]
free_unref_page_prepare+0x1d9/0x770 mm/page_alloc.c:3025
free_unref_page mm/page_alloc.c:3074 [inline]
free_the_page mm/page_alloc.c:4832 [inline]
__free_pages+0x154/0x230 mm/page_alloc.c:4840
__vunmap+0xdac/0xf20 mm/vmalloc.c:2277
__vfree mm/vmalloc.c:2325 [inline]
vfree+0x7c/0x170 mm/vmalloc.c:2355
copy_entries_to_user net/ipv6/netfilter/ip6_tables.c:883 [inline]
get_entries net/ipv6/netfilter/ip6_tables.c:1041 [inline]
do_ip6t_get_ctl+0xfa4/0x1030 net/ipv6/netfilter/ip6_tables.c:1709
nf_sockopt net/netfilter/nf_sockopt.c:104 [inline]
nf_getsockopt+0x481/0x4e0 net/netfilter/nf_sockopt.c:122
ipv6_getsockopt+0x264/0x510 net/ipv6/ipv6_sockglue.c:1400
tcp_getsockopt+0x1c6/0x1f0 net/ipv4/tcp.c:3688
sock_common_getsockopt+0x13f/0x180 net/core/sock.c:3110
__sys_getsockopt+0x533/0x7b0 net/socket.c:2129
__do_sys_getsockopt net/socket.c:2144 [inline]
__se_sys_getsockopt+0xe1/0x100 net/socket.c:2141
__x64_sys_getsockopt+0x62/0x80 net/socket.c:2141
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45d20a
Code: b8 34 01 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 8d 8b fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 37 00 00 00 0f 05 <48> 3d 01 f0 ff ff 0f 83 6a 8b fb ff c3 66 0f 1f 84 00 00 00 00 00
RSP: 002b:
0000000000a6f618 EFLAGS:
00000212 ORIG_RAX:
0000000000000037
RAX:
ffffffffffffffda RBX:
0000000000a6f640 RCX:
000000000045d20a
RDX:
0000000000000041 RSI:
0000000000000029 RDI:
0000000000000003
RBP:
0000000000717cc0 R08:
0000000000a6f63c R09:
0000000000004000
R10:
0000000000a6f740 R11:
0000000000000212 R12:
0000000000000003
R13:
0000000000000000 R14:
0000000000000029 R15:
0000000000715b00
Local variable description: ----parms@bond_neigh_init
Variable was created at:
bond_neigh_init+0x8c/0x4b0 drivers/net/bonding/bond_main.c:3617
bond_neigh_init+0x8c/0x4b0 drivers/net/bonding/bond_main.c:3617
Fixes: 9918d5bf329d ("bonding: modify only neigh_parms owned by us")
Fixes: 234bcf8a499e ("net/bonding: correctly proxy slave neigh param setup ndo function")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Jay Vosburgh <j.vosburgh@gmail.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 7 Dec 2019 20:23:21 +0000 (12:23 -0800)]
neighbour: remove neigh_cleanup() method
neigh_cleanup() has not been used for seven years, and was a wrong design.
Messing with shared pointer in bond_neigh_init() without proper
memory barriers would at least trigger syzbot complains eventually.
It is time to remove this stuff.
Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 9 Dec 2019 17:27:47 +0000 (09:27 -0800)]
Merge tag 'linux-can-fixes-for-5.5-
20191208' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2019-12-08
this is a pull request of 13 patches for net/master.
The first two patches are by Dan Murphy. He adds himself as a maintainer to the
m-can MMIO and tcan SPI driver.
The next two patches the j1939 stack. The first one is by Oleksij Rempel and
fixes a locking problem found by the syzbot, the second one is by me an fixes a
mistake in the documentation.
Srinivas Neeli fixes missing RX CAN packets on CANFD2.0 in the xilinx driver.
Sean Nyekjaer fixes a possible deadlock in the the flexcan driver after
suspend/resume. Joakim Zhang contributes two patches for the flexcan driver
that fix problems with the low power enter/exit.
The next 4 patches all target the tcan part of the m_can driver. Sean Nyekjaer
adds the required delay after reset and fixes the device tree binding example.
Dan Murphy's patches make the wake-gpio optional.
In the last patch Xiaolong Huang fixes several kernel memory info leaks to the
USB device in the kvaser_usb_leaf driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 7 Dec 2019 22:43:39 +0000 (14:43 -0800)]
netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()
syzbot is kind enough to remind us we need to call skb_may_pull()
BUG: KMSAN: uninit-value in br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
CPU: 1 PID: 11631 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c9/0x220 lib/dump_stack.c:118
kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
__msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
nf_hook_slow+0x18b/0x3f0 net/netfilter/core.c:512
nf_hook include/linux/netfilter.h:260 [inline]
NF_HOOK include/linux/netfilter.h:303 [inline]
__br_forward+0x78f/0xe30 net/bridge/br_forward.c:109
br_flood+0xef0/0xfe0 net/bridge/br_forward.c:234
br_handle_frame_finish+0x1a77/0x1c20 net/bridge/br_input.c:162
nf_hook_bridge_pre net/bridge/br_input.c:245 [inline]
br_handle_frame+0xfb6/0x1eb0 net/bridge/br_input.c:348
__netif_receive_skb_core+0x20b9/0x51a0 net/core/dev.c:4830
__netif_receive_skb_one_core net/core/dev.c:4927 [inline]
__netif_receive_skb net/core/dev.c:5043 [inline]
process_backlog+0x610/0x13c0 net/core/dev.c:5874
napi_poll net/core/dev.c:6311 [inline]
net_rx_action+0x7a6/0x1aa0 net/core/dev.c:6379
__do_softirq+0x4a1/0x83a kernel/softirq.c:293
do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1091
</IRQ>
do_softirq kernel/softirq.c:338 [inline]
__local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:190
local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline]
__dev_queue_xmit+0x38e8/0x4200 net/core/dev.c:3819
dev_queue_xmit+0x4b/0x60 net/core/dev.c:3825
packet_snd net/packet/af_packet.c:2959 [inline]
packet_sendmsg+0x8234/0x9100 net/packet/af_packet.c:2984
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
__sys_sendto+0xc44/0xc70 net/socket.c:1952
__do_sys_sendto net/socket.c:1964 [inline]
__se_sys_sendto+0x107/0x130 net/socket.c:1960
__x64_sys_sendto+0x6e/0x90 net/socket.c:1960
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45a679
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007f0a3c9e5c78 EFLAGS:
00000246 ORIG_RAX:
000000000000002c
RAX:
ffffffffffffffda RBX:
0000000000000006 RCX:
000000000045a679
RDX:
000000000000000e RSI:
0000000020000200 RDI:
0000000000000003
RBP:
000000000075bf20 R08:
00000000200000c0 R09:
0000000000000014
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f0a3c9e66d4
R13:
00000000004c8ec1 R14:
00000000004dfe28 R15:
00000000ffffffff
Uninit was created at:
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
slab_alloc_node mm/slub.c:2773 [inline]
__kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
__kmalloc_reserve net/core/skbuff.c:141 [inline]
__alloc_skb+0x306/0xa10 net/core/skbuff.c:209
alloc_skb include/linux/skbuff.h:1049 [inline]
alloc_skb_with_frags+0x18c/0xa80 net/core/skbuff.c:5662
sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2244
packet_alloc_skb net/packet/af_packet.c:2807 [inline]
packet_snd net/packet/af_packet.c:2902 [inline]
packet_sendmsg+0x63a6/0x9100 net/packet/af_packet.c:2984
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg net/socket.c:657 [inline]
__sys_sendto+0xc44/0xc70 net/socket.c:1952
__do_sys_sendto net/socket.c:1964 [inline]
__se_sys_sendto+0x107/0x130 net/socket.c:1960
__x64_sys_sendto+0x6e/0x90 net/socket.c:1960
do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: c4e70a87d975 ("netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.c")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 6 Dec 2019 21:49:58 +0000 (22:49 +0100)]
netfilter: nf_tables_offload: return EOPNOTSUPP if rule specifies no actions
If the rule only specifies the matching side, return EOPNOTSUPP.
Otherwise, the front-end relies on the drivers to reject this rule.
Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 6 Dec 2019 21:25:55 +0000 (22:25 +0100)]
netfilter: nf_tables: skip module reference count bump on object updates
Use __nft_obj_type_get() instead, otherwise there is a module reference
counter leak.
Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 6 Dec 2019 21:09:14 +0000 (22:09 +0100)]
netfilter: nf_tables: validate NFT_DATA_VALUE after nft_data_init()
Userspace might bogusly sent NFT_DATA_VERDICT in several netlink
attributes that assume NFT_DATA_VALUE. Moreover, make sure that error
path invokes nft_data_release() to decrement the reference count on the
chain object.
Fixes: 96518518cc41 ("netfilter: add nftables")
Fixes: 0f3cd9b36977 ("netfilter: nf_tables: add range expression")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 6 Dec 2019 20:55:20 +0000 (21:55 +0100)]
netfilter: nf_tables: validate NFT_SET_ELEM_INTERVAL_END
Only NFTA_SET_ELEM_KEY and NFTA_SET_ELEM_FLAGS make sense for elements
whose NFT_SET_ELEM_INTERVAL_END flag is set on.
Fixes: 96518518cc41 ("netfilter: add nftables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 6 Dec 2019 19:23:29 +0000 (20:23 +0100)]
netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
The existing rbtree implementation might store consecutive elements
where the closing element and the opening element might overlap, eg.
[ a, a+1) [ a+1, a+2)
This patch removes the optimization for non-anonymous sets in the exact
matching case, where it is assumed to stop searching in case that the
closing element is found. Instead, invalidate candidate interval and
keep looking further in the tree.
The lookup/get operation might return false, while there is an element
in the rbtree. Moreover, the get operation returns true as if a+2 would
be in the tree. This happens with named sets after several set updates.
The existing lookup optimization (that only works for the anonymous
sets) might not reach the opening [ a+1,... element if the closing
...,a+1) is found in first place when walking over the rbtree. Hence,
walking the full tree in that case is needed.
This patch fixes the lookup and get operations.
Fixes: e701001e7cbe ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Fixes: ba0e4d9917b4 ("netfilter: nf_tables: get set elements via netlink")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Thu, 5 Dec 2019 12:35:11 +0000 (13:35 +0100)]
netfilter: uapi: Avoid undefined left-shift in xt_sctp.h
With 'bytes(__u32)' being 32, a left-shift of 31 may happen which is
undefined for the signed 32-bit value 1. Avoid this by declaring 1 as
unsigned.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Sun, 8 Dec 2019 22:57:55 +0000 (14:57 -0800)]
Linux 5.5-rc1
Linus Torvalds [Sun, 8 Dec 2019 21:28:11 +0000 (13:28 -0800)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) More jumbo frame fixes in r8169, from Heiner Kallweit.
2) Fix bpf build in minimal configuration, from Alexei Starovoitov.
3) Use after free in slcan driver, from Jouni Hogander.
4) Flower classifier port ranges don't work properly in the HW offload
case, from Yoshiki Komachi.
5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin.
6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk.
7) Fix flow dissection in dsa TX path, from Alexander Lobakin.
8) Stale syncookie timestampe fixes from Guillaume Nault.
[ Did an evil merge to silence a warning introduced by this pull - Linus ]
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
r8169: fix rtl_hw_jumbo_disable for RTL8168evl
net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()
r8169: add missing RX enabling for WoL on RTL8125
vhost/vsock: accept only packets with the right dst_cid
net: phy: dp83867: fix hfs boot in rgmii mode
net: ethernet: ti: cpsw: fix extra rx interrupt
inet: protect against too small mtu values.
gre: refetch erspan header from skb->data after pskb_may_pull()
pppoe: remove redundant BUG_ON() check in pppoe_pernet
tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
tcp: tighten acceptance of ACKs not matching a child socket
tcp: fix rejected syncookies due to stale timestamps
lpc_eth: kernel BUG on remove
tcp: md5: fix potential overestimation of TCP option space
net: sched: allow indirect blocks to bind to clsact in TC
net: core: rename indirect block ingress cb function
net-sysfs: Call dev_hold always in netdev_queue_add_kobject
net: dsa: fix flow dissection on Tx path
net/tls: Fix return values to avoid ENOTSUPP
net: avoid an indirect call in ____sys_recvmsg()
...
Linus Torvalds [Sun, 8 Dec 2019 20:23:42 +0000 (12:23 -0800)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Eleven patches, all in drivers (no core changes) that are either minor
cleanups or small fixes.
They were late arriving, but still safe for -rc1"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: MAINTAINERS: Add the linux-scsi mailing list to the ISCSI entry
scsi: megaraid_sas: Make poll_aen_lock static
scsi: sd_zbc: Improve report zones error printout
scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI
scsi: qla2xxx: unregister ports after GPN_FT failure
scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan
scsi: pm80xx: Remove unused include of linux/version.h
scsi: pm80xx: fix logic to break out of loop when register value is 2 or 3
scsi: scsi_transport_sas: Fix memory leak when removing devices
scsi: lpfc: size cpu map by last cpu id set
scsi: ibmvscsi_tgt: Remove unneeded variable rc
Linus Torvalds [Sun, 8 Dec 2019 20:12:18 +0000 (12:12 -0800)]
Merge tag '5.5-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Nine cifs/smb3 fixes:
- one fix for stable (oops during oplock break)
- two timestamp fixes including important one for updating mtime at
close to avoid stale metadata caching issue on dirty files (also
improves perf by using SMB2_CLOSE_FLAG_POSTQUERY_ATTRIB over the
wire)
- two fixes for "modefromsid" mount option for file create (now
allows mode bits to be set more atomically and accurately on create
by adding "sd_context" on create when modefromsid specified on
mount)
- two fixes for multichannel found in testing this week against
different servers
- two small cleanup patches"
* tag '5.5-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
smb3: improve check for when we send the security descriptor context on create
smb3: fix mode passed in on create for modetosid mount option
cifs: fix possible uninitialized access and race on iface_list
cifs: Fix lookup of SMB connections on multichannel
smb3: query attributes on file close
smb3: remove unused flag passed into close functions
cifs: remove redundant assignment to pointer pneg_ctxt
fs: cifs: Fix atime update check vs mtime
CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks
Linus Torvalds [Sun, 8 Dec 2019 19:08:28 +0000 (11:08 -0800)]
Merge branch 'work.misc' of git://git./linux/kernel/git/viro/vfs
Pull misc vfs cleanups from Al Viro:
"No common topic, just three cleanups".
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
make __d_alloc() static
fs/namespace: add __user to open_tree and move_mount syscalls
fs/fnctl: fix missing __user in fcntl_rw_hint()
Xiaolong Huang [Sat, 7 Dec 2019 14:40:24 +0000 (22:40 +0800)]
can: kvaser_usb: kvaser_usb_leaf: Fix some info-leaks to USB devices
Uninitialized Kernel memory can leak to USB devices.
Fix this by using kzalloc() instead of kmalloc().
Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com>
Fixes: 7259124eac7d ("can: kvaser_usb: Split driver into kvaser_usb_core.c and kvaser_usb_leaf.c")
Cc: linux-stable <stable@vger.kernel.org> # >= v4.19
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Dan Murphy [Wed, 4 Dec 2019 17:51:12 +0000 (11:51 -0600)]
can: tcan45x: Make wake-up GPIO an optional GPIO
The device has the ability to disable the wake-up pin option. The
wake-up pin can be either force to GND or Vsup and does not have to be
tied to a GPIO. In order for the device to not use the wake-up feature
write the register to disable the WAKE_CONFIG option.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Cc: Sean Nyekjaer <sean@geanix.com>
Reviewed-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Dan Murphy [Wed, 4 Dec 2019 17:51:11 +0000 (11:51 -0600)]
dt-bindings: tcan4x5x: Make wake-gpio an optional gpio
The wake-up of the device can be configured as an optional feature of
the device. Move the wake-up gpio from a requried property to an
optional property.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Cc: Rob Herring <robh@kernel.org>
Reviewed-by: Sean Nyekjaer <sean@geanix.com>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Sean Nyekjaer [Fri, 6 Dec 2019 15:29:23 +0000 (16:29 +0100)]
dt-bindings: can: tcan4x5x: reset pin is active high
Change the reset pin example to active high to be in line with
the datasheet
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Sean Nyekjaer [Fri, 6 Dec 2019 15:29:22 +0000 (16:29 +0100)]
can: m_can: tcan4x5x: add required delay after reset
According to section "8.3.8 RST Pin" in the datasheet we are required to
wait >700us after the device is reset.
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Acked-by: Dan Murphy <dmurphy@ti.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.4
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Joakim Zhang [Wed, 4 Dec 2019 11:36:14 +0000 (11:36 +0000)]
can: flexcan: poll MCR_LPM_ACK instead of GPR ACK for stop mode acknowledgment
Stop Mode is entered when Stop Mode is requested at chip level and
MCR[LPM_ACK] is asserted by the FlexCAN.
Double check with IP owner, the MCR[LPM_ACK] bit should be polled for
stop mode acknowledgment, not the acknowledgment from chip level which
is used to gate flexcan clocks.
This patch depends on:
b7603d080ffc ("can: flexcan: add low power enter/exit acknowledgment helper")
Fixes: 5f186c257fa4 (can: flexcan: fix stop mode acknowledgment)
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Joakim Zhang [Wed, 4 Dec 2019 11:36:11 +0000 (11:36 +0000)]
can: flexcan: add low power enter/exit acknowledgment helper
The MCR[LPMACK] read-only bit indicates that FlexCAN is in a lower-power
mode (Disabled mode, Doze mode, Stop mode).
The CPU can poll this bit to know when FlexCAN has actually entered low
power mode. The low power enter/exit acknowledgment helper will reduce
code duplication for disabled mode, doze mode and stop mode.
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Sean Nyekjaer [Wed, 4 Dec 2019 11:36:06 +0000 (11:36 +0000)]
can: flexcan: fix possible deadlock and out-of-order reception after wakeup
When suspending, and there is still CAN traffic on the interfaces the
flexcan immediately wakes the platform again. As it should :-). But it
throws this error msg:
[ 3169.378661] PM: noirq suspend of devices failed
On the way down to suspend the interface that throws the error message
calls flexcan_suspend() but fails to call flexcan_noirq_suspend(). That
means flexcan_enter_stop_mode() is called, but on the way out of suspend
the driver only calls flexcan_resume() and skips flexcan_noirq_resume(),
thus it doesn't call flexcan_exit_stop_mode(). This leaves the flexcan
in stop mode, and with the current driver it can't recover from this
even with a soft reboot, it requires a hard reboot.
This patch fixes the deadlock when using self wakeup, by calling
flexcan_exit_stop_mode() from flexcan_resume() instead of
flexcan_noirq_resume().
This also fixes another issue: CAN frames are received out-of-order in
first IRQ handler run after wakeup.
The problem is that the wakeup latency from frame reception to the IRQ
handler (where the CAN frames are sorted by timestamp) is much bigger
than the time stamp counter wrap around time. This means it's
impossible to sort the CAN frames by timestamp.
The reason is that the controller exits stop mode during noirq resume,
which means it receives frames immediately, but interrupt handling is
still not possible.
So exit stop mode during resume stage instead of noirq resume fixes this
issue.
Fixes: de3578c198c6 ("can: flexcan: add self wakeup support")
Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Tested-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Srinivas Neeli [Tue, 3 Dec 2019 12:16:36 +0000 (17:46 +0530)]
can: xilinx_can: Fix missing Rx can packets on CANFD2.0
CANFD2.0 core uses BRAM for storing acceptance filter ID(AFID) and MASK
(AFMASK)registers. So by default AFID and AFMASK registers contain random
data. Due to random data, we are not able to receive all CAN ids.
Initializing AFID and AFMASK registers with Zero before enabling
acceptance filter to receive all packets irrespective of ID and Mask.
Fixes: 0db9071353a0 ("can: xilinx: add can 2.0 support")
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Srinivas Neeli <srinivas.neeli@xilinx.com>
Reviewed-by: Naga Sureshkumar Relli <naga.sureshkumar.relli@xilinx.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.0
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Thu, 21 Nov 2019 09:47:50 +0000 (10:47 +0100)]
can: j1939: fix address claim code example
During development the define J1939_PGN_ADDRESS_REQUEST was renamed to
J1939_PGN_REQUEST. It was forgotten to adjust the documentation
accordingly.
This patch fixes the name of the symbol.
Reported-by: https://github.com/linux-can/can-utils/issues/159#issuecomment-556538798
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Cc: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Oleksij Rempel [Fri, 6 Dec 2019 14:18:35 +0000 (15:18 +0100)]
can: j1939: j1939_sk_bind(): take priv after lock is held
syzbot reproduced following crash:
===============================================================================
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9844 Comm: syz-executor.0 Not tainted 5.4.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:__lock_acquire+0x1254/0x4a00 kernel/locking/lockdep.c:3828
Code: 00 0f 85 96 24 00 00 48 81 c4 f0 00 00 00 5b 41 5c 41 5d 41 5e 41
5f 5d c3 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 ea 03 <80> 3c 02
00 0f 85 0b 28 00 00 49 81 3e 20 19 78 8a 0f 84 5f ee ff
RSP: 0018:
ffff888099c3fb48 EFLAGS:
00010006
RAX:
dffffc0000000000 RBX:
0000000000000000 RCX:
0000000000000000
RDX:
0000000000000218 RSI:
0000000000000000 RDI:
0000000000000001
RBP:
ffff888099c3fc60 R08:
0000000000000001 R09:
0000000000000001
R10:
fffffbfff146e1d0 R11:
ffff888098720400 R12:
00000000000010c0
R13:
0000000000000000 R14:
00000000000010c0 R15:
0000000000000000
FS:
00007f0559e98700(0000) GS:
ffff8880ae800000(0000)
knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fe4d89e0000 CR3:
0000000099606000 CR4:
00000000001406f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
_raw_spin_lock_bh+0x33/0x50 kernel/locking/spinlock.c:175
spin_lock_bh include/linux/spinlock.h:343 [inline]
j1939_jsk_del+0x32/0x210 net/can/j1939/socket.c:89
j1939_sk_bind+0x2ea/0x8f0 net/can/j1939/socket.c:448
__sys_bind+0x239/0x290 net/socket.c:1648
__do_sys_bind net/socket.c:1659 [inline]
__se_sys_bind net/socket.c:1657 [inline]
__x64_sys_bind+0x73/0xb0 net/socket.c:1657
do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x45a679
Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007f0559e97c78 EFLAGS:
00000246 ORIG_RAX:
0000000000000031
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
000000000045a679
RDX:
0000000000000018 RSI:
0000000020000240 RDI:
0000000000000003
RBP:
000000000075bf20 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f0559e986d4
R13:
00000000004c09e9 R14:
00000000004d37d0 R15:
00000000ffffffff
Modules linked in:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 9844 at kernel/locking/mutex.c:1419
mutex_trylock+0x279/0x2f0 kernel/locking/mutex.c:1427
===============================================================================
This issues was caused by null pointer deference. Where j1939_sk_bind()
was using currently not existing priv.
Possible scenario may look as following:
cpu0 cpu1
bind()
bind()
j1939_sk_bind()
j1939_sk_bind()
priv = jsk->priv;
priv = jsk->priv;
lock_sock(sock->sk);
priv = j1939_netdev_start(ndev);
j1939_jsk_add(priv, jsk);
jsk->priv = priv;
relase_sock(sock->sk);
lock_sock(sock->sk);
j1939_jsk_del(priv, jsk);
..... ooops ......
With this patch we move "priv = jsk->priv;" after the lock, to avoid
assigning of wrong priv pointer.
Reported-by: syzbot+99e9e1b200a1e363237d@syzkaller.appspotmail.com
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: linux-stable <stable@vger.kernel.org> # >= v5.4
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Dan Murphy [Thu, 5 Dec 2019 17:57:16 +0000 (11:57 -0600)]
MAINTAINERS: Add myself as a maintainer for TCAN4x5x
Adding myself to support the TI TCAN4X5X SPI CAN device.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Dan Murphy [Thu, 5 Dec 2019 17:57:15 +0000 (11:57 -0600)]
MAINTAINERS: Add myself as a maintainer for MMIO m_can
Since I refactored the code to create a m_can framework and we
have a MMIO MCAN IP as well add myself to help maintain the code.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Linus Torvalds [Sun, 8 Dec 2019 02:38:17 +0000 (18:38 -0800)]
Merge tag 'ntb-5.5' of git://github.com/jonmason/ntb
Pull NTB update from Jon Mason:
"Just a simple patch to add a new Hygon Device ID to the AMD NTB device
driver"
* tag 'ntb-5.5' of git://github.com/jonmason/ntb:
NTB: Add Hygon Device ID
Linus Torvalds [Sun, 8 Dec 2019 02:33:01 +0000 (18:33 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull more input updates from Dmitry Torokhov:
- fixups for Synaptics RMI4 driver
- a quirk for Goodinx touchscreen on Teclast tablet
- a new keycode definition for activating privacy screen feature found
on a few "enterprise" laptops
- updates to snvs_pwrkey driver
- polling uinput device for writing (which is always allowed) now works
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers
Input: synaptics-rmi4 - re-enable IRQs in f34v7_do_reflash
Input: goodix - add upside-down quirk for Teclast X89 tablet
Input: add privacy screen toggle keycode
Input: uinput - fix returning EPOLLOUT from uinput_poll
Input: snvs_pwrkey - remove gratuitous NULL initializers
Input: snvs_pwrkey - send key events for i.MX6 S, DL and Q
Linus Torvalds [Sun, 8 Dec 2019 01:07:18 +0000 (17:07 -0800)]
Merge tag 'iomap-5.5-merge-14' of git://git./fs/xfs/xfs-linux
Pull iomap fixes from Darrick Wong:
"Fix a race condition and a use-after-free error:
- Fix a UAF when reporting writeback errors
- Fix a race condition when handling page uptodate on fragmented file
with blocksize < pagesize"
* tag 'iomap-5.5-merge-14' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: stop using ioend after it's been freed in iomap_finish_ioend()
iomap: fix sub-page uptodate handling
Linus Torvalds [Sun, 8 Dec 2019 01:05:33 +0000 (17:05 -0800)]
Merge tag 'xfs-5.5-merge-17' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"Fix a couple of resource management errors and a hang:
- fix a crash in the log setup code when log mounting fails
- fix a hang when allocating space on the realtime device
- fix a block leak when freeing space on the realtime device"
* tag 'xfs-5.5-merge-17' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix mount failure crash on invalid iclog memory access
xfs: don't check for AG deadlock for realtime files in bunmapi
xfs: fix realtime file data space leak
Linus Torvalds [Sun, 8 Dec 2019 00:59:25 +0000 (16:59 -0800)]
Merge tag 'for-linus-5.5-ofs1' of git://git./linux/kernel/git/hubcap/linux
Pull orangefs update from Mike Marshall:
"orangefs: posix open permission checking...
Orangefs has no open, and orangefs checks file permissions on each
file access. Posix requires that file permissions be checked on open
and nowhere else. Orangefs-through-the-kernel needs to seem posix
compliant.
The VFS opens files, even if the filesystem provides no method. We can
see if a file was successfully opened for read and or for write by
looking at file->f_mode.
When writes are flowing from the page cache, file is no longer
available. We can trust the VFS to have checked file->f_mode before
writing to the page cache.
The mode of a file might change between when it is opened and IO
commences, or it might be created with an arbitrary mode.
We'll make sure we don't hit EACCES during the IO stage by using
UID 0"
[ This is "posixish", but not a great solution in the long run, since a
proper secure network server shouldn't really trust the client like this.
But proper and secure POSIX behavior requires an open method and a
resulting cookie for IO of some kind, or similar. - Linus ]
* tag 'for-linus-5.5-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: posix open permission checking...
Linus Torvalds [Sun, 8 Dec 2019 00:56:00 +0000 (16:56 -0800)]
Merge tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"This is a relatively quiet cycle for nfsd, mainly various bugfixes.
Possibly most interesting is Trond's fixes for some callback races
that were due to my incomplete understanding of rpc client shutdown.
Unfortunately at the last minute I've started noticing a new
intermittent failure to send callbacks. As the logic seems basically
correct, I'm leaving Trond's patches in for now, and hope to find a
fix in the next week so I don't have to revert those patches"
* tag 'nfsd-5.5' of git://linux-nfs.org/~bfields/linux: (24 commits)
nfsd: depend on CRYPTO_MD5 for legacy client tracking
NFSD fixing possible null pointer derefering in copy offload
nfsd: check for EBUSY from vfs_rmdir/vfs_unink.
nfsd: Ensure CLONE persists data and metadata changes to the target file
SUNRPC: Fix backchannel latency metrics
nfsd: restore NFSv3 ACL support
nfsd: v4 support requires CRYPTO_SHA256
nfsd: Fix cld_net->cn_tfm initialization
lockd: remove __KERNEL__ ifdefs
sunrpc: remove __KERNEL__ ifdefs
race in exportfs_decode_fh()
nfsd: Drop LIST_HEAD where the variable it declares is never used.
nfsd: document callback_wq serialization of callback code
nfsd: mark cb path down on unknown errors
nfsd: Fix races between nfsd4_cb_release() and nfsd4_shutdown_callback()
nfsd: minor 4.1 callback cleanup
SUNRPC: Fix svcauth_gss_proxy_init()
SUNRPC: Trace gssproxy upcall results
sunrpc: fix crash when cache_head become valid before update
nfsd: remove private bin2hex implementation
...
Linus Torvalds [Sun, 8 Dec 2019 00:50:55 +0000 (16:50 -0800)]
Merge tag 'nfs-for-5.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Features:
- NFSv4.2 now supports cross device offloaded copy (i.e. offloaded
copy of a file from one source server to a different target
server).
- New RDMA tracepoints for debugging congestion control and Local
Invalidate WRs.
Bugfixes and cleanups
- Drop the NFSv4.1 session slot if nfs4_delegreturn_prepare waits for
layoutreturn
- Handle bad/dead sessions correctly in nfs41_sequence_process()
- Various bugfixes to the delegation return operation.
- Various bugfixes pertaining to delegations that have been revoked.
- Cleanups to the NFS timespec code to avoid unnecessary conversions
between timespec and timespec64.
- Fix unstable RDMA connections after a reconnect
- Close race between waking an RDMA sender and posting a receive
- Wake pending RDMA tasks if connection fails
- Fix MR list corruption, and clean up MR usage
- Fix another RPCSEC_GSS issue with MIC buffer space"
* tag 'nfs-for-5.5-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (79 commits)
SUNRPC: Capture completion of all RPC tasks
SUNRPC: Fix another issue with MIC buffer space
NFS4: Trace lock reclaims
NFS4: Trace state recovery operation
NFSv4.2 fix memory leak in nfs42_ssc_open
NFSv4.2 fix kfree in __nfs42_copy_file_range
NFS: remove duplicated include from nfs4file.c
NFSv4: Make _nfs42_proc_copy_notify() static
NFS: Fallocate should use the nfs4_fattr_bitmap
NFS: Return -ETXTBSY when attempting to write to a swapfile
fs: nfs: sysfs: Remove NULL check before kfree
NFS: remove unneeded semicolon
NFSv4: add declaration of current_stateid
NFSv4.x: Drop the slot if nfs4_delegreturn_prepare waits for layoutreturn
NFSv4.x: Handle bad/dead sessions correctly in nfs41_sequence_process()
nfsv4: Move NFSPROC4_CLNT_COPY_NOTIFY to end of list
SUNRPC: Avoid RPC delays when exiting suspend
NFS: Add a tracepoint in nfs_fh_to_dentry()
NFSv4: Don't retry the GETATTR on old stateid in nfs4_delegreturn_done()
NFSv4: Handle NFS4ERR_OLD_STATEID in delegreturn
...