Linus Torvalds [Fri, 17 Apr 2020 17:08:08 +0000 (10:08 -0700)]
Merge tag 'block-5.7-2020-04-17' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Fix for a driver tag leak in error handling (John)
- Remove now defunct Kconfig selection from dasd (Stefan)
- blk-wbt trace fiexs (Tommi)
* tag 'block-5.7-2020-04-17' of git://git.kernel.dk/linux-block:
blk-wbt: Drop needless newlines from tracepoint format strings
blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals
s390/dasd: remove IOSCHED_DEADLINE from DASD Kconfig
blk-mq: Put driver tag in blk_mq_dispatch_rq_list() when no budget
Linus Torvalds [Fri, 17 Apr 2020 17:06:16 +0000 (10:06 -0700)]
Merge tag 'libata-5.7-2020-04-17' of git://git.kernel.dk/linux-block
Pull libata fixlet from Jens Axboe:
"Add yet another Comet Lake PCI ID for ahci"
* tag 'libata-5.7-2020-04-17' of git://git.kernel.dk/linux-block:
ahci: Add Intel Comet Lake PCH-U PCI ID
Linus Torvalds [Fri, 17 Apr 2020 17:00:33 +0000 (10:00 -0700)]
Merge tag 'for-5.7-rc1-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"A regression fix for a warning caused by running balance and snapshot
creation in parallel"
* tag 'for-5.7-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix setting last_trans for reloc roots
Linus Torvalds [Fri, 17 Apr 2020 16:56:28 +0000 (09:56 -0700)]
Merge tag 'pm-5.7-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management update from Rafael Wysocki:
"Allow the operating performance points (OPP) core to be used in the
case when the same driver is used on different platforms, some of
which have an OPP table and some of which have a clock node (Rajendra
Nayak)"
* tag 'pm-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
opp: Manage empty OPP tables with clk handle
Linus Torvalds [Fri, 17 Apr 2020 16:48:50 +0000 (09:48 -0700)]
Merge tag 'sound-5.7-rc2' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"One significant regression fix is for HD-audio buffer preallocation.
In 5.6 it was set to non-prompt for x86 and forced to 0, but this
turned out to be problematic for some applications, hence it gets
reverted. Distros would need to restore CONFIG_SND_HDA_PREALLOC_SIZE
value to the earlier values they've used in the past.
Other than that, we've received quite a few small fixes for HD-audio
and USB-audio. Most of them are for dealing with the broken TRX40
mobos and the runtime PM without HD-audio codecs"
* tag 'sound-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda: call runtime_allow() for all hda controllers
ALSA: hda: Allow setting preallocation again for x86
ALSA: hda: Explicitly permit using autosuspend if runtime PM is supported
ALSA: hda: Skip controller resume if not needed
ALSA: hda: Keep the controller initialization even if no codecs found
ALSA: hda: Release resources at error in delayed probe
ALSA: hda: Honor PM disablement in PM freeze and thaw_noirq ops
ALSA: hda: Don't release card at firmware loading error
ALSA: usb-audio: Check mapping at creating connector controls, too
ALSA: usb-audio: Don't create jack controls for PCM terminals
ALSA: usb-audio: Don't override ignore_ctl_error value from the map
ALSA: usb-audio: Filter error from connector kctl ops, too
ALSA: hda/realtek - Enable the headset mic on Asus FX505DT
ALSA: ctxfi: Remove unnecessary cast in kfree
Tommi Rantala [Fri, 17 Apr 2020 13:00:23 +0000 (16:00 +0300)]
blk-wbt: Drop needless newlines from tracepoint format strings
Drop needless newlines from tracepoint format strings, they only add
empty lines to perf tracing output.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Tommi Rantala [Fri, 17 Apr 2020 13:00:22 +0000 (16:00 +0300)]
blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals
Use tracepoint_string() for string literals that are used in the
wbt_step tracepoint, so that userspace tools can display the string
content.
Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stefan Haberland [Fri, 17 Apr 2020 09:48:35 +0000 (11:48 +0200)]
s390/dasd: remove IOSCHED_DEADLINE from DASD Kconfig
CONFIG_IOSCHED_DEADLINE was removed with
commit
f382fb0bcef4 ("block: remove legacy IO schedulers")
and setting of the scheduler was removed with
commit
a5fd8ddce2af ("s390/dasd: remove setting of scheduler from driver").
So get rid of the select.
Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Josef Bacik [Fri, 10 Apr 2020 15:42:48 +0000 (11:42 -0400)]
btrfs: fix setting last_trans for reloc roots
I made a mistake with my previous fix, I assumed that we didn't need to
mess with the reloc roots once we were out of the part of relocation where
we are actually moving the extents.
The subtle thing that I missed is that btrfs_init_reloc_root() also
updates the last_trans for the reloc root when we do
btrfs_record_root_in_trans() for the corresponding fs_root. I've added a
comment to make sure future me doesn't make this mistake again.
This showed up as a WARN_ON() in btrfs_copy_root() because our
last_trans didn't == the current transid. This could happen if we
snapshotted a fs root with a reloc root after we set
rc->create_reloc_tree = 0, but before we actually merge the reloc root.
Worth mentioning that the regression produced the following warning
when running snapshot creation and balance in parallel:
BTRFS info (device sdc): relocating block group
30408704 flags metadata|dup
------------[ cut here ]------------
WARNING: CPU: 0 PID: 12823 at fs/btrfs/ctree.c:191 btrfs_copy_root+0x26f/0x430 [btrfs]
CPU: 0 PID: 12823 Comm: btrfs Tainted: G W 5.6.0-rc7-btrfs-next-58 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
RIP: 0010:btrfs_copy_root+0x26f/0x430 [btrfs]
RSP: 0018:
ffffb96e044279b8 EFLAGS:
00010202
RAX:
0000000000000009 RBX:
ffff9da70bf61000 RCX:
ffffb96e04427a48
RDX:
ffff9da733a770c8 RSI:
ffff9da70bf61000 RDI:
ffff9da694163818
RBP:
ffff9da733a770c8 R08:
fffffffffffffff8 R09:
0000000000000002
R10:
ffffb96e044279a0 R11:
0000000000000000 R12:
ffff9da694163818
R13:
fffffffffffffff8 R14:
ffff9da6d2512000 R15:
ffff9da714cdac00
FS:
00007fdeacf328c0(0000) GS:
ffff9da735e00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
000055a2a5b8a118 CR3:
00000001eed78002 CR4:
00000000003606f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
? create_reloc_root+0x49/0x2b0 [btrfs]
? kmem_cache_alloc_trace+0xe5/0x200
create_reloc_root+0x8b/0x2b0 [btrfs]
btrfs_reloc_post_snapshot+0x96/0x5b0 [btrfs]
create_pending_snapshot+0x610/0x1010 [btrfs]
create_pending_snapshots+0xa8/0xd0 [btrfs]
btrfs_commit_transaction+0x4c7/0xc50 [btrfs]
? btrfs_mksubvol+0x3cd/0x560 [btrfs]
btrfs_mksubvol+0x455/0x560 [btrfs]
__btrfs_ioctl_snap_create+0x15f/0x190 [btrfs]
btrfs_ioctl_snap_create_v2+0xa4/0xf0 [btrfs]
? mem_cgroup_commit_charge+0x6e/0x540
btrfs_ioctl+0x12d8/0x3760 [btrfs]
? do_raw_spin_unlock+0x49/0xc0
? _raw_spin_unlock+0x29/0x40
? __handle_mm_fault+0x11b3/0x14b0
? ksys_ioctl+0x92/0xb0
ksys_ioctl+0x92/0xb0
? trace_hardirqs_off_thunk+0x1a/0x1c
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x5c/0x280
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fdeabd3bdd7
Fixes: 2abc726ab4b8 ("btrfs: do not init a reloc root if we aren't relocating")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Linus Torvalds [Fri, 17 Apr 2020 01:14:13 +0000 (18:14 -0700)]
Merge tag 'nfs-for-5.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfix from Trond Myklebust:
"Fix an ABBA spinlock issue in pnfs_update_layout()"
* tag 'nfs-for-5.7-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix an ABBA spinlock issue in pnfs_update_layout()
Linus Torvalds [Thu, 16 Apr 2020 22:00:57 +0000 (15:00 -0700)]
Merge tag 'tag-chrome-platform-fixes-for-v5.7-rc2' of git://git./linux/kernel/git/chrome-platform/linux
Pull chrome-platform fixes from Benson Leung:
"Two small fixes for cros_ec_sensorhub_ring.c, addressing issues
introduced in the cros_ec_sensorhub FIFO support commit"
* tag 'tag-chrome-platform-fixes-for-v5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec_sensorhub: Add missing '\n' in log messages
platform/chrome: cros_ec_sensorhub: Off by one in cros_sensorhub_send_sample()
Linus Torvalds [Thu, 16 Apr 2020 21:52:29 +0000 (14:52 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Disable RISCV BPF JIT builds when !MMU, from Björn Töpel.
2) nf_tables leaves dangling pointer after free, fix from Eric Dumazet.
3) Out of boundary write in __xsk_rcv_memcpy(), fix from Li RongQing.
4) Adjust icmp6 message source address selection when routes have a
preferred source address set, from Tim Stallard.
5) Be sure to validate HSR protocol version when creating new links,
from Taehee Yoo.
6) CAP_NET_ADMIN should be sufficient to manage l2tp tunnels even in
non-initial namespaces, from Michael Weiß.
7) Missing release firmware call in mlx5, from Eran Ben Elisha.
8) Fix variable type in macsec_changelink(), caught by KASAN. Fix from
Taehee Yoo.
9) Fix pause frame negotiation in marvell phy driver, from Clemens
Gruber.
10) Record RX queue early enough in tun packet paths such that XDP
programs will see the correct RX queue index, from Gilberto Bertin.
11) Fix double unlock in mptcp, from Florian Westphal.
12) Fix offset overflow in ARM bpf JIT, from Luke Nelson.
13) marvell10g needs to soft reset PHY when coming out of low power
mode, from Russell King.
14) Fix MTU setting regression in stmmac for some chip types, from
Florian Fainelli.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
amd-xgbe: Use __napi_schedule() in BH context
mISDN: make dmril and dmrim static
net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes
net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode
tipc: fix incorrect increasing of link window
Documentation: Fix tcp_challenge_ack_limit default value
net: tulip: make early_486_chipsets static
dt-bindings: net: ethernet-phy: add desciption for ethernet-phy-id1234.d400
ipv6: remove redundant assignment to variable err
net/rds: Use ERR_PTR for rds_message_alloc_sgs()
net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge
selftests/bpf: Check for correct program attach/detach in xdp_attach test
libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
libbpf: Always specify expected_attach_type on program load if supported
xsk: Add missing check on user supplied headroom size
mac80211: fix channel switch trigger from unknown mesh peer
mac80211: fix race in ieee80211_register_hw()
net: marvell10g: soft-reset the PHY when coming out of low power
net: marvell10g: report firmware version
net/cxgb4: Check the return from t4_query_params properly
...
Sebastian Andrzej Siewior [Thu, 16 Apr 2020 15:57:40 +0000 (17:57 +0200)]
amd-xgbe: Use __napi_schedule() in BH context
The driver uses __napi_schedule_irqoff() which is fine as long as it is
invoked with disabled interrupts by everybody. Since the commit
mentioned below the driver may invoke xgbe_isr_task() in tasklet/softirq
context. This may lead to list corruption if another driver uses
__napi_schedule_irqoff() in IRQ context.
Use __napi_schedule() which safe to use from IRQ and softirq context.
Fixes: 85b85c853401d ("amd-xgbe: Re-issue interrupt if interrupt status not cleared")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Yan [Wed, 15 Apr 2020 08:42:26 +0000 (16:42 +0800)]
mISDN: make dmril and dmrim static
Fix the following sparse warning:
drivers/isdn/hardware/mISDN/mISDNisar.c:746:12: warning: symbol 'dmril'
was not declared. Should it be static?
drivers/isdn/hardware/mISDN/mISDNisar.c:749:12: warning: symbol 'dmrim'
was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 14 Apr 2020 22:39:52 +0000 (15:39 -0700)]
net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes
After commit
bfcb813203e619a8960a819bf533ad2a108d8105 ("net: dsa:
configure the MTU for switch ports") my Lamobo R1 platform which uses
an allwinner,sun7i-a20-gmac compatible Ethernet MAC started to fail
by rejecting a MTU of 1536. The reason for that is that the DMA
capabilities are not readable on this version of the IP, and there
is also no 'tx-fifo-depth' property being provided in Device Tree. The
property is documented as optional, and is not provided.
Chen-Yu indicated that the FIFO sizes are 4KB for TX and 16KB for RX, so
provide these values through platform data as an immediate fix until
various Device Tree sources get updated accordingly.
Fixes: eaf4fac47807 ("net: stmmac: Do not accept invalid MTU values")
Suggested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
DENG Qingfang [Tue, 14 Apr 2020 06:34:08 +0000 (14:34 +0800)]
net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode
In VLAN-unaware mode, the Egress Tag (EG_TAG) field in Port VLAN
Control register must be set to Consistent to let tagged frames pass
through as is, otherwise their tags will be stripped.
Fixes: 83163f7dca56 ("net: dsa: mediatek: add VLAN support for MT7530")
Signed-off-by: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 16 Apr 2020 17:45:47 +0000 (10:45 -0700)]
Merge tag 'selinux-pr-
20200416' of git://git./linux/kernel/git/pcmoore/selinux
Pull SELinux fix from Paul Moore:
"One small SELinux fix to ensure we cleanup properly on an error
condition"
* tag 'selinux-pr-
20200416' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: free str on error in str_read()
Linus Torvalds [Thu, 16 Apr 2020 17:29:34 +0000 (10:29 -0700)]
Merge tag 'ceph-for-5.7-rc2' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
- a set of patches for a deadlock on "rbd map" error path
- a fix for invalid pointer dereference and uninitialized variable use
on asynchronous create and unlink error paths.
* tag 'ceph-for-5.7-rc2' of git://github.com/ceph/ceph-client:
ceph: fix potential bad pointer deref in async dirops cb's
rbd: don't mess with a page vector in rbd_notify_op_lock()
rbd: don't test rbd_dev->opts in rbd_dev_image_release()
rbd: call rbd_dev_unprobe() after unwatching and flushing notifies
rbd: avoid a deadlock on header_rwsem when flushing notifies
Linus Torvalds [Thu, 16 Apr 2020 17:14:22 +0000 (10:14 -0700)]
Merge tag 'trace-v5.7-rc1' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"This fixes a small race between allocating a snapshot buffer and
setting the snapshot trigger.
On a slow machine, the trigger can occur before the snapshot is
allocated causing a warning to be displayed in the ring buffer, and no
snapshot triggering. Reversing the allocation and the enabling of the
trigger fixes the problem"
* tag 'trace-v5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation
Vasily Averin [Tue, 14 Apr 2020 20:33:16 +0000 (21:33 +0100)]
keys: Fix proc_keys_next to increase position index
If seq_file .next function does not change position index,
read after some lseek can generate unexpected output:
$ dd if=/proc/keys bs=1 # full usual output
0f6bfdf5 I--Q--- 2 perm
3f010000 1000 1000 user
4af2f79ab8848d0a: 740
1fb91b32 I--Q--- 3 perm
1f3f0000 1000 65534 keyring _uid.1000: 2
27589480 I--Q--- 1 perm
0b0b0000 0 0 user invocation_id: 16
2f33ab67 I--Q--- 152 perm
3f030000 0 0 keyring _ses: 2
33f1d8fa I--Q--- 4 perm
3f030000 1000 1000 keyring _ses: 1
3d427fda I--Q--- 2 perm
3f010000 1000 1000 user
69ec44aec7678e5a: 740
3ead4096 I--Q--- 1 perm
1f3f0000 1000 65534 keyring _uid_ses.1000: 1
521+0 records in
521+0 records out
521 bytes copied, 0,
00123769 s, 421 kB/s
But a read after lseek in middle of last line results in the partial
last line and then a repeat of the final line:
$ dd if=/proc/keys bs=500 skip=1
dd: /proc/keys: cannot skip to specified offset
g _uid_ses.1000: 1
3ead4096 I--Q--- 1 perm
1f3f0000 1000 65534 keyring _uid_ses.1000: 1
0+1 records in
0+1 records out
97 bytes copied, 0,
000135035 s, 718 kB/s
and a read after lseek beyond end of file results in the last line being
shown:
$ dd if=/proc/keys bs=1000 skip=1 # read after lseek beyond end of file
dd: /proc/keys: cannot skip to specified offset
3ead4096 I--Q--- 1 perm
1f3f0000 1000 65534 keyring _uid_ses.1000: 1
0+1 records in
0+1 records out
76 bytes copied, 0,
000119981 s, 633 kB/s
See https://bugzilla.kernel.org/show_bug.cgi?id=206283
Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kai-Heng Feng [Thu, 16 Apr 2020 06:35:40 +0000 (14:35 +0800)]
ahci: Add Intel Comet Lake PCH-U PCI ID
Add Intel Comet Lake PCH-U PCI ID to the list of supported controllers.
Set default SATA LPM so the SoC can enter S0ix.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
John Garry [Thu, 16 Apr 2020 11:18:51 +0000 (19:18 +0800)]
blk-mq: Put driver tag in blk_mq_dispatch_rq_list() when no budget
If in blk_mq_dispatch_rq_list() we find no budget, then we break of the
dispatch loop, but the request may keep the driver tag, evaulated
in 'nxt' in the previous loop iteration.
Fix by putting the driver tag for that request.
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 16 Apr 2020 00:37:48 +0000 (17:37 -0700)]
Merge tag 'efi-urgent-2020-04-15' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Misc EFI fixes, including the boot failure regression caused by the
BSS section not being cleared by the loaders"
* tag 'efi-urgent-2020-04-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Revert struct layout change to fix kexec boot regression
efi/x86: Don't remap text<->rodata gap read-only for mixed mode
efi/x86: Fix the deletion of variables in mixed mode
efi/libstub/file: Merge file name buffers to reduce stack usage
Documentation/x86, efi/x86: Clarify EFI handover protocol and its requirements
efi/arm: Deal with ADR going out of range in efi_enter_kernel()
efi/x86: Always relocate the kernel for EFI handover entry
efi/x86: Move efi stub globals from .bss to .data
efi/libstub/x86: Remove redundant assignment to pointer hdr
efi/cper: Use scnprintf() for avoiding potential buffer overflow
Tuong Lien [Wed, 15 Apr 2020 11:34:49 +0000 (18:34 +0700)]
tipc: fix incorrect increasing of link window
In commit
16ad3f4022bb ("tipc: introduce variable window congestion
control"), we allow link window to change with the congestion avoidance
algorithm. However, there is a bug that during the slow-start if packet
retransmission occurs, the link will enter the fast-recovery phase, set
its window to the 'ssthresh' which is never less than 300, so the link
window suddenly increases to that limit instead of decreasing.
Consequently, two issues have been observed:
- For broadcast-link: it can leave a gap between the link queues that a
new packet will be inserted and sent before the previous ones, i.e. not
in-order.
- For unicast: the algorithm does not work as expected, the link window
jumps to the slow-start threshold whereas packet retransmission occurs.
This commit fixes the issues by avoiding such the link window increase,
but still decreasing if the 'ssthresh' is lowered.
Fixes: 16ad3f4022bb ("tipc: introduce variable window congestion control")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cambda Zhu [Wed, 15 Apr 2020 09:54:04 +0000 (17:54 +0800)]
Documentation: Fix tcp_challenge_ack_limit default value
The default value of tcp_challenge_ack_limit has been changed from
100 to 1000 and this patch fixes its documentation.
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Yan [Wed, 15 Apr 2020 08:42:48 +0000 (16:42 +0800)]
net: tulip: make early_486_chipsets static
Fix the following sparse warning:
drivers/net/ethernet/dec/tulip/tulip_core.c:1280:28: warning: symbol
'early_486_chipsets' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Jonker [Wed, 15 Apr 2020 20:01:49 +0000 (22:01 +0200)]
dt-bindings: net: ethernet-phy: add desciption for ethernet-phy-id1234.d400
The description below is already in use in
'rk3228-evb.dts', 'rk3229-xms6.dts' and 'rk3328.dtsi'
but somehow never added to a document, so add
"ethernet-phy-id1234.d400", "ethernet-phy-ieee802.3-c22"
for ethernet-phy nodes on Rockchip platforms to
'ethernet-phy.yaml'.
Signed-off-by: Johan Jonker <jbx6244@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 15 Apr 2020 23:16:30 +0000 (00:16 +0100)]
ipv6: remove redundant assignment to variable err
The variable err is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ondrej Mosnacek [Tue, 14 Apr 2020 14:23:51 +0000 (16:23 +0200)]
selinux: free str on error in str_read()
In [see "Fixes:"] I missed the fact that str_read() may give back an
allocated pointer even if it returns an error, causing a potential
memory leak in filename_trans_read_one(). Fix this by making the
function free the allocated string whenever it returns a non-zero value,
which also makes its behavior more obvious and prevents repeating the
same mistake in the future.
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID:
1461665 ("Resource leaks")
Fixes: c3a276111ea2 ("selinux: optimize storage of filename transitions")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Jason Gunthorpe [Tue, 14 Apr 2020 23:02:07 +0000 (20:02 -0300)]
net/rds: Use ERR_PTR for rds_message_alloc_sgs()
Returning the error code via a 'int *ret' when the function returns a
pointer is very un-kernely and causes gcc 10's static analysis to choke:
net/rds/message.c: In function ‘rds_message_map_pages’:
net/rds/message.c:358:10: warning: ‘ret’ may be used uninitialized in this function [-Wmaybe-uninitialized]
358 | return ERR_PTR(ret);
Use a typical ERR_PTR return instead.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Tue, 14 Apr 2020 19:36:15 +0000 (22:36 +0300)]
net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge
To rehash a previous explanation given in commit
1c44ce560b4d ("net:
mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is
up"), the switch driver operates the in a mode where a single VLAN can
be transmitted as untagged on a particular egress port. That is the
"native VLAN on trunk port" use case.
The configuration for this native VLAN is driven in 2 ways:
- Set the egress port rewriter to strip the VLAN tag for the native
VID (as it is egress-untagged, after all).
- Configure the ingress port to drop untagged and priority-tagged
traffic, if there is no native VLAN. The intention of this setting is
that a trunk port with no native VLAN should not accept untagged
traffic.
Since both of the above configurations for the native VLAN should only
be done if VLAN awareness is requested, they are actually done from the
ocelot_port_vlan_filtering function, after the basic procedure of
toggling the VLAN awareness flag of the port.
But there's a problem with that simplistic approach: we are trying to
juggle with 2 independent variables from a single function:
- Native VLAN of the port - its value is held in port->vid.
- VLAN awareness state of the port - currently there are some issues
here, more on that later*.
The actual problem can be seen when enslaving the switch ports to a VLAN
filtering bridge:
0. The driver configures a pvid of zero for each port, when in
standalone mode. While the bridge configures a default_pvid of 1 for
each port that gets added as a slave to it.
1. The bridge calls ocelot_port_vlan_filtering with vlan_aware=true.
The VLAN-filtering-dependent portion of the native VLAN
configuration is done, considering that the native VLAN is 0.
2. The bridge calls ocelot_vlan_add with vid=1, pvid=true,
untagged=true. The native VLAN changes to 1 (change which gets
propagated to hardware).
3. ??? - nobody calls ocelot_port_vlan_filtering again, to reapply the
VLAN-filtering-dependent portion of the native VLAN configuration,
for the new native VLAN of 1. One can notice that after toggling "ip
link set dev br0 type bridge vlan_filtering 0 && ip link set dev br0
type bridge vlan_filtering 1", the new native VLAN finally makes it
through and untagged traffic finally starts flowing again. But
obviously that shouldn't be needed.
So it is clear that 2 independent variables need to both re-trigger the
native VLAN configuration. So we introduce the second variable as
ocelot_port->vlan_aware.
*Actually both the DSA Felix driver and the Ocelot driver already had
each its own variable:
- Ocelot: ocelot_port_private->vlan_aware
- Felix: dsa_port->vlan_filtering
but the common Ocelot library needs to work with a single, common,
variable, so there is some refactoring done to move the vlan_aware
property from the private structure into the common ocelot_port
structure.
Fixes: 97bb69e1e36e ("net: mscc: ocelot: break apart ocelot_vlan_port_apply")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 15 Apr 2020 18:35:42 +0000 (11:35 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2020-04-15
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 3 day(s) which contain
a total of 11 files changed, 238 insertions(+), 95 deletions(-).
The main changes are:
1) Fix offset overflow for BPF_MEM BPF_DW insn mapping on arm32 JIT,
from Luke Nelson and Xi Wang.
2) Prevent mprotect() to make frozen & mmap()'ed BPF map writeable
again, from Andrii Nakryiko and Jann Horn.
3) Fix type of old_fd in bpf_xdp_set_link_opts to int in libbpf and add
selftests, from Toke Høiland-Jørgensen.
4) Fix AF_XDP to check that headroom cannot be larger than the available
space in the chunk, from Magnus Karlsson.
5) Fix reset of XDP prog when expected_fd is set, from David Ahern.
6) Fix a segfault in bpftool's struct_ops command when BTF is not
available, from Daniel T. Lee.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 15 Apr 2020 18:27:23 +0000 (11:27 -0700)]
Merge tag 'mac80211-for-net-2020-04-15' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A couple of fixes:
* FTM responder policy netlink validation fix
(but the only user validates again later)
* kernel-doc fixes
* a fix for a race in mac80211 radio registration vs. userspace
* a mesh channel switch fix
* a fix for a syzbot reported kasprintf() issue
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Toke Høiland-Jørgensen [Tue, 14 Apr 2020 14:50:25 +0000 (16:50 +0200)]
selftests/bpf: Check for correct program attach/detach in xdp_attach test
David Ahern noticed that there was a bug in the EXPECTED_FD code so
programs did not get detached properly when that parameter was supplied.
This case was not included in the xdp_attach tests; so let's add it to be
sure that such a bug does not sneak back in down.
Fixes: 87854a0b57b3 ("selftests/bpf: Add tests for attaching XDP programs")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200414145025.182163-2-toke@redhat.com
Toke Høiland-Jørgensen [Tue, 14 Apr 2020 14:50:24 +0000 (16:50 +0200)]
libbpf: Fix type of old_fd in bpf_xdp_set_link_opts
The 'old_fd' parameter used for atomic replacement of XDP programs is
supposed to be an FD, but was left as a u32 from an earlier iteration of
the patch that added it. It was converted to an int when read, so things
worked correctly even with negative values, but better change the
definition to correctly reflect the intention.
Fixes: bd5ca3ef93cd ("libbpf: Add function to set link XDP fd while specifying old program")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200414145025.182163-1-toke@redhat.com
Andrii Nakryiko [Tue, 14 Apr 2020 18:26:45 +0000 (11:26 -0700)]
libbpf: Always specify expected_attach_type on program load if supported
For some types of BPF programs that utilize expected_attach_type, libbpf won't
set load_attr.expected_attach_type, even if expected_attach_type is known from
section definition. This was done to preserve backwards compatibility with old
kernels that didn't recognize expected_attach_type attribute yet (which was
added in
5e43f899b03a ("bpf: Check attach type at prog load time"). But this
is problematic for some BPF programs that utilize newer features that require
kernel to know specific expected_attach_type (e.g., extended set of return
codes for cgroup_skb/egress programs).
This patch makes libbpf specify expected_attach_type by default, but also
detect support for this field in kernel and not set it during program load.
This allows to have a good metadata for bpf_program
(e.g., bpf_program__get_extected_attach_type()), but still work with old
kernels (for cases where it can work at all).
Additionally, due to expected_attach_type being always set for recognized
program types, bpf_program__attach_cgroup doesn't have to do extra checks to
determine correct attach type, so remove that additional logic.
Also adjust section_names selftest to account for this change.
More detailed discussion can be found in [0].
[0] https://lore.kernel.org/bpf/
20200412003604.GA15986@rdna-mbp.dhcp.thefacebook.com/
Fixes: 5cf1e9145630 ("bpf: cgroup inet skb programs can return 0 to 3")
Fixes: 5e43f899b03a ("bpf: Check attach type at prog load time")
Reported-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20200414182645.1368174-1-andriin@fb.com
Magnus Karlsson [Tue, 14 Apr 2020 07:35:15 +0000 (09:35 +0200)]
xsk: Add missing check on user supplied headroom size
Add a check that the headroom cannot be larger than the available
space in the chunk. In the current code, a malicious user can set the
headroom to a value larger than the chunk size minus the fixed XDP
headroom. That way packets with a length larger than the supported
size in the umem could get accepted and result in an out-of-bounds
write.
Fixes: c0c77d8fb787 ("xsk: add user memory registration support sockopt")
Reported-by: Bui Quang Minh <minhquangbui99@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207225
Link: https://lore.kernel.org/bpf/1586849715-23490-1-git-send-email-magnus.karlsson@intel.com
Tamizh chelvam [Sat, 28 Mar 2020 13:53:24 +0000 (19:23 +0530)]
mac80211: fix channel switch trigger from unknown mesh peer
Previously mesh channel switch happens if beacon contains
CSA IE without checking the mesh peer info. Due to that
channel switch happens even if the beacon is not from
its own mesh peer. Fixing that by checking if the CSA
originated from the same mesh network before proceeding
for channel switch.
Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org>
Link: https://lore.kernel.org/r/1585403604-29274-1-git-send-email-tamizhr@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Sumit Garg [Tue, 7 Apr 2020 10:10:55 +0000 (15:40 +0530)]
mac80211: fix race in ieee80211_register_hw()
A race condition leading to a kernel crash is observed during invocation
of ieee80211_register_hw() on a dragonboard410c device having wcn36xx
driver built as a loadable module along with a wifi manager in user-space
waiting for a wifi device (wlanX) to be active.
Sequence diagram for a particular kernel crash scenario:
user-space ieee80211_register_hw() ieee80211_tasklet_handler()
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| | |
|<---phy0----wiphy_register() |
|-----iwd if_add---->| |
| |<---IRQ----(RX packet)
| Kernel crash |
| due to unallocated |
| workqueue. |
| | |
| alloc_ordered_workqueue() |
| | |
| Misc wiphy init. |
| | |
| ieee80211_if_add() |
| | |
As evident from above sequence diagram, this race condition isn't specific
to a particular wifi driver but rather the initialization sequence in
ieee80211_register_hw() needs to be fixed. So re-order the initialization
sequence and the updated sequence diagram would look like:
user-space ieee80211_register_hw() ieee80211_tasklet_handler()
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| | |
| alloc_ordered_workqueue() |
| | |
| Misc wiphy init. |
| | |
|<---phy0----wiphy_register() |
|-----iwd if_add---->| |
| |<---IRQ----(RX packet)
| | |
| ieee80211_if_add() |
| | |
Cc: stable@vger.kernel.org
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/1586254255-28713-1-git-send-email-sumit.garg@linaro.org
[Johannes: fix rtnl imbalances]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Xiao Yang [Tue, 14 Apr 2020 01:51:45 +0000 (09:51 +0800)]
tracing: Fix the race between registering 'snapshot' event trigger and triggering 'snapshot' operation
Traced event can trigger 'snapshot' operation(i.e. calls snapshot_trigger()
or snapshot_count_trigger()) when register_snapshot_trigger() has completed
registration but doesn't allocate buffer for 'snapshot' event trigger. In
the rare case, 'snapshot' operation always detects the lack of allocated
buffer so make register_snapshot_trigger() allocate buffer first.
trigger-snapshot.tc in kselftest reproduces the issue on slow vm:
-----------------------------------------------------------
cat trace
...
ftracetest-3028 [002] .... 236.784290: sched_process_fork: comm=ftracetest pid=3028 child_comm=ftracetest child_pid=3036
<...>-2875 [003] .... 240.460335: tracing_snapshot_instance_cond: *** SNAPSHOT NOT ALLOCATED ***
<...>-2875 [003] .... 240.460338: tracing_snapshot_instance_cond: *** stopping trace here! ***
-----------------------------------------------------------
Link: http://lkml.kernel.org/r/20200414015145.66236-1-yangx.jy@cn.fujitsu.com
Cc: stable@vger.kernel.org
Fixes: 93e31ffbf417a ("tracing: Add 'snapshot' event trigger command")
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
David S. Miller [Tue, 14 Apr 2020 23:48:09 +0000 (16:48 -0700)]
Merge branch 'Fix-88x3310-leaving-power-save-mode'
Russell King says:
====================
Fix 88x3310 leaving power save mode
This series fixes a problem with the 88x3310 PHY on Macchiatobin
coming out of powersave mode noticed by Matteo Croce. It seems
that certain PHY firmwares do not properly exit powersave mode,
resulting in a fibre link not coming up.
The solution appears to be to soft-reset the PHY after clearing
the powersave bit.
We add support for reporting the PHY firmware version to the kernel
log, and use it to trigger this new behaviour if we have v0.3.x.x
or more recent firmware on the PHY. This, however, is a guess as
the firmware revision documentation does not mention this issue,
and we know that v0.2.1.0 works without this fix but v0.3.3.0 and
later does not.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 14 Apr 2020 19:49:08 +0000 (20:49 +0100)]
net: marvell10g: soft-reset the PHY when coming out of low power
Soft-reset the PHY when coming out of low power mode, which seems to
be necessary with firmware versions 0.3.3.0 and 0.3.10.0.
This depends on ("net: marvell10g: report firmware version")
Fixes: c9cc1c815d36 ("net: phy: marvell10g: place in powersave mode at probe")
Reported-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 14 Apr 2020 19:49:03 +0000 (20:49 +0100)]
net: marvell10g: report firmware version
Report the firmware version when probing the PHY to allow issues
attributable to firmware to be diagnosed.
Tested-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Gunthorpe [Tue, 14 Apr 2020 15:27:08 +0000 (12:27 -0300)]
net/cxgb4: Check the return from t4_query_params properly
Positive return values are also failures that don't set val,
although this probably can't happen. Fixes gcc 10 warning:
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c: In function ‘t4_phy_fw_ver’:
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:3747:14: warning: ‘val’ may be used uninitialized in this function [-Wmaybe-uninitialized]
3747 | *phy_fw_ver = val;
Fixes: 01b6961410b7 ("cxgb4: Add PHY firmware support for T420-BT cards")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atsushi Nemoto [Tue, 14 Apr 2020 01:12:34 +0000 (10:12 +0900)]
net: stmmac: socfpga: Allow all RGMII modes
Allow all the RGMII modes to be used. (Not only "rgmii", "rgmii-id"
but "rgmii-txid", "rgmii-rxid")
Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Apr 2020 23:33:26 +0000 (16:33 -0700)]
Merge branch 'mv88e6xxx-fixed-link-fixes'
Andrew Lunn says:
====================
mv88e6xxx fixed link fixes
Recent changes for how the MAC is configured broke fixed links, as
used by CPU/DSA ports, and for SFPs when phylink cannot be used. The
first fix is unchanged from v1. The second fix takes a different
solution than v1. If a CPU or DSA port is known to have a PHYLINK
instance, configure the port down before instantiating the PHYLINK, so
it is in the down state as expected by PHYLINK.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 14 Apr 2020 00:34:39 +0000 (02:34 +0200)]
net: dsa: Down cpu/dsa ports phylink will control
DSA and CPU ports can be configured in two ways. By default, the
driver should configure such ports to there maximum bandwidth. For
most use cases, this is sufficient. When this default is insufficient,
a phylink instance can be bound to such ports, and phylink will
configure the port, e.g. based on fixed-link properties. phylink
assumes the port is initially down. Given that the driver should have
already configured it to its maximum speed, ask the driver to down
the port before instantiating the phylink instance.
Fixes: 30c4a5b0aad8 ("net: mv88e6xxx: use resolved link config in mac_link_up()")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 14 Apr 2020 00:34:38 +0000 (02:34 +0200)]
net: dsa: mv88e6xxx: Configure MAC when using fixed link
The
88e6185 is reporting it has detected a PHY, when a port is
connected to an SFP. As a result, the fixed-phy configuration is not
being applied. That then breaks packet transfer, since the port is
reported as being down.
Add additional conditions to check the interface mode, and if it is
fixed always configure the port on link up/down, independent of the
PPU status.
Fixes: 30c4a5b0aad8 ("net: mv88e6xxx: use resolved link config in mac_link_up()")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Apr 2020 23:30:14 +0000 (16:30 -0700)]
Merge branch 'ionic-address-automated-build-complaints'
Shannon Nelson says:
====================
ionic: address automated build complaints
Kernel build checks found a couple of things to improve.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Apr 2020 17:33:11 +0000 (10:33 -0700)]
ionic: fix unused assignment
Remove an unused initialized value.
Fixes: 7e4d47596b68 ("ionic: replay filters after fw upgrade")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Mon, 13 Apr 2020 17:33:10 +0000 (10:33 -0700)]
ionic: add dynamic_debug header
Add the appropriate header for using dynamic_hex_dump(), which
seems to be incidentally included in some configurations but
not all.
Fixes: 7e4d47596b68 ("ionic: replay filters after fw upgrade")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Mon, 13 Apr 2020 12:57:14 +0000 (13:57 +0100)]
rxrpc: Fix DATA Tx to disable nofrag for UDP on AF_INET6 socket
Fix the DATA packet transmission to disable nofrag for UDPv4 on an AF_INET6
socket as well as UDPv6 when trying to transmit fragmentably.
Without this, packets filled to the normal size used by the kernel AFS
client of 1412 bytes be rejected by udp_sendmsg() with EMSGSIZE
immediately. The ->sk_error_report() notification hook is called, but
rxrpc doesn't generate a trace for it.
This is a temporary fix; a more permanent solution needs to involve
changing the size of the packets being filled in accordance with the MTU,
which isn't currently done in AF_RXRPC. The reason for not doing so was
that, barring the last packet in an rx jumbo packet, jumbos can only be
assembled out of 1412-byte packets - and the plan was to construct jumbos
on the fly at transmission time.
Also, there's no point turning on IPV6_MTU_DISCOVER, since IPv6 has to
engage in this anyway since fragmentation is only done by the sender. We
can then condense the switch-statement in rxrpc_send_data_packet().
Fixes: 75b54cb57ca3 ("rxrpc: Add IPv6 support")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atsushi Nemoto [Fri, 10 Apr 2020 03:16:16 +0000 (12:16 +0900)]
net: phy: micrel: use genphy_read_status for KSZ9131
KSZ9131 will not work with some switches due to workaround for KSZ9031
introduced in commit
d2fd719bcb0e83cb39cfee22ee800f98a56eceb3
("net/phy: micrel: Add workaround for bad autoneg").
Use genphy_read_status instead of dedicated ksz9031_read_status.
Fixes: bff5b4b37372 ("net: phy: micrel: add Microchip KSZ9131 initial driver")
Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rafael J. Wysocki [Tue, 14 Apr 2020 21:13:53 +0000 (23:13 +0200)]
Merge branch 'opp/linux-next' of git://git./linux/kernel/git/vireshk/pm
Merge an operating performance points (OPP) framework update for
5.7-rc2 from Viresh Kumar:
"This contains a single patch that lets the OPP core to be used by
several IO drivers without making a lot of changes in them for the
case where the same driver may be used by a platform with an OPP
table or a clock node on another one. I am looking to get this into
5.7 release itself, which will enable other users (in multiple
frameworks) to get merged without waiting for the dependency to get
resolved."
* 'opp/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Manage empty OPP tables with clk handle
David S. Miller [Tue, 14 Apr 2020 20:07:19 +0000 (13:07 -0700)]
Merge tag 'wireless-drivers-2020-04-14' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.7
First set of fixes for v5.6. Fixes for a crash and for two compiler
warnings.
brcmfmac
* fix a crash related to monitor interface
ath11k
* fix compiler warnings without CONFIG_THERMAL
rtw88
* fix compiler warnings related to suspend and resume functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Zou Wei [Mon, 13 Apr 2020 11:57:56 +0000 (19:57 +0800)]
bpf: remove unneeded conversion to bool in __mark_reg_unknown
This issue was detected by using the Coccinelle software:
kernel/bpf/verifier.c:1259:16-21: WARNING: conversion to bool not needed here
The conversion to bool is unneeded, remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/1586779076-101346-1-git-send-email-zou_wei@huawei.com
David Ahern [Sun, 12 Apr 2020 13:32:04 +0000 (07:32 -0600)]
xdp: Reset prog in dev_change_xdp_fd when fd is negative
The commit mentioned in the Fixes tag reuses the local prog variable
when looking up an expected_fd. The variable is not reset when fd < 0
causing a detach with the expected_fd set to actually call
dev_xdp_install for the existing program. The end result is that the
detach does not happen.
Fixes: 92234c8f15c8 ("xdp: Support specifying expected existing program when attaching XDP")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200412133204.43847-1-dsahern@kernel.org
Daniel T. Lee [Fri, 10 Apr 2020 02:06:12 +0000 (11:06 +0900)]
tools, bpftool: Fix struct_ops command invalid pointer free
In commit
65c93628599d ("bpftool: Add struct_ops support") a new
type of command named struct_ops has been added. This command requires
a kernel with CONFIG_DEBUG_INFO_BTF=y set and for retrieving BTF info
in bpftool, the helper get_btf_vmlinux() is used.
When running this command on kernel without BTF debug info, this will
lead to 'btf_vmlinux' variable being an invalid(error) pointer. And by
this, btf_free() causes a segfault when executing 'bpftool struct_ops'.
This commit adds pointer validation with IS_ERR not to free invalid
pointer, and this will fix the segfault issue.
Fixes: 65c93628599d ("bpftool: Add struct_ops support")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200410020612.2930667-1-danieltimlee@gmail.com
Andrii Nakryiko [Fri, 10 Apr 2020 20:26:13 +0000 (13:26 -0700)]
selftests/bpf: Validate frozen map contents stays frozen
Test that frozen and mmap()'ed BPF map can't be mprotect()'ed as writable or
executable memory. Also validate that "downgrading" from writable to read-only
doesn't screw up internal writable count accounting for the purposes of map
freezing.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200410202613.3679837-2-andriin@fb.com
Andrii Nakryiko [Fri, 10 Apr 2020 20:26:12 +0000 (13:26 -0700)]
bpf: Prevent re-mmap()'ing BPF map as writable for initially r/o mapping
VM_MAYWRITE flag during initial memory mapping determines if already mmap()'ed
pages can be later remapped as writable ones through mprotect() call. To
prevent user application to rewrite contents of memory-mapped as read-only and
subsequently frozen BPF map, remove VM_MAYWRITE flag completely on initially
read-only mapping.
Alternatively, we could treat any memory-mapping on unfrozen map as writable
and bump writecnt instead. But there is little legitimate reason to map
BPF map as read-only and then re-mmap() it as writable through mprotect(),
instead of just mmap()'ing it as read/write from the very beginning.
Also, at the suggestion of Jann Horn, drop unnecessary refcounting in mmap
operations. We can just rely on VMA holding reference to BPF map's file
properly.
Fixes: fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/bpf/20200410202613.3679837-1-andriin@fb.com
Luke Nelson [Thu, 9 Apr 2020 22:17:52 +0000 (15:17 -0700)]
arm, bpf: Fix offset overflow for BPF_MEM BPF_DW
This patch fixes an incorrect check in how immediate memory offsets are
computed for BPF_DW on arm.
For BPF_LDX/ST/STX + BPF_DW, the 32-bit arm JIT breaks down an 8-byte
access into two separate 4-byte accesses using off+0 and off+4. If off
fits in imm12, the JIT emits a ldr/str instruction with the immediate
and avoids the use of a temporary register. While the current check off
<= 0xfff ensures that the first immediate off+0 doesn't overflow imm12,
it's not sufficient for the second immediate off+4, which may cause the
second access of BPF_DW to read/write the wrong address.
This patch fixes the problem by changing the check to
off <= 0xfff - 4 for BPF_DW, ensuring off+4 will never overflow.
A side effect of simplifying the check is that it now allows using
negative immediate offsets in ldr/str. This means that small negative
offsets can also avoid the use of a temporary register.
This patch introduces no new failures in test_verifier or test_bpf.c.
Fixes: c5eae692571d6 ("ARM: net: bpf: improve 64-bit store implementation")
Fixes: ec19e02b343db ("ARM: net: bpf: fix LDX instructions")
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com
Linus Torvalds [Tue, 14 Apr 2020 18:58:04 +0000 (11:58 -0700)]
Merge tag 'hyperv-fixes-signed' of git://git./linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- a series from Tianyu Lan to fix crash reporting on Hyper-V
- three miscellaneous cleanup patches
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/Hyper-V: Report crash data in die() when panic_on_oops is set
x86/Hyper-V: Report crash register data when sysctl_record_panic_msg is not set
x86/Hyper-V: Report crash register data or kmsg before running crash kernel
x86/Hyper-V: Trigger crash enlightenment only once during system crash.
x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
x86/Hyper-V: Unload vmbus channel in hv panic callback
x86: hyperv: report value of misc_features
hv_debugfs: Make hv_debug_root static
hv: hyperv_vmbus.h: Replace zero-length array with flexible-array member
Linus Torvalds [Tue, 14 Apr 2020 18:51:30 +0000 (11:51 -0700)]
Merge tag 'for-5.7-rc1-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"We have a few regressions and one fix for stable:
- revert fsync optimization
- fix lost i_size update
- fix a space accounting leak
- build fix, add back definition of a deprecated ioctl flag
- fix search condition for old roots in relocation"
* tag 'for-5.7-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: re-instantiate the removed BTRFS_SUBVOL_CREATE_ASYNC definition
btrfs: fix reclaim counter leak of space_info objects
btrfs: make full fsyncs always operate on the entire file again
btrfs: fix lost i_size update after cloning inline extent
btrfs: check commit root generation in should_ignore_root
Linus Torvalds [Tue, 14 Apr 2020 18:47:30 +0000 (11:47 -0700)]
Merge tag 'afs-fixes-
20200413' of git://git./linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
- Fix the decoding of fetched file status records so that the xdr
pointer is advanced under all circumstances.
- Fix the decoding of a fetched file status record that indicates an
inline abort (ie. an error) so that it sets the flag saying the
decoder stored the abort code.
- Fix the decoding of the result of the rename operation so that it
doesn't skip the decoding of the second fetched file status (ie. that
of the dest dir) in the case that the source and dest dirs were the
same as this causes the xdr pointer not to be advanced, leading to
incorrect decoding of subsequent parts of the reply.
- Fix the dump of a bad YFSFetchStatus record to dump the full length.
- Fix a race between local editing of directory contents and accessing
the dir for reading or d_revalidate by using the same lock in both.
- Fix afs_d_revalidate() to not accidentally reverse the version on a
dentry when it's meant to be bringing it forward.
* tag 'afs-fixes-
20200413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix afs_d_validate() to set the right directory version
afs: Fix race between post-modification dir edit and readdir/d_revalidate
afs: Fix length of dump of bad YFSFetchStatus record
afs: Fix rename operation status delivery
afs: Fix decoding of inline abort codes from version 1 status records
afs: Fix missing XDR advance in xdr_decode_{AFS,YFS}FSFetchStatus()
Hui Wang [Tue, 14 Apr 2020 14:27:25 +0000 (22:27 +0800)]
ALSA: hda: call runtime_allow() for all hda controllers
Before the pci_driver->probe() is called, the pci subsystem calls
runtime_forbid() and runtime_get_sync() on this pci dev, so only call
runtime_put_autosuspend() is not enough to enable the runtime_pm on
this device.
For controllers with vgaswitcheroo feature, the pci/quirks.c will call
runtime_allow() for this dev, then the controllers could enter
rt_idle/suspend/resume, but for non-vgaswitcheroo controllers like
Intel hda controllers, the runtime_pm is not enabled because the
runtime_allow() is not called.
Since it is no harm calling runtime_allow() twice, here let hda
driver call runtime_allow() for all controllers. Then the runtime_pm
is enabled on all controllers after the put_autosuspend() is called.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Link: https://lore.kernel.org/r/20200414142725.6020-1-hui.wang@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Arnd Bergmann [Wed, 8 Apr 2020 18:53:51 +0000 (20:53 +0200)]
rtw88: avoid unused function warnings
The rtw88 driver defines emtpy functions with multiple indirections
but gets one of these wrong:
drivers/net/wireless/realtek/rtw88/pci.c:1347:12: error: 'rtw_pci_resume' defined but not used [-Werror=unused-function]
1347 | static int rtw_pci_resume(struct device *dev)
| ^~~~~~~~~~~~~~
drivers/net/wireless/realtek/rtw88/pci.c:1342:12: error: 'rtw_pci_suspend' defined but not used [-Werror=unused-function]
1342 | static int rtw_pci_suspend(struct device *dev)
Better simplify it to rely on the conditional reference in
SIMPLE_DEV_PM_OPS(), and mark the functions as __maybe_unused to avoid
warning about it.
I'm not sure if these are needed at all given that the functions
don't do anything, but they were only recently added.
Fixes: 44bc17f7f5b3 ("rtw88: support wowlan feature for 8822c")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200408185413.218643-1-arnd@arndb.de
Lothar Rubusch [Wed, 8 Apr 2020 23:10:13 +0000 (23:10 +0000)]
cfg80211: fix kernel-doc notation
Update missing kernel-doc annotations and fix of related warnings
at 'make htmldocs'.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Link: https://lore.kernel.org/r/20200408231013.28370-1-l.rubusch@gmail.com
[fix indentation, attribute references]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tuomas Tynkkynen [Fri, 10 Apr 2020 12:32:57 +0000 (15:32 +0300)]
mac80211_hwsim: Use kstrndup() in place of kasprintf()
syzbot reports a warning:
precision 33020 too large
WARNING: CPU: 0 PID: 9618 at lib/vsprintf.c:2471 set_precision+0x150/0x180 lib/vsprintf.c:2471
vsnprintf+0xa7b/0x19a0 lib/vsprintf.c:2547
kvasprintf+0xb2/0x170 lib/kasprintf.c:22
kasprintf+0xbb/0xf0 lib/kasprintf.c:59
hwsim_del_radio_nl+0x63a/0x7e0 drivers/net/wireless/mac80211_hwsim.c:3625
genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline]
...
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Thus it seems that kasprintf() with "%.*s" format can not be used for
duplicating a string with arbitrary length. Replace it with kstrndup().
Note that later this string is limited to NL80211_WIPHY_NAME_MAXLEN == 64,
but the code is simpler this way.
Reported-by: syzbot+6693adf1698864d21734@syzkaller.appspotmail.com
Reported-by: syzbot+a4aee3f42d7584d76761@syzkaller.appspotmail.com
Cc: stable@kernel.org
Signed-off-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Link: https://lore.kernel.org/r/20200410123257.14559-1-tuomas.tynkkynen@iki.fi
[johannes: add note about length limit]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Sat, 11 Apr 2020 22:40:30 +0000 (00:40 +0200)]
nl80211: fix NL80211_ATTR_FTM_RESPONDER policy
The nested policy here should be established using the
NLA_POLICY_NESTED() macro so the length is properly
filled in.
Cc: stable@vger.kernel.org
Fixes: 81e54d08d9d8 ("cfg80211: support FTM responder configuration/statistics")
Link: https://lore.kernel.org/r/20200412004029.9d0722bb56c8.Ie690bfcc4a1a61ff8d8ca7e475d59fcaa52fb2da@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ard Biesheuvel [Fri, 10 Apr 2020 07:43:20 +0000 (09:43 +0200)]
efi/x86: Revert struct layout change to fix kexec boot regression
Commit
0a67361dcdaa29 ("efi/x86: Remove runtime table address from kexec EFI setup data")
removed the code that retrieves the non-remapped UEFI runtime services
pointer from the data structure provided by kexec, as it was never really
needed on the kexec boot path: mapping the runtime services table at its
non-remapped address is only needed when calling SetVirtualAddressMap(),
which never happens during a kexec boot in the first place.
However, dropping the 'runtime' member from struct efi_setup_data was a
mistake. That struct is shared ABI between the kernel and the kexec tooling
for x86, and so we cannot simply change its layout. So let's put back the
removed field, but call it 'unused' to reflect the fact that we never look
at its contents. While at it, add a comment to remind our future selves
that the layout is external ABI.
Fixes: 0a67361dcdaa29 ("efi/x86: Remove runtime table address from kexec EFI setup data")
Reported-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ard Biesheuvel [Thu, 9 Apr 2020 13:04:34 +0000 (15:04 +0200)]
efi/x86: Don't remap text<->rodata gap read-only for mixed mode
Commit
d9e3d2c4f10320 ("efi/x86: Don't map the entire kernel text RW for mixed mode")
updated the code that creates the 1:1 memory mapping to use read-only
attributes for the 1:1 alias of the kernel's text and rodata sections, to
protect it from inadvertent modification. However, it failed to take into
account that the unused gap between text and rodata is given to the page
allocator for general use.
If the vmap'ed stack happens to be allocated from this region, any by-ref
output arguments passed to EFI runtime services that are allocated on the
stack (such as the 'datasize' argument taken by GetVariable() when invoked
from efivar_entry_size()) will be referenced via a read-only mapping,
resulting in a page fault if the EFI code tries to write to it:
BUG: unable to handle page fault for address:
00000000386aae88
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
PGD
fd61063 P4D
fd61063 PUD
fd62063 PMD
386000e1
Oops: 0003 [#1] SMP PTI
CPU: 2 PID: 255 Comm: systemd-sysv-ge Not tainted 5.6.0-rc4-default+ #22
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0008:0x3eaeed95
Code: ... <89> 03 be 05 00 00 80 a1 74 63 b1 3e 83 c0 48 e8 44 d2 ff ff eb 05
RSP: 0018:
000000000fd73fa0 EFLAGS:
00010002
RAX:
0000000000000001 RBX:
00000000386aae88 RCX:
000000003e9f1120
RDX:
0000000000000001 RSI:
0000000000000000 RDI:
0000000000000001
RBP:
000000000fd73fd8 R08:
00000000386aae88 R09:
0000000000000000
R10:
0000000000000002 R11:
0000000000000000 R12:
0000000000000000
R13:
ffffc0f040220000 R14:
0000000000000000 R15:
0000000000000000
FS:
00007f21160ac940(0000) GS:
ffff9cf23d500000(0000) knlGS:
0000000000000000
CS: 0008 DS: 0018 ES: 0018 CR0:
0000000080050033
CR2:
00000000386aae88 CR3:
000000000fd6c004 CR4:
00000000003606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
Modules linked in:
CR2:
00000000386aae88
---[ end trace
a8bfbd202e712834 ]---
Let's fix this by remapping text and rodata individually, and leave the
gaps mapped read-write.
Fixes: d9e3d2c4f10320 ("efi/x86: Don't map the entire kernel text RW for mixed mode")
Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200409130434.6736-10-ardb@kernel.org
Gary Lin [Thu, 9 Apr 2020 13:04:33 +0000 (15:04 +0200)]
efi/x86: Fix the deletion of variables in mixed mode
efi_thunk_set_variable() treated the NULL "data" pointer as an invalid
parameter, and this broke the deletion of variables in mixed mode.
This commit fixes the check of data so that the userspace program can
delete a variable in mixed mode.
Fixes: 8319e9d5ad98ffcc ("efi/x86: Handle by-ref arguments covering multiple pages in mixed mode")
Signed-off-by: Gary Lin <glin@suse.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200408081606.1504-1-glin@suse.com
Link: https://lore.kernel.org/r/20200409130434.6736-9-ardb@kernel.org
Ard Biesheuvel [Thu, 9 Apr 2020 13:04:32 +0000 (15:04 +0200)]
efi/libstub/file: Merge file name buffers to reduce stack usage
Arnd reports that commit
9302c1bb8e47 ("efi/libstub: Rewrite file I/O routine")
reworks the file I/O routines in a way that triggers the following
warning:
drivers/firmware/efi/libstub/file.c:240:1: warning: the frame size
of 1200 bytes is larger than 1024 bytes [-Wframe-larger-than=]
We can work around this issue dropping an instance of efi_char16_t[256]
from the stack frame, and reusing the 'filename' field of the file info
struct that we use to obtain file information from EFI (which contains
the file name even though we already know it since we used it to open
the file in the first place)
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200409130434.6736-8-ardb@kernel.org
Ard Biesheuvel [Thu, 9 Apr 2020 13:04:31 +0000 (15:04 +0200)]
Documentation/x86, efi/x86: Clarify EFI handover protocol and its requirements
The EFI handover protocol was introduced on x86 to permit the boot
loader to pass a populated boot_params structure as an additional
function argument to the entry point. This allows the bootloader to
pass the base and size of a initrd image, which is more flexible
than relying on the EFI stub's file I/O routines, which can only
access the file system from which the kernel image itself was loaded
from firmware.
This approach requires a fair amount of internal knowledge regarding
the layout of the boot_params structure on the part of the boot loader,
as well as knowledge regarding the allowed placement of the initrd in
memory, and so it has been deprecated in favour of a new initrd loading
method that is based on existing UEFI protocols and best practices.
So update the x86 boot protocol documentation to clarify that the EFI
handover protocol has been deprecated, and while at it, add a note that
invoking the EFI handover protocol still requires the PE/COFF image to
be loaded properly (as opposed to simply being copied into memory).
Also, drop the code32_start header field from the list of values that
need to be provided, as this is no longer required.
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200409130434.6736-7-ardb@kernel.org
Ard Biesheuvel [Thu, 9 Apr 2020 13:04:30 +0000 (15:04 +0200)]
efi/arm: Deal with ADR going out of range in efi_enter_kernel()
Commit
0698fac4ac2a ("efi/arm: Clean EFI stub exit code from cache instead of avoiding it")
introduced a PC-relative reference to 'call_cache_fn' into
efi_enter_kernel(), which lives way at the end of head.S. In some cases,
the ARM version of the ADR instruction does not have sufficient range,
resulting in a build error:
arch/arm/boot/compressed/head.S:1453: Error: invalid constant (
fffffffffffffbe4) after fixup
ARM defines an alternative with a wider range, called ADRL, but this does
not exist for Thumb-2. At the same time, the ADR instruction in Thumb-2
has a wider range, and so it does not suffer from the same issue.
So let's switch to ADRL for ARM builds, and keep the ADR for Thumb-2 builds.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200409130434.6736-6-ardb@kernel.org
Arvind Sankar [Thu, 9 Apr 2020 13:04:29 +0000 (15:04 +0200)]
efi/x86: Always relocate the kernel for EFI handover entry
Commit
d5cdf4cfeac9 ("efi/x86: Don't relocate the kernel unless necessary")
tries to avoid relocating the kernel in the EFI stub as far as possible.
However, when systemd-boot is used to boot a unified kernel image [1],
the image is constructed by embedding the bzImage as a .linux section in
a PE executable that contains a small stub loader from systemd that will
call the EFI stub handover entry, together with additional sections and
potentially an initrd. When this image is constructed, by for example
dracut, the initrd is placed after the bzImage without ensuring that at
least init_size bytes are available for the bzImage. If the kernel is
not relocated by the EFI stub, this could result in the compressed
kernel's startup code in head_{32,64}.S overwriting the initrd.
To prevent this, unconditionally relocate the kernel if the EFI stub was
entered via the handover entry point.
[1] https://systemd.io/BOOT_LOADER_SPECIFICATION/#type-2-efi-unified-kernel-images
Fixes: d5cdf4cfeac9 ("efi/x86: Don't relocate the kernel unless necessary")
Reported-by: Sergey Shatunov <me@prok.pw>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200406180614.429454-2-nivedita@alum.mit.edu
Link: https://lore.kernel.org/r/20200409130434.6736-5-ardb@kernel.org
Arvind Sankar [Thu, 9 Apr 2020 13:04:28 +0000 (15:04 +0200)]
efi/x86: Move efi stub globals from .bss to .data
Commit
3ee372ccce4d ("x86/boot/compressed/64: Remove .bss/.pgtable from bzImage")
removed the .bss section from the bzImage.
However, while a PE loader is required to zero-initialize the .bss
section before calling the PE entry point, the EFI handover protocol
does not currently document any requirement that .bss be initialized by
the bootloader prior to calling the handover entry.
When systemd-boot is used to boot a unified kernel image [1], the image
is constructed by embedding the bzImage as a .linux section in a PE
executable that contains a small stub loader from systemd together with
additional sections and potentially an initrd. As the .bss section
within the bzImage is no longer explicitly present as part of the file,
it is not initialized before calling the EFI handover entry.
Furthermore, as the size of the embedded .linux section is only the size
of the bzImage file itself, the .bss section's memory may not even have
been allocated.
In particular, this can result in efi_disable_pci_dma being true even
when it was not specified via the command line or configuration option,
which in turn causes crashes while booting on some systems.
To avoid issues, place all EFI stub global variables into the .data
section instead of .bss. As of this writing, only boolean flags for a
few command line arguments and the sys_table pointer were in .bss and
will now move into the .data section.
[1] https://systemd.io/BOOT_LOADER_SPECIFICATION/#type-2-efi-unified-kernel-images
Fixes: 3ee372ccce4d ("x86/boot/compressed/64: Remove .bss/.pgtable from bzImage")
Reported-by: Sergey Shatunov <me@prok.pw>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200406180614.429454-1-nivedita@alum.mit.edu
Link: https://lore.kernel.org/r/20200409130434.6736-4-ardb@kernel.org
Colin Ian King [Thu, 9 Apr 2020 13:04:27 +0000 (15:04 +0200)]
efi/libstub/x86: Remove redundant assignment to pointer hdr
The pointer hdr is being assigned a value that is never read and
it is being updated later with a new value. The assignment is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200402102537.503103-1-colin.king@canonical.com
Link: https://lore.kernel.org/r/20200409130434.6736-3-ardb@kernel.org
Takashi Iwai [Thu, 9 Apr 2020 13:04:26 +0000 (15:04 +0200)]
efi/cper: Use scnprintf() for avoiding potential buffer overflow
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20200311072145.5001-1-tiwai@suse.de
Link: https://lore.kernel.org/r/20200409130434.6736-2-ardb@kernel.org
Takashi Iwai [Mon, 13 Apr 2020 20:19:19 +0000 (22:19 +0200)]
ALSA: hda: Allow setting preallocation again for x86
The commit
c31427d0d21e ("ALSA: hda: No preallocation on x86
platforms") changed CONFIG_SND_HDA_PREALLOC_SIZE setup and its default
to zero for x86, as the preallocation should work almost all cases.
However, this expectation was too naive; some applications try to
allocate as the max buffer size as possible, and it leads to the
memory exhaustion. More badly, the commit changed the kconfig no
longer adjustable for x86, so you can't fix it statically (although it
can be still adjusted via procfs).
So, practically seen, it's more recommended to set a reasonable limit
for x86, too. This patch follows to that experience, and changes the
default to 2048 and allow the kconfig adjustable again.
Fixes: c31427d0d21e ("ALSA: hda: No preallocation on x86 platforms")
Cc: <stable@vger.kernel.org>
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207223
Link: https://lore.kernel.org/r/20200413201919.24241-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Trond Myklebust [Mon, 13 Apr 2020 19:55:21 +0000 (15:55 -0400)]
NFS: Fix an ABBA spinlock issue in pnfs_update_layout()
We need to drop the inode spinlock while calling nfs4_select_rw_stateid(),
since nfs4_copy_delegation_stateid() could take the delegation lock.
Note that it is safe to do this, since all other calls to
pnfs_update_layout() for that inode will find themselves blocked by
the lock we hold on NFS_LAYOUT_FIRST_LAYOUTGET.
Fixes: fc51b1cf391d ("NFS: Beware when dereferencing the delegation cred")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Geert Uytterhoeven [Mon, 13 Apr 2020 09:35:52 +0000 (11:35 +0200)]
m68k: Drop redundant generic-y += hardirq.h
The cleanup in commit
630f289b7114c0e6 ("asm-generic: make more
kernel-space headers mandatory") did not take into account the recently
added line for hardirq.h in commit
acc45648b9aefa90 ("m68k: Switch to
asm-generic/hardirq.h"), leading to the following message during the
build:
scripts/Makefile.asm-generic:25: redundant generic-y found in arch/m68k/include/asm/Kbuild: hardirq.h
Fix this by dropping the now redundant line.
Fixes: 630f289b7114c0e6 ("asm-generic: make more kernel-space headers mandatory")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Layton [Wed, 8 Apr 2020 12:41:38 +0000 (08:41 -0400)]
ceph: fix potential bad pointer deref in async dirops cb's
The new async dirops callback routines can pass ERR_PTR values to
ceph_mdsc_free_path, which could cause an oops. Make ceph_mdsc_free_path
ignore ERR_PTR values. Also, ensure that the pr_warn messages look sane
even if ceph_mdsc_build_path fails.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Roy Spliet [Mon, 13 Apr 2020 08:20:34 +0000 (10:20 +0200)]
ALSA: hda: Explicitly permit using autosuspend if runtime PM is supported
This fixes runtime PM not working after a suspend-to-RAM cycle at least for
the codec-less HDA device found on NVIDIA GPUs.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Signed-off-by: Roy Spliet <nouveau@spliet.org>
Link: https://lore.kernel.org/r/20200413082034.25166-7-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Mon, 13 Apr 2020 08:20:33 +0000 (10:20 +0200)]
ALSA: hda: Skip controller resume if not needed
The HD-audio controller does system-suspend and resume operations by
directly calling its helpers __azx_runtime_suspend() and
__azx_runtime_resume(). However, in general, we don't have to resume
always the device fully at the system resume; typically, if a device
has been runtime-suspended, we can leave it to runtime resume.
Usually for achieving this, the driver would call
pm_runtime_force_suspend() and pm_runtime_force_resume() pairs in the
system suspend and resume ops. Unfortunately, this doesn't work for
the resume path in our case. For handling the jack detection at the
system resume, a child codec device may need the (literally) forcibly
resume even if it's been runtime-suspended, and for that, the
controller device must be also resumed even if it's been suspended.
This patch is an attempt to improve the situation. It replaces the
direct __azx_runtime_suspend()/_resume() calls with with
pm_runtime_force_suspend() and pm_runtime_force_resume() with a slight
trick as we've done for the codec side. More exactly:
- azx_has_pm_runtime() check is dropped from azx_runtime_suspend() and
azx_runtime_resume(), so that it can be properly executed from the
system-suspend/resume path
- The WAKEEN handling depends on the card's power state now; it's set
and cleared only for the runtime-suspend
- azx_resume() checks whether any codec may need the forcible resume
beforehand. If the forcible resume is required, it does temporary
PM refcount up/down for actually triggering the runtime resume.
- A new helper function, hda_codec_need_resume(), is introduced for
checking whether the codec needs a forcible runtime-resume, and the
existing code is rewritten with that.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Link: https://lore.kernel.org/r/20200413082034.25166-6-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Mon, 13 Apr 2020 08:20:32 +0000 (10:20 +0200)]
ALSA: hda: Keep the controller initialization even if no codecs found
Currently, when the HD-audio controller driver doesn't detect any
codecs, it tries to abort the probe. But this abort happens at the
delayed probe, i.e. the primary probe call already returned success,
hence the driver is never unbound until user does so explicitly.
As a result, it may leave the HD-audio device in the running state
without the runtime PM. More badly, if the device is a HD-audio bus
that is tied with a GPU, GPU cannot reach to the full power down and
consumes unnecessarily much power.
This patch changes the logic after no-codec situation; it continues
probing without the further codec initialization but keep the
controller driver running normally.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Tested-by: Roy Spliet <nouveau@spliet.org>
Link: https://lore.kernel.org/r/20200413082034.25166-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Mon, 13 Apr 2020 08:20:31 +0000 (10:20 +0200)]
ALSA: hda: Release resources at error in delayed probe
snd-hda-intel driver handles the most of its probe task in the delayed
work (either via workqueue or via firmware loader). When an error
happens in the later delayed probe, we can't deregister the device
itself because the probe callback already returned success and the
device was bound. So, for now, we set hda->init_failed flag and make
the rest untouched until the device gets really unbound.
However, this leaves the device up running, keeping the resources
without any use that prevents other operations.
In this patch, we release the resources at first when a probe error
happens in the delayed probe stage, but keeps the top-level object, so
that the PM and other ops can still refer to the object itself.
Also for simplicity, snd_hda_intel object is allocated via devm, so
that we can get rid of the explicit kfree calls.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Link: https://lore.kernel.org/r/20200413082034.25166-4-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Mon, 13 Apr 2020 08:20:30 +0000 (10:20 +0200)]
ALSA: hda: Honor PM disablement in PM freeze and thaw_noirq ops
freeze_noirq and thaw_noirq need to check the PM availability like
other PM ops. There are cases where the device got disabled due to
the error, and the PM operation should be ignored for that.
Fixes: 3e6db33aaf1d ("ALSA: hda - Set SKL+ hda controller power at freeze() and thaw()")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Link: https://lore.kernel.org/r/20200413082034.25166-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Mon, 13 Apr 2020 08:20:29 +0000 (10:20 +0200)]
ALSA: hda: Don't release card at firmware loading error
At the error path of the firmware loading error, the driver tries to
release the card object and set NULL to drvdata. This may be referred
badly at the possible PM action, as the driver itself is still bound
and the PM callbacks read the card object.
Instead, we continue the probing as if it were no option set. This is
often a better choice than the forced abort, too.
Fixes: 5cb543dba986 ("ALSA: hda - Deferred probing with request_firmware_nowait()")
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Link: https://lore.kernel.org/r/20200413082034.25166-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Christophe JAILLET [Sat, 11 Apr 2020 14:58:44 +0000 (16:58 +0200)]
platform/chrome: cros_ec_sensorhub: Add missing '\n' in log messages
Message logged by 'dev_xxx()' or 'pr_xxx()' should end with a '\n'.
Fixes: 145d59baff59 ("platform/chrome: cros_ec_sensorhub: Add FIFO support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
David Howells [Sat, 11 Apr 2020 07:50:45 +0000 (08:50 +0100)]
afs: Fix afs_d_validate() to set the right directory version
If a dentry's version is somewhere between invalid_before and the current
directory version, we should be setting it forward to the current version,
not backwards to the invalid_before version. Note that we're only doing
this at all because dentry::d_fsdata isn't large enough on a 32-bit system.
Fix this by using a separate variable for invalid_before so that we don't
accidentally clobber the current dir version.
Fixes: a4ff7401fbfa ("afs: Keep track of invalid-before version for dentry coherency")
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Fri, 10 Apr 2020 14:23:27 +0000 (15:23 +0100)]
afs: Fix race between post-modification dir edit and readdir/d_revalidate
AFS directories are retained locally as a structured file, with lookup
being effected by a local search of the file contents. When a modification
(such as mkdir) happens, the dir file content is modified locally rather
than redownloading the directory.
The directory contents are accessed in a number of ways, with a number of
different locks schemes:
(1) Download of contents - dvnode->validate_lock/write in afs_read_dir().
(2) Lookup and readdir - dvnode->validate_lock/read in afs_dir_iterate(),
downgrading from (1) if necessary.
(3) d_revalidate of child dentry - dvnode->validate_lock/read in
afs_do_lookup_one() downgrading from (1) if necessary.
(4) Edit of dir after modification - page locks on individual dir pages.
Unfortunately, because (4) uses different locking scheme to (1) - (3),
nothing protects against the page being scanned whilst the edit is
underway. Even download is not safe as it doesn't lock the pages - relying
instead on the validate_lock to serialise as a whole (the theory being that
directory contents are treated as a block and always downloaded as a
block).
Fix this by write-locking dvnode->validate_lock around the edits. Care
must be taken in the rename case as there may be two different dirs - but
they need not be locked at the same time. In any case, once the lock is
taken, the directory version must be rechecked, and the edit skipped if a
later version has been downloaded by revalidation (there can't have been
any local changes because the VFS holds the inode lock, but there can have
been remote changes).
Fixes: 63a4681ff39c ("afs: Locally edit directory data for mkdir/create/unlink/...")
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Wed, 1 Apr 2020 22:32:12 +0000 (23:32 +0100)]
afs: Fix length of dump of bad YFSFetchStatus record
Fix the length of the dump of a bad YFSFetchStatus record. The function
was copied from the AFS version, but the YFS variant contains bigger fields
and extra information, so expand the dump to match.
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Wed, 8 Apr 2020 19:56:20 +0000 (20:56 +0100)]
afs: Fix rename operation status delivery
The afs_deliver_fs_rename() and yfs_deliver_fs_rename() functions both only
decode the second file status returned unless the parent directories are
different - unfortunately, this means that the xdr pointer isn't advanced
and the volsync record will be read incorrectly in such an instance.
Fix this by always decoding the second status into the second
status/callback block which wasn't being used if the dirs were the same.
The afs_update_dentry_version() calls that update the directory data
version numbers on the dentries can then unconditionally use the second
status record as this will always reflect the state of the destination dir
(the two records will be identical if the destination dir is the same as
the source dir)
Fixes: 260a980317da ("[AFS]: Add "directory write" support.")
Fixes: 30062bd13e36 ("afs: Implement YFS support in the fs client")
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Wed, 8 Apr 2020 16:32:10 +0000 (17:32 +0100)]
afs: Fix decoding of inline abort codes from version 1 status records
If we're decoding an AFSFetchStatus record and we see that the version is 1
and the abort code is set and we're expecting inline errors, then we store
the abort code and ignore the remaining status record (which is correct),
but we don't set the flag to say we got a valid abort code.
This can affect operation of YFS.RemoveFile2 when removing a file and the
operation of {,Y}FS.InlineBulkStatus when prospectively constructing or
updating of a set of inodes during a lookup.
Fix this to indicate the reception of a valid abort code.
Fixes: a38a75581e6e ("afs: Fix unlink to handle YFS.RemoveFile2 better")
Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Wed, 8 Apr 2020 15:13:20 +0000 (16:13 +0100)]
afs: Fix missing XDR advance in xdr_decode_{AFS,YFS}FSFetchStatus()
If we receive a status record that has VNOVNODE set in the abort field,
xdr_decode_AFSFetchStatus() and xdr_decode_YFSFetchStatus() don't advance
the XDR pointer, thereby corrupting anything subsequent decodes from the
same block of data.
This has the potential to affect AFS.InlineBulkStatus and
YFS.InlineBulkStatus operation, but probably doesn't since the status
records are extracted as individual blocks of data and the buffer pointer
is reset between blocks.
It does affect YFS.RemoveFile2 operation, corrupting the volsync record -
though that is not currently used.
Other operations abort the entire operation rather than returning an error
inline, in which case there is no decoding to be done.
Fix this by unconditionally advancing the xdr pointer.
Fixes: 684b0f68cf1c ("afs: Fix AFSFetchStatus decoder to provide OpenAFS compatibility")
Signed-off-by: David Howells <dhowells@redhat.com>
Rajendra Nayak [Wed, 8 Apr 2020 13:46:27 +0000 (19:16 +0530)]
opp: Manage empty OPP tables with clk handle
With OPP core now supporting DVFS for IO devices, we have instances of
IO devices (same IP block) which require an OPP on some platforms/SoCs
while just needing to scale the clock on some others.
In order to avoid conditional code in every driver which supports such
devices (to check for availability of OPPs and then deciding to do
either dev_pm_opp_set_rate() or clk_set_rate()) add support to manage
empty OPP tables with a clk handle.
This makes dev_pm_opp_set_rate() equivalent of a clk_set_rate() for
devices with just a clk and no OPPs specified, and makes
dev_pm_opp_set_rate(0) bail out without throwing an error.
Signed-off-by: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Ilya Dryomov [Tue, 17 Mar 2020 14:18:48 +0000 (15:18 +0100)]
rbd: don't mess with a page vector in rbd_notify_op_lock()
rbd_notify_op_lock() isn't interested in a notify reply. Instead of
accepting that page vector just to free it, have watch-notify code take
care of it.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Ilya Dryomov [Mon, 16 Mar 2020 16:16:28 +0000 (17:16 +0100)]
rbd: don't test rbd_dev->opts in rbd_dev_image_release()
rbd_dev->opts is used to distinguish between the image that is being
mapped and a parent. However, because we no longer establish watch for
read-only mappings, this test is imprecise and results in unnecessary
rbd_unregister_watch() calls.
Make it consistent with need_watch in rbd_dev_image_probe().
Fixes: b9ef2b8858a0 ("rbd: don't establish watch for read-only mappings")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Ilya Dryomov [Mon, 16 Mar 2020 14:52:54 +0000 (15:52 +0100)]
rbd: call rbd_dev_unprobe() after unwatching and flushing notifies
rbd_dev_unprobe() is supposed to undo most of rbd_dev_image_probe(),
including rbd_dev_header_info(), which means that rbd_dev_header_info()
isn't supposed to be called after rbd_dev_unprobe().
However, rbd_dev_image_release() calls rbd_dev_unprobe() before
rbd_unregister_watch(). This is racy because a header update notify
can sneak in:
"rbd unmap" thread ceph-watch-notify worker
rbd_dev_image_release()
rbd_dev_unprobe()
free and zero out header
rbd_watch_cb()
rbd_dev_refresh()
rbd_dev_header_info()
read in header
The same goes for "rbd map" because rbd_dev_image_probe() calls
rbd_dev_unprobe() on errors. In both cases this results in a memory
leak.
Fixes: fd22aef8b47c ("rbd: move rbd_unregister_watch() call into rbd_dev_image_release()")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>