David S. Miller [Fri, 21 Feb 2020 23:22:45 +0000 (15:22 -0800)]
Merge git://git./linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2020-02-21
The following pull-request contains BPF updates for your *net-next* tree.
We've added 25 non-merge commits during the last 4 day(s) which contain
a total of 33 files changed, 2433 insertions(+), 161 deletions(-).
The main changes are:
1) Allow for adding TCP listen sockets into sock_map/hash so they can be used
with reuseport BPF programs, from Jakub Sitnicki.
2) Add a new bpf_program__set_attach_target() helper for adding libbpf support
to specify the tracepoint/function dynamically, from Eelco Chaudron.
3) Add bpf_read_branch_records() BPF helper which helps use cases like profile
guided optimizations, from Daniel Xu.
4) Enable bpf_perf_event_read_value() in all tracing programs, from Song Liu.
5) Relax BTF mandatory check if only used for libbpf itself e.g. to process
BTF defined maps, from Andrii Nakryiko.
6) Move BPF selftests -mcpu compilation attribute from 'probe' to 'v3' as it has
been observed that former fails in envs with low memlock, from Yonghong Song.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Feb 2020 21:39:34 +0000 (13:39 -0800)]
Merge git://git./linux/kernel/git/netdev/net
Conflict resolution of ice_virtchnl_pf.c based upon work by
Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Fri, 21 Feb 2020 21:29:46 +0000 (22:29 +0100)]
Merge branch 'bpf-sockmap-listen'
Jakub Sitnicki says:
====================
This patch set turns SOCK{MAP,HASH} into generic collections for TCP
sockets, both listening and established. Adding support for listening
sockets enables us to use these BPF map types with reuseport BPF programs.
Why? SOCKMAP and SOCKHASH, in comparison to REUSEPORT_SOCKARRAY, allow
the socket to be in more than one map at the same time.
Having a BPF map type that can hold listening sockets, and gracefully
co-exist with reuseport BPF is important if, in the future, we want
BPF programs that run at socket lookup time [0]. Cover letter for v1 of
this series tells the full story of how we got here [1].
Although SOCK{MAP,HASH} are not a drop-in replacement for SOCKARRAY just
yet, because UDP support is lacking, it's a step in this direction. We're
working with Lorenz on extending SOCK{MAP,HASH} to hold UDP sockets, and
expect to post RFC series for sockmap + UDP in the near future.
I've dropped Acks from all patches that have been touched since v6.
The audit for missing READ_ONCE annotations for access to sk_prot is
ongoing. Thus far I've found one location specific to TCP listening sockets
that needed annotating. This got fixed it in this iteration. I wonder if
sparse checker could be put to work to identify places where we have
sk_prot access while not holding sk_lock...
The patch series depends on another one, posted earlier [2], that has
been split out of it.
v6 -> v7:
- Extended the series to cover SOCKHASH. (patches 4-8, 10-11) (John)
- Rebased onto recent bpf-next. Resolved conflicts in recent fixes to
sk_state checks on sockmap/sockhash update path. (patch 4)
- Added missing READ_ONCE annotation in sock_copy. (patch 1)
- Split out patches that simplify sk_psock_restore_proto [2].
v5 -> v6:
- Added a fix-up for patch 1 which I forgot to commit in v5. Sigh.
v4 -> v5:
- Rebase onto recent bpf-next to resolve conflicts. (Daniel)
v3 -> v4:
- Make tcp_bpf_clone parameter names consistent across function declaration
and definition. (Martin)
- Use sock_map_redirect_okay helper everywhere we need to take a different
action for listening sockets. (Lorenz)
- Expand comment explaining the need for a callback from reuseport to
sockarray code in reuseport_detach_sock. (Martin)
- Mention the possibility of using a u64 counter for reuseport IDs in the
future in the description for patch 10. (Martin)
v2 -> v3:
- Generate reuseport ID when group is created. Please see patch 10
description for details. (Martin)
- Fix the build when CONFIG_NET_SOCK_MSG is not selected by either
CONFIG_BPF_STREAM_PARSER or CONFIG_TLS. (kbuild bot & John)
- Allow updating sockmap from BPF on BPF_SOCK_OPS_TCP_LISTEN_CB callback. An
oversight in previous iterations. Users may want to populate the sockmap with
listening sockets from BPF as well.
- Removed RCU read lock assertion in sock_map_lookup_sys. (Martin)
- Get rid of a warning when child socket was cloned with parent's psock
state. (John)
- Check for tcp_bpf_unhash rather than tcp_bpf_recvmsg when deciding if
sk_proto needs restoring on clone. Check for recvmsg in the context of
listening socket cloning was confusing. (Martin)
- Consolidate sock_map_sk_is_suitable with sock_map_update_okay. This led
to adding dedicated predicates for sockhash. Update self-tests
accordingly. (John)
- Annotate unlikely branch in bpf_{sk,msg}_redirect_map when socket isn't
in a map, or isn't a valid redirect target. (John)
- Document paired READ/WRITE_ONCE annotations and cover shared access in
more detail in patch 2 description. (John)
- Correct a couple of log messages in sockmap_listen self-tests so the
message reflects the actual failure.
- Rework reuseport tests from sockmap_listen suite so that ENOENT error
from bpf_sk_select_reuseport handler does not happen on happy path.
v1 -> v2:
- af_ops->syn_recv_sock callback is no longer overridden and burdened with
restoring sk_prot and clearing sk_user_data in the child socket. As child
socket is already hashed when syn_recv_sock returns, it is too late to
put it in the right state. Instead patches 3 & 4 address restoring
sk_prot and clearing sk_user_data before we hash the child socket.
(Pointed out by Martin Lau)
- Annotate shared access to sk->sk_prot with READ_ONCE/WRITE_ONCE macros as
we write to it from sk_msg while socket might be getting cloned on
another CPU. (Suggested by John Fastabend)
- Convert tests for SOCKMAP holding listening sockets to return-on-error
style, and hook them up to test_progs. Also use BPF skeleton for setup.
Add new tests to cover the race scenario discovered during v1 review.
RFC -> v1:
- Switch from overriding proto->accept to af_ops->syn_recv_sock, which
happens earlier. Clearing the psock state after accept() does not work
for child sockets that become orphaned (never got accepted). v4-mapped
sockets need special care.
- Return the socket cookie on SOCKMAP lookup from syscall to be on par with
REUSEPORT_SOCKARRAY. Requires SOCKMAP to take u64 on lookup/update from
syscall.
- Make bpf_sk_redirect_map (ingress) and bpf_msg_redirect_map (egress)
SOCKMAP helpers fail when target socket is a listening one.
- Make bpf_sk_select_reuseport helper fail when target is a TCP established
socket.
- Teach libbpf to recognize SK_REUSEPORT program type from section name.
- Add a dedicated set of tests for SOCKMAP holding listening sockets,
covering map operations, overridden socket callbacks, and BPF helpers.
[0] https://lore.kernel.org/bpf/
20190828072250.29828-1-jakub@cloudflare.com/
[1] https://lore.kernel.org/bpf/
20191123110751.6729-1-jakub@cloudflare.com/
[2] https://lore.kernel.org/bpf/
20200217121530.754315-1-jakub@cloudflare.com/
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:23 +0000 (17:10 +0000)]
selftests/bpf: Tests for sockmap/sockhash holding listening sockets
Now that SOCKMAP and SOCKHASH map types can store listening sockets,
user-space and BPF API is open to a new set of potential pitfalls.
Exercise the map operations, with extra attention to code paths susceptible
to races between map ops and socket cloning, and BPF helpers that work with
SOCKMAP/SOCKHASH to gain confidence that all works as expected.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-12-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:22 +0000 (17:10 +0000)]
selftests/bpf: Extend SK_REUSEPORT tests to cover SOCKMAP/SOCKHASH
Parametrize the SK_REUSEPORT tests so that the map type for storing sockets
is not hard-coded in the test setup routine.
This, together with careful state cleaning after the tests, lets us run the
test cases for REUSEPORT_ARRAY, SOCKMAP, and SOCKHASH to have test coverage
for all supported map types. The last two support only TCP sockets at the
moment.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-11-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:21 +0000 (17:10 +0000)]
net: Generate reuseport group ID on group creation
Commit
736b46027eb4 ("net: Add ID (if needed) to sock_reuseport and expose
reuseport_lock") has introduced lazy generation of reuseport group IDs that
survive group resize.
By comparing the identifier we check if BPF reuseport program is not trying
to select a socket from a BPF map that belongs to a different reuseport
group than the one the packet is for.
Because SOCKARRAY used to be the only BPF map type that can be used with
reuseport BPF, it was possible to delay the generation of reuseport group
ID until a socket from the group was inserted into BPF map for the first
time.
Now that SOCK{MAP,HASH} can be used with reuseport BPF we have two options,
either generate the reuseport ID on map update, like SOCKARRAY does, or
allocate an ID from the start when reuseport group gets created.
This patch takes the latter approach to keep sockmap free of calls into
reuseport code. This streamlines the reuseport_id access as its lifetime
now matches the longevity of reuseport object.
The cost of this simplification, however, is that we allocate reuseport IDs
for all SO_REUSEPORT users. Even those that don't use SOCKARRAY in their
setups. With the way identifiers are currently generated, we can have at
most S32_MAX reuseport groups, which hopefully is sufficient. If we ever
get close to the limit, we can switch an u64 counter like sk_cookie.
Another change is that we now always call into SOCKARRAY logic to unlink
the socket from the map when unhashing or closing the socket. Previously we
did it only when at least one socket from the group was in a BPF map.
It is worth noting that this doesn't conflict with sockmap tear-down in
case a socket is in a SOCK{MAP,HASH} and belongs to a reuseport
group. sockmap tear-down happens first:
prot->unhash
`- tcp_bpf_unhash
|- tcp_bpf_remove
| `- while (sk_psock_link_pop(psock))
| `- sk_psock_unlink
| `- sock_map_delete_from_link
| `- __sock_map_delete
| `- sock_map_unref
| `- sk_psock_put
| `- sk_psock_drop
| `- rcu_assign_sk_user_data(sk, NULL)
`- inet_unhash
`- reuseport_detach_sock
`- bpf_sk_reuseport_detach
`- WRITE_ONCE(sk->sk_user_data, NULL)
Suggested-by: Martin Lau <kafai@fb.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200218171023.844439-10-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:20 +0000 (17:10 +0000)]
bpf: Allow selecting reuseport socket from a SOCKMAP/SOCKHASH
SOCKMAP & SOCKHASH now support storing references to listening
sockets. Nothing keeps us from using these map types a collection of
sockets to select from in BPF reuseport programs. Whitelist the map types
with the bpf_sk_select_reuseport helper.
The restriction that the socket has to be a member of a reuseport group
still applies. Sockets in SOCKMAP/SOCKHASH that don't have sk_reuseport_cb
set are not a valid target and we signal it with -EINVAL.
The main benefit from this change is that, in contrast to
REUSEPORT_SOCKARRAY, SOCK{MAP,HASH} don't impose a restriction that a
listening socket can be just one BPF map at the same time.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200218171023.844439-9-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:19 +0000 (17:10 +0000)]
bpf, sockmap: Let all kernel-land lookup values in SOCKMAP/SOCKHASH
Don't require the kernel code, like BPF helpers, that needs access to
SOCK{MAP,HASH} map contents to live in net/core/sock_map.c. Expose the
lookup operation to all kernel-land.
Lookup from BPF context is not whitelisted yet. While syscalls have a
dedicated lookup handler.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-8-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:18 +0000 (17:10 +0000)]
bpf, sockmap: Return socket cookie on lookup from syscall
Tooling that populates the SOCK{MAP,HASH} with sockets from user-space
needs a way to inspect its contents. Returning the struct sock * that the
map holds to user-space is neither safe nor useful. An approach established
by REUSEPORT_SOCKARRAY is to return a socket cookie (a unique identifier)
instead.
Since socket cookies are u64 values, SOCK{MAP,HASH} need to support such a
value size for lookup to be possible. This requires special handling on
update, though. Attempts to do a lookup on a map holding u32 values will be
met with ENOSPC error.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-7-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:17 +0000 (17:10 +0000)]
bpf, sockmap: Don't set up upcalls and progs for listening sockets
Now that sockmap/sockhash can hold listening sockets, when setting up the
psock we will (i) grab references to verdict/parser progs, and (2) override
socket upcalls sk_data_ready and sk_write_space.
However, since we cannot redirect to listening sockets so we don't need to
link the socket to the BPF progs. And more importantly we don't want the
listening socket to have overridden upcalls because they would get
inherited by child sockets cloned from it.
Introduce a separate initialization path for listening sockets that does
not change the upcalls and ignores the BPF progs.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-6-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:16 +0000 (17:10 +0000)]
bpf, sockmap: Allow inserting listening TCP sockets into sockmap
In order for sockmap/sockhash types to become generic collections for
storing TCP sockets we need to loosen the checks during map update, while
tightening the checks in redirect helpers.
Currently sock{map,hash} require the TCP socket to be in established state,
which prevents inserting listening sockets.
Change the update pre-checks so the socket can also be in listening state.
Since it doesn't make sense to redirect with sock{map,hash} to listening
sockets, add appropriate socket state checks to BPF redirect helpers too.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-5-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:15 +0000 (17:10 +0000)]
tcp_bpf: Don't let child socket inherit parent protocol ops on copy
Prepare for cloning listening sockets that have their protocol callbacks
overridden by sk_msg. Child sockets must not inherit parent callbacks that
access state stored in sk_user_data owned by the parent.
Restore the child socket protocol callbacks before it gets hashed and any
of the callbacks can get invoked.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-4-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:14 +0000 (17:10 +0000)]
net, sk_msg: Clear sk_user_data pointer on clone if tagged
sk_user_data can hold a pointer to an object that is not intended to be
shared between the parent socket and the child that gets a pointer copy on
clone. This is the case when sk_user_data points at reference-counted
object, like struct sk_psock.
One way to resolve it is to tag the pointer with a no-copy flag by
repurposing its lowest bit. Based on the bit-flag value we clear the child
sk_user_data pointer after cloning the parent socket.
The no-copy flag is stored in the pointer itself as opposed to externally,
say in socket flags, to guarantee that the pointer and the flag are copied
from parent to child socket in an atomic fashion. Parent socket state is
subject to change while copying, we don't hold any locks at that time.
This approach relies on an assumption that sk_user_data holds a pointer to
an object aligned at least 2 bytes. A manual audit of existing users of
rcu_dereference_sk_user_data helper confirms our assumption.
Also, an RCU-protected sk_user_data is not likely to hold a pointer to a
char value or a pathological case of "struct { char c; }". To be safe, warn
when the flag-bit is set when setting sk_user_data to catch any future
misuses.
It is worth considering why clearing sk_user_data unconditionally is not an
option. There exist users, DRBD, NVMe, and Xen drivers being among them,
that rely on the pointer being copied when cloning the listening socket.
Potentially we could distinguish these users by checking if the listening
socket has been created in kernel-space via sock_create_kern, and hence has
sk_kern_sock flag set. However, this is not the case for NVMe and Xen
drivers, which create sockets without marking them as belonging to the
kernel.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200218171023.844439-3-jakub@cloudflare.com
Jakub Sitnicki [Tue, 18 Feb 2020 17:10:13 +0000 (17:10 +0000)]
net, sk_msg: Annotate lockless access to sk_prot on clone
sk_msg and ULP frameworks override protocol callbacks pointer in
sk->sk_prot, while tcp accesses it locklessly when cloning the listening
socket, that is with neither sk_lock nor sk_callback_lock held.
Once we enable use of listening sockets with sockmap (and hence sk_msg),
there will be shared access to sk->sk_prot if socket is getting cloned
while being inserted/deleted to/from the sockmap from another CPU:
Read side:
tcp_v4_rcv
sk = __inet_lookup_skb(...)
tcp_check_req(sk)
inet_csk(sk)->icsk_af_ops->syn_recv_sock
tcp_v4_syn_recv_sock
tcp_create_openreq_child
inet_csk_clone_lock
sk_clone_lock
READ_ONCE(sk->sk_prot)
Write side:
sock_map_ops->map_update_elem
sock_map_update_elem
sock_map_update_common
sock_map_link_no_progs
tcp_bpf_init
tcp_bpf_update_sk_prot
sk_psock_update_proto
WRITE_ONCE(sk->sk_prot, ops)
sock_map_ops->map_delete_elem
sock_map_delete_elem
__sock_map_delete
sock_map_unref
sk_psock_put
sk_psock_drop
sk_psock_restore_proto
tcp_update_ulp
WRITE_ONCE(sk->sk_prot, proto)
Mark the shared access with READ_ONCE/WRITE_ONCE annotations.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200218171023.844439-2-jakub@cloudflare.com
Linus Torvalds [Fri, 21 Feb 2020 21:02:49 +0000 (13:02 -0800)]
Merge tag 'linux-watchdog-5.6-rc3' of git://linux-watchdog.org/linux-watchdog
Pull watchdog fixes from Wim Van Sebroeck:
- mtk_wdt needs RESET_CONTROLLER to build
- da9062 driver fixes:
- fix power management ops
- do not ping the hw during stop()
- add dependency on I2C
* tag 'linux-watchdog-5.6-rc3' of git://www.linux-watchdog.org/linux-watchdog:
watchdog: da9062: Add dependency on I2C
watchdog: da9062: fix power management ops
watchdog: da9062: do not ping the hw during stop()
watchdog: fix mtk_wdt.c RESET_CONTROLLER build error
Linus Torvalds [Fri, 21 Feb 2020 20:57:05 +0000 (12:57 -0800)]
Merge tag 'char-misc-5.6-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char/misc driver fixes for 5.6-rc3.
Also included in here are some updates for some documentation files
that I seem to be maintaining these days.
The driver fixes are:
- small fixes for the habanalabs driver
- fsi driver bugfix
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Documentation/process: Swap out the ambassador for Canonical
habanalabs: patched cb equals user cb in device memset
habanalabs: do not halt CoreSight during hard reset
habanalabs: halt the engines before hard-reset
MAINTAINERS: remove unnecessary ':' characters
fsi: aspeed: add unspecified HAS_IOMEM dependency
COPYING: state that all contributions really are covered by this file
Documentation/process: Change Microsoft contact for embargoed hardware issues
embargoed-hardware-issues: drop Amazon contact as the email address now bounces
Documentation/process: Add Arm contact for embargoed HW issues
Linus Torvalds [Fri, 21 Feb 2020 20:53:53 +0000 (12:53 -0800)]
Merge tag 'staging-5.6-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.6-rc3, along with the
removal of an unused/unneeded driver as well.
The android vsoc driver is not needed anymore by anyone, so it was
removed.
The other driver fixes are:
- ashmem bugfixes
- greybus audio driver bugfix
- wireless driver bugfixes and tiny cleanups to error paths
All of these have been in linux-next for a while now with no reported
issues"
* tag 'staging-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: rtl8723bs: Remove unneeded goto statements
staging: rtl8188eu: Remove some unneeded goto statements
staging: rtl8723bs: Fix potential overuse of kernel memory
staging: rtl8188eu: Fix potential overuse of kernel memory
staging: rtl8723bs: Fix potential security hole
staging: rtl8188eu: Fix potential security hole
staging: greybus: use after free in gb_audio_manager_remove_all()
staging: android: Delete the 'vsoc' driver
staging: rtl8723bs: fix copy of overlapping memory
staging: android: ashmem: Disallow ashmem memory from being remapped
staging: vt6656: fix sign of rx_dbm to bb_pre_ed_rssi.
Linus Torvalds [Fri, 21 Feb 2020 20:48:29 +0000 (12:48 -0800)]
Merge tag 'tty-5.6-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial driver fixes from Greg KH:
"Here are a number of small tty and serial driver fixes for 5.6-rc3
that resolve a bunch of reported issues.
They are:
- vt selection and ioctl fixes
- serdev bugfix
- atmel serial driver fixes
- qcom serial driver fixes
- other minor serial driver fixes
All of these have been in linux-next for a while with no reported
issues"
* tag 'tty-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
vt: selection, close sel_buffer race
vt: selection, handle pending signals in paste_selection
serial: cpm_uart: call cpm_muram_init before registering console
tty: serial: qcom_geni_serial: Fix RX cancel command failure
serial: 8250: Check UPF_IRQ_SHARED in advance
tty: serial: imx: setup the correct sg entry for tx dma
vt: vt_ioctl: fix race in VT_RESIZEX
vt: fix scrollback flushing on background consoles
tty: serial: tegra: Handle RX transfer in PIO mode if DMA wasn't started
tty/serial: atmel: manage shutdown in case of RS485 or ISO7816 mode
serdev: ttyport: restore client ops on deregistration
serial: ar933x_uart: set UART_CS_{RX,TX}_READY_ORIDE
Linus Torvalds [Fri, 21 Feb 2020 20:44:53 +0000 (12:44 -0800)]
Merge tag 'usb-5.6-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB/Thunderbolt fixes from Greg KH:
"Here are a number of small USB driver fixes for 5.6-rc3.
Included in here are:
- MAINTAINER file updates
- USB gadget driver fixes
- usb core quirk additions and fixes for regressions
- xhci driver fixes
- usb serial driver id additions and fixes
- thunderbolt bugfix
Thunderbolt patches come in through here now that USB4 is really
thunderbolt.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (34 commits)
USB: misc: iowarrior: add support for the 100 device
thunderbolt: Prevent crash if non-active NVMem file is read
usb: gadget: udc-xilinx: Fix xudc_stop() kernel-doc format
USB: misc: iowarrior: add support for the 28 and 28L devices
USB: misc: iowarrior: add support for 2 OEMed devices
USB: Fix novation SourceControl XL after suspend
xhci: Fix memory leak when caching protocol extended capability PSI tables - take 2
Revert "xhci: Fix memory leak when caching protocol extended capability PSI tables"
MAINTAINERS: Sort entries in database for THUNDERBOLT
usb: dwc3: debug: fix string position formatting mixup with ret and len
usb: gadget: serial: fix Tx stall after buffer overflow
usb: gadget: ffs: ffs_aio_cancel(): Save/restore IRQ flags
usb: dwc2: Fix SET/CLEAR_FEATURE and GET_STATUS flows
usb: dwc2: Fix in ISOC request length checking
usb: gadget: composite: Support more than 500mA MaxPower
usb: gadget: composite: Fix bMaxPower for SuperSpeedPlus
usb: gadget: u_audio: Fix high-speed max packet size
usb: dwc3: gadget: Check for IOC/LST bit in TRB->ctrl fields
USB: core: clean up endpoint-descriptor parsing
USB: quirks: blacklist duplicate ep on Sound Devices USBPre2
...
Linus Torvalds [Fri, 21 Feb 2020 20:18:02 +0000 (12:18 -0800)]
Merge tag 'drm-fixes-2020-02-21' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Varied fixes for rc3.
i915 is the largest, they are seeing some ACPI problems with their CI
which hopefully get solved soon [1].
msm has a bunch of fixes for new hw added in the merge, a bunch of
amdgpu fixes, and nouveau adds support for some new firmwares for
turing tu11x GPUs that were just released into linux-firmware by
nvidia, they operate the same as the ones we already have for tu10x so
should be fine to hook up.
Otherwise it's just misc fixes for panfrost and sun4i.
core:
- Allow only one rotation argument, and allow zero rotation in video
cmdline.
i915:
- Workaround missing Display Stream Compression (DSC) state readout
by forcing modeset when its enabled at probe
- Fix EHL port clock voltage level requirements
- Fix queuing retire workers on the virtual engine
- Fix use of partially initialized waiters
- Stop using drm_pci_alloc/drm_pci/free
- Fix rewind of RING_TAIL by forcing a context reload
- Fix locking on resetting ring->head
- Propagate our bug filing URL change to stable kernels
panfrost:
- Small compiler warning fix for panfrost.
- Fix when using performance counters in panfrost when using per fd
address space.
sun4xi:
- Fix dt binding
nouveau:
- tu11x modesetting fix
- ACR/GR firmware support for tu11x (fw is public now)
msm:
- fix UBWC on GPU and display side for sc7180
- fix DSI suspend/resume issue encountered on sc7180
- fix some breakage on so called "linux-android" devices
(fallout from sc7180/a618 support, not seen earlier due to
bootloader/firmware differences)
- couple other misc fixes
amdgpu:
- HDCP fixes
- xclk fix for raven
- GFXOFF fixes"
[1] The Intel suspend testing should now be fixed by commit
63fb9623427f
("ACPI: PM: s2idle: Check fixed wakeup events in acpi_s2idle_wake()")
* tag 'drm-fixes-2020-02-21' of git://anongit.freedesktop.org/drm/drm: (39 commits)
drm/amdgpu/display: clean up hdcp workqueue handling
drm/amdgpu: add is_raven_kicker judgement for raven1
drm/i915/gt: Avoid resetting ring->head outside of its timeline mutex
drm/i915/execlists: Always force a context reload when rewinding RING_TAIL
drm/i915: Wean off drm_pci_alloc/drm_pci_free
drm/i915/gt: Protect defer_request() from new waiters
drm/i915/gt: Prevent queuing retire workers on the virtual engine
drm/i915/dsc: force full modeset whenever DSC is enabled at probe
drm/i915/ehl: Update port clock voltage level requirements
drm/i915: Update drm/i915 bug filing URL
MAINTAINERS: Update drm/i915 bug filing URL
drm/i915: Initialise basic fence before acquiring seqno
drm/i915/gem: Require per-engine reset support for non-persistent contexts
drm/nouveau/kms/gv100-: Re-set LUT after clearing for modesets
drm/nouveau/gr/tu11x: initial support
drm/nouveau/acr/tu11x: initial support
drm/amdgpu/gfx10: disable gfxoff when reading rlc clock
drm/amdgpu/gfx9: disable gfxoff when reading rlc clock
drm/amdgpu/soc15: fix xclk for raven
drm/amd/powerplay: always refetch the enabled features status on dpm enablement
...
Linus Torvalds [Fri, 21 Feb 2020 19:59:51 +0000 (11:59 -0800)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Limit xt_hashlimit hash table size to avoid OOM or hung tasks, from
Cong Wang.
2) Fix deadlock in xsk by publishing global consumer pointers when NAPI
is finished, from Magnus Karlsson.
3) Set table field properly to RT_TABLE_COMPAT when necessary, from
Jethro Beekman.
4) NLA_STRING attributes are not necessary NULL terminated, deal wiht
that in IFLA_ALT_IFNAME. From Eric Dumazet.
5) Fix checksum handling in atlantic driver, from Dmitry Bezrukov.
6) Handle mtu==0 devices properly in wireguard, from Jason A.
Donenfeld.
7) Fix several lockdep warnings in bonding, from Taehee Yoo.
8) Fix cls_flower port blocking, from Jason Baron.
9) Sanitize internal map names in libbpf, from Toke Høiland-Jørgensen.
10) Fix RDMA race in qede driver, from Michal Kalderon.
11) Fix several false lockdep warnings by adding conditions to
list_for_each_entry_rcu(), from Madhuparna Bhowmik.
12) Fix sleep in atomic in mlx5 driver, from Huy Nguyen.
13) Fix potential deadlock in bpf_map_do_batch(), from Yonghong Song.
14) Hey, variables declared in switch statement before any case
statements are not initialized. I learn something every day. Get
rids of this stuff in several parts of the networking, from Kees
Cook.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (99 commits)
bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs.
bnxt_en: Improve device shutdown method.
net: netlink: cap max groups which will be considered in netlink_bind()
net: thunderx: workaround BGX TX Underflow issue
ionic: fix fw_status read
net: disable BRIDGE_NETFILTER by default
net: macb: Properly handle phylink on at91rm9200
s390/qeth: fix off-by-one in RX copybreak check
s390/qeth: don't warn for napi with 0 budget
s390/qeth: vnicc Fix EOPNOTSUPP precedence
openvswitch: Distribute switch variables for initialization
net: ip6_gre: Distribute switch variables for initialization
net: core: Distribute switch variables for initialization
udp: rehash on disconnect
net/tls: Fix to avoid gettig invalid tls record
bpf: Fix a potential deadlock with bpf_map_do_batch
bpf: Do not grab the bucket spinlock by default on htab batch ops
ice: Wait for VF to be reset/ready before configuration
ice: Don't tell the OS that link is going down
ice: Don't reject odd values of usecs set by user
...
David S. Miller [Fri, 21 Feb 2020 19:59:13 +0000 (11:59 -0800)]
Merge branch 'Migrate-QRTR-Nameservice-to-Kernel'
Manivannan Sadhasivam says:
====================
Migrate QRTR Nameservice to Kernel
This patchset migrates the Qualcomm IPC Router (QRTR) Nameservice from userspace
to kernel under net/qrtr.
The userspace implementation of it can be found here:
https://github.com/andersson/qrtr/blob/master/src/ns.c
This change is required for enabling the WiFi functionality of some Qualcomm
WLAN devices using ATH11K without any dependency on a userspace daemon. Since
the QRTR NS is not usually packed in most of the distros, users need to clone,
build and install it to get the WiFi working. It will become a hassle when the
user doesn't have any other source of network connectivity.
The original userspace code is published under BSD3 license. For migrating it
to Linux kernel, I have adapted Dual BSD/GPL license.
This patchset has been verified on Dragonboard410c and Intel NUC with QCA6390
WLAN device.
Changes in v2:
* Sorted the local variables in reverse XMAS tree order
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manivannan Sadhasivam [Thu, 20 Feb 2020 15:13:27 +0000 (20:43 +0530)]
net: qrtr: Fix the local node ID as 1
In order to start the QRTR nameservice, the local node ID needs to be
valid. Hence, fix it to 1. Previously, the node ID was configured through
a userspace tool before starting the nameservice daemon. Since we have now
integrated the nameservice handling to kernel, this change is necessary
for making it functional.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manivannan Sadhasivam [Thu, 20 Feb 2020 15:13:26 +0000 (20:43 +0530)]
net: qrtr: Migrate nameservice to kernel from userspace
The QRTR nameservice has been maintained in userspace for some time. This
commit migrates it to Linux kernel. This change is required in order to
eliminate the need of starting a userspace daemon for making the WiFi
functional for ath11k based devices. Since the QRTR NS is not usually
packed in most of the distros, users need to clone, build and install it
to get the WiFi working. It will become a hassle when the user doesn't
have any other source of network connectivity.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Murphy [Tue, 18 Feb 2020 14:11:30 +0000 (08:11 -0600)]
net: phy: dp83867: Add speed optimization feature
Set the speed optimization bit on the DP83867 PHY.
This feature can also be strapped on the 64 pin PHY devices
but the 48 pin devices do not have the strap pin available to enable
this feature in the hardware. PHY team suggests to have this bit set.
With this bit set the PHY will auto negotiate and report the link
parameters in the PHYSTS register. This register provides a single
location within the register set for quick access to commonly accessed
information.
In this case when auto negotiation is on the PHY core reads the bits
that have been configured or if auto negotiation is off the PHY core
reads the BMCR register and sets the phydev parameters accordingly.
This Giga bit PHY can throttle the speed to 100Mbps or 10Mbps to accomodate a
4-wire cable. If this should occur the PHYSTS register contains the
current negotiated speed and duplex mode.
In overriding the genphy_read_status the dp83867_read_status will do a
genphy_read_status to setup the LP and pause bits. And then the PHYSTS
register is read and the phydev speed and duplex mode settings are
updated.
Signed-off-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 21 Feb 2020 19:40:10 +0000 (11:40 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
- A few y2038 fixes which missed the merge window while dependencies
in NFS were being sorted out.
- A bunch of fixes. Some minor, some not.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: use tabs for SAFESETID
lib/stackdepot.c: fix global out-of-bounds in stack_slabs
mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM
mm/vmscan.c: don't round up scan size for online memory cgroup
lib/string.c: update match_string() doc-strings with correct behavior
mm/memcontrol.c: lost css_put in memcg_expand_shrinker_maps()
mm/swapfile.c: fix a comment in sys_swapon()
scripts/get_maintainer.pl: deprioritize old Fixes: addresses
get_maintainer: remove uses of P: for maintainer name
selftests/vm: add missed tests in run_vmtests
include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap
Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"
y2038: hide timeval/timespec/itimerval/itimerspec types
y2038: remove unused time32 interfaces
y2038: remove ktime to/from timespec/timeval conversion
Randy Dunlap [Fri, 21 Feb 2020 04:04:33 +0000 (20:04 -0800)]
MAINTAINERS: use tabs for SAFESETID
Use tabs for indentation instead of spaces for SAFESETID. All (!) other
entries in MAINTAINERS use tabs (according to my simple grepping).
Link: http://lkml.kernel.org/r/2bb2e52a-2694-816d-57b4-6cabfadd6c1a@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Micah Morton <mortonm@chromium.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 21 Feb 2020 04:04:30 +0000 (20:04 -0800)]
lib/stackdepot.c: fix global out-of-bounds in stack_slabs
Walter Wu has reported a potential case in which init_stack_slab() is
called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been
initialized. In that case init_stack_slab() will overwrite
stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory
corruption.
Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com
Fixes: cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reported-by: Walter Wu <walter-zh.wu@mediatek.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wei Yang [Fri, 21 Feb 2020 04:04:27 +0000 (20:04 -0800)]
mm/sparsemem: pfn_to_page is not valid yet on SPARSEMEM
When we use SPARSEMEM instead of SPARSEMEM_VMEMMAP, pfn_to_page()
doesn't work before sparse_init_one_section() is called.
This leads to a crash when hotplug memory:
BUG: unable to handle page fault for address:
0000000006400000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP PTI
CPU: 3 PID: 221 Comm: kworker/u16:1 Tainted: G W 5.5.0-next-
20200205+ #343
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
Workqueue: kacpi_hotplug acpi_hotplug_work_fn
RIP: 0010:__memset+0x24/0x30
Code: cc cc cc cc cc cc 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 f3
RSP: 0018:
ffffb43ac0373c80 EFLAGS:
00010a87
RAX:
ffffffffffffffff RBX:
ffff8a1518800000 RCX:
0000000000050000
RDX:
0000000000000000 RSI:
00000000000000ff RDI:
0000000006400000
RBP:
0000000000140000 R08:
0000000000100000 R09:
0000000006400000
R10:
0000000000000000 R11:
0000000000000002 R12:
0000000000000000
R13:
0000000000000028 R14:
0000000000000000 R15:
ffff8a153ffd9280
FS:
0000000000000000(0000) GS:
ffff8a153ab00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000006400000 CR3:
0000000136fca000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
sparse_add_section+0x1c9/0x26a
__add_pages+0xbf/0x150
add_pages+0x12/0x60
add_memory_resource+0xc8/0x210
__add_memory+0x62/0xb0
acpi_memory_device_add+0x13f/0x300
acpi_bus_attach+0xf6/0x200
acpi_bus_scan+0x43/0x90
acpi_device_hotplug+0x275/0x3d0
acpi_hotplug_work_fn+0x1a/0x30
process_one_work+0x1a7/0x370
worker_thread+0x30/0x380
kthread+0x112/0x130
ret_from_fork+0x35/0x40
We should use memmap as it did.
On x86 the impact is limited to x86_32 builds, or x86_64 configurations
that override the default setting for SPARSEMEM_VMEMMAP.
Other memory hotplug archs (arm64, ia64, and ppc) also default to
SPARSEMEM_VMEMMAP=y.
[dan.j.williams@intel.com: changelog update]
{rppt@linux.ibm.com: changelog update]
Link: http://lkml.kernel.org/r/20200219030454.4844-1-bhe@redhat.com
Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gavin Shan [Fri, 21 Feb 2020 04:04:24 +0000 (20:04 -0800)]
mm/vmscan.c: don't round up scan size for online memory cgroup
Commit
68600f623d69 ("mm: don't miss the last page because of round-off
error") makes the scan size round up to @denominator regardless of the
memory cgroup's state, online or offline. This affects the overall
reclaiming behavior: the corresponding LRU list is eligible for
reclaiming only when its size logically right shifted by @sc->priority
is bigger than zero in the former formula.
For example, the inactive anonymous LRU list should have at least 0x4000
pages to be eligible for reclaiming when we have 60/12 for
swappiness/priority and without taking scan/rotation ratio into account.
After the roundup is applied, the inactive anonymous LRU list becomes
eligible for reclaiming when its size is bigger than or equal to 0x1000
in the same condition.
(0x4000 >> 12) * 60 / (60 + 140 + 1) = 1
((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) = 1
aarch64 has 512MB huge page size when the base page size is 64KB. The
memory cgroup that has a huge page is always eligible for reclaiming in
that case.
The reclaiming is likely to stop after the huge page is reclaimed,
meaing the further iteration on @sc->priority and the silbing and child
memory cgroups will be skipped. The overall behaviour has been changed.
This fixes the issue by applying the roundup to offlined memory cgroups
only, to give more preference to reclaim memory from offlined memory
cgroup. It sounds reasonable as those memory is unlikedly to be used by
anyone.
The issue was found by starting up 8 VMs on a Ampere Mustang machine,
which has 8 CPUs and 16 GB memory. Each VM is given with 2 vCPUs and
2GB memory. It took 264 seconds for all VMs to be completely up and
784MB swap is consumed after that. With this patch applied, it took 236
seconds and 60MB swap to do same thing. So there is 10% performance
improvement for my case. Note that KSM is disable while THP is enabled
in the testing.
total used free shared buff/cache available
Mem: 16196 10065 2049 16 4081 3749
Swap: 8175 784 7391
total used free shared buff/cache available
Mem: 16196 11324 3656 24 1215 2936
Swap: 8175 60 8115
Link: http://lkml.kernel.org/r/20200211024514.8730-1-gshan@redhat.com
Fixes: 68600f623d69 ("mm: don't miss the last page because of round-off error")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexandru Ardelean [Fri, 21 Feb 2020 04:04:21 +0000 (20:04 -0800)]
lib/string.c: update match_string() doc-strings with correct behavior
There were a few attempts at changing behavior of the match_string()
helpers (i.e. 'match_string()' & 'sysfs_match_string()'), to change &
extend the behavior according to the doc-string.
But the simplest approach is to just fix the doc-strings. The current
behavior is fine as-is, and some bugs were introduced trying to fix it.
As for extending the behavior, new helpers can always be introduced if
needed.
The match_string() helpers behave more like 'strncmp()' in the sense
that they go up to n elements or until the first NULL element in the
array of strings.
This change updates the doc-strings with this info.
Link: http://lkml.kernel.org/r/20200213072722.8249-1-alexandru.ardelean@analog.com
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "Tobin C . Harding" <tobin@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasily Averin [Fri, 21 Feb 2020 04:04:18 +0000 (20:04 -0800)]
mm/memcontrol.c: lost css_put in memcg_expand_shrinker_maps()
for_each_mem_cgroup() increases css reference counter for memory cgroup
and requires to use mem_cgroup_iter_break() if the walk is cancelled.
Link: http://lkml.kernel.org/r/c98414fb-7e1f-da0f-867a-9340ec4bd30b@virtuozzo.com
Fixes: 0a4465d34028 ("mm, memcg: assign memcg-aware shrinkers bitmap to memcg")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christoph Hellwig [Fri, 21 Feb 2020 04:04:15 +0000 (20:04 -0800)]
mm/swapfile.c: fix a comment in sys_swapon()
claim_swapfile now always takes i_rwsem.
Link: http://lkml.kernel.org/r/20200114161225.309792-2-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Douglas Anderson [Fri, 21 Feb 2020 04:04:12 +0000 (20:04 -0800)]
scripts/get_maintainer.pl: deprioritize old Fixes: addresses
Recently, I found that get_maintainer was causing me to send emails to
the old addresses for maintainers. Since I usually just trust the
output of get_maintainer to know the right email address, I didn't even
look carefully and fired off two patch series that went to the wrong
place. Oops.
The problem was introduced recently when trying to add signatures from
Fixes. The problem was that these email addresses were added too early
in the process of compiling our list of places to send. Things added to
the list earlier are considered more canonical and when we later added
maintainer entries we ended up deduplicating to the old address.
Here are two examples using mainline commits (to make it easier to
replicate) for the two maintainers that I messed up recently:
$ git format-patch
d8549bcd0529~..
d8549bcd0529
$ ./scripts/get_maintainer.pl 0001-clk-Add-clk_hw*.patch | grep Boyd
Stephen Boyd <sboyd@codeaurora.org>...
$ git format-patch
6d1238aa3395~..
6d1238aa3395
$ ./scripts/get_maintainer.pl 0001-arm64-dts-qcom-qcs404*.patch | grep Andy
Andy Gross <andy.gross@linaro.org>
Let's move the adding of addresses from Fixes: to the end since the
email addresses from these are much more likely to be older.
After this patch the above examples get the right addresses for the two
examples.
Link: http://lkml.kernel.org/r/20200127095001.1.I41fba9f33590bfd92cd01960161d8384268c6569@changeid
Fixes: 2f5bd343694e ("scripts/get_maintainer.pl: add signatures from Fixes: <badcommit> lines in commit message")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Andy Gross <agross@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Fri, 21 Feb 2020 04:04:09 +0000 (20:04 -0800)]
get_maintainer: remove uses of P: for maintainer name
Commit
1ca84ed6425f ("MAINTAINERS: Reclaim the P: tag for Maintainer
Entry Profile") changed the use of the "P:" tag from "Person" to
"Profile (ie: special subsystem coding styles and characteristics)"
Change how get_maintainer.pl parses the "P:" tag to match.
Link: http://lkml.kernel.org/r/ca53823fc5d25c0be32ad937d0207a0589c08643.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Dan Williams <dan.j.william@intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 21 Feb 2020 04:04:06 +0000 (20:04 -0800)]
selftests/vm: add missed tests in run_vmtests
The commits introducing 'mlock-random-test'[1], 'map_fiex_noreplace'[2],
and 'thuge-gen'[3] have not added those in the 'run_vmtests' script and
thus the 'run_tests' command of kselftests doesn't run those. This
commit adds those in the script.
'gup_benchmark' and 'transhuge-stress' are also not included in the
'run_vmtests', but this commit does not add those because those are for
performance measurement rather than pass/fail tests.
[1] commit
26b4224d9961 ("selftests: expanding more mlock selftest")
[2] commit
91cbacc34512 ("tools/testing/selftests/vm/map_fixed_noreplace.c: add test for MAP_FIXED_NOREPLACE")
[3] commit
fcc1f2d5dd34 ("selftests: add a test program for variable huge page sizes in mmap/shmget")
Link: http://lkml.kernel.org/r/20200206085144.29126-1-sj38.park@gmail.com
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Christian Borntraeger [Fri, 21 Feb 2020 04:04:03 +0000 (20:04 -0800)]
include/uapi/linux/swab.h: fix userspace breakage, use __BITS_PER_LONG for swap
QEMU has a funny new build error message when I use the upstream kernel
headers:
CC block/file-posix.o
In file included from /home/cborntra/REPOS/qemu/include/qemu/timer.h:4,
from /home/cborntra/REPOS/qemu/include/qemu/timed-average.h:29,
from /home/cborntra/REPOS/qemu/include/block/accounting.h:28,
from /home/cborntra/REPOS/qemu/include/block/block_int.h:27,
from /home/cborntra/REPOS/qemu/block/file-posix.c:30:
/usr/include/linux/swab.h: In function `__swab':
/home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:34: error: "sizeof" is not defined, evaluates to 0 [-Werror=undef]
20 | #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
| ^~~~~~
/home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:41: error: missing binary operator before token "("
20 | #define BITS_PER_LONG (sizeof (unsigned long) * BITS_PER_BYTE)
| ^
cc1: all warnings being treated as errors
make: *** [/home/cborntra/REPOS/qemu/rules.mak:69: block/file-posix.o] Error 1
rm tests/qemu-iotests/socket_scm_helper.o
This was triggered by commit
d5767057c9a ("uapi: rename ext2_swab() to
swab() and share globally in swab.h"). That patch is doing
#include <asm/bitsperlong.h>
but it uses BITS_PER_LONG.
The kernel file asm/bitsperlong.h provide only __BITS_PER_LONG.
Let us use the __ variant in swap.h
Link: http://lkml.kernel.org/r/20200213142147.17604-1-borntraeger@de.ibm.com
Fixes: d5767057c9a ("uapi: rename ext2_swab() to swab() and share globally in swab.h")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Joe Perches <joe@perches.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Cc: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ioanna Alifieraki [Fri, 21 Feb 2020 04:04:00 +0000 (20:04 -0800)]
Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"
This reverts commit
a97955844807e327df11aa33869009d14d6b7de0.
Commit
a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage
in exit_sem()") removes a lock that is needed. This leads to a process
looping infinitely in exit_sem() and can also lead to a crash. There is
a reproducer available in [1] and with the commit reverted the issue
does not reproduce anymore.
Using the reproducer found in [1] is fairly easy to reach a point where
one of the child processes is looping infinitely in exit_sem between
for(;;) and if (semid == -1) block, while it's trying to free its last
sem_undo structure which has already been freed by freeary().
Each sem_undo struct is on two lists: one per semaphore set (list_id)
and one per process (list_proc). The list_id list tracks undos by
semaphore set, and the list_proc by process.
Undo structures are removed either by freeary() or by exit_sem(). The
freeary function is invoked when the user invokes a syscall to remove a
semaphore set. During this operation freeary() traverses the list_id
associated with the semaphore set and removes the undo structures from
both the list_id and list_proc lists.
For this case, exit_sem() is called at process exit. Each process
contains a struct sem_undo_list (referred to as "ulp") which contains
the head for the list_proc list. When the process exits, exit_sem()
traverses this list to remove each sem_undo struct. As in freeary(),
whenever a sem_undo struct is removed from list_proc, it is also removed
from the list_id list.
Removing elements from list_id is safe for both exit_sem() and freeary()
due to sem_lock(). Removing elements from list_proc is not safe;
freeary() locks &un->ulp->lock when it performs
list_del_rcu(&un->list_proc) but exit_sem() does not (locking was
removed by commit
a97955844807 ("ipc,sem: remove uneeded sem_undo_list
lock usage in exit_sem()").
This can result in the following situation while executing the
reproducer [1] : Consider a child process in exit_sem() and the parent
in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).
- The list_proc for the child contains the last two undo structs A and
B (the rest have been removed either by exit_sem() or freeary()).
- The semid for A is 1 and semid for B is 2.
- exit_sem() removes A and at the same time freeary() removes B.
- Since A and B have different semid sem_lock() will acquire different
locks for each process and both can proceed.
The bug is that they remove A and B from the same list_proc at the same
time because only freeary() acquires the ulp lock. When exit_sem()
removes A it makes ulp->list_proc.next to point at B and at the same
time freeary() removes B setting B->semid=-1.
At the next iteration of for(;;) loop exit_sem() will try to remove B.
The only way to break from for(;;) is for (&un->list_proc ==
&ulp->list_proc) to be true which is not. Then exit_sem() will check if
B->semid=-1 which is and will continue looping in for(;;) until the
memory for B is reallocated and the value at B->semid is changed.
At that point, exit_sem() will crash attempting to unlink B from the
lists (this can be easily triggered by running the reproducer [1] a
second time).
To prove this scenario instrumentation was added to keep information
about each sem_undo (un) struct that is removed per process and per
semaphore set (sma).
CPU0 CPU1
[caller holds sem_lock(sma for A)] ...
freeary() exit_sem()
... ...
... sem_lock(sma for B)
spin_lock(A->ulp->lock) ...
list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc)
Undo structures A and B have different semid and sem_lock() operations
proceed. However they belong to the same list_proc list and they are
removed at the same time. This results into ulp->list_proc.next
pointing to the address of B which is already removed.
After reverting commit
a97955844807 ("ipc,sem: remove uneeded
sem_undo_list lock usage in exit_sem()") the issue was no longer
reproducible.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=
1694779
Link: http://lkml.kernel.org/r/20191211191318.11860-1-ioanna-maria.alifieraki@canonical.com
Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()")
Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Herton R. Krzesinski <herton@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: <malat@debian.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 21 Feb 2020 04:03:57 +0000 (20:03 -0800)]
y2038: hide timeval/timespec/itimerval/itimerspec types
There are no in-kernel users remaining, but there may still be users that
include linux/time.h instead of sys/time.h from user space, so leave the
types available to user space while hiding them from kernel space.
Only the __kernel_old_* versions of these types remain now.
Link: http://lkml.kernel.org/r/20200110154232.4104492-4-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 21 Feb 2020 04:03:54 +0000 (20:03 -0800)]
y2038: remove unused time32 interfaces
No users remain, so kill these off before we grow new ones.
Link: http://lkml.kernel.org/r/20200110154232.4104492-3-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 21 Feb 2020 04:03:50 +0000 (20:03 -0800)]
y2038: remove ktime to/from timespec/timeval conversion
A couple of helpers are now obsolete and can be removed, so drivers can no
longer start using them and instead use y2038-safe interfaces.
Link: http://lkml.kernel.org/r/20200110154232.4104492-2-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Fri, 21 Feb 2020 00:46:18 +0000 (01:46 +0100)]
ACPI: PM: s2idle: Check fixed wakeup events in acpi_s2idle_wake()
Commit
fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from
waking up the system") overlooked the fact that fixed events can wake
up the system too and broke RTC wakeup from suspend-to-idle as a
result.
Fix this issue by checking the fixed events in acpi_s2idle_wake() in
addition to checking wakeup GPEs and break out of the suspend-to-idle
loop if the status bits of any enabled fixed events are set then.
Fixes: fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system")
Reported-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vladimir Oltean [Fri, 21 Feb 2020 14:46:24 +0000 (16:46 +0200)]
enetc: remove "depends on (ARCH_LAYERSCAPE || COMPILE_TEST)"
ARCH_LAYERSCAPE isn't needed for this driver, it builds and
sends/receives traffic without this config option just fine.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Fri, 21 Feb 2020 14:38:57 +0000 (09:38 -0500)]
tc-testing: updated tdc tests for basic filter with u16 extended match rules
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilias Apalodimas [Fri, 21 Feb 2020 09:15:19 +0000 (11:15 +0200)]
net: page_pool: Add documentation on page_pool API
Add documentation explaining the basic functionality and design
principles of the API
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Fri, 21 Feb 2020 02:46:54 +0000 (12:46 +1000)]
Merge tag 'drm-intel-fixes-2020-02-20' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.6-rc3:
- Workaround missing Display Stream Compression (DSC) state readout by
forcing modeset when its enabled at probe
- Fix EHL port clock voltage level requirements
- Fix queuing retire workers on the virtual engine
- Fix use of partially initialized waiters
- Stop using drm_pci_alloc/drm_pci/free
- Fix rewind of RING_TAIL by forcing a context reload
- Fix locking on resetting ring->head
- Propagate our bug filing URL change to stable kernels
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87y2sxtsrd.fsf@intel.com
Dave Airlie [Fri, 21 Feb 2020 02:30:23 +0000 (12:30 +1000)]
Merge tag 'drm-misc-fixes-2020-02-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.6-rc3:
- Fix dt binding for sunxi.
- Allow only 1 rotation argument, and allow 0 rotation in video cmdline.
- Small compiler warning fix for panfrost.
- Fix when using performance counters in panfrost when using per fd address space.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f5a6370d-9898-6c72-43e4-5bb56a99b6f2@linux.intel.com
Yonghong Song [Fri, 21 Feb 2020 00:43:54 +0000 (16:43 -0800)]
docs/bpf: Update bpf development Q/A file
bpf now has its own mailing list bpf@vger.kernel.org.
Update the bpf_devel_QA.rst file to reflect this.
Also llvm has switch to github with llvm and clang
in the same repo https://github.com/llvm/llvm-project.git.
Update the QA file with newer build instructions.
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200221004354.930952-1-yhs@fb.com
Andrii Nakryiko [Thu, 20 Feb 2020 23:05:46 +0000 (15:05 -0800)]
selftests/bpf: Fix trampoline_count clean up logic
Libbpf's Travis CI tests caught this issue. Ensure bpf_link and bpf_object
clean up is performed correctly.
Fixes: d633d57902a5 ("selftest/bpf: Add test for allowed trampolines count")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20200220230546.769250-1-andriin@fb.com
Alexei Starovoitov [Fri, 21 Feb 2020 01:48:40 +0000 (17:48 -0800)]
Merge branch 'set_attach_target'
Eelco Chaudron says:
====================
Currently when you want to attach a trace program to a bpf program
the section name needs to match the tracepoint/function semantics.
However the addition of the bpf_program__set_attach_target() API
allows you to specify the tracepoint/function dynamically.
The call flow would look something like this:
xdp_fd = bpf_prog_get_fd_by_id(id);
trace_obj = bpf_object__open_file("func.o", NULL);
prog = bpf_object__find_program_by_title(trace_obj,
"fentry/myfunc");
bpf_program__set_expected_attach_type(prog, BPF_TRACE_FENTRY);
bpf_program__set_attach_target(prog, xdp_fd,
"xdpfilt_blk_all");
bpf_object__load(trace_obj)
v1 -> v2: Remove requirement for attach type hint in API
v2 -> v3: Moved common warning to __find_vmlinux_btf_id, requested by Andrii
Updated the xdp_bpf2bpf test to use this new API
v3 -> v4: Split up patch, update libbpf.map version
v4 -> v5: Fix return code, and prog assignment in test case
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Eelco Chaudron [Thu, 20 Feb 2020 13:26:45 +0000 (13:26 +0000)]
selftests/bpf: Update xdp_bpf2bpf test to use new set_attach_target API
Use the new bpf_program__set_attach_target() API in the xdp_bpf2bpf
selftest so it can be referenced as an example on how to use it.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/158220520562.127661.14289388017034825841.stgit@xdp-tutorial
Eelco Chaudron [Thu, 20 Feb 2020 13:26:35 +0000 (13:26 +0000)]
libbpf: Add support for dynamic program attach target
Currently when you want to attach a trace program to a bpf program
the section name needs to match the tracepoint/function semantics.
However the addition of the bpf_program__set_attach_target() API
allows you to specify the tracepoint/function dynamically.
The call flow would look something like this:
xdp_fd = bpf_prog_get_fd_by_id(id);
trace_obj = bpf_object__open_file("func.o", NULL);
prog = bpf_object__find_program_by_title(trace_obj,
"fentry/myfunc");
bpf_program__set_expected_attach_type(prog, BPF_TRACE_FENTRY);
bpf_program__set_attach_target(prog, xdp_fd,
"xdpfilt_blk_all");
bpf_object__load(trace_obj)
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158220519486.127661.7964708960649051384.stgit@xdp-tutorial
Eelco Chaudron [Thu, 20 Feb 2020 13:26:24 +0000 (13:26 +0000)]
libbpf: Bump libpf current version to v0.0.8
New development cycles starts, bump to v0.0.8.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/158220518424.127661.8278643006567775528.stgit@xdp-tutorial
David S. Miller [Fri, 21 Feb 2020 00:05:42 +0000 (16:05 -0800)]
Merge branch 'bnxt_en-shutdown-and-kexec-kdump-related-fixes'
Michael Chan says:
====================
bnxt_en: shutdown and kexec/kdump related fixes.
2 small patches to fix kexec shutdown and kdump kernel driver init issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 20 Feb 2020 22:26:35 +0000 (17:26 -0500)]
bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs.
If crashed kernel does not shutdown the NIC properly, PCIe FLR
is required in the kdump kernel in order to initialize all the
functions properly.
Fixes: d629522e1d66 ("bnxt_en: Reduce memory usage when running in kdump kernel.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasundhara Volam [Thu, 20 Feb 2020 22:26:34 +0000 (17:26 -0500)]
bnxt_en: Improve device shutdown method.
Especially when bnxt_shutdown() is called during kexec, we need to
disable MSIX and disable Bus Master to completely quiesce the device.
Make these 2 calls unconditionally in the shutdown method.
Fixes: c20dc142dd7b ("bnxt_en: Disable bus master during PCI shutdown and driver unload.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Thu, 20 Feb 2020 14:42:13 +0000 (16:42 +0200)]
net: netlink: cap max groups which will be considered in netlink_bind()
Since nl_groups is a u32 we can't bind more groups via ->bind
(netlink_bind) call, but netlink has supported more groups via
setsockopt() for a long time and thus nlk->ngroups could be over 32.
Recently I added support for per-vlan notifications and increased the
groups to 33 for NETLINK_ROUTE which exposed an old bug in the
netlink_bind() code causing out-of-bounds access on archs where unsigned
long is 32 bits via test_bit() on a local variable. Fix this by capping the
maximum groups in netlink_bind() to BITS_PER_TYPE(u32), effectively
capping them at 32 which is the minimum of allocated groups and the
maximum groups which can be bound via netlink_bind().
CC: Christophe Leroy <christophe.leroy@c-s.fr>
CC: Richard Guy Briggs <rgb@redhat.com>
Fixes: 4f520900522f ("netlink: have netlink per-protocol bind function return an error code.")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 21 Feb 2020 00:00:14 +0000 (16:00 -0800)]
Merge branch '1GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2020-02-19
This series contains updates to e1000e and igc drivers.
Ben Dooks adds a missing cpu_to_le64() in the e1000e transmit ring flush
function.
Jia-Ju Bai replaces a couple of udelay() with usleep_range() where we
could sleep while holding a spinlock in e1000e.
Chen Zhou make 2 functions static in igc,
Sasha finishes the legacy power management support in igc by adding
resume and schedule suspend requests. Also added register dump
functionality in the igc driver. Added device id support for the next
generation of i219 devices in e1000e. Fixed a typo in the igc driver
that referenced a device that is not support in the driver. Added the
missing PTP support when suspending now that igc has legacy power
management support. Added PCIe error detection, slot reset and resume
capability in igc. Added WoL support for igc as well. Lastly, added a
code comment to distinguish between interrupt and flag definitions.
Vitaly adds device id support for Tiger Lake platforms, which has
another next generation of i219 device in e1000e.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Harvey [Wed, 19 Feb 2020 23:19:36 +0000 (15:19 -0800)]
net: thunderx: workaround BGX TX Underflow issue
While it is not yet understood why a TX underflow can easily occur
for SGMII interfaces resulting in a TX wedge. It has been found that
disabling/re-enabling the LMAC resolves the issue.
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Reviewed-by: Robert Jones <rjones@gateworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Wed, 19 Feb 2020 22:59:42 +0000 (14:59 -0800)]
ionic: fix fw_status read
The fw_status field is only 8 bits, so fix the read. Also,
we only want to look at the one status bit, to allow for future
use of the other bits, and watch for a bad PCI read.
Fixes: 97ca486592c0 ("ionic: add heartbeat check")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 20 Feb 2020 23:15:16 +0000 (15:15 -0800)]
Merge branch 'next-integrity' of git://git./linux/kernel/git/zohar/linux-integrity
Pull IMA fixes from Mimi Zohar:
"Two bug fixes and an associated change for each.
The one that adds SM3 to the IMA list of supported hash algorithms is
a simple change, but could be considered a new feature"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: add sm3 algorithm to hash algorithm configuration list
crypto: rename sm3-256 to sm3 in hash_algo_name
efi: Only print errors about failing to get certs if EFI vars are found
x86/ima: use correct identifier for SetupMode variable
David S. Miller [Thu, 20 Feb 2020 23:04:49 +0000 (15:04 -0800)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2020-02-19
This series contains updates to the ice driver only.
Avinash adds input validation for software DCB configurations received
via lldptool or pcap to ensure bad bandwidth inputs are not inputted
which could cause the loss of link.
Paul update the malicious driver detection event messages to rate limit
once per second and to include the total number of receive|transmit MDD
event count.
Dan updates how TCAM entries are managed to ensure when overriding
pre-existing TCAM entries, properly delete the existing entry and remove
it from the change/update list.
Brett ensures we clear the relevant values in the QRXFLXP_CNTXT register
for VF queues to ensure the receive queue data is not stale.
Avinash adds required DCBNL operations for configuring ETS in software
DCB CEE mode. Also added code to detect if DCB is in IEEE or CEE mode
to properly report what mode we are in.
Dave fixes the driver to properly report the current maximum TC, not the
maximum allowed number of TCs.
Krzysztof adds support for AF_XDP feature in the ice driver.
Jake increases the maximum time that the driver will wait for a PR reset
to account for possibility of a slightly longer than expected PD reset.
Jesse fixes a number of strings which did not have line feeds, so add
line feeds so that messages do not rum together, creating a jumbled
mess.
Bruce adds support for additional E810 and E823 device ids. Also
updated the product name change for E822 devices.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Kiryanov [Wed, 19 Feb 2020 21:40:06 +0000 (13:40 -0800)]
net: disable BRIDGE_NETFILTER by default
The description says 'If unsure, say N.' but
the module is built as M by default (once
the dependencies are satisfied).
When the module is selected (Y or M), it enables
NETFILTER_FAMILY_BRIDGE and SKB_EXTENSIONS
which alter kernel internal structures.
We (Android Studio Emulator) currently do not
use this module and think this it is more consistent
to have it disabled by default as opposite to
disabling it explicitly to prevent enabling
NETFILTER_FAMILY_BRIDGE and SKB_EXTENSIONS.
Signed-off-by: Roman Kiryanov <rkir@google.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandre Belloni [Wed, 19 Feb 2020 14:15:51 +0000 (15:15 +0100)]
net: macb: Properly handle phylink on at91rm9200
at91ether_init was handling the phy mode and speed but since the switch to
phylink, the NCFGR register got overwritten by macb_mac_config(). The issue
is that the RM9200_RMII bit and the MACB_CLK_DIV32 field are cleared
but never restored as they conflict with the PAE, GBE and PCSSEL bits.
Add new capability to differentiate between EMAC and the other versions of
the IP and use it to set and avoid clearing the relevant bits.
Also, this fixes a NULL pointer dereference in macb_mac_link_up as the EMAC
doesn't use any rings/bufffers/queues.
Fixes: 7897b071ac3b ("net: macb: convert to phylink")
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrii Nakryiko [Thu, 20 Feb 2020 06:26:35 +0000 (22:26 -0800)]
libbpf: Relax check whether BTF is mandatory
If BPF program is using BTF-defined maps, BTF is required only for
libbpf itself to process map definitions. If after that BTF fails to
be loaded into kernel (e.g., if it doesn't support BTF at all), this
shouldn't prevent valid BPF program from loading. Existing
retry-without-BTF logic for creating maps will succeed to create such
maps without any problems. So, presence of .maps section shouldn't make
BTF required for kernel. Update the check accordingly.
Validated by ensuring simple BPF program with BTF-defined maps is still
loaded on old kernel without BTF support and map is correctly parsed and
created.
Fixes: abd29c931459 ("libbpf: allow specifying map definitions using BTF")
Reported-by: Julia Kartseva <hex@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200220062635.1497872-1-andriin@fb.com
David S. Miller [Thu, 20 Feb 2020 18:30:47 +0000 (10:30 -0800)]
Merge branch 's390-fixes'
Julian Wiedmann says:
====================
s390/qeth: fixes 2020-02-20
please apply the following patch series for qeth to netdev's net tree.
This corrects three minor issues:
1) return a more fitting errno when VNICC cmds are not supported,
2) remove a bogus WARN in the NAPI code, and
3) be _very_ pedantic about the RX copybreak.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 20 Feb 2020 14:54:56 +0000 (15:54 +0100)]
s390/qeth: fix off-by-one in RX copybreak check
The RX copybreak is intended as the _max_ value where the frame's data
should be copied. So for frame_len == copybreak, don't build an SG skb.
Fixes: 4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 20 Feb 2020 14:54:55 +0000 (15:54 +0100)]
s390/qeth: don't warn for napi with 0 budget
Calling napi->poll() with 0 budget is a legitimate use by netpoll.
Fixes: a1c3ed4c9ca0 ("qeth: NAPI support for l2 and l3 discipline")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandra Winter [Thu, 20 Feb 2020 14:54:54 +0000 (15:54 +0100)]
s390/qeth: vnicc Fix EOPNOTSUPP precedence
When getting or setting VNICC parameters, the error code EOPNOTSUPP
should have precedence over EBUSY.
EBUSY is used because vnicc feature and bridgeport feature are mutually
exclusive, which is a temporary condition.
Whereas EOPNOTSUPP indicates that the HW does not support all or parts of
the vnicc feature.
This issue causes the vnicc sysfs params to show 'blocked by bridgeport'
for HW that does not support VNICC at all.
Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Thu, 20 Feb 2020 08:00:07 +0000 (09:00 +0100)]
net: use netif_is_bridge_port() to check for IFF_BRIDGE_PORT
Trivial cleanup, so that all bridge port-specific code can be found in
one go.
CC: Johannes Berg <johannes@sipsolutions.net>
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilias Apalodimas [Thu, 20 Feb 2020 07:41:55 +0000 (09:41 +0200)]
net: page_pool: API cleanup and comments
Functions starting with __ usually indicate those which are exported,
but should not be called directly. Update some of those declared in the
API and make it more readable.
page_pool_unmap_page() and page_pool_release_page() were doing
exactly the same thing calling __page_pool_clean_page(). Let's
rename __page_pool_clean_page() to page_pool_release_page() and
export it in order to show up on perf logs and get rid of
page_pool_unmap_page().
Finally rename __page_pool_put_page() to page_pool_put_page() since we
can now directly call it from drivers and rename the existing
page_pool_put_page() to page_pool_put_full_page() since they do the same
thing but the latter is trying to sync the full DMA area.
This patch also updates netsec, mvneta and stmmac drivers which use
those functions.
Suggested-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Feb 2020 18:04:34 +0000 (10:04 -0800)]
Merge branch 'mlxsw-Preparation-for-RTNL-removal'
Ido Schimmel says:
====================
mlxsw: Preparation for RTNL removal
The driver currently acquires RTNL in its route insertion path, which
contributes to very large control plane latencies. This patch set
prepares mlxsw for RTNL removal from its route insertion path in a
follow-up patch set.
Patches #1-#2 protect shared resources - KVDL and counter pool - with
their own locks. All allocations of these resources are currently
performed under RTNL, so no locks were required.
Patches #3-#7 ensure that updates to mirroring sessions only take place
in case there are active mirroring sessions. This allows us to avoid
taking RTNL when it is unnecessary, as updating of the mirroring
sessions must be performed under RTNL for the time being.
Patches #8-#10 replace the use of APIs that assume that RTNL is taken
with their RCU counterparts. Specifically, patches #8 and #9 replace
__in_dev_get_rtnl() with __in_dev_get_rcu() under RCU read-side critical
section. Patch #10 replaces __dev_get_by_index() with
dev_get_by_index_rcu().
Patches #11-#15 perform small adjustments in the code to make it easier
to later introduce a router lock instead of relying on RTNL.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:08:00 +0000 (09:08 +0200)]
mlxsw: spectrum_nve: Make tunnel initialization symmetric
The device supports a single VTEP whose configuration is shared between
all VXLAN tunnels.
While the shared configuration is cleared upon the destruction of the
last tunnel - in mlxsw_sp_nve_tunnel_fini() - it is set in
mlxsw_sp_nve_fid_enable(), after calling mlxsw_sp_nve_tunnel_init().
Make tunnel initialization and destruction symmetric and set the
configuration in mlxsw_sp_nve_tunnel_init().
This will later allow us to protect the shared configuration with a
lock.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:59 +0000 (09:07 +0200)]
mlxsw: spectrum: Export function to check if RIF exists
After the previous patch, all the callers of mlxsw_sp_rif_find_by_dev()
outside of the routing code use it to understand if a RIF exists for the
passed netdev.
Therefore, export a function to check if a RIF exists and make
mlxsw_sp_rif_find_by_dev() internal to the routing code.
This will later allow us to more easily introduce the router lock which
will also protect the RIFs.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:58 +0000 (09:07 +0200)]
mlxsw: spectrum: Prevent RIF access outside of routing code
There are currently 5 users of mlxsw_sp_rif_find_by_dev() outside of the
routing code. Only one call site actually needs to dereference the
router interface (RIF). The rest merely need to know if a RIF exists for
the provided netdev.
Convert this call site to query the needed information directly from the
routing code instead of dereferencing the RIF.
This will later allow us to replace mlxsw_sp_rif_find_by_dev() with a
function that checks if a RIF exist.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:57 +0000 (09:07 +0200)]
mlxsw: spectrum_router: Prepare function for router lock introduction
The function de-associates the port-vlan from its router interface
(RIF). It is called both from the netdev notifier block and the inetaddr
notifier block that will soon hold the router lock.
Make sure that router code calls the internal version, as it will
already have the router lock held when the function is called.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:56 +0000 (09:07 +0200)]
mlxsw: spectrum_router: Prepare function for router lock introduction
The function removes the FDB entry that directs the macvlan's MAC to the
router port. It is called from both the netdev notifier block and the
inetaddr notifier block that will soon hold the router lock.
Make sure that only the netdev notifier calls the exported version, so
that is will take the router lock, which will already be held by the
inetaddr notifier.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:55 +0000 (09:07 +0200)]
mlxsw: spectrum_router: Do not assume RTNL is taken when resolving underlay device
The function that resolves the underlay device of the IPIP tunnel
assumes that RTNL is taken, but this will not be correct when RTNL is
removed from the route insertion path.
Convert the function to use dev_get_by_index_rcu() instead of
__dev_get_by_index() and make sure it is always called from an RCU
read-side critical section.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:54 +0000 (09:07 +0200)]
mlxsw: spectrum_router: Do not assume RTNL is taken during RIF teardown
IPv6 addresses are deleted in an atomic context, so the driver defers
the potential teardown of the associated router interface (RIF) to a
work item that takes RTNL.
The RIF is only destroyed if the associated netdev does not have any IP
addresses (both IPv4 and IPv6). The IPv4 device ('struct in_device') is
currently fetched via __in_dev_get_rtnl() which assumes RTNL is taken.
Since RTNL is going to be removed, convert it to use __in_dev_get_rcu()
from an RCU read-side critical section.
Note that the IPv6 device ('struct inet6_dev') is fetched via
__in6_dev_get(), which does not require RTNL.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:53 +0000 (09:07 +0200)]
mlxsw: spectrum_router: Do not assume RTNL is taken during nexthop init
RTNL is going to be removed from route insertion path, so use
__in_dev_get_rcu() from an RCU read-side critical section instead of
__in_dev_get_rtnl() which assumes RTNL is taken.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:52 +0000 (09:07 +0200)]
mlxsw: spectrum_span: Only update mirroring agents if present
In order not to needlessly schedule the work item that updates the
mirroring agents, only schedule it if there are any mirroring agents
present.
This is done by adding an atomic counter that counts the active
mirroring agents.
It is incremented / decremented whenever a mirroring agent is created /
destroyed. It is read before scheduling the work item and in the
devlink-resource occupancy callback.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:51 +0000 (09:07 +0200)]
mlxsw: spectrum: Convert callers to use new mirroring API
Previous patch added a work item in the mirroring code that will take
care of updating the active mirroring agents in response to different
events.
Change the mirroring agents update function - mlxsw_sp_span_respin() -
to invoke this work item when called.
Therefore there is no need for callers to schedule a work item
themselves.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:50 +0000 (09:07 +0200)]
mlxsw: spectrum_span: Prepare work item to update mirroring agents
The driver updates its mirroring agents whenever it receives a
notification about an event that can affect these. For example, the
addition of a route might require the driver to change the egress port
of an ERSPAN session.
Currently, RTNL needs to be held when these agents are updates, so the
driver either:
1. Calls directly into the mirroring code, in case RTNL is held
2. Schedules a work item that will take RTNL and call into the mirroring
code
Simplify this by having the mirroring code schedule the work item for
the update instead of requiring callers to schedule a work item
themselves.
The conversion of the callers will be done in the next patch to make
review easier.
This will later allow us to remove RTNL from different parts of the
driver. It will also allow us to only schedule the work item in case
there are active mirroring agents, which is information private to the
mirroring code.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:49 +0000 (09:07 +0200)]
mlxsw: spectrum_span: Use struct_size() to simplify allocation
Allocate the main mirroring struct and the individual structs for the
different mirroring agents in a single allocation.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:48 +0000 (09:07 +0200)]
mlxsw: spectrum_span: Do no expose mirroring agents to entire driver
The struct holding the different mirroring agents is currently allocated
as part of the main driver struct. This is unlike other driver modules.
Allocate the memory required to store the different mirroring agents as
part of the initialization of the mirroring module.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:47 +0000 (09:07 +0200)]
mlxsw: spectrum: Protect counter pool with a lock
The counter pool is a shared resource. It is used by both the ACL code
to allocate counters for actions and by the routing code to allocate
counters for adjacency entries (for example).
Currently, all allocations are protected by RTNL, but this is going to
change with the removal of RTNL from the routing code.
Therefore, protect counter allocations with a spin lock.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 20 Feb 2020 07:07:46 +0000 (09:07 +0200)]
mlxsw: spectrum_kvdl: Protect allocations with a lock
The KVDL is used to store objects allocated throughout various places
in the driver. For example, both nexthops (adjacency entries) and ACL
actions are stored in the KVDL.
Currently, all allocations are protected by RTNL, but this is going to
change with the removal of RTNL from the routing code.
Therefore, protect KVDL allocations with a lock. A mutex is used since
the free operation can block in Spectrum-2.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Thu, 20 Feb 2020 06:50:19 +0000 (14:50 +0800)]
net: remove unused macro from fib_trie.c
TNODE_KMALLOC_MAX and VERSION are not used, so remove them
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Thu, 20 Feb 2020 06:49:02 +0000 (14:49 +0800)]
net: neigh: remove unused NEIGH_SYSCTL_MS_JIFFIES_ENTRY
this macro is never used, so remove it
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Thu, 20 Feb 2020 06:23:09 +0000 (22:23 -0800)]
openvswitch: Distribute switch variables for initialization
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
net/openvswitch/flow_netlink.c: In function ‘validate_set’:
net/openvswitch/flow_netlink.c:2711:29: warning: statement will never be executed [-Wswitch-unreachable]
2711 | const struct ovs_key_ipv4 *ipv4_key;
| ^~~~~~~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Thu, 20 Feb 2020 06:23:07 +0000 (22:23 -0800)]
net: ip6_gre: Distribute switch variables for initialization
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
net/ipv6/ip6_gre.c: In function ‘ip6gre_err’:
net/ipv6/ip6_gre.c:440:32: warning: statement will never be executed [-Wswitch-unreachable]
440 | struct ipv6_tlv_tnl_enc_lim *tel;
| ^~~
net/ipv6/ip6_tunnel.c: In function ‘ip6_tnl_err’:
net/ipv6/ip6_tunnel.c:520:32: warning: statement will never be executed [-Wswitch-unreachable]
520 | struct ipv6_tlv_tnl_enc_lim *tel;
| ^~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Thu, 20 Feb 2020 06:23:04 +0000 (22:23 -0800)]
net: core: Distribute switch variables for initialization
Variables declared in a switch statement before any case statements
cannot be automatically initialized with compiler instrumentation (as
they are not part of any execution flow). With GCC's proposed automatic
stack variable initialization feature, this triggers a warning (and they
don't get initialized). Clang's automatic stack variable initialization
(via CONFIG_INIT_STACK_ALL=y) doesn't throw a warning, but it also
doesn't initialize such variables[1]. Note that these warnings (or silent
skipping) happen before the dead-store elimination optimization phase,
so even when the automatic initializations are later elided in favor of
direct initializations, the warnings remain.
To avoid these problems, move such variables into the "case" where
they're used or lift them up into the main function body.
net/core/skbuff.c: In function ‘skb_checksum_setup_ip’:
net/core/skbuff.c:4809:7: warning: statement will never be executed [-Wswitch-unreachable]
4809 | int err;
| ^~~
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Thu, 20 Feb 2020 02:00:06 +0000 (12:00 +1000)]
Merge branch 'linux-5.6' of git://github.com/skeggsb/linux into drm-fixes
Nothing major here, another TU1xx modesetting fix, and hooking up
ACR/GR support on TU11x now that NVIDIA have made the firmware
available.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/
Linus Torvalds [Thu, 20 Feb 2020 01:22:10 +0000 (17:22 -0800)]
Merge tag 'linux-kselftest-5.6-rc3' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"Fixes to build failures and other test bugs"
* tag 'linux-kselftest-5.6-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: openat2: fix build error on newer glibc
selftests: use LDLIBS for libraries instead of LDFLAGS
selftests: fix too long argument
selftests: allow detection of build failures
Kernel selftests: tpm2: check for tpm support
selftests/ftrace: Have pid filter test use instance flag
selftests: fix spelling mistaked "chaigned" -> "chained"
Dave Airlie [Thu, 20 Feb 2020 01:01:28 +0000 (11:01 +1000)]
Merge tag 'drm-msm-fixes-2020-02-16' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
+ fix UBWC on GPU and display side for sc7180
+ fix DSI suspend/resume issue encountered on sc7180
+ fix some breakage on so called "linux-android" devices
(fallout from sc7180/a618 support, not seen earlier
due to bootloader/firmware differences)
+ couple other misc fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/
Sasha Neftin [Mon, 3 Feb 2020 08:11:50 +0000 (10:11 +0200)]
igc: Add comment
Separate interrupt and flag definitions.
Made the code clear.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sasha Neftin [Mon, 3 Feb 2020 07:55:20 +0000 (09:55 +0200)]
igc: Add WOL support
This patch adds a define and WOL support for an i225 parts.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sasha Neftin [Wed, 29 Jan 2020 14:30:07 +0000 (16:30 +0200)]
igc: Add pcie error handler support
Add pcie error detection, slot reset and resume capability
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sasha Neftin [Wed, 22 Jan 2020 09:21:13 +0000 (11:21 +0200)]
igc: Complete to commit Add basic skeleton for PTP
commit
5f2958052c58 ("igc: Add basic skeleton for PTP") added basic
support for PTP, what's missing is support for suspending.
Legacy power management has been added. Now we can add
the suspend method to the igc_shutdown.
By cleaning the runtime storage for timestamp this avoids a possible
invalid memory access when the system comes back from suspend state.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Vitaly Lifshits [Tue, 21 Jan 2020 23:46:28 +0000 (15:46 -0800)]
e1000e: Add support for Tiger Lake device
Added support for a device id that is a part of the Intel Tiger Lake
platform.
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>