Daniel Borkmann [Fri, 8 Sep 2017 23:40:35 +0000 (01:40 +0200)]
bpf: make error reporting in bpf_warn_invalid_xdp_action more clear
Differ between illegal XDP action code and just driver
unsupported one to provide better feedback when we throw
a one-time warning here. Reason is that with
814abfabef3c
("xdp: add bpf_redirect helper function") not all drivers
support the new XDP return code yet and thus they will
fall into their 'default' case when checking for return
codes after program return, which then triggers a
bpf_warn_invalid_xdp_action() stating that the return
code is illegal, but from XDP perspective it's not.
I decided not to place something like a XDP_ACT_MAX define
into uapi i) given we don't have this either for all other
program types, ii) future action codes could have further
encoding there, which would render such define unsuitable
and we wouldn't be able to rip it out again, and iii) we
rarely add new action codes.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 8 Sep 2017 22:38:07 +0000 (15:38 -0700)]
Revert "mdio_bus: Remove unneeded gpiod NULL check"
This reverts commit
95b80bf3db03c2bf572a357cf74b9a6aefef0a4a ("mdio_bus:
Remove unneeded gpiod NULL check"), this commit assumed that GPIOLIB
checks for NULL descriptors, so it's safe to drop them, but it is not
when CONFIG_GPIOLIB is disabled in the kernel. If we do call
gpiod_set_value_cansleep() on a GPIO descriptor we will issue warnings
coming from the inline stubs declared in include/linux/gpio/consumer.h.
Fixes: 95b80bf3db03 ("mdio_bus: Remove unneeded gpiod NULL check")
Reported-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 9 Sep 2017 04:11:01 +0000 (21:11 -0700)]
Merge branch 'xdp-bpf-fixes'
John Fastabend says:
====================
net: Fixes for XDP/BPF
The following fixes, UAPI updates, and small improvement,
i. XDP needs to be called inside RCU with preempt disabled.
ii. Not strictly a bug fix but we have an attach command in the
sockmap UAPI already to avoid having a single kernel released with
only the attach and not the detach I'm pushing this into net branch.
Its early in the RC cycle so I think this is OK (not ideal but better
than supporting a UAPI with a missing detach forever).
iii. Final patch replace cpu_relax with cond_resched in devmap.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 8 Sep 2017 21:01:10 +0000 (14:01 -0700)]
bpf: devmap, use cond_resched instead of cpu_relax
Be a bit more friendly about waiting for flush bits to complete.
Replace the cpu_relax() with a cond_resched().
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 8 Sep 2017 21:00:49 +0000 (14:00 -0700)]
bpf: add support for sockmap detach programs
The bpf map sockmap supports adding programs via attach commands. This
patch adds the detach command to keep the API symmetric and allow
users to remove previously added programs. Otherwise the user would
have to delete the map and re-add it to get in this state.
This also adds a series of additional tests to capture detach operation
and also attaching/detaching invalid prog types.
API note: socks will run (or not run) programs depending on the state
of the map at the time the sock is added. We do not for example walk
the map and remove programs from previously attached socks.
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 8 Sep 2017 21:00:30 +0000 (14:00 -0700)]
net: rcu lock and preempt disable missing around generic xdp
do_xdp_generic must be called inside rcu critical section with preempt
disabled to ensure BPF programs are valid and per-cpu variables used
for redirect operations are consistent. This patch ensures this is true
and fixes the splat below.
The netif_receive_skb_internal() code path is now broken into two rcu
critical sections. I decided it was better to limit the preempt_enable/disable
block to just the xdp static key portion and the fallout is more
rcu_read_lock/unlock calls. Seems like the best option to me.
[ 607.596901] =============================
[ 607.596906] WARNING: suspicious RCU usage
[ 607.596912] 4.13.0-rc4+ #570 Not tainted
[ 607.596917] -----------------------------
[ 607.596923] net/core/dev.c:3948 suspicious rcu_dereference_check() usage!
[ 607.596927]
[ 607.596927] other info that might help us debug this:
[ 607.596927]
[ 607.596933]
[ 607.596933] rcu_scheduler_active = 2, debug_locks = 1
[ 607.596938] 2 locks held by pool/14624:
[ 607.596943] #0: (rcu_read_lock_bh){......}, at: [<
ffffffff95445ffd>] ip_finish_output2+0x14d/0x890
[ 607.596973] #1: (rcu_read_lock_bh){......}, at: [<
ffffffff953c8e3a>] __dev_queue_xmit+0x14a/0xfd0
[ 607.597000]
[ 607.597000] stack backtrace:
[ 607.597006] CPU: 5 PID: 14624 Comm: pool Not tainted 4.13.0-rc4+ #570
[ 607.597011] Hardware name: Dell Inc. Precision Tower 5810/0HHV7N, BIOS A17 03/01/2017
[ 607.597016] Call Trace:
[ 607.597027] dump_stack+0x67/0x92
[ 607.597040] lockdep_rcu_suspicious+0xdd/0x110
[ 607.597054] do_xdp_generic+0x313/0xa50
[ 607.597068] ? time_hardirqs_on+0x5b/0x150
[ 607.597076] ? mark_held_locks+0x6b/0xc0
[ 607.597088] ? netdev_pick_tx+0x150/0x150
[ 607.597117] netif_rx_internal+0x205/0x3f0
[ 607.597127] ? do_xdp_generic+0xa50/0xa50
[ 607.597144] ? lock_downgrade+0x2b0/0x2b0
[ 607.597158] ? __lock_is_held+0x93/0x100
[ 607.597187] netif_rx+0x119/0x190
[ 607.597202] loopback_xmit+0xfd/0x1b0
[ 607.597214] dev_hard_start_xmit+0x127/0x4e0
Fixes: d445516966dc ("net: xdp: support xdp generic on virtual devices")
Fixes: b5cdae3291f7 ("net: Generic XDP")
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 7 Sep 2017 22:14:51 +0000 (00:14 +0200)]
bpf: don't select potentially stale ri->map from buggy xdp progs
We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri->map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:
i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri->map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.
ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.
iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri->map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.
Fix thus needs to avoid accessing stale ri->map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri->map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.
The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.
Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.
Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Thu, 7 Sep 2017 19:35:14 +0000 (12:35 -0700)]
net: tulip: Constify tulip_tbl
It looks like all users of tulip_tbl are reads, so mark this table
as read-only.
$ git grep tulip_tbl # edited to avoid line-wraps...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs&~RxPollInt, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ...
interrupt.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs | TimerInt,
pnic.c: iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
tulip.h: extern struct tulip_chip_table tulip_tbl[];
tulip_core.c:struct tulip_chip_table tulip_tbl[] = {
tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR5);
tulip_core.c:iowrite32(tulip_tbl[tp->chip_id].valid_intrs, ioaddr + CSR7);
tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer,
tulip_core.c:const char *chip_name = tulip_tbl[chip_idx].chip_name;
tulip_core.c:if (pci_resource_len (pdev, 0) < tulip_tbl[chip_idx].io_size)
tulip_core.c:ioaddr = pci_iomap(..., tulip_tbl[chip_idx].io_size);
tulip_core.c:tp->flags = tulip_tbl[chip_idx].flags;
tulip_core.c:setup_timer(&tp->timer, tulip_tbl[tp->chip_id].media_timer,
tulip_core.c:INIT_WORK(&tp->media_work, tulip_tbl[tp->chip_id].media_task);
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: netdev@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Thu, 7 Sep 2017 15:32:30 +0000 (18:32 +0300)]
net: ethernet: ti: netcp_core: no need in netif_napi_del
Don't remove rx_napi specifically just before free_netdev(),
it's supposed to be done in it and is confusing w/o tx_napi deletion.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathieu Malaterre [Thu, 7 Sep 2017 11:24:20 +0000 (13:24 +0200)]
davicom: Display proper debug level up to 6
This will make it explicit some messages are of the form:
dm9000_dbg(db, 5, ...
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Thu, 7 Sep 2017 09:25:50 +0000 (12:25 +0300)]
net: phy: sfp: rename dt properties to match the binding
Make the Rx rate select control gpio property name match the documented
binding. This would make the addition of 'rate-select1-gpios' for SFP+
support more natural.
Also, make the MOD-DEF0 gpio property name match the documentation.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Thu, 7 Sep 2017 09:25:49 +0000 (12:25 +0300)]
dt-binding: net: sfp binding documentation
Add device-tree binding documentation SFP transceivers. Support for SFP
transceivers has been recently introduced (drivers/net/phy/sfp.c).
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Thu, 7 Sep 2017 09:25:48 +0000 (12:25 +0300)]
dt-bindings: add SFF vendor prefix
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Thu, 7 Sep 2017 08:09:59 +0000 (11:09 +0300)]
dt-bindings: net: don't confuse with generic PHY property
This complements commit
9a94b3a4bd (dt-binding: phy: don't confuse with
Ethernet phy properties).
The generic PHY 'phys' property sometime appears in the same node with
the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
ethernet.txt to reduce confusion.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haishuang Yan [Thu, 7 Sep 2017 06:08:35 +0000 (14:08 +0800)]
ip6_tunnel: fix setting hop_limit value for ipv6 tunnel
Similar to vxlan/geneve tunnel, if hop_limit is zero, it should fall
back to ip6_dst_hoplimt().
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haishuang Yan [Thu, 7 Sep 2017 06:08:34 +0000 (14:08 +0800)]
ip_tunnel: fix setting ttl and tos value in collect_md mode
ttl and tos variables are declared and assigned, but are not used in
iptunnel_xmit() function.
Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Cc: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 8 Sep 2017 22:48:47 +0000 (15:48 -0700)]
ipv6: fix typo in fib6_net_exit()
IPv6 FIB should use FIB6_TABLE_HASHSZ, not FIB_TABLE_HASHSZ.
Fixes: ba1cc08d9488 ("ipv6: fix memory leak with multiple tables during netns destruction")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 8 Sep 2017 19:44:47 +0000 (12:44 -0700)]
tcp: fix a request socket leak
While the cited commit fixed a possible deadlock, it added a leak
of the request socket, since reqsk_put() must be called if the BPF
filter decided the ACK packet must be dropped.
Fixes: d624d276d1dd ("tcp: fix possible deadlock in TCP stack vs BPF filter")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Sep 2017 18:35:55 +0000 (11:35 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) Fix SCTP connection setup when IPVS module is loaded and any scheduler
is registered, from Xin Long.
2) Don't create a SCTP connection from SCTP ABORT packets, also from
Xin Long.
3) WARN_ON() and drop packet, instead of BUG_ON() races when calling
nf_nat_setup_info(). This is specifically a longstanding problem
when br_netfilter with conntrack support is in place, patch from
Florian Westphal.
4) Avoid softlock splats via iptables-restore, also from Florian.
5) Revert NAT hashtable conversion to rhashtable, semantics of rhlist
are different from our simple NAT hashtable, this has been causing
problems in the recent Linux kernel releases. From Florian.
6) Add per-bucket spinlock for NAT hashtable, so at least we restore
one of the benefits we got from the previous rhashtable conversion.
7) Fix incorrect hashtable size in memory allocation in xt_hashlimit,
from Zhizhou Tian.
8) Fix build/link problems with hashlimit and 32-bit arches, to address
recent fallout from a new hashlimit mode, from Vishwanath Pai.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Sep 2017 17:09:57 +0000 (10:09 -0700)]
Merge tag 'wireless-drivers-for-davem-2017-09-08' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.14
Few fixes to regressions introduced in the last one or two releases.
The iwlwifi fix is for a regression reported by Linus.
rtlwifi
* fix two antenna selection related bugs
iwlwifi
* fix regression with older firmwares
brcmfmac
* workaround firmware crash for bcm4345
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 8 Sep 2017 14:35:21 +0000 (11:35 -0300)]
sctp: fix missing wake ups in some situations
Commit
fb586f25300f ("sctp: delay calls to sk_data_ready() as much as
possible") minimized the number of wake ups that are triggered in case
the association receives a packet with multiple data chunks on it and/or
when io_events are enabled and then commit
0970f5b36659 ("sctp: signal
sk_data_ready earlier on data chunks reception") moved the wake up to as
soon as possible. It thus relies on the state machine running later to
clean the flag that the event was already generated.
The issue is that there are 2 call paths that calls
sctp_ulpq_tail_event() outside of the state machine, causing the flag to
linger and possibly omitting a needed wake up in the sequence.
One of the call paths is when enabling SCTP_SENDER_DRY_EVENTS via
setsockopt(SCTP_EVENTS), as noticed by Harald Welte. The other is when
partial reliability triggers removal of chunks from the send queue when
the application calls sendmsg().
This commit fixes it by not setting the flag in case the socket is not
owned by the user, as it won't be cleaned later. This works for
user-initiated calls and also for rx path processing.
Fixes: fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible")
Reported-by: Harald Welte <laforge@gnumonks.org>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vishwanath Pai [Fri, 8 Sep 2017 05:38:58 +0000 (01:38 -0400)]
netfilter: xt_hashlimit: fix build error caused by 64bit division
64bit division causes build/link errors on 32bit architectures. It
prints out error messages like:
ERROR: "__aeabi_uldivmod" [net/netfilter/xt_hashlimit.ko] undefined!
The value of avg passed through by userspace in BYTE mode cannot exceed
U32_MAX. Which means 64bit division in user2rate_bytes is unnecessary.
To fix this I have changed the type of param 'user' to u32.
Since anything greater than U32_MAX is an invalid input we error out in
hashlimit_mt_check_common() when this is the case.
Changes in v2:
Making return type as u32 would cause an overflow for small
values of 'user' (for example 2, 3 etc). To avoid this I bumped up
'r' to u64 again as well as the return type. This is OK since the
variable that stores the result is u64. We still avoid 64bit
division here since 'user' is u32.
Fixes: bea74641e378 ("netfilter: xt_hashlimit: add rate match mode")
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Zhizhou Tian [Fri, 8 Sep 2017 03:00:16 +0000 (11:00 +0800)]
netfilter: xt_hashlimit: alloc hashtable with right size
struct xt_byteslimit_htable used hlist_head, but memory allocation is
done through sizeof(struct list_head).
Signed-off-by: Zhizhou Tian <zhizhou.tian@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 6 Sep 2017 12:47:57 +0000 (14:47 +0200)]
netfilter: core: remove erroneous warn_on
kernel test robot reported:
WARNING: CPU: 0 PID: 1244 at net/netfilter/core.c:218 __nf_hook_entries_try_shrink+0x49/0xcd
[..]
After allowing batching in nf_unregister_net_hooks its possible that an earlier
call to __nf_hook_entries_try_shrink already compacted the list.
If this happens we don't need to do anything.
Fixes: d3ad2c17b4047 ("netfilter: core: batch nf_unregister_net_hooks synchronize_net calls")
Reported-by: kernel test robot <xiaolong.ye@intel.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Aaron Conole <aconole@bytheb.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 6 Sep 2017 12:39:52 +0000 (14:39 +0200)]
netfilter: nat: use keyed locks
no need to serialize on a single lock, we can partition the table and
add/delete in parallel to different slots.
This restores one of the advantages that got lost with the rhlist
revert.
Cc: Ivan Babrou <ibobrik@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 6 Sep 2017 12:39:51 +0000 (14:39 +0200)]
netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable"
This reverts commit
870190a9ec9075205c0fa795a09fa931694a3ff1.
It was not a good idea. The custom hash table was a much better
fit for this purpose.
A fast lookup is not essential, in fact for most cases there is no lookup
at all because original tuple is not taken and can be used as-is.
What needs to be fast is insertion and deletion.
rhlist removal however requires a rhlist walk.
We can have thousands of entries in such a list if source port/addresses
are reused for multiple flows, if this happens removal requests are so
expensive that deletions of a few thousand flows can take several
seconds(!).
The advantages that we got from rhashtable are:
1) table auto-sizing
2) multiple locks
1) would be nice to have, but it is not essential as we have at
most one lookup per new flow, so even a million flows in the bysource
table are not a problem compared to current deletion cost.
2) is easy to add to custom hash table.
I tried to add hlist_node to rhlist to speed up rhltable_remove but this
isn't doable without changing semantics. rhltable_remove_fast will
check that the to-be-deleted object is part of the table and that
requires a list walk that we want to avoid.
Furthermore, using hlist_node increases size of struct rhlist_head, which
in turn increases nf_conn size.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821
Reported-by: Ivan Babrou <ibobrik@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Fri, 1 Sep 2017 20:41:03 +0000 (22:41 +0200)]
netfilter: xtables: add scheduling opportunity in get_counters
There are reports about spurious softlockups during iptables-restore, a
backtrace i saw points at get_counters -- it uses a sequence lock and also
has unbounded restart loop.
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Thu, 31 Aug 2017 11:45:24 +0000 (13:45 +0200)]
netfilter: nf_nat: don't bug when mapping already exists
It seems preferrable to limp along if we have a conflicting mapping,
its certainly better than a BUG().
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Sabrina Dubroca [Fri, 8 Sep 2017 08:26:19 +0000 (10:26 +0200)]
ipv6: fix memory leak with multiple tables during netns destruction
fib6_net_exit only frees the main and local tables. If another table was
created with fib6_alloc_table, we leak it when the netns is destroyed.
Fix this in the same way ip_fib_net_exit cleans up tables, by walking
through the whole hashtable of fib6_table's. We can get rid of the
special cases for local and main, since they're also part of the
hashtable.
Reproducer:
ip netns add x
ip -net x -6 rule add from 6003:1::/64 table 100
ip netns del x
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 58f09b78b730 ("[NETNS][IPV6] ip6_fib - make it per network namespace")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 20 Aug 2017 05:38:08 +0000 (13:38 +0800)]
netfilter: ipvs: do not create conn for ABORT packet in sctp_conn_schedule
There's no reason for ipvs to create a conn for an ABORT packet
even if sysctl_sloppy_sctp is set.
This patch is to accept it without creating a conn, just as ipvs
does for tcp's RST packet.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Xin Long [Sun, 20 Aug 2017 05:38:07 +0000 (13:38 +0800)]
netfilter: ipvs: fix the issue that sctp_conn_schedule drops non-INIT packet
Commit
5e26b1b3abce ("ipvs: support scheduling inverse and icmp SCTP
packets") changed to check packet type early. It introduced a side
effect: if it's not a INIT packet, ports will be set as NULL, and
the packet will be dropped later.
It caused that sctp couldn't create connection when ipvs module is
loaded and any scheduler is registered on server.
Li Shuang reproduced it by running the cmds on sctp server:
# ipvsadm -A -t 1.1.1.1:80 -s rr
# ipvsadm -D -t 1.1.1.1:80
then the server could't work any more.
This patch is to return 1 when it's not an INIT packet. It means ipvs
will accept it without creating a conn for it, just like what it does
for tcp.
Fixes: 5e26b1b3abce ("ipvs: support scheduling inverse and icmp SCTP packets")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Ian W MORRISON [Wed, 30 Aug 2017 22:51:03 +0000 (08:51 +1000)]
brcmfmac: feature check for multi-scheduled scan fails on bcm4345 devices
The firmware feature check introduced for multi-scheduled scan is also
failing for bcm4345 devices resulting in a firmware crash.
The reason for this crash has not yet been root cause so this patch avoids
the feature check for those device as a short-term fix.
Fixes: 9fe929aaace6 ("brcmfmac: add firmware feature detection for gscan feature")
Cc: <stable@vger.kernel.org> # v4.13
Signed-off-by: Ian W MORRISON <ianwmorrison@gmail.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Håkon Bugge [Wed, 6 Sep 2017 16:35:51 +0000 (18:35 +0200)]
rds: Fix incorrect statistics counting
In rds_send_xmit() there is logic to batch the sends. However, if
another thread has acquired the lock and has incremented the send_gen,
it is considered a race and we yield. The code incrementing the
s_send_lock_queue_raced statistics counter did not count this event
correctly.
This commit counts the race condition correctly.
Changes from v1:
- Removed check for *someone_on_xmit()*
- Fixed incorrect indentation
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 6 Sep 2017 13:38:58 +0000 (15:38 +0200)]
isdn: isdnloop: fix logic error in isdnloop_sendbuf
gcc-7 found an ancient bug in the loop driver, leading to a condition that
is always false, meaning we ignore the contents of 'card->flags' here:
drivers/isdn/isdnloop/isdnloop.c:412:37: error: ?: using integer constants in boolean context, the expression will always evaluate to 'true' [-Werror=int-in-bool-context]
This changes the braces in the expression to ensure we actually
compare the flag bits, rather than comparing a constant. As Joe Perches
pointed out, an earlier patch of mine incorrectly assumed this was a
false-positive warning.
Cc: Joe Perches <joe@perches.com>
Link: https://patchwork.kernel.org/patch/9840289/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Wed, 6 Sep 2017 12:44:36 +0000 (14:44 +0200)]
udp: drop head states only when all skb references are gone
After commit
0ddf3fb2c43d ("udp: preserve skb->dst if required
for IP options processing") we clear the skb head state as soon
as the skb carrying them is first processed.
Since the same skb can be processed several times when MSG_PEEK
is used, we can end up lacking the required head states, and
eventually oopsing.
Fix this clearing the skb head state only when processing the
last skb reference.
Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 5 Sep 2017 09:26:33 +0000 (17:26 +0800)]
ip6_gre: update mtu properly in ip6gre_err
Now when probessing ICMPV6_PKT_TOOBIG, ip6gre_err only subtracts the
offset of gre header from mtu info. The expected mtu of gre device
should also subtract gre header. Otherwise, the next packets still
can't be sent out.
Jianlin found this issue when using the topo:
client(ip6gre)<---->(nic1)route(nic2)<----->(ip6gre)server
and reducing nic2's mtu, then both tcp and sctp's performance with
big size data became 0.
This patch is to fix it by also subtracting grehdr (tun->tun_hlen)
from mtu info when updating gre device's mtu in ip6gre_err(). It
also needs to subtract ETH_HLEN if gre dev'type is ARPHRD_ETHER.
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 6 Sep 2017 11:14:19 +0000 (13:14 +0200)]
net: sched: fix memleak for chain zero
There's a memleak happening for chain 0. The thing is, chain 0 needs to
be always present, not created on demand. Therefore tcf_block_get upon
creation of block calls the tcf_chain_create function directly. The
chain is created with refcnt == 1, which is not correct in this case and
causes the memleak. So move the refcnt increment into tcf_chain_get
function even for the case when chain needs to be created.
Reported-by: Jakub Kicinski <kubakici@wp.pl>
Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 7 Sep 2017 16:40:58 +0000 (09:40 -0700)]
Merge tag 'mac80211-for-davem-2017-09-07' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Back from a long absence, so we have a number of things:
* a remain-on-channel fix from Avi
* hwsim TX power fix from Beni
* null-PTR dereference with iTXQ in some rare configurations (Chunho)
* 40 MHz custom regdomain fixes (Emmanuel)
* look at right place in HT/VHT capability parsing (Igor)
* complete A-MPDU teardown properly (Ilan)
* Mesh ID Element ordering fix (Liad)
* avoid tracing warning in ht_dbg() (Sharon)
* fix print of assoc/reassoc (Simon)
* fix encrypted VLAN with iTXQ (myself)
* fix calling context of TX queue wake (myself)
* fix a deadlock with ath10k aggregation (myself)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Luca Coelho [Thu, 7 Sep 2017 07:51:52 +0000 (10:51 +0300)]
iwlwifi: mvm: only send LEDS_CMD when the FW supports it
The LEDS_CMD command is only supported in some newer FW versions
(e.g. iwlwifi-8000C-31.ucode), so we can't send it to older versions
(such as iwlwifi-8000C-27.ucode).
To fix this, check for a new bit in the FW capabilities TLV that tells
when the command is supported.
Note that the current version of -31.ucode in linux-firmware.git
(31.532993.0) does not have this capability bit set, so the LED won't
work, even though this version should support it. But we will update
this firmware soon, so it won't be a problem anymore.
Fixes: 7089ae634c50 ("iwlwifi: mvm: use firmware LED command where applicable")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Larry Finger [Mon, 4 Sep 2017 17:51:34 +0000 (12:51 -0500)]
rtlwifi: btcoexist: Fix antenna selection code
In commit
87d8a9f35202 ("rtlwifi: btcoex: call bind to setup btcoex"),
the code turns on a call to exhalbtc_bind_bt_coex_withadapter(). This
routine contains a bug that causes incorrect antenna selection for those
HP laptops with only one antenna and an incorrectly programmed EFUSE.
These boxes are the ones that need the ant_sel module parameter.
Fixes: 87d8a9f35202 ("rtlwifi: btcoex: call bind to setup btcoex")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Cc: Yan-Hsuan Chuang <yhchuang@realtek.com>
Cc: Birming Chiu <birming@realtek.com>
Cc: Shaofu <shaofu@realtek.com>
Cc: Steven Ting <steventing@realtek.com>
Cc: Stable <stable@vger.kernel.org> # 4.13+
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Larry Finger [Mon, 4 Sep 2017 17:51:33 +0000 (12:51 -0500)]
rtlwifi: btcoexist: Fix breakage of ant_sel for rtl8723be
In commit
bcd37f4a0831 ("rtlwifi: btcoex: 23b 2ant: let bt transmit when
hw initialisation done"), there is an additional error when the module
parameter ant_sel is used to select the auxilary antenna. The error is
that the antenna selection is not checked when writing the antenna
selection register.
Fixes: bcd37f4a0831 ("rtlwifi: btcoex: 23b 2ant: let bt transmit when hw initialisation done")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Ping-Ke Shih <pkshih@realtek.com>
Cc: Yan-Hsuan Chuang <yhchuang@realtek.com>
Cc: Birming Chiu <birming@realtek.com>
Cc: Shaofu <shaofu@realtek.com>
Cc: Steven Ting <steventing@realtek.com>
Cc: Stable <stable@vger.kernel.org> # 4.12+
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Kleber Sacilotto de Souza [Wed, 6 Sep 2017 09:08:06 +0000 (11:08 +0200)]
tipc: remove unnecessary call to dev_net()
The net device is already stored in the 'net' variable, so no need to call
dev_net() again.
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 6 Sep 2017 03:53:29 +0000 (11:53 +0800)]
netlink: access nlk groups safely in netlink bind and getname
Now there is no lock protecting nlk ngroups/groups' accessing in
netlink bind and getname. It's safe from nlk groups' setting in
netlink_release, but not from netlink_realloc_groups called by
netlink_setsockopt.
netlink_lock_table is needed in both netlink bind and getname when
accessing nlk groups.
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 6 Sep 2017 03:47:12 +0000 (11:47 +0800)]
netlink: fix an use-after-free issue for nlk groups
ChunYu found a netlink use-after-free issue by syzkaller:
[28448.842981] BUG: KASAN: use-after-free in __nla_put+0x37/0x40 at addr
ffff8807185e2378
[28448.969918] Call Trace:
[...]
[28449.117207] __nla_put+0x37/0x40
[28449.132027] nla_put+0xf5/0x130
[28449.146261] sk_diag_fill.isra.4.constprop.5+0x5a0/0x750 [netlink_diag]
[28449.176608] __netlink_diag_dump+0x25a/0x700 [netlink_diag]
[28449.202215] netlink_diag_dump+0x176/0x240 [netlink_diag]
[28449.226834] netlink_dump+0x488/0xbb0
[28449.298014] __netlink_dump_start+0x4e8/0x760
[28449.317924] netlink_diag_handler_dump+0x261/0x340 [netlink_diag]
[28449.413414] sock_diag_rcv_msg+0x207/0x390
[28449.432409] netlink_rcv_skb+0x149/0x380
[28449.467647] sock_diag_rcv+0x2d/0x40
[28449.484362] netlink_unicast+0x562/0x7b0
[28449.564790] netlink_sendmsg+0xaa8/0xe60
[28449.661510] sock_sendmsg+0xcf/0x110
[28449.865631] __sys_sendmsg+0xf3/0x240
[28450.000964] SyS_sendmsg+0x32/0x50
[28450.016969] do_syscall_64+0x25c/0x6c0
[28450.154439] entry_SYSCALL64_slow_path+0x25/0x25
It was caused by no protection between nlk groups' free in netlink_release
and nlk groups' accessing in sk_diag_dump_groups. The similar issue also
exists in netlink_seq_show().
This patch is to defer nlk groups' free in deferred_put_nlk_sk.
Reported-by: ChunYu Wang <chunwang@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gao Feng [Mon, 4 Sep 2017 06:21:12 +0000 (14:21 +0800)]
sched: Use __qdisc_drop instead of kfree_skb in sch_prio and sch_qfq
The commit
520ac30f4551 ("net_sched: drop packets after root qdisc lock
is released) made a big change of tc for performance. There are two points
left in sch_prio and sch_qfq which are not changed with that commit. Now
enhance them now with __qdisc_drop.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Sun, 3 Sep 2017 14:32:16 +0000 (17:32 +0300)]
dt-binding: phy: don't confuse with Ethernet phy properties
The generic PHY 'phys' property sometime appears in the same node with
the Ethernet PHY 'phy' or 'phy-handle' properties. Add a warning in
phy-bindings.txt to reduce confusion.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 6 Sep 2017 22:17:17 +0000 (15:17 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.14:
API:
- Defer scompress scratch buffer allocation to first use.
- Add __crypto_xor that takes separte src and dst operands.
- Add ahash multiple registration interface.
- Revamped aead/skcipher algif code to fix async IO properly.
Drivers:
- Add non-SIMD fallback code path on ARM for SVE.
- Add AMD Security Processor framework for ccp.
- Add support for RSA in ccp.
- Add XTS-AES-256 support for CCP version 5.
- Add support for PRNG in sun4i-ss.
- Add support for DPAA2 in caam.
- Add ARTPEC crypto support.
- Add Freescale RNGC hwrng support.
- Add Microchip / Atmel ECC driver.
- Add support for STM32 HASH module"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits)
crypto: af_alg - get_page upon reassignment to TX SGL
crypto: cavium/nitrox - Fix an error handling path in 'nitrox_probe()'
crypto: inside-secure - fix an error handling path in safexcel_probe()
crypto: rockchip - Don't dequeue the request when device is busy
crypto: cavium - add release_firmware to all return case
crypto: sahara - constify platform_device_id
MAINTAINERS: Add ARTPEC crypto maintainer
crypto: axis - add ARTPEC-6/7 crypto accelerator driver
crypto: hash - add crypto_(un)register_ahashes()
dt-bindings: crypto: add ARTPEC crypto
crypto: algif_aead - fix comment regarding memory layout
crypto: ccp - use dma_mapping_error to check map error
lib/mpi: fix build with clang
crypto: sahara - Remove leftover from previous used spinlock
crypto: sahara - Fix dma unmap direction
crypto: af_alg - consolidation of duplicate code
crypto: caam - Remove unused dentry members
crypto: ccp - select CONFIG_CRYPTO_RSA
crypto: ccp - avoid uninitialized variable warning
crypto: serpent - improve __serpent_setkey with UBSAN
...
Linus Torvalds [Wed, 6 Sep 2017 21:45:08 +0000 (14:45 -0700)]
Merge git://git./linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Support ipv6 checksum offload in sunvnet driver, from Shannon
Nelson.
2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric
Dumazet.
3) Allow generic XDP to work on virtual devices, from John Fastabend.
4) Add bpf device maps and XDP_REDIRECT, which can be used to build
arbitrary switching frameworks using XDP. From John Fastabend.
5) Remove UFO offloads from the tree, gave us little other than bugs.
6) Remove the IPSEC flow cache, from Florian Westphal.
7) Support ipv6 route offload in mlxsw driver.
8) Support VF representors in bnxt_en, from Sathya Perla.
9) Add support for forward error correction modes to ethtool, from
Vidya Sagar Ravipati.
10) Add time filter for packet scheduler action dumping, from Jamal Hadi
Salim.
11) Extend the zerocopy sendmsg() used by virtio and tap to regular
sockets via MSG_ZEROCOPY. From Willem de Bruijn.
12) Significantly rework value tracking in the BPF verifier, from Edward
Cree.
13) Add new jump instructions to eBPF, from Daniel Borkmann.
14) Rework rtnetlink plumbing so that operations can be run without
taking the RTNL semaphore. From Florian Westphal.
15) Support XDP in tap driver, from Jason Wang.
16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal.
17) Add Huawei hinic ethernet driver.
18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan
Delalande.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits)
i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
i40e: avoid NVM acquire deadlock during NVM update
drivers: net: xgene: Remove return statement from void function
drivers: net: xgene: Configure tx/rx delay for ACPI
drivers: net: xgene: Read tx/rx delay for ACPI
rocker: fix kcalloc parameter order
rds: Fix non-atomic operation on shared flag variable
net: sched: don't use GFP_KERNEL under spin lock
vhost_net: correctly check tx avail during rx busy polling
net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
rxrpc: Make service connection lookup always check for retry
net: stmmac: Delete dead code for MDIO registration
gianfar: Fix Tx flow control deactivation
cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
cxgb4: Fix pause frame count in t4_get_port_stats
cxgb4: fix memory leak
tun: rename generic_xdp to skb_xdp
tun: reserve extra headroom only when XDP is set
net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
net: dsa: bcm_sf2: Advertise number of egress queues
...
Linus Torvalds [Wed, 6 Sep 2017 21:11:03 +0000 (14:11 -0700)]
Merge tag 'wberr-v4.14-1' of git://git./linux/kernel/git/jlayton/linux
Pull writeback error handling updates from Jeff Layton:
"This pile continues the work from last cycle on better tracking
writeback errors. In v4.13 we added some basic errseq_t infrastructure
and converted a few filesystems to use it.
This set continues refining that infrastructure, adds documentation,
and converts most of the other filesystems to use it. The main
exception at this point is the NFS client"
* tag 'wberr-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
ecryptfs: convert to file_write_and_wait in ->fsync
mm: remove optimizations based on i_size in mapping writeback waits
fs: convert a pile of fsync routines to errseq_t based reporting
gfs2: convert to errseq_t based writeback error reporting for fsync
fs: convert sync_file_range to use errseq_t based error-tracking
mm: add file_fdatawait_range and file_write_and_wait
fuse: convert to errseq_t based error tracking for fsync
mm: consolidate dax / non-dax checks for writeback
Documentation: add some docs for errseq_t
errseq: rename __errseq_set to errseq_set
Linus Torvalds [Wed, 6 Sep 2017 20:43:26 +0000 (13:43 -0700)]
Merge tag 'locks-v4.14-1' of git://git./linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"This pile just has a few file locking fixes from Ben Coddington. There
are a couple of cleanup patches + an attempt to bring sanity to the
l_pid value that is reported back to userland on an F_GETLK request.
After a few gyrations, he came up with a way for filesystems to
communicate to the VFS layer code whether the pid should be translated
according to the namespace or presented as-is to userland"
* tag 'locks-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
locks: restore a warn for leaked locks on close
fs/locks: Remove fl_nspid and use fs-specific l_pid for remote locks
fs/locks: Use allocation rather than the stack in fcntl_getlk()
Linus Torvalds [Wed, 6 Sep 2017 20:39:23 +0000 (13:39 -0700)]
Merge tag 'dlm-4.14' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This set includes a bunch of minor code cleanups that have
accumulated, probably from code analyzers people like to run. There is
one nice fix that avoids some socket leaks by switching to use
sock_create_lite()"
* tag 'dlm-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: use sock_create_lite inside tcp_accept_from_sock
uapi linux/dlm_netlink.h: include linux/dlmconstants.h
dlm: avoid double-free on error path in dlm_device_{register,unregister}
dlm: constify kset_uevent_ops structure
dlm: print log message when cluster name is not set
dlm: Delete an unnecessary variable initialisation in dlm_ls_start()
dlm: Improve a size determination in two functions
dlm: Use kcalloc() in two functions
dlm: Use kmalloc_array() in make_member_array()
dlm: Delete an error message for a failed memory allocation in dlm_recover_waiters_pre()
dlm: Improve a size determination in dlm_recover_waiters_pre()
dlm: Use kcalloc() in dlm_scan_waiters()
dlm: Improve a size determination in table_seq_start()
dlm: Add spaces for better code readability
dlm: Replace six seq_puts() calls by seq_putc()
dlm: Make dismatch error message more clear
dlm: Fix kernel memory disclosure
Linus Torvalds [Wed, 6 Sep 2017 19:59:41 +0000 (12:59 -0700)]
Merge tag 'ext4_for_linus' of git://git./linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"Scalability improvements when allocating inodes, and some
miscellaneous bug fixes and cleanups"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: avoid Y2038 overflow in recently_deleted()
ext4: fix fault handling when mounted with -o dax,ro
ext4: fix quota inconsistency during orphan cleanup for read-only mounts
ext4: fix incorrect quotaoff if the quota feature is enabled
ext4: remove useless test and assignment in strtohash functions
ext4: backward compatibility support for Lustre ea_inode implementation
ext4: remove timebomb in ext4_decode_extra_time()
ext4: use sizeof(*ptr)
ext4: in ext4_seek_{hole,data}, return -ENXIO for negative offsets
ext4: reduce lock contention in __ext4_new_inode
ext4: cleanup goto next group
ext4: do not unnecessarily allocate buffer in recently_deleted()
Linus Torvalds [Wed, 6 Sep 2017 19:19:23 +0000 (12:19 -0700)]
Merge tag 'xfs-4.14-merge-7' of git://git./fs/xfs/xfs-linux
Pull XFS updates from Darrick Wong:
"Here are the changes for xfs for 4.14. Most of these are cleanups and
fixes for bad behavior, as we're mostly focusing on improving
reliablity this cycle (read: there's potentially a lot of stuff on the
horizon for 4.15 so better to spend a few weeks killing other bugs
now).
Summary:
- Write unmount record for a ro mount to avoid unnecessary log replay
- Clean up orphaned inodes when mounting fs readonly
- Resubmit inode log items when buffer writeback fails to avoid
umount hang
- Fix log recovery corruption problems when log headers wrap around
the end
- Avoid infinite loop searching for free inodes when inode counters
are wrong
- Evict inodes involved with log redo so that we don't leak them
later
- Fix a potential race between reclaim and inode cluster freeing
- Refactor the inode joining code w.r.t. transaction rolling &
deferred ops
- Fix a bug where the log doesn't properly deal with dirty buffers
that are about to become ordered buffers
- Fix the extent swap code to deal with making dirty buffers ordered
properly
- Consolidate page fault handlers
- Refactor the incore extent manipulation functions to use the iext
abstractions instead of directly modifying with extent data
- Disable crashy chattr +/-x until we fix it
- Don't allow us to set S_DAX for v2 inodes
- Various cleanups
- Clarify some documentation
- Fix a problem where fsync and a log commit race to send the disk a
flush command, resulting in a small window where power fail data
loss could occur
- Simplify some rmap operations in the fcollapse code
- Fix some use-after-free problems in async writeback"
* tag 'xfs-4.14-merge-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (44 commits)
xfs: use kmem_free to free return value of kmem_zalloc
xfs: open code end_buffer_async_write in xfs_finish_page_writeback
xfs: don't set v3 xflags for v2 inodes
xfs: fix compiler warnings
fsmap: fix documentation of FMR_OF_LAST
xfs: simplify the rmap code in xfs_bmse_merge
xfs: remove unused flags arg from xfs_file_iomap_begin_delay
xfs: fix incorrect log_flushed on fsync
xfs: disable per-inode DAX flag
xfs: replace xfs_qm_get_rtblks with a direct call to xfs_bmap_count_leaves
xfs: rewrite xfs_bmap_count_leaves using xfs_iext_get_extent
xfs: use xfs_iext_*_extent helpers in xfs_bmap_split_extent_at
xfs: use xfs_iext_*_extent helpers in xfs_bmap_shift_extents
xfs: move some code around inside xfs_bmap_shift_extents
xfs: use xfs_iext_get_extent in xfs_bmap_first_unused
xfs: switch xfs_bmap_local_to_extents to use xfs_iext_insert
xfs: add a xfs_iext_update_extent helper
xfs: consolidate the various page fault handlers
iomap: return VM_FAULT_* codes from iomap_page_mkwrite
xfs: relog dirty buffers during swapext bmbt owner change
...
Linus Torvalds [Wed, 6 Sep 2017 18:42:31 +0000 (11:42 -0700)]
Merge tag 'gfs2-4.14.fixes' of git://git./linux/kernel/git/gfs2/linux-gfs2
Pull GFS2 updates from Bob Peterson:
"We've got a whopping 29 GFS2 patches for this merge window, mainly
because we held some back from the previous merge window until we
could get them perfected and well tested. We have a couple patch sets,
including my patch set for protecting glock gl_object and Andreas
Gruenbacher's patch set to fix the long-standing shrink- slab hang,
plus a bunch of assorted bugs and cleanups.
Summary:
- I fixed a bug whereby an IO error would lead to a double-brelse.
- Andreas Gruenbacher made a minor cleanup to call his relatively new
function, gfs2_holder_initialized, rather than doing it manually.
This was just missed by a previous patch set.
- Jan Kara fixed a bug whereby the SGID was being cleared when
inheriting ACLs.
- Andreas found a bug and fixed it in his previous patch, "Get rid of
flush_delayed_work in gfs2_evict_inode". A call to
flush_delayed_work was deleted from *gfs2_inode_lookup and added to
gfs2_create_inode.
- Wang Xibo found and fixed a list_add call in inode_go_lock that
specified the parameters in the wrong order.
- Coly Li submitted a patch to add the REQ_PRIO to some of GFS2's
metadata reads that were accidentally missing them.
- I submitted a 4-patch set to protect the glock gl_object field.
GFS2 was setting and checking gl_object with no locking mechanism,
so the value was occasionally stomped on, which caused file system
corruption.
- I submitted a small cleanup to function gfs2_clear_rgrpd. It was
needlessly adding rgrp glocks to the lru list, then pulling them
back off immediately. The rgrp glocks don't use the lru list
anyway, so doing so was just a waste of time.
- I submitted a patch that checks the GLOF_LRU flag on a glock before
trying to remove it from the lru_list. This avoids a lot of
unnecessary spin_lock contention.
- I submitted a patch to delete GFS2's debugfs files only after we
evict all the glocks. Before this patch, GFS2 would delete the
debugfs files, and if unmount hung waiting for a glock, there was
no way to debug the problem. Now, if a hang occurs during umount,
we can examine the debugfs files to figure out why it's hung.
- Andreas Gruenbacher submitted a patch to fix some trivial typos.
- Andreas also submitted a five-part patch set to fix the
longstanding hang involving the slab shrinker: dlm requires memory,
calls the inode shrinker, which calls gfs2's evict, which calls
back into DLM before it can evict an inode.
- Abhi Das submitted a patch to forcibly flush the active items list
to relieve memory pressure. This fixes a long-standing bug whereby
GFS2 was getting hung permanently in balance_dirty_pages.
- Thomas Tai submitted a patch to fix a slab corruption problem due
to a residual pointer left in the lock_dlm lockstruct.
- I submitted a patch to withdraw the file system if IO errors are
encountered while writing to the journals or statfs system file
which were previously not being sent back up. Before, some IO
errors were sometimes not be detected for several hours, and at
recovery time, the journal errors made journal replay impossible.
- Andreas has a patch to fix an annoying format-truncation compiler
warning so GFS2 compiles cleanly.
- I have a patch that fixes a handful of sparse compiler warnings.
- Andreas fixed up an useless gl_object warning caused by an earlier
patch.
- Arvind Yadav added a patch to properly constify our rhashtable
params declare.
- I added a patch to fix a regression caused by the non-recursive
delete and truncate patch that caused file system blocks to not be
properly freed.
- Ernesto A. Fernández added a patch to fix a place where GFS2 would
send back the wrong return code setting extended attributes.
- Ernesto also added a patch to fix a case in which GFS2 was
improperly setting an inode's i_mode, potentially granting access
to the wrong users"
* tag 'gfs2-4.14.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (29 commits)
gfs2: preserve i_mode if __gfs2_set_acl() fails
gfs2: don't return ENODATA in __gfs2_xattr_set unless replacing
GFS2: Fix non-recursive truncate bug
gfs2: constify rhashtable_params
GFS2: Fix gl_object warnings
GFS2: Fix up some sparse warnings
gfs2: Silence gcc format-truncation warning
GFS2: Withdraw for IO errors writing to the journal or statfs
gfs2: fix slab corruption during mounting and umounting gfs file system
gfs2: forcibly flush ail to relieve memory pressure
gfs2: Clean up waiting on glocks
gfs2: Defer deleting inodes under memory pressure
gfs2: gfs2_evict_inode: Put glocks asynchronously
gfs2: Get rid of gfs2_set_nlink
gfs2: gfs2_glock_get: Wait on freeing glocks
gfs2: Fix trivial typos
GFS2: Delete debugfs files only after we evict the glocks
GFS2: Don't waste time locking lru_lock for non-lru glocks
GFS2: Don't bother trying to add rgrps to the lru list
GFS2: Clear gl_object when deleting an inode in gfs2_delete_inode
...
Johannes Berg [Wed, 6 Sep 2017 13:01:42 +0000 (15:01 +0200)]
mac80211: fix deadlock in driver-managed RX BA session start
When an RX BA session is started by the driver, and it has to tell
mac80211 about it, the corresponding bit in tid_rx_manage_offl gets
set and the BA session work is scheduled. Upon testing this bit, it
will call __ieee80211_start_rx_ba_session(), thus deadlocking as it
already holds the ampdu_mlme.mtx, which that acquires again.
Fix this by adding ___ieee80211_start_rx_ba_session(), a version of
the function that requires the mutex already held.
Cc: stable@vger.kernel.org
Fixes: 699cb58c8a52 ("mac80211: manage RX BA session offload without SKB queue")
Reported-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ilan peer [Wed, 6 Sep 2017 14:32:40 +0000 (17:32 +0300)]
mac80211: Complete ampdu work schedule during session tear down
Commit
7a7c0a6438b8 ("mac80211: fix TX aggregation start/stop callback race")
added a cancellation of the ampdu work after the loop that stopped the
Tx and Rx BA sessions. However, in some cases, e.g., during HW reconfig,
the low level driver might call mac80211 APIs to complete the stopping
of the BA sessions, which would queue the ampdu work to handle the actual
completion. This work needs to be performed as otherwise mac80211 data
structures would not be properly synced.
Fix this by checking if BA session STOP_CB bit is set after the BA session
cancellation and properly clean the session.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
[Johannes: the work isn't flushed because that could do other things we
don't want, and the locking situation isn't clear]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Emmanuel Grumbach [Wed, 6 Sep 2017 10:45:40 +0000 (13:45 +0300)]
cfg80211: honor NL80211_RRF_NO_HT40{MINUS,PLUS}
Honor the NL80211_RRF_NO_HT40{MINUS,PLUS} flags in
reg_process_ht_flags_channel. Not doing so leads can lead
to a firmware assert in iwlwifi for example.
Fixes: b0d7aa59592b ("cfg80211: allow wiphy specific regdomain management")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Wed, 6 Sep 2017 03:03:40 +0000 (20:03 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2017-09-05
This series contains fixes for i40e only.
These two patches fix an issue where our nvmupdate tool does not work on RHEL 7.4
and newer kernels, in fact, the use of the nvmupdate tool on newer kernels can
cause the cards to be non-functional unless these patches are applied.
Anjali reworks the locking around accessing the NVM so that NVM acquire timeouts
do not occur which was causing the failed firmware updates.
Jake correctly updates the wb_desc when reading the NVM through the AdminQ.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 6 Sep 2017 03:03:35 +0000 (20:03 -0700)]
Merge git://git./linux/kernel/git/davem/net
Jacob Keller [Fri, 1 Sep 2017 20:43:08 +0000 (13:43 -0700)]
i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq
When introducing the functions to read the NVM through the AdminQ, we
did not correctly mark the wb_desc.
Fixes: 7073f46e443e ("i40e: Add AQ commands for NVM Update for X722", 2015-06-05)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Fri, 1 Sep 2017 20:42:49 +0000 (13:42 -0700)]
i40e: avoid NVM acquire deadlock during NVM update
X722 devices use the AdminQ to access the NVM, and this requires taking
the AdminQ lock. Because of this, we lock the AdminQ during
i40e_read_nvm(), which is also called in places where the lock is
already held, such as the firmware update path which wants to lock once
and then unlock when finished after performing several tasks.
Although this should have only affected X722 devices, commit
96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices",
2016-12-02) added locking for all NVM reads, regardless of device
family.
This resulted in us accidentally causing NVM acquire timeouts on all
devices, causing failed firmware updates which left the eeprom in
a corrupt state.
Create unsafe non-locked variants of i40e_read_nvm_word and
i40e_read_nvm_buffer, __i40e_read_nvm_word and __i40e_read_nvm_buffer
respectively. These variants will not take the NVM lock and are expected
to only be called in places where the NVM lock is already held if
needed.
Since the only caller of i40e_read_nvm_buffer() was in such a path,
remove it entirely in favor of the unsafe version. If necessary we can
always add it back in the future.
Additionally, we now need to hold the NVM lock in i40e_validate_checksum
because the call to i40e_calc_nvm_checksum now assumes that the NVM lock
is held. We can further move the call to read I40E_SR_SW_CHECKSUM_WORD
up a bit so that we do not need to acquire the NVM lock twice.
This should resolve firmware updates and also fix potential raise that
could have caused the driver to report an invalid NVM checksum upon
driver load.
Reported-by: Stefan Assmann <sassmann@kpanic.de>
Fixes: 96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices", 2016-12-02)
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 5 Sep 2017 21:58:25 +0000 (14:58 -0700)]
Merge branch 'xgene-Misc-bug-fixes'
Iyappan Subramanian says:
====================
drivers: net: xgene: Misc bug fixes
This patch set fixes bugs related to handling the case for ACPI for,
reading and programming tx/rx delay values.
====================
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iyappan Subramanian [Tue, 5 Sep 2017 18:16:32 +0000 (11:16 -0700)]
drivers: net: xgene: Remove return statement from void function
commit 183db4 ("drivers: net: xgene: Correct probe sequence handling")
changed the return type of xgene_enet_check_phy_handle() to void.
This patch, removes the return statement from the last line.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Quan Nguyen [Tue, 5 Sep 2017 18:16:31 +0000 (11:16 -0700)]
drivers: net: xgene: Configure tx/rx delay for ACPI
This patch fixes configuring tx/rx delay values for ACPI.
Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iyappan Subramanian [Tue, 5 Sep 2017 18:16:30 +0000 (11:16 -0700)]
drivers: net: xgene: Read tx/rx delay for ACPI
This patch fixes reading tx/rx delay values for ACPI.
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Quan Nguyen <qnguyen@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zahari Doychev [Tue, 5 Sep 2017 19:49:58 +0000 (21:49 +0200)]
rocker: fix kcalloc parameter order
The function calls to kcalloc use wrong parameter order and incorrect flags
values. GFP_KERNEL is used instead of flags now and the order is corrected.
The change was done using the following coccinelle script:
@@
expression E1,E2;
type T;
@@
-kcalloc(E1, E2, sizeof(T))
+kcalloc(E2, sizeof(T), GFP_KERNEL)
Signed-off-by: Zahari Doychev <zahari.doychev@linux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Håkon Bugge [Tue, 5 Sep 2017 15:42:01 +0000 (17:42 +0200)]
rds: Fix non-atomic operation on shared flag variable
The bits in m_flags in struct rds_message are used for a plurality of
reasons, and from different contexts. To avoid any missing updates to
m_flags, use the atomic set_bit() instead of the non-atomic equivalent.
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 5 Sep 2017 15:31:23 +0000 (08:31 -0700)]
net: sched: don't use GFP_KERNEL under spin lock
The new TC IDR code uses GFP_KERNEL under spin lock. Which leads
to:
[ 582.621091] BUG: sleeping function called from invalid context at ../mm/slab.h:416
[ 582.629721] in_atomic(): 1, irqs_disabled(): 0, pid: 3379, name: tc
[ 582.636939] 2 locks held by tc/3379:
[ 582.641049] #0: (rtnl_mutex){+.+.+.}, at: [<
ffffffff910354ce>] rtnetlink_rcv_msg+0x92e/0x1400
[ 582.650958] #1: (&(&tn->idrinfo->lock)->rlock){+.-.+.}, at: [<
ffffffff9110a5e0>] tcf_idr_create+0x2f0/0x8e0
[ 582.662217] Preemption disabled at:
[ 582.662222] [<
ffffffff9110a5e0>] tcf_idr_create+0x2f0/0x8e0
[ 582.672592] CPU: 9 PID: 3379 Comm: tc Tainted: G W
4.13.0-rc7-debug-00648-g43503a79b9f0 #287
[ 582.683432] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
[ 582.691937] Call Trace:
...
[ 582.742460] kmem_cache_alloc+0x286/0x540
[ 582.747055] radix_tree_node_alloc.constprop.6+0x4a/0x450
[ 582.753209] idr_get_free_cmn+0x627/0xf80
...
[ 582.815525] idr_alloc_cmn+0x1a8/0x270
...
[ 582.833804] tcf_idr_create+0x31b/0x8e0
...
Try to preallocate the memory with idr_prealloc(GFP_KERNEL)
(as suggested by Eric Dumazet), and change the allocation
flags under spin lock.
Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Tue, 5 Sep 2017 01:22:05 +0000 (09:22 +0800)]
vhost_net: correctly check tx avail during rx busy polling
We check tx avail through vhost_enable_notify() in the past which is
wrong since it only checks whether or not guest has filled more
available buffer since last avail idx synchronization which was just
done by vhost_vq_avail_empty() before. What we really want is checking
pending buffers in the avail ring. Fix this by calling
vhost_vq_avail_empty() instead.
This issue could be noticed by doing netperf TCP_RR benchmark as
client from guest (but not host). With this fix, TCP_RR from guest to
localhost restores from 1375.91 trans per sec to 55235.28 trans per
sec on my laptop (Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz).
Fixes: 030881372460 ("vhost_net: basic polling support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Corentin Labbe [Mon, 4 Sep 2017 16:30:14 +0000 (18:30 +0200)]
net: mdio-mux: add mdio_mux parameter to mdio_mux_init()
mdio_mux_init() use the parameter dev for two distinct thing:
1) Have a device for all devm_ functions
2) Get device_node from it
Since it is two distinct purpose, this patch add a parameter mdio_mux
that is linked to task 2.
This will also permit to register an of_node mdio-mux that lacks a direct
owning device.
For example a mdio-mux which is a subnode of a real device.
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Mon, 4 Sep 2017 14:28:28 +0000 (15:28 +0100)]
rxrpc: Make service connection lookup always check for retry
When an RxRPC service packet comes in, the target connection is looked up
by an rb-tree search under RCU and a read-locked seqlock; the seqlock retry
check is, however, currently skipped if we got a match, but probably
shouldn't be in case the connection we found gets replaced whilst we're
doing a search.
Make the lookup procedure always go through need_seqretry(), even if the
lookup was successful. This makes sure we always pick up on a write-lock
event.
On the other hand, since we don't take a ref on the object, but rely on RCU
to prevent its destruction after dropping the seqlock, I'm not sure this is
necessary.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Romain Perier [Mon, 4 Sep 2017 08:41:36 +0000 (10:41 +0200)]
net: stmmac: Delete dead code for MDIO registration
This code is no longer used, the logging function was changed by commit
fbca164776e4 ("net: stmmac: Use the right logging function in stmmac_mdio_register").
It was previously showing information about the type of the IRQ, if it's
polled, ignored or a normal interrupt. As we don't want information loss,
I have moved this code to phy_attached_print().
Fixes: fbca164776e4 ("net: stmmac: Use the right logging function in stmmac_mdio_register")
Signed-off-by: Romain Perier <romain.perier@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Mon, 4 Sep 2017 07:45:28 +0000 (10:45 +0300)]
gianfar: Fix Tx flow control deactivation
The wrong register is checked for the Tx flow control bit,
it should have been maccfg1 not maccfg2.
This went unnoticed for so long probably because the impact is
hardly visible, not to mention the tangled code from adjust_link().
First, link flow control (i.e. handling of Rx/Tx link level pause frames)
is disabled by default (needs to be enabled via 'ethtool -A').
Secondly, maccfg2 always returns 0 for tx_flow_oldval (except for a few
old boards), which results in Tx flow control remaining always on
once activated.
Fixes: 45b679c9a3ccd9e34f28e6ec677b812a860eb8eb ("gianfar: Implement PAUSE frame generation support")
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Mon, 4 Sep 2017 05:55:34 +0000 (11:25 +0530)]
cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6
MPS_TX_INT_CAUSE[Bubble] is a normal condition for T6, hence
ignore this interrupt for T6.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Mon, 4 Sep 2017 05:47:36 +0000 (11:17 +0530)]
cxgb4: Fix pause frame count in t4_get_port_stats
MPS_STAT_CTL[CountPauseStatTx] and MPS_STAT_CTL[CountPauseStatRx]
only control whether or not Pause Frames will be counted as part
of the 64-Byte Tx/Rx Frame counters. These bits do not control
whether Pause Frames are counted in the Total Tx/Rx Frames/Bytes
counters.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Mon, 4 Sep 2017 05:46:28 +0000 (11:16 +0530)]
cxgb4: fix memory leak
do not reuse the loop counter which is used iterate over
the ports, so that sched_tbl will be freed for all the ports.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Mon, 4 Sep 2017 03:36:09 +0000 (11:36 +0800)]
tun: rename generic_xdp to skb_xdp
Rename "generic_xdp" to "skb_xdp" to avoid confusing it with the
generic XDP which will be done at netif_receive_skb().
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Mon, 4 Sep 2017 03:36:08 +0000 (11:36 +0800)]
tun: reserve extra headroom only when XDP is set
We reserve headroom unconditionally which could cause unnecessary
stress on socket memory accounting because of increased trusesize. Fix
this by only reserve extra headroom when XDP is set.
Cc: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Sep 2017 19:50:00 +0000 (12:50 -0700)]
Merge tag 'devprop-4.14-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull device properties framework updates from Rafael Wysocki:
"These introduce fwnode operations for all of the separate types of
'firmware nodes' that can be handled by the device properties
framework, make the framework use const fwnode arguments all over, add
a helper for the consolidated handling of node references and switch
over the framework to the new UUID API.
Specifics:
- Introduce fwnode operations for all of the separate types of
'firmware nodes' that can be handled by the device properties
framework and drop the type field from struct fwnode_handle (Sakari
Ailus, Arnd Bergmann).
- Make the device properties framework use const fwnode arguments
where possible (Sakari Ailus).
- Add a helper for the consolidated handling of node references to
the device properties framework (Sakari Ailus).
- Switch over the ACPI part of the device properties framework to the
new UUID API (Andy Shevchenko)"
* tag 'devprop-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: device property: Switch to use new generic UUID API
device property: export irqchip_fwnode_ops
device property: Introduce fwnode_property_get_reference_args
device property: Constify fwnode property API
device property: Constify argument to pset fwnode backend
ACPI: Constify internal fwnode arguments
ACPI: Constify acpi_bus helper functions, switch to macros
ACPI: Prepare for constifying acpi_get_next_subnode() fwnode argument
device property: Get rid of struct fwnode_handle type field
ACPI: Use IS_ERR_OR_NULL() instead of non-NULL check in is_acpi_data_node()
Linus Torvalds [Tue, 5 Sep 2017 19:45:03 +0000 (12:45 -0700)]
Merge tag 'acpi-4.14-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These include a usual ACPICA code update (this time to upstream
revision
20170728), a fix for a boot crash on some systems with
Thunderbolt devices connected at boot time, a rework of the handling
of PCI bridges when setting up device wakeup, new support for Apple
device properties, support for DMA configurations reported via ACPI on
ARM64, APEI-related updates, ACPI EC driver updates and assorted minor
modifications in several places.
Specifics:
- Update the ACPICA code in the kernel to upstream revision
20170728
including:
* Alias operator handling update (Bob Moore).
* Deferred resolution of reference package elements (Bob Moore).
* Support for the _DMA method in walk resources (Bob Moore).
* Tables handling update and support for deferred table
verification (Lv Zheng).
* Update of SMMU models for IORT (Robin Murphy).
* Compiler and disassembler updates (Alex James, Erik Schmauss,
Ganapatrao Kulkarni, James Morse).
* Tools updates (Erik Schmauss, Lv Zheng).
* Assorted minor fixes and cleanups (Bob Moore, Kees Cook, Lv
Zheng, Shao Ming).
- Rework the initialization of non-wakeup GPEs with method handlers
in order to address a boot crash on some systems with Thunderbolt
devices connected at boot time where we miss an early hotplug event
due to a delay in GPE enabling (Rafael Wysocki).
- Rework the handling of PCI bridges when setting up ACPI-based
device wakeup in order to avoid disabling wakeup for bridges
prematurely (Rafael Wysocki).
- Consolidate Apple DMI checks throughout the tree, add support for
Apple device properties to the device properties framework and use
these properties for the handling of I2C and SPI devices on Apple
systems (Lukas Wunner).
- Add support for _DMA to the ACPI-based device properties lookup
code and make it possible to use the information from there to
configure DMA regions on ARM64 systems (Lorenzo Pieralisi).
- Fix several issues in the APEI code, add support for exporting the
BERT error region over sysfs and update APEI MAINTAINERS entry with
reviewers information (Borislav Petkov, Dongjiu Geng, Loc Ho, Punit
Agrawal, Tony Luck, Yazen Ghannam).
- Fix a potential initialization ordering issue in the ACPI EC driver
and clean it up somewhat (Lv Zheng).
- Update the ACPI SPCR driver to extend the existing XGENE 8250
workaround in it to a new platform (m400) and to work around an
Xgene UART clock issue (Graeme Gregory).
- Add a new utility function to the ACPI core to support using ACPI
OEM ID / OEM Table ID / Revision for system identification in
blacklisting or similar and switch over the existing code already
using this information to this new interface (Toshi Kani).
- Fix an xpower PMIC issue related to GPADC reads that always return
0 without extra pin manipulations (Hans de Goede).
- Add statements to print debug messages in a couple of places in the
ACPI core for easier diagnostics (Rafael Wysocki).
- Clean up the ACPI processor driver slightly (Colin Ian King, Hanjun
Guo).
- Clean up the ACPI x86 boot code somewhat (Andy Shevchenko).
- Add a quirk for Dell OptiPlex 9020M to the ACPI backlight driver
(Alex Hung).
- Assorted fixes, cleanups and updates related to ACPI (Amitoj Kaur
Chawla, Bhumika Goyal, Frank Rowand, Jean Delvare, Punit Agrawal,
Ronald Tschalär, Sumeet Pawnikar)"
* tag 'acpi-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (75 commits)
ACPI / APEI: Suppress message if HEST not present
intel_pstate: convert to use acpi_match_platform_list()
ACPI / blacklist: add acpi_match_platform_list()
ACPI, APEI, EINJ: Subtract any matching Register Region from Trigger resources
ACPI: make device_attribute const
ACPI / sysfs: Extend ACPI sysfs to provide access to boot error region
ACPI: APEI: fix the wrong iteration of generic error status block
ACPI / processor: make function acpi_processor_check_duplicates() static
ACPI / EC: Clean up EC GPE mask flag
ACPI: EC: Fix possible issues related to EC initialization order
ACPI / PM: Add debug statements to acpi_pm_notify_handler()
ACPI: Add debug statements to acpi_global_event_handler()
ACPI / scan: Enable GPEs before scanning the namespace
ACPICA: Make it possible to enable runtime GPEs earlier
ACPICA: Dispatch active GPEs at init time
ACPI: SPCR: work around clock issue on xgene UART
ACPI: SPCR: extend XGENE 8250 workaround to m400
ACPI / LPSS: Don't abort ACPI scan on missing mem resource
mailbox: pcc: Drop uninformative output during boot
ACPI/IORT: Add IORT named component memory address limits
...
Linus Torvalds [Tue, 5 Sep 2017 19:19:08 +0000 (12:19 -0700)]
Merge tag 'pm-4.14-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"This time (again) cpufreq gets the majority of changes which mostly
are driver updates (including a major consolidation of intel_pstate),
some schedutil governor modifications and core cleanups.
There also are some changes in the system suspend area, mostly related
to diagnostics and debug messages plus some renames of things related
to suspend-to-idle. One major change here is that suspend-to-idle is
now going to be preferred over S3 on systems where the ACPI tables
indicate to do so and provide requsite support (the Low Power Idle S0
_DSM in particular). The system sleep documentation and the tools
related to it are updated too.
The rest is a few cpuidle changes (nothing major), devfreq updates,
generic power domains (genpd) framework updates and a few assorted
modifications elsewhere.
Specifics:
- Drop the P-state selection algorithm based on a PID controller from
intel_pstate and make it use the same P-state selection method
(based on the CPU load) for all types of systems in the active mode
(Rafael Wysocki, Srinivas Pandruvada).
- Rework the cpufreq core and governors to make it possible to take
cross-CPU utilization updates into account and modify the schedutil
governor to actually do so (Viresh Kumar).
- Clean up the handling of transition latency information in the
cpufreq core and untangle it from the information on which drivers
cannot do dynamic frequency switching (Viresh Kumar).
- Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek
cpufreq driver and update its DT bindings (Sean Wang).
- Modify the cpufreq dt-platdev driver to autimatically create
cpufreq devices for the new (v2) Operating Performance Points (OPP)
DT bindings and update its whitelist of supported systems (Viresh
Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley
Xiao).
- Add support for Ux500 to the cpufreq-dt driver and drop the
obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann).
- Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem
Nguyen).
- Fix and clean up assorted issues in the cpufreq drivers and core
(Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva,
Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla).
- Update the IO-wait boost handling in the schedutil governor to make
it less aggressive (Joel Fernandes).
- Rework system suspend diagnostics to make it print fewer messages
to the kernel log by default, add a sysfs knob to allow more
suspend-related messages to be printed and add Low Power S0 Idle
constraints checks to the ACPI suspend-to-idle code (Rafael
Wysocki, Srinivas Pandruvada).
- Prefer suspend-to-idle over S3 on ACPI-based systems with the
ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM
interface present in the ACPI tables (Rafael Wysocki).
- Update documentation related to system sleep and rename a number of
items in the code to make it cleare that they are related to
suspend-to-idle (Rafael Wysocki).
- Export a variable allowing device drivers to check the target
system sleep state from the core system suspend code (Florian
Fainelli).
- Clean up the cpuidle subsystem to handle the polling state on x86
in a more straightforward way and to use %pOF instead of full_name
(Rafael Wysocki, Rob Herring).
- Update the devfreq framework to fix and clean up a few minor issues
(Chanwoo Choi, Rob Herring).
- Extend diagnostics in the generic power domains (genpd) framework
and clean it up slightly (Thara Gopinath, Rob Herring).
- Fix and clean up a couple of issues in the operating performance
points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz).
- Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling
(AVS) driver (David Wu).
- Fix the usage of notifiers in CPU power management on some
platforms (Alex Shi).
- Update the pm-graph system suspend/hibernation and boot profiling
utility (Todd Brandt).
- Make it possible to run the cpupower utility without CPU0 (Prarit
Bhargava)"
* tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits)
cpuidle: Make drivers initialize polling state
cpuidle: Move polling state initialization code to separate file
cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol
cpufreq: imx6q: Fix imx6sx low frequency support
cpufreq: speedstep-lib: make several arrays static, makes code smaller
PM: docs: Delete the obsolete states.txt document
PM: docs: Describe high-level PM strategies and sleep states
PM / devfreq: Fix memory leak when fail to register device
PM / devfreq: Add dependency on PM_OPP
PM / devfreq: Move private devfreq_update_stats() into devfreq
PM / devfreq: Convert to using %pOF instead of full_name
PM / AVS: rockchip-io: add io selectors and supplies for RV1108
cpufreq: ti: Fix 'of_node_put' being called twice in error handling path
cpufreq: dt-platdev: Drop few entries from whitelist
cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2
ARM: ux500: don't select CPUFREQ_DT
cpuidle: Convert to using %pOF instead of full_name
cpufreq: Convert to using %pOF instead of full_name
PM / Domains: Convert to using %pOF instead of full_name
cpufreq: Cap the default transition delay value to 10 ms
...
Linus Torvalds [Tue, 5 Sep 2017 18:54:41 +0000 (11:54 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID update from Jiri Kosina:
- Wacom driver fixes/updates (device name generation improvements,
touch ring status support) from Jason Gerecke
- T100 touchpad support from Hans de Goede
- support for batteries driven by HID input reports, from Dmitry
Torokhov
- Arnd pointed out that driver_lock semaphore is superfluous, as driver
core already provides all the necessary concurency protection.
Removal patch from Binoy Jayan
- logical minimum numbering improvements in sensor-hub driver, from
Srinivas Pandruvada
- support for Microsoft Win8 Wireless Radio Controls extensions from
João Paulo Rechi Vita
- assorted small fixes and device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (28 commits)
HID: prodikeys: constify snd_rawmidi_ops structures
HID: sensor: constify platform_device_id
HID: input: throttle battery uevents
HID: usbmouse: constify usb_device_id and fix space before '[' error
HID: usbkbd: constify usb_device_id and fix space before '[' error.
HID: hid-sensor-hub: Force logical minimum to 1 for power and report state
HID: wacom: Do not completely map WACOM_HID_WD_TOUCHRINGSTATUS usage
HID: asus: Add T100CHI bluetooth keyboard dock touchpad support
HID: ntrig: constify attribute_group structures.
HID: logitech-hidpp: constify attribute_group structures.
HID: sensor: constify attribute_group structures.
HID: multitouch: constify attribute_group structures.
HID: multitouch: use proper symbolic constant for 0xff310076 application
HID: multitouch: Support Asus T304UA media keys
HID: multitouch: Support HID_GD_WIRELESS_RADIO_CTLS
HID: input: optionally use device id in battery name
HID: input: map digitizer battery usage
HID: Remove the semaphore driver_lock
HID: wacom: add USB_HID dependency
HID: add ALWAYS_POLL quirk for Logitech 0xc077
...
David S. Miller [Tue, 5 Sep 2017 18:53:34 +0000 (11:53 -0700)]
Merge branch 'dsa-tx-queues'
Florian Fainelli says:
====================
net: dsa: Allow switch drivers to indicate number of TX queues
This patch series extracts the parts of the patch set that are likely not to be
controversial and actually bringing multi-queue support to DSA-created network
devices.
With these patches, we can now use sch_multiq as documented under
Documentation/networking/multique.txt and let applications dedice the switch
port output queue they want to use. Currently only Broadcom tags utilize that
information.
Resending based on David's feedback regarding the patches not in patchwork.
Changes in v2:
- use a proper define for the number of TX queues in bcm_sf2.c (Andrew)
Changes from RFC:
- dropped the ability to configure RX queues since we don't do anything with
those just yet
- dropped the patches that dealt with binding the DSA slave network devices
queues with their master network devices queues this will be worked on
separately.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 4 Sep 2017 03:27:03 +0000 (20:27 -0700)]
net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping
Even though TC2QOS mapping is for switch egress queues, we need to
configure it correclty in order for the Broadcom tag ingress (CPU ->
switch) queue selection to work correctly since there is a 1:1 mapping
between switch egress queues and ingress queues.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 4 Sep 2017 03:27:02 +0000 (20:27 -0700)]
net: dsa: bcm_sf2: Advertise number of egress queues
The switch supports 8 egress queues per port, so indicate that such that
net/dsa/slave.c::dsa_slave_create can allocate the right number of TX queues.
While at it use SF2_NUM_EGRESS_QUEUE as a define for the number of queues we
support.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 4 Sep 2017 03:27:01 +0000 (20:27 -0700)]
net: dsa: tag_brcm: Set output queue from skb queue mapping
We originally used skb->priority but that was not quite correct as this
bitfield needs to contain the egress switch queue we intend to send this
SKB to.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 4 Sep 2017 03:27:00 +0000 (20:27 -0700)]
net: dsa: Allow switch drivers to indicate number of TX queues
Let switch drivers indicate how many TX queues they support. Some
switches, such as Broadcom Starfighter 2 are designed with 8 egress
queues. Future changes will allow us to leverage the queue mapping and
direct the transmission towards a particular queue.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 3 Sep 2017 14:44:13 +0000 (17:44 +0300)]
bridge: switchdev: Use an helper to clear forward mark
Instead of using ifdef in the C file.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Tested-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Sep 2017 18:49:48 +0000 (11:49 -0700)]
Merge tag 'gpio-v4.14-1' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of the GPIO changes for the v4.14 cycle.
Not so much changes this time, phew. David Daney and Bartosz
Golaszewski did all the really interesting work in infrastructure
improvement across GPIO and IRQ core, hats off for them and to tglx
and Marc Z for general help with these patch sets.
Core changes:
- Allow the GPIO irqchip to allocate IRQs dynamically. This is an
important change on systems where only a restricted number of IRQs,
lesser than the number of GPIO lines, can be utilized. Now we can
allocate these on a first-come-first-served basis instead of
hogging up valuable IRQ lines.
- Serious fix-up of the kerneldoc documentation and inclusion into
the kerneldoc builds.
- Pulled in the IRQ simulator from the IRQ core tree and use this in
the GPIO mockup driver for exhaustive testing of interrupt
abilities.
New drivers:
- New driver for ThunderX and OCTEON-TX. This is especially
interesting as it picks up improvements from the IRQ core that
allow us to handle fasteoi ACKs upwards in a hierarchy when there
are IRQ flag latches on several levels in a hierarchy. Very
interesting work here.
- New subdriver for Renesas R-Car r8a7745 (RZ/G1E).
Misc:
- Several fixes and improvements for Xilinx Zynq GPIO.
- Support an enablement GPIO for the 74x164 GPIO.
- Switch a bunch of chips to use devres to allocate irq descriptors.
- A bunch of constification fixes"
* tag 'gpio-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (63 commits)
gpio: mockup: remove unused variable gc
gpio: pl061: constify amba_id
Revert "gpiolib: request the gpio before querying its direction"
gpio: twl6040: remove unneeded forward declaration
gpio: zevio: make gpio_chip const
gpio: add gpio_add_lookup_tables() to add several tables at once
gpio: rcar: Add r8a7745 (RZ/G1E) support
gpio: brcmstb: check return value of gpiochip_irqchip_add()
MAINTAINERS: Add entry for THUNDERX GPIO Driver.
gpio: Add gpio driver support for ThunderX and OCTEON-TX
gpio: mockup: use irq_sim
gpio: mxs: use devres for irq generic chip
gpio: mxc: use devres for irq generic chip
gpio: pch: use devres for irq generic chip
gpio: ml-ioh: use devres for irq generic chip
gpio: sta2x11: use devres for irq generic chip
gpio: sta2x11: disallow unbinding the driver
gpio: mxs: disallow unbinding the driver
gpio: mxc: disallow unbinding the driver
gpio: aspeed: Remove reference to clock name in debounce warning message
...
Thomas Meyer [Sun, 3 Sep 2017 12:19:31 +0000 (14:19 +0200)]
net/mlx4_core: Use ARRAY_SIZE macro
Use ARRAY_SIZE macro, rather than explicitly coding some variant of it
yourself.
Found with: find -type f -name "*.c" -o -name "*.h" | xargs perl -p -i -e
's/\bsizeof\s*\(\s*(\w+)\s*\)\s*\ /\s*sizeof\s*\(\s*\1\s*\[\s*0\s*\]\s*\)
/ARRAY_SIZE(\1)/g' and manual check/verification.
Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Sep 2017 18:45:33 +0000 (11:45 -0700)]
Merge tag 'pinctrl-v4.14-1' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"This is the big bulk of pin control changes for the v4.14 kernel.
There are just a few bigger changes (new drivers mostly) and then a
lot of small patches all over the place.
Core changes:
- Decision to wrap the sleep mode of the Spreadtrum and in the future
others into a specially tagged state. The generic DT bindings and
the new Spreadtrum driver conforms to this. Others should be moved
over if possible.
New drivers:
- Spreadtrum SoCs especially the SC9860 SoC.
- Storlink/Cortina Gemini 3512 and 3516 SoCs.
New subdrivers:
- Intel Denverton subdriver.
- Intel Cannon Lake subdriver.
- Intel Lewisburg subdriver.
- Allwinner sunxi: R40 subdriver for A10.
- Socionext uniphier PXs3 subdriver.
- Rockchip RK3128 subdriver.
- Renesas SH-PFC R8A77995 subdriver.
Miscellaneous:
- Qualcomm APQ8064 can handle general purpose clock muxing.
- Mediatek MT7623 PCIe mux data fixed up.
- Intel GPIO IRQs are disabled during suspend.
- Several fixes and addtions to Renesas r8a7796.
- Qualcomm SPMI GPIO supports dtest route and LV/MV subtype.
- Input schmitt trigger support in Rockchip RV1108.
- Aspeed G4 and G5 USB host/device pin control control added.
- Qualcomm IPQ4019 has matured with a few missing pin groups and
control bits put in place.
- Lots of constification, this is the latest in cocinelle fixes"
* tag 'pinctrl-v4.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (147 commits)
Revert "pinctrl: sunxi: Don't enforce bias disable (for now)"
pinctrl: uniphier: fix members of rmii group for Pro4
pinctrl: Delete an error message
pinctrl: core: Delete an error message
pinctrl: intel: Read back TX buffer state
pinctrl: rockchip: Add rv1108 recalculated iomux support
pinctrl: intel: Decrease indentation in intel_gpio_set()
pinctrl: rza1: Remove suffix from gpiochip label
pinctrl: qcom: spmi-gpio: Correct power_source range check
pinctrl: freescale: make mxs_regs const
pinctrl: aspeed: Rework strap register write logic for the AST2500
pinctrl: rza1: off by one in rza1_parse_gpiochip()
pinctrl: qcom: General Purpose clocks for apq8064
pinctrl: sprd: Add Spreadtrum pin control driver
dt-bindings: pinctrl: Add DT bindings for Spreadtrum SC9860
pinctrl: Add sleep related state to indicate sleep related configs
pinctrl: mediatek: update PCIe mux data for MT7623
pinctrl: intel: Add Intel Lewisburg GPIO support
pinctrl: intel: Add Intel Cannon Lake PCH-H pin controller support
pinctrl: aspeed: Fix ast2500 strap register write logic
...
Linus Torvalds [Tue, 5 Sep 2017 18:43:30 +0000 (11:43 -0700)]
Merge tag 'regulator-v4.14' of git://git./linux/kernel/git/broonie/regulator
Pull regulator updates from Mark Brown:
"This is an extremely quiet release for the regulator subsystem, it's
all fairly minor fixes and cleanups plus a few new drivers and ddevice
ID additions:
- Support for MediaTek MT6380, Ricoh RC5T619 and ST Voltage Reference
Buffers"
* tag 'regulator-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (24 commits)
regulator: Add support for stm32-vrefbuf
regulator: Add STM32 Voltage Reference Buffer
regulator: pv88090: Exception handling for out of bounds
regulator: da9063: Return an error code on probe failure
regulator: rn5t618: add RC5T619 PMIC support
regulator: ltc3589: constify i2c_device_id
regulator: fan53555: fix I2C device ids
regulator: add fixes with MT6397 dt-bindings shouldn't reference driver
regulator: add fixes with MT6323 dt-bindings shouldn't reference driver
regulator: add fixes with MT6311 dt-bindings shouldn't reference driver
regulator: Add document for MediaTek MT6380 regulator
regulator: mt6380: Add support for MT6380
regulator: pwm-regulator: Remove unneeded gpiod NULL check
regulator: core: fix a possible race in disable_work handling
regulator: fan53555: Use of_device_get_match_data() to simplify probe
regulator: of: regulator_of_get_init_data() missing of_node_get()
regulator: pwm-regulator: fix example syntax
regulator: Convert to using %pOF instead of full_name
regulator: cpcap: Add OF mode mapping
regulator: cpcap: Fix standby mode
...
Linus Torvalds [Tue, 5 Sep 2017 18:40:38 +0000 (11:40 -0700)]
Merge tag 'spi-v4.14' of git://git./linux/kernel/git/broonie/spi
Pull spi updates from Mark Brown:
"A fairly quiet release for the SPI subsystem:
- Move to using IDR for allocating bus numbers
- Modernisation of the ep93xx driver, removing a lot of open coding
and using the framework more
- The tools have been moved to use the standard tools build system
and an install target added (there will be a fairly trivial
conflict with tip resulting from the changes in the main tools
Makefile)
- A refactoring of the Qualcomm QUP driver which enables new variants
to be supported
- Explicit support for the Freescale i.MX53 and i.MX6 SPI, Renesas
R-Car H3 and Rockchip RV1108 controllers"
* tag 'spi-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (71 commits)
spi: spi-falcon: drop check of boot select
spi: imx: fix use of native chip-selects with devicetree
spi: pl022: constify amba_id
spi: imx: fix little-endian build
spi: omap: Allocate bus number from spi framework
spi: Kernel coding style fixes
spi: imx: dynamic burst length adjust for PIO mode
spi: Pick spi bus number from Linux idr or spi alias
spi: rockchip: configure CTRLR1 according to size and data frame
spi: altera: Consolidate TX/RX data register access
spi: altera: Switch to SPI core transfer queue management
spi: rockchip: add compatible string for rv1108 spi
spi: qup: fix 64-bit build warning
spi: qup: hide warning for uninitialized variable
spi: spi-ep93xx: use the default master transfer queueing mechanism
spi: spi-ep93xx: remove private data 'current_msg'
spi: spi-ep93xx: pass the spi_master pointer around
spi: spi-ep93xx: absorb the interrupt enable/disable helpers
spi: spi-ep93xx: add spi master prepare_transfer_hardware()
spi: spi-ep93xx: use 32-bit read/write for all registers
...
David S. Miller [Tue, 5 Sep 2017 18:40:08 +0000 (11:40 -0700)]
Merge branch 'flow_dissector-fixes'
Tom Herbert says:
====================
flow_dissector: Flow dissector fixes
This patch set fixes some basic issues with __skb_flow_dissect function.
Items addressed:
- Cleanup control flow in the function; in particular eliminate a
bunch of goto's and implement a simplified control flow model
- Add limits for number of encapsulations and headers that can be
dissected
v2:
- Simplify the logic for limits on flow dissection. Just set the
limit based on the number of headers the flow dissector can
processes. The accounted headers includes encapsulation headers,
extension headers, or other shim headers.
Tested:
Ran normal traffic, GUE, and VXLAN traffic.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Fri, 1 Sep 2017 21:04:12 +0000 (14:04 -0700)]
flow_dissector: Add limit for number of headers to dissect
In flow dissector there are no limits to the number of nested
encapsulations or headers that might be dissected which makes for a
nice DOS attack. This patch sets a limit of the number of headers
that flow dissector will parse.
Headers includes network layer headers, transport layer headers, shim
headers for encapsulation, IPv6 extension headers, etc. The limit for
maximum number of headers to parse has be set to fifteen to account for
a reasonable number of encapsulations, extension headers, VLAN,
in a packet. Note that this limit does not supercede the STOP_AT_*
flags which may stop processing before the headers limit is reached.
Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Fri, 1 Sep 2017 21:04:11 +0000 (14:04 -0700)]
flow_dissector: Cleanup control flow
__skb_flow_dissect is riddled with gotos that make discerning the flow,
debugging, and extending the capability difficult. This patch
reorganizes things so that we only perform goto's after the two main
switch statements (no gotos within the cases now). It also eliminates
several goto labels so that there are only two labels that can be target
for goto.
Reported-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Sep 2017 18:35:16 +0000 (11:35 -0700)]
Merge tag 'edac_for_4.14' of git://git./linux/kernel/git/bp/bp
Pull EDAC updates from Borislav Petkov:
- pnd2_edac: A minimal sideband driver (Tony Luck)
- small-ish cleanups and fixes all over the place
* tag 'edac_for_4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, mce_amd: Get rid of local var in amd_filter_mce()
EDAC, mce_amd: Get rid of most struct cpuinfo_x86 uses
EDAC, mce_amd: Rename decode_smca_errors() to decode_smca_error()
EDAC: Make device_type const
EDAC, pnd2: Properly toggle hidden state for P2SB PCI device
EDAC, pnd2: Conditionally unhide/hide the P2SB PCI device to read BAR
EDAC, pnd2: Mask off the lower four bits of a BAR
EDAC, thunderx: Fix error handling path in thunderx_lmc_probe()
EDAC, altera: Fix error handling path in altr_edac_device_probe()
EDAC, pnd2: Build in a minimal sideband driver for Apollo Lake
EDAC, sb_edac: Classify memory mirroring modes
EDAC, cpc925, ppc4xx: Convert to using %pOF instead of full_name
EDAC: Get rid of mci->mod_ver
EDAC: Constify attribute_group structures
EDAC, mce_amd: Use cpu_to_node() to find the node ID
Linus Torvalds [Tue, 5 Sep 2017 18:08:17 +0000 (11:08 -0700)]
Merge tag 'char-misc-4.14-rc1' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big char/misc driver update for 4.14-rc1.
Lots of different stuff in here, it's been an active development cycle
for some reason. Highlights are:
- updated binder driver, this brings binder up to date with what
shipped in the Android O release, plus some more changes that
happened since then that are in the Android development trees.
- coresight updates and fixes
- mux driver file renames to be a bit "nicer"
- intel_th driver updates
- normal set of hyper-v updates and changes
- small fpga subsystem and driver updates
- lots of const code changes all over the driver trees
- extcon driver updates
- fmc driver subsystem upadates
- w1 subsystem minor reworks and new features and drivers added
- spmi driver updates
Plus a smattering of other minor driver updates and fixes.
All of these have been in linux-next with no reported issues for a
while"
* tag 'char-misc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (244 commits)
ANDROID: binder: don't queue async transactions to thread.
ANDROID: binder: don't enqueue death notifications to thread todo.
ANDROID: binder: Don't BUG_ON(!spin_is_locked()).
ANDROID: binder: Add BINDER_GET_NODE_DEBUG_INFO ioctl
ANDROID: binder: push new transactions to waiting threads.
ANDROID: binder: remove proc waitqueue
android: binder: Add page usage in binder stats
android: binder: fixup crash introduced by moving buffer hdr
drivers: w1: add hwmon temp support for w1_therm
drivers: w1: refactor w1_slave_show to make the temp reading functionality separate
drivers: w1: add hwmon support structures
eeprom: idt_89hpesx: Support both ACPI and OF probing
mcb: Fix an error handling path in 'chameleon_parse_cells()'
MCB: add support for SC31 to mcb-lpc
mux: make device_type const
char: virtio: constify attribute_group structures.
Documentation/ABI: document the nvmem sysfs files
lkdtm: fix spelling mistake: "incremeted" -> "incremented"
perf: cs-etm: Fix ETMv4 CONFIGR entry in perf.data file
nvmem: include linux/err.h from header
...
Linus Torvalds [Tue, 5 Sep 2017 17:41:21 +0000 (10:41 -0700)]
Merge tag 'driver-core-4.14-rc1' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core update from Greg KH:
"Here is the "big" driver core update for 4.14-rc1.
It's really not all that big, the largest thing here being some
firmware tests to help ensure that that crazy api is working properly.
There's also a new uevent for when a driver is bound or unbound from a
device, fixing a hole in the driver model that's been there since the
very beginning. Many thanks to Dmitry for being persistent and
pointing out how wrong I was about this all along :)
Patches for the new uevents are already in the systemd tree, if people
want to play around with them.
Otherwise just a number of other small api changes and updates here,
nothing major. All of these patches have been in linux-next for a
while with no reported issues"
* tag 'driver-core-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (28 commits)
driver core: bus: Fix a potential double free
Do not disable driver and bus shutdown hook when class shutdown hook is set.
base: topology: constify attribute_group structures.
base: Convert to using %pOF instead of full_name
kernfs: Clarify lockdep name for kn->count
fbdev: uvesafb: remove DRIVER_ATTR() usage
xen: xen-pciback: remove DRIVER_ATTR() usage
driver core: Document struct device:dma_ops
mod_devicetable: Remove excess description from structured comment
test_firmware: add batched firmware tests
firmware: enable a debug print for batched requests
firmware: define pr_fmt
firmware: send -EINTR on signal abort on fallback mechanism
test_firmware: add test case for SIGCHLD on sync fallback
initcall_debug: add deferred probe times
Input: axp20x-pek - switch to using devm_device_add_group()
Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01
Input: gpio_keys - use devm_device_add_group() for attributes
driver core: add devm_device_add_group() and friends
driver core: add device_{add|remove}_group() helpers
...
Linus Torvalds [Tue, 5 Sep 2017 17:36:26 +0000 (10:36 -0700)]
Merge tag 'staging-4.14-rc1' of git://git./linux/kernel/git/gregkh/staging
Pull staging/IIO driver updates from Greg KH:
"Here is the big staging and IIO driver update for 4.14-rc1.
Lots of staging driver fixes and cleanups, including some reorginizing
of the lustre header files to try to impose some sanity on what is,
and what is not, the uapi for that filesystem.
There are some tty core changes in here as well, as the speakup
drivers need them, and that's ok with me, they are sane and the
speakup code is getting nicer because of it.
There is also the addition of the obiligatory new wifi driver, just
because it has been a release or two since we added our last one...
Other than that, lots and lots of small coding style fixes, as usual.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (612 commits)
staging:rtl8188eu:core Fix remove unneccessary else block
staging: typec: fusb302: make structure fusb302_psy_desc static
staging: unisys: visorbus: make two functions static
staging: fsl-dpaa2/eth: fix off-by-one FD ctrl bitmaks
staging: r8822be: Simplify deinit_priv()
staging: r8822be: Remove some dead code
staging: vboxvideo: Use CONFIG_DRM_KMS_FB_HELPER to check for fbdefio availability
staging:rtl8188eu Fix comparison to NULL
staging: rts5208: rename mmc_ddr_tunning_rx_cmd to mmc_ddr_tuning_rx_cmd
Staging: Pi433: style fix - tabs and spaces
staging: pi433: fix spelling mistake: "preample" -> "preamble"
staging:rtl8188eu:core Fix Code Indent
staging: typec: fusb302: Export current-limit through a power_supply class dev
staging: typec: fusb302: Add support for USB2 charger detection through extcon
staging: typec: fusb302: Use client->irq as irq if set
staging: typec: fusb302: Get max snk mv/ma/mw from device-properties
staging: typec: fusb302: Set max supply voltage to 5V
staging: typec: tcpm: Add get_current_limit tcpc_dev callback
staging:rtl8188eu Use __func__ instead of function name
staging: lustre: coding style fixes found by checkpatch.pl
...