Neal Cardwell [Fri, 14 Jul 2017 21:49:24 +0000 (17:49 -0400)]
tcp_bbr: remove sk_pacing_rate=0 transient during init
Fix a corner case noticed by Eric Dumazet, where BBR's setting
sk->sk_pacing_rate to 0 during initialization could theoretically
cause packets in the sending host to hang if there were packets "in
flight" in the pacing infrastructure at the time the BBR congestion
control state is initialized. This could occur if the pacing
infrastructure happened to race with bbr_init() in a way such that the
pacer read the 0 rather than the immediately following non-zero pacing
rate.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Fri, 14 Jul 2017 21:49:23 +0000 (17:49 -0400)]
tcp_bbr: introduce bbr_init_pacing_rate_from_rtt() helper
Introduce a helper to initialize the BBR pacing rate unconditionally,
based on the current cwnd and RTT estimate. This is a pure refactor,
but is needed for two following fixes.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Fri, 14 Jul 2017 21:49:22 +0000 (17:49 -0400)]
tcp_bbr: introduce bbr_bw_to_pacing_rate() helper
Introduce a helper to convert a BBR bandwidth and gain factor to a
pacing rate in bytes per second. This is a pure refactor, but is
needed for two following fixes.
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell [Fri, 14 Jul 2017 21:49:21 +0000 (17:49 -0400)]
tcp_bbr: cut pacing rate only if filled pipe
In bbr_set_pacing_rate(), which decides whether to cut the pacing
rate, there was some code that considered exiting STARTUP to be
equivalent to the notion of filling the pipe (i.e.,
bbr_full_bw_reached()). Specifically, as the code was structured,
exiting STARTUP and going into PROBE_RTT could cause us to cut the
pacing rate down to something silly and low, based on whatever
bandwidth samples we've had so far, when it's possible that all of
them have been small app-limited bandwidth samples that are not
representative of the bandwidth available in the path. (The code was
correct at the time it was written, but the state machine changed
without this spot being adjusted correspondingly.)
Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Greg Rose [Fri, 14 Jul 2017 19:42:49 +0000 (12:42 -0700)]
openvswitch: Fix for force/commit action failures
When there is an established connection in direction A->B, it is
possible to receive a packet on port B which then executes
ct(commit,force) without first performing ct() - ie, a lookup.
In this case, we would expect that this packet can delete the existing
entry so that we can commit a connection with direction B->A. However,
currently we only perform a check in skb_nfct_cached() for whether
OVS_CS_F_TRACKED is set and OVS_CS_F_INVALID is not set, ie that a
lookup previously occurred. In the above scenario, a lookup has not
occurred but we should still be able to statelessly look up the
existing entry and potentially delete the entry if it is in the
opposite direction.
This patch extends the check to also hint that if the action has the
force flag set, then we will lookup the existing entry so that the
force check at the end of skb_nfct_cached has the ability to delete
the connection.
Fixes: dd41d330b03 ("openvswitch: Add force commit.")
CC: Pravin Shelar <pshelar@nicira.com>
CC: dev@openvswitch.org
Signed-off-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Potapenko [Fri, 14 Jul 2017 16:32:45 +0000 (18:32 +0200)]
sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()
If the length field of the iterator (|pos.p| or |err|) is past the end
of the chunk, we shouldn't access it.
This bug has been detected by KMSAN. For the following pair of system
calls:
socket(PF_INET6, SOCK_STREAM, 0x84 /* IPPROTO_??? */) = 3
sendto(3, "A", 1, MSG_OOB, {sa_family=AF_INET6, sin6_port=htons(0),
inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0}, 28) = 1
the tool has reported a use of uninitialized memory:
==================================================================
BUG: KMSAN: use of uninitialized memory in sctp_rcv+0x17b8/0x43b0
CPU: 1 PID: 2940 Comm: probe Not tainted 4.11.0-rc5+ #2926
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:16
dump_stack+0x172/0x1c0 lib/dump_stack.c:52
kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
__msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
__sctp_rcv_init_lookup net/sctp/input.c:1074
__sctp_rcv_lookup_harder net/sctp/input.c:1233
__sctp_rcv_lookup net/sctp/input.c:1255
sctp_rcv+0x17b8/0x43b0 net/sctp/input.c:170
sctp6_rcv+0x32/0x70 net/sctp/ipv6.c:984
ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
NF_HOOK ./include/linux/netfilter.h:257
ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
dst_input ./include/net/dst.h:492
ip6_rcv_finish net/ipv6/ip6_input.c:69
NF_HOOK ./include/linux/netfilter.h:257
ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
__netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
__netif_receive_skb net/core/dev.c:4246
process_backlog+0x667/0xba0 net/core/dev.c:4866
napi_poll net/core/dev.c:5268
net_rx_action+0xc95/0x1590 net/core/dev.c:5333
__do_softirq+0x485/0x942 kernel/softirq.c:284
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
</IRQ>
do_softirq kernel/softirq.c:328
__local_bh_enable_ip+0x25b/0x290 kernel/softirq.c:181
local_bh_enable+0x37/0x40 ./include/linux/bottom_half.h:31
rcu_read_unlock_bh ./include/linux/rcupdate.h:931
ip6_finish_output2+0x19b2/0x1cf0 net/ipv6/ip6_output.c:124
ip6_finish_output+0x764/0x970 net/ipv6/ip6_output.c:149
NF_HOOK_COND ./include/linux/netfilter.h:246
ip6_output+0x456/0x520 net/ipv6/ip6_output.c:163
dst_output ./include/net/dst.h:486
NF_HOOK ./include/linux/netfilter.h:257
ip6_xmit+0x1841/0x1c00 net/ipv6/ip6_output.c:261
sctp_v6_xmit+0x3b7/0x470 net/sctp/ipv6.c:225
sctp_packet_transmit+0x38cb/0x3a20 net/sctp/output.c:632
sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
sctp_side_effects net/sctp/sm_sideeffect.c:1773
sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
sock_sendmsg_nosec net/socket.c:633
sock_sendmsg net/socket.c:643
SYSC_sendto+0x608/0x710 net/socket.c:1696
SyS_sendto+0x8a/0xb0 net/socket.c:1664
do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
RIP: 0033:0x401133
RSP: 002b:
00007fff6d99cd38 EFLAGS:
00000246 ORIG_RAX:
000000000000002c
RAX:
ffffffffffffffda RBX:
00000000004002b0 RCX:
0000000000401133
RDX:
0000000000000001 RSI:
0000000000494088 RDI:
0000000000000003
RBP:
00007fff6d99cd90 R08:
00007fff6d99cd50 R09:
000000000000001c
R10:
0000000000000001 R11:
0000000000000246 R12:
0000000000000000
R13:
00000000004063d0 R14:
0000000000406460 R15:
0000000000000000
origin:
save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:211
slab_alloc_node mm/slub.c:2743
__kmalloc_node_track_caller+0x200/0x360 mm/slub.c:4351
__kmalloc_reserve net/core/skbuff.c:138
__alloc_skb+0x26b/0x840 net/core/skbuff.c:231
alloc_skb ./include/linux/skbuff.h:933
sctp_packet_transmit+0x31e/0x3a20 net/sctp/output.c:570
sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
sctp_side_effects net/sctp/sm_sideeffect.c:1773
sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
sock_sendmsg_nosec net/socket.c:633
sock_sendmsg net/socket.c:643
SYSC_sendto+0x608/0x710 net/socket.c:1696
SyS_sendto+0x8a/0xb0 net/socket.c:1664
do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
==================================================================
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasily Averin [Fri, 14 Jul 2017 09:04:16 +0000 (12:04 +0300)]
ipv4: ip_do_fragment: fix headroom tests
Some time ago David Woodhouse reported skb_under_panic
when we try to push ethernet header to fragmented ipv6 skbs.
It was fixed for ipv6 by Florian Westphal in
commit
1d325d217c7f ("ipv6: ip6_fragment: fix headroom tests and skb leak")
However similar problem still exist in ipv4.
It does not trigger skb_under_panic due paranoid check
in ip_finish_output2, however according to Alexey Kuznetsov
current state is abnormal and ip_fragment should be fixed too.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhu Yanjun [Fri, 14 Jul 2017 03:01:27 +0000 (23:01 -0400)]
mlx4_en: remove unnecessary returned value check
The function __mlx4_zone_remove_one_entry always returns zero. So
it is not necessary to check it.
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason A. Donenfeld [Mon, 10 Jul 2017 12:00:32 +0000 (14:00 +0200)]
ioc3-eth: store pointer to net_device for priviate area
Computing the alignment manually for going from priv to pub is probably
not such a good idea, and in general the assumption that going from priv
to pub is possible trivially could change, so rather than relying on
that, we change things to just store a pointer to pub. This was sugested
by DaveM in [1].
[1] http://www.spinics.net/lists/netdev/msg443992.html
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 15 Jul 2017 21:28:28 +0000 (14:28 -0700)]
Merge branch 'bgmac-stingray-soc'
Abhishek Shah says:
====================
Extend BGMAC driver for Stingray SoC
The patchset extends Broadcom BGMAC driver for Broadcom Stingray SoC.
This patchset is based on Linux-4.12 and tested on NS2 and Stingray.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Abhishek Shah [Thu, 13 Jul 2017 19:04:09 +0000 (00:34 +0530)]
Documentation: devicetree: net: optional idm regs for bgmac
Specifying IDM register space in DT is not mendatory for SoCs
where firmware takes care of IDM operations. This patch updates
BGMAC driver's DT binding documentation indicating the same.
Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Oza Oza <oza.oza@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Abhishek Shah [Thu, 13 Jul 2017 19:04:08 +0000 (00:34 +0530)]
net: ethernet: bgmac: Make IDM register space optional
IDM operations are usually one time ops and should be done in
firmware itself. Driver is not supposed to touch IDM registers.
However, for some SoCs', driver is performing IDM read/writes.
So this patch masks IDM operations in case firmware is taking
care of IDM operations.
Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com>
Reviewed-by: Oza Oza <oza.oza@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Abhishek Shah [Thu, 13 Jul 2017 19:04:07 +0000 (00:34 +0530)]
net: ethernet: bgmac: Remove unnecessary 'return' from platform_bgmac_idm_write
Return type for idm register write callback should be void as 'writel'
API is used for write operation. However, there no need to have 'return'
in this function.
Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com>
Reviewed-by: Oza Oza <oza.oza@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Fri, 14 Jul 2017 14:07:33 +0000 (22:07 +0800)]
sctp: fix an array overflow when all ext chunks are set
Marcelo noticed an array overflow caused by commit
c28445c3cb07
("sctp: add reconf_enable in asoc ep and netns"), in which sctp
would add SCTP_CID_RECONF into extensions when reconf_enable is
set in sctp_make_init and sctp_make_init_ack.
Then now when all ext chunks are set, 4 ext chunk ids can be put
into extensions array while extensions array size is 3. It would
cause a kernel panic because of this overflow.
This patch is to fix it by defining extensions array size is 4 in
both sctp_make_init and sctp_make_init_ack.
Fixes: c28445c3cb07 ("sctp: add reconf_enable in asoc ep and netns")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 14 Jul 2017 12:07:05 +0000 (14:07 +0200)]
liquidio: fix possible eeprom format string overflow
gcc reports that the temporary buffer for computing the
string length may be too small here:
drivers/net/ethernet/cavium/liquidio/lio_ethtool.c: In function 'lio_get_eeprom_len':
/drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:345:21: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
len = sprintf(buf, "boardname:%s serialnum:%s maj:%lld min:%lld\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:345:6: note: 'sprintf' output between 35 and 167 bytes into a destination of size 128
len = sprintf(buf, "boardname:%s serialnum:%s maj:%lld min:%lld\n",
This extends it to 192 bytes, which is certainly enough. As far
as I could tell, there are no other constraints that require a specific
maximum size.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 14 Jul 2017 12:07:04 +0000 (14:07 +0200)]
vmxnet3: avoid format strint overflow warning
gcc-7 notices that "-event-%d" could be more than 11 characters long
if we had larger 'vector' numbers:
drivers/net/vmxnet3/vmxnet3_drv.c: In function 'vmxnet3_activate_dev':
drivers/net/vmxnet3/vmxnet3_drv.c:2095:40: error: 'sprintf' may write a terminating nul past the end of the destination [-Werror=format-overflow=]
sprintf(intr->event_msi_vector_name, "%s-event-%d",
^~~~~~~~~~~~~
drivers/net/vmxnet3/vmxnet3_drv.c:2095:3: note: 'sprintf' output between 9 and 33 bytes into a destination of size 32
The current code is safe, but making the string a little longer
is harmless and lets gcc see that it's ok.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 14 Jul 2017 12:07:03 +0000 (14:07 +0200)]
net: thunder_bgx: avoid format string overflow warning
gcc warns that the temporary buffer might be too small here:
drivers/net/ethernet/cavium/thunder/thunder_bgx.c: In function 'bgx_probe':
drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:16: error: '%d' directive writing between 1 and 10 bytes into a region of size between 9 and 11 [-Werror=format-overflow=]
sprintf(str, "BGX%d LMAC%d mode", bgx->bgx_id, lmacid);
^~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:16: note: directive argument in the range [0,
2147483647]
drivers/net/ethernet/cavium/thunder/thunder_bgx.c:1020:3: note: 'sprintf' output between 16 and 27 bytes into a destination of size 20
This probably can't happen, but it can't hurt to make it long
enough for the theoretical limit.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 14 Jul 2017 12:07:02 +0000 (14:07 +0200)]
bnx2x: fix format overflow warning
gcc notices that large queue numbers would overflow the queue name
string:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c: In function 'bnx2x_get_strings':
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:25: error: '%d' directive writing between 1 and 10 bytes into a region of size 5 [-Werror=format-overflow=]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:25: note: directive argument in the range [0,
2147483647]
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:3165:5: note: 'sprintf' output between 2 and 11 bytes into a destination of size 5
There is a hard limit in place that makes the number at most two
digits, so the code is fine. This changes it to use snprintf()
to truncate instead of overflowing, which shuts up that warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 14 Jul 2017 12:07:01 +0000 (14:07 +0200)]
net: niu: fix format string overflow warning:
We get a warning for the port_name string that might be longer than
six characters if we had more than 10 ports:
drivers/net/ethernet/sun/niu.c: In function 'niu_put_parent':
drivers/net/ethernet/sun/niu.c:9563:21: error: '%d' directive writing between 1 and 3 bytes into a region of size 2 [-Werror=format-overflow=]
sprintf(port_name, "port%d", port);
^~~~~~~~
drivers/net/ethernet/sun/niu.c:9563:21: note: directive argument in the range [0, 255]
drivers/net/ethernet/sun/niu.c:9563:2: note: 'sprintf' output between 6 and 8 bytes into a destination of size 6
sprintf(port_name, "port%d", port);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/sun/niu.c: In function 'niu_pci_init_one':
drivers/net/ethernet/sun/niu.c:9538:22: error: '%d' directive writing between 1 and 3 bytes into a region of size 2 [-Werror=format-overflow=]
sprintf(port_name, "port%d", port);
^~~~~~~~
drivers/net/ethernet/sun/niu.c:9538:22: note: directive argument in the range [0, 255]
drivers/net/ethernet/sun/niu.c:9538:3: note: 'sprintf' output between 6 and 8 bytes into a destination of size 6
While we know that the port number is small, there is no harm in
making the format string two bytes longer to avoid the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 14 Jul 2017 12:07:00 +0000 (14:07 +0200)]
isdn: divert: fix sprintf buffer overflow warning
One string we pass into the cs->info buffer might be too long,
as pointed out by gcc:
drivers/isdn/divert/isdn_divert.c: In function 'll_callback':
drivers/isdn/divert/isdn_divert.c:488:22: error: '%d' directive writing between 1 and 3 bytes into a region of size between 1 and 69 [-Werror=format-overflow=]
sprintf(cs->info, "%d 0x%lx %s %s %s %s 0x%x 0x%x %d %d %s\n",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/isdn/divert/isdn_divert.c:488:22: note: directive argument in the range [0, 255]
drivers/isdn/divert/isdn_divert.c:488:4: note: 'sprintf' output 25 or more bytes (assuming 129) into a destination of size 90
This is unlikely to actually cause problems, so let's use snprintf
as a simple workaround to shut up the warning and truncate the
buffer instead.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Timur Tabi [Thu, 13 Jul 2017 20:45:41 +0000 (15:45 -0500)]
net: qcom/emac: fix double free of SGMII IRQ during shutdown
If the interface is not up, then don't try to close it during a
shutdown. This avoids possible double free of the IRQ, which
can happen during a shutdown.
Fixes: 03eb3eb4d4d5 ("net: qcom/emac: add shutdown function")
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Kulhavy [Thu, 13 Jul 2017 17:40:57 +0000 (19:40 +0200)]
smsc95xx: use ethtool_op_get_ts_info()
This change enables the use of SW timestamping on Raspberry PI.
smsc95xx uses the usbnet transmit function usbnet_start_xmit(), which
implements software timestamping. However the SOF_TIMESTAMPING_TX_SOFTWARE
capability was missing and only SOF_TIMESTAMPING_RX_SOFTWARE was announced.
By using ethtool_op_get_ts_info() as get_ts_info() also the
SOF_TIMESTAMPING_TX_SOFTWARE is announced.
Signed-off-by: Petr Kulhavy <brain@jikos.cz>
Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Thu, 13 Jul 2017 17:12:18 +0000 (13:12 -0400)]
net sched actions: rename act_get_notify() to tcf_get_notify()
Make name consistent with other TC event notification routines, such as
tcf_add_notify() and tcf_del_notify()
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iván Briano [Thu, 13 Jul 2017 16:46:58 +0000 (09:46 -0700)]
net/packet: Fix Tx queue selection for AF_PACKET
When PACKET_QDISC_BYPASS is not used, Tx queue selection will be done
before the packet is enqueued, taking into account any mappings set by
a queuing discipline such as mqprio without hardware offloading. This
selection may be affected by a previously saved queue_mapping, either on
the Rx path, or done before the packet reaches the device, as it's
currently the case for AF_PACKET.
In order for queue selection to work as expected when using traffic
control, there can't be another selection done before that point is
reached, so move the call to packet_pick_tx_queue to
packet_direct_xmit, leaving the default xmit path as it was before
PACKET_QDISC_BYPASS was introduced.
A forward declaration of packet_pick_tx_queue() is introduced to avoid
the need to reorder the functions within the file.
Fixes: d346a3fae3ff ("packet: introduce PACKET_QDISC_BYPASS socket option")
Signed-off-by: Iván Briano <ivan.briano@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Thu, 13 Jul 2017 13:09:10 +0000 (16:09 +0300)]
net: bridge: fix dest lookup when vlan proto doesn't match
With 802.1ad support the vlan_ingress code started checking for vlan
protocol mismatch which causes the current tag to be inserted and the
bridge vlan protocol & pvid to be set. The vlan tag insertion changes
the skb mac_header and thus the lookup mac dest pointer which was loaded
prior to calling br_allowed_ingress in br_handle_frame_finish is VLAN_HLEN
bytes off now, pointing to the last two bytes of the destination mac and
the first four of the source mac causing lookups to always fail and
broadcasting all such packets to all ports. Same thing happens for locally
originated packets when passing via br_dev_xmit. So load the dest pointer
after the vlan checks and possible skb change.
Fixes: 8580e2117c06 ("bridge: Prepare for 802.1ad vlan filtering support")
Reported-by: Anitha Narasimha Murthy <anitha@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Thu, 13 Jul 2017 13:06:50 +0000 (18:36 +0530)]
cxgb4: ptp_clock_register() returns error pointers
Check ptp_clock_register() return not only for NULL but
also for error pointers, and also nullify adapter->ptp_clock
if ptp_clock_register() fails.
Fixes: 9c33e4208bce ("cxgb4: Add PTP Hardware Clock (PHC) support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LiuJian [Thu, 13 Jul 2017 10:57:54 +0000 (18:57 +0800)]
net: hns: add acpi function of xge led control
The current code only support DT method to control xge led.
This patch is the implementation of acpi method to control xge led.
Signed-off-by: LiuJian <liujian56@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Daode Huang <huangdaode@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Wed, 12 Jul 2017 22:56:41 +0000 (15:56 -0700)]
netpoll: shut up a kernel warning on refcount
When we convert atomic_t to refcount_t, a new kernel warning
on "increment on 0" is introduced in the netpoll code,
zap_completion_queue(). In fact for this special case, we know
the refcount is 0 and we just have to set it to 1 to satisfy
the following dev_kfree_skb_any(), so we can just use
refcount_set(..., 1) instead.
Fixes: 633547973ffc ("net: convert sk_buff.users from atomic_t to refcount_t")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Reshetova, Elena <elena.reshetova@intel.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enrico Mioso [Tue, 11 Jul 2017 15:21:52 +0000 (17:21 +0200)]
cdc_ncm: Set NTB format again after altsetting switch for Huawei devices
Some firmwares in Huawei E3372H devices have been observed to switch back
to NTB 32-bit format after altsetting switch.
This patch implements a driver flag to check for the device settings and
set NTB format to 16-bit again if needed.
The flag has been activated for devices controlled by the huawei_cdc_ncm.c
driver.
V1->V2:
- fixed broken error checks
- some corrections to the commit message
V2->V3:
- variable name changes, to clarify what's happening
- check (and possibly set) the NTB format later in the common bind code path
Signed-off-by: Enrico Mioso <mrkiko.rs@gmail.com>
Reported-and-tested-by: Christian Panton <christian@panton.org>
Reviewed-by: Bjørn Mork <bjorn@mork.no>
CC: Bjørn Mork <bjorn@mork.no>
CC: Christian Panton <christian@panton.org>
CC: linux-usb@vger.kernel.org
CC: netdev@vger.kernel.org
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Blumenstingl [Mon, 10 Jul 2017 12:35:23 +0000 (14:35 +0200)]
mdio: mux: fix parsing mux registers outside of the PHY address range
mdio_mux_init parses the child nodes of the MDIO mux. When using
"mdio-mux-mmioreg" the child nodes are describing the register value
that is written to switch between the MDIO busses.
The change which makes the error messages more verbose changed the
parsing of the "reg" property from a simple of_property_read_u32 call
to of_mdio_parse_addr. On a Khadas VIM (based on the Meson GXL SoC,
which uses mdio-mux-mmioreg) this prevents registering the MDIO mux
(because the "reg" values on the MDIO mux child nodes are 0x2009087f
and 0xe40908ff) and leads to the following errors:
mdio-mux-mmioreg
c883455c.eth-phy-mux: /soc/periphs@
c8834000/eth-phy-mux/mdio@
e40908ff PHY address -
469169921 is too large
mdio-mux-mmioreg
c883455c.eth-phy-mux: Error: Failed to find reg for child /soc/periphs@
c8834000/eth-phy-mux/mdio@
e40908ff
mdio-mux-mmioreg
c883455c.eth-phy-mux: /soc/periphs@
c8834000/eth-phy-mux/mdio@
2009087f PHY address
537462911 is too large
mdio-mux-mmioreg
c883455c.eth-phy-mux: Error: Failed to find reg for child /soc/periphs@
c8834000/eth-phy-mux/mdio@
2009087f
mdio-mux-mmioreg
c883455c.eth-phy-mux: Error: No acceptable child buses found
mdio-mux-mmioreg
c883455c.eth-phy-mux: failed to register mdio-mux bus /soc/periphs@
c8834000/eth-phy-mux
(as a result of that ethernet is not working, because the PHY which is
connected through the mux' child MDIO bus, which is not being
registered).
Fix this by reverting the change from of_mdio_parse_addr to
of_mdio_parse_addr.
Fixes: 342fa1964439 ("mdio: mux: make child bus walking more permissive and errors more verbose")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 13 Jul 2017 20:36:40 +0000 (13:36 -0700)]
net: set fib rule refcount after malloc
The configure callback of fib_rules_ops can change the refcnt of a
fib rule. For instance, mlxsw takes a refcnt when adding the processing
of the rule to a work queue. Thus the rule refcnt can not be reset to
to 1 afterwards. Move the refcnt setting to after the allocation.
Fixes: 5361e209dd30 ("net: avoid one splat in fib_nl_delrule()")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rolf Eike Beer [Thu, 13 Jul 2017 14:46:43 +0000 (16:46 +0200)]
netlink: correctly document nla_put_u64_64bit()
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Thu, 13 Jul 2017 13:15:07 +0000 (18:45 +0530)]
cxgb4: add new T5 pci device id's
Add 0x50a3 and 0x50a4 T5 device id's
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 13 Jul 2017 11:22:24 +0000 (12:22 +0100)]
dccp: make const array error_code static
Don't populate array error_code on the stack but make it static. Makes
the object code smaller by almost 250 bytes:
Before:
text data bss dec hex filename
10366 983 0 11349 2c55 net/dccp/input.o
After:
text data bss dec hex filename
10161 1039 0 11200 2bc0 net/dccp/input.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 11 Jul 2017 11:47:33 +0000 (12:47 +0100)]
rt2x00: make const array glrt_table static
Don't populate array glrt_table on the stack but make it static.
Makes the object code a smaller by over 670 bytes:
Before:
text data bss dec hex filename
131772 4733 0 136505 21539 rt2800lib.o
After:
text data bss dec hex filename
131043 4789 0 135832 21298 rt2800lib.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 11 Jul 2017 11:18:48 +0000 (12:18 +0100)]
net: stmmac: make const array route_possibilities static
Don't populate array route_possibilities on the stack but make it
static const. Makes the object code a little smaller by 85 bytes:
Before:
text data bss dec hex filename
9901 2448 0 12349 303d dwmac4_core.o
After:
text data bss dec hex filename
9760 2504 0 12264 2fe8 dwmac4_core.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 11 Jul 2017 10:52:23 +0000 (11:52 +0100)]
net: broadcom: bnx2x: make a couple of const arrays static
Don't populate various tables on the stack but make them static const.
Makes the object code smaller by nearly 200 bytes:
Before:
text data bss dec hex filename
113468 11200 0 124668 1e6fc bnx2x_ethtool.o
After:
text data bss dec hex filename
113129 11344 0 124473 1e639 bnx2x_ethtool.o
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Bogendoerfer [Thu, 13 Jul 2017 08:57:40 +0000 (10:57 +0200)]
xgene: Don't fail probe, if there is no clk resource for SGMII interfaces
This change fixes following problem
[ 1.827940] xgene-enet: probe of
1f210030.ethernet failed with error -2
which leads to a missing ethernet interface (reproducable at least on
Gigabyte MP30-AR0 and APM Mustang systems).
The check for a valid clk resource fails, because DT doesn't provide a
clock for sgenet1. But the driver doesn't use this clk, if the ethernet
port is connected via SGMII. Therefore this patch avoids probing for clk
on SGMII interfaces.
Fixes: 9aea7779b764 ("drivers: net: xgene: Fix crash on DT systems")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kefeng Wang [Thu, 13 Jul 2017 06:27:58 +0000 (14:27 +0800)]
bpf: fix return in bpf_skb_adjust_net
The bpf_skb_adjust_net() ignores the return value of bpf_skb_net_shrink/grow,
and always return 0, fix it by return 'ret'.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 13 Jul 2017 02:30:57 +0000 (19:30 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix 64-bit division in mlx5 IPSEC offload support, from Ilan Tayari
and Arnd Bergmann.
2) Fix race in statistics gathering in bnxt_en driver, from Michael
Chan.
3) Can't use a mutex in RCU reader protected section on tap driver, from
Cong WANG.
4) Fix mdb leak in bridging code, from Eduardo Valentin.
5) Fix free of wrong pointer variable in nfp driver, from Dan Carpenter.
6) Buffer overflow in brcmfmac driver, from Arend van SPriel.
7) ioremap_nocache() return value needs to be checked in smsc911x
driver, from Alexey Khoroshilov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
net: stmmac: revert "support future possible different internal phy mode"
sfc: don't read beyond unicast address list
datagram: fix kernel-doc comments
socket: add documentation for missing elements
smsc911x: Add check for ioremap_nocache() return code
brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
net: hns: Bugfix for Tx timeout handling in hns driver
net: ipmr: ipmr_get_table() returns NULL
nfp: freeing the wrong variable
mlxsw: spectrum_switchdev: Check status of memory allocation
mlxsw: spectrum_switchdev: Remove unused variable
mlxsw: spectrum_router: Fix use-after-free in route replace
mlxsw: spectrum_router: Add missing rollback
samples/bpf: fix a build issue
bridge: mdb: fix leak on complete_info ptr on fail path
tap: convert a mutex to a spinlock
cxgb4: fix BUG() on interrupt deallocating path of ULD
qed: Fix printk option passed when printing ipv6 addresses
net: Fix minor code bug in timestamping.txt
net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer
...
Linus Torvalds [Thu, 13 Jul 2017 02:25:47 +0000 (19:25 -0700)]
disable new gcc-7.1.1 warnings for now
I made the mistake of upgrading my desktop to the new Fedora 26 that
comes with gcc-7.1.1.
There's nothing wrong per se that I've noticed, but I now have 1500
lines of warnings, mostly from the new format-truncation warning
triggering all over the tree.
We use 'snprintf()' and friends in a lot of places, and often know that
the numbers are fairly small (ie a controller index or similar), but gcc
doesn't know that, and sees an 'int', and thinks that it could be some
huge number. And then complains when our buffers are not able to fit
the name for the ten millionth controller.
These warnings aren't necessarily bad per se, and we probably want to
look through them subsystem by subsystem, but at least during the merge
window they just mean that I can't even see if somebody is introducing
any *real* problems when I pull.
So warnings disabled for now.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 13 Jul 2017 00:22:01 +0000 (17:22 -0700)]
Merge tag 'modules-for-v4.13' of git://git./linux/kernel/git/jeyu/linux
Pull modules updates from Jessica Yu:
"Summary of modules changes for the 4.13 merge window:
- Minor code cleanups
- Avoid accessing mod struct prior to checking module struct version,
from Kees
- Fix racy atomic inc/dec logic of kmod_concurrent_max in kmod, from
Luis"
* tag 'modules-for-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
module: make the modinfo name const
kmod: reduce atomic operations on kmod_concurrent and simplify
module: use list_for_each_entry_rcu() on find_module_all()
kernel/module.c: suppress warning about unused nowarn variable
module: Add module name to modinfo
module: Pass struct load_info into symbol checks
LABBE Corentin [Wed, 12 Jul 2017 07:32:34 +0000 (09:32 +0200)]
net: stmmac: revert "support future possible different internal phy mode"
Since internal phy-mode is reserved for non-xMII protocol we cannot use
it with dwmac-sun8i.
Furthermore, all DT patchs which comes with this patch were cleaned, so
the current state is broken.
This reverts commit
1c2fa5f84683 ("net: stmmac: support future possible different internal phy mode")
Fixes: 1c2fa5f84683 ("net: stmmac: support future possible different internal phy mode")
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bert Kenward [Wed, 12 Jul 2017 16:19:41 +0000 (17:19 +0100)]
sfc: don't read beyond unicast address list
If we have more than 32 unicast MAC addresses assigned to an interface
we will read beyond the end of the address table in the driver when
adding filters. The next 256 entries store multicast addresses, so we
will end up attempting to insert duplicate filters, which is mostly
harmless. If we add more than 288 unicast addresses we will then read
past the multicast address table, which is likely to be more exciting.
Fixes: 12fb0da45c9a ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode")
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 12 Jul 2017 21:39:44 +0000 (14:39 -0700)]
Merge branch 'net-doc-fixes'
Stephen Hemminger says:
====================
minor net kernel-doc fixes
Fix a couple of small errors in kernel-doc for networking
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 12 Jul 2017 16:29:07 +0000 (09:29 -0700)]
datagram: fix kernel-doc comments
An underscore in the kernel-doc comment section has special meaning
and mis-use generates an errors.
./net/core/datagram.c:207: ERROR: Unknown target name: "msg".
./net/core/datagram.c:379: ERROR: Unknown target name: "msg".
./net/core/datagram.c:816: ERROR: Unknown target name: "t".
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
stephen hemminger [Wed, 12 Jul 2017 16:29:06 +0000 (09:29 -0700)]
socket: add documentation for missing elements
Fill in missing kernel-doc for missing elements in struct sock.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Khoroshilov [Wed, 12 Jul 2017 20:58:56 +0000 (23:58 +0300)]
smsc911x: Add check for ioremap_nocache() return code
There is no check for return code of smsc911x_drv_probe()
in smsc911x_drv_probe(). The patch adds one.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 12 Jul 2017 17:04:56 +0000 (10:04 -0700)]
Merge branch 'i2c/for-4.13' of git://git./linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"This pull request contains:
- i2c core reorganization. One source file became too monolithic. It
is now split up, yet we still have the same named object as the
final output. This should ease maintenance.
- new drivers: ZTE ZX2967 family, ASPEED 24XX/25XX
- designware driver gained slave mode support
- xgene-slimpro driver gained ACPI support
- bigger overhaul for pca-platform driver
- the algo-bit module now supports messages with enforced STOP
- slightly bigger than usual set of driver updates and improvements
and with much appreciated quality assurance from Andy Shevchenko"
* 'i2c/for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (51 commits)
i2c: Provide a stub for i2c_detect_slave_mode()
i2c: designware: Let slave adapter support be optional
i2c: designware: Make HW init functions static
i2c: designware: fix spelling mistakes
i2c: pca-platform: propagate error from i2c_pca_add_numbered_bus
i2c: pca-platform: correctly set algo_data.reset_chip
i2c: acpi: Do not create i2c-clients for LNXVIDEO ACPI devices
i2c: designware: enable SLAVE in platform module
i2c: designware: add SLAVE mode functions
i2c: zx2967: drop COMPILE_TEST dependency
i2c: zx2967: always use the same device when printing errors
i2c: pca-platform: use dev_warn/dev_info instead of printk
i2c: pca-platform: use device managed allocations
i2c: pca-platform: add devicetree awareness
i2c: pca-platform: switch to struct gpio_desc
dt-bindings: add bindings for i2c-pca-platform
i2c: cadance: fix ctrl/addr reg write order
i2c: zx2967: add i2c controller driver for ZTE's zx2967 family
dt: bindings: add documentation for zx2967 family i2c controller
i2c: algo-bit: add support for I2C_M_STOP
...
Linus Torvalds [Wed, 12 Jul 2017 17:00:04 +0000 (10:00 -0700)]
Merge tag 'iommu-updates-v4.13' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This update comes with:
- Support for lockless operation in the ARM io-pgtable code.
This is an important step to solve the scalability problems in the
common dma-iommu code for ARM
- Some Errata workarounds for ARM SMMU implemenations
- Rewrite of the deferred IO/TLB flush code in the AMD IOMMU driver.
The code suffered from very high flush rates, with the new
implementation the flush rate is down to ~1% of what it was before
- Support for amd_iommu=off when booting with kexec.
The problem here was that the IOMMU driver bailed out early without
disabling the iommu hardware, if it was enabled in the old kernel
- The Rockchip IOMMU driver is now available on ARM64
- Align the return value of the iommu_ops->device_group call-backs to
not miss error values
- Preempt-disable optimizations in the Intel VT-d and common IOVA
code to help Linux-RT
- Various other small cleanups and fixes"
* tag 'iommu-updates-v4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (60 commits)
iommu/vt-d: Constify intel_dma_ops
iommu: Warn once when device_group callback returns NULL
iommu/omap: Return ERR_PTR in device_group call-back
iommu: Return ERR_PTR() values from device_group call-backs
iommu/s390: Use iommu_group_get_for_dev() in s390_iommu_add_device()
iommu/vt-d: Don't disable preemption while accessing deferred_flush()
iommu/iova: Don't disable preempt around this_cpu_ptr()
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #126
iommu/arm-smmu-v3: Enable ACPI based HiSilicon CMD_PREFETCH quirk(erratum
161010701)
iommu/arm-smmu-v3: Add workaround for Cavium ThunderX2 erratum #74
ACPI/IORT: Fixup SMMUv3 resource size for Cavium ThunderX2 SMMUv3 model
iommu/arm-smmu-v3, acpi: Add temporary Cavium SMMU-V3 IORT model number definitions
iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table
iommu/io-pgtable: depend on !GENERIC_ATOMIC64 when using COMPILE_TEST with LPAE
iommu/arm-smmu-v3: Remove io-pgtable spinlock
iommu/arm-smmu: Remove io-pgtable spinlock
iommu/io-pgtable-arm-v7s: Support lockless operation
iommu/io-pgtable-arm: Support lockless operation
iommu/io-pgtable: Introduce explicit coherency
iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap
...
Linus Torvalds [Wed, 12 Jul 2017 16:28:55 +0000 (09:28 -0700)]
Merge branch 'overlayfs-linus' of git://git./linux/kernel/git/mszeredi/vfs
Pull overlayfs updates from Miklos Szeredi:
"This work from Amir introduces the inodes index feature, which
provides:
- hardlinks are not broken on copy up
- infrastructure for overlayfs NFS export
This also fixes constant st_ino for samefs case for lower hardlinks"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (33 commits)
ovl: mark parent impure and restore timestamp on ovl_link_up()
ovl: document copying layers restrictions with inodes index
ovl: cleanup orphan index entries
ovl: persistent overlay inode nlink for indexed inodes
ovl: implement index dir copy up
ovl: move copy up lock out
ovl: rearrange copy up
ovl: add flag for upper in ovl_entry
ovl: use struct copy_up_ctx as function argument
ovl: base tmpfile in workdir too
ovl: factor out ovl_copy_up_inode() helper
ovl: extract helper to get temp file in copy up
ovl: defer upper dir lock to tempfile link
ovl: hash overlay non-dir inodes by copy up origin
ovl: cleanup bad and stale index entries on mount
ovl: lookup index entry for copy up origin
ovl: verify index dir matches upper dir
ovl: verify upper root dir matches lower root dir
ovl: introduce the inodes index dir feature
ovl: generalize ovl_create_workdir()
...
Al Viro [Wed, 12 Jul 2017 03:59:45 +0000 (04:59 +0100)]
fix a braino in compat_sys_getrlimit()
Reported-and-tested-by: Meelis Roos <mroos@linux.ee>
Fixes: commit d9e968cb9f84 "getrlimit()/setrlimit(): move compat to native"
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arend van Spriel [Fri, 7 Jul 2017 20:09:06 +0000 (21:09 +0100)]
brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()
The lower level nl80211 code in cfg80211 ensures that "len" is between
25 and NL80211_ATTR_FRAME (2304). We subtract DOT11_MGMT_HDR_LEN (24) from
"len" so thats's max of 2280. However, the action_frame->data[] buffer is
only BRCMF_FIL_ACTION_FRAME_SIZE (1800) bytes long so this memcpy() can
overflow.
memcpy(action_frame->data, &buf[DOT11_MGMT_HDR_LEN],
le16_to_cpu(action_frame->len));
Cc: stable@vger.kernel.org # 3.9.x
Fixes: 18e2f61db3b70 ("brcmfmac: P2P action frame tx.")
Reported-by: "freenerguo(郭大兴)" <freenerguo@tencent.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lin Yun Sheng [Wed, 12 Jul 2017 11:09:59 +0000 (19:09 +0800)]
net: hns: Bugfix for Tx timeout handling in hns driver
When hns port type is not debug mode, netif_tx_disable is called
when there is a tx timeout, which requires system reboot to return
to normal state. This patch fix this problem by resetting the net
dev.
Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support")
Signed-off-by: Lin Yun Sheng <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 12 Jul 2017 07:56:47 +0000 (10:56 +0300)]
net: ipmr: ipmr_get_table() returns NULL
The ipmr_get_table() function doesn't return error pointers it returns
NULL on error.
Fixes: 4f75ba6982bc ("net: ipmr: Add ipmr_rtm_getroute")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 12 Jul 2017 07:42:06 +0000 (10:42 +0300)]
nfp: freeing the wrong variable
We accidentally free a NULL pointer and leak the pointer we want to
free. Also you can tell from the label name what was intended. :)
Fixes: abfcdc1de9bf ("nfp: add a stats handler for flower offloads")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 12 Jul 2017 15:15:52 +0000 (08:15 -0700)]
Merge branch 'mlxsw-spectrum-Various-fixes'
Jiri Pirko says:
====================
mlxsw: spectrum: Various fixes
First patch adds a missing rollback in error path. Second patch prevents
a use-after-free during IPv4 route replace. Last two patches fix warnings
from static checkers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 12 Jul 2017 07:12:55 +0000 (09:12 +0200)]
mlxsw: spectrum_switchdev: Check status of memory allocation
We can't rely on kzalloc() always succeeding, so check its return value.
Suppresses the following smatch error:
mlxsw_sp_switchdev_event() error: potential null dereference
'switchdev_work->fdb_info.addr'. (kzalloc returns
null)
Fixes: af061378924f ("mlxsw: spectrum_switchdev: Add support for learning FDB through notification")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 12 Jul 2017 07:12:54 +0000 (09:12 +0200)]
mlxsw: spectrum_switchdev: Remove unused variable
Commit
10e23eb299fa ("mlxsw: spectrum: Remove support for bypass bridge
port attributes/vlan set") removed statements that used 'bridge_vlan',
but didn't remove the variable itself resulting in the following warning
with W=1:
warning: variable ‘bridge_vlan’ set but not used
[-Wunused-but-set-variable]
Remove the variable and suppress the warning.
Fixes: 10e23eb299fa ("mlxsw: spectrum: Remove support for bypass bridge port attributes/vlan set")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 12 Jul 2017 07:12:53 +0000 (09:12 +0200)]
mlxsw: spectrum_router: Fix use-after-free in route replace
While working on IPv6 route replace I realized we can have a
use-after-free in IPv4 in case the replaced route is offloaded and the
only one using its FIB info.
The problem is that fib_table_insert() drops the reference on the FIB
info of the replaced routes which is eventually freed via call_rcu().
Since the driver doesn't hold a reference on this FIB info it can cause
a use-after-free when it tries to clear the RTNH_F_OFFLOAD flag stored
in fi->fib_flags.
After running the following commands in a loop for enough time with a
KASAN enabled kernel I finally got the below trace.
$ ip route add 192.168.50.0/24 via 192.168.200.1 dev enp3s0np3
$ ip route replace 192.168.50.0/24 dev enp3s0np5
$ ip route del 192.168.50.0/24 dev enp3s0np5
BUG: KASAN: use-after-free in mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum]
Read of size 4 at addr
ffff8803717d9820 by task kworker/u4:2/55
[...]
? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum]
? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum]
? mlxsw_sp_router_neighs_update_work+0x1cd0/0x1ce0 [mlxsw_spectrum]
? mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum]
__asan_load4+0x61/0x80
mlxsw_sp_fib_entry_offload_unset+0xa7/0x120 [mlxsw_spectrum]
mlxsw_sp_fib_entry_offload_refresh+0xb6/0x370 [mlxsw_spectrum]
mlxsw_sp_router_fib_event_work+0xd1c/0x2780 [mlxsw_spectrum]
[...]
Freed by task 5131:
save_stack_trace+0x16/0x20
save_stack+0x46/0xd0
kasan_slab_free+0x70/0xc0
kfree+0x144/0x570
free_fib_info_rcu+0x2e7/0x410
rcu_process_callbacks+0x4f8/0xe30
__do_softirq+0x1d3/0x9e2
Fix this by taking a reference on the FIB info when creating the nexthop
group it represents and drop it when the group is destroyed.
Fixes: 599cf8f95f22 ("mlxsw: spectrum_router: Add support for route replace")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Wed, 12 Jul 2017 07:12:52 +0000 (09:12 +0200)]
mlxsw: spectrum_router: Add missing rollback
With this patch the error path of mlxsw_sp_nexthop_init() is symmetric
with mlxsw_sp_nexthop_fini(). Noticed during code review.
Fixes: a8c970142798 ("mlxsw: spectrum_router: Refactor nexthop init routine")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 12 Jul 2017 04:34:24 +0000 (21:34 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
- Fix symbol version generation for assembler on sparc, from
Nagarathnam Muthusamy.
- Fix compound page handling in gup_huge_pmd(), from Nitin Gupta.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix gup_huge_pmd
Adding the type of exported symbols
sed regex in Makefile.build requires line break between exported symbols
Adding asm-prototypes.h for genksyms to generate crc
Yonghong Song [Mon, 10 Jul 2017 21:04:28 +0000 (14:04 -0700)]
samples/bpf: fix a build issue
With latest net-next:
====
clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -Isamples/bpf \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-Wno-unknown-warning-option \
-O2 -emit-llvm -c samples/bpf/tcp_synrto_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/tcp_synrto_kern.o
samples/bpf/tcp_synrto_kern.c:20:10: fatal error: 'bpf_endian.h' file not found
^~~~~~~~~~~~~~
1 error generated.
====
net has the same issue.
Add support for ntohl and htonl in tools/testing/selftests/bpf/bpf_endian.h.
Also move bpf_helpers.h from samples/bpf to selftests/bpf and change
compiler include logic so that programs in samples/bpf can access the headers
in selftests/bpf, but not the other way around.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Lawrence Brakmo <brakmo@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eduardo Valentin [Tue, 11 Jul 2017 21:55:12 +0000 (14:55 -0700)]
bridge: mdb: fix leak on complete_info ptr on fail path
We currently get the following kmemleak report:
unreferenced object 0xffff8800039d9820 (size 32):
comm "softirq", pid 0, jiffies
4295212383 (age 792.416s)
hex dump (first 32 bytes):
00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00 ................
00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff ................
backtrace:
[<
ffffffff8152b4aa>] kmemleak_alloc+0x4a/0xa0
[<
ffffffff811d8ec8>] kmem_cache_alloc_trace+0xb8/0x1c0
[<
ffffffffa0389683>] __br_mdb_notify+0x2a3/0x300 [bridge]
[<
ffffffffa038a0ce>] br_mdb_notify+0x6e/0x70 [bridge]
[<
ffffffffa0386479>] br_multicast_add_group+0x109/0x150 [bridge]
[<
ffffffffa0386518>] br_ip6_multicast_add_group+0x58/0x60 [bridge]
[<
ffffffffa0387fb5>] br_multicast_rcv+0x1d5/0xdb0 [bridge]
[<
ffffffffa037d7cf>] br_handle_frame_finish+0xcf/0x510 [bridge]
[<
ffffffffa03a236b>] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter]
[<
ffffffffa03a3738>] br_nf_hook_thresh+0x48/0xb0 [br_netfilter]
[<
ffffffffa03a3fb9>] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 [br_netfilter]
[<
ffffffffa03a4400>] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter]
[<
ffffffffa03a3c27>] br_nf_pre_routing+0x197/0x3d0 [br_netfilter]
[<
ffffffff814a2952>] nf_iterate+0x52/0x60
[<
ffffffff814a29bc>] nf_hook_slow+0x5c/0xb0
[<
ffffffffa037ddf4>] br_handle_frame+0x1a4/0x2c0 [bridge]
This happens when switchdev_port_obj_add() fails. This patch
frees complete_info object in the fail path.
Reviewed-by: Vallish Vaidyeshwara <vallish@amazon.com>
Signed-off-by: Eduardo Valentin <eduval@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 11 Jul 2017 22:36:52 +0000 (15:36 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"This is a followup for block changes, that didn't make the initial
pull request. It's a bit of a mixed bag, this contains:
- A followup pull request from Sagi for NVMe. Outside of fixups for
NVMe, it also includes a series for ensuring that we properly
quiesce hardware queues when browsing live tags.
- Set of integrity fixes from Dmitry (mostly), fixing various issues
for folks using DIF/DIX.
- Fix for a bug introduced in cciss, with the req init changes. From
Christoph.
- Fix for a bug in BFQ, from Paolo.
- Two followup fixes for lightnvm/pblk from Javier.
- Depth fix from Ming for blk-mq-sched.
- Also from Ming, performance fix for mtip32xx that was introduced
with the dynamic initialization of commands"
* 'for-linus' of git://git.kernel.dk/linux-block: (44 commits)
block: call bio_uninit in bio_endio
nvmet: avoid unneeded assignment of submit_bio return value
nvme-pci: add module parameter for io queue depth
nvme-pci: compile warnings in nvme_alloc_host_mem()
nvmet_fc: Accept variable pad lengths on Create Association LS
nvme_fc/nvmet_fc: revise Create Association descriptor length
lightnvm: pblk: remove unnecessary checks
lightnvm: pblk: control I/O flow also on tear down
cciss: initialize struct scsi_req
null_blk: fix error flow for shared tags during module_init
block: Fix __blkdev_issue_zeroout loop
nvme-rdma: unconditionally recycle the request mr
nvme: split nvme_uninit_ctrl into stop and uninit
virtio_blk: quiesce/unquiesce live IO when entering PM states
mtip32xx: quiesce request queues to make sure no submissions are inflight
nbd: quiesce request queues to make sure no submissions are inflight
nvme: kick requeue list when requeueing a request instead of when starting the queues
nvme-pci: quiesce/unquiesce admin_q instead of start/stop its hw queues
nvme-loop: quiesce/unquiesce admin_q instead of start/stop its hw queues
nvme-fc: quiesce/unquiesce admin_q instead of start/stop its hw queues
...
Linus Torvalds [Tue, 11 Jul 2017 21:04:48 +0000 (14:04 -0700)]
Merge tag 'smb3-security-fixes-for-4.13' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes and sane default from Steve French:
"Upgrade default dialect to more secure SMB3 from older cifs dialect"
* tag 'smb3-security-fixes-for-4.13' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Clean up unused variables in smb2pdu.c
[SMB3] Improve security, move default dialect to SMB3 from old CIFS
[SMB3] Remove ifdef since SMB3 (and later) now STRONGLY preferred
CIFS: Reconnect expired SMB sessions
CIFS: Display SMB2 error codes in the hex format
cifs: Use smb 2 - 3 and cifsacl mount options setacl function
cifs: prototype declaration and definition to set acl for smb 2 - 3 and cifsacl mount options
WANG Cong [Mon, 10 Jul 2017 17:05:50 +0000 (10:05 -0700)]
tap: convert a mutex to a spinlock
We are not allowed to block on the RCU reader side, so can't
just hold the mutex as before. As a quick fix, convert it to
a spinlock.
Fixes: d9f1f61c0801 ("tap: Extending tap device create/destroy APIs")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Sainath Grandhi <sainath.grandhi@intel.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guilherme G. Piccoli [Mon, 10 Jul 2017 13:55:46 +0000 (10:55 -0300)]
cxgb4: fix BUG() on interrupt deallocating path of ULD
Since the introduction of ULD (Upper-Layer Drivers), the MSI-X
deallocating path changed in cxgb4: the driver frees the interrupts
of ULD when unregistering it or on shutdown PCI handler.
Problem is that if a MSI-X is not freed before deallocated in the PCI
layer, it will trigger a BUG() due to still "alive" interrupt being
tentatively quiesced.
The below trace was observed when doing a simple unbind of Chelsio's
adapter PCI function, like:
"echo 001e:80:00.4 > /sys/bus/pci/drivers/cxgb4/unbind"
Trace:
kernel BUG at drivers/pci/msi.c:352!
Oops: Exception in kernel mode, sig: 5 [#1]
...
NIP [
c0000000005a5e60] free_msi_irqs+0xa0/0x250
LR [
c0000000005a5e50] free_msi_irqs+0x90/0x250
Call Trace:
[
c0000000005a5e50] free_msi_irqs+0x90/0x250 (unreliable)
[
c0000000005a72c4] pci_disable_msix+0x124/0x180
[
d000000011e06708] disable_msi+0x88/0xb0 [cxgb4]
[
d000000011e06948] free_some_resources+0xa8/0x160 [cxgb4]
[
d000000011e06d60] remove_one+0x170/0x3c0 [cxgb4]
[
c00000000058a910] pci_device_remove+0x70/0x110
[
c00000000064ef04] device_release_driver_internal+0x1f4/0x2c0
...
This patch fixes the issue by refactoring the shutdown path of ULD on
cxgb4 driver, by properly freeing and disabling interrupts on PCI
remove handler too.
Fixes: 0fbc81b3ad51 ("Allocate resources dynamically for all cxgb4 ULD's")
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalderon, Michal [Sun, 9 Jul 2017 10:00:16 +0000 (13:00 +0300)]
qed: Fix printk option passed when printing ipv6 addresses
The option "h" (host order ) exists for ipv4 only.
Remove the h when printing ipv6 addresses.
Lead to the following smatch warning:
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:585 qed_iwarp_print_tcp_ramrod()
warn: '%pI6' can only be followed by c
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1521 qed_iwarp_print_cm_info()
warn: '%pI6' can only be followed by c
Fixes commit
456a584947d5 ("qed: iWARP CM add passive side connect")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ahmad Fatoum [Sat, 8 Jul 2017 19:28:44 +0000 (21:28 +0200)]
net: Fix minor code bug in timestamping.txt
Passing (void*)val instead of &val would make a pointer out of an integer
and cause sock_setsockopt to -EFAULT.
See tools/testing/selftests/networking/timestamping/timestamping.c
for a working example.
Cc: David S. Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Ahmad Fatoum <ahmad@a3f.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 11 Jul 2017 20:33:54 +0000 (13:33 -0700)]
Merge branch 'stmmac-dma-resources-fixes'
Christophe JAILLET says:
====================
net: stmmac: Fixes and cleanups in 'alloc_dma_[rt]x_desc_resources()'
These patchs are all related to 'alloc_dma_[rt]x_desc_resources()' functions.
The 2 first fix an error path where some resources are leaking. I've
separated them into 2 patches because the issues have been introduced by
2 deferent commits.
The 3rd patch is just a clean-up.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Sat, 8 Jul 2017 07:46:54 +0000 (09:46 +0200)]
net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer
'alloc_dma_[rt]x_desc_resources()' functions look very close.
Remove a useless initialization and use the same label name for error
handling path in order to get them even closer.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Sat, 8 Jul 2017 07:46:43 +0000 (09:46 +0200)]
net: stmmac: Fix error handling path in 'alloc_dma_tx_desc_resources()'
If the first 'kmalloc_array' within the loop fails, we should free what
as already been allocated, as done in all other error handling path.
Fixes: ce736788e8a9 ("net: stmmac: adding multiple buffers for TX")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Jaillet [Sat, 8 Jul 2017 07:46:33 +0000 (09:46 +0200)]
net: stmmac: Fix error handling path in 'alloc_dma_rx_desc_resources()'
If the first 'kmalloc_array' within the loop fails, we should free what
as already been allocated, as done in all other error handling path.
Fixes: 54139cf3bb33 ("net: stmmac: adding multiple buffers for rx")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 11 Jul 2017 19:12:28 +0000 (12:12 -0700)]
Merge tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The main item here is support for v12.y.z ("Luminous") clusters:
RESEND_ON_SPLIT, RADOS_BACKOFF, OSDMAP_PG_UPMAP and CRUSH_CHOOSE_ARGS
feature bits, and various other changes in the RADOS client protocol.
On top of that we have a new fsc mount option to allow supplying
fscache uniquifier (similar to NFS) and the usual pile of filesystem
fixes from Zheng"
* tag 'ceph-for-4.13-rc1' of git://github.com/ceph/ceph-client: (44 commits)
libceph: advertise support for NEW_OSDOP_ENCODING and SERVER_LUMINOUS
libceph: osd_state is 32 bits wide in luminous
crush: remove an obsolete comment
crush: crush_init_workspace starts with struct crush_work
libceph, crush: per-pool crush_choose_arg_map for crush_do_rule()
crush: implement weight and id overrides for straw2
libceph: apply_upmap()
libceph: compute actual pgid in ceph_pg_to_up_acting_osds()
libceph: pg_upmap[_items] infrastructure
libceph: ceph_decode_skip_* helpers
libceph: kill __{insert,lookup,remove}_pg_mapping()
libceph: introduce and switch to decode_pg_mapping()
libceph: don't pass pgid by value
libceph: respect RADOS_BACKOFF backoffs
libceph: make DEFINE_RB_* helpers more general
libceph: avoid unnecessary pi lookups in calc_target()
libceph: use target pi for calc_target() calculations
libceph: always populate t->target_{oid,oloc} in calc_target()
libceph: make sure need_resend targets reflect latest map
libceph: delete from need_resend_linger before check_linger_pool_dne()
...
Christophe Jaillet [Sat, 8 Jul 2017 04:51:35 +0000 (06:51 +0200)]
cisco: enic: Fic an error handling path in 'vnic_dev_init_devcmd2()'
if 'ioread32()' returns 0xFFFFFFF, we have to go through the error
handling path as done everywhere else in this function.
Move the 'err_free_wq' label to better match its name and its location
and add a new label 'err_disable_wq'.
Update the code accordingly.
Fixes: 373fb0873d43 ("enic: add devcmd2")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 11 Jul 2017 17:32:12 +0000 (10:32 -0700)]
Merge branch 'bnxt_en-Bug-fixes'
Michael Chan says:
====================
bnxt_en: Bug fixes.
3 bug fixes in this series. Fix a crash in bnxt_get_stats64() that can
happen if the device is closing and freeing the statistics block at the
same time. The 2nd one fixes ethtool -L failing when changing from
combined to non-combined mode or vice versa. The last one fixes SRIOV
failure on big-endian systems because we were setting a bitmap wrong in
a firmware message.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 11 Jul 2017 17:05:36 +0000 (13:05 -0400)]
bnxt_en: Fix SRIOV on big-endian architecture.
The PF driver sets up a list of firmware commands from the VF driver that
needs to be forwarded to the PF for approval. This list is a 256-bit
bitmap. The code that sets up the bitmap falls apart on big-endian
architecture. __set_bit() does not work because it operates on long types
whereas the firmware interface is defined in u32 types, causing bits in
the wrong 32-bit word to be set.
Fix it by setting the proper bits on an array of u32.
Fixes: de68f5de5651 ("bnxt_en: Fix bitmap declaration to work on 32-bit arches.")
Reported-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 11 Jul 2017 17:05:35 +0000 (13:05 -0400)]
bnxt_en: Fix bug in ethtool -L.
When changing channels from combined to rx/tx or vice versa, the code
uses the wrong "sh" parameter to determine if we are reserving rings
for shared or non-shared mode. It should be using the ethtool requested
"sh" parameter instead of the current "sh" parameter.
Fix it by passing the "sh" parameter to bnxt_reserve_rings(). For
ethtool, we will pass in the requested "sh" parameter.
Fixes: 391be5c27364 ("bnxt_en: Implement new scheme to reserve tx rings.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Tue, 11 Jul 2017 17:05:34 +0000 (13:05 -0400)]
bnxt_en: Fix race conditions in .ndo_get_stats64().
.ndo_get_stats64() may not be protected by RTNL and can race with
.ndo_stop() or other ethtool operations that can free the statistics
memory. Fix it by setting a new flag BNXT_STATE_READ_STATS and then
proceeding to read statistics memory only if the state is OPEN. The
close path that frees the memory clears the OPEN state and then waits
for the BNXT_STATE_READ_STATS to clear before proceeding to free the
statistics memory.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 11 Jul 2017 16:59:37 +0000 (09:59 -0700)]
Merge git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
- Add Renesas RZ/A WDT Watchdog driver
- STM32 Independent WatchDoG (IWDG) support
- UniPhier watchdog support
- Add F71868 support
- Add support for NCT6793D and NCT6795D
- dw_wdt: add reset lines support
- core: add option to avoid early handling of watchdog
- core: introduce watchdog_worker_should_ping helper
- Cleanups and improvements for sama5d4, intel-mid_wdt, s3c2410_wdt,
orion_wdt, gpio_wdt, it87_wdt, meson_wdt, davinci_wdt, bcm47xx_wdt,
zx2967_wdt, cadence_wdt
* git://www.linux-watchdog.org/linux-watchdog: (32 commits)
watchdog: introduce watchdog_worker_should_ping helper
watchdog: uniphier: add UniPhier watchdog driver
dt-bindings: watchdog: add description for UniPhier WDT controller
watchdog: cadence_wdt: make of_device_ids const.
watchdog: zx2967: constify zx2967_wdt_ops.
watchdog: bcm47xx_wdt: constify bcm47xx_wdt_hard_ops and bcm47xx_wdt_soft_ops
watchdog: davinci: Add missing clk_disable_unprepare().
watchdog: davinci: Handle return value of clk_prepare_enable
watchdog: meson: Handle return value of clk_prepare_enable
watchdog: it87: Add support for various Super-IO chips
watchdog: it87: Use infrastructure to stop watchdog on reboot
watchdog: it87: Drop support for resetting watchdog though CIR and Game port
watchdog: it87: Convert to use watchdog core infrastructure
watchdog: it87: Drop FSF mailing address
watchdog: dw_wdt: get reset lines from dt
watchdog: bindings: dw_wdt: add reset lines
watchdog: w83627hf: Add support for NCT6793D and NCT6795D
watchdog: core: add option to avoid early handling of watchdog
watchdog: f71808e_wdt: Add F71868 support
watchdog: Add STM32 IWDG driver
...
Linus Torvalds [Tue, 11 Jul 2017 16:55:47 +0000 (09:55 -0700)]
Merge tag 'chrome-platform-for-linus-4.13' of git://git./linux/kernel/git/bleung/chrome-platform
Pull chrome platform updates from Benson Leung:
"Changes in this pull request are around catching up cros_ec with the
internal chromeos-kernel versions of cros_ec, cros_ec_lpc, and
cros_ec_lightbar.
Also, switching maintainership from olof to bleung"
* tag 'chrome-platform-for-linus-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
platform/chrome : Add myself as Maintainer
platform/chrome: cros_ec_lightbar - hide unused PM functions
cros_ec: Don't signal wake event for non-wake host events
cros_ec: Fix deadlock when EC is not responsive at probe
cros_ec: Don't return error when checking command version
platform/chrome: cros_ec_lightbar - Avoid I2C xfer to EC during suspend
platform/chrome: cros_ec_lightbar - Add userspace lightbar control bit to EC
platform/chrome: cros_ec_lightbar - Control of suspend/resume lightbar sequence
platform/chrome: cros_ec_lightbar - Add lightbar program feature to sysfs
platform/chrome: cros_ec_lpc: Add MKBP events support over ACPI
platform/chrome: cros_ec_lpc: Add power management ops
platform/chrome: cros_ec_lpc: Add support for GOOG004 ACPI device
platform/chrome: cros_ec_lpc: Add support for mec1322 EC
platform/chrome: cros_ec_lpc: Add R/W helpers to LPC protocol variants
mfd: cros_ec: Add support for dumping panic information
cros_ec_debugfs: Pass proper struct sizes to cros_ec_cmd_xfer()
mfd: cros_ec: add debugfs, console log file
mfd: cros_ec: Add EC console read structures definitions
mfd: cros_ec: Add helper for event notifier.
Linus Torvalds [Tue, 11 Jul 2017 16:52:56 +0000 (09:52 -0700)]
Merge branch 'for-next' of git://git./linux/kernel/git/gerg/m68knommu
Pull x86nommu update from Greg Ungerer:
"Only a single change, to remove old Kconfig options from defconfigs"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: defconfig: Cleanup from old Kconfig options
Linus Torvalds [Mon, 10 Jul 2017 23:58:42 +0000 (16:58 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
- most of the rest of MM
- KASAN updates
- lib/ updates
- checkpatch updates
- some binfmt_elf changes
- various misc bits
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (115 commits)
kernel/exit.c: avoid undefined behaviour when calling wait4()
kernel/signal.c: avoid undefined behaviour in kill_something_info
binfmt_elf: safely increment argv pointers
s390: reduce ELF_ET_DYN_BASE
powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB
arm64: move ELF_ET_DYN_BASE to 4GB / 4MB
arm: move ELF_ET_DYN_BASE to 4MB
binfmt_elf: use ELF_ET_DYN_BASE only for PIE
fs, epoll: short circuit fetching events if thread has been killed
checkpatch: improve multi-line alignment test
checkpatch: improve macro reuse test
checkpatch: change format of --color argument to --color[=WHEN]
checkpatch: silence perl 5.26.0 unescaped left brace warnings
checkpatch: improve tests for multiple line function definitions
checkpatch: remove false warning for commit reference
checkpatch: fix stepping through statements with $stat and ctx_statement_block
checkpatch: [HLP]LIST_HEAD is also declaration
checkpatch: warn when a MAINTAINERS entry isn't [A-Z]:\t
checkpatch: improve the unnecessary OOM message test
lib/bsearch.c: micro-optimize pivot position calculation
...
zhongjiang [Mon, 10 Jul 2017 22:53:01 +0000 (15:53 -0700)]
kernel/exit.c: avoid undefined behaviour when calling wait4()
wait4(-
2147483648, 0x20, 0, 0xdd0000) triggers:
UBSAN: Undefined behaviour in kernel/exit.c:1651:9
The related calltrace is as follows:
negation of -
2147483648 cannot be represented in type 'int':
CPU: 9 PID: 16482 Comm: zj Tainted: G B ---- ------- 3.10.0-327.53.58.71.x86_64+ #66
Hardware name: Huawei Technologies Co., Ltd. Tecal RH2285 /BC11BTSA , BIOS CTSAV036 04/27/2011
Call Trace:
dump_stack+0x19/0x1b
ubsan_epilogue+0xd/0x50
__ubsan_handle_negate_overflow+0x109/0x14e
SyS_wait4+0x1cb/0x1e0
system_call_fastpath+0x16/0x1b
Exclude the overflow to avoid the UBSAN warning.
Link: http://lkml.kernel.org/r/1497264618-20212-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang <zhongjiang@huawei.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
zhongjiang [Mon, 10 Jul 2017 22:52:57 +0000 (15:52 -0700)]
kernel/signal.c: avoid undefined behaviour in kill_something_info
When running kill(
72057458746458112, 0) in userspace I hit the following
issue.
UBSAN: Undefined behaviour in kernel/signal.c:1462:11
negation of -
2147483648 cannot be represented in type 'int':
CPU: 226 PID: 9849 Comm: test Tainted: G B ---- ------- 3.10.0-327.53.58.70.x86_64_ubsan+ #116
Hardware name: Huawei Technologies Co., Ltd. RH8100 V3/BC61PBIA, BIOS BLHSV028 11/11/2014
Call Trace:
dump_stack+0x19/0x1b
ubsan_epilogue+0xd/0x50
__ubsan_handle_negate_overflow+0x109/0x14e
SYSC_kill+0x43e/0x4d0
SyS_kill+0xe/0x10
system_call_fastpath+0x16/0x1b
Add code to avoid the UBSAN detection.
[akpm@linux-foundation.org: tweak comment]
Link: http://lkml.kernel.org/r/1496670008-59084-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhongjiang <zhongjiang@huawei.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Mon, 10 Jul 2017 22:52:54 +0000 (15:52 -0700)]
binfmt_elf: safely increment argv pointers
When building the argv/envp pointers, the envp is needlessly
pre-incremented instead of just continuing after the argv pointers are
finished. In some (likely impossible) race where the strings could be
changed from userspace between copy_strings() and here, it might be
possible to confuse the envp position. Instead, just use sp like
everything else.
Link: http://lkml.kernel.org/r/20170622173838.GA43308@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Mon, 10 Jul 2017 22:52:51 +0000 (15:52 -0700)]
s390: reduce ELF_ET_DYN_BASE
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided). For s390 the
position could be 0x10000, but that is needlessly close to the NULL
address.
Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Mon, 10 Jul 2017 22:52:47 +0000 (15:52 -0700)]
powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, which is the
traditional x86 minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Mon, 10 Jul 2017 22:52:44 +0000 (15:52 -0700)]
arm64: move ELF_ET_DYN_BASE to 4GB / 4MB
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit
address space for 32-bit pointers. On 32-bit use 4MB, to match ARM.
This could be 0x8000, the standard ET_EXEC load address, but that is
needlessly close to the NULL address, and anyone running arm compat PIE
will have an MMU, so the tight mapping is not needed.
Link: http://lkml.kernel.org/r/1498251600-132458-4-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Mon, 10 Jul 2017 22:52:40 +0000 (15:52 -0700)]
arm: move ELF_ET_DYN_BASE to 4MB
Now that explicitly executed loaders are loaded in the mmap region, we
have more freedom to decide where we position PIE binaries in the
address space to avoid possible collisions with mmap or stack regions.
4MB is chosen here mainly to have parity with x86, where this is the
traditional minimum load location, likely to avoid historically
requiring a 4MB page table entry when only a portion of the first 4MB
would be used (since the NULL address is avoided).
For ARM the position could be 0x8000, the standard ET_EXEC load address,
but that is needlessly close to the NULL address, and anyone running PIE
on 32-bit ARM will have an MMU, so the tight mapping is not needed.
Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Mon, 10 Jul 2017 22:52:37 +0000 (15:52 -0700)]
binfmt_elf: use ELF_ET_DYN_BASE only for PIE
The ELF_ET_DYN_BASE position was originally intended to keep loaders
away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
/bin/cat" might cause the subsequent load of /bin/cat into where the
loader had been loaded.)
With the advent of PIE (ET_DYN binaries with an INTERP Program Header),
ELF_ET_DYN_BASE continued to be used since the kernel was only looking
at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the
top 1/3rd of the TASK_SIZE, a substantial portion of the address space
is unused.
For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are
loaded above the mmap region. This means they can be made to collide
(CVE-2017-
1000370) or nearly collide (CVE-2017-
1000371) with
pathological stack regions.
Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap
region in all cases, and will now additionally avoid programs falling
back to the mmap region by enforcing MAP_FIXED for program loads (i.e.
if it would have collided with the stack, now it will fail to load
instead of falling back to the mmap region).
To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP)
are loaded into the mmap region, leaving space available for either an
ET_EXEC binary with a fixed location or PIE being loaded into mmap by
the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE,
which means architectures can now safely lower their values without risk
of loaders colliding with their subsequently loaded programs.
For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use
the entire 32-bit address space for 32-bit pointers.
Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and
suggestions on how to implement this solution.
Fixes: d1fd836dcf00 ("mm: split ET_DYN ASLR from mmap ASLR")
Link: http://lkml.kernel.org/r/20170621173201.GA114489@beast
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Daniel Micay <danielmicay@gmail.com>
Cc: Qualys Security Advisory <qsa@qualys.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Mon, 10 Jul 2017 22:52:33 +0000 (15:52 -0700)]
fs, epoll: short circuit fetching events if thread has been killed
We've encountered zombies that are waiting for a thread to exit that are
looping in ep_poll() almost endlessly although there is a pending
SIGKILL as a result of a group exit.
This happens because we always find ep_events_available() and fetch more
events and never are able to check for signal_pending() that would break
from the loop and return -EINTR.
Special case fatal signals and break immediately to guarantee that we
loop to fetch more events and delay making a timely exit.
It would also be possible to simply move the check for signal_pending()
higher than checking for ep_events_available(), but there have been no
reports of delayed signal handling other than SIGKILL preventing zombies
from exiting that would be fixed by this.
It fixes an issue for us where we have witnessed zombies sticking around
for at least O(minutes), but considering the code has been like this
forever and nobody else has complained that I have found, I would simply
queue it up for 4.12.
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1705031722350.76784@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Jan Kara <jack@suse.cz>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 10 Jul 2017 22:52:30 +0000 (15:52 -0700)]
checkpatch: improve multi-line alignment test
The current test fails to warn about improper alignment with code like
foo->bar = func(arg1,
arg2);
because foo->bar is not a single identifier.
Convert the $Ident to $Lval which allows for multiple dereferences.
Link: http://lkml.kernel.org/r/01c35b9b6a12a415e57746d45d589bfaad39952a.1498841563.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 10 Jul 2017 22:52:27 +0000 (15:52 -0700)]
checkpatch: improve macro reuse test
checkpatch reports a false positive when using token pasting argument
multiple times in a macro.
Fix it.
Miscellanea:
o Make the $tmp variable name used in the macro argument tests
a bit more descriptive
Link: http://lkml.kernel.org/r/cf434ae7602838388c7cb49d42bca93ab88527e7.1498483044.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Brooks [Mon, 10 Jul 2017 22:52:24 +0000 (15:52 -0700)]
checkpatch: change format of --color argument to --color[=WHEN]
The boolean --color argument did not offer the ability to force
colourized output even if stdout is not a terminal. Change the format
of the argument to the familiar --color[=WHEN] construct as seen in
common Linux utilities such as git, ls and dmesg, which allows the user
to specify whether to colourize output "always", "never", or "auto" when
the output is a terminal. The default is "auto".
The old command-line uses of --color and --no-color are unchanged.
Link: http://lkml.kernel.org/r/efe43bdbad400f39ba691ae663044462493b0773.1496799721.git.joe@perches.com
Signed-off-by: John Brooks <john@fastquake.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cyril Bur [Mon, 10 Jul 2017 22:52:21 +0000 (15:52 -0700)]
checkpatch: silence perl 5.26.0 unescaped left brace warnings
As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have
occurred when running checkpatch.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3544.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s*){
<-- HERE \s*/ at scripts/checkpatch.pl line 3885.
Unescaped left brace in regex is deprecated here (and will be fatal in
Perl 5.30), passed through in regex; marked by <-- HERE in
m/^(\+.*(?:do|\))){ <-- HERE / at scripts/checkpatch.pl line 4374.
It seems perfectly reasonable to do as the warning suggests and simply
escape the left brace in these three locations.
Link: http://lkml.kernel.org/r/20170607060135.17384-1-cyrilbur@gmail.com
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Joe Perches <joe@perches.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 10 Jul 2017 22:52:19 +0000 (15:52 -0700)]
checkpatch: improve tests for multiple line function definitions
Add a block that identifies multiple line function definitions.
Save the function name into $context_function to improve the embedded
function name test.
Look for misplaced open brace on the function definition.
Emit an OPEN_BRACE error when the function definition is similar to
void foo(int arg1,
int arg2) {
Miscellanea:
o Remove the $realfile test in function declaration w/o named arguments test
o Comment the function declaration w/o named arguments test
Link: http://lkml.kernel.org/r/de620ed6ebab75fdfa323741ada2134a0f545892.1496835238.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Tested-by: David Kershner <david.kershner@unisys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Heinrich Schuchardt [Mon, 10 Jul 2017 22:52:16 +0000 (15:52 -0700)]
checkpatch: remove false warning for commit reference
Checkpatch warns of an incorrect commit reference style for any
hexadecimal number of 12 digits and more.
Numbers of 12 digits are not necessarily commit ids.
For an example provoking the problem see
https://patchwork.kernel.org/patch/
9170897/
Checkpatch should only warn if the number refers to an existing commit.
Link: http://lkml.kernel.org/r/20170607184008.5869-1-xypron.glpk@gmx.de
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Mon, 10 Jul 2017 22:52:13 +0000 (15:52 -0700)]
checkpatch: fix stepping through statements with $stat and ctx_statement_block
Fix the off-by-one in the suppression of lines in a statement block.
This means that for multiple line statements like
foo(bar,
baz,
qux);
$stat has been inspected first correctly for the entire statement,
and subsequently incorrectly just for
qux);
This fix will help make tracking appropriate indentation a little easier.
Link: http://lkml.kernel.org/r/71b25979c90412133c717084036c9851cd2b7bcb.1496862585.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>