David S. Miller [Tue, 22 May 2018 19:43:16 +0000 (15:43 -0400)]
Merge branch 'tcp-ECN-quickack'
Eric Dumazet says:
====================
tcp: reduce quickack pressure for ECN
Small patch series changing TCP behavior vs quickack and ECN
First patch is a refactoring, adding parameter to tcp_incr_quickack()
and tcp_enter_quickack_mode() helpers.
Second patch implements the change, lowering number of ACK packets
sent after an ECN event.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 21 May 2018 22:08:57 +0000 (15:08 -0700)]
tcp: do not aggressively quick ack after ECN events
ECN signals currently forces TCP to enter quickack mode for
up to 16 (TCP_MAX_QUICKACKS) following incoming packets.
We believe this is not needed, and only sending one immediate ack
for the current packet should be enough.
This should reduce the extra load noticed in DCTCP environments,
after congestion events.
This is part 2 of our effort to reduce pure ACK packets.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 21 May 2018 22:08:56 +0000 (15:08 -0700)]
tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode
We want to add finer control of the number of ACK packets sent after
ECN events.
This patch is not changing current behavior, it only enables following
change.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 21 May 2018 20:03:04 +0000 (23:03 +0300)]
net: sched: don't disable bh when accessing action idr
Initial net_device implementation used ingress_lock spinlock to synchronize
ingress path of device. This lock was used in both process and bh context.
In some code paths action map lock was obtained while holding ingress_lock.
Commit
e1e992e52faa ("[NET_SCHED] protect action config/dump from irqs")
modified actions to always disable bh, while using action map lock, in
order to prevent deadlock on ingress_lock in softirq. This lock was removed
from net_device, so disabling bh, while accessing action map, is no longer
necessary.
Replace all action idr spinlock usage with regular calls that do not
disable bh.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 22 May 2018 18:44:20 +0000 (14:44 -0400)]
Merge branch 'net-ipv6-Fix-route-append-and-replace-use-cases'
David Ahern says:
====================
net/ipv6: Fix route append and replace use cases
This patch set fixes a few append and replace uses cases for IPv6 and
adds test cases that codifies the expectations of how append and replace
are expected to work. In paricular it allows a multipath route to have
a dev-only nexthop, something Thomas tried to accomplish with commit
edd7ceb78296 ("ipv6: Allow non-gateway ECMP for IPv6") which had to be
reverted because of breakage, and to replace an existing FIB entry
with a reject route.
There are a number of inconsistent and surprising aspects to the Linux
API for adding, deleting, replacing and changing FIB entries. For example,
with IPv4 NLM_F_APPEND means insert the route after any existing entries
with the same key (prefix + priority + TOS for IPv4) and NLM_F_CREATE
without the append flag inserts the new route before any existing entries.
IPv6 on the other hand attempts to guess whether a new route should be
appended to an existing one, possibly creating a multipath route, or to
add a new entry after any existing ones. This applies to both the 'append'
(NLM_F_CREATE + NLM_F_APPEND) and 'prepend' (NLM_F_CREATE only) cases
meaning for IPv6 the NLM_F_APPEND is basically ignored. This guessing
whether the route should be added to a multipath route (gateway routes)
or inserted after existing entries (non-gateway based routes) means a
multipath route can not have a dev only nexthop (potentially required in
some cases - tunnels or VRF route leaking for example) and route 'replace'
is a bit adhoc treating gateway based routes and dev-only / reject routes
differently.
This has led to frustration with developers working on routing suites
such as FRR where workarounds such as delete and add are used instead of
replace.
After this patch set there are 2 differences between IPv4 and IPv6:
1. 'ip ro prepend' = NLM_F_CREATE only
IPv4 adds the new route before any existing ones
IPv6 adds new route after any existing ones
2. 'ip ro append' = NLM_F_CREATE|NLM_F_APPEND
IPv4 adds the new route after any existing ones
IPv6 adds the nexthop to existing routes converting to multipath
For the former, there are cases where we want same prefix routes added
after existing ones (e.g., multicast, prefix routes for macvlan when used
for virtual router redundancy). Requiring the APPEND flag to add a new
route to an existing one helps here but is a slight change in behavior
since prepend with gateway routes now create a separate entry.
For the latter IPv6 behavior is preferred - appending a route for the same
prefix and metric to make a multipath route, so really IPv4 not allowing an
existing route to be updated is the limiter. This will be fixed when
nexthops become separate objects - a future patch set.
Thank you to Thomas and Ido for testing earlier versions of this set, and
to Ido for providing an update to the mlxsw driver.
Changes since RFC
- cleanup wording in test script; add comments about expected failures
and why
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:58 +0000 (10:26 -0700)]
selftests: fib_tests: Add ipv4 route add append replace tests
Add IPv4 route tests covering add, append and replace permutations.
Assumes the ability to add a basic single path route works; this is
required for example when adding an address to an interface.
$ fib_tests.sh -t ipv4_rt
IPv4 route add / append tests
TEST: Attempt to add duplicate route - gw [ OK ]
TEST: Attempt to add duplicate route - dev only [ OK ]
TEST: Attempt to add duplicate route - reject route [ OK ]
TEST: Add new nexthop for existing prefix [ OK ]
TEST: Append nexthop to existing route - gw [ OK ]
TEST: Append nexthop to existing route - dev only [ OK ]
TEST: Append nexthop to existing route - reject route [ OK ]
TEST: Append nexthop to existing reject route - gw [ OK ]
TEST: Append nexthop to existing reject route - dev only [ OK ]
TEST: add multipath route [ OK ]
TEST: Attempt to add duplicate multipath route [ OK ]
TEST: Route add with different metrics [ OK ]
TEST: Route delete with metric [ OK ]
IPv4 route replace tests
TEST: Single path with single path [ OK ]
TEST: Single path with multipath [ OK ]
TEST: Single path with reject route [ OK ]
TEST: Single path with single path via multipath attribute [ OK ]
TEST: Invalid nexthop [ OK ]
TEST: Single path - replace of non-existent route [ OK ]
TEST: Multipath with multipath [ OK ]
TEST: Multipath with single path [ OK ]
TEST: Multipath with single path via multipath attribute [ OK ]
TEST: Multipath with reject route [ OK ]
TEST: Multipath - invalid first nexthop [ OK ]
TEST: Multipath - invalid second nexthop [ OK ]
TEST: Multipath - replace of non-existent route [ OK ]
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:57 +0000 (10:26 -0700)]
selftests: fib_tests: Add ipv6 route add append replace tests
Add IPv6 route tests covering add, append and replace permutations.
Assumes the ability to add a basic single path route works; this is
required for example when adding an address to an interface.
$ fib_tests.sh -t ipv6_rt
IPv6 route add / append tests
TEST: Attempt to add duplicate route - gw [ OK ]
TEST: Attempt to add duplicate route - dev only [ OK ]
TEST: Attempt to add duplicate route - reject route [ OK ]
TEST: Add new route for existing prefix (w/o NLM_F_EXCL) [ OK ]
TEST: Append nexthop to existing route - gw [ OK ]
TEST: Append nexthop to existing route - dev only [ OK ]
TEST: Append nexthop to existing route - reject route [ OK ]
TEST: Append nexthop to existing reject route - gw [ OK ]
TEST: Append nexthop to existing reject route - dev only [ OK ]
TEST: Add multipath route [ OK ]
TEST: Attempt to add duplicate multipath route [ OK ]
TEST: Route add with different metrics [ OK ]
TEST: Route delete with metric [ OK ]
IPv6 route replace tests
TEST: Single path with single path [ OK ]
TEST: Single path with multipath [ OK ]
TEST: Single path with reject route [ OK ]
TEST: Single path with single path via multipath attribute [ OK ]
TEST: Invalid nexthop [ OK ]
TEST: Single path - replace of non-existent route [ OK ]
TEST: Multipath with multipath [ OK ]
TEST: Multipath with single path [ OK ]
TEST: Multipath with single path via multipath attribute [ OK ]
TEST: Multipath with reject route [ OK ]
TEST: Multipath - invalid first nexthop [ OK ]
TEST: Multipath - invalid second nexthop [ OK ]
TEST: Multipath - replace of non-existent route [ OK ]
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:56 +0000 (10:26 -0700)]
selftests: fib_tests: Add option to pause after each test
Add option to pause after each test before cleanup is done. Allows
user to do manual inspection or more ad-hoc testing after each test
with the setup in tact.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:55 +0000 (10:26 -0700)]
selftests: fib_tests: Add command line options
Add command line options for controlling pause on fail, controlling
specific tests to run and verbose mode rather than relying on environment
variables.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:54 +0000 (10:26 -0700)]
selftests: fib_tests: Add success-fail counts
As more tests are added, it is convenient to have a tally at the end.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:53 +0000 (10:26 -0700)]
net/ipv6: Simplify route replace and appending into multipath route
Bring consistency to ipv6 route replace and append semantics.
Remove rt6_qualify_for_ecmp which is just guess work. It fails in 2 cases:
1. can not replace a route with a reject route. Existing code appends
a new route instead of replacing the existing one.
2. can not have a multipath route where a leg uses a dev only nexthop
Existing use cases affected by this change:
1. adding a route with existing prefix and metric using NLM_F_CREATE
without NLM_F_APPEND or NLM_F_EXCL (ie., what iproute2 calls
'prepend'). Existing code auto-determines that the new nexthop can
be appended to an existing route to create a multipath route. This
change breaks that by requiring the APPEND flag for the new route
to be added to an existing one. Instead the prepend just adds another
route entry.
2. route replace. Existing code replaces first matching multipath route
if new route is multipath capable and fallback to first matching
non-ECMP route (reject or dev only route) in case one isn't available.
New behavior replaces first matching route. (Thanks to Ido for spotting
this one)
Note: Newer iproute2 is needed to display multipath routes with a dev-only
nexthop. This is due to a bug in iproute2 and parsing nexthops.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 21 May 2018 17:26:52 +0000 (10:26 -0700)]
mlxsw: spectrum_router: Add support for route append
Handle append for gateway based routes. Dev-only multipath routes will
be handled by a follow on patch.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 May 2018 20:17:11 +0000 (16:17 -0400)]
Merge branch 'TI-Ethernet-driver-warnings-fixes'
Florian Fainelli says:
====================
TI Ethernet driver warnings fixes
This patch series attempts to fix properly the warnings observed with turning
on COMPILE_TEST and TI Ethernet drivers on 64-bit hosts.
Since I don't have any of this hardware, please review carefully for possible
breakage!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 18:45:55 +0000 (11:45 -0700)]
ti: ethernet: davinci: Fix cast to int warnings
Now that we can compile test this driver on 64-bit hosts, we get some
warnings about how a pointer/address is written/read to/from a register
(sw_token). Fix this by doing the appropriate conversions, we cannot
possibly have the driver work on 64-bit hosts the way the tokens are
managed though, since the registers being written to a 32-bit only.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 18:45:54 +0000 (11:45 -0700)]
net: ethernet: davinci_emac: Fix printing of base address
Use %pa which is the correct formatter to print a physical address,
instead of %p which is just a pointer.
Fixes: a6286ee630f6 ("net: Add TI DaVinci EMAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 18:45:53 +0000 (11:45 -0700)]
net: ethernet: ti: cpsw: Fix cpsw_add_ch_strings() printk format
When building on a 64-bit host we will get the following warning:
drivers/net/ethernet/ti/cpsw.c: In function 'cpsw_add_ch_strings':
drivers/net/ethernet/ti/cpsw.c:1284:19: warning: format '%d' expects
argument of type 'int', but argument 5 has type 'long unsigned int'
[-Wformat=]
"%s DMA chan %d: %s", rx_dir ? "Rx" : "Tx",
~^
%ld
Fix this by using an %ld format and casting to long.
Fixes: e05107e6b747 ("net: ethernet: ti: cpsw: add multi queue support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 18:45:52 +0000 (11:45 -0700)]
net: ethernet: ti: cpts: Fix timestamp print
On 64-bit hosts we will get the following warning:
drivers/net/ethernet/ti/cpts.c: In function 'cpts_overflow_check':
drivers/net/ethernet/ti/cpts.c:297:11: warning: format '%lld' expects
argument of type 'long long int', but argument 3 has type
'__kernel_time_t {aka long int}' [-Wformat=]
pr_debug("cpts overflow check at %lld.%09lu\n", ts.tv_sec,
ts.tv_nsec);
Fix this by using an appropriate casting that works on all bit sizes.
Fixes: a5c79c26e168 ("ptp: cpts: convert to the 64 bit get/set time methods.")
Fixes: 87c0e764d43a ("cpts: introduce time stamping code and a PTP hardware clock.")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 18:45:51 +0000 (11:45 -0700)]
ti: ethernet: cpdma: Use correct format for genpool_*
Now that we can compile davinci_cpdma.c on 64-bit hosts, we can see that
the format used for printing a size_t type is incorrect, use %zd
accordingly.
Fixes: aeec3021043b ("net: ethernet: ti: cpdma: remove used_desc counter")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 May 2018 20:01:54 +0000 (16:01 -0400)]
Merge git://git./linux/kernel/git/davem/net
S390 bpf_jit.S is removed in net-next and had changes in 'net',
since that code isn't used any more take the removal.
TLS data structures split the TX and RX components in 'net-next',
put the new struct members from the bug fix in 'net' into the RX
part.
The 'net-next' tree had some reworking of how the ERSPAN code works in
the GRE tunneling code, overlapping with a one-line headroom
calculation fix in 'net'.
Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
that read the prog members via READ_ONCE() into local variables
before using them.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rahul Lakkireddy [Mon, 21 May 2018 13:37:50 +0000 (19:07 +0530)]
vmcore: move get_vmcore_size out of __init
Fix below build warning:
WARNING: vmlinux.o(.text+0x422bb8): Section mismatch in reference from
the function vmcore_add_device_dump() to the function
.init.text:get_vmcore_size.constprop.5()
The function vmcore_add_device_dump() references
the function __init get_vmcore_size.constprop.5().
This is often because vmcore_add_device_dump lacks a __init
annotation or the annotation of get_vmcore_size.constprop.5 is wrong.
Fixes: 7efe48df8a3d ("vmcore: append device dumps to vmcore as elf notes")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Mon, 21 May 2018 06:56:36 +0000 (12:26 +0530)]
cxgb4: copy the length of cpl_tx_pkt_core to fw_wr
immdlen field of FW_ETH_TX_PKT_WR is filled in a wrong way,
we must copy the length of all the cpls encapsulated in fw
work request. In the xmit path we missed adding the length
of CPL_TX_PKT_CORE but we added the length of WR_HDR and it
worked because WR_HDR and CPL_TX_PKT_CORE are of same length.
Add the length of cpl_tx_pkt_core not WR_HDR's. This also
fixes the lso cpl errors for udp tunnels
Fixes: d0a1299c6bf7 ("cxgb4: add support for vxlan segmentation offload")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 03:58:28 +0000 (20:58 -0700)]
net: ethernet: Sort Kconfig sourcing alphabetically
A number of entries were not alphabetically sorted, remedy that.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Mon, 21 May 2018 03:49:47 +0000 (20:49 -0700)]
net: phy: phylink: Don't release NULL GPIO
If CONFIG_GPIOLIB is disabled, gpiod_put() becomes a stub that produces a
warning, this helped identify that we could be attempting to release a NULL
pl->link_gpio GPIO descriptor, so guard against that.
Fixes: daab3349ad1a ("net: phy: phylink: Release link GPIO")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Mon, 21 May 2018 15:58:00 +0000 (08:58 -0700)]
Merge tag 'mips_fixes_4.17_2' of git://git./linux/kernel/git/jhogan/mips
Pull MIPS fixes from James Hogan:
- fix build with DEBUG_ZBOOT and MACH_JZ4770 (4.16)
- include xilfpga FDT in fitImage and stop generating dtb.o (4.15)
- fix software IO coherence on CM SMP systems (4.8)
- ptrace: Fix PEEKUSR/POKEUSR to o32 FGRs (3.14)
- ptrace: Expose FIR register through FP regset (3.13)
- fix typo in KVM debugfs file name (3.10)
* tag 'mips_fixes_4.17_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
MIPS: xilfpga: Actually include FDT in fitImage
MIPS: xilfpga: Stop generating useless dtb.o
KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
MIPS: ptrace: Expose FIR register through FP regset
MIPS: Fix build with DEBUG_ZBOOT and MACH_JZ4770
MIPS: c-r4k: Fix data corruption related to cache coherence
Linus Torvalds [Mon, 21 May 2018 15:37:48 +0000 (08:37 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix refcounting bug for connections in on-packet scheduling mode of
IPVS, from Julian Anastasov.
2) Set network header properly in AF_PACKET's packet_snd, from Willem
de Bruijn.
3) Fix regressions in 3c59x by converting to generic DMA API. It was
relying upon the hack that the PCI DMA interfaces would accept NULL
for EISA devices. From Christoph Hellwig.
4) Remove RDMA devices before unregistering netdev in QEDE driver, from
Michal Kalderon.
5) Use after free in TUN driver ptr_ring usage, from Jason Wang.
6) Properly check for missing netlink attributes in SMC_PNETID
requests, from Eric Biggers.
7) Set DMA mask before performaing any DMA operations in vmxnet3
driver, from Regis Duchesne.
8) Fix mlx5 build with SMP=n, from Saeed Mahameed.
9) Classifier fixes in bcm_sf2 driver from Florian Fainelli.
10) Tuntap use after free during release, from Jason Wang.
11) Don't use stack memory in scatterlists in tls code, from Matt
Mullins.
12) Not fully initialized flow key object in ipv4 routing code, from
David Ahern.
13) Various packet headroom bug fixes in ip6_gre driver, from Petr
Machata.
14) Remove queues from XPS maps using correct index, from Amritha
Nambiar.
15) Fix use after free in sock_diag, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (64 commits)
net: ip6_gre: fix tunnel metadata device sharing.
cxgb4: fix offset in collecting TX rate limit info
net: sched: red: avoid hashing NULL child
sock_diag: fix use-after-free read in __sk_free
sh_eth: Change platform check to CONFIG_ARCH_RENESAS
net: dsa: Do not register devlink for unused ports
net: Fix a bug in removing queues from XPS map
bpf: fix truncated jump targets on heavy expansions
bpf: parse and verdict prog attach may race with bpf map update
bpf: sockmap update rollback on error can incorrectly dec prog refcnt
net: test tailroom before appending to linear skb
net: ip6_gre: Fix ip6erspan hlen calculation
net: ip6_gre: Split up ip6gre_changelink()
net: ip6_gre: Split up ip6gre_newlink()
net: ip6_gre: Split up ip6gre_tnl_change()
net: ip6_gre: Split up ip6gre_tnl_link_config()
net: ip6_gre: Fix headroom request in ip6erspan_tunnel_xmit()
net: ip6_gre: Request headroom in __gre6_xmit()
selftests/bpf: check return value of fopen in test_verifier.c
erspan: fix invalid erspan version.
...
David S. Miller [Sun, 20 May 2018 23:04:24 +0000 (19:04 -0400)]
mv88e6xxx: Fix uninitialized variable warning.
In mv88e6xxx_probe(), ("np" or "pdata") might be an invariant
but GCC can't see that, therefore:
drivers/net/dsa/mv88e6xxx/chip.c: In function ‘mv88e6xxx_probe’:
drivers/net/dsa/mv88e6xxx/chip.c:4420:13: warning: ‘compat_info’ may be used uninitialized in this function [-Wmaybe-uninitialized]
chip->info = compat_info;
Actually, it should have warned on the "if (!compat_info)" test, but
whatever.
Explicitly initialize to NULL in the variable declaration to
deal with this.
Fixes: 877b7cb0b6f2 ("net: dsa: mv88e6xxx: Add minimal platform_data support")
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sun, 20 May 2018 15:56:30 +0000 (08:56 -0700)]
net: dsa: b53: Extend platform data to include DSA ports
The b53 driver already defines and internally uses platform data to let the
glue drivers specify parameters such as the chip id. What we were missing was
a way to tell the core DSA layer about the ports and their type.
Place a dsa_chip_data structure at the beginning of b53_platform_data for
dsa_register_switch() to access it. This does not require modifications to
b53_common.c which will pass platform_data trough.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 20 May 2018 22:58:28 +0000 (18:58 -0400)]
Merge branch 'mv88exxx-pdata'
Andrew Lunn says:
====================
Platform data support for mv88exxx
There are a few Intel based platforms making use of the mv88exxx.
These don't easily have access to device tree in order to instantiate
the switch driver. These patches allow the use of platform data to
hold the configuration.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 19 May 2018 20:31:35 +0000 (22:31 +0200)]
net: dsa: mv88e6xxx: Add support for EEPROM via platform data
Add the size of the EEPROM to the platform data, so it can also be
instantiated by a platform device.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 19 May 2018 20:31:34 +0000 (22:31 +0200)]
net: dsa: mv88e6xxx: Add minimal platform_data support
Not all the world uses device tree. Some parts of the world still use
platform devices and platform data. Add basic support for probing a
Marvell switch via platform data.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 19 May 2018 20:31:33 +0000 (22:31 +0200)]
net: dsa: mv88e6xxx: Remove OF check for IRQ domain
An IRQ domain will work without an OF node. It is not possible to
reference interrupts via a phandle, but C code can still use
irq_find_mapping() to get an interrupt from the domain.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 20 May 2018 22:56:43 +0000 (18:56 -0400)]
Merge branch 'sh_eth-typos'
Sergei Shtylyov says:
====================
sh_eth: fix typos/grammar
Here's a set of 3 patches against DaveM's 'net-next.git' repo plus the R8A77980
support patches posted earlier. They fix the comments typos/grammar and another
typo in the EESR bit...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 19 May 2018 21:05:02 +0000 (00:05 +0300)]
sh_eth: fix typo in comment to BCULR write
Simon has noticed a typo in the comment accompaining the BCULR write --
fix it and move the comment before the write (following the style of
the other comments), while at it...
Reported-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 19 May 2018 21:03:42 +0000 (00:03 +0300)]
sh_eth: fix comment grammar in 'struct sh_eth_cpu_data'
All the verbs in the comments to the 'struct sh_eth_cpu_data' declaration
should be in a 3rd person singular, to match the nouns.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 19 May 2018 21:02:36 +0000 (00:02 +0300)]
sh_eth: fix typo in EESR.TRO bit name
The correct name of the EESR bit 8 is TRO (transmit retry over), not RTO.
Note that EESIPR bit 8, TROIP remained correct...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 20 May 2018 22:53:59 +0000 (18:53 -0400)]
Merge branch 'hns3-next'
Salil Mehta says:
====================
Misc. bug fixes and cleanup for HNS3 driver
This patch-set presents miscellaneous bug fixes and cleanups found
during internal review, system testing and cleanup.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Sat, 19 May 2018 15:53:23 +0000 (16:53 +0100)]
net: hns3: Fix for CMDQ and Misc. interrupt init order problem
When vf module is loading, the cmd queue initialization should
happen before misc interrupt initialization, otherwise the misc
interrupt handle will cause using uninitialized cmd queue problem.
There is also the same issue when vf module is unloading.
This patch fixes it by adjusting the location of some function.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xi Wang [Sat, 19 May 2018 15:53:22 +0000 (16:53 +0100)]
net: hns3: Fixes kernel panic issue during rmmod hns3 driver
If CONFIG_ARM_SMMU_V3 is enabled, arm64's dma_ops will replace
arm64_swiotlb_dma_ops with iommu_dma_ops. When releasing contiguous
dma memory, the new ops will call the vunmap function which cannot
be run in interrupt context.
Currently, spin_lock_bh is called before vunmap is executed. This
disables BH and causes the interrupt context to be detected to
generate a kernel panic like below:
[ 2831.573400] kernel BUG at mm/vmalloc.c:1621!
[ 2831.577659] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
...
[ 2831.699907] Process rmmod (pid: 1893, stack limit = 0x0000000055103ee2)
[ 2831.706507] Call trace:
[ 2831.708941] vunmap+0x48/0x50
[ 2831.711897] dma_common_free_remap+0x78/0x88
[ 2831.716155] __iommu_free_attrs+0xa8/0x1c0
[ 2831.720255] hclge_free_cmd_desc+0xc8/0x118 [hclge]
[ 2831.725128] hclge_destroy_cmd_queue+0x34/0x68 [hclge]
[ 2831.730261] hclge_uninit_ae_dev+0x90/0x100 [hclge]
[ 2831.735127] hnae3_unregister_ae_dev+0xb0/0x868 [hnae3]
[ 2831.740345] hns3_remove+0x3c/0x90 [hns3]
[ 2831.744344] pci_device_remove+0x48/0x108
[ 2831.748342] device_release_driver_internal+0x164/0x200
[ 2831.753553] driver_detach+0x4c/0x88
[ 2831.757116] bus_remove_driver+0x60/0xc0
[ 2831.761026] driver_unregister+0x34/0x60
[ 2831.764935] pci_unregister_driver+0x30/0xb0
[ 2831.769197] hns3_exit_module+0x10/0x978 [hns3]
[ 2831.773715] SyS_delete_module+0x1f8/0x248
[ 2831.777799] el0_svc_naked+0x30/0x34
This patch fixes it by using spin_lock instead of spin_lock_bh.
Fixes: 68c0a5c70614 ("net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fuyun Liang [Sat, 19 May 2018 15:53:21 +0000 (16:53 +0100)]
net: hns3: Fix for netdev not running problem after calling net_stop and net_open
The link status update function is called by timer every second. But
net_stop and net_open may be called with very short intervals. The link
status update function can not detect the link state has changed. It
causes the netdev not running problem.
This patch fixes it by updating the link state in ae_stop function.
Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Sat, 19 May 2018 15:53:20 +0000 (16:53 +0100)]
net: hns3: Use enums instead of magic number in hclge_is_special_opcode
This patch does bit of a clean-up by using already defined enums for
certain values in function hclge_is_special_opcode(). Below enums from
have been used as replacements for magic values:
enum hclge_opcode_type{
<snip>
HCLGE_OPC_STATS_64_BIT = 0x0030,
HCLGE_OPC_STATS_32_BIT = 0x0031,
HCLGE_OPC_STATS_MAC = 0x0032,
<snip>
};
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xi Wang [Sat, 19 May 2018 15:53:19 +0000 (16:53 +0100)]
net: hns3: Fix for hns3 module is loaded multiple times problem
If the hns3 driver has been built into kernel and then loaded with
the same driver which built as KLM, it may trigger an error like
below:
[ 20.009555] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version
[ 20.016789] hns3: Copyright (c) 2017 Huawei Corporation.
[ 20.022100] Error: Driver 'hns3' is already registered, aborting...
[ 23.517397] Unable to handle kernel NULL pointer dereference at virtual address
00000000
...
[ 23.691583] Process insmod (pid: 1982, stack limit = 0x00000000cd5f21cb)
[ 23.698270] Call trace:
[ 23.700705] __list_del_entry_valid+0x2c/0xd8
[ 23.705049] hnae3_unregister_client+0x68/0xa8
[ 23.709487] hns3_init_module+0x98/0x1000 [hns3]
[ 23.714093] do_one_initcall+0x5c/0x170
[ 23.717918] do_init_module+0x64/0x1f4
[ 23.721654] load_module+0x1d14/0x24b0
[ 23.725390] SyS_init_module+0x158/0x208
[ 23.729300] el0_svc_naked+0x30/0x34
This patch fixes it by adding module version info.
Fixes: 38caee9d3ee8 ("net: hns3: Add support of the HNAE3 framework")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xi Wang [Sat, 19 May 2018 15:53:18 +0000 (16:53 +0100)]
net: hns3: Fix the missing client list node initialization
This patch fixes the missing initialization of the client list node
in the hnae3_register_client() function.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Sat, 19 May 2018 15:53:17 +0000 (16:53 +0100)]
net: hns3: cleanup of return values in hclge_init_client_instance()
Removes the goto and directly returns in case of errors as part of the
cleanup.
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peng Li [Sat, 19 May 2018 15:53:16 +0000 (16:53 +0100)]
net: hns3: Fixes API to fetch ethernet header length with kernel default
During the RX leg driver needs to fetch the ethernet header
length from the RX'ed Buffer Descriptor. Currently, proprietary
version hns3_nic_get_headlen is being used to fetch the header
length which uses l234info present in the Buffer Descriptor
which might not be valid for the first Buffer Descriptor if the
packet is spanning across multiple descriptors.
Kernel default eth_get_headlen API does the job correctly.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Salil Mehta [Sat, 19 May 2018 15:53:15 +0000 (16:53 +0100)]
net: hns3: Fixes error reported by Kbuild and internal review
This patch fixes the error reported by Intel's kbuild and fixes a
return value in one of the legs, caught during review of the original
patch sent by kbuild.
Fixes: fdb793670a00 ("net: hns3: Add support of .sriov_configure in HNS3 driver")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sat, 19 May 2018 08:29:33 +0000 (10:29 +0200)]
r8169: fix network error on resume from suspend
This commit removed calls to rtl_set_rx_mode(). This is ok for the
standard path if the link is brought up, however it breaks system
resume from suspend. Link comes up but no network traffic.
Meanwhile common code from rtl_hw_start_8169/8101/8168() was moved
to rtl_hw_start(), therefore re-add the call to rtl_set_rx_mode()
there.
Due to adding this call we have to move definition of rtl_hw_start()
after definition of rtl_set_rx_mode().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Fixes: 82d3ff6dd199 ("r8169: remove calls to rtl_set_rx_mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
William Tu [Sat, 19 May 2018 02:41:01 +0000 (19:41 -0700)]
erspan: set bso bit based on mirrored packet's len
Before the patch, the erspan BSO bit (Bad/Short/Oversized) is not
handled. BSO has 4 possible values:
00 --> Good frame with no error, or unknown integrity
11 --> Payload is a Bad Frame with CRC or Alignment Error
01 --> Payload is a Short Frame
10 --> Payload is an Oversized Frame
Based the short/oversized definitions in RFC1757, the patch sets
the bso bit based on the mirrored packet's size.
Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 20 May 2018 22:31:38 +0000 (15:31 -0700)]
Linux 4.17-rc6
David S. Miller [Sun, 20 May 2018 22:24:22 +0000 (18:24 -0400)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
pull request: bluetooth-next 2018-05-18
Here's the first bluetooth-next pull request for the 4.18 kernel:
- Refactoring of the btbcm driver
- New USB IDs for QCA_ROME and LiteOn controllers
- Buffer overflow fix if the controller sends invalid advertising data length
- Various cleanups & fixes for Qualcomm controllers
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher [Fri, 18 May 2018 18:58:30 +0000 (11:58 -0700)]
Revert "ixgbe: release lock for the duration of ixgbe_suspend_close()"
This reverts commit
6710f970d9979d8f03f6e292bb729b2ee1526d0e.
Gotta love when developers have offline discussions, thinking everyone
is reading their responses/dialog.
The change had the potential for a number of race conditions on
shutdown, which is why we are reverting the change.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hemanth Puranik [Fri, 18 May 2018 03:29:29 +0000 (08:59 +0530)]
net: qcom/emac: Allocate buffers from local node
Currently we use non-NUMA aware allocation for TPD and RRD buffers,
this patch modifies to use NUMA friendly allocation.
Signed-off-by: Hemanth Puranik <hpuranik@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 20 May 2018 19:44:07 +0000 (12:44 -0700)]
Merge branch 'parisc-4.17-5' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixlets from Helge Deller:
"Three small section mismatch fixes, one of them was found by 0-day
test infrastructure"
* 'parisc-4.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Move ccio_cujo20_fixup() into init section
parisc: Move setup_profiling_timer() out of init section
parisc: Move find_pa_parent_type() out of init section
Linus Torvalds [Sun, 20 May 2018 19:04:27 +0000 (12:04 -0700)]
Merge tag 'for-4.17-rc5-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"We've accumulated some fixes during the last week, some of them were
in the works for a longer time but there are some newer ones too.
Most of the fixes have a reproducer and fix user visible problems,
also candidates for stable kernels. They IMHO qualify for a late rc,
though I did not expect that many"
* tag 'for-4.17-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix crash when trying to resume balance without the resume flag
btrfs: Fix delalloc inodes invalidation during transaction abort
btrfs: Split btrfs_del_delalloc_inode into 2 functions
btrfs: fix reading stale metadata blocks after degraded raid1 mounts
btrfs: property: Set incompat flag if lzo/zstd compression is set
Btrfs: fix duplicate extents after fsync of file with prealloc extents
Btrfs: fix xattr loss after power failure
Btrfs: send, fix invalid access to commit roots due to concurrent snapshotting
Linus Torvalds [Sun, 20 May 2018 18:50:27 +0000 (11:50 -0700)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
- Łukasz Stelmach spotted a couple of issues with the decompressor.
- a couple of kdump fixes found while testing kdump
- replace some perl with shell code
- resolve SIGFPE breakage
- kprobes fixes
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: fix kill( ,SIGFPE) breakage
ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr
ARM: 8770/1: kprobes: Prohibit probing on optimized_callback
ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
ARM: replace unnecessary perl with sed and the shell $(( )) operator
ARM: kexec: record parent context registers for non-crash CPUs
ARM: kexec: fix kdump register saving on panic()
ARM: 8758/1: decompressor: restore r1 and r2 just before jumping to the kernel
ARM: 8753/1: decompressor: add a missing parameter to the addruart macro
Linus Torvalds [Sun, 20 May 2018 18:28:32 +0000 (11:28 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"An unfortunately larger set of fixes, but a large portion is
selftests:
- Fix the missing clusterid initializaiton for x2apic cluster
management which caused boot failures due to IPIs being sent to the
wrong cluster
- Drop TX_COMPAT when a 64bit executable is exec()'ed from a compat
task
- Wrap access to __supported_pte_mask in __startup_64() where clang
compile fails due to a non PC relative access being generated.
- Two fixes for 5 level paging fallout in the decompressor:
- Handle GOT correctly for paging_prepare() and
cleanup_trampoline()
- Fix the page table handling in cleanup_trampoline() to avoid
page table corruption.
- Stop special casing protection key 0 as this is inconsistent with
the manpage and also inconsistent with the allocation map handling.
- Override the protection key wen moving away from PROT_EXEC to
prevent inaccessible memory.
- Fix and update the protection key selftests to address breakage and
to cover the above issue
- Add a MOV SS self test"
[ Part of the x86 fixes were in the earlier core pull due to dependencies ]
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
x86/apic/x2apic: Initialize cluster ID properly
x86/boot/compressed/64: Fix moving page table out of trampoline memory
x86/boot/compressed/64: Set up GOT for paging_prepare() and cleanup_trampoline()
x86/pkeys: Do not special case protection key 0
x86/pkeys/selftests: Add a test for pkey 0
x86/pkeys/selftests: Save off 'prot' for allocations
x86/pkeys/selftests: Fix pointer math
x86/pkeys: Override pkey when moving away from PROT_EXEC
x86/pkeys/selftests: Fix pkey exhaustion test off-by-one
x86/pkeys/selftests: Add PROT_EXEC test
x86/pkeys/selftests: Factor out "instruction page"
x86/pkeys/selftests: Allow faults on unknown keys
x86/pkeys/selftests: Avoid printf-in-signal deadlocks
x86/pkeys/selftests: Remove dead debugging code, fix dprint_in_signal
x86/pkeys/selftests: Stop using assert()
x86/pkeys/selftests: Give better unexpected fault error messages
x86/selftests: Add mov_to_ss test
x86/mpx/selftests: Adjust the self-test to fresh distros that export the MPX ABI
x86/pkeys/selftests: Adjust the self-test to fresh distros that export the pkeys ABI
...
Linus Torvalds [Sun, 20 May 2018 18:25:54 +0000 (11:25 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull UP timer fix from Thomas Gleixner:
"Work around the for_each_cpu() oddity on UP kernels in the tick
broadcast code which causes boot failures because the CPU0 bit is
always reported as set independent of the cpumask content"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/broadcast: Use for_each_cpu() specially on UP kernels
Linus Torvalds [Sun, 20 May 2018 18:23:34 +0000 (11:23 -0700)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixlets from Thomas Gleixner:
"Three trivial fixlets for the scheduler:
- move print_rt_rq() and print_dl_rq() declarations to the right
place
- make grub_reclaim() static
- fix the bogus documentation reference in Kconfig"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix documentation file path
sched/deadline: Make the grub_reclaim() function static
sched/debug: Move the print_rt_rq() and print_dl_rq() declarations to kernel/sched/sched.h
Linus Torvalds [Sun, 20 May 2018 18:20:40 +0000 (11:20 -0700)]
Merge branch 'ras-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RAS fix from Thomas Gleixner:
"Fix a regression in the new AMD SMCA code which issues an SMP function
call from the early interrupt disabled region of CPU hotplug. To avoid
that, use cached block addresses which can be used directly"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/MCE/AMD: Cache SMCA MISC block addresses
Linus Torvalds [Sun, 20 May 2018 18:18:42 +0000 (11:18 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:
- fix segfault when processing unknown threads in cs-etm
- fix "perf test inet_pton" on s390 failing due to missing inline
- display all available events on 'perf annotate --stdio'
- add missing newline when parsing an empty BPF program
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Add missing newline when parsing empty BPF proggie
perf cs-etm: Remove redundant space
perf cs-etm: Support unknown_thread in cs_etm_auxtrace
perf annotate: Display all available events on --stdio
perf test: "probe libc's inet_pton" fails on s390 due to missing inline
Linus Torvalds [Sun, 20 May 2018 17:43:27 +0000 (10:43 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"Two fixes to address shortcomings of the rwsem/percpu-rwsem lock
debugging code which emits false positive warnings when the rwsem is
anonymously locked and unlocked"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/percpu-rwsem: Annotate rwsem ownership transfer by setting RWSEM_OWNER_UNKNOWN
locking/rwsem: Add a new RWSEM_ANONYMOUSLY_OWNED flag
Linus Torvalds [Sun, 20 May 2018 17:36:52 +0000 (10:36 -0700)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Thomas Gleixner:
- Use explicitely sized type for the romimage pointer in the 32bit EFI
protocol struct so a 64bit kernel does not expand it to 64bit. Ditto
for the 64bit struct to avoid the reverse issue on 32bit kernels.
- Handle randomized tex offset correctly in the ARM64 EFI stub to avoid
unaligned data resulting in stack corruption and other hard to
diagnose wreckage.
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub/arm64: Handle randomized TEXT_OFFSET
efi: Avoid potential crashes, fix the 'struct efi_pci_io_protocol_32' definition for mixed mode
Linus Torvalds [Sun, 20 May 2018 17:01:38 +0000 (10:01 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
- Unbreak the BPF compilation which got broken by the unconditional
requirement of asm-goto, which is not supported by clang.
- Prevent probing on exception masking instructions in uprobes and
kprobes to avoid the issues of the delayed exceptions instead of
having an ugly workaround.
- Prevent a double free_page() in the error path of do_kexec_load()
- A set of objtool updates addressing various issues mostly related to
switch tables and the noreturn detection for recursive sibling calls
- Header sync for tools.
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Detect RIP-relative switch table references, part 2
objtool: Detect RIP-relative switch table references
objtool: Support GCC 8 switch tables
objtool: Support GCC 8's cold subfunctions
objtool: Fix "noreturn" detection for recursive sibling calls
objtool, kprobes/x86: Sync the latest <asm/insn.h> header with tools/objtool/arch/x86/include/asm/insn.h
x86/cpufeature: Guard asm_volatile_goto usage for BPF compilation
uprobes/x86: Prohibit probing on MOV SS instruction
kprobes/x86: Prohibit probing on exception masking instructions
x86/kexec: Avoid double free_page() upon do_kexec_load() failure
William Tu [Sat, 19 May 2018 02:22:28 +0000 (19:22 -0700)]
net: ip6_gre: fix tunnel metadata device sharing.
Currently ip6gre and ip6erspan share single metadata mode device,
using 'collect_md_tun'. Thus, when doing:
ip link add dev ip6gre11 type ip6gretap external
ip link add dev ip6erspan12 type ip6erspan external
RTNETLINK answers: File exists
simply fails due to the 2nd tries to create the same collect_md_tun.
The patch fixes it by adding a separate collect md tunnel device
for the ip6erspan, 'collect_md_tun_erspan'. As a result, a couple
of places need to refactor/split up in order to distinguish ip6gre
and ip6erspan.
First, move the collect_md check at ip6gre_tunnel_{unlink,link} and
create separate function {ip6gre,ip6ersapn}_tunnel_{link_md,unlink_md}.
Then before link/unlink, make sure the link_md/unlink_md is called.
Finally, a separate ndo_uninit is created for ip6erspan. Tested it
using the samples/bpf/test_tunnel_bpf.sh.
Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode")
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 20 May 2018 03:24:47 +0000 (23:24 -0400)]
Merge branch 'sh_eth-R8A77980-GEther-support'
Sergei Shtylyov says:
====================
Add Renesas R8A77980 GEther support
Here's a set of 3 patches against DaveM's 'net-next.git' repo. They (gradually)
add R8A77980 GEther support to the 'sh_eth' driver, starting with couple new
register bits/values introduced with this chip, and ending with adding a new
'struct sh_eth_cpu_data' instance connected to the new DT "compatible" prop
value...
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 18 May 2018 18:32:46 +0000 (21:32 +0300)]
sh_eth: add R8A77980 support
Finally, add support for the DT probing of the R-Car V3H (AKA R8A77980) --
it's the only R-Car gen3 SoC having the GEther controller -- others have
only EtherAVB...
Based on the original (and large) patch by Vladimir Barinov.
Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 18 May 2018 18:31:28 +0000 (21:31 +0300)]
sh_eth: add EDMR.NBST support
The R-Car V3H (AKA R8A77980) GEther controller adds the DMA burst mode bit
(NBST) in EDMR and the manual tells to always set it before doing any DMA.
Based on the original (and large) patch by Vladimir Barinov.
Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 18 May 2018 18:30:18 +0000 (21:30 +0300)]
sh_eth: add RGMII support
The R-Car V3H (AKA R8A77980) GEther controller adds support for the RGMII
PHY interface mode as a new value for the RMII_MII register.
Based on the original (and large) patch by Vladimir Barinov.
Signed-off-by: Vladimir Barinov <vladimir.barinov@cogentembedded.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 20 May 2018 02:56:15 +0000 (19:56 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A handful of fixes. I've been queuing them up a bit too long so the
list is longer than it otherwise would have been spread out across a
few -rcs.
In general, it's a scattering of fixes across several platforms,
nothing truly serious enough to point out.
There's a slightly larger batch of them for the Davinci platforms due
to work to bring them back to life after some time, so there's a
handful of regressions, some of them going back very far, others more
recent.
There's also a few patches fixing DT on Renesas platforms since they
changed some bindings without remaining backwards compatible,
splitting up describing LVDS as a proper bridge instead of having it
as part of the display unit.
We could push for them to be backwards compatible with old device
trees, but it's likely to regress eventually if nobody's actually
using said compatibility"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (36 commits)
ARM: davinci: board-dm646x-evm: set VPIF capture card name
ARM: davinci: board-dm646x-evm: pass correct I2C adapter id for VPIF
ARM: davinci: dm646x: fix timer interrupt generation
ARM: keystone: fix platform_domain_notifier array overrun
arm64: dts: exynos: Fix interrupt type for I2S1 device on Exynos5433
ARM: dts: imx51-zii-rdu1: fix touchscreen bindings
firmware: arm_scmi: Use after free in scmi_create_protocol_device()
ARM: dts: cygnus: fix irq type for arm global timer
Revert "ARM: dts: logicpd-som-lv: Fix pinmux controller references"
tee: check shm references are consistent in offset/size
tee: shm: fix use-after-free via temporarily dropped reference
ARM: dts: imx7s: Pass the 'fsl,sec-era' property
ARM: dts: tegra20: Revert "Fix ULPI regression on Tegra20"
ARM: dts: correct missing "compatible" entry for ti81xx SoCs
ARM: OMAP1: ams-delta: fix deferred_fiq handler
arm64: tegra: Make BCM89610 PHY interrupt as active low
ARM: davinci: fix GPIO lookup for I2C
ARM: dts: logicpd-som-lv: Fix pinmux controller references
ARM: dts: logicpd-som-lv: Fix Audio Mute
ARM: dts: logicpd-som-lv: Fix WL127x Startup Issues
...
Maxime Chevallier [Fri, 18 May 2018 07:33:39 +0000 (09:33 +0200)]
net: mvpp2: Add missing VLAN tag detection
Marvell PPv2 Header Parser sets some bits in the 'result_info' field in
each lookup iteration, to identify different packet attributes such as
DSA / VLAN tag, protocol infos, etc. This is used in further
classification stages in the controller.
It's the DSA tag detection entry that is in charge of detecting when there
is a single VLAN tag.
This commits adds the missing update of the result_info in this case.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Olof Johansson [Sun, 20 May 2018 00:58:32 +0000 (17:58 -0700)]
Merge tag 'tegra-for-4.17-fixes-2' of git://git./linux/kernel/git/tegra/linux into fixes
arm64: tegra: Device tree fixes for v4.17
This contains a one-line update to the device tree of the Tegra186 P3310
processor module, fixing the polarity of the PHY interrupt. Originally,
this was queued to go into v4.18, but the PHY ID matching patch has now
found its way into v4.17-rc5, which means that the PHY driver will know
how to identify the PHY on this board and try to use the interrupt. This
will unfortunately cause networking to break on P3310, hence why I think
this should go into v4.17.
* tag 'tegra-for-4.17-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
arm64: tegra: Make BCM89610 PHY interrupt as active low
Signed-off-by: Olof Johansson <olof@lixom.net>
David S. Miller [Sat, 19 May 2018 20:30:39 +0000 (16:30 -0400)]
Merge branch 'devlink-port-flavours-and-phys_port_name'
Jiri Pirko says:
====================
devlink: introduce port flavours and common phys_port_name generation
This patchset resolves 2 issues we have right now:
1) There are many netdevices / ports in the system, for port, pf, vf
represenatation but the user has no way to see which is which
2) The ndo_get_phys_port_name is implemented in each driver separatelly,
which may lead to inconsistent names between drivers.
This patchset introduces port flavours which should address the first
problem. In this initial patchset, I focus on DSA and their port
flavours. As a follow-up, I plan to add PF and VF representor flavours.
However, that needs additional dependencies in drivers (nfp, mlx5).
The common phys_port_name generation is used by mlxsw. An example output
for mlxsw looks like this:
...
pci/0000:03:00.0/59: type eth netdev enp3s0np4 flavour physical number 4
pci/0000:03:00.0/61: type eth netdev enp3s0np1 flavour physical number 1
pci/0000:03:00.0/63: type eth netdev enp3s0np2 flavour physical number 2
pci/0000:03:00.0/49: type eth netdev enp3s0np8s0 flavour physical number 8 split_group 8 subport 0
pci/0000:03:00.0/50: type eth netdev enp3s0np8s1 flavour physical number 8 split_group 8 subport 1
pci/0000:03:00.0/51: type eth netdev enp3s0np8s2 flavour physical number 8 split_group 8 subport 2
pci/0000:03:00.0/52: type eth netdev enp3s0np8s3 flavour physical number 8 split_group 8 subport 3
As you can see, the netdev names are generated according to the flavour
and port number. In case the port is split, the split subnumber is also
included.
An example output for dsa_loop testing module looks like this:
mdio_bus/fixed-0:1f/0: type eth netdev lan1 flavour physical number 0
mdio_bus/fixed-0:1f/1: type eth netdev lan2 flavour physical number 1
mdio_bus/fixed-0:1f/2: type eth netdev lan3 flavour physical number 2
mdio_bus/fixed-0:1f/3: type eth netdev lan4 flavour physical number 3
mdio_bus/fixed-0:1f/4: type notset
mdio_bus/fixed-0:1f/5: type notset flavour cpu number 5
mdio_bus/fixed-0:1f/6: type notset
mdio_bus/fixed-0:1f/7: type notset
mdio_bus/fixed-0:1f/8: type notset
mdio_bus/fixed-0:1f/9: type notset
mdio_bus/fixed-0:1f/10: type notset
mdio_bus/fixed-0:1f/11: type notset
---
RFC->v1:
-removed nfp patches, removed DSA patch that used name generation helper
-patch 1:
- Reduced the nfp change just to simply use newly created attr_set func
-patch 2:
- rebased
- removed pf/vf reps flavours
-patch 3:
- rebased
-patch 4:
- added missing break pointed out by Andrew
====================
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 18 May 2018 07:29:04 +0000 (09:29 +0200)]
mlxsw: use devlink helper to generate physical port name
Since devlink knows the info needed to generate the physical port name
in a generic way for all devlink users, use the helper to do the job.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 18 May 2018 07:29:03 +0000 (09:29 +0200)]
dsa: set devlink port attrs for dsa ports
Set the attrs and allow to expose port flavour to user via devlink.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 18 May 2018 07:29:02 +0000 (09:29 +0200)]
devlink: introduce a helper to generate physical port names
Each driver implements physical port name generation by itself. However
as devlink has all needed info, it can easily do the job for all its
users. So implement this helper in devlink.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 18 May 2018 07:29:01 +0000 (09:29 +0200)]
devlink: extend attrs_set for setting port flavours
Devlink ports can have specific flavour according to the purpose of use.
This patch extend attrs_set so the driver can say which flavour port
has. Initial flavours are:
physical, cpu, dsa
User can query this to see right away what is the purpose of each port.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 18 May 2018 07:29:00 +0000 (09:29 +0200)]
devlink: introduce devlink_port_attrs_set
Change existing setter for split port information into more generic
attrs setter. Alongside with that, allow to set port number and subport
number for split ports.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Thu, 12 Apr 2018 23:22:47 +0000 (00:22 +0100)]
ARM: fix kill( ,SIGFPE) breakage
Commit
7771c6645700 ("signal/arm: Document conflicts with SI_USER and
SIGFPE") broke the siginfo structure for userspace triggered signals,
causing the strace testsuite to regress. Fix this by eliminating
the FPE_FIXME definition (which is at the root of the breakage) and
use FPE_FLTINV instead for the case where the hardware appears to be
reporting nonsense.
Fixes: 7771c6645700 ("signal/arm: Document conflicts with SI_USER and SIGFPE")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Linus Torvalds [Sat, 19 May 2018 16:54:02 +0000 (09:54 -0700)]
Merge tag 'dmaengine-fix-4.17-rc6' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fix from Vinod Koul:
- qcom bam runtime_pm fix
- email update for Vinod
* tag 'dmaengine-fix-4.17-rc6' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: qcom: bam_dma: check if the runtime pm enabled
dmaengine: Update email address for Vinod
Linus Torvalds [Sat, 19 May 2018 16:29:11 +0000 (09:29 -0700)]
mmap: relax file size limit for regular files
Commit
be83bbf80682 ("mmap: introduce sane default mmap limits") was
introduced to catch problems in various ad-hoc character device drivers
doing mmap and getting the size limits wrong. In the process, it used
"known good" limits for the normal cases of mapping regular files and
block device drivers.
It turns out that the "s_maxbytes" limit was less "known good" than I
thought. In particular, /proc doesn't set it, but exposes one regular
file to mmap: /proc/vmcore. As a result, that file got limited to the
default MAX_INT s_maxbytes value.
This went unnoticed for a while, because apparently the only thing that
needs it is the s390 kernel zfcpdump, but there might be other tools
that use this too.
Vasily suggested just changing s_maxbytes for all of /proc, which isn't
wrong, but makes me nervous at this stage. So instead, just make the
new mmap limit always be MAX_LFS_FILESIZE for regular files, which won't
affect anything else. It wasn't the regular file case I was worried
about.
I'd really prefer for maxsize to have been per-inode, but that is not
how things are today.
Fixes: be83bbf80682 ("mmap: introduce sane default mmap limits")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Borislav Petkov [Thu, 17 May 2018 08:46:26 +0000 (10:46 +0200)]
x86/MCE/AMD: Cache SMCA MISC block addresses
... into a global, two-dimensional array and service subsequent reads from
that cache to avoid rdmsr_on_cpu() calls during CPU hotplug (IPIs with IRQs
disabled).
In addition, this fixes a KASAN slab-out-of-bounds read due to wrong usage
of the bank->blocks pointer.
Fixes: 27bd59502702 ("x86/mce/AMD: Get address from already initialized block")
Reported-by: Johannes Hirte <johannes.hirte@datenkhaos.de>
Tested-by: Johannes Hirte <johannes.hirte@datenkhaos.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Link: http://lkml.kernel.org/r/20180414004230.GA2033@probook
Masami Hiramatsu [Sun, 13 May 2018 04:04:29 +0000 (05:04 +0100)]
ARM: 8772/1: kprobes: Prohibit kprobes on get_user functions
Since do_undefinstr() uses get_user to get the undefined
instruction, it can be called before kprobes processes
recursive check. This can cause an infinit recursive
exception.
Prohibit probing on get_user functions.
Fixes: 24ba613c9d6c ("ARM kprobes: core code")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Masami Hiramatsu [Sun, 13 May 2018 04:04:16 +0000 (05:04 +0100)]
ARM: 8771/1: kprobes: Prohibit kprobes on do_undefinstr
Prohibit kprobes on do_undefinstr because kprobes on
arm is implemented by undefined instruction. This means
if we probe do_undefinstr(), it can cause infinit
recursive exception.
Fixes: 24ba613c9d6c ("ARM kprobes: core code")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Masami Hiramatsu [Sun, 13 May 2018 04:04:10 +0000 (05:04 +0100)]
ARM: 8770/1: kprobes: Prohibit probing on optimized_callback
Prohibit probing on optimized_callback() because
it is called from kprobes itself. If we put a kprobes
on it, that will cause a recursive call loop.
Mark it NOKPROBE_SYMBOL.
Fixes: 0dc016dbd820 ("ARM: kprobes: enable OPTPROBES for ARM 32")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Masami Hiramatsu [Sun, 13 May 2018 04:03:54 +0000 (05:03 +0100)]
ARM: 8769/1: kprobes: Fix to use get_kprobe_ctlblk after irq-disabed
Since get_kprobe_ctlblk() uses smp_processor_id() to access
per-cpu variable, it hits smp_processor_id sanity check as below.
[ 7.006928] BUG: using smp_processor_id() in preemptible [
00000000] code: swapper/0/1
[ 7.007859] caller is debug_smp_processor_id+0x20/0x24
[ 7.008438] CPU: 0 PID: 1 Comm: swapper/0 Not tainted
4.16.0-rc1-00192-g4eb17253e4b5 #1
[ 7.008890] Hardware name: Generic DT based system
[ 7.009917] [<
c0313f0c>] (unwind_backtrace) from [<
c030e6d8>] (show_stack+0x20/0x24)
[ 7.010473] [<
c030e6d8>] (show_stack) from [<
c0c64694>] (dump_stack+0x84/0x98)
[ 7.010990] [<
c0c64694>] (dump_stack) from [<
c071ca5c>] (check_preemption_disabled+0x138/0x13c)
[ 7.011592] [<
c071ca5c>] (check_preemption_disabled) from [<
c071ca80>] (debug_smp_processor_id+0x20/0x24)
[ 7.012214] [<
c071ca80>] (debug_smp_processor_id) from [<
c03335e0>] (optimized_callback+0x2c/0xe4)
[ 7.013077] [<
c03335e0>] (optimized_callback) from [<
bf0021b0>] (0xbf0021b0)
To fix this issue, call get_kprobe_ctlblk() right after
irq-disabled since that disables preemption.
Fixes: 0dc016dbd820 ("ARM: kprobes: enable OPTPROBES for ARM 32")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Mon, 16 Apr 2018 12:21:54 +0000 (13:21 +0100)]
ARM: replace unnecessary perl with sed and the shell $(( )) operator
You can build a kernel in a cross compiling environment that doesn't
have perl in the $PATH. Commit
429f7a062e3b broke that for 32 bit
ARM. Fix it.
As reported by Stephen Rothwell, it appears that the symbols can be
either part of the BSS section or absolute symbols depending on the
binutils version. When they're an absolute symbol, the $(( ))
operator errors out and the build fails. Fix this as well.
Fixes: 429f7a062e3b ("ARM: decompressor: fix BSS size calculation")
Reported-by: Rob Landley <rob@landley.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 11 Apr 2018 18:35:19 +0000 (19:35 +0100)]
ARM: kexec: record parent context registers for non-crash CPUs
How we got to machine_crash_nonpanic_core() (iow, from an IPI, etc) is
not interesting for debugging a crash. The more interesting context
is the parent context prior to the IPI being received.
Record the parent context register state rather than the register state
in machine_crash_nonpanic_core(), which is more relevant to the failing
condition.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Wed, 11 Apr 2018 17:24:01 +0000 (18:24 +0100)]
ARM: kexec: fix kdump register saving on panic()
When a panic() occurs, the kexec code uses smp_send_stop() to stop
the other CPUs, but this results in the CPU register state not being
saved, and gdb is unable to inspect the state of other CPUs.
Commit
0ee59413c967 ("x86/panic: replace smp_send_stop() with kdump
friendly version in panic path") addressed the issue on x86, but
ignored other architectures. Address the issue on ARM by splitting
out the crash stop implementation to crash_smp_send_stop() and
adding the necessary protection.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Łukasz Stelmach [Wed, 4 Apr 2018 07:46:58 +0000 (08:46 +0100)]
ARM: 8758/1: decompressor: restore r1 and r2 just before jumping to the kernel
The hypervisor setup before __enter_kernel destroys the value
sotred in r1. The value needs to be restored just before the jump.
Fixes: 6b52f7bdb888 ("ARM: hyp-stub: Use r1 for the soft-restart address")
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Łukasz Stelmach [Tue, 3 Apr 2018 08:04:57 +0000 (09:04 +0100)]
ARM: 8753/1: decompressor: add a missing parameter to the addruart macro
In commit
639da5ee374b ("ARM: add an extra temp register to the low
level debugging addruart macro") an additional temporary register was
added to the addruart macro, but the decompressor code wasn't updated.
Fixes: 639da5ee374b ("ARM: add an extra temp register to the low level debugging addruart macro")
Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Dmitry Safonov [Thu, 17 May 2018 23:35:10 +0000 (00:35 +0100)]
x86/mm: Drop TS_COMPAT on 64-bit exec() syscall
The x86 mmap() code selects the mmap base for an allocation depending on
the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
for 32bit mm->mmap_compat_base.
exec() calls mmap() which in turn uses in_compat_syscall() to check whether
the mapping is for a 32bit or a 64bit task. The decision is made on the
following criteria:
ia32 child->thread.status & TS_COMPAT
x32 child->pt_regs.orig_ax & __X32_SYSCALL_BIT
ia64 !ia32 && !x32
__set_personality_x32() was dropping TS_COMPAT flag, but
set_personality_64bit() has kept compat syscall flag making
in_compat_syscall() return true during the first exec() syscall.
Which in result has user-visible effects, mentioned by Alexey:
1) It breaks ASAN
$ gcc -fsanitize=address wrap.c -o wrap-asan
$ ./wrap32 ./wrap-asan true
==1217==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==1217==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==1217==Process memory map follows:
0x000000400000-0x000000401000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x000000600000-0x000000601000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x000000601000-0x000000602000 /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
0x0000f7dbd000-0x0000f7de2000 /lib64/ld-2.27.so
0x0000f7fe2000-0x0000f7fe3000 /lib64/ld-2.27.so
0x0000f7fe3000-0x0000f7fe4000 /lib64/ld-2.27.so
0x0000f7fe4000-0x0000f7fe5000
0x7fed9abff000-0x7fed9af54000
0x7fed9af54000-0x7fed9af6b000 /lib64/libgcc_s.so.1
[snip]
2) It doesn't seem to be great for security if an attacker always knows
that ld.so is going to be mapped into the first 4GB in this case
(the same thing happens for PIEs as well).
The testcase:
$ cat wrap.c
int main(int argc, char *argv[]) {
execvp(argv[1], &argv[1]);
return 127;
}
$ gcc wrap.c -o wrap
$ LD_SHOW_AUXV=1 ./wrap ./wrap true |& grep AT_BASE
AT_BASE: 0x7f63b8309000
AT_BASE: 0x7faec143c000
AT_BASE: 0x7fbdb25fa000
$ gcc -m32 wrap.c -o wrap32
$ LD_SHOW_AUXV=1 ./wrap32 ./wrap true |& grep AT_BASE
AT_BASE: 0xf7eff000
AT_BASE: 0xf7cee000
AT_BASE: 0x7f8b9774e000
Fixes: 1b028f784e8c ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
Fixes: ada26481dfe6 ("x86/mm: Make in_compat_syscall() work during exec")
Reported-by: Alexey Izbyshev <izbyshev@ispras.ru>
Bisected-by: Alexander Monakov <amonakov@ispras.ru>
Investigated-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: stable@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Link: https://lkml.kernel.org/r/20180517233510.24996-1-dima@arista.com
Josh Poimboeuf [Fri, 18 May 2018 20:10:34 +0000 (15:10 -0500)]
objtool: Detect RIP-relative switch table references, part 2
With the following commit:
fd35c88b7417 ("objtool: Support GCC 8 switch tables")
I added a "can't find switch jump table" warning, to stop covering up
silent failures if add_switch_table() can't find anything.
That warning found yet another bug in the objtool switch table detection
logic. For cases 1 and 2 (as described in the comments of
find_switch_table()), the find_symbol_containing() check doesn't adjust
the offset for RIP-relative switch jumps.
Incidentally, this bug was already fixed for case 3 with:
6f5ec2993b1f ("objtool: Detect RIP-relative switch table references")
However, that commit missed the fix for cases 1 and 2.
The different cases are now starting to look more and more alike. So
fix the bug by consolidating them into a single case, by checking the
original dynamic jump instruction in the case 3 loop.
This also simplifies the code and makes it more robust against future
switch table detection issues -- of which I'm sure there will be many...
Switch table detection has been the most fragile area of objtool, by
far. I long for the day when we'll have a GCC plugin for annotating
switch tables. Linus asked me to delay such a plugin due to the
flakiness of the plugin infrastructure in older versions of GCC, so this
rickety code is what we're stuck with for now. At least the code is now
a little simpler than it was.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f400541613d45689086329432f3095119ffbc328.1526674218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Mark Rutland [Fri, 18 May 2018 14:08:41 +0000 (16:08 +0200)]
efi/libstub/arm64: Handle randomized TEXT_OFFSET
When CONFIG_RANDOMIZE_TEXT_OFFSET=y, TEXT_OFFSET is an arbitrary
multiple of PAGE_SIZE in the interval [0, 2MB).
The EFI stub does not account for the potential misalignment of
TEXT_OFFSET relative to EFI_KIMG_ALIGN, and produces a randomized
physical offset which is always a round multiple of EFI_KIMG_ALIGN.
This may result in statically allocated objects whose alignment exceeds
PAGE_SIZE to appear misaligned in memory. This has been observed to
result in spurious stack overflow reports and failure to make use of
the IRQ stacks, and theoretically could result in a number of other
issues.
We can OR in the low bits of TEXT_OFFSET to ensure that we have the
necessary offset (and hence preserve the misalignment of TEXT_OFFSET
relative to EFI_KIMG_ALIGN), so let's do that.
Reported-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Kim Phillips <kim.phillips@arm.com>
[ardb: clarify comment and commit log, drop unneeded parens]
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 6f26b3671184c36d ("arm64: kaslr: increase randomization granularity")
Link: http://lkml.kernel.org/r/20180518140841.9731-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sat, 19 May 2018 04:24:26 +0000 (21:24 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"10 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
hfsplus: stop workqueue when fill_super() failed
mm: don't allow deferred pages with NEED_PER_CPU_KM
MAINTAINERS: add Q: entry to kselftest for patchwork project
radix tree: fix multi-order iteration race
radix tree test suite: multi-order iteration race
radix tree test suite: add item_delete_rcu()
radix tree test suite: fix compilation issue
radix tree test suite: fix mapshift build target
include/linux/mm.h: add new inline function vmf_error()
lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly
Linus Torvalds [Sat, 19 May 2018 04:22:16 +0000 (21:22 -0700)]
Merge tag 'platform-drivers-x86-v4.17-3' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver fix from Darren Hart:
"Remove the last of the "select DELL_SMBIOS" references in the Kconfig"
* tag 'platform-drivers-x86-v4.17-3' of git://git.infradead.org/linux-platform-drivers-x86:
platform/x86: DELL_WMI use depends on instead of select for DELL_SMBIOS
Linus Torvalds [Sat, 19 May 2018 04:19:02 +0000 (21:19 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
- a modified revert of a patch that made new choices come out for a
couple stm32 clk drivers that really always need to be there when
that particular machine is compiled in
- boot fix on i.MX for Stefan who noticed odd behavior from the
critical flag patch that came in during the merge window
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: stm32: fix: stm32 clock drivers are not compiled by default
clk: imx6ull: use OSC clock during AXI rate change
Linus Torvalds [Sat, 19 May 2018 01:02:01 +0000 (18:02 -0700)]
Merge branch 'i2c/for-current-fixed' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A bunch of driver bugfixes and a MAINTAINERS addition"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
MAINTAINERS: add entry for STM32 I2C driver
i2c: viperboard: return message count on master_xfer success
i2c: pmcmsp: fix error return from master_xfer
i2c: pmcmsp: return message count on master_xfer success
i2c: designware: fix poll-after-enable regression
eeprom: at24: fix retrieving the at24_chip_data structure
i2c: core: ACPI: Log device not acking errors at dbg loglevel
i2c: core: ACPI: Improve OpRegion read errors
Tetsuo Handa [Fri, 18 May 2018 23:09:16 +0000 (16:09 -0700)]
hfsplus: stop workqueue when fill_super() failed
syzbot is reporting ODEBUG messages at hfsplus_fill_super() [1]. This
is because hfsplus_fill_super() forgot to call cancel_delayed_work_sync().
As far as I can see, it is hfsplus_mark_mdb_dirty() from
hfsplus_new_inode() in hfsplus_fill_super() that calls
queue_delayed_work(). Therefore, I assume that hfsplus_new_inode() does
not fail if queue_delayed_work() was called, and the out_put_hidden_dir
label is the appropriate location to call cancel_delayed_work_sync().
[1] https://syzkaller.appspot.com/bug?id=
a66f45e96fdbeb76b796bf46eb25ea878c42a6c9
Link: http://lkml.kernel.org/r/964a8b27-cd69-357c-fe78-76b066056201@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+4f2e5f086147d543ab03@syzkaller.appspotmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Ernesto A. Fernandez <ernesto.mnd.fernandez@gmail.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pavel Tatashin [Fri, 18 May 2018 23:09:13 +0000 (16:09 -0700)]
mm: don't allow deferred pages with NEED_PER_CPU_KM
It is unsafe to do virtual to physical translations before mm_init() is
called if struct page is needed in order to determine the memory section
number (see SECTION_IN_PAGE_FLAGS). This is because only in mm_init()
we initialize struct pages for all the allocated memory when deferred
struct pages are used.
My recent fix in commit
c9e97a1997 ("mm: initialize pages on demand
during boot") exposed this problem, because it greatly reduced number of
pages that are initialized before mm_init(), but the problem existed
even before my fix, as Fengguang Wu found.
Below is a more detailed explanation of the problem.
We initialize struct pages in four places:
1. Early in boot a small set of struct pages is initialized to fill the
first section, and lower zones.
2. During mm_init() we initialize "struct pages" for all the memory that
is allocated, i.e reserved in memblock.
3. Using on-demand logic when pages are allocated after mm_init call
(when memblock is finished)
4. After smp_init() when the rest free deferred pages are initialized.
The problem occurs if we try to do va to phys translation of a memory
between steps 1 and 2. Because we have not yet initialized struct pages
for all the reserved pages, it is inherently unsafe to do va to phys if
the translation itself requires access of "struct page" as in case of
this combination: CONFIG_SPARSE && !CONFIG_SPARSE_VMEMMAP
The following path exposes the problem:
start_kernel()
trap_init()
setup_cpu_entry_areas()
setup_cpu_entry_area(cpu)
get_cpu_gdt_paddr(cpu)
per_cpu_ptr_to_phys(addr)
pcpu_addr_to_page(addr)
virt_to_page(addr)
pfn_to_page(__pa(addr) >> PAGE_SHIFT)
We disable this path by not allowing NEED_PER_CPU_KM with deferred
struct pages feature.
The problems are discussed in these threads:
http://lkml.kernel.org/r/
20180418135300.inazvpxjxowogyge@wfg-t540p.sh.intel.com
http://lkml.kernel.org/r/
20180419013128.iurzouiqxvcnpbvz@wfg-t540p.sh.intel.com
http://lkml.kernel.org/r/
20180426202619.2768-1-pasha.tatashin@oracle.com
Link: http://lkml.kernel.org/r/20180515175124.1770-1-pasha.tatashin@oracle.com
Fixes: 3a80a7fa7989 ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set")
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Dennis Zhou <dennisszhou@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Shuah Khan (Samsung OSG) [Fri, 18 May 2018 23:09:09 +0000 (16:09 -0700)]
MAINTAINERS: add Q: entry to kselftest for patchwork project
A new patchwork project is created to track kselftest patches. Update
the kselftest entry in the MAINTAINERS file adding 'Q:' entry:
https://patchwork.kernel.org/project/linux-kselftest/list/
Link: http://lkml.kernel.org/r/20180515164427.12201-1-shuah@kernel.org
Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Fri, 18 May 2018 23:09:06 +0000 (16:09 -0700)]
radix tree: fix multi-order iteration race
Fix a race in the multi-order iteration code which causes the kernel to
hit a GP fault. This was first seen with a production v4.15 based
kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
order 9 PMD DAX entries.
The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree. Remember for example that
an order 2 entry looks like this:
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'
When we delete 'entry' from the tree, we call :
radix_tree_delete()
radix_tree_delete_item()
__radix_tree_delete()
replace_slot()
replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL. This means that for a
brief period of time we end up with one or more of the siblings removed,
so:
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection. This is a common case in
mm/filemap.c.
The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:
V preceding slot
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
^ current slot
This lets you find the first sibling, and you skip them all in order.
But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:
V preceding slot
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
^ current slot
This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers. This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.
In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.
We fix this race by fixing the way that skip_siblings() detects sibling
nodes. Instead of testing against the preceding slot we instead look
for siblings via is_sibling_entry() which compares against the position
of the struct radix_tree_node.slots[] array. This ensures that sibling
entries are properly identified, even if they are no longer contiguous
with the 'entry' they point to.
Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
Fixes: 148deab223b2 ("radix-tree: improve multiorder iterators")
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: CR, Sapthagirish <sapthagirish.cr@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>