Igor Druzhinin [Tue, 27 Nov 2018 20:58:21 +0000 (20:58 +0000)]
Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"
This reverts commit
b3cf8528bb21febb650a7ecbf080d0647be40b9f.
That commit unintentionally broke Xen balloon memory hotplug with
"hotplug_unpopulated" set to 1. As long as "System RAM" resource
got assigned under a new "Unusable memory" resource in IO/Mem tree
any attempt to online this memory would fail due to general kernel
restrictions on having "System RAM" resources as 1st level only.
The original issue that commit has tried to workaround
fa564ad96366
("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
60-7f)") also got amended by the following
03a551734 ("x86/PCI: Move
and shrink AMD 64-bit window to avoid conflict") which made the
original fix to Xen ballooning unnecessary.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Srikanth Boddepalli [Tue, 27 Nov 2018 14:23:27 +0000 (19:53 +0530)]
xen: xlate_mmu: add missing header to fix 'W=1' warning
Add a missing header otherwise compiler warns about missed prototype:
drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes]
int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
^~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Srikanth Boddepalli <boddepalli.srikanth@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Fri, 23 Nov 2018 16:24:51 +0000 (17:24 +0100)]
xen/x86: add diagnostic printout to xen_mc_flush() in case of error
Failure of an element of a Xen multicall is signalled via a WARN()
only if the kernel is compiled with MC_DEBUG. It is impossible to
know which element failed and why it did so.
Change that by printing the related information even without MC_DEBUG,
even if maybe in some limited form (e.g. without information which
caller produced the failing element).
Move the printing out of the switch statement in order to have the
same information for a single call.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Mon, 19 Nov 2018 13:59:45 +0000 (14:59 +0100)]
x86/xen: cleanup includes in arch/x86/xen/spinlock.c
arch/x86/xen/spinlock.c includes several headers which are not needed.
Remove the #includes.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Thu, 1 Nov 2018 12:33:07 +0000 (13:33 +0100)]
xen: remove size limit of privcmd-buf mapping interface
Currently the size of hypercall buffers allocated via
/dev/xen/hypercall is limited to a default of 64 memory pages. For live
migration of guests this might be too small as the page dirty bitmask
needs to be sized according to the size of the guest. This means
migrating a 8GB sized guest is already exhausting the default buffer
size for the dirty bitmap.
There is no sensible way to set a sane limit, so just remove it
completely. The device node's usage is limited to root anyway, so there
is no additional DOS scenario added by allowing unlimited buffers.
While at it make the error path for the -ENOMEM case a little bit
cleaner by setting n_pages to the number of successfully allocated
pages instead of the target size.
Fixes: c51b3c639e01f2 ("xen: add new hypercall buffer mapping device")
Cc: <stable@vger.kernel.org> #4.18
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Thu, 8 Nov 2018 07:35:06 +0000 (08:35 +0100)]
xen: fix xen_qlock_wait()
Commit
a856531951dc80 ("xen: make xen_qlock_wait() nestable")
introduced a regression for Xen guests running fully virtualized
(HVM or PVH mode). The Xen hypervisor wouldn't return from the poll
hypercall with interrupts disabled in case of an interrupt (for PV
guests it does).
So instead of disabling interrupts in xen_qlock_wait() use a nesting
counter to avoid calling xen_clear_irq_pending() in case
xen_qlock_wait() is nested.
Fixes: a856531951dc80 ("xen: make xen_qlock_wait() nestable")
Cc: stable@vger.kernel.org
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Wed, 7 Nov 2018 17:01:00 +0000 (18:01 +0100)]
x86/xen: fix pv boot
Commit
9da3f2b7405440 ("x86/fault: BUG() when uaccess helpers fault on
kernel addresses") introduced a regression for booting Xen PV guests.
Xen PV guests are using __put_user() and __get_user() for accessing the
p2m map (physical to machine frame number map) as accesses might fail
in case of not populated areas of the map.
With above commit using __put_user() and __get_user() for accessing
kernel pages is no longer valid. So replace the Xen hack by adding
appropriate p2m access functions using the default fixup handler.
Fixes: 9da3f2b7405440 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Manjunath Patil [Tue, 30 Oct 2018 16:49:21 +0000 (09:49 -0700)]
xen-blkfront: fix kernel panic with negotiate_mq error path
info->nr_rings isn't adjusted in case of ENOMEM error from
negotiate_mq(). This leads to kernel panic in error path.
Typical call stack involving panic -
#8 page_fault at
ffffffff8175936f
[exception RIP: blkif_free_ring+33]
RIP:
ffffffffa0149491 RSP:
ffff8804f7673c08 RFLAGS:
00010292
...
#9 blkif_free at
ffffffffa0149aaa [xen_blkfront]
#10 talk_to_blkback at
ffffffffa014c8cd [xen_blkfront]
#11 blkback_changed at
ffffffffa014ea8b [xen_blkfront]
#12 xenbus_otherend_changed at
ffffffff81424670
#13 backend_changed at
ffffffff81426dc3
#14 xenwatch_thread at
ffffffff81422f29
#15 kthread at
ffffffff810abe6a
#16 ret_from_fork at
ffffffff81754078
Cc: stable@vger.kernel.org
Fixes: 7ed8ce1c5fc7 ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs")
Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Liam Merwick [Fri, 2 Nov 2018 14:04:23 +0000 (14:04 +0000)]
xen/grant-table: Fix incorrect gnttab_dma_free_pages() pr_debug message
If a call to xenmem_reservation_increase() in gnttab_dma_free_pages()
fails it triggers a message "Failed to decrease reservation..." which
should be "Failed to increase reservation..."
Fixes: 9bdc7304f536 ('xen/grant-table: Allow allocating buffers suitable for DMA')
Reported-by: Ross Philipson <ross.philipson@oracle.com>
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Stefano Stabellini [Wed, 31 Oct 2018 23:11:49 +0000 (16:11 -0700)]
CONFIG_XEN_PV breaks xen_create_contiguous_region on ARM
xen_create_contiguous_region has now only an implementation if
CONFIG_XEN_PV is defined. However, on ARM we never set CONFIG_XEN_PV but
we do have an implementation of xen_create_contiguous_region which is
required for swiotlb-xen to work correctly (although it just sets
*dma_handle).
Cc: <stable@vger.kernel.org> # 4.12
Fixes: 16624390816c ("xen: create xen_create/destroy_contiguous_region() stubs for PVHVM only builds")
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
CC: Jeff.Kubascik@dornerworks.com
CC: Jarvis.Roach@dornerworks.com
CC: Nathan.Studer@dornerworks.com
CC: vkuznets@redhat.com
CC: boris.ostrovsky@oracle.com
CC: jgross@suse.com
CC: julien.grall@arm.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Tue, 9 Oct 2018 16:09:59 +0000 (18:09 +0200)]
xen: drop writing error messages to xenstore
xenbus_va_dev_error() will try to write error messages to Xenstore
under the error/<dev-name>/error node (with <dev-name> something like
"device/vbd/51872"). This will fail normally and another message
about this failure is added to dmesg.
I believe this is a remnant from very ancient times, as it was added
in the first pvops rush of commits in 2007.
So remove the additional message when writing to Xenstore failed as
a minimum step.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracel.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Thu, 25 Oct 2018 07:54:15 +0000 (09:54 +0200)]
xen/pvh: don't try to unplug emulated devices
A Xen PVH guest has no associated qemu device model, so trying to
unplug any emulated devices is making no sense at all.
Bail out early from xen_unplug_emulated_devices() when running as PVH
guest. This will avoid issuing the boot message:
[ 0.000000] Xen Platform PCI: unrecognised magic value
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Stefano Stabellini [Wed, 17 Oct 2018 13:44:45 +0000 (21:44 +0800)]
add myself as reviewer for Xen support in Linux
It would be good for me to keep an eye on the patches that touch Xen
support in Linux to try to spot changes that break Xen on ARM early on.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Bartlomiej Zolnierkiewicz [Tue, 16 Oct 2018 14:33:58 +0000 (16:33 +0200)]
xen: remove redundant 'default n' from Kconfig
'default n' is the default value for any bool or tristate Kconfig
setting so there is no need to write it explicitly.
Also since commit
f467c5640c29 ("kconfig: only write '# CONFIG_FOO
is not set' for visible symbols") the Kconfig behavior is the same
regardless of 'default n' being present or not:
...
One side effect of (and the main motivation for) this change is making
the following two definitions behave exactly the same:
config FOO
bool
config FOO
bool
default n
With this change, neither of these will generate a
'# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
That might make it clearer to people that a bare 'default n' is
redundant.
...
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Boris Ostrovsky [Sun, 7 Oct 2018 20:05:38 +0000 (16:05 -0400)]
xen/balloon: Support xend-based toolstack
Xend-based toolstacks don't have static-max entry in xenstore. The
equivalent node for those toolstacks is memory_static_max.
Fixes: 5266b8e4445c (xen: fix booting ballooned down hvm guest)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <stable@vger.kernel.org> # 4.13
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Roger Pau Monne [Tue, 9 Oct 2018 10:32:37 +0000 (12:32 +0200)]
xen/pvh: increase early stack size
While booting on an AMD EPYC box the stack canary would detect stack
overflows when using the current PVH early stack size (256). Switch to
using the value defined by BOOT_STACK_SIZE, which prevents the stack
overflow.
Cc: <stable@vger.kernel.org> # 4.11
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Mon, 1 Oct 2018 05:57:42 +0000 (07:57 +0200)]
xen: make xen_qlock_wait() nestable
xen_qlock_wait() isn't safe for nested calls due to interrupts. A call
of xen_qlock_kick() might be ignored in case a deeper nesting level
was active right before the call of xen_poll_irq():
CPU 1: CPU 2:
spin_lock(lock1)
spin_lock(lock1)
-> xen_qlock_wait()
-> xen_clear_irq_pending()
Interrupt happens
spin_unlock(lock1)
-> xen_qlock_kick(CPU 2)
spin_lock_irqsave(lock2)
spin_lock_irqsave(lock2)
-> xen_qlock_wait()
-> xen_clear_irq_pending()
clears kick for lock1
-> xen_poll_irq()
spin_unlock_irq_restore(lock2)
-> xen_qlock_kick(CPU 2)
wakes up
spin_unlock_irq_restore(lock2)
IRET
resumes in xen_qlock_wait()
-> xen_poll_irq()
never wakes up
The solution is to disable interrupts in xen_qlock_wait() and not to
poll for the irq in case xen_qlock_wait() is called in nmi context.
Cc: stable@vger.kernel.org
Cc: Waiman.Long@hp.com
Cc: peterz@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Juergen Gross [Mon, 1 Oct 2018 05:57:42 +0000 (07:57 +0200)]
xen: fix race in xen_qlock_wait()
In the following situation a vcpu waiting for a lock might not be
woken up from xen_poll_irq():
CPU 1: CPU 2: CPU 3:
takes a spinlock
tries to get lock
-> xen_qlock_wait()
frees the lock
-> xen_qlock_kick(cpu2)
-> xen_clear_irq_pending()
takes lock again
tries to get lock
-> *lock = _Q_SLOW_VAL
-> *lock == _Q_SLOW_VAL ?
-> xen_poll_irq()
frees the lock
-> xen_qlock_kick(cpu3)
And cpu 2 will sleep forever.
This can be avoided easily by modifying xen_qlock_wait() to call
xen_poll_irq() only if the related irq was not pending and to call
xen_clear_irq_pending() only if it was pending.
Cc: stable@vger.kernel.org
Cc: Waiman.Long@hp.com
Cc: peterz@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Geert Uytterhoeven [Wed, 26 Sep 2018 08:43:13 +0000 (10:43 +0200)]
xen/balloon: Grammar s/Is it/It is/
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Juergen Gross <jgross@suse.com>
Jason Andryuk [Tue, 25 Sep 2018 11:36:55 +0000 (07:36 -0400)]
xen: Make XEN_BACKEND selectable by DomU
XEN_BACKEND doesn't actually depend on XEN_DOM0. DomUs can serve
backends to other DomUs. One example is a service VM providing network
backends.
The original Kconfig defaulted Dom0 to y and it could be disabled. DomU
could not select the option. With the new Kconfig, we default y for
Dom0 and n for DomU. Either can then toggle the selection.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Linus Torvalds [Wed, 24 Oct 2018 07:11:35 +0000 (08:11 +0100)]
net/kconfig: Make QCOM_QMI_HELPERS available when COMPILE_TEST
The networking merge brought in the experimental support for the
Qualcomm ath10k system NOC, which selects QCOM_QMI_HELPERS as a
dependency.
But the ATH10K_SNOC option (which selects QCOM_QMI_HELPERS) depends on
ARCH_QCOM || COMPILE_TEST in order to get wider build testing than just
the unusual QCOM architecture build, while the QCOM_QMI_HELPERS option
doesn't have that COMPILE_TEST option and is limited to only ARCH_QCOM.
As a result, a "make allmodconfig" complains
WARNING: unmet direct dependencies detected for QCOM_QMI_HELPERS
Depends on [n]: ARCH_QCOM && NET [=y]
Selected by [m]:
- ATH10K_SNOC [=m] && NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_ATH [=y] && ATH10K [=m] && (ARCH_QCOM || COMPILE_TEST [=y])
Fix the config-time warning by making QCOM_QMI_HELPERS available when
COMPILE_TEST, since the result seems to build fine.
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Govind Singh <govinds@codeaurora.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 24 Oct 2018 05:47:44 +0000 (06:47 +0100)]
Merge git://git./linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
1) Add VF IPSEC offload support in ixgbe, from Shannon Nelson.
2) Add zero-copy AF_XDP support to i40e, from Björn Töpel.
3) All in-tree drivers are converted to {g,s}et_link_ksettings() so we
can get rid of the {g,s}et_settings ethtool callbacks, from Michal
Kubecek.
4) Add software timestamping to veth driver, from Michael Walle.
5) More work to make packet classifiers and actions lockless, from Vlad
Buslov.
6) Support sticky FDB entries in bridge, from Nikolay Aleksandrov.
7) Add ipv6 version of IP_MULTICAST_ALL sockopt, from Andre Naujoks.
8) Support batching of XDP buffers in vhost_net, from Jason Wang.
9) Add flow dissector BPF hook, from Petar Penkov.
10) i40e vf --> generic iavf conversion, from Jesse Brandeburg.
11) Add NLA_REJECT netlink attribute policy type, to signal when users
provide attributes in situations which don't make sense. From
Johannes Berg.
12) Switch TCP and fair-queue scheduler over to earliest departure time
model. From Eric Dumazet.
13) Improve guest receive performance by doing rx busy polling in tx
path of vhost networking driver, from Tonghao Zhang.
14) Add per-cgroup local storage to bpf
15) Add reference tracking to BPF, from Joe Stringer. The verifier can
now make sure that references taken to objects are properly released
by the program.
16) Support in-place encryption in TLS, from Vakul Garg.
17) Add new taprio packet scheduler, from Vinicius Costa Gomes.
18) Lots of selftests additions, too numerous to mention one by one here
but all of which are very much appreciated.
19) Support offloading of eBPF programs containing BPF to BPF calls in
nfp driver, frm Quentin Monnet.
20) Move dpaa2_ptp driver out of staging, from Yangbo Lu.
21) Lots of u32 classifier cleanups and simplifications, from Al Viro.
22) Add new strict versions of netlink message parsers, and enable them
for some situations. From David Ahern.
23) Evict neighbour entries on carrier down, also from David Ahern.
24) Support BPF sk_msg verdict programs with kTLS, from Daniel Borkmann
and John Fastabend.
25) Add support for filtering route dumps, from David Ahern.
26) New igc Intel driver for 2.5G parts, from Sasha Neftin et al.
27) Allow vxlan enslavement to bridges in mlxsw driver, from Ido
Schimmel.
28) Add queue and stack map types to eBPF, from Mauricio Vasquez B.
29) Add back byte-queue-limit support to r8169, with all the bug fixes
in other areas of the driver it works now! From Florian Westphal and
Heiner Kallweit.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2147 commits)
tcp: add tcp_reset_xmit_timer() helper
qed: Fix static checker warning
Revert "be2net: remove desc field from be_eq_obj"
Revert "net: simplify sock_poll_wait"
net: socionext: Reset tx queue in ndo_stop
net: socionext: Add dummy PHY register read in phy_write()
net: socionext: Stop PHY before resetting netsec
net: stmmac: Set OWN bit for jumbo frames
arm64: dts: stratix10: Support Ethernet Jumbo frame
tls: Add maintainers
net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
octeontx2-af: Support for setting MAC address
octeontx2-af: Support for changing RSS algorithm
octeontx2-af: NIX Rx flowkey configuration for RSS
octeontx2-af: Install ucast and bcast pkt forwarding rules
octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response
octeontx2-af: NPC MCAM and LDATA extract minimal configuration
octeontx2-af: Enable packet length and csum validation
octeontx2-af: Support for VTAG strip and capture
...
Linus Torvalds [Wed, 24 Oct 2018 05:42:00 +0000 (06:42 +0100)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc updates from David Miller:
"Mostly VDSO cleanups and optimizations"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Several small VDSO vclock_gettime.c improvements.
sparc: Validate VDSO for undefined symbols.
sparc: Really use linker with LDFLAGS.
sparc: Improve VDSO CFLAGS.
sparc: Set DISABLE_BRANCH_PROFILING in VDSO CFLAGS.
sparc: Don't bother masking out TICK_PRIV_BIT in VDSO code.
sparc: Inline VDSO gettime code aggressively.
sparc: Improve VDSO instruction patching.
sparc: Fix parport build warnings.
Eric Dumazet [Tue, 23 Oct 2018 18:54:16 +0000 (11:54 -0700)]
tcp: add tcp_reset_xmit_timer() helper
With EDT model, SRTT no longer is inflated by pacing delays.
This means that RTO and some other xmit timers might be setup
incorrectly. This is particularly visible with either :
- Very small enforced pacing rates (SO_MAX_PACING_RATE)
- Reduced rto (from the default 200 ms)
This can lead to TCP flows aborts in the worst case,
or spurious retransmits in other cases.
For example, this session gets far more throughput
than the requested 80kbit :
$ netperf -H 127.0.0.2 -l 100 -- -q 10000
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.2 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
540000 262144 262144 104.00 2.66
With the fix :
$ netperf -H 127.0.0.2 -l 100 -- -q 10000
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 127.0.0.2 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
540000 262144 262144 104.00 0.12
EDT allows for better control of rtx timers, since TCP has
a better idea of the earliest departure time of each skb
in the rtx queue. We only have to eventually add to the
timer the difference of the EDT time with current time.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 23 Oct 2018 19:02:03 +0000 (20:02 +0100)]
Merge branch 'parisc-4.20-1' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Lots of small fixes and enhancements, most noteably:
- Many TLB and cache flush optimizations (Dave)
- Fixed HPMC/crash handler on 64-bit kernel (Dave and myself)
- Added alternative infrastructre. The kernel now live-patches itself
for various situations, e.g. replace SMP code when running on one
CPU only or drop cache flushes when system has no cache installed.
- vmlinuz now contains a full copy of the compressed vmlinux file.
This simplifies debugging the currently booted kernel.
- Unused driver removal (Christoph)
- Reduced warnings of Dino PCI bridge when running in qemu
- Removed gcc version check (Masahiro)"
* 'parisc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (23 commits)
parisc: Retrieve and display the PDC PAT capabilities
parisc: Optimze cache flush algorithms
parisc: Remove pte_inserted define
parisc: Add PDC PAT cell_info() and pd_get_pdc_revisions() functions
parisc: Drop two instructions from pte lookup code
parisc: Use zdep for shlw macro on PA1.1 and PA2.0
parisc: Add alternative coding infrastructure
parisc: Include compressed vmlinux file in vmlinuz boot kernel
extract-vmlinux: Check for uncompressed image as fallback
parisc: Fix address in HPMC IVA
parisc: Fix exported address of os_hpmc handler
parisc: Fix map_pages() to not overwrite existing pte entries
parisc: Purge TLB entries after updating page table entry and set page accessed flag in TLB handler
parisc: Release spinlocks using ordered store
parisc: Ratelimit dino stuck interrupt warnings
parisc: dino: Utilize DINO_MASK_IRQ() macro
parisc: Clean up crash header output
parisc: Add SYSTEM_INFO and REGISTER TOC PAT functions
parisc: Remove PTE load and fault check from L2_ptep macro
parisc: Reorder TLB flush timing calculation
...
Linus Torvalds [Tue, 23 Oct 2018 18:32:10 +0000 (19:32 +0100)]
Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:
"The main item in this pull request are the Spectre variant 1.1 fixes
from Julien Thierry.
A few other patches to improve various areas, and removal of some
obsolete mcount bits and a redundant kbuild conditional"
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8802/1: Call syscall_trace_exit even when system call skipped
ARM: 8797/1: spectre-v1.1: harden __copy_to_user
ARM: 8796/1: spectre-v1,v1.1: provide helpers for address sanitization
ARM: 8795/1: spectre-v1.1: use put_user() for __put_user()
ARM: 8794/1: uaccess: Prevent speculative use of the current addr_limit
ARM: 8793/1: signal: replace __put_user_error with __put_user
ARM: 8792/1: oabi-compat: copy oabi events using __copy_to_user()
ARM: 8791/1: vfp: use __copy_to_user() when saving VFP state
ARM: 8790/1: signal: always use __copy_to_user to save iwmmxt context
ARM: 8789/1: signal: copy registers using __copy_to_user()
ARM: 8801/1: makefile: use ARMv3M mode for RiscPC
ARM: 8800/1: use choice for kernel unwinders
ARM: 8798/1: remove unnecessary KBUILD_SRC ifeq conditional
ARM: 8788/1: ftrace: remove old mcount support
ARM: 8786/1: Debug kernel copy by printing
Linus Torvalds [Tue, 23 Oct 2018 18:07:25 +0000 (19:07 +0100)]
Merge branch 'x86-vdso-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
"Two main changes:
- Cleanups, simplifications and CLOCK_TAI support (Thomas Gleixner)
- Improve code generation (Andy Lutomirski)"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Rearrange do_hres() to improve code generation
x86/vdso: Document vgtod_ts better
x86/vdso: Remove "memory" clobbers in the vDSO syscall fallbacks
x66/vdso: Add CLOCK_TAI support
x86/vdso: Move cycle_last handling into the caller
x86/vdso: Simplify the invalid vclock case
x86/vdso: Replace the clockid switch case
x86/vdso: Collapse coarse functions
x86/vdso: Collapse high resolution functions
x86/vdso: Introduce and use vgtod_ts
x86/vdso: Use unsigned int consistently for vsyscall_gtod_data:: Seq
x86/vdso: Enforce 64bit clocksource
x86/time: Implement clocksource_arch_init()
clocksource: Provide clocksource_arch_init()
Rahul Verma [Tue, 23 Oct 2018 15:04:24 +0000 (08:04 -0700)]
qed: Fix static checker warning
Static Checker Warnings:
drivers/net/ethernet/qlogic/qed/qed_main.c:1510 qed_fill_link_capability()
error: uninitialized symbol 'tcvr_state'.
drivers/net/ethernet/qlogic/qed/qed_mcp.c:1951 qed_mcp_trans_speed_mask()
error: uninitialized symbol 'transceiver_state'.
drivers/net/ethernet/qlogic/qed/qed_mcp.c:1951 qed_mcp_trans_speed_mask()
error: uninitialized symbol 'transceiver_type'.
Symbols tcvr_state, transceiver_state and transceiver_type
are initialized with respective default state.
Fixes: c56a8be7e7aa ("qed: Add supported link and advertise link to display in ethtool.")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Rahul Verma <Rahul.Verma@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Tue, 23 Oct 2018 14:40:26 +0000 (16:40 +0200)]
Revert "be2net: remove desc field from be_eq_obj"
The mentioned commit needs to be reverted because we cannot pass
string allocated on stack to request_irq(). This function stores
uses this pointer for later use (e.g. /proc/interrupts) so we need
to keep this string persistently.
Fixes: d6d9704af8f4 ("be2net: remove desc field from be_eq_obj")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Tue, 23 Oct 2018 11:40:39 +0000 (13:40 +0200)]
Revert "net: simplify sock_poll_wait"
This reverts commit
dd979b4df817e9976f18fb6f9d134d6bc4a3c317.
This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Oct 2018 17:55:35 +0000 (10:55 -0700)]
Merge branch 'netsec-fixes'
Masahisa Kojima says:
====================
Bugfix for the netsec driver
This patch series include bugfix for the netsec ethernet
controller driver, fix the problem in interface down/up.
changes in v2:
- change the place to perform the PHY power down
- use the MACROs defiend in include/uapi/linux/mii.h
- update commit comment
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Masahisa Kojima [Tue, 23 Oct 2018 11:24:28 +0000 (20:24 +0900)]
net: socionext: Reset tx queue in ndo_stop
We observed that packets and bytes count are not reset
when user performs interface down. Eventually, tx queue is
exhausted and packets will not be sent out.
To avoid this problem, resets tx queue in ndo_stop.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Masahisa Kojima [Tue, 23 Oct 2018 11:24:27 +0000 (20:24 +0900)]
net: socionext: Add dummy PHY register read in phy_write()
There is a compatibility issue between RTL8211E implemented
in Developerbox and netsec ethernet controller IP.
Our MDIO controller stops MDC clock right after the write
access, but RTL8211E expects MDC clock must be kept toggling
for several clock cycle with MDIO high before entering
the IDLE state. Without keeping clock after write access,
write access is not correctly handled and register is not
updated.
To meet this requirement, netsec driver needs to issue dummy
read(e.g. read PHYID1(offset 0x2) register) right after write
access, to keep MDC clock.
We think this compatibility issue is a problem specific to
our MDIO controller and RTL8211E.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Masahisa Kojima [Tue, 23 Oct 2018 11:24:26 +0000 (20:24 +0900)]
net: socionext: Stop PHY before resetting netsec
In ndo_stop, driver resets the netsec ethernet controller IP.
When the netsec IP is reset, HW running mode turns to NRM mode
and driver has to wait until this mode transition completes.
But mode transition to NRM will not complete if the PHY is
in normal operation state. Netsec IP requires PHY is in
power down state when it is reset.
This modification stops the PHY before resetting netsec.
Together with this modification, phy_addr is stored in netsec_priv
structure because ndev->phydev is not yet ready in ndo_init.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Masahisa Kojima <masahisa.kojima@linaro.org>
Signed-off-by: Yoshitoyo Osaki <osaki.yoshitoyo@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 23 Oct 2018 17:43:04 +0000 (18:43 +0100)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 pti updates from Ingo Molnar:
"The main changes:
- Make the IBPB barrier more strict and add STIBP support (Jiri
Kosina)
- Micro-optimize and clean up the entry code (Andy Lutomirski)
- ... plus misc other fixes"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation: Propagate information about RSB filling mitigation to sysfs
x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
x86/speculation: Add RETPOLINE_AMD support to the inline asm CALL_NOSPEC variant
x86/CPU: Fix unused variable warning when !CONFIG_IA32_EMULATION
x86/pti/64: Remove the SYSCALL64 entry trampoline
x86/entry/64: Use the TSS sp2 slot for SYSCALL/SYSRET scratch space
x86/entry/64: Document idtentry
Linus Torvalds [Tue, 23 Oct 2018 17:17:11 +0000 (18:17 +0100)]
Merge branch 'x86-platform-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
"Two minor OLPC changes: a build fix and a new quirk"
* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/olpc: Fix build error with CONFIG_MFD_CS5535=m
x86/olpc: Indicate that legacy PC XO-1 platform should not register RTC
Linus Torvalds [Tue, 23 Oct 2018 16:54:58 +0000 (17:54 +0100)]
Merge branch 'x86-paravirt-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 paravirt updates from Ingo Molnar:
"Two main changes:
- Remove no longer used parts of the paravirt infrastructure and put
large quantities of paravirt ops under a new config option
PARAVIRT_XXL=y, which is selected by XEN_PV only. (Joergen Gross)
- Enable PV spinlocks on Hyperv (Yi Sun)"
* 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyperv: Enable PV qspinlock for Hyper-V
x86/hyperv: Add GUEST_IDLE_MSR support
x86/paravirt: Clean up native_patch()
x86/paravirt: Prevent redefinition of SAVE_FLAGS macro
x86/xen: Make xen_reservation_lock static
x86/paravirt: Remove unneeded mmu related paravirt ops bits
x86/paravirt: Move the Xen-only pv_mmu_ops under the PARAVIRT_XXL umbrella
x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrella
x86/paravirt: Move the Xen-only pv_cpu_ops under the PARAVIRT_XXL umbrella
x86/paravirt: Move items in pv_info under PARAVIRT_XXL umbrella
x86/paravirt: Introduce new config option PARAVIRT_XXL
x86/paravirt: Remove unused paravirt bits
x86/paravirt: Use a single ops structure
x86/paravirt: Remove clobbers from struct paravirt_patch_site
x86/paravirt: Remove clobbers parameter from paravirt patch functions
x86/paravirt: Make paravirt_patch_call() and paravirt_patch_jmp() static
x86/xen: Add SPDX identifier in arch/x86/xen files
x86/xen: Link platform-pci-unplug.o only if CONFIG_XEN_PVHVM
x86/xen: Move pv specific parts of arch/x86/xen/mmu.c to mmu_pv.c
x86/xen: Move pv irq related functions under CONFIG_XEN_PV umbrella
Linus Torvalds [Tue, 23 Oct 2018 16:05:28 +0000 (17:05 +0100)]
Merge branch 'x86-mm-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
"Lots of changes in this cycle:
- Lots of CPA (change page attribute) optimizations and related
cleanups (Thomas Gleixner, Peter Zijstra)
- Make lazy TLB mode even lazier (Rik van Riel)
- Fault handler cleanups and improvements (Dave Hansen)
- kdump, vmcore: Enable kdumping encrypted memory with AMD SME
enabled (Lianbo Jiang)
- Clean up VM layout documentation (Baoquan He, Ingo Molnar)
- ... plus misc other fixes and enhancements"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
x86/stackprotector: Remove the call to boot_init_stack_canary() from cpu_startup_entry()
x86/mm: Kill stray kernel fault handling comment
x86/mm: Do not warn about PCI BIOS W+X mappings
resource: Clean it up a bit
resource: Fix find_next_iomem_res() iteration issue
resource: Include resource end in walk_*() interfaces
x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
x86/mm: Remove spurious fault pkey check
x86/mm/vsyscall: Consider vsyscall page part of user address space
x86/mm: Add vsyscall address helper
x86/mm: Fix exception table comments
x86/mm: Add clarifying comments for user addr space
x86/mm: Break out user address space handling
x86/mm: Break out kernel address space handling
x86/mm: Clarify hardware vs. software "error_code"
x86/mm/tlb: Make lazy TLB mode lazier
x86/mm/tlb: Add freed_tables element to flush_tlb_info
x86/mm/tlb: Add freed_tables argument to flush_tlb_mm_range
smp,cpumask: introduce on_each_cpu_cond_mask
smp: use __cpumask_set_cpu in on_each_cpu_cond
...
Linus Torvalds [Tue, 23 Oct 2018 15:47:41 +0000 (16:47 +0100)]
Merge branch 'x86-hyperv-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 hyperv updates from Ingo Molnar:
"Two small changes: a boot warning removal and a minor cleanup"
* 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hyperv: Remove unused include
x86/hyperv: Suppress "PCI: Fatal: No config space access function found"
Linus Torvalds [Tue, 23 Oct 2018 15:31:33 +0000 (16:31 +0100)]
Merge branch 'x86-grub2-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 grub2 updates from Ingo Molnar:
"This extends the x86 boot protocol to include an address for the RSDP
table - utilized by Xen currently.
Matching Grub2 patches are pending as well. (Juergen Gross)"
* 'x86-grub2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi, x86/boot: Take RSDP address for boot params if available
x86/boot: Add ACPI RSDP address to setup_header
x86/xen: Fix boot loader version reported for PVH guests
Linus Torvalds [Tue, 23 Oct 2018 15:16:40 +0000 (16:16 +0100)]
Merge branch 'x86-cpu-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 cpu updates from Ingo Molnar:
"The main changes in this cycle were:
- Add support for the "Dhyana" x86 CPUs by Hygon: these are licensed
based on the AMD Zen architecture, and are built and sold in China,
for domestic datacenter use. The code is pretty close to AMD
support, mostly with a few quirks and enumeration differences. (Pu
Wen)
- Enable CPUID support on Cyrix 6x86/6x86L processors"
* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools/cpupower: Add Hygon Dhyana support
cpufreq: Add Hygon Dhyana support
ACPI: Add Hygon Dhyana support
x86/xen: Add Hygon Dhyana support to Xen
x86/kvm: Add Hygon Dhyana support to KVM
x86/mce: Add Hygon Dhyana support to the MCA infrastructure
x86/bugs: Add Hygon Dhyana to the respective mitigation machinery
x86/apic: Add Hygon Dhyana support
x86/pci, x86/amd_nb: Add Hygon Dhyana support to PCI and northbridge
x86/amd_nb: Check vendor in AMD-only functions
x86/alternative: Init ideal_nops for Hygon Dhyana
x86/events: Add Hygon Dhyana support to PMU infrastructure
x86/smpboot: Do not use BSP INIT delay and MWAIT to idle on Dhyana
x86/cpu/mtrr: Support TOP_MEM2 and get MTRR number
x86/cpu: Get cache info and setup cache cpumap for Hygon Dhyana
x86/cpu: Create Hygon Dhyana architecture support file
x86/CPU: Change query logic so CPUID is enabled before testing
x86/CPU: Use correct macros for Cyrix calls
Linus Torvalds [Tue, 23 Oct 2018 15:04:22 +0000 (16:04 +0100)]
Merge branch 'x86-build-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 build update from Ingo Molnar:
"A small cleanup to x86 Kconfigs"
* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kconfig: Remove redundant 'default n' lines from all x86 Kconfig's
Linus Torvalds [Tue, 23 Oct 2018 14:54:42 +0000 (15:54 +0100)]
Merge branch 'x86-boot-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
"Two cleanups and a bugfix for a rare boot option combination"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/KASLR: Remove return value from handle_mem_options()
x86/corruption-check: Use pr_*() instead of printk()
x86/corruption-check: Fix panic in memory_corruption_check() when boot option without value is provided
Linus Torvalds [Tue, 23 Oct 2018 14:24:22 +0000 (15:24 +0100)]
Merge branch 'x86-asm-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were the fsgsbase related preparatory
patches from Chang S. Bae - but there's also an optimized
memcpy_flushcache() and a cleanup for the __cmpxchg_double() assembly
glue"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/fsgsbase/64: Clean up various details
x86/segments: Introduce the 'CPUNODE' naming to better document the segment limit CPU/node NR trick
x86/vdso: Initialize the CPU/node NR segment descriptor earlier
x86/vdso: Introduce helper functions for CPU and node number
x86/segments/64: Rename the GDT PER_CPU entry to CPU_NUMBER
x86/fsgsbase/64: Factor out FS/GS segment loading from __switch_to()
x86/fsgsbase/64: Convert the ELF core dump code to the new FSGSBASE helpers
x86/fsgsbase/64: Make ptrace use the new FS/GS base helpers
x86/fsgsbase/64: Introduce FS/GS base helper functions
x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately
x86/asm: Use CC_SET()/CC_OUT() in __cmpxchg_double()
x86/asm: Optimize memcpy_flushcache()
Linus Torvalds [Tue, 23 Oct 2018 14:15:20 +0000 (15:15 +0100)]
Merge branch 'x86-apic-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 apic updates from Ingo Molnar:
"Improve the spreading of managed IRQs at allocation time"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq/matrix: Spread managed interrupts on allocation
irq/matrix: Split out the CPU selection code into a helper
Linus Torvalds [Tue, 23 Oct 2018 14:00:03 +0000 (15:00 +0100)]
Merge branch 'sched-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
"The main changes are:
- Migrate CPU-intense 'misfit' tasks on asymmetric capacity systems,
to better utilize (much) faster 'big core' CPUs. (Morten Rasmussen,
Valentin Schneider)
- Topology handling improvements, in particular when CPU capacity
changes and related load-balancing fixes/improvements (Morten
Rasmussen)
- ... plus misc other improvements, fixes and updates"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions
sched/completions/Documentation: Clean up the document some more
sched/completions/Documentation: Fix a couple of punctuation nits
cpu/SMT: State SMT is disabled even with nosmt and without "=force"
sched/core: Fix comment regarding nr_iowait_cpu() and get_iowait_load()
sched/fair: Remove setting task's se->runnable_weight during PELT update
sched/fair: Disable LB_BIAS by default
sched/pelt: Fix warning and clean up IRQ PELT config
sched/topology: Make local variables static
sched/debug: Use symbolic names for task state constants
sched/numa: Remove unused numa_stats::nr_running field
sched/numa: Remove unused code from update_numa_stats()
sched/debug: Explicitly cast sched_feat() to bool
sched/core: Disable SD_PREFER_SIBLING on asymmetric CPU capacity domains
sched/fair: Don't move tasks to lower capacity CPUs unless necessary
sched/fair: Set rq->rd->overload when misfit
sched/fair: Wrap rq->rd->overload accesses with READ/WRITE_ONCE()
sched/core: Change root_domain->overload type to int
sched/fair: Change 'prefer_sibling' type to bool
sched/fair: Kick nohz balance if rq->misfit_task_load
...
Linus Torvalds [Tue, 23 Oct 2018 12:46:36 +0000 (13:46 +0100)]
Merge branch 'ras-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
"Misc smaller fixes and cleanups"
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mcelog: Remove one mce_helper definition
x86/mce: Add macros for the corrected error count bit field
x86/mce: Use BIT_ULL(x) for bit mask definitions
x86/mce-inject: Reset injection struct after injection
Linus Torvalds [Tue, 23 Oct 2018 12:32:18 +0000 (13:32 +0100)]
Merge branch 'perf-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The main updates in this cycle were:
- Lots of perf tooling changes too voluminous to list (big perf trace
and perf stat improvements, lots of libtraceevent reorganization,
etc.), so I'll list the authors and refer to the changelog for
details:
Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter
Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven
Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas
Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir
Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa.
... with the bulk of the changes written by Jiri Olsa, Tzvetomir
Stoyanov and Arnaldo Carvalho de Melo.
- Continued intel_rdt work with a focus on playing well with perf
events. This also imported some non-perf RDT work due to
dependencies. (Reinette Chatre)
- Implement counter freezing for Arch Perfmon v4 (Skylake and newer).
This allows to speed up the PMI handler by avoiding unnecessary MSR
writes and make it more accurate. (Andi Kleen)
- kprobes cleanups and simplification (Masami Hiramatsu)
- Intel Goldmont PMU updates (Kan Liang)
- ... plus misc other fixes and updates"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits)
kprobes/x86: Use preempt_enable() in optimized_callback()
x86/intel_rdt: Prevent pseudo-locking from using stale pointers
kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
perf/x86/intel: Export mem events only if there's PEBS support
x86/cpu: Drop pointless static qualifier in punit_dev_state_show()
x86/intel_rdt: Fix initial allocation to consider CDP
x86/intel_rdt: CBM overlap should also check for overlap with CDP peer
x86/intel_rdt: Introduce utility to obtain CDP peer
tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file
tools lib traceevent: Separate out tep_strerror() for strerror_r() issues
perf python: More portable way to make CFLAGS work with clang
perf python: Make clang_has_option() work on Python 3
perf tools: Free temporary 'sys' string in read_event_files()
perf tools: Avoid double free in read_event_file()
perf tools: Free 'printk' string in parse_ftrace_printk()
perf tools: Cleanup trace-event-info 'tdata' leak
perf strbuf: Match va_{add,copy} with va_end
perf test: S390 does not support watchpoints in test 22
perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG
tools include: Adopt linux/bits.h
...
Linus Torvalds [Tue, 23 Oct 2018 12:08:53 +0000 (13:08 +0100)]
Merge branch 'locking-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking and misc x86 updates from Ingo Molnar:
"Lots of changes in this cycle - in part because locking/core attracted
a number of related x86 low level work which was easier to handle in a
single tree:
- Linux Kernel Memory Consistency Model updates (Alan Stern, Paul E.
McKenney, Andrea Parri)
- lockdep scalability improvements and micro-optimizations (Waiman
Long)
- rwsem improvements (Waiman Long)
- spinlock micro-optimization (Matthew Wilcox)
- qspinlocks: Provide a liveness guarantee (more fairness) on x86.
(Peter Zijlstra)
- Add support for relative references in jump tables on arm64, x86
and s390 to optimize jump labels (Ard Biesheuvel, Heiko Carstens)
- Be a lot less permissive on weird (kernel address) uaccess faults
on x86: BUG() when uaccess helpers fault on kernel addresses (Jann
Horn)
- macrofy x86 asm statements to un-confuse the GCC inliner. (Nadav
Amit)
- ... and a handful of other smaller changes as well"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
locking/lockdep: Make global debug_locks* variables read-mostly
locking/lockdep: Fix debug_locks off performance problem
locking/pvqspinlock: Extend node size when pvqspinlock is configured
locking/qspinlock_stat: Count instances of nested lock slowpaths
locking/qspinlock, x86: Provide liveness guarantee
x86/asm: 'Simplify' GEN_*_RMWcc() macros
locking/qspinlock: Rework some comments
locking/qspinlock: Re-order code
locking/lockdep: Remove duplicated 'lock_class_ops' percpu array
x86/defconfig: Enable CONFIG_USB_XHCI_HCD=y
futex: Replace spin_is_locked() with lockdep
locking/lockdep: Make class->ops a percpu counter and move it under CONFIG_DEBUG_LOCKDEP=y
x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs
x86/cpufeature: Macrofy inline assembly code to work around GCC inlining bugs
x86/extable: Macrofy inline assembly code to work around GCC inlining bugs
x86/paravirt: Work around GCC inlining bugs when compiling paravirt ops
x86/bug: Macrofy the BUG table section handling, to work around GCC inlining bugs
x86/alternatives: Macrofy lock prefixes to work around GCC inlining bugs
x86/refcount: Work around GCC inlining bug
x86/objtool: Use asm macros to work around GCC inlining bugs
...
Linus Torvalds [Tue, 23 Oct 2018 12:04:03 +0000 (13:04 +0100)]
Merge branch 'efi-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes are:
- Add support for enlisting the help of the EFI firmware to create
memory reservations that persist across kexec.
- Add page fault handling to the runtime services support code on x86
so we can more gracefully recover from buggy EFI firmware.
- Fix command line handling on x86 for the boot path that omits the
stub's PE/COFF entry point.
- Other assorted fixes and updates"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86: boot: Fix EFI stub alignment
efi/x86: Call efi_parse_options() from efi_main()
efi/x86: earlyprintk - Add 64bit efi fb address support
efi/x86: drop task_lock() from efi_switch_mm()
efi/x86: Handle page faults occurring while running EFI runtime services
efi: Make efi_rts_work accessible to efi page fault handler
efi/efi_test: add exporting ResetSystem runtime service
efi/libstub: arm: support building with clang
efi: add API to reserve memory persistently across kexec reboot
efi/arm: libstub: add a root memreserve config table
efi: honour memory reservations passed via a linux specific config table
Linus Torvalds [Tue, 23 Oct 2018 11:31:17 +0000 (12:31 +0100)]
Merge branch 'core-rcu-for-linus' of git://git./linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
"The biggest change in this cycle is the conclusion of the big
'simplify RCU to two primary flavors' consolidation work - i.e.
there's a single RCU flavor for any kernel variant (PREEMPT and
!PREEMPT):
- Consolidate the RCU-bh, RCU-preempt, and RCU-sched flavors into a
single flavor similar to RCU-sched in !PREEMPT kernels and into a
single flavor similar to RCU-preempt (but also waiting on
preempt-disabled sequences of code) in PREEMPT kernels.
This branch also includes a refactoring of
rcu_{nmi,irq}_{enter,exit}() from Byungchul Park.
- Now that there is only one RCU flavor in any given running kernel,
the many "rsp" pointers are no longer required, and this cleanup
series removes them.
- This branch carries out additional cleanups made possible by the
RCU flavor consolidation, including inlining now-trivial
functions, updating comments and definitions, and removing
now-unneeded rcutorture scenarios.
- Now that there is only one flavor of RCU in any running kernel,
there is also only on rcu_data structure per CPU. This means that
the rcu_dynticks structure can be merged into the rcu_data
structure, a task taken on by this branch. This branch also
contains a -rt-related fix from Mike Galbraith.
There were also other updates:
- Documentation updates, including some good-eye catches from Joel
Fernandes.
- SRCU updates, most notably changes enabling call_srcu() to be
invoked very early in the boot sequence.
- Torture-test updates, including some preliminary work towards
making rcutorture better able to find problems that result in
insufficient grace-period forward progress.
- Initial changes to RCU to better promote forward progress of grace
periods, including fixing a bug found by Marius Hillenbrand and
David Woodhouse, with the fix suggested by Peter Zijlstra"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits)
srcu: Make early-boot call_srcu() reuse workqueue lists
rcutorture: Test early boot call_srcu()
srcu: Make call_srcu() available during very early boot
rcu: Convert rcu_state.ofl_lock to raw_spinlock_t
rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed
rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks
rcu: Switch dyntick nesting counters to rcu_data structure
rcu: Switch urgent quiescent-state requests to rcu_data structure
rcu: Switch lazy counts to rcu_data structure
rcu: Switch last accelerate/advance to rcu_data structure
rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure
rcu: Merge rcu_dynticks structure into rcu_data structure
rcu: Remove unused rcu_dynticks_snap() from Tiny RCU
rcu: Convert "1UL << x" to "BIT(x)"
rcu: Avoid resched_cpu() when rescheduling the current CPU
rcu: More aggressively enlist scheduler aid for nohz_full CPUs
rcu: Compute jiffies_till_sched_qs from other kernel parameters
rcu: Provide functions for determining if call_rcu() has been invoked
rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure
rcu: Motivate Tiny RCU forward progress
...
Ingo Molnar [Tue, 23 Oct 2018 10:30:19 +0000 (12:30 +0200)]
Merge branch 'x86/cache' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Tue, 23 Oct 2018 10:14:47 +0000 (11:14 +0100)]
Merge tag 's390-4.20-1' of git://git./linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- Improved access control for the zcrypt driver, multiple device nodes
can now be created with different access control lists
- Extend the pkey API to provide random protected keys, this is useful
for encrypted swap device with ephemeral protected keys
- Add support for virtually mapped kernel stacks
- Rework the early boot code, this moves the memory detection into the
boot code that runs prior to decompression.
- Add KASAN support
- Bug fixes and cleanups
* tag 's390-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (83 commits)
s390/pkey: move pckmo subfunction available checks away from module init
s390/kasan: support preemptible kernel build
s390/pkey: Load pkey kernel module automatically
s390/perf: Return error when debug_register fails
s390/sthyi: Fix machine name validity indication
s390/zcrypt: fix broken zcrypt_send_cprb in-kernel api function
s390/vmalloc: fix VMALLOC_START calculation
s390/mem_detect: add missing include
s390/dumpstack: print psw mask and address again
s390/crypto: Enhance paes cipher to accept variable length key material
s390/pkey: Introduce new API for transforming key blobs
s390/pkey: Introduce new API for random protected key verification
s390/pkey: Add sysfs attributes to emit secure key blobs
s390/pkey: Add sysfs attributes to emit protected key blobs
s390/pkey: Define protected key blob format
s390/pkey: Introduce new API for random protected key generation
s390/zcrypt: add ap_adapter_mask sysfs attribute
s390/zcrypt: provide apfs failure code on type 86 error reply
s390/zcrypt: zcrypt device driver cleanup
s390/kasan: add support for mem= kernel parameter
...
Linus Torvalds [Tue, 23 Oct 2018 10:06:43 +0000 (11:06 +0100)]
Merge tag 'please-pull-next' of git://git./linux/kernel/git/aegl/linux
Pull ia64 updates from Tony Luck:
"Miscellaneous ia64 fixes from Christoph"
* tag 'please-pull-next' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
intel-iommu: mark intel_dma_ops static
ia64: remove machvec_dma_sync_{single,sg}
ia64/sn2: remove no-ops dma sync methods
ia64: remove the unused iommu_dma_init function
ia64: remove the unused pci_iommu_shutdown function
ia64: remove the unused bad_dma_address symbol
ia64: remove iommu_dma_supported
ia64: remove the dead iommu_sac_force variable
ia64: remove the kern_mem_attribute export
Linus Torvalds [Tue, 23 Oct 2018 10:02:05 +0000 (11:02 +0100)]
Merge branch 'stable/for-linus-4.20' of git://git./linux/kernel/git/konrad/swiotlb
Pull xen swiotlb fix from Konrad Rzeszutek Wilk:
"One tiny fix for the Xen SWIOTLB mechanism that occasionally happened
with devices that didn't allocate size in power of two but rather some
odd sizes.
We neglected to make the memory coherent leading to all kinds of fun
crashes"
* 'stable/for-linus-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
xen-swiotlb: use actually allocated size on check physical continuous
Linus Torvalds [Tue, 23 Oct 2018 09:33:16 +0000 (10:33 +0100)]
Merge tag 'acpi-4.20-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These fix ACPICA issues related to the handling of module-level AML,
fix an ordering issue during ACPI initialization, update ACPICA to
upstream revision
20181003 (including fixes mostly), fix issues with
system-wide suspend/resume related to the ACPI driver for Intel SoCs
(LPSS), fix device enumeration issues on boards with Dollar Cove or
Whiskey Cove Intel PMICs, prevent ACPICA from calling ktime_get() in
unsuitable conditions, update a few drivers and clean up some code in
several places.
Specifics:
- Fix ACPICA issues related to the handling of module-level AML and
make the ACPI initialization code parse ECDT before loading the
definition block tables (Erik Schmauss).
- Update ACPICA to upstream revision
20181003 including fixes related
to the ill-defined "generic serial bus" and the handling of the
_REG object (Bob Moore).
- Fix some issues with system-wide suspend/resume on Intel BYT/CHT
related to the handling of I2C controllers in the ACPI LPSS driver
for Intel SoCs (Hans de Goede).
- Modify the ACPI namespace scanning code to enumerate INT33FE HID
devices as platform devices with I2C resources to avoid device
enumeration problems on boards with Dollar Cove or Whiskey Cove
Intel PMICs (Hans de Goede).
- Prevent ACPICA from using ktime_get() during early resume from
system-wide suspend before resuming the timekeeping which generally
is unsafe and triggers a warning from the timekeeping code (Bart
Van Assche).
- Add low-level real time clock support to the ACPI Time and Aalarm
Device (TAD) driver (Rafael Wysocki).
- Fix the ACPI SBS driver to avoid GPE storms on MacBook Pro and
Oopses when removing modules (Ronald Tschalär).
- Fix the ACPI PPTT parsing code to handle architecturally unknown
cache types properly (Jeffrey Hugo).
- Fix initialization issue in the ACPI processor driver (Dou Liyang).
- Clean up the code in several places (Andy Shevchenko, Bartlomiej
Zolnierkiewicz, David Arcari, zhong jiang)"
* tag 'acpi-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (33 commits)
ACPI / scan: Create platform device for INT33FE ACPI nodes
ACPI / OSL: Use 'jiffies' as the time bassis for acpi_os_get_timer()
ACPI: probe ECDT before loading AML tables regardless of module-level code flag
ACPICA: Remove acpi_gbl_group_module_level_code and only use acpi_gbl_execute_tables_as_methods instead
ACPICA: AML Parser: fix parse loop to correctly skip erroneous extended opcodes
ACPICA: AML interpreter: add region addresses in global list during initialization
ACPI: TAD: Add low-level support for real time capability
ACPI: remove redundant 'default n' from Kconfig
ACPI / SBS: Fix rare oops when removing modules
ACPI / SBS: Fix GPE storm on recent MacBookPro's
ACPI/PPTT: Handle architecturally unknown cache types
drivers: base: cacheinfo: Do not populate sysfs for unknown cache types
ACPICA: Update version to
20181003
ACPICA: Never run _REG on system_memory and system_IO
ACPICA: Split large interpreter file
ACPICA: Update for field unit access
ACPICA: Rename some of the Field Attribute defines
ACPICA: Update for generic_serial_bus and attrib_raw_process_bytes protocol
ACPI / processor: Fix the return value of acpi_processor_ids_walk()
ACPI / LPSS: Resume BYT/CHT I2C controllers from resume_noirq
...
Linus Torvalds [Tue, 23 Oct 2018 09:28:21 +0000 (10:28 +0100)]
Merge tag 'pm-4.20-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These make hibernation on 32-bit x86 systems work in all of the cases
in which it works on 64-bit x86 ones, update the menu cpuidle governor
and the "polling" state to make them more efficient, add more hardware
support to cpufreq drivers and fix issues with some of them, fix a bug
in the conservative cpufreq governor, fix the operating performance
points (OPP) framework and make it more stable, update the devfreq
subsystem to take changes in the APIs used by into account and clean
up some things all over.
Specifics:
- Backport hibernation bug fixes from x86-64 to x86-32 and
consolidate hibernation handling on x86 to allow 32-bit systems to
work in all of the cases in which 64-bit ones work (Zhimin Gu, Chen
Yu).
- Fix hibernation documentation (Vladimir D. Seleznev).
- Update the menu cpuidle governor to fix a couple of issues with it,
make it more efficient in some cases and clean it up (Rafael
Wysocki).
- Rework the cpuidle polling state implementation to make it more
efficient (Rafael Wysocki).
- Clean up the cpuidle core somewhat (Fieah Lim).
- Fix the cpufreq conservative governor to take policy limits into
account properly in some cases (Rafael Wysocki).
- Add support for retrieving guaranteed performance information to
the ACPI CPPC library and make the intel_pstate driver use it to
expose the CPU base frequency via sysfs on systems with the
hardware-managed P-states (HWP) feature enabled (Srinivas
Pandruvada).
- Fix clang warning in the CPPC cpufreq driver (Nathan Chancellor).
- Get rid of device_node.name printing from cpufreq (Rob Herring).
- Remove unnecessary unlikely() from the cpufreq core (Igor Stoppa).
- Add support for the r8a7744 SoC to the cpufreq-dt driver (Biju
Das).
- Update the dt-platdev cpufreq driver to allow RK3399 to have
separate tunables per cluster (Dmitry Torokhov).
- Fix the dma_alloc_coherent() usage in the tegra186 cpufreq driver
(Christoph Hellwig).
- Make the imx6q cpufreq driver read OCOTP through nvmem for
imx6ul/imx6ull (Anson Huang).
- Fix several bugs in the operating performance points (OPP)
framework and make it more stable (Viresh Kumar, Dave Gerlach).
- Update the devfreq subsystem to take changes in the APIs used by
into account, fix some issues with it and make it stop print
device_node.name directly (Bjorn Andersson, Enric Balletbo i Serra,
Matthias Kaehlcke, Rob Herring, Vincent Donnefort, zhong jiang).
- Prepare the generic power domains (genpd) framework for dealing
with domains containing CPUs (Ulf Hansson).
- Prevent sysfs attributes representing low-power S0 residency
counters from being exposed if low-power S0 support is not
indicated in ACPI FADT (Rajneesh Bhardwaj).
- Get rid of custom CPU features macros for Intel CPUs from the
intel_idle and RAPL drivers (Andy Shevchenko).
- Update the tasks freezer to list tasks that refused to freeze and
caused a system transition to a sleep state to be aborted (Todd
Brandt).
- Update the pm-graph set of tools to v5.2 (Todd Brandt).
- Fix some issues in the cpupower utility (Anders Roxell, Prarit
Bhargava)"
* tag 'pm-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits)
PM / Domains: Document flags for genpd
PM / Domains: Deal with multiple states but no governor in genpd
PM / Domains: Don't treat zero found compatible idle states as an error
cpuidle: menu: Avoid computations when result will be discarded
cpuidle: menu: Drop redundant comparison
cpufreq: tegra186: don't pass GFP_DMA32 to dma_alloc_coherent()
cpufreq: conservative: Take limits changes into account properly
Documentation: intel_pstate: Add base_frequency information
cpufreq: intel_pstate: Add base_frequency attribute
ACPI / CPPC: Add support for guaranteed performance
cpuidle: menu: Simplify checks related to the polling state
PM / tools: sleepgraph and bootgraph: upgrade to v5.2
PM / tools: sleepgraph: first batch of v5.2 changes
cpupower: Fix coredump on VMWare
cpupower: Fix AMD Family 0x17 msr_pstate size
cpufreq: imx6q: read OCOTP through nvmem for imx6ul/imx6ull
cpufreq: dt-platdev: allow RK3399 to have separate tunables per cluster
cpuidle: poll_state: Revise loop termination condition
cpuidle: menu: Move the latency_req == 0 special case check
cpuidle: menu: Avoid computations for very close timers
...
Linus Torvalds [Tue, 23 Oct 2018 09:22:33 +0000 (10:22 +0100)]
Merge branch 'pcmcia-next' of git://git./linux/kernel/git/brodo/linux
Pull pcmcia fixes from Dominik Brodowski:
"These are just a few odd fixes and improvements to the PCMCIA core and
to a few PCMCIA device drivers"
* 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
pcmcia: Implement CLKRUN protocol disabling for Ricoh bridges
pcmcia: pcmcia_resource: Replace mdelay() with msleep()
pcmcia: add error handling for pcmcia_enable_device in qlogic_stub
char: pcmcia: cm4000_cs: Replace mdelay with usleep_range in set_protocol
pcmcia: Use module_pcmcia_driver for scsi drivers
pcmcia: remove KERN_INFO level from debug message
Linus Torvalds [Tue, 23 Oct 2018 08:42:05 +0000 (09:42 +0100)]
Merge tag 'for-linus-4.20' of https://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
"Lots of small changes to the IPMI driver. Most of the changes are
logging cleanup and style fixes. There are a few smaller bug fixes"
* tag 'for-linus-4.20' of https://github.com/cminyard/linux-ipmi: (21 commits)
ipmi: Fix timer race with module unload
ipmi:ssif: Add support for multi-part transmit messages > 2 parts
MAINTAINERS: Add file patterns for ipmi device tree bindings
ipmi: Remove platform driver overrides and use the id_table
ipmi: Free the address list on module cleanup
ipmi: Don't leave holes in the I2C address list in the ssif driver
ipmi: fix return value of ipmi_set_my_LUN
ipmi: Convert pr_xxx() to dev_xxx() in the BT code
ipmi:dmi: Ignore IPMI SMBIOS entries with a zero base address
ipmi:dmi: Use pr_fmt in the IPMI DMI code
ipmi: Change to ktime_get_ts64()
ipmi_si: fix potential integer overflow on large shift
ipmi_si_pci: fix NULL device in ipmi_si error message
ipmi: Convert printk(KERN_<level> to pr_<level>(
ipmi: Use more common logging styles
ipmi: msghandler: Add and use pr_fmt and dev_fmt, remove PFX
pci:ipmi: Move IPMI PCI class id defines to pci_ids.h
ipmi: Finally get rid of ipmi_user_t and ipmi_smi_t
ipmi:powernv: Convert ipmi_smi_t to struct ipmi_smi
hwmon:ibm: Change ipmi_user_t to struct ipmi_user *
...
Linus Torvalds [Tue, 23 Oct 2018 07:55:03 +0000 (08:55 +0100)]
Merge tag 'leds-for-4.20-rc1' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED updates from Jacek Anaszewski:
"We introduce LED pattern trigger - the idea that was proposed three
years ago now received enough attention and determination to drive it
to the successful end.
There is also one new LED class driver and couple of improvements to
the existing ones.
New LED class driver:
- add support for Panasonic AN30259A with related DT bindings
New LED trigger:
- introduce LED pattern trigger
leds-sc27xx-bltc:
- implement pattern_set/clear ops to enable support for pattern
trigger's hw_pattern sysfs file
Improvements to existing LED class drivers:
- leds-pwm: don't print error message on -EPROBE_DEFER
- leds-gpio: try to lookup gpiod from device
- leds-as3645a: convert to using %pOFn instead of device_node.name"
* tag 'leds-for-4.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: sc27xx: Add pattern_set/clear interfaces for LED controller
leds: core: Introduce LED pattern trigger
leds: add Panasonic AN30259A support
dt-bindings: leds: document Panasonic AN30259A bindings
leds: gpio: Try to lookup gpiod from device
leds: pwm: silently error out on EPROBE_DEFER
leds: Convert to using %pOFn instead of device_node.name
Linus Torvalds [Tue, 23 Oct 2018 07:45:05 +0000 (08:45 +0100)]
Merge tag 'gpio-v4.20-1' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.20 series:
Core changes:
- A patch series from Hans Verkuil to make it possible to
enable/disable IRQs on a GPIO line at runtime and drive GPIO lines
as output without having to put/get them from scratch.
The irqchip callbacks have been improved so that they can use only
the fastpatch callbacks to enable/disable irqs like any normal
irqchip, especially the gpiod_lock_as_irq() has been improved to be
callable in fastpath context.
A bunch of rework had to be done to achieve this but it is a big
win since I never liked to restrict this to slowpath. The only call
requireing slowpath was try_module_get() and this is kept at the
.request_resources() slowpath callback. In the GPIO CEC driver this
is a big win sine a single line is used for both outgoing and
incoming traffic, and this needs to use IRQs for incoming traffic
while actively driving the line for outgoing traffic.
- Janusz Krzysztofik improved the GPIO array API to pass a "cookie"
(struct gpio_array) and a bitmap for setting or getting multiple
GPIO lines at once.
This improvement orginated in a specific need to speed up an OMAP1
driver and has led to a much better API and real performance gains
when the state of the array can be used to bypass a lot of checks
and code when we want things to go really fast.
The previous code would minimize the number of calls down to the
driver callbacks assuming the CPU speed was orders of magnitude
faster than the I/O latency, but this assumption was wrong on
several platforms: what we needed to do was to profile and improve
the speed on the hot path of the array functions and this change is
now completed.
- Clean out the painful and hard to grasp BNF experiments from the
device tree bindings. Future approaches are looking into using JSON
schema for this purpose. (Rob Herring is floating a patch series.)
New drivers:
- The RCAR driver now supports r8a774a1 (RZ/G2M).
- Synopsys GPIO via CREGs driver.
Major improvements:
- Modernization of the EP93xx driver to use irqdomain and other
contemporary concepts.
- The ingenic driver has been merged into the Ingenic pin control
driver and removed from the GPIO subsystem.
- Debounce support in the ftgpio010 driver"
* tag 'gpio-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (116 commits)
gpio: Clarify kerneldoc on gpiochip_set_chained_irqchip()
gpio: Remove unused 'irqchip' argument to gpiochip_set_cascaded_irqchip()
gpio: Drop parent irq assignment during cascade setup
mmc: pwrseq_simple: Fix incorrect handling of GPIO bitmap
gpio: fix SNPS_CREG kconfig dependency warning
gpiolib: Initialize gdev field before is used
gpio: fix kernel-doc after devres.c file rename
gpio: fix doc string for devm_gpiochip_add_data() to not talk about irq_chip
gpio: syscon: Fix possible NULL ptr usage
gpiolib: Show correct direction from the beginning
pinctrl: msm: Use init_valid_mask exported function
gpiolib: Add init_valid_mask exported function
GPIO: add single-register GPIO via CREG driver
dt-bindings: Document the Synopsys GPIO via CREG bindings
gpio: mockup: use device properties instead of platform_data
gpio: Slightly more helpful debugfs
gpio: omap: Remove set but not used variable 'dev'
gpio: omap: drop omap_gpio_list
Accept partial 'gpio-line-names' property.
gpio: omap: get rid of the conditional PM runtime calls
...
Linus Torvalds [Tue, 23 Oct 2018 07:40:16 +0000 (08:40 +0100)]
Merge tag 'pinctrl-v4.20-1' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"This is the bulk of pin control changes for the v4.20 series:
There were no significant changes to the core this time! Bur the new
Qualcomm, Mediatek and Broadcom drivers are quite interesting as they
will be used in a few million embedded devices the coming years as it
seems.
New drivers:
- Broadcom Northstar pin control driver.
- Mediatek MT8183 subdriver.
- Mediatek MT7623 subdriver.
- Mediatek MT6765 subdriver.
- Meson g12a subdriver.
- Nuvoton NPCM7xx pin control and GPIO driver.
- Qualcomm QCS404 pin control and GPIO subdriver.
- Qualcomm SDM660 pin control and GPIO subdriver.
- Renesas R8A7744 PFC subdriver.
- Renesas R8A774C0 PFC subdriver.
- Renesas RZ/N1 pinctrl driver
Major improvements:
- Pulled the GPIO support for Ingenic over from the GPIO subsystem
and consolidated it all in the Ingenic pin control driver.
- Major cleanups and consolidation work in all Intel drivers.
- Major cleanups and consolidation work in all Mediatek drivers.
- Lots of incremental improvements to the Renesas PFC pin controller
family.
- All drivers doing GPIO now include <linux/gpio/driver.h> and
nothing else"
* tag 'pinctrl-v4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (153 commits)
pinctrl: sunxi: Fix a memory leak in 'sunxi_pinctrl_build_state()'
gpio: uniphier: include <linux/bits.h> instead of <linux/bitops.h>
pinctrl: uniphier: include <linux/bits.h> instead of <linux/bitops.h>
dt-bindings: pinctrl: bcm4708-pinmux: improve example binding
pinctrl: geminilake: Sort register offsets by value
pinctrl: geminilake: Get rid of unneeded ->probe() stub
pinctrl: geminilake: Update pin list for B0 stepping
pinctrl: renesas: Fix platform_no_drv_owner.cocci warnings
pinctrl: mediatek: Make eint_m u16
pinctrl: bcm: ns: Use uintptr_t for casting data
pinctrl: madera: Fix uninitialized variable bug in madera_mux_set_mux
pinctrl: gemini: Fix up TVC clock group
pinctrl: gemini: Drop noisy debug prints
pinctrl: gemini: Mask and set properly
pinctrl: mediatek: select GPIOLIB
pinctrl: rza1: don't manually release devm managed resources
MAINTAINERS: update entry for Mediatek pin controller
pinctrl: bcm: add Northstar driver
dt-bindings: pinctrl: document Broadcom Northstar pin mux controller
pinctrl: qcom: fix 'const' pointer handling
...
Linus Torvalds [Tue, 23 Oct 2018 07:36:15 +0000 (08:36 +0100)]
Merge tag 'mmc-v4.20' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Introduce a host helper function to share re-tuning progress
MMC host:
- sdhci: Add support for v4 host mode
- sdhci-of-arasan: Add Support for AM654 MMC and PHY
- sdhci-sprd: Add support for Spreadtrum's host controller
- sdhci-tegra: Add support for HS400 enhanced strobe
- sdhci-tegra: Enable UHS/HS200 modes for Tegra186/210
- sdhci-tegra: Add support for HS400 delay line calibration
- sdhci-tegra: Add support for pad calibration
- sdhci-of-dwcmshc: Address 128MB DMA boundary limitation
- sdhci-of-esdhc: Add support for tuning erratum
A008171
- sdhci-iproc: Add ACPI support
- mediatek: Add support for MT8183
- mediatek: Improve the support for tuning
- mediatek: Add bus clock control for MT2712
- jz4740: Add support for the JZ4725B
- mmci: Add support for the stm32 sdmmc variant
- mmci: Add support for an optional reset control
- mmci: Add some new variant specific properties/callbacks
- mmci: Re-structure DMA code to prepare for new variants
- renesas_sdhi: Add support for r8a77470, r8a7744 and r8a774a1
- renesas_sdhi_internal_dmac: Whitelist r8a77970 and r8a774a1
- tmio/uniphier-sd: Add new UniPhier SD/eMMC controller driver
- tmio/renesas_sdhi: Deal properly with SCC detection during re-tune
- tmio/renesas_sdhi: Refactor/consolidate clock management
- omap_hsmmc: Drop cover detection and some unused platform data
- dw_mmc-exynos: Enable tuning for more speed modes
- sunxi: Clarify the new timing mode and enable it for the A64 controller
- various: Convert to slot GPIO descriptors"
* tag 'mmc-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (129 commits)
mmc: mediatek: drop too much code of tuning method
mmc: mediatek: add MT8183 MMC driver support
mmc: mediatek: tune CMD/DATA together
mmc: mediatek: fix cannot receive new request when msdc_cmd_is_ready fail
mmc: mediatek: fill the actual clock for mmc debugfs
mmc: dt-bindings: add support for MT8183 SoC
mmc: uniphier-sd: avoid using broken DMA RX channel
mmc: uniphier-sd: fix DMA disabling
mmc: tmio: simplify the DMA mode test
mmc: tmio: remove TMIO_MMC_HAVE_HIGH_REG flag
mmc: tmio: move MFD variant reset to a platform hook
mmc: renesas_sdhi: Add r8a77470 SDHI1 support
dt-bindings: mmc: renesas_sdhi: Add r8a77470 support
mmc: mmci: add stm32 sdmmc variant
dt-bindings: mmci: add stm32 sdmmc variant
mmc: mmci: add stm32 sdmmc registers
mmc: mmci: add clock divider for stm32 sdmmc
mmc: mmci: add optional reset property
dt-bindings: mmci: add optional reset property
mmc: mmci: add variant property to not read datacnt
...
Thor Thayer [Mon, 22 Oct 2018 22:22:26 +0000 (17:22 -0500)]
net: stmmac: Set OWN bit for jumbo frames
Ping with Jumbo packet does not reply and get a watchdog timeout
[ 46.059616] ------------[ cut here ]------------
[ 46.064268] NETDEV WATCHDOG: eth0 (socfpga-dwmac): transmit queue 0 timed out
[ 46.071471] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:461 dev_watchdog+0x2cc/0x2d8
[ 46.079708] Modules linked in:
[ 46.082761] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
4.18.0-00115-gc262be665854-dirty #264
[ 46.091082] Hardware name: SoCFPGA Stratix 10 SoCDK (DT)
[ 46.096377] pstate:
20000005 (nzCv daif -PAN -UAO)
[ 46.101152] pc : dev_watchdog+0x2cc/0x2d8
[ 46.105149] lr : dev_watchdog+0x2cc/0x2d8
[ 46.109144] sp :
ffff00000800bd80
[ 46.112447] x29:
ffff00000800bd80 x28:
ffff80007a9b4940
[ 46.117744] x27:
00000000ffffffff x26:
ffff80007aa183b0
[ 46.123040] x25:
0000000000000001 x24:
0000000000000140
[ 46.128336] x23:
ffff80007aa1839c x22:
ffff80007aa17fb0
[ 46.133632] x21:
ffff80007aa18000 x20:
ffff0000091a7000
[ 46.138927] x19:
0000000000000000 x18:
ffffffffffffffff
[ 46.144223] x17:
0000000000000000 x16:
0000000000000000
[ 46.149519] x15:
ffff0000091a96c8 x14:
07740775076f0720
[ 46.154814] x13:
07640765076d0769 x12:
0774072007300720
[ 46.160110] x11:
0765077507650775 x10:
0771072007740769
[ 46.165406] x9 :
076d0773076e0761 x8 :
077207740720073a
[ 46.170702] x7 :
072907630761076d x6 :
ffff80007ff9a0c0
[ 46.175997] x5 :
ffff80007ff9a0c0 x4 :
0000000000000002
[ 46.181293] x3 :
0000000000000000 x2 :
ffff0000091ac180
[ 46.186589] x1 :
e6a742ebe628e800 x0 :
0000000000000000
[ 46.191885] Call trace:
[ 46.194326] dev_watchdog+0x2cc/0x2d8
[ 46.197980] call_timer_fn+0x20/0x78
[ 46.201544] expire_timers+0xa4/0xb0
[ 46.205108] run_timer_softirq+0xe4/0x198
[ 46.209107] __do_softirq+0x114/0x210
[ 46.212760] irq_exit+0xd0/0xd8
[ 46.215895] __handle_domain_irq+0x60/0xb0
[ 46.219977] gic_handle_irq+0x58/0xa8
[ 46.223628] el1_irq+0xb0/0x128
[ 46.226761] arch_cpu_idle+0x10/0x18
[ 46.230326] do_idle+0x1d4/0x288
[ 46.233544] cpu_startup_entry+0x24/0x28
[ 46.237457] secondary_start_kernel+0x17c/0x1c0
[ 46.241971] ---[ end trace
57048cd1372cd828 ]---
Inspection of queue showed Jumbo packets were not sent out.
The ring Jumbo packet function needs to set the OWN bit so
the packet is sent.
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thor Thayer [Mon, 22 Oct 2018 22:22:25 +0000 (17:22 -0500)]
arm64: dts: stratix10: Support Ethernet Jumbo frame
Properly specify the RX and TX FIFO size which is important
for Jumbo frames.
Update the max-frame-size to support Jumbo frames.
Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Oct 2018 03:21:30 +0000 (20:21 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree:
1) rbtree lookup from control plane returns the left-hand side element
of the range when the interval end flag is set on.
2) osf extension is not supported from the input path, reject this from
the control plane, from Fernando Fernandez Mancera.
3) xt_TEE is leaving output interface unset due to a recent incorrect
netns rework, from Taehee Yoo.
4) xt_TEE allows to select an interface which does not belong to this
netnamespace, from Taehee Yoo.
5) Zero private extension area in nft_compat, just like we do in x_tables,
otherwise we leak kernel memory to userspace.
6) Missing .checkentry and .destroy entries in new DNAT extensions breaks
it since we never load nf_conntrack dependencies, from Paolo Abeni.
7) Do not remove flowtable hook from netns exit path, the netdevice handler
already deals with this, also from Taehee Yoo.
8) Only cleanup flowtable entries that reside in this netnamespace, also
from Taehee Yoo.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Watson [Mon, 22 Oct 2018 19:36:37 +0000 (19:36 +0000)]
tls: Add maintainers
Add John and Daniel as additional tls co-maintainers to help review
patches and fix syzbot reports.
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Mon, 22 Oct 2018 18:51:36 +0000 (21:51 +0300)]
net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
After flushing all mcast entries from the table, the ones contained in
mc list of ndev are not restored when promisc mode is toggled off,
because they are considered as synched with ALE, thus, in order to
restore them after promisc mode - reset syncing info. This fix
touches only switch mode devices, including single port boards
like Beagle Bone.
Fixes: commit 5da1948969bc
("net: ethernet: ti: cpsw: fix lost of mcast packets while rx_mode update")
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Oct 2018 03:15:39 +0000 (20:15 -0700)]
Merge branch 'octeontx2-af-NPC-parser-and-NIX-blocks-initialization'
Sunil Goutham says:
====================
octeontx2-af: NPC parser and NIX blocks initialization
This patchset is a continuation to earlier submitted two patch
series to add a new driver for Marvell's OcteonTX2 SOC's
Resource virtualization unit (RVU) admin function driver.
1. octeontx2-af: Add RVU Admin Function driver
https://www.spinics.net/lists/netdev/msg528272.html
2. octeontx2-af: NPA and NIX blocks initialization
https://www.spinics.net/lists/netdev/msg529163.html
This patch series adds more NIX block configuration logic
and additionally adds NPC block parser profile configuration.
In brief below is what this series adds.
NIX block:
- Support for PF/VF to allocate/free transmit scheduler queues,
maintenance and their configuration.
- Adds support for packet replication lists, only broadcast
packets is covered for now.
- Defines few RSS flow algorithms for HW to distribute packets.
This is not the hash algorithsm (i.e toeplitz or crc32), here SW
defines what fields in packet should HW take and calculate the hash.
- Support for PF/VF to configure VTAG strip and capture capabilities.
- Reset NIXLF statastics.
NPC block:
This block has multiple parser engines which support packet parsing
at multiple layers and generates a parse result which is further used
to generate a key. Based on packet field offsets in the key, SW can
install packet forwarding rules.
This patch series adds
- Initial parser profile to be programmed into parser engines.
- Default forwarding rules to forward packets to different logical
interfaces having a NIXLF attached.
- Support for promiscuous and multicast modes.
Changes from v1:
1 Fixed kernel build failure when compiled with BIG_ENDIAN enabled.
- Reported by Kbuild test robot
2 Fixed a warning observed when kernel is built with -Wunused-but-set-variable
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:56:04 +0000 (23:26 +0530)]
octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
By default NIXLF is set in UCAST mode. This patch adds a new
mailbox message which when sent by a RVU PF changes this default
mode. When promiscuous mode is needed, the reserved promisc entry
for each of RVU PF is setup to match against ingress channel number
only, so that all pkts on that channel are accepted and forwarded
to the mode change requesting PF_FUNC's NIXLF.
PROMISC and ALLMULTI modes are supported only for PFs, for VFs only
UCAST mode is supported.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:56:03 +0000 (23:26 +0530)]
octeontx2-af: Support for setting MAC address
Added a new mailbox message for a PF/VF to set/update
it's NIXLF's MAC address. Also updates unicast NPC
MCAM entry with this address as matching DMAC.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:56:02 +0000 (23:26 +0530)]
octeontx2-af: Support for changing RSS algorithm
This patch adds support for a RVU PF/VF to change
NIX Rx flowkey algorithm index in NPC RX RSS_ACTION.
eg: a ethtool command changing RSS algorithm for a netdev
interface would trigger this change in NPC.
If PF/VF doesn't specify any MCAM entry index then default
UCAST entry of the NIXLF attached to PF/VF will be updated
with RSS_ACTION and flowkey index.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:56:01 +0000 (23:26 +0530)]
octeontx2-af: NIX Rx flowkey configuration for RSS
Configure NIX RX flowkey algorithm configuration to support
RSS (receive side scaling). Currently support for only L3/L4
2-tuple and 4-tuple hash of IPv4/v6/TCP/UDP/SCTP is added.
HW supports upto 32 different flowkey algorithms which SW
can define, this patch defines 9. NPC RX ACTION has to point
to one of these flowkey indices for RSS to work.
The configuration is dependent on NPC parse result's layer
info. So if NPC KPU profile changes suchthat LID/LTYPE values
of above said protocols change then this configuration will
most likely be effected.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:56:00 +0000 (23:26 +0530)]
octeontx2-af: Install ucast and bcast pkt forwarding rules
Upon NIXLF_ALLOC install a unicast forwarding rule in NPC MCAM
like below
- Match pkt DMAC with NIXLF attached PF/VF's MAC address.
- Ingress channel
- Action is UCAST
- Forward to PF_FUNC of this NIXLF
And broadcast pkt forwarding rule as
- Match L2B bit in MCAM search key
- Ingress channel
- Action is UCAST, for now, later it will be changed to MCAST.
Only PFs can install this rule
Upon NIXLF_FREE disable all MCAM entries in use by that NIXLF.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stanislaw Kardach [Mon, 22 Oct 2018 17:55:59 +0000 (23:25 +0530)]
octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response
Add LMAC channel info like Rx/Tx channel base and count to
NIXLF_ALLOC mailbox message response. This info is used by
NIXLF attached RVU PF/VF to configure SQ's default channel,
TL3_TL2_LINKX_CFG and to install MCAM rules in NPC based
on matching ingress channel number.
Signed-off-by: Stanislaw Kardach <skardach@marvell.com>
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:58 +0000 (23:25 +0530)]
octeontx2-af: NPC MCAM and LDATA extract minimal configuration
This patch adds some minimal configuration for NPC MCAM and
LDATA extraction which is sufficient enough to install
ucast/bcast/promiscuous forwarding rules. Below is the
config done
- LDATA extraction config to extract DMAC from pkt
to offset 64bit in MCAM search key.
- Set MCAM lookup keysize to 224bits
- Set MCAM TX miss action to UCAST_DEFAULT
- Set MCAM RX miss action to DROP
Also inorder to have guaranteed space in MCAM to install
ucast forwarding rule for each of RVU PF/VF, reserved
one MCAM entry for each of NIXLF for ucast rule. And two
entries for each of RVU PF. One for bcast pkt replication
and other for promiscuous mode which allows all pkts
received on a HW CGX/LBK channel.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:57 +0000 (23:25 +0530)]
octeontx2-af: Enable packet length and csum validation
Config NPC layer info from KPU profile into protocol
checker to identify outer L2/IPv4/TCP/UDP headers in a
packet. And enable IPv4 checksum validation.
L3/L4 and L4 CSUM validation will be enabled by PF/VF
drivers by configuring NIX_AF_LF(0..127)_RX_CFG via mbox
i.e 'nix_lf_alloc_req->rx_cfg'
Also enable setting of NPC_RESULT_S[L2B] when an outer
L2 broadcast address is detected. This will help in
installing NPC MCAM rules for broadcast packets.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vamsi Attunuru [Mon, 22 Oct 2018 17:55:56 +0000 (23:25 +0530)]
octeontx2-af: Support for VTAG strip and capture
Added support for PF/VF drivers to configure NIX to
capture and/or strip VLAN tag from ingress packets.
Signed-off-by: Vamsi Attunuru <vamsi.attunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:55 +0000 (23:25 +0530)]
octeontx2-af: Update bcast list upon NIXLF alloc/free
Upon NIXLF ALLOC/FREE, add or remove corresponding PF_FUNC from
the broadcast packet replication list of the CGX LMAC mapped
RVU PF.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:54 +0000 (23:25 +0530)]
octeontx2-af: Broadcast packet replication support
Allocate memory for mcast/bcast/mirror replication entry
contexts, replication buffers (used by HW) and config HW
with corresponding memory bases. Added support for installing
MCEs via NIX AQ mbox.
For now support is restricted to broadcast pkt replication,
hence MCE table size and number of replication buffers
allocated are less. Each CGX LMAC mapped RVU PF is assigned
a MCE table of size 'num VFs of that PF + PF'.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geetha sowjanya [Mon, 22 Oct 2018 17:55:53 +0000 (23:25 +0530)]
octeontx2-af: Config pkind for CGX mapped PFs
For each CGX LMAC that is mapped to a RVU PF, allocate
a pkind and config the same in CGX. For a received packet
at CGX LMAC interface this pkind is used by NPC block
to start parsing of packet.
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:52 +0000 (23:25 +0530)]
octeontx2-af: Config NPC KPU engines with parser profile
This patch configures all 16 KPUs and iKPU (pkinds) with
the KPU parser profile defined in npc_profile.h. Each KPU
engine has a 128 entry CAM, only CAM entries which are listed
in the profile are enabled and rest are left disabled.
Also
- Memory is allocated for pkind's bitmap and PFFUNC, interface
channel mapping.
- Added all CSR offsets of NPC HW block.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hao Zheng [Mon, 22 Oct 2018 17:55:51 +0000 (23:25 +0530)]
octeontx2-af: Add NPC KPU profile
NPC block is responsible for parsing and forwarding
packets to different NIXLFs. NPC has 16 KPU engines
(Kangaroo parse engine) and one iKPU which represents
pkinds. Each physical port either CGX/LBK is assigned
a pkind and upon receiving a packet HW takes that port's
pkind and starts parsing as per the KPU engines config.
This patch adds header files which contain configuration
profile/array for each of the iKPU and 16 KPU engines.
Signed-off-by: Hao Zheng <hao.zheng@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vamsi Attunuru [Mon, 22 Oct 2018 17:55:50 +0000 (23:25 +0530)]
octeontx2-af: Reset NIXLF's Rx/Tx stats
This patch adds a new mailbox message to reset
a NIXLF's receive and transmit HW stats.
Signed-off-by: Vamsi Attunuru <vamsi.attunuru@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:49 +0000 (23:25 +0530)]
octeontx2-af: NIX Tx scheduler queue config support
This patch adds support for a PF/VF driver to configure
NIX transmit scheduler queues via mbox. Since PF/VF doesn't
know the absolute HW index of the NIXLF attached to it, AF
traps the register config and overwrites with the correct
NIXLF index.
HW supports shaping, colouring and policing of packets with
these multilevel traffic scheduler queues. Instead of
introducing different mbox message formats for different
configurations and making both AF & PF/VF driver implementation
cumbersome, access to the scheduler queue's CSRs is provided
via mbox. AF checks whether the sender PF/VF has the
corresponding queue allocated or not and dumps the config
to HW. With a single mbox msg 20 registers can be configured.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Mon, 22 Oct 2018 17:55:48 +0000 (23:25 +0530)]
octeontx2-af: NIX Tx scheduler queues alloc/free
Added support for a PF/VF to allocate or free NIX transmit
scheduler queues via mbox. For setting up pkt transmission
priorities between queues, the scheduler queues have to be
contiguous w.r.t their HW indices. So both contiguous and
non-contiguous allocations are supported.
Upon receiving NIX_TXSCH_FREE mbox msg all scheduler queues
allocated to sending PFFUNC (PF/VF) will be freed. Selective
free is not supported.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 22 Oct 2018 16:24:27 +0000 (09:24 -0700)]
llc: do not use sk_eat_skb()
syzkaller triggered a use-after-free [1], caused by a combination of
skb_get() in llc_conn_state_process() and usage of sk_eat_skb()
sk_eat_skb() is assuming the skb about to be freed is only used by
the current thread. TCP/DCCP stacks enforce this because current
thread holds the socket lock.
llc_conn_state_process() wants to make sure skb does not disappear,
and holds a reference on the skb it manipulates. But as soon as this
skb is added to socket receive queue, another thread can consume it.
This means that llc must use regular skb_unlink() and kfree_skb()
so that both producer and consumer can safely work on the same skb.
[1]
BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
BUG: KASAN: use-after-free in refcount_read include/linux/refcount.h:43 [inline]
BUG: KASAN: use-after-free in skb_unref include/linux/skbuff.h:967 [inline]
BUG: KASAN: use-after-free in kfree_skb+0xb7/0x580 net/core/skbuff.c:655
Read of size 4 at addr
ffff8801d1f6fba4 by task ksoftirqd/1/18
CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.19.0-rc8+ #295
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1c4/0x2b6 lib/dump_stack.c:113
print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
refcount_read include/linux/refcount.h:43 [inline]
skb_unref include/linux/skbuff.h:967 [inline]
kfree_skb+0xb7/0x580 net/core/skbuff.c:655
llc_sap_state_process+0x9b/0x550 net/llc/llc_sap.c:224
llc_sap_rcv+0x156/0x1f0 net/llc/llc_sap.c:297
llc_sap_handler+0x65e/0xf80 net/llc/llc_sap.c:438
llc_rcv+0x79e/0xe20 net/llc/llc_input.c:208
__netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913
__netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023
process_backlog+0x218/0x6f0 net/core/dev.c:5829
napi_poll net/core/dev.c:6249 [inline]
net_rx_action+0x7c5/0x1950 net/core/dev.c:6315
__do_softirq+0x30c/0xb03 kernel/softirq.c:292
run_ksoftirqd+0x94/0x100 kernel/softirq.c:653
smpboot_thread_fn+0x68b/0xa00 kernel/smpboot.c:164
kthread+0x35a/0x420 kernel/kthread.c:246
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
Allocated by task 18:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
kmem_cache_alloc_node+0x144/0x730 mm/slab.c:3644
__alloc_skb+0x119/0x770 net/core/skbuff.c:193
alloc_skb include/linux/skbuff.h:995 [inline]
llc_alloc_frame+0xbc/0x370 net/llc/llc_sap.c:54
llc_station_ac_send_xid_r net/llc/llc_station.c:52 [inline]
llc_station_rcv+0x1dc/0x1420 net/llc/llc_station.c:111
llc_rcv+0xc32/0xe20 net/llc/llc_input.c:220
__netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913
__netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023
process_backlog+0x218/0x6f0 net/core/dev.c:5829
napi_poll net/core/dev.c:6249 [inline]
net_rx_action+0x7c5/0x1950 net/core/dev.c:6315
__do_softirq+0x30c/0xb03 kernel/softirq.c:292
Freed by task 16383:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kmem_cache_free+0x83/0x290 mm/slab.c:3756
kfree_skbmem+0x154/0x230 net/core/skbuff.c:582
__kfree_skb+0x1d/0x20 net/core/skbuff.c:642
sk_eat_skb include/net/sock.h:2366 [inline]
llc_ui_recvmsg+0xec2/0x1610 net/llc/af_llc.c:882
sock_recvmsg_nosec net/socket.c:794 [inline]
sock_recvmsg+0xd0/0x110 net/socket.c:801
___sys_recvmsg+0x2b6/0x680 net/socket.c:2278
__sys_recvmmsg+0x303/0xb90 net/socket.c:2390
do_sys_recvmmsg+0x181/0x1a0 net/socket.c:2466
__do_sys_recvmmsg net/socket.c:2484 [inline]
__se_sys_recvmmsg net/socket.c:2480 [inline]
__x64_sys_recvmmsg+0xbe/0x150 net/socket.c:2480
do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The buggy address belongs to the object at
ffff8801d1f6fac0
which belongs to the cache skbuff_head_cache of size 232
The buggy address is located 228 bytes inside of
232-byte region [
ffff8801d1f6fac0,
ffff8801d1f6fba8)
The buggy address belongs to the page:
page:
ffffea000747dbc0 count:1 mapcount:0 mapping:
ffff8801d9be7680 index:0xffff8801d1f6fe80
flags: 0x2fffc0000000100(slab)
raw:
02fffc0000000100 ffffea0007346e88 ffffea000705b108 ffff8801d9be7680
raw:
ffff8801d1f6fe80 ffff8801d1f6f0c0 000000010000000b 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801d1f6fa80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
ffff8801d1f6fb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>
ffff8801d1f6fb80: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc
^
ffff8801d1f6fc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801d1f6fc80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mathias Thore [Mon, 22 Oct 2018 12:55:50 +0000 (14:55 +0200)]
net/wan/fsl_ucc_hdlc: error counters
Extract error information from rx and tx buffer descriptors,
and update error counters.
Signed-off-by: Mathias Thore <mathias.thore@infinera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfram Sang [Sun, 21 Oct 2018 20:01:05 +0000 (22:01 +0200)]
net: dsa: legacy: simplify getting .driver_data
We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wolfram Sang [Sun, 21 Oct 2018 20:00:39 +0000 (22:00 +0200)]
ptp: ptp_dte: simplify getting .driver_data
We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arthur Kiyanovski [Sun, 21 Oct 2018 15:07:14 +0000 (18:07 +0300)]
net: ena: fix compilation error in xtensa architecture
linux/prefetch.h is never explicitly included in ena_com, although
functions from it, such as prefetchw(), are used throughout ena_com.
This is an inclusion bug, and we fix it here by explicitly including
linux/prefetch.h. The bug was exposed when the driver was compiled
for the xtensa architecture.
Fixes: 689b2bdaaa14 ("net: ena: add functions for handling Low Latency Queues in ena_com")
Fixes: 8c590f977638 ("ena: Fix Kconfig dependency on X86")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vito Caputo [Sun, 21 Oct 2018 11:33:03 +0000 (04:33 -0700)]
af_unix.h: trivial whitespace cleanup
Replace spurious spaces with a tab and remove superfluous tab from
unix_sock struct.
Signed-off-by: Vito Caputo <vcaputo@pengaru.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Sun, 21 Oct 2018 11:05:49 +0000 (14:05 +0300)]
net/mlx5: Allocate enough space for the FDB sub-namespaces
FDB_MAX_CHAIN is three. We wanted to allocate enough memory to hold four
structs but there are missing parentheses so we only allocate enough
memory for three structs and the first byte of the fourth one.
Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Oct 2018 02:42:58 +0000 (19:42 -0700)]
Merge branch 'forbid-goto_chain-fallback'
Davide Caratti says:
====================
net/sched: forbid 'goto_chain' on fallback actions
the following command:
# tc actions add action police rate 1mbit burst 1k conform-exceed \
> pass / goto chain 42
generates a NULL pointer dereference when packets exceed the configured
rate. Similarly, the following command:
# tc actions add action pass random determ goto chain 42 2
makes the kernel crash with NULL dereference when the first packet does
not match the 'pass' action.
gact and police allow users to specify a fallback control action, that is
stored in the action private data. 'goto chain x' never worked for these
cases, since a->goto_chain handle was never initialized. There is only one
goto_chain handle per TC action, and it is designed to be non-NULL only if
tcf_action contains a 'goto chain' command. So, let's forbid 'goto chain'
on fallback actions.
Patch 1/4 and 2/4 change the .init() functions of police and gact, to let
them return an error when users try to set 'goto chain x' in the fallback
action. Patch 3/4 and 4/4 add TDC selftest coverage to this new behavior.
====================
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 20 Oct 2018 21:33:10 +0000 (23:33 +0200)]
tc-tests: test denial of 'goto chain' for exceed traffic in police.json
add test to verify if act_police forbids 'goto chain' control actions for
'exceed' traffic.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 20 Oct 2018 21:33:09 +0000 (23:33 +0200)]
tc-tests: test denial of 'goto chain' on 'random' traffic in gact.json
add test to verify if act_gact forbids 'goto chain' control actions on
'random' traffic in gact.json.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 20 Oct 2018 21:33:08 +0000 (23:33 +0200)]
net/sched: act_police: disallow 'goto chain' on fallback control action
in the following command:
# tc action add action police rate <r> burst <b> conform-exceed <c1>/<c2>
'goto chain x' is allowed only for c1: setting it for c2 makes the kernel
crash with NULL pointer dereference, since TC core doesn't initialize the
chain handle.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 20 Oct 2018 21:33:07 +0000 (23:33 +0200)]
net/sched: act_gact: disallow 'goto chain' on fallback control action
in the following command:
# tc action add action <c1> random <rand_type> <c2> <rand_param>
'goto chain x' is allowed only for c1: setting it for c2 makes the kernel
crash with NULL pointer dereference, since TC core doesn't initialize the
chain handle.
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 20 Oct 2018 20:41:28 +0000 (22:41 +0200)]
net: phy: phy_support_sym_pause: Clear Asym Pause
When indicating the MAC supports Symmetric Pause, clear the Asymmetric
Pause bit, which could of been already set is the PHY supports it.
Reported-by: Labbe Corentin <clabbe@baylibre.com>
Fixes: c306ad36184f ("net: ethernet: Add helper for MACs which support pause")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Olivier Brunel [Sat, 20 Oct 2018 17:39:57 +0000 (19:39 +0200)]
net: bpfilter: Set user mode helper's command line
Signed-off-by: Olivier Brunel <jjk@jjacky.com>
Signed-off-by: David S. Miller <davem@davemloft.net>