Johannes Berg [Fri, 16 Jan 2015 10:37:14 +0000 (11:37 +0100)]
genetlink: synchronize socket closing and family removal
In addition to the problem Jeff Layton reported, I looked at the code
and reproduced the same warning by subscribing and removing the genl
family with a socket still open. This is a fairly tricky race which
originates in the fact that generic netlink allows the family to go
away while sockets are still open - unlike regular netlink which has
a module refcount for every open socket so in general this cannot be
triggered.
Trying to resolve this issue by the obvious locking isn't possible as
it will result in deadlocks between unregistration and group unbind
notification (which incidentally lockdep doesn't find due to the home
grown locking in the netlink table.)
To really resolve this, introduce a "closing socket" reference counter
(for generic netlink only, as it's the only affected family) in the
core netlink code and use that in generic netlink to wait for all the
sockets that are being closed at the same time as a generic netlink
family is removed.
This fixes the race that when a socket is closed, it will should call
the unbind, but if the family is removed at the same time the unbind
will not find it, leading to the warning. The real problem though is
that in this case the unbind could actually find a new family that is
registered to have a multicast group with the same ID, and call its
mcast_unbind() leading to confusing.
Also remove the warning since it would still trigger, but is now no
longer a problem.
This also moves the code in af_netlink.c to before unreferencing the
module to avoid having the same problem in the normal non-genl case.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Fri, 16 Jan 2015 10:37:13 +0000 (11:37 +0100)]
genetlink: disallow subscribing to unknown mcast groups
Jeff Layton reported that he could trigger the multicast unbind warning
in generic netlink using trinity. I originally thought it was a race
condition between unregistering the generic netlink family and closing
the socket, but there's a far simpler explanation: genetlink currently
allows subscribing to groups that don't (yet) exist, and the warning is
triggered when unsubscribing again while the group still doesn't exist.
Originally, I had a warning in the subscribe case and accepted it out of
userspace API concerns, but the warning was of course wrong and removed
later.
However, I now think that allowing userspace to subscribe to groups that
don't exist is wrong and could possibly become a security problem:
Consider a (new) genetlink family implementing a permission check in
the mcast_bind() function similar to the like the audit code does today;
it would be possible to bypass the permission check by guessing the ID
and subscribing to the group it exists. This is only possible in case a
family like that would be dynamically loaded, but it doesn't seem like a
huge stretch, for example wireless may be loaded when you plug in a USB
device.
To avoid this reject such subscription attempts.
If this ends up causing userspace issues we may need to add a workaround
in af_netlink to deny such requests but not return an error.
Reported-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Fri, 16 Jan 2015 10:37:12 +0000 (11:37 +0100)]
genetlink: document parallel_ops
The kernel-doc for the parallel_ops family struct member is
missing, add it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 16 Jan 2015 01:04:22 +0000 (17:04 -0800)]
net: rps: fix cpu unplug
softnet_data.input_pkt_queue is protected by a spinlock that
we must hold when transferring packets from victim queue to an active
one. This is because other cpus could still be trying to enqueue packets
into victim queue.
A second problem is that when we transfert the NAPI poll_list from
victim to current cpu, we absolutely need to special case the percpu
backlog, because we do not want to add complex locking to protect
process_queue : Only owner cpu is allowed to manipulate it, unless cpu
is offline.
Based on initial patch from Prasad Sodagudi & Subash Abhinov
Kasiviswanathan.
This version is better because we do not slow down packet processing,
only make migration safer.
Reported-by: Prasad Sodagudi <psodagud@codeaurora.org>
Reported-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 16 Jan 2015 06:00:17 +0000 (01:00 -0500)]
Merge branch 'davinci_emac'
Tony Lindgren says:
====================
Fixes for davinci_emac
Here's a repost of the fixes for davinci_emac with patches
updated for comments and acks collected.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Thu, 15 Jan 2015 22:45:14 +0000 (14:45 -0800)]
net: davinci_emac: Add support for emac on dm816x
On dm816x we have two emac controllers with separate memory
areas.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Thu, 15 Jan 2015 22:45:13 +0000 (14:45 -0800)]
net: davinci_emac: Fix ioremap for devices with MDIO within the EMAC address space
Some devices like dm816x have the MDIO registers within the first EMAC
instance address space. Let's fix the issue by allowing to pass an
optional second IO range for the EMAC control register area.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Thu, 15 Jan 2015 22:45:12 +0000 (14:45 -0800)]
net: davinci_emac: Fix incomplete code for getting the phy from device tree
Looks like the phy_id is never set up beyond getting the phandle.
Note that we can remove the ifdef for phy_node as there is a stub
for of_phy_connec() if CONFIG_OF is not set.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Thu, 15 Jan 2015 22:45:11 +0000 (14:45 -0800)]
net: davinci_emac: Free clock after checking the frequency
We only use clk_get() to get the frequency, the rest is done by
the runtime PM calls. Let's free the clock too.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Thu, 15 Jan 2015 22:45:10 +0000 (14:45 -0800)]
net: davinci_emac: Fix runtime pm calls for davinci_emac
Commit
3ba97381343b ("net: ethernet: davinci_emac: add pm_runtime support")
added support for runtime PM, but it causes issues on omap3 related devices
that actually gate the clocks:
Unhandled fault: external abort on non-linefetch (0x1008)
...
[<
c04160f0>] (emac_dev_getnetstats) from [<
c04d6a3c>] (dev_get_stats+0x78/0xc8)
[<
c04d6a3c>] (dev_get_stats) from [<
c04e9ccc>] (rtnl_fill_ifinfo+0x3b8/0x938)
[<
c04e9ccc>] (rtnl_fill_ifinfo) from [<
c04eade4>] (rtmsg_ifinfo+0x68/0xd8)
[<
c04eade4>] (rtmsg_ifinfo) from [<
c04dd35c>] (register_netdevice+0x3a0/0x4ec)
[<
c04dd35c>] (register_netdevice) from [<
c04dd4bc>] (register_netdev+0x14/0x24)
[<
c04dd4bc>] (register_netdev) from [<
c041755c>] (davinci_emac_probe+0x408/0x5c8)
[<
c041755c>] (davinci_emac_probe) from [<
c0396d78>] (platform_drv_probe+0x48/0xa4)
Let's fix it by moving the pm_runtime_get() call earlier, and also add it to
the emac_dev_getnetstats(). Also note that we want to use pm_runtime_get_sync()
as we don't want to have deferred_resume happen. And let's also check the
return value for pm_runtime_get_sync() as noted by Felipe Balbi <balbi@ti.com>.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Acked-by: Mark A. Greer <mgreer@animalcreek.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Thu, 15 Jan 2015 22:45:09 +0000 (14:45 -0800)]
net: davinci_emac: Fix hangs with interrupts
On davinci_emac, we have pulse interrupts. This means that we need to
clear the EOI bits when disabling interrupts as otherwise the interrupts
keep happening. And we also need to not clear the EOI bits again when
enabling the interrupts as otherwise we will get tons of:
unexpected IRQ trap at vector 00
These errors almost certainly mean that the omap-intc.c is signaling
a spurious interrupt with the reserved irq 127 as we've seen earlier
on omap3.
Let's fix the issue by clearing the EOI bits when disabling the
interrupts. Let's also keep the comment for "Rx Threshold and Misc
interrupts are not enabled" for both enable and disable so people
are aware of this when potentially adding more support.
Note that eventually we should handle the RX and TX interrupts
separately like cpsw is now doing. However, so far I have not seen
any issues with this based on my testing, so it seems to behave a
little different compared to the cpsw that had a similar issue.
Cc: Brian Hutchinson <b.hutchman@gmail.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Thu, 15 Jan 2015 18:18:40 +0000 (13:18 -0500)]
ip: zero sockaddr returned on error queue
The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That
structure is defined and allocated on the stack as
struct {
struct sock_extended_err ee;
struct sockaddr_in(6) offender;
} errhdr;
The second part is only initialized for certain SO_EE_ORIGIN values.
Always initialize it completely.
An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that
would return uninitialized bytes.
Signed-off-by: Willem de Bruijn <willemb@google.com>
----
Also verified that there is no padding between errhdr.ee and
errhdr.offender that could leak additional kernel data.
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 16 Jan 2015 00:38:49 +0000 (19:38 -0500)]
Merge tag 'linux-can-fixes-for-3.19-
20150115' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2015-01-15
this is a pull request of 8 patches.
Ahmed S. Darwish contributes 4 fixes for the kvaser_usb driver. The two patches
by Oliver Hartkopp mark the m_can driver as non-ISO, as the CANFD standard was
updated. Roger Quadros's patch for the c_can driver fixes the register access
during RAMINIT. And one patch by my, which updates the MAINTAINERS file, as we
moved the git repos to the kernel.org infrastructure.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 15 Jan 2015 13:28:54 +0000 (15:28 +0200)]
net/mlx4: Don't disable vxlan offloads under DMFS-A0 optimized steering
Except for VXLAN steering rules, all offloads should work as they were
under plain DMFS mode. Fix that by enabling all the offloads under
DMFS-A0 mode, except for VXLAN steering rules.
Fixes: d57febe1a478 "net/mlx4: Add A0 hybrid steering"
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 16 Jan 2015 00:28:36 +0000 (19:28 -0500)]
Merge tag 'mac80211-for-davem-2015-01-15' of git://git./linux/kernel/git/jberg/mac80211
Just two fixes - one for an uninialized variable and
one for a deadlock in regulatory processing.
Signed-off-by: David S. Miller <davem@davemloft.net>
Byungho An [Thu, 15 Jan 2015 01:43:11 +0000 (10:43 +0900)]
net: sxgbe: Fix waring for double kfree()
This patch fixes double kfree() calls at init_rx_ring() because
it causes static checker warning.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Girish K.S [Thu, 15 Jan 2015 01:41:47 +0000 (10:41 +0900)]
net: sxgbe: Fix NULL dereferece when using DT
When the MAC address is provided in the device tree file, the
condition is true and kernel crashes due to NULL dereference.
Signed-off-by: Girish K.S <ks.giri@samsung.com>
Signed-off-by: Byungho An <bh74.an@samsung.com>
Signed-off-by: Kukjin Kim <kgene@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Thu, 15 Jan 2015 10:52:19 +0000 (11:52 +0100)]
sh_eth: Fix addition of .trscer_err_mask to wrong SoC data
commit
b284fbe3b3ef9cf8 ("sh_eth: Fix access to TRSCER register") wanted
to add a .trscer_err_mask value to the R-Car Gen2 family-specific data
structure (r8a779x_data), but it was accidentally added to the
SH7724-specific data structure (sh7724_data).
Presumably this happened due to a patch conflict with commit
d407bc0203539031 ("sh-eth: Set fdr_value of R-Car SoCs"), which added
another field at the same position.
Move the field setting to fix this.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: b284fbe3b3ef9cf8 ("sh_eth: Fix access to TRSCER register")
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Thu, 15 Jan 2015 09:29:28 +0000 (14:59 +0530)]
drivers: net: cpsw: fix cpsw hung with add vlan using vconfig
while adding vlan in dual EMAC mode, only specific ports should be
subscribed for the vlan, else it will lead to switching mode and
if both ports connected to same switch cpsw will hung as it creates
a network loop. Fixing this by adding only specific ports in case
of dual EMAC.
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ahmed S. Darwish [Sun, 11 Jan 2015 20:49:52 +0000 (15:49 -0500)]
can: kvaser_usb: Don't dereference skb after a netif_rx()
We should not touch the packet after a netif_rx: it might
get freed behind our back.
Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Ahmed S. Darwish [Mon, 5 Jan 2015 17:57:13 +0000 (12:57 -0500)]
can: kvaser_usb: Don't send a RESET_CHIP for non-existing channels
Recent Leaf firmware versions (>= 3.1.557) do not allow to send
commands for non-existing channels. If a command is sent for a
non-existing channel, the firmware crashes.
Reported-by: Christopher Storah <Christopher.Storah@invetech.com.au>
Signed-off-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Ahmed S. Darwish [Mon, 5 Jan 2015 17:52:06 +0000 (12:52 -0500)]
can: kvaser_usb: Reset all URB tx contexts upon channel close
Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in very high frequency (*), closing the CAN channel while
all the transmissions are on (#), opening the device again (@),
then sending a small number of packets would make the driver
enter an almost infinite loop of:
[....]
[15959.853988] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853990] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853991] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853993] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853994] kvaser_usb 4-3:1.0 can0: cannot find free context
[15959.853995] kvaser_usb 4-3:1.0 can0: cannot find free context
[....]
_dragging the whole system down_ in the process due to the
excessive logging output.
Initially, this has caused random panics in the kernel due to a
buggy error recovery path. That got fixed in an earlier commit.(%)
This patch aims at solving the root cause. -->
16 tx URBs and contexts are allocated per CAN channel per USB
device. Such URBs are protected by:
a) A simple atomic counter, up to a value of MAX_TX_URBS (16)
b) A flag in each URB context, stating if it's free
c) The fact that ndo_start_xmit calls are themselves protected
by the networking layers higher above
After grabbing one of the tx URBs, if the driver noticed that all
of them are now taken, it stops the netif transmission queue.
Such queue is worken up again only if an acknowedgment was received
from the firmware on one of our earlier-sent frames.
Meanwhile, upon channel close (#), the driver sends a CMD_STOP_CHIP
to the firmware, effectively closing all further communication. In
the high traffic case, the atomic counter remains at MAX_TX_URBS,
and all the URB contexts remain marked as active. While opening
the channel again (@), it cannot send any further frames since no
more free tx URB contexts are available.
Reset all tx URB contexts upon CAN channel close.
(*) 50 parallel instances of `cangen0 -g 0 -ix`
(#) `ifconfig can0 down`
(@) `ifconfig can0 up`
(%) "can: kvaser_usb: Don't free packets when tight on URBs"
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Ahmed S. Darwish [Mon, 5 Jan 2015 17:49:10 +0000 (12:49 -0500)]
can: kvaser_usb: Don't free packets when tight on URBs
Flooding the Kvaser CAN to USB dongle with multiple reads and
writes in high frequency caused seemingly-random panics in the
kernel.
On further inspection, it seems the driver erroneously freed the
to-be-transmitted packet upon getting tight on URBs and returning
NETDEV_TX_BUSY, leading to invalid memory writes and double frees
at a later point in time.
Note:
Finding no more URBs/transmit-contexts and returning NETDEV_TX_BUSY
is a driver bug in and out of itself: it means that our start/stop
queue flow control is broken.
This patch only fixes the (buggy) error handling code; the root
cause shall be fixed in a later commit.
Acked-by: Olivier Sobrie <olivier@sobrie.be>
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Roger Quadros [Tue, 13 Jan 2015 14:23:11 +0000 (16:23 +0200)]
can: c_can: use regmap_update_bits() to modify RAMINIT register
use of regmap_read() and regmap_write() in c_can_hw_raminit_syscon()
is not safe as the RAMINIT register can be shared between different drivers
at least for TI SoCs.
To make the modification atomic we switch to using regmap_update_bits().
regmap_update_bits() skips writing to the register if it's read content is the
same as what is going to be written. This causes an issue for us when we
need to clear the DONE bit with the initial condition START:0, DONE:1 as
DONE bit must be written with 1 to clear it.
So we defer the clearing of DONE bit to later when we set the START bit.
There we are sure that START bit is changed from 0 to 1 so the write of
1 to already set DONE bit will happen.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Oliver Hartkopp [Mon, 5 Jan 2015 18:47:43 +0000 (19:47 +0100)]
can: m_can: tag current CAN FD controllers as non-ISO
During the CAN FD standardization process within the ISO it turned out that
the failure detection capability has to be improved.
The CAN in Automation organization (CiA) defined the already implemented CAN
FD controllers as 'non-ISO' and the upcoming improved CAN FD controllers as
'ISO' compliant. See at http://www.can-cia.com/index.php?id=1937
Finally there will be three types of CAN FD controllers in the future:
1. ISO compliant (fixed)
2. non-ISO compliant (fixed, like the M_CAN IP v3.0.1 in m_can.c)
3. ISO/non-ISO CAN FD controllers (switchable, like the PEAK USB FD)
So the current M_CAN driver for the M_CAN IP v3.0.1 has to expose its non-ISO
implementation by setting the CAN_CTRLMODE_FD_NON_ISO ctrlmode at startup.
As this bit cannot be switched at configuration time CAN_CTRLMODE_FD_NON_ISO
must not be set in ctrlmode_supported of the current M_CAN driver.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Oliver Hartkopp [Mon, 5 Jan 2015 17:40:15 +0000 (18:40 +0100)]
can: dev: fix crtlmode_supported check
When changing flags in the CAN drivers ctrlmode the provided new content has to
be checked whether the bits are allowed to be changed. The bits that are to be
changed are given as a bitfield in cm->mask. Therefore checking against
cm->flags is wrong as the content can hold any kind of values.
The iproute2 tool sets the bits in cm->mask and cm->flags depending on the
detected command line options. To be robust against bogus user space
applications additionally sanitize the provided flags with the provided mask.
Cc: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Thu, 15 Jan 2015 15:56:26 +0000 (16:56 +0100)]
MAINTAINERS: update linux-can git repositories
The linux-can upstream git repositories are now hosted on kernel.org, update
MAINTAINERS accordingly.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Sriharsha Basavapatna [Thu, 15 Jan 2015 10:38:43 +0000 (16:08 +0530)]
be2net: Allow GRE to work concurrently while a VxLAN tunnel is configured
Other tunnels like GRE break while VxLAN offloads are enabled in Skyhawk-R. To
avoid this, we should restrict offload features on a per-packet basis in such
conditions.
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 14 Jan 2015 22:17:37 +0000 (11:17 +1300)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Don't use uninitialized data in IPVS, from Dan Carpenter.
2) conntrack race fixes from Pablo Neira Ayuso.
3) Fix TX hangs with i40e, from Jesse Brandeburg.
4) Fix budget return from poll calls in dnet and alx, from Eric
Dumazet.
5) Fix bugus "if (unlikely(x) < 0)" test in AF_PACKET, from Christoph
Jaeger.
6) Fix bug introduced by conversion to list_head in TIPC retransmit
code, from Jon Paul Maloy.
7) Don't use GFP_NOIO under spinlock in USB kaweth driver, from Alexey
Khoroshilov.
8) Fix bridge build with INET disabled, from Arnd Bergmann.
9) Fix netlink array overrun for PROBE attributes in openvswitch, from
Thomas Graf.
10) Don't hold spinlock across synchronize_irq() in tg3 driver, from
Prashant Sreedharan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
tg3: Release tp->lock before invoking synchronize_irq()
tg3: tg3_reset_task() needs to use rtnl_lock to synchronize
tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync
team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin
openvswitch: packet messages need their own probe attribtue
i40e: adds FCoE configure option
cxgb4vf: Fix queue allocation for 40G adapter
netdevice: Add missing parentheses in macro
bridge: only provide proxy ARP when CONFIG_INET is enabled
neighbour: fix base_reachable_time(_ms) not effective immediatly when changed
net: fec: fix MDIO bus assignement for dual fec SoC's
xen-netfront: use different locks for Rx and Tx stats
drivers: net: cpsw: fix multicast flush in dual emac mode
cxgb4vf: Initialize mdio_addr before using it
net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations
usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb()
MAINTAINERS: add me as ibmveth maintainer
tipc: fix bug in broadcast retransmit code
update ip-sysctl.txt documentation (v2)
net/at91_ether: prepare and unprepare clock
...
David S. Miller [Wed, 14 Jan 2015 22:05:55 +0000 (17:05 -0500)]
Merge branch 'tg3-net'
Prashant Sreedharan says:
====================
tg3: synchronize_irq() should be called without taking locks
v2: Added Reported-by, Tested-by fields and reference to the thread that
reported the problem
This series addresses the problem reported by Peter Hurley in mail thread
https://lkml.org/lkml/2015/1/12/1082
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Sreedharan [Wed, 14 Jan 2015 19:34:44 +0000 (11:34 -0800)]
tg3: Release tp->lock before invoking synchronize_irq()
synchronize_irq() can sleep waiting, for pending IRQ handlers so driver
should release the tp->lock spin lock before invoking synchronize_irq()
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Sreedharan [Wed, 14 Jan 2015 19:34:17 +0000 (11:34 -0800)]
tg3: tg3_reset_task() needs to use rtnl_lock to synchronize
Currently tg3_reset_task() uses only tp->lock for synchronizing with code
paths like tg3_open() etc. But since tp->lock is released before doing
synchronize_irq(), rtnl_lock should be taken in tg3_reset_task() to
synchronize it with other code paths.
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Sreedharan [Wed, 14 Jan 2015 19:33:49 +0000 (11:33 -0800)]
tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync
This is to avoid the race between tg3_timer() and the execution paths
which does not invoke tg3_timer_stop() and releases tp->lock before
calling synchronize_irq()
Reported-by: Peter Hurley <peter@hurleysoftware.com>
Tested-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 14 Jan 2015 21:54:30 +0000 (10:54 +1300)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Two bugfixes for arm64. I will have another pull request next week,
but otherwise things are calm"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm64: KVM: Fix HCR setting for 32bit guests
arm64: KVM: Fix TLB invalidation by IPA/VMID
Jiri Pirko [Wed, 14 Jan 2015 17:15:30 +0000 (18:15 +0100)]
team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin
This patch is fixing a race condition that may cause setting
count_pending to -1, which results in unwanted big bulk of arp messages
(in case of "notify peers").
Consider following scenario:
count_pending == 2
CPU0 CPU1
team_notify_peers_work
atomic_dec_and_test (dec count_pending to 1)
schedule_delayed_work
team_notify_peers
atomic_add (adding 1 to count_pending)
team_notify_peers_work
atomic_dec_and_test (dec count_pending to 1)
schedule_delayed_work
team_notify_peers_work
atomic_dec_and_test (dec count_pending to 0)
schedule_delayed_work
team_notify_peers_work
atomic_dec_and_test (dec count_pending to -1)
Fix this race by using atomic_dec_if_positive - that will prevent
count_pending running under 0.
Fixes: fc423ff00df3a1955441 ("team: add peer notification")
Fixes: 492b200efdd20b8fcfd ("team: add support for sending multicast rejoins")
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 14 Jan 2015 21:50:29 +0000 (10:50 +1300)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Two small performance tweaks, the plumbing for the execveat system
call and a couple of bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/uprobes: fix user space PER events
s390/bpf: Fix JMP_JGE_X (A > X) and JMP_JGT_X (A >= X)
s390/bpf: Fix ALU_NEG (A = -A)
s390/mm: avoid using pmd_to_page for !USE_SPLIT_PMD_PTLOCKS
s390/timex: fix get_tod_clock_ext() inline assembly
s390: wire up execveat syscall
s390/kernel: use stnsm 255 instead of stosm 0
s390/vtime: Get rid of redundant WARN_ON
s390/zcrypt: kernel oops at insmod of the z90crypt device driver
Thomas Graf [Wed, 14 Jan 2015 13:56:19 +0000 (13:56 +0000)]
openvswitch: packet messages need their own probe attribtue
User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow
and packet messages. This leads to an out-of-bounds access in
ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE >
OVS_PACKET_ATTR_MAX.
Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value
as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes
while maintaining to be binary compatible with existing OVS binaries.
Fixes: 05da589 ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.")
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Tracked-down-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Jesse Gross <jesse@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasu Dev [Wed, 14 Jan 2015 13:14:07 +0000 (05:14 -0800)]
i40e: adds FCoE configure option
Adds FCoE config option I40E_FCOE, so that FCoE can be enabled
as needed but otherwise have it disabled by default.
This also eliminate multiple FCoE config checks, instead now just
one config check for CONFIG_I40E_FCOE.
The I40E FCoE was added with 3.17 kernel and therefore this patch
shall be applied to stable 3.17 kernel also.
CC: <stable@vger.kernel.org>
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Wed, 14 Jan 2015 12:28:10 +0000 (17:58 +0530)]
cxgb4vf: Fix queue allocation for 40G adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 14 Jan 2015 21:38:07 +0000 (10:38 +1300)]
Merge tag 'locks-v3.19-1' of git://git.samba.org/jlayton/linux
Pull file locking fix from Jeff Layton:
"Just a simple bugfix for a regression that I introduced into v3.18
with the internal lease API overhaul -- mea culpa. Kudos to Linda and
Neil for tracking this down and fixing it"
* tag 'locks-v3.19-1' of git://git.samba.org/jlayton/linux:
locks: fix NULL-deref in generic_delete_lease
Benjamin Poirier [Wed, 14 Jan 2015 07:52:35 +0000 (16:52 +0900)]
netdevice: Add missing parentheses in macro
For example, one could conceivably call
for_each_netdev_in_bond_rcu(condition ? bond1 : bond2, slave)
and get an unexpected result.
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 14 Jan 2015 21:27:56 +0000 (10:27 +1300)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block layer fixes from Jens Axboe:
"The major part is an update to the NVMe driver, fixing various issues
around surprise removal and hung controllers. Most of that is from
Keith, and parts are simple blk-mq fixes or exports/additions of minor
functions to aid this effort, and parts are changes directly to the
NVMe driver.
Apart from the above, this contains:
- Small blk-mq change from me, killing an unused member of the
hardware queue structure.
- Small fix from Ming Lei, fixing up a few drivers that didn't
properly check for ERR_PTR() returns from blk_mq_init_queue()"
* 'for-linus' of git://git.kernel.dk/linux-block:
NVMe: Fix locking on abort handling
NVMe: Start and stop h/w queues on reset
NVMe: Command abort handling fixes
NVMe: Admin queue removal handling
NVMe: Reference count admin queue usage
NVMe: Start all requests
blk-mq: End unstarted requests on a dying queue
blk-mq: Allow requests to never expire
blk-mq: Add helper to abort requeued requests
blk-mq: Let drivers cancel requeue_work
blk-mq: Export if requests were started
blk-mq: Wake tasks entering queue on dying
blk-mq: get rid of ->cmd_size in the hardware queue
block: fix checking return value of blk_mq_init_queue
block: wake up waiters when a queue is marked dying
NVMe: Fix double free irq
blk-mq: Export freeze/unfreeze functions
blk-mq: Exit queue on alloc failure
Arnd Bergmann [Tue, 13 Jan 2015 14:10:27 +0000 (15:10 +0100)]
bridge: only provide proxy ARP when CONFIG_INET is enabled
When IPV4 support is disabled, we cannot call arp_send from
the bridge code, which would result in a kernel link error:
net/built-in.o: In function `br_handle_frame_finish':
:(.text+0x59914): undefined reference to `arp_send'
:(.text+0x59a50): undefined reference to `arp_tbl'
This makes the newly added proxy ARP support in the bridge
code depend on the CONFIG_INET symbol and lets the compiler
optimize the code out to avoid the link error.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP")
Cc: Kyeyoon Park <kyeyoonp@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean-Francois Remy [Wed, 14 Jan 2015 03:22:39 +0000 (04:22 +0100)]
neighbour: fix base_reachable_time(_ms) not effective immediatly when changed
When setting base_reachable_time or base_reachable_time_ms on a
specific interface through sysctl or netlink, the reachable_time
value is not updated.
This means that neighbour entries will continue to be updated using the
old value until it is recomputed in neigh_period_work (which
recomputes the value every 300*HZ).
On systems with HZ equal to 1000 for instance, it means 5mins before
the change is effective.
This patch changes this behavior by recomputing reachable_time after
each set on base_reachable_time or base_reachable_time_ms.
The new value will become effective the next time the neighbour's timer
is triggered.
Changes are made in two places: the netlink code for set and the sysctl
handling code. For sysctl, I use a proc_handler. The ipv6 network
code does provide its own handler but it already refreshes
reachable_time correctly so it's not an issue.
Any other user of neighbour which provide its own handlers must
refresh reachable_time.
Signed-off-by: Jean-Francois Remy <jeff@melix.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Agner [Tue, 13 Jan 2015 23:20:21 +0000 (00:20 +0100)]
net: fec: fix MDIO bus assignement for dual fec SoC's
On i.MX28, the MDIO bus is shared between the two FEC instances.
The driver makes sure that the second FEC uses the MDIO bus of the
first FEC. This is done conditionally if FEC_QUIRK_ENET_MAC is set.
However, in newer designs, such as Vybrid or i.MX6SX, each FEC MAC
has its own MDIO bus. Simply removing the quirk FEC_QUIRK_ENET_MAC
is not an option since other logic, triggered by this quirk, is
still needed.
Furthermore, there are board designs which use the same MDIO bus
for both PHY's even though the second bus would be available on the
SoC side. Such layout are popular since it saves pins on SoC side.
Due to the above quirk, those boards currently do work fine. The
boards in the mainline tree with such a layout are:
- Freescale Vybrid Tower with TWR-SER2 (vf610-twr.dts)
- Freescale i.MX6 SoloX SDB Board (imx6sx-sdb.dts)
This patch adds a new quirk FEC_QUIRK_SINGLE_MDIO for i.MX28, which
makes sure that the MDIO bus of the first FEC is used in any case.
However, the boards above do have a SoC with a MDIO bus for each FEC
instance. But the PHY's are not connected in a 1:1 configuration. A
proper device tree description is needed to allow the driver to
figure out where to find its PHY. This patch fixes that shortcoming
by adding a MDIO bus child node to the first FEC instance, along
with the two PHY's on that bus, and making use of the phy-handle
property to add a reference to the PHY's.
Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 13 Jan 2015 23:22:56 +0000 (12:22 +1300)]
Merge branch 'leds-fixes-for-3.19' of git://git./linux/kernel/git/cooloney/linux-leds
Pull LED fix from Bryan Wu.
* 'leds-fixes-for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/linux-leds:
leds: netxbig: fix oops at probe time
Linus Torvalds [Tue, 13 Jan 2015 22:54:12 +0000 (11:54 +1300)]
Merge tag 'for-linus' of git://git./linux/kernel/git/borntraeger/linux
Pull WRITE_ONCE argument order change from Christian Borntraeger:
"As discussed on LKML[1] it was agreed that WRITE_ONCE(x, val) is
better than ASSIGN_ONCE(val, x)
Lets change that for 3.19 as 3.19 has no user yet, but the first users
will hit linux-next soon"
[1] http://marc.info/?l=linux-kernel&m=
142081181707596
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux:
kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)
David Vrabel [Tue, 13 Jan 2015 16:42:42 +0000 (16:42 +0000)]
xen-netfront: use different locks for Rx and Tx stats
In netfront the Rx and Tx path are independent and use different
locks. The Tx lock is held with hard irqs disabled, but Rx lock is
held with only BH disabled. Since both sides use the same stats lock,
a deadlock may occur.
[ INFO: possible irq lock inversion dependency detected ]
3.16.2 #16 Not tainted
---------------------------------------------------------
swapper/0/0 just changed the state of lock:
(&(&queue->tx_lock)->rlock){-.....}, at: [<
c03adec8>]
xennet_tx_interrupt+0x14/0x34
but this lock took another, HARDIRQ-unsafe lock in the past:
(&stat->syncp.seq#2){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&stat->syncp.seq#2);
local_irq_disable();
lock(&(&queue->tx_lock)->rlock);
lock(&stat->syncp.seq#2);
<Interrupt>
lock(&(&queue->tx_lock)->rlock);
Using separate locks for the Rx and Tx stats fixes this deadlock.
Reported-by: Dmitry Piotrovsky <piotrovskydmitry@gmail.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Tue, 13 Jan 2015 12:05:49 +0000 (17:35 +0530)]
drivers: net: cpsw: fix multicast flush in dual emac mode
Since ALE table is a common resource for both the interfaces in Dual EMAC
mode and while bringing up the second interface in cpsw_ndo_set_rx_mode()
all the multicast entries added by the first interface is flushed out and
only second interface multicast addresses are added. Fixing this by
flushing multicast addresses based on dual EMAC port vlans which will not
affect the other emac port multicast addresses.
Fixes: d9ba8f9 (driver: net: ethernet: cpsw: dual emac interface implementation)
Cc: <stable@vger.kernel.org> # v3.9+
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Guinot [Tue, 2 Dec 2014 15:32:10 +0000 (07:32 -0800)]
leds: netxbig: fix oops at probe time
This patch fixes a NULL pointer dereference on led_dat->mode_val. Due to
this bug, a kernel oops can be observed at probe time on the LaCie 2Big
and 5Big v2 boards:
Unable to handle kernel NULL pointer dereference at virtual address
00000008
[...]
[<
c03f244c>] (netxbig_led_probe) from [<
c02c8c6c>] (platform_drv_probe+0x4c/0x9c)
[<
c02c8c6c>] (platform_drv_probe) from [<
c02c72d0>] (driver_probe_device+0x98/0x25c)
[<
c02c72d0>] (driver_probe_device) from [<
c02c7520>] (__driver_attach+0x8c/0x90)
[<
c02c7520>] (__driver_attach) from [<
c02c5c24>] (bus_for_each_dev+0x68/0x94)
[<
c02c5c24>] (bus_for_each_dev) from [<
c02c6408>] (bus_add_driver+0x124/0x1dc)
[<
c02c6408>] (bus_add_driver) from [<
c02c7ac0>] (driver_register+0x78/0xf8)
[<
c02c7ac0>] (driver_register) from [<
c000888c>] (do_one_initcall+0x80/0x1cc)
[<
c000888c>] (do_one_initcall) from [<
c0733618>] (kernel_init_freeable+0xe4/0x1b4)
[<
c0733618>] (kernel_init_freeable) from [<
c058db9c>] (kernel_init+0xc/0xec)
[<
c058db9c>] (kernel_init) from [<
c0009850>] (ret_from_fork+0x14/0x24)
[...]
This bug was introduced by commit
588a6a99286ae30afb1339d8bc2163517b1b7dd1
("leds: netxbig: fix attribute-creation race").
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Cc: <stable@vger.kernel.org> # 3.17+
Acked-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
Hariprasad Shenai [Mon, 12 Jan 2015 16:36:16 +0000 (22:06 +0530)]
cxgb4vf: Initialize mdio_addr before using it
In commit
5ad24def21b205a8 ("cxgb4vf: Fix ethtool get_settings for VF driver")
mdio_addr of port_info structure was used unininitialzed. Fixing it.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christian Borntraeger [Tue, 13 Jan 2015 09:46:42 +0000 (10:46 +0100)]
kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)
Feedback has shown that WRITE_ONCE(x, val) is easier to use than
ASSIGN_ONCE(val,x).
There are no in-tree users yet, so lets change it for 3.19.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Linus Torvalds [Tue, 13 Jan 2015 19:09:14 +0000 (08:09 +1300)]
Merge tag 'linux-kselftest-3.19-rc-5' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"This update contains three patches to fix one compile error, and two
run-time bugs. One of them fixes infinite loop on ARM"
* tag 'linux-kselftest-3.19-rc-5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/vm: fix link error for transhuge-stress test
tools: testing: selftests: mq_perf_tests: Fix infinite loop on ARM
selftests/exec: allow shell return code of 126
Linus Torvalds [Tue, 13 Jan 2015 19:07:42 +0000 (08:07 +1300)]
Merge tag 'stable/for-linus-3.19-rc4-tag' of git://git./linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
"Several critical linear p2m fixes that prevented some hosts from
booting"
* tag 'stable/for-linus-3.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: properly retrieve NMI reason
xen: check for zero sized area when invalidating memory
xen: use correct type for physical addresses
xen: correct race in alloc_p2m_pmd()
xen: correct error for building p2m list on 32 bits
x86/xen: avoid freeing static 'name' when kasprintf() fails
x86/xen: add extra memory for remapped frames during setup
x86/xen: don't count how many PFNs are identity mapped
x86/xen: Free bootmem in free_p2m_page() during early boot
x86/xen: Remove unnecessary BUG_ON(preemptible()) in xen_setup_timer()
B Viswanath [Mon, 12 Jan 2015 09:16:25 +0000 (14:46 +0530)]
net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations
Corrected the comment describing the ndo operations to
reflect the actual prototype for couple of operations
Signed-off-by: B Viswanath <marichika4@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 13 Jan 2015 18:53:51 +0000 (07:53 +1300)]
Merge branch 'for-rc' of git://git./linux/kernel/git/rzhang/linux
Pull thermal management fixes from Zhang Rui:
"Specifics:
- Fix a problem that Intel SoC DTS thermal driver does not work when
CONFIG_THERMAL_INT340X is not set.
- Fix a NULL pointer dereference when processor_thermal_device driver
is loaded on a platform without ACPI support"
* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
int340x_thermal/processor_thermal_device: return failure when
ACPI/int340x_thermal: enumerate INT3401 for Intel SoC DTS thermal driver
ACPI/int340x_thermal: enumerate INT340X devices even if they're not in _ART/_TRT
NeilBrown [Tue, 13 Jan 2015 02:17:43 +0000 (15:17 +1300)]
locks: fix NULL-deref in generic_delete_lease
commit
0efaa7e82f02fe69c05ad28e905f31fc86e6f08e
locks: generic_delete_lease doesn't need a file_lock at all
moves the call to fl->fl_lmops->lm_change() to a place in the
code where fl might be a non-lease lock.
When that happens, fl_lmops is NULL and an Oops ensures.
So add an extra test to restore correct functioning.
Reported-by: Linda Walsh <suse@tlinx.org>
Link: https://bugzilla.suse.com/show_bug.cgi?id=912569
Cc: stable@vger.kernel.org (v3.18)
Fixes: 0efaa7e82f02fe69c05ad28e905f31fc86e6f08e
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Jeff Layton <jlayton@primarydata.com>
Jan Beulich [Tue, 13 Jan 2015 07:40:05 +0000 (07:40 +0000)]
x86/xen: properly retrieve NMI reason
Using the native code here can't work properly, as the hypervisor would
normally have cleared the two reason bits by the time Dom0 gets to see
the NMI (if passed to it at all). There's a shared info field for this,
and there's an existing hook to use - just fit the two together. This
is particularly relevant so that NMIs intended to be handled by APEI /
GHES actually make it to the respective handler.
Note that the hook can (and should) be used irrespective of whether
being in Dom0, as accessing port 0x61 in a DomU would be even worse,
while the shared info field would just hold zero all the time. Note
further that hardware NMI handling for PVH doesn't currently work
anyway due to missing code in the hypervisor (but it is expected to
work the native rather than the PV way).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Linus Torvalds [Tue, 13 Jan 2015 02:29:42 +0000 (15:29 +1300)]
Merge tag 'gpio-v3.19-3' of git://git./linux/kernel/git/linusw/linux-gpio
Pull gpio fixes from Linus Walleij:
"Here are some GPIO fixes, mainly affecting the DLN2 IRQ handling.
Nothing special about them, just fixes:
- Three patches fixing IRQ handling for the DLN2
- Null pointer handling for grgpio"
* tag 'gpio-v3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: dln2: use bus_sync_unlock instead of scheduling work
gpio: grgpio: Avoid potential NULL pointer dereference
gpio: dln2: Fix gpio output value in dln2_gpio_direction_output()
gpio: dln2: fix issue when an IRQ is unmasked then enabled
Linus Torvalds [Tue, 13 Jan 2015 02:25:23 +0000 (15:25 +1300)]
Merge tag 'mmc-v3.19-3' of git://git.linaro.org/people/ulf.hansson/mmc
Pull MMC fixes from Ulf Hansson:
"MMC host:
- sdhci-pci|acpi: Support some new IDs
- sdhci: Fix sleep from atomic context
- sdhci-pxav3: Prevent hang during ->probe()
- sdhci: Disable re-tuning for HS400"
* tag 'mmc-v3.19-3' of git://git.linaro.org/people/ulf.hansson/mmc:
mmc: sdhci-pci: Add support for Intel SPT
mmc: sdhci-acpi: Add ACPI HID INT344D
mmc: sdhci: Fix sleep in atomic after inserting SD card
mmc: sdhci-pxav3: do the mbus window configuration after enabling clocks
mmc: sdhci: Disable re-tuning for HS400
mmc: sdhci: Simplify use of tuning timer
mmc: sdhci: Add out_unlock to sdhci_execute_tuning
mmc: sdhci: Tuning should not change max_blk_count
Linus Torvalds [Tue, 13 Jan 2015 02:23:26 +0000 (15:23 +1300)]
Merge git://git./linux/kernel/git/nab/target-pending
Pull scsi target fixes from Nicholas Bellinger:
"Mostly minor fixes this time, including:
- Add missing virtio-scsi -> TCM attribute conversion in vhost-scsi.
- Fix persistent reservations write exclusive handling to allow
readers for all registered I_T nexuses.
- Drop arbitrary maximum I/O size limit in order to process I/Os
larger than 4 MB, required for initiators that don't honor block
limits EVPD.
- Drop the now left-over fabric_max_sectors attribute"
* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
iscsi-target: Fix typos in enum cmd_flags_table
MAINTAINERS: Add entry for iSER target driver
target: Allow Write Exclusive non-reservation holders to READ
target: Drop left-over fabric_max_sectors attribute
target: Drop arbitrary maximum I/O size limit
Documentation/target: Update fabric_ops to latest code
vhost-scsi: Add missing virtio-scsi -> TCM attribute conversion
Will Deacon [Mon, 12 Jan 2015 19:10:55 +0000 (19:10 +0000)]
mm: mmu_gather: use tlb->end != 0 only for TLB invalidation
When batching up address ranges for TLB invalidation, we check tlb->end
!= 0 to indicate that some pages have actually been unmapped.
As of commit
f045bbb9fa1b ("mmu_gather: fix over-eager
tlb_flush_mmu_free() calling"), we use the same check for freeing these
pages in order to avoid a performance regression where we call
free_pages_and_swap_cache even when no pages are actually queued up.
Unfortunately, the range could have been reset (tlb->end = 0) by
tlb_end_vma, which has been shown to cause memory leaks on arm64.
Furthermore, investigation into these leaks revealed that the fullmm
case on task exit no longer invalidates the TLB, by virtue of tlb->end
== 0 (in 3.18, need_flush would have been set).
This patch resolves the problem by reverting commit
f045bbb9fa1b, using
instead tlb->local.nr as the predicate for page freeing in
tlb_flush_mmu_free and ensuring that tlb->end is initialised to a
non-zero value in the fullmm case.
Tested-by: Mark Langsdorf <mlangsdo@redhat.com>
Tested-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Khoroshilov [Fri, 9 Jan 2015 23:16:22 +0000 (02:16 +0300)]
usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb()
Commit
e4c7f259c5be ("USB: kaweth.c: use GFP_ATOMIC under spin_lock")
makes sure that kaweth_internal_control_msg() allocates memory with GFP_ATOMIC,
but kaweth_internal_control_msg() also calls usb_start_wait_urb()
that still allocates memory with GFP_NOIO.
The patch fixes usb_start_wait_urb() as well.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Fri, 9 Jan 2015 20:29:40 +0000 (14:29 -0600)]
MAINTAINERS: add me as ibmveth maintainer
Adding myself as the ibmveth maintainer and replacing
Santiago Leon.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Cc: Santiago Leon <santi_leon@yahoo.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Thu, 8 Jan 2015 17:27:27 +0000 (12:27 -0500)]
tipc: fix bug in broadcast retransmit code
In commit
58dc55f25631178ee74cd27185956a8f7dcb3e32 ("tipc: use generic
SKB list APIs to manage link transmission queue") we replace all list
traversal loops with the macros skb_queue_walk() or
skb_queue_walk_safe(). While the previous loops were based on the
assumption that the list was NULL-terminated, the standard macros
stop when the iterator reaches the list head, which is non-NULL.
In the function bclink_retransmit_pkt() this macro replacement has
lead to a bug. When we receive a BCAST STATE_MSG we unconditionally
call the function bclink_retransmit_pkt(), whether there really is
anything to retransmit or not, assuming that the sequence number
comparisons will lead to the correct behavior. However, if the
transmission queue is empty, or if there are no eligible buffers in
the transmission queue, we will by mistake pass the list head pointer
to the function tipc_link_retransmit(). Since the list head is not a
valid sk_buff, this leads to a crash.
In this commit we fix this by only calling tipc_link_retransmit()
if we actually found eligible buffers in the transmission queue.
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ani Sinha [Wed, 7 Jan 2015 23:45:56 +0000 (15:45 -0800)]
update ip-sysctl.txt documentation (v2)
Update documentation to reflect the fact that
/proc/sys/net/ipv4/route/max_size is no longer used for ipv4.
Signed-off-by: Ani Sinha <ani@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandre Belloni [Wed, 7 Jan 2015 22:59:26 +0000 (23:59 +0100)]
net/at91_ether: prepare and unprepare clock
The clock is enabled without being prepared, this leads to:
WARNING: CPU: 0 PID: 1 at drivers/clk/clk.c:889 __clk_enable+0x24/0xa8()
and a non working ethernet interface.
Use clk_prepare_enable() and clk_disable_unprepare() to handle the clock.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Giel van Schijndel [Wed, 7 Jan 2015 19:10:12 +0000 (20:10 +0100)]
isdn: fix NUL (\0 or \x00) specification in string
In C one can either use '\0' or '\x00' (or '\000') to add a NUL byte to
a string. '\0x00' isn't part of these and will in fact result in a
single NUL followed by "x00". This fixes that.
Signed-off-by: Giel van Schijndel <me@mortis.eu>
Reported-at: http://www.viva64.com/en/b/0299/
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Zyngier [Sun, 11 Jan 2015 13:10:11 +0000 (14:10 +0100)]
arm64: KVM: Fix HCR setting for 32bit guests
Commit
b856a59141b1 (arm/arm64: KVM: Reset the HCR on each vcpu
when resetting the vcpu) moved the init of the HCR register to
happen later in the init of a vcpu, but left out the fixup
done in kvm_reset_vcpu when preparing for a 32bit guest.
As a result, the 32bit guest is run as a 64bit guest, but the
rest of the kernel still manages it as a 32bit. Fun follows.
Moving the fixup to vcpu_reset_hcr solves the problem for good.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Marc Zyngier [Sun, 11 Jan 2015 13:10:10 +0000 (14:10 +0100)]
arm64: KVM: Fix TLB invalidation by IPA/VMID
It took about two years for someone to notice that the IPA passed
to TLBI IPAS2E1IS must be shifted by 12 bits. Clearly our reviewing
is not as good as it should be...
Paper bag time for me.
Reported-by: Mario Smarduch <m.smarduch@samsung.com>
Tested-by: Mario Smarduch <m.smarduch@samsung.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Juergen Gross [Mon, 12 Jan 2015 05:05:10 +0000 (06:05 +0100)]
xen: check for zero sized area when invalidating memory
With the introduction of the linear mapped p2m list setting memory
areas to "invalid" had to be delayed. When doing the invalidation
make sure no zero sized areas are processed.
Signed-off-by: Juegren Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Juergen Gross [Mon, 12 Jan 2015 05:05:09 +0000 (06:05 +0100)]
xen: use correct type for physical addresses
When converting a pfn to a physical address be sure to use 64 bit
wide types or convert the physical address to a pfn if possible.
Signed-off-by: Juergen Gross <jgross@suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Juergen Gross [Mon, 12 Jan 2015 05:05:08 +0000 (06:05 +0100)]
xen: correct race in alloc_p2m_pmd()
When allocating a new pmd for the linear mapped p2m list a check is
done for not introducing another pmd when this just happened on
another cpu. In this case the old pte pointer was returned which
points to the p2m_missing or p2m_identity page. The correct value
would be the pointer to the found new page.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Juergen Gross [Mon, 12 Jan 2015 05:05:07 +0000 (06:05 +0100)]
xen: correct error for building p2m list on 32 bits
In xen_rebuild_p2m_list() for large areas of invalid or identity
mapped memory the pmd entries on 32 bit systems are initialized
wrong. Correct this error.
Suggested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Jan Willeke [Thu, 8 Jan 2015 15:56:01 +0000 (16:56 +0100)]
s390/uprobes: fix user space PER events
If uprobes are single stepped for example with gdb, the behavior should
now be correct. Before this patch, when gdb was single stepping a uprobe,
the result was a SIGILL.
When PER is active for any storage alteration and a uprobe is hit, a storage
alteration event is indicated. These over indications are filterd out by gdb,
if no change has happened within the observed area.
Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Adrian Hunter [Mon, 5 Jan 2015 12:47:58 +0000 (14:47 +0200)]
mmc: sdhci-pci: Add support for Intel SPT
Add PCI IDs for SPT eMMC, SDIO and SD card.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Adrian Hunter [Mon, 5 Jan 2015 12:47:57 +0000 (14:47 +0200)]
mmc: sdhci-acpi: Add ACPI HID INT344D
Add ACPI HID INT344D for an Intel SDIO host controller.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Krzysztof Kozlowski [Mon, 5 Jan 2015 09:50:15 +0000 (10:50 +0100)]
mmc: sdhci: Fix sleep in atomic after inserting SD card
Sleep in atomic context happened on Trats2 board after inserting or
removing SD card because mmc_gpio_get_cd() was called under spin lock.
Fix this by moving card detection earlier, before acquiring spin lock.
The mmc_gpio_get_cd() call does not have to be protected by spin lock
because it does not access any sdhci internal data.
The sdhci_do_get_cd() call access host flags (SDHCI_DEVICE_DEAD). After
moving it out side of spin lock it could theoretically race with driver
removal but still there is no actual protection against manual card
eject.
Dmesg after inserting SD card:
[ 41.663414] BUG: sleeping function called from invalid context at drivers/gpio/gpiolib.c:1511
[ 41.670469] in_atomic(): 1, irqs_disabled(): 128, pid: 30, name: kworker/u8:1
[ 41.677580] INFO: lockdep is turned off.
[ 41.681486] irq event stamp: 61972
[ 41.684872] hardirqs last enabled at (61971): [<
c0490ee0>] _raw_spin_unlock_irq+0x24/0x5c
[ 41.693118] hardirqs last disabled at (61972): [<
c04907ac>] _raw_spin_lock_irq+0x18/0x54
[ 41.701190] softirqs last enabled at (61648): [<
c0026fd4>] __do_softirq+0x234/0x2c8
[ 41.708914] softirqs last disabled at (61631): [<
c00273a0>] irq_exit+0xd0/0x114
[ 41.716206] Preemption disabled at:[< (null)>] (null)
[ 41.721500]
[ 41.722985] CPU: 3 PID: 30 Comm: kworker/u8:1 Tainted: G W 3.18.0-rc5-next-
20141121 #883
[ 41.732111] Workqueue: kmmcd mmc_rescan
[ 41.735945] [<
c0014d2c>] (unwind_backtrace) from [<
c0011c80>] (show_stack+0x10/0x14)
[ 41.743661] [<
c0011c80>] (show_stack) from [<
c0489d14>] (dump_stack+0x70/0xbc)
[ 41.750867] [<
c0489d14>] (dump_stack) from [<
c0228b74>] (gpiod_get_raw_value_cansleep+0x18/0x30)
[ 41.759628] [<
c0228b74>] (gpiod_get_raw_value_cansleep) from [<
c03646e8>] (mmc_gpio_get_cd+0x38/0x58)
[ 41.768821] [<
c03646e8>] (mmc_gpio_get_cd) from [<
c036d378>] (sdhci_request+0x50/0x1a4)
[ 41.776808] [<
c036d378>] (sdhci_request) from [<
c0357934>] (mmc_start_request+0x138/0x268)
[ 41.785051] [<
c0357934>] (mmc_start_request) from [<
c0357cc8>] (mmc_wait_for_req+0x58/0x1a0)
[ 41.793469] [<
c0357cc8>] (mmc_wait_for_req) from [<
c0357e68>] (mmc_wait_for_cmd+0x58/0x78)
[ 41.801714] [<
c0357e68>] (mmc_wait_for_cmd) from [<
c0361c00>] (mmc_io_rw_direct_host+0x98/0x124)
[ 41.810480] [<
c0361c00>] (mmc_io_rw_direct_host) from [<
c03620f8>] (sdio_reset+0x2c/0x64)
[ 41.818641] [<
c03620f8>] (sdio_reset) from [<
c035a3d8>] (mmc_rescan+0x254/0x2e4)
[ 41.826028] [<
c035a3d8>] (mmc_rescan) from [<
c003a0e0>] (process_one_work+0x180/0x3f4)
[ 41.833920] [<
c003a0e0>] (process_one_work) from [<
c003a3bc>] (worker_thread+0x34/0x4b0)
[ 41.841991] [<
c003a3bc>] (worker_thread) from [<
c003fed8>] (kthread+0xe4/0x104)
[ 41.849285] [<
c003fed8>] (kthread) from [<
c000f268>] (ret_from_fork+0x14/0x2c)
[ 42.038276] mmc0: new high speed SDHC card at address 1234
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: 94144a465dd0 ("mmc: sdhci: add get_cd() implementation")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Thomas Petazzoni [Wed, 31 Dec 2014 10:54:10 +0000 (11:54 +0100)]
mmc: sdhci-pxav3: do the mbus window configuration after enabling clocks
In commit
5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada
38x SDHCI controller"), the sdhci-pxav3 driver was extended to include
support for the SDHCI controller found in the Armada 38x
processor. This mainly involved adding some MBus window related
configuration.
However, this configuration is currently done too early in ->probe():
it is done before clocks are enabled, while this configuration
involves touching the registers of the controller, which will hang the
SoC if the clock is disabled. It wasn't noticed until now because the
bootloader typically leaves gatable clocks enabled, but in situations
where we have a deferred probe (due to a CD GPIO that cannot be taken,
for example), then the probe will be re-tried later, after a clock
disable has been done in the exit path of the failed probe attempt of
the device. This second probe() will hang the system due to the clock
being disabled.
This can for example be produced on Armada 385 GP, which has a CD GPIO
connected to an I2C PCA9555. If the driver for the PCA9555 is not
compiled into the kernel, then we will have the following sequence of
events:
1. The SDHCI probes
2. It does the MBus configuration (which works, because the clock is
left enabled by the bootloader)
3. It enables the clock
4. It tries to get the CD GPIO, which fails due to the driver being
missing, so -EPROBE_DEFER is returned.
5. Before returning -EPROBE_DEFER, the driver cleans up what was
done, which includes disabling the clock.
6. Later on, the SDHCI probe is tried again.
7. It does the MBus configuration, which hangs because the clock is
no longer enabled.
This commit does the obvious fix of doing the MBus configuration after
the clock has been enabled by the driver.
Fixes: 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Adrian Hunter [Fri, 5 Dec 2014 17:25:31 +0000 (19:25 +0200)]
mmc: sdhci: Disable re-tuning for HS400
Re-tuning for HS400 mode must be done in HS200
mode. Currently there is no support for that.
That needs to be reflected in the code.
Specifically, if tuning is executed in HS400 mode
then return an error, and do not start the
tuning timer if HS200 tuning is being done prior
to switching to HS400.
Note that periodic re-tuning is not expected
to be needed for HS400 but re-tuning is still
needed after the host controller has lost power.
In the case of suspend/resume that is not necessary
because the card is fully re-initialised. That
just leaves runtime suspend/resume with no support
for HS400 re-tuning.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Adrian Hunter [Fri, 5 Dec 2014 17:25:30 +0000 (19:25 +0200)]
mmc: sdhci: Simplify use of tuning timer
The tuning timer is always used if the tuning mode
is 1 and there is a tuning count, irrespective of
whether this is the first call, or any subsequent
call. Consequently the logic to start the timer
can be simplified.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Adrian Hunter [Fri, 5 Dec 2014 17:25:29 +0000 (19:25 +0200)]
mmc: sdhci: Add out_unlock to sdhci_execute_tuning
A 'goto' can be used to save duplicating unlocking
and returning.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Adrian Hunter [Fri, 5 Dec 2014 17:25:28 +0000 (19:25 +0200)]
mmc: sdhci: Tuning should not change max_blk_count
Re-tuning requires that the maximum data length
is limited to 4MiB. The code currently changes
max_blk_count in an attempt to achieve that.
This is wrong because max_blk_count is a different
limit, but it is also un-necessary because
max_req_size is 512KiB anyway. Consequently, the
changes to max_blk_count are removed and the
comment for max_req_size adjusted accordingly.
The comment is also tweaked to show that the 512KiB
limit is a SDMA limit not an ADMA limit.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
David S. Miller [Mon, 12 Jan 2015 05:23:45 +0000 (00:23 -0500)]
Merge tag 'wireless-drivers-for-davem-2015-01-09' of git://git./linux/kernel/git/kvalo/wireless-drivers
* rtlwifi: fix a regression in large skb allocation failure
iwlwifi:
* fix for 7265D NVM check
* fixes for scan: fix long scanning times and network discovery
* new firmware API for iwlmvm supported devices
* fixes in rate control
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 12 Jan 2015 05:14:49 +0000 (00:14 -0500)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
netfilter/ipvs fixes for net
The following patchset contains netfilter/ipvs fixes, they are:
1) Small fix for the FTP helper in IPVS, a diff variable may be left
unset when CONFIG_IP_VS_IPV6 is set. Patch from Dan Carpenter.
2) Fix nf_tables port NAT in little endian archs, patch from leroy
christophe.
3) Fix race condition between conntrack confirmation and flush from
userspace. This is the second reincarnation to resolve this problem.
4) Make sure inner messages in the batch come with the nfnetlink header.
5) Relax strict check from nfnetlink_bind() that may break old userspace
applications using all 1s group mask.
6) Schedule removal of chains once no sets and rules refer to them in
the new nf_tables ruleset flush command. Reported by Asbjoern Sloth
Toennesen.
Note that this batch comes later than usual because of the short
winter holidays.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Jaeger [Sun, 11 Jan 2015 18:01:16 +0000 (13:01 -0500)]
packet: bail out of packet_snd() if L2 header creation fails
Due to a misplaced parenthesis, the expression
(unlikely(offset) < 0),
which expands to
(__builtin_expect(!!(offset), 0) < 0),
never evaluates to true. Therefore, when sending packets with
PF_PACKET/SOCK_DGRAM, packet_snd() does not abort as intended
if the creation of the layer 2 header fails.
Spotted by Coverity - CID
1259975 ("Operands don't affect result").
Fixes: 9c7077622dd9 ("packet: make packet_snd fail on len smaller than l2 header")
Signed-off-by: Christoph Jaeger <cj@linux.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Jan 2015 18:32:18 +0000 (10:32 -0800)]
alx: fix alx_poll()
Commit
d75b1ade567f ("net: less interrupt masking in NAPI") uncovered
wrong alx_poll() behavior.
A NAPI poll() handler is supposed to return exactly the budget when/if
napi_complete() has not been called.
It is also supposed to return number of frames that were received, so
that netdev_budget can have a meaning.
Also, in case of TX pressure, we still have to dequeue received
packets : alx_clean_rx_irq() has to be called even if
alx_clean_tx_irq(alx) returns false, otherwise device is half duplex.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: d75b1ade567f ("net: less interrupt masking in NAPI")
Reported-by: Oded Gabbay <oded.gabbay@amd.com>
Bisected-by: Oded Gabbay <oded.gabbay@amd.com>
Tested-by: Oded Gabbay <oded.gabbay@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 11 Jan 2015 19:02:32 +0000 (11:02 -0800)]
net: dnet: fix dnet_poll()
A NAPI poll() handler is supposed to return exactly the budget when/if
napi_complete() has not been called.
It is also supposed to return number of frames that were received, so
that netdev_budget can have a meaning.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 11 Jan 2015 20:44:53 +0000 (12:44 -0800)]
linux 3.19-rc4
Linus Torvalds [Sun, 11 Jan 2015 20:44:10 +0000 (12:44 -0800)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Three small fixes from over the Christmas period, and wiring up the
new execveat syscall for ARM"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8275/1: mm: fix PMD_SECT_RDONLY undeclared compile error
ARM: 8253/1: mm: use phys_addr_t type in map_lowmem() for kernel mem region
ARM: 8249/1: mm: dump: don't skip regions
ARM: wire up execveat syscall
Linus Torvalds [Sun, 11 Jan 2015 19:53:46 +0000 (11:53 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes: two vdso fixes, two kbuild fixes and a boot failure fix
with certain odd memory mappings"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, vdso: Use asm volatile in __getcpu
x86/build: Clean auto-generated processor feature files
x86: Fix mkcapflags.sh bash-ism
x86: Fix step size adjustment during initial memory mapping
x86_64, vdso: Fix the vdso address randomization algorithm
Linus Torvalds [Sun, 11 Jan 2015 19:51:49 +0000 (11:51 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Misc fixes: group scheduling corner case fix, two deadline scheduler
fixes, effective_load() overflow fix, nested sleep fix, 6144 CPUs
system fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group()
sched/deadline: Avoid double-accounting in case of missed deadlines
sched/deadline: Fix migration of SCHED_DEADLINE tasks
sched: Fix odd values in effective_load() calculations
sched, fanotify: Deal with nested sleeps
sched: Fix KMALLOC_MAX_SIZE overflow during cpumask allocation
Linus Torvalds [Sun, 11 Jan 2015 19:47:45 +0000 (11:47 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also some kernel side fixes: uncore PMU
driver fix, user regs sampling fix and an instruction decoder fix that
unbreaks PEBS precise sampling"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes
perf/x86_64: Improve user regs sampling
perf: Move task_pt_regs sampling into arch code
x86: Fix off-by-one in instruction decoder
perf hists browser: Fix segfault when showing callchain
perf callchain: Free callchains when hist entries are deleted
perf hists: Fix children sort key behavior
perf diff: Fix to sort by baseline field by default
perf list: Fix --raw-dump option
perf probe: Fix crash in dwarf_getcfi_elf
perf probe: Fix to fall back to find probe point in symbols
perf callchain: Append callchains only when requested
perf ui/tui: Print backtrace symbols when segfault occurs
perf report: Show progress bar for output resorting
Linus Torvalds [Sun, 11 Jan 2015 19:46:31 +0000 (11:46 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
"A liblockdep fix and a mutex_unlock() mutex-debugging fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mutex: Always clear owner field upon mutex_unlock()
tools/liblockdep: Fix debug_check thinko in mutex destroy
Konstantin Khlebnikov [Sun, 11 Jan 2015 13:54:06 +0000 (16:54 +0300)]
mm: fix corner case in anon_vma endless growing prevention
Fix for BUG_ON(anon_vma->degree) splashes in unlink_anon_vmas() ("kernel
BUG at mm/rmap.c:399!") caused by commit
7a3ef208e662 ("mm: prevent
endless growth of anon_vma hierarchy")
Anon_vma_clone() is usually called for a copy of source vma in
destination argument. If source vma has anon_vma it should be already
in dst->anon_vma. NULL in dst->anon_vma is used as a sign that it's
called from anon_vma_fork(). In this case anon_vma_clone() finds
anon_vma for reusing.
Vma_adjust() calls it differently and this breaks anon_vma reusing
logic: anon_vma_clone() links vma to old anon_vma and updates degree
counters but vma_adjust() overrides vma->anon_vma right after that. As
a result final unlink_anon_vmas() decrements degree for wrong anon_vma.
This patch assigns ->anon_vma before calling anon_vma_clone().
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-and-tested-by: Chris Clayton <chris2553@googlemail.com>
Reported-and-tested-by: Oded Gabbay <oded.gabbay@amd.com>
Reported-and-tested-by: Chih-Wei Huang <cwhuang@android-x86.org>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Daniel Forrest <dan.forrest@ssec.wisc.edu>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org # to match back-porting of 7a3ef208e662
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 11 Jan 2015 19:33:57 +0000 (11:33 -0800)]
mm: Don't count the stack guard page towards RLIMIT_STACK
Commit
fee7e49d4514 ("mm: propagate error from stack expansion even for
guard page") made sure that we return the error properly for stack
growth conditions. It also theorized that counting the guard page
towards the stack limit might break something, but also said "Let's see
if anybody notices".
Somebody did notice. Apparently android-x86 sets the stack limit very
close to the limit indeed, and including the guard page in the rlimit
check causes the android 'zygote' process problems.
So this adds the (fairly trivial) code to make the stack rlimit check be
against the actual real stack size, rather than the size of the vma that
includes the guard page.
Reported-and-tested-by: Chih-Wei Huang <cwhuang@android-x86.org>
Cc: Jay Foad <jay.foad@gmail.com>
Cc: stable@kernel.org # to match back-porting of fee7e49d4514
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Molnar [Sun, 11 Jan 2015 08:18:05 +0000 (09:18 +0100)]
Merge branch 'core/urgent' into locking/urgent, to collect all pending locking fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Sat, 10 Jan 2015 20:23:03 +0000 (12:23 -0800)]
Merge tag 'vfio-v3.19-rc4' of git://github.com/awilliam/linux-vfio
Pull VFIO fix from Alex Williamson:
"Fix PCI header check in vfio_pci_probe() (Wei Yang)"
* tag 'vfio-v3.19-rc4' of git://github.com/awilliam/linux-vfio:
vfio-pci: Fix the check on pci device type in vfio_pci_probe()
Linus Torvalds [Sat, 10 Jan 2015 19:59:25 +0000 (11:59 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"Just one fix: a qlogic busy wait regression"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
qla2xxx: fix busy wait regression
Linus Torvalds [Sat, 10 Jan 2015 05:23:27 +0000 (21:23 -0800)]
Merge tag 'sound-3.19-rc4' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"All a few small regression or stable fixes: a Nvidia HDMI ID addition,
a regression fix for CAIAQ stream count, a typo fix for GPIO setup
with STAC/IDT HD-audio codecs, and a Fireworks big-endian fix"
* tag 'sound-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: fireworks: fix an endianness bug for transaction length
ALSA: hda - Add new GPU codec ID 0x10de0072 to snd-hda
ALSA: hda - Fix wrong gpio_dir & gpio_mask hint setups for IDT/STAC codecs
ALSA: snd-usb-caiaq: fix stream count check