Xin Long [Thu, 18 Feb 2016 13:21:19 +0000 (21:21 +0800)]
route: check and remove route cache when we get route
Since the gc of ipv4 route was removed, the route cached would has
no chance to be removed, and even it has been timeout, it still could
be used, cause no code to check it's expires.
Fix this issue by checking and removing route cache when we get route.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Thu, 18 Feb 2016 12:38:04 +0000 (07:38 -0500)]
net_sched fix: reclassification needs to consider ether protocol changes
actions could change the etherproto in particular with ethernet
tunnelled data. Typically such actions, after peeling the outer header,
will ask for the packet to be reclassified. We then need to restart
the classification with the new proto header.
Example setup used to catch this:
sudo tc qdisc add dev $ETH ingress
sudo $TC filter add dev $ETH parent ffff: pref 1 protocol 802.1Q \
u32 match u32 0 0 flowid 1:1 \
action vlan pop reclassify
Fixes: 3b3ae880266d ("net: sched: consolidate tc_classify{,_compat}")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 18 Feb 2016 15:44:27 +0000 (10:44 -0500)]
Merge branch 'mlxsw-fixes'
Jiri Pirko says:
====================
mlxsw fixes
Another bulk of fixes from Ido.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 18 Feb 2016 10:30:02 +0000 (11:30 +0100)]
mlxsw: spectrum: Allow for PVID deletion
When PVID is toggled off on a port member in a VLAN filtering bridge or
the PVID VLAN is deleted, make the port drop untagged packets. Reverse
the operation when PVID is toggled back on.
Set the PVID back to the default (1), when leaving the bridge so that
untagged traffic will be directed to the CPU.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 18 Feb 2016 10:30:01 +0000 (11:30 +0100)]
mlxsw: reg: Add the Switch Port Acceptable Frame Types register
When VLAN filtering is enabled on a bridge and PVID is deleted from a
bridge port, then untagged frames are not allowed to ingress into the
bridge from this port.
Add the Switch Port Acceptable Frame Types (SPAFT) register, which
configures the frame admittance of the port.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Insu Yun [Tue, 16 Feb 2016 02:23:47 +0000 (21:23 -0500)]
et131x: check return value of dma_alloc_coherent
For error handling, dma_alloc_coherent's return value
needs to be checked, not argument.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 18 Feb 2016 03:24:57 +0000 (22:24 -0500)]
Merge branch 'thunderx-fixes'
Sunil Goutham says:
====================
net: thunderx: Miscellaneous fixes
This patch series fixes couple of issues w.r.t multiqset mode
and receive packet statastics.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Tue, 16 Feb 2016 10:59:51 +0000 (16:29 +0530)]
net: thunderx: Fix receive packet stats
Counting rx packets for every CQE_RX in CQ irq handler is incorrect.
Synchronization is missing when multiple queues are receiving packets
simultaneously. Like transmit packet stats use HW stats here.
Also removed unused 'cqe_type' parameter in nicvf_rcv_pkt_handler().
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Tue, 16 Feb 2016 10:59:50 +0000 (16:29 +0530)]
net: thunderx: Fix for HW TSO not enabled for secondary qsets
For secondary Qsets 'hw_tso' is not getting set as probe() returns
much earlier. Fixed it by moving silicon revision check.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sunil Goutham [Tue, 16 Feb 2016 10:59:49 +0000 (16:29 +0530)]
net: thunderx: Fix for multiqset not configured upon interface toggle
When a interface is assigned morethan 8 queues and the logical interface
is toggled i.e down & up, additional queues or qsets are not initialized
as secondary qset count is being set to zero while tearing down.
Signed-off-by: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Insu Yun [Tue, 16 Feb 2016 02:30:33 +0000 (21:30 -0500)]
tcp: correctly crypto_alloc_hash return check
crypto_alloc_hash never returns NULL
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 18 Feb 2016 02:43:22 +0000 (18:43 -0800)]
net: dsa: Unregister slave_dev in error path
With commit
0071f56e46da ("dsa: Register netdev before phy"), we are now trying
to free a network device that has been previously registered, and in case of
errors, this will make us hit the BUG_ON(dev->reg_state != NETREG_UNREGISTERED)
condition.
Fix this by adding a missing unregister_netdev() before free_netdev().
Fixes: 0071f56e46da ("dsa: Register netdev before phy")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clemens Gruber [Mon, 15 Feb 2016 22:46:45 +0000 (23:46 +0100)]
phy: marvell: Fix and unify reg-init behavior
For the Marvell
88E1510, marvell_of_reg_init was called too late, in the
config_aneg function.
Since commit
113c74d83eef ("net: phy: turn carrier off on phy attach"),
this lead to the link not coming up at boot anymore, due to the phy
state machine being stuck at waiting for interrupts (off by default on
the
88E1510).
For seven other Marvell PHYs, marvell_of_reg_init was not called at all.
Add a generic marvell_config_init function, which in turn calls
marvell_of_reg_init.
PHYs, which already have a specific config_init function with a call to
marvell_of_reg_init, are left untouched. The generic marvell_config_init
function is called for all the others, to get consistent behavior across
all Marvell PHYs.
Fixes: 113c74d83eef ("net: phy: turn carrier off on phy attach")
Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Mon, 15 Feb 2016 16:01:10 +0000 (17:01 +0100)]
pppoe: fix reference counting in PPPoE proxy
Drop reference on the relay_po socket when __pppoe_xmit() succeeds.
This is already handled correctly in the error path.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Mon, 15 Feb 2016 12:41:31 +0000 (13:41 +0100)]
ravb: Update DT binding example for final CPG/MSSR bindings
The example in the DT binding documentation uses the preliminary DT
bindings for the r8a7795 MSTP clocks, which never went upstream.
Update the example to use the DT bindings for the upstream Clock Pulse
Generator / Module Standby and Software Reset hardware block.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Feb 2016 20:52:59 +0000 (15:52 -0500)]
Merge branch 'mlxsw-fixes'
Jiri Pirko says:
====================
mlxsw fixes
Just a couple of fixes from Ido.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 15 Feb 2016 12:19:54 +0000 (13:19 +0100)]
mlxsw: spectrum: Set STP state when leaving 802.1D bridge
When a VLAN device leaves a bridge its STP state is set to DISABLED,
which causes the hardware to discard any packets coming through the port
with this VLAN.
Fix that by setting STP state to FORWARDING when the device leaves its
bridge and allow traffic to be directed to CPU.
Fixes: 26f0e7fb15de ("mlxsw: spectrum: Add support for VLAN devices bridging")
Reported-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 15 Feb 2016 12:19:53 +0000 (13:19 +0100)]
mlxsw: Treat local port 64 as valid
MLXSW_PORT_MAX_PORTS represents the maximum number of local ports, which
is 65 for both ASICs (SwitchX-2 and Spectrum) supported by this driver.
Fixes: 93c1edb27f9e ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Tomlinson [Mon, 15 Feb 2016 03:24:44 +0000 (16:24 +1300)]
l2tp: Fix error creating L2TP tunnels
A previous commit (
33f72e6) added notification via netlink for tunnels
when created/modified/deleted. If the notification returned an error,
this error was returned from the tunnel function. If there were no
listeners, the error code ESRCH was returned, even though having no
listeners is not an error. Other calls to this and other similar
notification functions either ignore the error code, or filter ESRCH.
This patch checks for ESRCH and does not flag this as an error.
Reviewed-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Rivshin [Sat, 13 Feb 2016 00:45:36 +0000 (19:45 -0500)]
drivers: net: cpsw-phy-sel: add dev_warn() for unsupported PHY mode
The cpsw-phy-sel driver supports only MII, RMII, and RGMII PHY modes,
and silently handled any other values as if MII was specified. In a
case where the PHY mode was incorrectly specified, or a bug elsewhere,
there would be no indication of a problem. If MII was the correct mode,
then this will go unnoticed, otherwise the symptom will be a failure
to transmit/receive data over the RMII/RGMII link.
Add a dev_warn() to make this condition obvious and provide a
breadcrumb to follow.
Cc: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David Rivshin <drivshin@allworx.com>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Woojung.Huh@microchip.com [Thu, 11 Feb 2016 17:29:47 +0000 (17:29 +0000)]
phy: keep pause flags in phy driver features
genphy_config_init() masked out pause flags set in phy driver structure.
Pause flags needs to be preserved in phydev->supported &
phydev->advertising.
Signed-off-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 17 Feb 2016 15:29:27 +0000 (10:29 -0500)]
Merge branch 'mlx4-fixes'
Or Gerlitz says:
====================
Mellanox 10/40G mlx4 driver fixes for 4.5-rc
Bunch of fixes from the team to the mlx4 Eth and core drivers.
Series generated against net commit
aac8d3c "qmi_wwan: add "4G LTE usb-modem U901""
Please push patches 1,2 and 6 to -stable as well
changes from v0:
- handled another wrongly accounted HW counter in patch #1 (Rick)
- fixed coding style issues in patch #4 (Sergei)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eugenia Emantayev [Wed, 17 Feb 2016 15:24:27 +0000 (17:24 +0200)]
net/mlx4_en: Avoid changing dev->features directly in run-time
It's forbidden to manually change dev->features in run-time. Currently, this is
done in the driver to make sure that GSO_UDP_TUNNEL is advertized only when
VXLAN tunnel is set. However, since the stack actually does features intersection
with hw_enc_features, we can safely revert to advertizing features early when
registering the netdevice.
Fixes: f4a1edd56120 ('net/mlx4_en: Advertize encapsulation offloads [...]')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huy Nguyen [Wed, 17 Feb 2016 15:24:26 +0000 (17:24 +0200)]
net/mlx4_core: Set UAR page size to 4KB regardless of system page size
problem description:
The current code sets UAR page size equal to system page size.
The ConnectX-3 and ConnectX-3 Pro HWs require minimum 128 UAR pages.
The mlx4 kernel drivers are not loaded if there is less than 128 UAR pages.
solution:
Always set UAR page to 4KB. This allows more UAR pages if the OS
has PAGE_SIZE larger than 4KB. For example, PowerPC kernel use 64KB
system page size, with 4MB uar region, there are 4MB/2/64KB = 32
uars (half for uar, half for blueflame). This does not meet minimum 128
UAR pages requirement. With 4KB UAR page, there are 4MB/2/4KB = 512 uars
which meet the minimum requirement.
Note that only codes in mlx4_core that deal with firmware know that uar
page size is 4KB. Codes that deal with usr page in cq and qp context
(mlx4_ib, mlx4_en and part of mlx4_core) still have the same assumption
that uar page size equals to system page size.
Note that with this implementation, on 64KB system page size kernel, there
are 16 uars per system page but only one uars is used. The other 15
uars are ignored because of the above assumption.
Regarding SR-IOV, mlx4_core in hypervisor will set the uar page size
to 4KB and mlx4_core code in virtual OS will obtain the uar page size from
firmware.
Regarding backward compatibility in SR-IOV, if hypervisor has this new code,
the virtual OS must be updated. If hypervisor has old code, and the virtual
OS has this new code, the new code will be backward compatible with the
old code. If the uar size is big enough, this new code in VF continues to
work with 64 KB uar page size (on PowerPc kernel). If the uar size does not
meet 128 uars requirement, this new code not loaded in VF and print the same
error message as the old code in Hypervisor.
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Jurgens [Wed, 17 Feb 2016 15:24:25 +0000 (17:24 +0200)]
net/mlx4_core: Do not BUG_ON during reset when PCI is offline
The PCI channel could go offline during reset due to EEH. Don't bug on in
this case, the error is recoverable.
Fixes: f6bc11e42646 ('net/mlx4_core: Enhance the catas flow to support device reset')
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eran Ben Elisha [Wed, 17 Feb 2016 15:24:24 +0000 (17:24 +0200)]
net/mlx4_core: Fix potential corruption in counters database
The error flow in procedure handle_existing_counter() is wrong.
The procedure should exit after encountering the error, not continue
as if everything is OK.
Fixes: 68230242cdbc ('net/mlx4_core: Add port attribute when tracking counters')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eugenia Emantayev [Wed, 17 Feb 2016 15:24:23 +0000 (17:24 +0200)]
net/mlx4_en: Choose time-stamping shift value according to HW frequency
Previously, the shift value used for time-stamping was constant and didn't
depend on the HW chip frequency. Change that to take the frequency into account
and calculate the maximal value in cycles per wraparound of ten seconds. This
time slot was chosen since it gives a good accuracy in time synchronization.
Algorithm for shift value calculation:
* Round up the maximal value in cycles to nearest power of two
* Calculate maximal multiplier by division of all 64 bits set
to above result
* Then, invert the function clocksource_khz2mult() to get the shift from
maximal mult value
Fixes: ec693d47010e ('net/mlx4_en: Add HW timestamping (TS) support')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amir Vadai [Wed, 17 Feb 2016 15:24:22 +0000 (17:24 +0200)]
net/mlx4_en: Count HW buffer overrun only once
RdropOvflw counts overrun of HW buffer, therefore should
be used for rx_fifo_errors only.
Currently RdropOvflw counter is mistakenly also set into
rx_missed_errors and rx_over_errors too, which makes the
device total dropped packets accounting to show wrong results.
Fix that. Use it for rx_fifo_errors only.
Fixes: c27a02cd94d6 ('mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC')
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Fri, 12 Feb 2016 15:42:14 +0000 (16:42 +0100)]
qmi_wwan: add "4G LTE usb-modem U901"
Thomas reports:
T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=05c6 ProdID=6001 Rev=00.00
S: Manufacturer=USB Modem
S: Product=USB Modem
S: SerialNumber=
1234567890ABCDEF
C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 12 Feb 2016 06:50:29 +0000 (22:50 -0800)]
tcp: md5: release request socket instead of listener
If tcp_v4_inbound_md5_hash() returns an error, we must release
the refcount on the request socket, not on the listener.
The bug was added for IPv4 only.
Fixes: 079096f103fac ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 11 Feb 2016 16:58:18 +0000 (08:58 -0800)]
tcp: do not set rtt_min to 1
There are some cases where rtt_us derives from deltas of jiffies,
instead of using usec timestamps.
Since we want to track minimal rtt, better to assume a delta of 0 jiffie
might be in fact be very close to 1 jiffie.
It is kind of sad jiffies_to_usecs(1) calls a function instead of simply
using a constant.
Fixes: f672258391b42 ("tcp: track min RTT using windowed min-filter")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ken Kawasaki [Thu, 11 Feb 2016 11:27:04 +0000 (20:27 +0900)]
pcnet_cs: add new id
add new id (CONTEC C-NET(PC)C-100TX2)
Signed-off-by: Ken Kawasaki <ken_kawasaki@nifty.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sascha Hauer [Thu, 11 Feb 2016 10:44:49 +0000 (11:44 +0100)]
net: dsa: remove phy_disconnect from error path
The phy has not been initialized, disconnecting it in the error
path results in a NULL pointer exception. Drop the phy_disconnect
from the error path.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sascha Hauer [Thu, 11 Feb 2016 10:44:48 +0000 (11:44 +0100)]
net: dsa: mv88e6xxx: Add support for Marvell
88E6240
The Marvell
88E6240 has been tested successfully without further
changes. Add entry to the table of supported devices.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 10 Feb 2016 21:14:57 +0000 (16:14 -0500)]
tipc: fix premature addition of node to lookup table
In commit
5266698661401a ("tipc: let broadcast packet reception
use new link receive function") we introduced a new per-node
broadcast reception link instance. This link is created at the
moment the node itself is created. Unfortunately, the allocation
is done after the node instance has already been added to the node
lookup hash table. This creates a potential race condition, where
arriving broadcast packets are able to find and access the node
before it has been fully initialized, and before the above mentioned
link has been created. The result is occasional crashes in the function
tipc_bcast_rcv(), which is trying to access the not-yet existing link.
We fix this by deferring the addition of the node instance until after
it has been fully initialized in the function tipc_node_create().
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Feb 2016 20:50:03 +0000 (15:50 -0500)]
Merge branch 'bnxt_en-fixes'
Michael Chan says:
====================
bnxt_en: Bug fixes.
Fixed autoneg logic and some related cleanups, fixed tx push operation,
and reduced default ring sizes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Feb 2016 22:33:50 +0000 (17:33 -0500)]
bnxt_en: Reduce default ring sizes.
The current default tx ring size of 512 causes an extra page to be
allocated for the tx ring with only 1 entry in it. Reduce it to
511. The default rx ring size is also reduced to 511 to use less
memory by default.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Feb 2016 22:33:49 +0000 (17:33 -0500)]
bnxt_en: Fix implementation of tx push operation.
tx push is supported for small packets to reduce DMA latency. The
following bugs are fixed in this patch:
1. Fix the definition of the push BD which is different from the DMA BD.
2. The push buffer has to be zero padded to the next 64-bit word boundary
or tx checksum won't be correct.
3. Increase the tx push packet threshold to 164 bytes (192 bytes with the BD)
so that small tunneled packets are within the threshold.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Feb 2016 22:33:48 +0000 (17:33 -0500)]
bnxt_en: Remove 20G support and advertise only 40GbaseCR4.
20G is not supported by production hardware and only the 40GbaseCR4 standard
is supported.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Feb 2016 22:33:47 +0000 (17:33 -0500)]
bnxt_en: Cleanup and Fix flow control setup logic
Cleanup bnxt_probe_phy() to cleanly separate 2 code blocks for autoneg
on and off. Autoneg flow control is possible only if autoneg is enabled.
In bnxt_get_settings(), Pause and Asym_Pause are always supported.
Only the advertisement bits change depending on the ethtool -A setting
in auto mode.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Wed, 10 Feb 2016 22:33:46 +0000 (17:33 -0500)]
bnxt_en: Fix ethtool autoneg logic.
1. Determine autoneg on|off setting from link_info->autoneg. Using the
firmware returned setting can be misleading if autoneg is changed and
there hasn't been a phy update from the firmware.
2. If autoneg is disabled, link_info->autoneg should be set to 0 to
indicate both speed and flow control autoneg are disabled.
3. To enable autoneg flow control, speed autoneg must be enabled.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 10 Feb 2016 15:09:02 +0000 (16:09 +0100)]
bridge: mdb: avoid uninitialized variable warning
A recent change to the mdb code confused the compiler to the point
where it did not realize that the port-group returned from
br_mdb_add_group() is always valid when the function returns a nonzero
return value, so we get a spurious warning:
net/bridge/br_mdb.c: In function 'br_mdb_add':
net/bridge/br_mdb.c:542:4: error: 'pg' may be used uninitialized in this function [-Werror=maybe-uninitialized]
__br_mdb_notify(dev, entry, RTM_NEWMDB, pg);
Slightly rearranging the code in br_mdb_add_group() makes the problem
go away, as gcc is clever enough to see that both functions check
for 'ret != 0'.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 9e8430f8d60d ("bridge: mdb: Passing the port-group pointer to br_mdb module")
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Wed, 10 Feb 2016 06:58:49 +0000 (12:28 +0530)]
cxgb4: Add pci device id for chelsio t540 lom adapter
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Feb 2016 20:26:31 +0000 (15:26 -0500)]
Merge branch 'arc_emac-fixes'
Alexander Kochetkov says:
====================
Fixes for rockchip EMAC
Here is a set of 3 patches what fix koops, memory leak and
rockchip EMAC hang. Tested on radxarock lite.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Kochetkov [Tue, 9 Feb 2016 15:20:40 +0000 (18:20 +0300)]
net: arc_emac: fix sk_buff leak
EMAC could be disabled, while there is some sb_buff
in use. That buffers got lost for linux.
In order to reproduce run on device during active ethernet work:
ifconfig eth0 down
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Kochetkov [Tue, 9 Feb 2016 15:20:39 +0000 (18:20 +0300)]
net: arc_emac: reset txbd_curr and txbd_dirty pointers to zero
EMAC reset internal tx ring pointer to zero at statup.
txbd_curr and txbd_dirty can be different from zero.
That cause ethernet transfer hang (no packets transmitted).
In order to reproduce, run on device:
ifconfig eth0 down
ifconfig eth0 up
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Kochetkov [Tue, 9 Feb 2016 15:20:38 +0000 (18:20 +0300)]
net: arc_emac: fix koops caused by sk_buff free
There is a race between arc_emac_tx() and arc_emac_tx_clean().
sk_buff got freed by arc_emac_tx_clean() while arc_emac_tx()
submitting sk_buff.
In order to free sk_buff arc_emac_tx_clean() checks:
if ((info & FOR_EMAC) || !txbd->data)
break;
...
dev_kfree_skb_irq(skb);
If condition false, arc_emac_tx_clean() free sk_buff.
In order to submit txbd, arc_emac_tx() do:
priv->tx_buff[*txbd_curr].skb = skb;
...
priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
...
... <== arc_emac_tx_clean() check condition here
... <== (info & FOR_EMAC) is false
... <== !txbd->data is false
...
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
In order to reproduce the situation,
run device:
# iperf -s
run on host:
# iperf -t 600 -c <device-ip-addr>
[ 28.396284] ------------[ cut here ]------------
[ 28.400912] kernel BUG at .../net/core/skbuff.c:1355!
[ 28.414019] Internal error: Oops - BUG: 0 [#1] SMP ARM
[ 28.419150] Modules linked in:
[ 28.422219] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 4.4.0+ #120
[ 28.429516] Hardware name: Rockchip (Device Tree)
[ 28.434216] task:
c0665070 ti:
c0660000 task.ti:
c0660000
[ 28.439622] PC is at skb_put+0x10/0x54
[ 28.443381] LR is at arc_emac_poll+0x260/0x474
[ 28.447821] pc : [<
c03af580>] lr : [<
c028fec4>] psr:
a0070113
[ 28.447821] sp :
c0661e58 ip :
eea68502 fp :
ef377000
[ 28.459280] r10:
0000012c r9 :
f08b2000 r8 :
eeb57100
[ 28.464498] r7 :
00000000 r6 :
ef376594 r5 :
00000077 r4 :
ef376000
[ 28.471015] r3 :
0030488b r2 :
ef13e880 r1 :
000005ee r0 :
eeb57100
[ 28.477534] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 28.484658] Control:
10c5387d Table:
8eaf004a DAC:
00000051
[ 28.490396] Process swapper/0 (pid: 0, stack limit = 0xc0660210)
[ 28.496393] Stack: (0xc0661e58 to 0xc0662000)
[ 28.500745] 1e40:
00000002 00000000
[ 28.508913] 1e60:
00000000 ef376520 00000028 f08b23b8 00000000 ef376520 ef7b6900 c028fc64
[ 28.517082] 1e80:
2f158000 c0661ea8 c0661eb0 0000012c c065e900 c03bdeac ffff95e9 c0662100
[ 28.525250] 1ea0:
c0663924 00000028 c0661ea8 c0661ea8 c0661eb0 c0661eb0 0000001e c0660000
[ 28.533417] 1ec0:
40000003 00000008 c0695a00 0000000a c066208c 00000100 c0661ee0 c0027410
[ 28.541584] 1ee0:
ef0fb700 2f158000 00200000 ffff95e8 00000004 c0662100 c0662080 00000003
[ 28.549751] 1f00:
00000000 00000000 00000000 c065b45c 0000001e ef005000 c0647a30 00000000
[ 28.557919] 1f20:
00000000 c0027798 00000000 c005cf40 f0802100 c0662ffc c0661f60 f0803100
[ 28.566088] 1f40:
c0661fb8 c00093bc c000ffb4 60070013 ffffffff c0661f94 c0661fb8 c00137d4
[ 28.574267] 1f60:
00000001 00000000 00000000 c001ffa0 00000000 c0660000 00000000 c065a364
[ 28.582441] 1f80:
c0661fb8 c0647a30 00000000 00000000 00000000 c0661fb0 c000ffb0 c000ffb4
[ 28.590608] 1fa0:
60070013 ffffffff 00000051 00000000 00000000 c005496c c0662400 c061bc40
[ 28.598776] 1fc0:
ffffffff ffffffff 00000000 c061b680 00000000 c0647a30 00000000 c0695294
[ 28.606943] 1fe0:
c0662488 c0647a2c c066619c 6000406a 413fc090 6000807c 00000000 00000000
[ 28.615127] [<
c03af580>] (skb_put) from [<
ef376520>] (0xef376520)
[ 28.621218] Code:
e5902054 e590c090 e3520000 0a000000 (
e7f001f2)
[ 28.627307] ---[ end trace
4824734e2243fdb6 ]---
[ 34.377068] Internal error: Oops: 17 [#1] SMP ARM
[ 34.382854] Modules linked in:
[ 34.385947] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.4.0+ #120
[ 34.392219] Hardware name: Rockchip (Device Tree)
[ 34.396937] task:
ef02d040 ti:
ef05c000 task.ti:
ef05c000
[ 34.402376] PC is at __dev_kfree_skb_irq+0x4/0x80
[ 34.407121] LR is at arc_emac_poll+0x130/0x474
[ 34.411583] pc : [<
c03bb640>] lr : [<
c028fd94>] psr:
60030013
[ 34.411583] sp :
ef05de68 ip :
0008e83c fp :
ef377000
[ 34.423062] r10:
c001bec4 r9 :
00000000 r8 :
f08b24c8
[ 34.428296] r7 :
f08b2400 r6 :
00000075 r5 :
00000019 r4 :
ef376000
[ 34.434827] r3 :
00060000 r2 :
00000042 r1 :
00000001 r0 :
00000000
[ 34.441365] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 34.448507] Control:
10c5387d Table:
8f25c04a DAC:
00000051
[ 34.454262] Process ksoftirqd/0 (pid: 3, stack limit = 0xef05c210)
[ 34.460449] Stack: (0xef05de68 to 0xef05e000)
[ 34.464827] de60:
ef376000 c028fd94 00000000 c0669480 c0669480 ef376520
[ 34.473022] de80:
00000028 00000001 00002ae4 ef376520 ef7b6900 c028fc64 2f158000 ef05dec0
[ 34.481215] dea0:
ef05dec8 0000012c c065e900 c03bdeac ffff983f c0662100 c0663924 00000028
[ 34.489409] dec0:
ef05dec0 ef05dec0 ef05dec8 ef05dec8 ef7b6000 ef05c000 40000003 00000008
[ 34.497600] dee0:
c0695a00 0000000a c066208c 00000100 ef05def8 c0027410 ef7b6000 40000000
[ 34.505795] df00:
04208040 ffff983e 00000004 c0662100 c0662080 00000003 ef05c000 ef027340
[ 34.513985] df20:
ef05c000 c0666c2c 00000000 00000001 00000002 00000000 00000000 c0027568
[ 34.522176] df40:
ef027340 c003ef48 ef027300 00000000 ef027340 c003edd4 00000000 00000000
[ 34.530367] df60:
00000000 c003c37c ffffff7f 00000001 00000000 ef027340 00000000 00030003
[ 34.538559] df80:
ef05df80 ef05df80 00000000 00000000 ef05df90 ef05df90 ef05dfac ef027300
[ 34.546750] dfa0:
c003c2a4 00000000 00000000 c000f578 00000000 00000000 00000000 00000000
[ 34.554939] dfc0:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 34.563129] dfe0:
00000000 00000000 00000000 00000000 00000013 00000000 ffffffff dfff7fff
[ 34.571360] [<
c03bb640>] (__dev_kfree_skb_irq) from [<
c028fd94>] (arc_emac_poll+0x130/0x474)
[ 34.579840] [<
c028fd94>] (arc_emac_poll) from [<
c03bdeac>] (net_rx_action+0xdc/0x28c)
[ 34.587712] [<
c03bdeac>] (net_rx_action) from [<
c0027410>] (__do_softirq+0xcc/0x1f8)
[ 34.595482] [<
c0027410>] (__do_softirq) from [<
c0027568>] (run_ksoftirqd+0x2c/0x50)
[ 34.603168] [<
c0027568>] (run_ksoftirqd) from [<
c003ef48>] (smpboot_thread_fn+0x174/0x18c)
[ 34.611466] [<
c003ef48>] (smpboot_thread_fn) from [<
c003c37c>] (kthread+0xd8/0xec)
[ 34.619075] [<
c003c37c>] (kthread) from [<
c000f578>] (ret_from_fork+0x14/0x3c)
[ 34.626317] Code:
e8bd8010 e3a00000 e12fff1e e92d4010 (
e59030a4)
[ 34.632572] ---[ end trace
cca5a3d86a82249a ]---
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 9 Feb 2016 14:14:43 +0000 (06:14 -0800)]
net: Copy inner L3 and L4 headers as unaligned on GRE TEB
This patch corrects the unaligned accesses seen on GRE TEB tunnels when
generating hash keys. Specifically what this patch does is make it so that
we force the use of skb_copy_bits when the GRE inner headers will be
unaligned due to NET_IP_ALIGNED being a non-zero value.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Feb 2016 20:21:48 +0000 (15:21 -0500)]
Merge branch 'mlx5-fixes'
Saeed Mahameed says:
====================
mlx5 driver fixes for 4.5-rc2
We added here a patch from Matan and Alaa for addressing Linus comments on
the mess w.r.t reserved field names in the driver/firmware auto-generated file.
Once the patch hits linus tree, we'll ask Doug to rebase his tree on that
rc so both net-next and rdma-next development for 4.6 will be done under
the fixed robust form.
Also provided two patches that addresses the dynamic ndo initialization
issue of mlx5e netdevice.
Or and Saeed.
changes from V1: (Only first patch was changed)
In this V we fixed the issues addressed in Or's previous e-mail.
1. Offsets took into account two dimensional u8 arrays
2. Offsets took into account nesting unions and structs
3. Offsets for unions
4. Offsets for any reserved field
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Tue, 9 Feb 2016 12:57:44 +0000 (14:57 +0200)]
net/mlx5e: Use static constant netdevice ndos
Currently our netdevice ops is a one static global variable which
is referenced by all mlx5e netdevice instances. This can be
problematic when different driver instances do not share same
HW capabilities (e.g SRIOV PF and VFs probed to the host).
Now we have two constant global netdevice ops variables, one
for basic netdevice ops and the other with extended SRIOV ops,
on netdevice construction we choose the one suitable for
current device capabilities.
Fixes: 66e49dedada6 ("net/mlx5e: Add support for SR-IOV ndos")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Tue, 9 Feb 2016 12:57:43 +0000 (14:57 +0200)]
net/mlx5e: Remove select queue ndo initialization
Currently mlx5e_select_queue is redundant since num_tc is always 1.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matan Barak [Tue, 9 Feb 2016 12:57:42 +0000 (14:57 +0200)]
net/mlx5: Use offset based reserved field names in the IFC header file
mlx5_ifc.h is a header file representing the API and ABI between
the driver to the firmware and hardware. This file is used from
both the mlx5_ib and mlx5_core drivers.
Previously, this file used incrementing counter to indicate
reserved fields, for example:
struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 send[0x1];
u8 receive[0x1];
u8 write[0x1];
u8 read[0x1];
u8 reserved_0[0x1];
u8 srq_receive[0x1];
u8 reserved_1[0x1a];
};
If one developer implements through net-next feature A that uses
reserved_0, they replace it with featureA and renames reserved_1 to
reserved_0. In the same kernel cycle, a 2nd developer could implement
feature B through the rdma tree, that uses reserved_1 and split it to
featureB and a smaller reserved_1 field. This will cause a conflict
when the two trees are merged.
The source of this conflict is that the 1st developer changed *all*
reserved fields.
As Linus suggested, we change the layout of structs to:
struct mlx5_ifc_odp_per_transport_service_cap_bits {
u8 send[0x1];
u8 receive[0x1];
u8 write[0x1];
u8 read[0x1];
u8 reserved_at_4[0x1];
u8 srq_receive[0x1];
u8 reserved_at_6[0x1a];
};
This makes the conflicts much more rare and preserves the locality of
changes.
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jay Vosburgh [Mon, 8 Feb 2016 20:10:02 +0000 (12:10 -0800)]
bonding: don't use stale speed and duplex information
There is presently a race condition between the bonding periodic
link monitor and the updating of a slave's speed and duplex. The former
occurs on a periodic basis, and the latter in response to a driver's
calling of netif_carrier_on.
It is possible for the periodic monitor to run between the
driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
event that causes bonding to update the slave's speed and duplex. This
manifests most notably as a report that a slave is up and "0 Mbps full
duplex" after enslavement, but in principle could report an incorrect
speed and duplex after any link up event if the device comes up with a
different speed or duplex. This affects the 802.3ad aggregator
selection, as the speed and duplex are selection criteria.
This is fixed by updating the speed and duplex in the periodic
monitor, prior to using that information.
This was done historically in bonding, but the call to
bond_update_speed_duplex was removed in commit
876254ae2758 ("bonding:
don't call update_speed_duplex() under spinlocks"), as it might sleep
under lock. Later, the locking was changed to only hold RTNL, and so
after commit
876254ae2758 ("bonding: don't call update_speed_duplex()
under spinlocks") this call is again safe.
Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com>
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Acked-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 8 Feb 2016 14:33:42 +0000 (15:33 +0100)]
net: am79c961a: avoid %? in inline assembly
The am79c961a.c driver fails to build with clang because of an
unusual inline assembly construct:
drivers/net/ethernet/amd/am79c961a.c:53:7: error: invalid % escape in inline assembly string
"str%?h %1, [%2] @ NET_RAP\n\t"
The same change has been done a decade ago in arch/arm as of
6a39dd6222dd ("[ARM] 3759/2: Remove uses of %?"), but apparently
some drivers were missed.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Robert Jarzmik [Sat, 6 Feb 2016 21:23:20 +0000 (22:23 +0100)]
net: smc91x: propagate irq return code
The smc91x driver doesn't honor the probe deferral mechanism when the
interrupt source is not yet available, such as one provided by a gpio
controller not probed.
Fix this by propagating the platform_get_irq() error code as the probe
return value.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Feb 2016 19:55:36 +0000 (14:55 -0500)]
Merge branch 'bcm7xxx-fixes'
Florian Fainelli says:
====================
Subject: [PATCH net v2 0/4] net: phy: bcm7xxx 40nm PHY fixes
Here is a collection of fixes for the 40nm Ethernet PHY supported
by the 7xxx PHY driver, please also queue these fixes for stable.
Changes in v2:
- dropped the cleanup patch, not appropriate
- added another patch removing bogus wildcard entries
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sat, 6 Feb 2016 20:58:51 +0000 (12:58 -0800)]
net: phy: bcm7xxx: Remove wildcard entries
Remove the two wildcard entries, they serve no purpose and will match way too
many devices, some of them being covered by the driver in
drivers/net/phy/broadcom.c. Remove the now unused bcm7xxx_dummy_config_init()
function which would produce a warning.
Fixes: b560a58c45c6 ("net: phy: add Broadcom BCM7xxx internal PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sat, 6 Feb 2016 20:58:50 +0000 (12:58 -0800)]
net: phy: bcm7xxx: Fix bcm7xxx_config_init() check
Since we were wrongly advertising gigabit features for these 10/100 only
Ethernet PHYs, bcm7xxx_config_init() which is supposed to apply workaround
would have not run since the check would be true, now that we have fixed the
PHY features, remove that check since it has no reasoning to be there anymore.
Fixes: e18556ee3bd83 ("net: phy: bcm7xxx: do not use PHY_BRCM_100MBPS_WAR")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sat, 6 Feb 2016 20:58:49 +0000 (12:58 -0800)]
net: phy: bcm7xxx: Fix 40nm EPHY features
The PHY entries for BCM7425/29/35 declare the 40nm Ethernet PHY as being
10/100/1000 capable, while this is just a 10/100 capable PHY device, fix that.
Fixes: d068b02cfdfc2 ("net: phy: add BCM7425 and BCM7429 PHYs")
Fixes: 9458ceab4917 ("net: phy: bcm7xxx: Add entry for BCM7435")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sat, 6 Feb 2016 20:58:48 +0000 (12:58 -0800)]
net: phy: bcm7xxx: Fix shadow mode 2 disabling
The clear and set masks in the call to phy_set_clr_bits() called from
bcm7xxx_config_init() are inverted. We need to fix this by swapping the two
arguments, that is, set 0 bits, but clear the shade mode 2 enable bit.
Fixes: b560a58c45c66 ("net: phy: add Broadcom BCM7xxx internal PHY driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Feb 2016 19:53:00 +0000 (14:53 -0500)]
Merge branch 'ravb-fixes'
Sergei Shtylyov says:
====================
ravb: fix the fallout of R-Car gen3 gPTP support
Here's a set of 2 patches against DaveM's 'net.git' repo fixing up the
incomplete commit
f5d7837f96e5 ("ravb: ptp: Add CONFIG mode support").
I'm proposing these as fixes but they can be merged as cleanups as well...
[1/2] ravb: kill duplicate setting of CCC.CSEL
[2/2] ravb: skip gPTP start/stop on R-Car gen3
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 6 Feb 2016 14:47:22 +0000 (17:47 +0300)]
ravb: skip gPTP start/stop on R-Car gen3
When adding support for the R-Car gen3 gPTP active in configuration mode,
some call sites of ravb_ptp_{init|stop}() were missed due to an oversight.
Add checks for the R-Car gen2 SoCs around these...
Fixes: f5d7837f96e5 ("ravb: ptp: Add CONFIG mode support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 6 Feb 2016 14:46:35 +0000 (17:46 +0300)]
ravb: kill duplicate setting of CCC.CSEL
When adding support for the R-Car gen3 gPTP active in configuration mode,
the code setting the CCC.CSEL field was duplicated due to an oversight.
For R-Car gen 2 it's just redundant and for R-Car gen3 the write at this
time is probably ignored due to CCC.GAC bit being already set...
Fixes: f5d7837f96e5 ("ravb: ptp: Add CONFIG mode support")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 16 Feb 2016 17:56:00 +0000 (12:56 -0500)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contain a rather large batch for your net that
includes accumulated bugfixes, they are:
1) Run conntrack cleanup from workqueue process context to avoid hitting
soft lockup via watchdog for large tables. This is required by the
IPv6 masquerading extension. From Florian Westphal.
2) Use original skbuff from nfnetlink batch when calling netlink_ack()
on error since this needs to access the skb->sk pointer.
3) Incremental fix on top of recent Sasha Levin's lock fix for conntrack
resizing.
4) Fix several problems in nfnetlink batch message header sanitization
and error handling, from Phil Turnbull.
5) Select NF_DUP_IPV6 based on CONFIG_IPV6, from Arnd Bergmann.
6) Fix wrong signess in return values on nf_tables counter expression,
from Anton Protopopov.
Due to the NetDev 1.1 organization burden, I had no chance to pass up
this to you any sooner in this release cycle, sorry about that.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rainer Weikusat [Thu, 11 Feb 2016 19:37:27 +0000 (19:37 +0000)]
af_unix: Guard against other == sk in unix_dgram_sendmsg
The unix_dgram_sendmsg routine use the following test
if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
to determine if sk and other are in an n:1 association (either
established via connect or by using sendto to send messages to an
unrelated socket identified by address). This isn't correct as the
specified address could have been bound to the sending socket itself or
because this socket could have been connected to itself by the time of
the unix_peer_get but disconnected before the unix_state_lock(other). In
both cases, the if-block would be entered despite other == sk which
might either block the sender unintentionally or lead to trying to unlock
the same spin lock twice for a non-blocking send. Add a other != sk
check to guard against this.
Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue")
Reported-By: Philipp Hahn <pmhahn@pmhahn.de>
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Tested-by: Philipp Hahn <pmhahn@pmhahn.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rainer Weikusat [Mon, 8 Feb 2016 18:47:19 +0000 (18:47 +0000)]
af_unix: Don't set err in unix_stream_read_generic unless there was an error
The present unix_stream_read_generic contains various code sequences of
the form
err = -EDISASTER;
if (<test>)
goto out;
This has the unfortunate side effect of possibly causing the error code
to bleed through to the final
out:
return copied ? : err;
and then to be wrongly returned if no data was copied because the caller
didn't supply a data buffer, as demonstrated by the program available at
http://pad.lv/
1540731
Change it such that err is only set if an error condition was detected.
Fixes: 3822b5c2fc62 ("af_unix: Revert 'lock_interruptible' in stream receive code")
Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael McConville [Sat, 6 Feb 2016 01:46:25 +0000 (20:46 -0500)]
dscc4: Undefined signed int shift
My analysis in the below mail applies, although the second part is
unnecessary because i isn't used in arithmetic operations here:
https://marc.info/?l=openbsd-tech&m=
145377854103866&w=2
Thanks for your time.
Signed-off-by: Michael McConville <mmcco@mykolab.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 5 Feb 2016 19:07:14 +0000 (14:07 -0500)]
net: dsa: mv88e6xxx: do not leave reserved VLANs
BRIDGE_VLAN_FILTERING automatically adds a newly bridged port to the
VLAN with the bridge's default_pvid.
The mv88e6xxx driver currently reserves VLANs 4000+ for unbridged ports
isolation. When a port joins a bridge, it leaves its reserved VLAN. When
a port leaves a bridge, it joins again its reserved VLAN.
But if the VLAN filtering is disabled, or if this hardware VLAN is
already in use, the bridged port ends up with no default VLAN, and the
communication with the CPU is thus broken.
To fix this, make a port join its reserved VLAN once on setup, never
leave it, and restore its PVID after another one was eventually used.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 5 Feb 2016 19:04:39 +0000 (14:04 -0500)]
net: dsa: mv88e6xxx: fix software VLAN deletion
The current bridge code calls switchdev_port_obj_del on a VLAN port even
if the corresponding switchdev_port_obj_add call returned -EOPNOTSUPP.
If the DSA driver doesn't return -EOPNOTSUPP for a software port VLAN in
its port_vlan_del function, the VLAN is not deleted. Unbridging the port
also generates a stack trace for the same reason.
This can be quickly tested on a VLAN filtering enabled system with:
# brctl addbr br0
# brctl addif br0 lan0
# brctl addbr br1
# brctl addif br1 lan1
# brctl delif br1 lan1
Both bridges have a default default_pvid set to 1. lan0 uses the
hardware VLAN 1 while lan1 falls back to the software VLAN 1.
Unbridging lan1 does not delete its software VLAN, and thus generates
the following stack trace:
[ 2991.681705] device lan1 left promiscuous mode
[ 2991.686237] br1: port 1(lan1) entered disabled state
[ 2991.725094] ------------[ cut here ]------------
[ 2991.729761] WARNING: CPU: 0 PID: 869 at net/bridge/br_vlan.c:314 __vlan_group_free+0x4c/0x50()
[ 2991.738437] Modules linked in:
[ 2991.741546] CPU: 0 PID: 869 Comm: ip Not tainted 4.4.0 #16
[ 2991.747039] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
[ 2991.753511] Backtrace:
[ 2991.756008] [<
80014450>] (dump_backtrace) from [<
8001469c>] (show_stack+0x20/0x24)
[ 2991.763604] r6:
80512644 r5:
00000009 r4:
00000000 r3:
00000000
[ 2991.769343] [<
8001467c>] (show_stack) from [<
80268e44>] (dump_stack+0x24/0x28)
[ 2991.776618] [<
80268e20>] (dump_stack) from [<
80025568>] (warn_slowpath_common+0x98/0xc4)
[ 2991.784750] [<
800254d0>] (warn_slowpath_common) from [<
80025650>] (warn_slowpath_null+0x2c/0x34)
[ 2991.793557] r8:
00000000 r7:
9f786a8c r6:
9f76c440 r5:
9f786a00 r4:
9f68ac00
[ 2991.800366] [<
80025624>] (warn_slowpath_null) from [<
80512644>] (__vlan_group_free+0x4c/0x50)
[ 2991.808946] [<
805125f8>] (__vlan_group_free) from [<
80514488>] (nbp_vlan_flush+0x44/0x68)
[ 2991.817147] r4:
9f68ac00 r3:
9ec70000
[ 2991.820772] [<
80514444>] (nbp_vlan_flush) from [<
80506f08>] (del_nbp+0xac/0x130)
[ 2991.828201] r5:
9f56f800 r4:
9f786a00
[ 2991.831841] [<
80506e5c>] (del_nbp) from [<
8050774c>] (br_del_if+0x40/0xbc)
[ 2991.838724] r7:
80590f68 r6:
00000000 r5:
9ec71c38 r4:
9f76c440
[ 2991.844475] [<
8050770c>] (br_del_if) from [<
80503dc0>] (br_del_slave+0x1c/0x20)
[ 2991.851802] r5:
9ec71c38 r4:
9f56f800
[ 2991.855428] [<
80503da4>] (br_del_slave) from [<
80484a34>] (do_setlink+0x324/0x7b8)
[ 2991.863043] [<
80484710>] (do_setlink) from [<
80485e90>] (rtnl_newlink+0x508/0x6f4)
[ 2991.870616] r10:
00000000 r9:
9ec71ba8 r8:
00000000 r7:
00000000 r6:
9f6b0400 r5:
9f56f800
[ 2991.878548] r4:
8076278c
[ 2991.881110] [<
80485988>] (rtnl_newlink) from [<
80484048>] (rtnetlink_rcv_msg+0x18c/0x22c)
[ 2991.889315] r10:
9f7d4e40 r9:
00000000 r8:
00000000 r7:
00000000 r6:
9f7d4e40 r5:
9f6b0400
[ 2991.897250] r4:
00000000
[ 2991.899814] [<
80483ebc>] (rtnetlink_rcv_msg) from [<
80497c74>] (netlink_rcv_skb+0xb0/0xcc)
[ 2991.908104] r8:
00000000 r7:
9f7d4e40 r6:
9f7d4e40 r5:
80483ebc r4:
9f6b0400
[ 2991.914928] [<
80497bc4>] (netlink_rcv_skb) from [<
80483eb4>] (rtnetlink_rcv+0x34/0x3c)
[ 2991.922874] r6:
9f5ea000 r5:
00000028 r4:
9f7d4e40 r3:
80483e80
[ 2991.928622] [<
80483e80>] (rtnetlink_rcv) from [<
80497604>] (netlink_unicast+0x180/0x200)
[ 2991.936742] r4:
9f4edc00 r3:
80483e80
[ 2991.940362] [<
80497484>] (netlink_unicast) from [<
80497a88>] (netlink_sendmsg+0x33c/0x350)
[ 2991.948648] r8:
00000000 r7:
00000028 r6:
00000000 r5:
9f5ea000 r4:
9ec71f4c
[ 2991.955481] [<
8049774c>] (netlink_sendmsg) from [<
80457ff0>] (sock_sendmsg+0x24/0x34)
[ 2991.963342] r10:
00000000 r9:
9ec71e28 r8:
00000000 r7:
9f1e2140 r6:
00000000 r5:
00000000
[ 2991.971276] r4:
9ec71f4c
[ 2991.973849] [<
80457fcc>] (sock_sendmsg) from [<
80458af0>] (___sys_sendmsg+0x1fc/0x204)
[ 2991.981809] [<
804588f4>] (___sys_sendmsg) from [<
804598d0>] (__sys_sendmsg+0x4c/0x7c)
[ 2991.989640] r10:
00000000 r9:
9ec70000 r8:
80010824 r7:
00000128 r6:
7ee946c4 r5:
00000000
[ 2991.997572] r4:
9f1e2140
[ 2992.000128] [<
80459884>] (__sys_sendmsg) from [<
80459918>] (SyS_sendmsg+0x18/0x1c)
[ 2992.007725] r6:
00000000 r5:
7ee9c7b8 r4:
7ee946e0
[ 2992.012430] [<
80459900>] (SyS_sendmsg) from [<
80010660>] (ret_fast_syscall+0x0/0x3c)
[ 2992.020182] ---[ end trace
5d4bc29f4da04280 ]---
To fix this, return -EOPNOTSUPP in _mv88e6xxx_port_vlan_del instead of
-ENOENT if the hardware VLAN doesn't exist or the port is not a member.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Fri, 5 Feb 2016 16:30:39 +0000 (16:30 +0000)]
net: cavium: liquidio: fix check for in progress flag
smatch detected a suspicious looking bitop condition:
drivers/net/ethernet/cavium/liquidio/lio_main.c:2529
handle_timestamp() warn: suspicious bitop condition
(skb_shinfo(skb)->tx_flags | SKBTX_IN_PROGRESS is always non-zero,
so the logic is definitely not correct. Use & to mask the correct
bit.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vitaly Kuznetsov [Fri, 5 Feb 2016 16:29:08 +0000 (17:29 +0100)]
hv_netvsc: Restore needed_headroom request
Commit
c0eb454034aa ("hv_netvsc: Don't ask for additional head room in the
skb") got rid of needed_headroom setting for the driver. With the change I
hit the following issue trying to use ptkgen module:
[ 57.522021] kernel BUG at net/core/skbuff.c:1128!
[ 57.522021] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
...
[ 58.721068] Call Trace:
[ 58.721068] [<
ffffffffa0144e86>] netvsc_start_xmit+0x4c6/0x8e0 [hv_netvsc]
...
[ 58.721068] [<
ffffffffa02f87fc>] ? pktgen_finalize_skb+0x25c/0x2a0 [pktgen]
[ 58.721068] [<
ffffffff814f5760>] ? __netdev_alloc_skb+0xc0/0x100
[ 58.721068] [<
ffffffffa02f9907>] pktgen_thread_worker+0x257/0x1920 [pktgen]
Basically, we're calling skb_cow_head(skb, RNDIS_AND_PPI_SIZE) and crash on
if (skb_shared(skb))
BUG();
We probably need to restore needed_headroom setting (but shrunk to
RNDIS_AND_PPI_SIZE as we don't need more) to request the required headroom
space. In theory, it should not give us performance penalty.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 13 Feb 2016 11:02:25 +0000 (06:02 -0500)]
Merge branch 'mvneta-fixes'
Gregory CLEMENT says:
====================
mvneta fixes for SMP
Following this bug report:
http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173 and the
suggestions from Russell King, I reviewed all the code involving
multi-CPU. It ended with this series of patches which should improve
the stability of the driver.
During my test I found another bug which is fixed by new patch (the
second one of this new version of the series)
The two first patches fix real bugs, the others fix potential issues
in the driver.
Changelog:
v1 -> v2
Fix spinlock comment. Pointed by David Miller
v2 -> v3
- Fix typos and mistake in the comments. Pointed by Sergei Shtylyov
- Add a new patch fixing the CPU choice in mvneta_percpu_elect
- Use lock in last patch to prevent remaining race condition. Pointed
by Jisheng
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:29 +0000 (22:09 +0100)]
net: mvneta: Fix race condition during stopping
When stopping the port, the CPU notifier are still there whereas the
mvneta_stop_dev function calls mvneta_percpu_disable() on each CPUs.
It was possible to have a new CPU coming at this point which could be
racy.
This patch adds a flag preventing executing the code notifier for a new
CPU when the port is stopping. It also uses the spinlock introduces
previously. To avoid the deadlock, the lock has been moved outside the
mvneta_percpu_elect function.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:28 +0000 (22:09 +0100)]
net: mvneta: The mvneta_percpu_elect function should be atomic
Electing a CPU must be done in an atomic way: it should be done after or
before the removal/insertion of a CPU and this function is not reentrant.
During the loop of mvneta_percpu_elect we associates the queues to the
CPUs, if there is a topology change during this loop, then the mapping
between the CPUs and the queues could be wrong. During this loop the
interrupt mask is also updating for each CPUs, It should not be changed
in the same time by other part of the driver.
This patch adds spinlock to create the needed critical sections.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:27 +0000 (22:09 +0100)]
net: mvneta: Modify the queue related fields from each cpu
In the MVNETA_INTR_* registers, the queues related fields are per cpu,
according to the datasheet (comment in [] are added by me):
"In a multi-CPU system, bits of RX[or TX] queues for which the access by
the reading[or writing] CPU is disabled are read as 0, and cannot be
cleared[or written]."
That means that each time we want to manipulate these bits we had to do
it on each cpu and not only on the current cpu.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:26 +0000 (22:09 +0100)]
net: mvneta: Remove unused code
Since the commit
2dcf75e2793c ("net: mvneta: Associate RX queues with
each CPU") all the percpu irq are used and disabled at initialization, so
there is no point to disable them first.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:25 +0000 (22:09 +0100)]
net: mvneta: Use on_each_cpu when possible
Instead of using a for_each_* loop in which we just call the
smp_call_function_single macro, it is more simple to directly use the
on_each_cpu macro. Moreover, this macro ensures that the calls will be
done all at once.
Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:24 +0000 (22:09 +0100)]
net: mvneta: Fix the CPU choice in mvneta_percpu_elect
When passing to the management of multiple RX queue, the
mvneta_percpu_elect function was broken. The use of the modulo can lead
to elect the wrong cpu. For example with rxq_def=2, if the CPU 2 goes
offline and then online, we ended with the third RX queue activated in
the same time on CPU 0 and CPU2, which lead to a kernel crash.
With this fix, we don't try to get "the closer" CPU if the default CPU is
gone, now we just use CPU 0 which always be there. Thanks to this, the
code becomes more readable, easier to maintain and more predicable.
Cc: stable@vger.kernel.org
Fixes: 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU")
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gregory CLEMENT [Thu, 4 Feb 2016 21:09:23 +0000 (22:09 +0100)]
net: mvneta: Fix for_each_present_cpu usage
This patch convert the for_each_present in on_each_cpu, instead of
applying on the present cpus it will be applied only on the online cpus.
This fix a bug reported on
http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173.
Using the macro on_each_cpu (instead of a for_each_* loop) also ensures
that all the calls will be done all at once.
Fixes: f86428854480 ("net: mvneta: Statically assign queues to CPUs")
Reported-by: Stefan Roese <stefan.roese@gmail.com>
Suggested-by: Jisheng Zhang <jszhang@marvell.com>
Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Laura Abbott [Thu, 4 Feb 2016 18:50:45 +0000 (10:50 -0800)]
vsock: Fix blocking ops call in prepare_to_wait
We receoved a bug report from someone using vmware:
WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389
__might_sleep+0x7d/0x90()
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<
ffffffff810fa68d>] prepare_to_wait+0x2d/0x90
Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi
snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi
snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm
snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc
parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs
xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx
drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic
mptbase pata_acpi
CPU: 3 PID: 660 Comm: vmtoolsd Not tainted
4.2.0-0.rc1.git3.1.fc23.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop
Reference Platform, BIOS 6.00 05/20/2014
0000000000000000 0000000049e617f3 ffff88006ac37ac8 ffffffff818641f5
0000000000000000 ffff88006ac37b20 ffff88006ac37b08 ffffffff810ab446
ffff880068009f40 ffffffff81c63bc0 0000000000000061 0000000000000000
Call Trace:
[<
ffffffff818641f5>] dump_stack+0x4c/0x65
[<
ffffffff810ab446>] warn_slowpath_common+0x86/0xc0
[<
ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70
[<
ffffffff8112551d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[<
ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90
[<
ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90
[<
ffffffff810da2bd>] __might_sleep+0x7d/0x90
[<
ffffffff812163b3>] __might_fault+0x43/0xa0
[<
ffffffff81430477>] copy_from_iter+0x87/0x2a0
[<
ffffffffa039460a>] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci]
[<
ffffffffa0394740>] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci]
[<
ffffffffa0394757>] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci]
[<
ffffffffa0394d50>] qp_enqueue_locked+0xa0/0x140 [vmw_vmci]
[<
ffffffffa039593f>] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci]
[<
ffffffffa04847bb>] vmci_transport_stream_enqueue+0x1b/0x20
[vmw_vsock_vmci_transport]
[<
ffffffffa047ae05>] vsock_stream_sendmsg+0x2c5/0x320 [vsock]
[<
ffffffff810fabd0>] ? wake_atomic_t_function+0x70/0x70
[<
ffffffff81702af8>] sock_sendmsg+0x38/0x50
[<
ffffffff81702ff4>] SYSC_sendto+0x104/0x190
[<
ffffffff8126e25a>] ? vfs_read+0x8a/0x140
[<
ffffffff817042ee>] SyS_sendto+0xe/0x10
[<
ffffffff8186d9ae>] entry_SYSCALL_64_fastpath+0x12/0x76
transport->stream_enqueue may call copy_to_user so it should
not be called inside a prepare_to_wait. Narrow the scope of
the prepare_to_wait to avoid the bad call. This also applies
to vsock_stream_recvmsg as well.
Reported-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chun-Hao Lin [Thu, 4 Feb 2016 18:28:00 +0000 (02:28 +0800)]
r8169:fix system hange problem.
There are typos in setting RTL8168H hardware parameters. If system install
another version driver that may cuase system hang.
Signed-off-by: Chunhao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 4 Feb 2016 14:23:28 +0000 (06:23 -0800)]
ipv4: fix memory leaks in ip_cmsg_send() callers
Dmitry reported memory leaks of IP options allocated in
ip_cmsg_send() when/if this function returns an error.
Callers are responsible for the freeing.
Many thanks to Dmitry for the report and diagnostic.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amitoj Kaur Chawla [Thu, 4 Feb 2016 13:55:26 +0000 (19:25 +0530)]
net: mvpp2: Return correct error codes
The return value of kzalloc on failure of allocation of memory should
be -ENOMEM and not -1.
Found using Coccinelle. A simplified version of the semantic patch
used is:
//<smpl>
@@
expression *e;
position p,q;
@@
e@q = kzalloc(...);
if@p (e == NULL) {
...
return
- -1
+ -ENOMEM
;
}
//</smpl>
This function may also return -1 after calling mpp2_prs_tcam_port_map_get.
So that the function consistently returns meaningful error values on
failure, the -1 is changed to -EINVAL.
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amitoj Kaur Chawla [Thu, 4 Feb 2016 13:55:13 +0000 (19:25 +0530)]
net: cavium: liquidio: Return correct error code
The return value of vmalloc on failure of allocation of memory should
be -ENOMEM and not -1.
Found using Coccinelle. A simplified version of the semantic patch
used is:
//<smpl>
@@
expression *e;
identifier l1;
position p,q;
@@
e@q = vmalloc(...);
if@p (e == NULL) {
...
goto l1;
}
l1:
...
return -1
+ -ENOMEM
;
//</smpl
The single call site of the containing function checks whether the
returned value is -1, so this check is changed as well. The single call
site of this call site, however, only checks whether the value is not 0,
so no further change was required.
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jay Vosburgh [Tue, 2 Feb 2016 21:35:56 +0000 (13:35 -0800)]
bonding: Fix ARP monitor validation
The current logic in bond_arp_rcv will accept an incoming ARP for
validation if (a) the receiving slave is either "active" (which includes
the currently active slave, or the current ARP slave) or, (b) there is a
currently active slave, and it has received an ARP since it became active.
For case (b), the receiving slave isn't the currently active slave, and is
receiving the original broadcast ARP request, not an ARP reply from the
target.
This logic can fail if there is no currently active slave. In
this situation, the ARP probe logic cycles through all slaves, assigning
each in turn as the "current_arp_slave" for one arp_interval, then setting
that one as "active," and sending an ARP probe from that slave. The
current logic expects the ARP reply to arrive on the sending
current_arp_slave, however, due to switch FDB updating delays, the reply
may be directed to another slave.
This can arise if the bonding slaves and switch are working, but
the ARP target is not responding. When the ARP target recovers, a
condition may result wherein the ARP target host replies faster than the
switch can update its forwarding table, causing each ARP reply to be sent
to the previous current_arp_slave. This will never pass the logic in
bond_arp_rcv, as neither of the above conditions (a) or (b) are met.
Some experimentation on a LAN shows ARP reply round trips in the
200 usec range, but my available switches never update their FDB in less
than 4000 usec.
This patch changes the logic in bond_arp_rcv to additionally
accept an ARP reply for validation on any slave if there is a current ARP
slave and it sent an ARP probe during the previous arp_interval.
Fixes: aeea64ac717a ("bonding: don't trust arp requests unless active slave really works")
Cc: Veaceslav Falico <vfalico@gmail.com>
Cc: Andy Gospodarek <gospo@cumulusnetworks.com>
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 11 Feb 2016 19:25:55 +0000 (11:25 -0800)]
Merge tag 'gpio-v4.5-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
- Probe errorpath fix for the Altera
- irqchip ofnode pointer added to the DaVinci driver
- controller instance number correction for DaVinci
* tag 'gpio-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: davinci: Fix the number of controllers allocated
gpio: davinci: Add the missing of-node pointer
gpio: gpio-altera: Remove gpiochip on probe failure.
Linus Torvalds [Thu, 11 Feb 2016 19:17:19 +0000 (11:17 -0800)]
Merge tag 'platform-drivers-x86-v4.5-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86
Pull x86 platform driver fixes from Darren Hart:
"Just two small fixes for the 4.5-rc cycle:
intel_scu_ipcutil:
- underflow in scu_reg_access()
intel-hid:
- fix incorrect entries in intel_hid_keymap"
* tag 'platform-drivers-x86-v4.5-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
intel_scu_ipcutil: underflow in scu_reg_access()
intel-hid: fix incorrect entries in intel_hid_keymap
Linus Torvalds [Thu, 11 Feb 2016 19:00:34 +0000 (11:00 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix BPF handling of branch offset adjustmnets on backjumps, from
Daniel Borkmann.
2) Make sure selinux knows about SOCK_DESTROY netlink messages, from
Lorenzo Colitti.
3) Fix openvswitch tunnel mtu regression, from David Wragg.
4) Fix ICMP handling of TCP sockets in syn_recv state, from Eric
Dumazet.
5) Fix SCTP user hmacid byte ordering bug, from Xin Long.
6) Fix recursive locking in ipv6 addrconf, from Subash Abhinov
Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
bpf: fix branch offset adjustment on backjumps after patching ctx expansion
vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices
geneve: Relax MTU constraints
vxlan: Relax MTU constraints
flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen
of: of_mdio: Add marvell,
88e1145 to whitelist of PHY compatibilities.
selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables
sctp: translate network order to host order when users get a hmacid
enic: increment devcmd2 result ring in case of timeout
tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs
net:Add sysctl_max_skb_frags
tcp: do not drop syn_recv on all icmp reports
ipv6: fix a lockdep splat
unix: correctly track in-flight fds in sending process user_struct
update be2net maintainers' email addresses
dwc_eth_qos: Reset hardware before PHY start
ipv6: addrconf: Fix recursive spin lock call
Linus Torvalds [Wed, 10 Feb 2016 23:11:08 +0000 (15:11 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"A few more minor fixes for rc3:
- One fix to ipoib
- One fix to core sysfs code
- Four patches that resolve an oops found in testing of ocrdma and a
couple other ocrdma issues"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
RDMA/ocrdma: Fixing ocrdma debugfs directory remove
RDMA/ocrdma: Fix pkey_index returned by driver in rq work completion
RDMA/ocrdma: populate max_sge_rd in device attributes
RDMA/ocrdma: Initialize stats resources in the driver before ib device registration.
IB/sysfs: remove unused va_list args
IB/IPoIB: Do not set skb truesize since using one linearskb
Daniel Borkmann [Wed, 10 Feb 2016 15:47:11 +0000 (16:47 +0100)]
bpf: fix branch offset adjustment on backjumps after patching ctx expansion
When ctx access is used, the kernel often needs to expand/rewrite
instructions, so after that patching, branch offsets have to be
adjusted for both forward and backward jumps in the new eBPF program,
but for backward jumps it fails to account the delta. Meaning, for
example, if the expansion happens exactly on the insn that sits at
the jump target, it doesn't fix up the back jump offset.
Analysis on what the check in adjust_branches() is currently doing:
/* adjust offset of jmps if necessary */
if (i < pos && i + insn->off + 1 > pos)
insn->off += delta;
else if (i > pos && i + insn->off + 1 < pos)
insn->off -= delta;
First condition (forward jumps):
Before: After:
insns[0] insns[0]
insns[1] <--- i/insn insns[1] <--- i/insn
insns[2] <--- pos insns[P] <--- pos
insns[3] insns[P] `------| delta
insns[4] <--- target_X insns[P] `-----|
insns[5] insns[3]
insns[4] <--- target_X
insns[5]
First case is if we cross pos-boundary and the jump instruction was
before pos. This is handeled correctly. I.e. if i == pos, then this
would mean our jump that we currently check was the patchlet itself
that we just injected. Since such patchlets are self-contained and
have no awareness of any insns before or after the patched one, the
delta is correctly not adjusted. Also, for the second condition in
case of i + insn->off + 1 == pos, means we jump to that newly patched
instruction, so no offset adjustment are needed. That part is correct.
Second condition (backward jumps):
Before: After:
insns[0] insns[0]
insns[1] <--- target_X insns[1] <--- target_X
insns[2] <--- pos <-- target_Y insns[P] <--- pos <-- target_Y
insns[3] insns[P] `------| delta
insns[4] <--- i/insn insns[P] `-----|
insns[5] insns[3]
insns[4] <--- i/insn
insns[5]
Second interesting case is where we cross pos-boundary and the jump
instruction was after pos. Backward jump with i == pos would be
impossible and pose a bug somewhere in the patchlet, so the first
condition checking i > pos is okay only by itself. However, i +
insn->off + 1 < pos does not always work as intended to trigger the
adjustment. It works when jump targets would be far off where the
delta wouldn't matter. But, for example, where the fixed insn->off
before pointed to pos (target_Y), it now points to pos + delta, so
that additional room needs to be taken into account for the check.
This means that i) both tests here need to be adjusted into pos + delta,
and ii) for the second condition, the test needs to be <= as pos
itself can be a target in the backjump, too.
Fixes: 9bac3d6d548e ("bpf: allow extended BPF programs access skb fields")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 10 Feb 2016 20:21:57 +0000 (12:21 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
"Just small driver fixups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: colibri-vf50-ts - add missing #include <linux/of.h>
Input: adp5589 - fix row 5 handling for adp5589
Input: edt-ft5x06 - fix setting gain, offset, and threshold via device tree
Input: vmmouse - fix absolute device registration
Input: serio - drop warnings in case of EPROBE_DEFER from serio_find_driver()
Input: cap11xx - add missing of_node_put
Input: sirfsoc-onkey - allow modular build
Input: xpad - remove unused function
Linus Torvalds [Wed, 10 Feb 2016 20:04:59 +0000 (12:04 -0800)]
Merge branch 'for-4.5-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
- PORTS_IMPL workaround for very early ahci controllers is misbehaving
on new systems. Disabled on recent ahci versions.
- Old-style PIO state machine had a horrible locking problem. Don't
know how we've been getting away this far. Fixed.
- Other device specific updates.
* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: Intel DNV device IDs SATA
libata: fix sff host state machine locking while polling
libata-sff: use WARN instead of BUG on illegal host state machine state
libata: disable forced PORTS_IMPL for >= AHCI 1.3
libata: blacklist a Viking flash model for MWDMA corruption
drivers: ata: wake port before DMA stop for ALPM
Linus Torvalds [Wed, 10 Feb 2016 19:36:19 +0000 (11:36 -0800)]
Merge branch 'for-4.5-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- The destruction path of cgroup objects are asynchronous and
multi-staged and some of them ended up destroying parents before
children leading to failures in cpu and memory controllers. Ensure
that parents are always destroyed after children.
- cpuset mm node migration was performed synchronously while holding
threadgroup and cgroup mutexes and the recent threadgroup locking
update resulted in a possible deadlock. The migration is best effort
and shouldn't have been performed under those locks to begin with.
Made asynchronous.
- Minor documentation fix.
* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
Documentation: cgroup: Fix 'cgroup-legacy' -> 'cgroup-v1'
cgroup: make sure a parent css isn't freed before its children
cgroup: make sure a parent css isn't offlined before its children
cpuset: make mm migration asynchronous
Linus Torvalds [Wed, 10 Feb 2016 19:04:05 +0000 (11:04 -0800)]
Merge branch 'for-4.5-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"Workqueue fixes for v4.5-rc3.
- Remove a spurious triggering of flush dependency warning.
- Officially break local execution guarantee of unbound work items
and add a debug feature to flush out usages which depend on it.
- Work around CPU -> NODE mapping becoming invalid on CPU offline.
The branch is young but pushing out early as stable kernels are being
affected"
* 'for-4.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
workqueue: implement "workqueue.debug_force_rr_cpu" debug feature
workqueue: schedule WORK_CPU_UNBOUND work on wq_unbound_cpumask CPUs
Revert "workqueue: make sure delayed work run in local cpu"
workqueue: skip flush dependency checks for legacy workqueues
Tejun Heo [Wed, 3 Feb 2016 18:54:25 +0000 (13:54 -0500)]
workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
When looking up the pool_workqueue to use for an unbound workqueue,
workqueue assumes that the target CPU is always bound to a valid NUMA
node. However, currently, when a CPU goes offline, the mapping is
destroyed and cpu_to_node() returns NUMA_NO_NODE.
This has always been broken but hasn't triggered often enough before
874bbfe600a6 ("workqueue: make sure delayed work run in local cpu").
After the commit, workqueue forcifully assigns the local CPU for
delayed work items without explicit target CPU to fix a different
issue. This widens the window where CPU can go offline while a
delayed work item is pending causing delayed work items dispatched
with target CPU set to an already offlined CPU. The resulting
NUMA_NO_NODE mapping makes workqueue try to queue the work item on a
NULL pool_workqueue and thus crash.
While
874bbfe600a6 has been reverted for a different reason making the
bug less visible again, it can still happen. Fix it by mapping
NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node().
This is a temporary workaround. The long term solution is keeping CPU
-> NODE mapping stable across CPU off/online cycles which is being
worked on.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com
Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.com
Alexandra Yates [Fri, 5 Feb 2016 23:27:49 +0000 (15:27 -0800)]
ahci: Intel DNV device IDs SATA
Adding Intel codename DNV platform device IDs for SATA.
Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
David S. Miller [Wed, 10 Feb 2016 10:50:16 +0000 (05:50 -0500)]
Merge branch 'ovs-tunnel-mtu'
David Wragg says:
====================
Set a large MTU on ovs-created tunnel devices
Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
transmit vxlan packets of any size, constrained only by the ability to
send out the resulting packets. 4.3 introduced netdevs corresponding
to tunnel vports. These netdevs have an MTU, which limits the size of
a packet that can be successfully encapsulated. The default MTU
values are low (1500 or less), which is awkwardly small in the context
of physical networks supporting jumbo frames, and leads to a
conspicuous change in behaviour for userspace.
This patch series sets the MTU on openvswitch-created netdevs to be
the relevant maximum (i.e. the maximum IP packet size minus any
relevant overhead), effectively restoring the behaviour prior to 4.3.
Where relevant, the limits on MTU values that can be directly set on
the netdevs are also relaxed.
Changes in v2:
* Extend to all openvswitch tunnel types, i.e. gre and geneve as well
* Use IP_MAX_MTU
Changes in v3:
* Fix block comment style
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Wragg [Wed, 10 Feb 2016 00:05:58 +0000 (00:05 +0000)]
vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices
Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could
transmit vxlan packets of any size, constrained only by the ability to
send out the resulting packets. 4.3 introduced netdevs corresponding
to tunnel vports. These netdevs have an MTU, which limits the size of
a packet that can be successfully encapsulated. The default MTU
values are low (1500 or less), which is awkwardly small in the context
of physical networks supporting jumbo frames, and leads to a
conspicuous change in behaviour for userspace.
Instead, set the MTU on openvswitch-created netdevs to be the relevant
maximum (i.e. the maximum IP packet size minus any relevant overhead),
effectively restoring the behaviour prior to 4.3.
Signed-off-by: David Wragg <david@weave.works>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Wragg [Wed, 10 Feb 2016 00:05:57 +0000 (00:05 +0000)]
geneve: Relax MTU constraints
Allow the MTU of geneve devices to be set to large values, in order to
exploit underlying networks with larger frame sizes.
GENEVE does not have a fixed encapsulation overhead (an openvswitch
rule can add variable length options), so there is no relevant maximum
MTU to enforce. A maximum of IP_MAX_MTU is used instead.
Encapsulated packets that are too big for the underlying network will
get dropped on the floor.
Signed-off-by: David Wragg <david@weave.works>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Wragg [Wed, 10 Feb 2016 00:05:55 +0000 (00:05 +0000)]
vxlan: Relax MTU constraints
Allow the MTU of vxlan devices without an underlying device to be set
to larger values (up to a maximum based on IP packet limits and vxlan
overhead).
Previously, their MTUs could not be set to higher than the
conventional ethernet value of 1500. This is a very arbitrary value
in the context of vxlan, and prevented vxlan devices from being able
to take advantage of jumbo frames etc.
The default MTU remains 1500, for compatibility.
Signed-off-by: David Wragg <david@weave.works>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>