Sergei Shtylyov [Tue, 3 Nov 2015 19:36:04 +0000 (22:36 +0300)]
sh_eth: use DMA barriers
Commit
7d7355f58ba4 ("sh_eth: Ensure proper ordering of descriptor active
bit write/read") did the right thing but used too "heavy" barriers while
there were already "lighter" DMA barriers exactly for this case...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Nov 2015 18:41:45 +0000 (13:41 -0500)]
Merge git://git./linux/kernel/git/davem/net
Minor overlapping changes in net/ipv4/ipmr.c, in 'net' we were
fixing the "BH-ness" of the counter bumps whilst in 'net-next'
the functions were modified to take an explicit 'net' parameter.
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 3 Nov 2015 16:40:53 +0000 (17:40 +0100)]
switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion
Caller passing down the SKIP_EOPNOTSUPP switchdev flag expects that
-EOPNOTSUPP cannot be returned. But in case of direct op call without
recurtion, this may happen. So fix this by checking it always on the
end of __switchdev_port_attr_set function.
Fixes: 464314ea6c11 ("switchdev: skip over ports returning -EOPNOTSUPP when recursing ports")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Tue, 3 Nov 2015 18:01:41 +0000 (19:01 +0100)]
net: sched: kill dead code in sch_choke.c
It looks like this has never been used at all.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Markus Elfring [Tue, 3 Nov 2015 17:18:37 +0000 (18:18 +0100)]
irda: Delete an unnecessary check before the function call "irlmp_unregister_service"
The irlmp_unregister_service() function tests whether its argument is NULL
and then returns immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Nov 2015 16:30:57 +0000 (11:30 -0500)]
Merge tag 'mac80211-for-davem-2015-11-03' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Another set of fixes:
* remove a warning on a check that can trigger without any
errors having happened (Andrei)
* correctly handle deauth request while in the process of
associating (Andrei)
* fix TDLS HT operation (Arik)
* allow changing AID/listen interval during client setup (Ayala)
* be more forgiving with WMM parameters to get HT/VHT in case of
broken APs with bad WMM settings (Emmanuel, myself)
* a number of other fixes (some in documentation)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Tue, 3 Nov 2015 15:52:52 +0000 (10:52 -0500)]
net: dsa: mv88e6xxx: include DSA ports in VLANs
DSA ports must be members of a VLAN in order to ensure frame bridging
between chained switch chips.
Thus tag them in addition to the CPU port when adding a VLAN, and skip
them when deleting a VLAN and reporting VLAN members.
Also use the UNMODIFIED egress policy, so that frames egress on these
ports as they ingress, tagged or untagged.
Fixes: 0d3b33e60206 ("net: dsa: mv88e6xxx: add VLAN Load support")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 3 Nov 2015 15:52:36 +0000 (10:52 -0500)]
net: dsa: mv88e6xxx: disable SA learning for DSA and CPU ports
Frames with DSA headers passing to/from the CPU were taking place in the
MAC learning on these ports, resulting in incorrect ATU entries. Disable
learning on these ports.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarod Wilson [Tue, 3 Nov 2015 15:15:59 +0000 (10:15 -0500)]
net/core: fix for_each_netdev_feature
As pointed out by Nikolay and further explained by Geert, the initial
for_each_netdev_feature macro was broken, as feature would get set outside
of the block of code it was intended to run in, thus only ever working for
the first feature bit in the mask. While less pretty this way, this is
tested and confirmed functional with multiple feature bits set in
NETIF_F_UPPER_DISABLES.
[root@dell-per730-01 ~]# ethtool -K bond0 lro off
...
[ 242.761394] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2.
[ 243.552178] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83
[ 244.353978] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1.
[ 245.147420] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71
[root@dell-per730-01 ~]# ethtool -K bond0 gro off
...
[ 251.925645] bond0: Disabling feature 0x0000000000004000 on lower dev p5p2.
[ 252.713693] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83
[ 253.499085] bond0: Disabling feature 0x0000000000004000 on lower dev p5p1.
[ 254.290922] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71
Fixes: fd867d51f ("net/core: generic support for disabling netdev features down stack")
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Nikolay Aleksandrov <razor@blackwall.org>
CC: Michal Kubecek <mkubecek@suse.cz>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Padmanabh Ratnakar [Tue, 3 Nov 2015 14:55:59 +0000 (20:25 +0530)]
vlan: Invoke driver vlan hooks only if device is present
NIC drivers mark device as detached during error recovery.
It expects no manangement hooks to be invoked in this state.
Invoke driver vlan hooks only if device is present.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 3 Nov 2015 13:51:29 +0000 (14:51 +0100)]
arcnet/com20020: add LEDS_CLASS dependency
The newly added led trigger support in the com20020-pci driver causes
build errors when CONFIG_LEDS_CLASS is disabled:
drivers/built-in.o: In function `com20020pci_probe':
(.text+0x185dc4): undefined reference to `devm_led_classdev_register'
(.text+0x185dd8): undefined reference to `devm_led_classdev_register'
This adds a Kconfig dependency to prevent the invalid configurations.
Other drivers appear to be split 50:50 between 'select' and 'depends on'
for this symbol, I picked 'depends on' as I could not find a common
policy and it generally causes fewer problems.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 8890624a4e8c ("arcnet: com20020-pci: add led trigger support")
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 3 Nov 2015 10:39:20 +0000 (11:39 +0100)]
bpf, verifier: annotate verbose printer with __printf
The verbose() printer dumps the verifier state to user space, so let gcc
take care to check calls to verbose() for (future) errors. make with W=1
correctly suggests: function might be possible candidate for 'gnu_printf'
format attribute [-Wsuggest-attribute=format].
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Nov 2015 16:08:22 +0000 (11:08 -0500)]
Merge branch 'dp83640-fixes'
Stefan Sørensen says:
====================
dp83640 driver fixes
This series fixes a number of minor bugs in the dp83640 driver.
====================
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Tue, 3 Nov 2015 08:34:08 +0000 (09:34 +0100)]
dp83640: Only wait for timestamps for packets with timestamping enabled.
In the packet timestamping function, check that the ptp version and
protocol of the packet matches what we have configured the hardware to
actually generate timestamps for, before looking/waiting for a timestamp.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Tue, 3 Nov 2015 08:34:07 +0000 (09:34 +0100)]
ptp: Change ptp_class to a proper bitmask
Change the definition of PTP_CLASS_L2 to not have any bits overlapping with
the other defined protocol values, allowing the PTP_CLASS_* definitions to
be for simple filtering on packet type.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Tue, 3 Nov 2015 08:34:06 +0000 (09:34 +0100)]
dp83640: Prune rx timestamp list before reading from it
The list of rx timestamps are currently only pruned of old entries when a
new entry is inserted. If no new entries are added, old timestamps may
survive beyond their lifetime, possible causing them to be attached to
packets with the same sequence number after a rollover.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Tue, 3 Nov 2015 08:34:05 +0000 (09:34 +0100)]
dp83640: Delay scheduled work.
Currently rx_timestamp_work reschedules itself as a regular workqueue item,
effectively causing it run constantly as long as there are packets left in
the queue. Fix by using delayed workqueue items, limiting it to run only
every two jiffies.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Sørensen [Tue, 3 Nov 2015 08:34:04 +0000 (09:34 +0100)]
dp83640: Include hash in timestamp/packet matching
Only using the message type and sequence id for matching timestamps
with packets is error prone, as multiple clients may very well be
sending packets with the same messagetype and timestamp at the same
time. Fix by extending the check to include the hash of bytes 20-29
(source id in PTPv2) that is provided with the timestamps.
Signed-off-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kubeček [Tue, 3 Nov 2015 07:51:07 +0000 (08:51 +0100)]
ipv6: fix tunnel error handling
Both tunnel6_protocol and tunnel46_protocol share the same error
handler, tunnel6_err(), which traverses through tunnel6_handlers list.
For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g.
in tunnel46_rcv(). Current code can generate an ICMPv6 error message
with an IPv4 packet embedded in it.
Fixes: 73d605d1abbd ("[IPSEC]: changing API of xfrm6_tunnel_register")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Nov 2015 15:41:51 +0000 (10:41 -0500)]
Merge branch 'mlx5-fixes'
Or Gerlitz says:
====================
Mellanox mlx5e driver update, Nov 3 2015
This series contains bunch of small fixes to the mlx5e driver from Achiad.
Changes from V0:
- removed the driver patch that dealt with IRQ affinity changes during
NAPI poll, as this is a generic problem which needs generic solution.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:24 +0000 (08:07 +0200)]
net/mlx5e: Fix LSO vlan insertion
Consider vlan insertion impact on headers copy size also for LSO
packets.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:23 +0000 (08:07 +0200)]
net/mlx5e: Re-eanble client vlan TX acceleration
This reverts commit
cd58c714acb9 "net/mlx5e: Disable client vlan TX acceleration".
Bring back client vlan insertion offload, the original
performance issue was found and fixed in the next patch.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:22 +0000 (08:07 +0200)]
net/mlx5e: Return error in case mlx5e_set_features() fails
In case mlx5e_set_features() fails, return the failure status rather
than 0.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:21 +0000 (08:07 +0200)]
net/mlx5e: Don't allow more than max supported channels
Consider MLX5E_MAX_NUM_CHANNELS @ethtool set/get_channels
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:20 +0000 (08:07 +0200)]
net/mlx5_core: Use the the real irqn in eq->irqn
Instead of storing the msix array index in eq->irqn (vecidx),
store the real irq number.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:19 +0000 (08:07 +0200)]
net/mlx5e: Wait for RX buffers initialization in a more proper manner
Use jiffies rather than wait loop with msleep().
The wait loop didn't take into consideration time when the
process was not executing.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Tue, 3 Nov 2015 06:07:18 +0000 (08:07 +0200)]
net/mlx5e: Avoid NULL pointer access in case of configuration failure
In case a configuration operation that involves closing and re-opening
resources (e.g RX/TX queue size change) fails at the re-opening stage
these resources will remain closed.
So when executing (following) configuration operations (e.g ifconfig
down) we cannot assume that these resources are available.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ayala Beker [Fri, 23 Oct 2015 08:20:06 +0000 (11:20 +0300)]
cfg80211: allow AID/listen interval changes for unassociated station
Currently, cfg80211 rejects updates of AID and listen interval parameters
for existing entries. This information is known only at association stage
and as a result it's impossible to update entries that were added
unassociated.
Fix this by allowing updates of these properies for stations that the
driver (or mac80211) assigned unassociated state.
This then fixes mac80211's use of NL80211_FEATURE_FULL_AP_CLIENT_STATE.
Signed-off-by: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Chaitanya T K [Fri, 30 Oct 2015 17:46:15 +0000 (23:16 +0530)]
mac80211: document sleep requirements for channel context ops
Channel context driver operations can sleep, so add might_sleep()
and document this.
Signed-off-by: Chaitanya T K <chaitanya.mgit@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Sun, 25 Oct 2015 08:59:42 +0000 (10:59 +0200)]
mac80211: further improve "no supported rates" warning
Allow distinguishing the non-station case from the case of a
station without rates, by using -1 for the non-station case.
This value cannot be reached with a station since that many
legacy rates don't exist.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 22 Oct 2015 15:46:06 +0000 (17:46 +0200)]
mac80211: treat bad WMM parameters more gracefully
As WMM is required for HT/VHT operation, treat bad WMM parameters
more gracefully by falling back to default parameters instead of
not using WMM assocation. This makes it possible to still use HT
or VHT, although potentially with reduced quality of service due
to unintended WMM parameters.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Emmanuel Grumbach [Thu, 22 Oct 2015 15:46:05 +0000 (17:46 +0200)]
mac80211: fixup AIFSN instead of disabling WMM
Disabling WMM has a huge impact these days. It implies that
HT and VHT will be disabled which means that the throughput
will be drammatically reduced.
Since the AIFSN is a transmission parameter, we can play a
bit and fix it up to make it compliant with the 802.11
specification which requires it to be at least 2.
Increasing it from 1 to 2 will slightly reduce the
likelyhood to get a transmission opportunity compared to
other clients that would accept to set AIFSN=1, but at
least it will allow HT and VHT which is a huge gain.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 22 Oct 2015 15:46:04 +0000 (17:46 +0200)]
mac80211: make enable_qos parameter to ieee80211_set_wmm_default()
The function currently determines this value, for use in bss_info.qos,
based on the interface type itself. Make it a parameter instead and
set it with the same logic for now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Thu, 22 Oct 2015 15:35:14 +0000 (17:35 +0200)]
cfg80211/mac80211: clarify RSSI CQM reporting requirements
The previous patch changed mac80211 to always report an event
after a CQM RSSI reconfiguration. Document that as expected
behaviour in both the cfg80211 and mac80211 API.
Currently, iwlmvm already implements that behaviour; the other
drivers implementing CQM RSSI events may have to be changed.
This behaviour lets userspace know what the current state is
without relying on querying the data which is racy.
Reviewed-by: Sharon, Sara <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Matthias Schiffer [Sat, 24 Oct 2015 19:25:51 +0000 (21:25 +0200)]
mac80211: fix crash on mesh local link ID generation with VIFs
llid_in_use needs to be limited to stations of the same VIF, otherwise it
will cause a NULL deref as the sta_info of non-mesh-VIFs don't have
sta->mesh set.
Steps to reproduce:
modprobe mac80211_hwsim channels=2
iw phy phy0 interface add ibss0 type ibss
iw phy phy0 interface add mesh0 type mp
iw phy phy1 interface add ibss1 type ibss
iw phy phy1 interface add mesh1 type mp
ip link set ibss0 up
ip link set mesh0 up
ip link set ibss1 up
ip link set mesh1 up
iw dev ibss0 ibss join foo 2412
iw dev ibss1 ibss join foo 2412
# Ensure that ibss0 and ibss1 are actually associated; I often need to
# leave and join the cell on ibss1 a second time.
iw dev mesh0 mesh join bar
iw dev mesh1 mesh join bar # crash
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Sun, 25 Oct 2015 08:59:34 +0000 (10:59 +0200)]
mac80211: TDLS: add proper HT-oper IE
When 11n peers performs a TDLS connection on a legacy BSS, the HT
operation IE must be specified according to IEEE802.11-2012 section
9.23.3.2. Otherwise HT-protection is compromised and the medium becomes
noisy for both the TDLS and the BSS links.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Eliad Peller [Sun, 25 Oct 2015 08:59:33 +0000 (10:59 +0200)]
mac80211: don't reconfigure sched scan in case of wowlan
Scheduled scan has to be reconfigured only if wowlan wasn't
configured, since otherwise it should continue to run (with
the 'any' trigger) or be aborted.
The current code will end up asking the driver to start a new
scheduled scan without stopping the previous one, and leaking
some memory (from the previous request.)
Fix this by doing the abort/restart under the proper conditions.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Eliad Peller [Sun, 25 Oct 2015 08:59:36 +0000 (10:59 +0200)]
mac80211: call drv_stop only if driver is started
If drv_start() fails during hw_restart, all the running
interfaces are being closed/stopped, which results in
drv_stop() being called, although the driver was never
started successfully.
This might cause drivers to perform operations on uninitialized
memory (as they assume it was initialized on drv_start)
Consider the local->started flag, and call the driver's stop()
op only if drv_start() succeeded before.
Move drv_start() and drv_stop() to driver-ops.c, as they are no
longer simple wrappers.
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Andrei Otcheretianski [Sun, 25 Oct 2015 08:59:40 +0000 (10:59 +0200)]
mac80211: Remove WARN_ON_ONCE in ieee80211_recalc_smps
The recalc_smps work can run after the station disassociates.
At this stage we already released the channel, but the work
will be cancelled only when the interface stops.
In this scenario we can hit the warning in ieee80211_recalc_smps, so
just remove it.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Eliad Peller [Sun, 25 Oct 2015 08:59:35 +0000 (10:59 +0200)]
mac80211: use freezable workqueue for restart work
Requesting hw restart during suspend might result
in the restart work being executed after mac80211
and the hw are suspended.
Solve the race by simply scheduling the restart
work on a freezable workqueue.
Note that there can be some cases of reconfiguration
on resume (besides the hardware restart):
* wowlan is not configured -
All the interfaces removed were removed on suspend,
and drv_stop() was called. At this point the driver
shouldn't expect for hw_restart anyway, so we can
simply cancel it (on resume).
* wowlan is configured, drv_resume() == 1
There is no definitive expected behavior in this case,
as each driver might have different expectations (e.g.
setting some flags on suspend/restart vs. not handling
spurious recovery).
For now, simply let the hw_restart work run again after
resume, and hope the driver will handle it well (or at
least initiate another hw restart).
Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Andrei Otcheretianski [Sun, 25 Oct 2015 08:59:38 +0000 (10:59 +0200)]
mac80211: Fix local deauth while associating
Local request to deauthenticate wasn't handled while associating, thus
the association could continue even when the user space required to
disconnect.
Cc: stable@vger.kernel.org
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Sun, 25 Oct 2015 08:59:41 +0000 (10:59 +0200)]
mac80211: allow null chandef in tracing
In TDLS channel-switch operations the chandef can sometimes be NULL.
Avoid an oops in the trace code for these cases and just print a
chandef full of zeros.
Cc: stable@vger.kernel.org
Fixes: a7a6bdd0670fe ("mac80211: introduce TDLS channel switch ops")
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ola Olsson [Thu, 29 Oct 2015 06:04:58 +0000 (07:04 +0100)]
nl80211: Fix potential memory leak from parse_acl_data
If parse_acl_data succeeds but the subsequent parsing of smps
attributes fails, there will be a memory leak due to early returns.
Fix that by moving the ACL parsing later.
Cc: stable@vger.kernel.org
Fixes: 18998c381b19b ("cfg80211: allow requesting SMPS mode on ap start")
Signed-off-by: Ola Olsson <ola.olsson@sonymobile.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Janusz.Dziedzic@tieto.com [Tue, 27 Oct 2015 07:35:11 +0000 (08:35 +0100)]
mac80211: fix divide by zero when NOA update
In case of one shot NOA the interval can be 0, catch that
instead of potentially (depending on the driver) crashing
like this:
divide error: 0000 [#1] SMP
[...]
Call Trace:
<IRQ>
[<
ffffffffc08e891c>] ieee80211_extend_absent_time+0x6c/0xb0 [mac80211]
[<
ffffffffc08e8a17>] ieee80211_update_p2p_noa+0xb7/0xe0 [mac80211]
[<
ffffffffc069cc30>] ath9k_p2p_ps_timer+0x170/0x190 [ath9k]
[<
ffffffffc070adf8>] ath_gen_timer_isr+0xc8/0xf0 [ath9k_hw]
[<
ffffffffc0691156>] ath9k_tasklet+0x296/0x2f0 [ath9k]
[<
ffffffff8107ad65>] tasklet_action+0xe5/0xf0
[...]
Cc: stable@vger.kernel.org [3.16+, due to d463af4a1c34 using it]
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Shaohui Xie [Tue, 3 Nov 2015 04:27:33 +0000 (12:27 +0800)]
net: phy: fix a bug in get_phy_c45_ids
When probing devices-in-package for a c45 phy, device zero is the last
device to probe, however, if driver reads 0 from device zero,
c45_ids->devices_in_package is set to '0', the loop condition of probing
will be matched again, see codes below:
for (i = 1;i < num_ids && c45_ids->devices_in_package == 0;i++)
driver will run in a dead loop.
This patch restructures the bug and confusing loop, it provides a helper
function get_phy_c45_devs_in_pkg which to read devices-in-package registers
of a MMD, and rewrites the loop with using the helper function.
Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jarod Wilson [Tue, 3 Nov 2015 02:55:59 +0000 (21:55 -0500)]
net/core: generic support for disabling netdev features down stack
There are some netdev features, which when disabled on an upper device,
such as a bonding master or a bridge, must be disabled and cannot be
re-enabled on underlying devices.
This is a rework of an earlier more heavy-handed appraoch, which simply
disables and prevents re-enabling of netdev features listed in a new
define in include/net/netdev_features.h, NETIF_F_UPPER_DISABLES. Any upper
device that disables a flag in that feature mask, the disabling will
propagate down the stack, and any lower device that has any upper device
with one of those flags disabled should not be able to enable said flag.
Initially, only LRO is included for proof of concept, and because this
code effectively does the same thing as dev_disable_lro(), though it will
also activate from the ethtool path, which was one of the goals here.
[root@dell-per730-01 ~]# ethtool -k bond0 |grep large
large-receive-offload: on
[root@dell-per730-01 ~]# ethtool -k p5p1 |grep large
large-receive-offload: on
[root@dell-per730-01 ~]# ethtool -K bond0 lro off
[root@dell-per730-01 ~]# ethtool -k bond0 |grep large
large-receive-offload: off
[root@dell-per730-01 ~]# ethtool -k p5p1 |grep large
large-receive-offload: off
dmesg dump:
[ 1033.277986] bond0: Disabling feature 0x0000000000008000 on lower dev p5p2.
[ 1034.067949] bnx2x 0000:06:00.1 p5p2: using MSI-X IRQs: sp 74 fp[0] 76 ... fp[7] 83
[ 1034.753612] bond0: Disabling feature 0x0000000000008000 on lower dev p5p1.
[ 1035.591019] bnx2x 0000:06:00.0 p5p1: using MSI-X IRQs: sp 62 fp[0] 64 ... fp[7] 71
This has been successfully tested with bnx2x, qlcnic and netxen network
cards as slaves in a bond interface. Turning LRO on or off on the master
also turns it on or off on each of the slaves, new slaves are added with
LRO in the same state as the master, and LRO can't be toggled on the
slaves.
Also, this should largely remove the need for dev_disable_lro(), and most,
if not all, of its call sites can be replaced by simply making sure
NETIF_F_LRO isn't included in the relevant device's feature flags.
Note that this patch is driven by bug reports from users saying it was
confusing that bonds and slaves had different settings for the same
features, and while it won't be 100% in sync if a lower device doesn't
support a feature like LRO, I think this is a good step in the right
direction.
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <gospo@cumulusnetworks.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Nikolay Aleksandrov <razor@blackwall.org>
CC: Michal Kubecek <mkubecek@suse.cz>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Mon, 2 Nov 2015 12:51:31 +0000 (12:51 +0000)]
sfc: push partner queue for skb->xmit_more
When the IP stack passes SKBs the sfc driver puts them in 2 different TX
queues (called partners), one for checksummed and one for not checksummed.
If the SKB has xmit_more set the driver will delay pushing the work to the
NIC.
When later it does decide to push the buffers this patch ensures it also
pushes the partner queue, if that also has any delayed work. Before this
fix the work in the partner queue would be left for a long time and cause
a netdev watchdog.
Fixes: 70b33fb ("sfc: add support for skb->xmit_more")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Mon, 2 Nov 2015 22:28:07 +0000 (01:28 +0300)]
sh_eth: fix typo in RX descriptor bit name
The correct name of the RX descriptor 0 bit 30 is RDLE (receive descriptor
list end), not RDEL.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 3 Nov 2015 01:08:19 +0000 (17:08 -0800)]
sit: fix sit0 percpu double allocations
sit0 device allocates its percpu storage twice :
- One time in ipip6_tunnel_init()
- One time in ipip6_fb_tunnel_init()
Thus we leak 48 bytes per possible cpu per network namespace dismantle.
ipip6_fb_tunnel_init() can be much simpler and does not
return an error, and should be called after register_netdev()
Note that ipip6_tunnel_clone_6rd() also needs to be called
after register_netdev() (calling ipip6_tunnel_init())
Fixes: ebe084aafb7e ("sit: Use ipip6_tunnel_init as the ndo_init function.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Nov 2015 03:52:24 +0000 (22:52 -0500)]
Merge branch 'bonding-actor-updates'
Mahesh Bandewar says:
====================
re-org actor admin/oper key updates
I was observing machines entering into weird LACP state when the
partner is in passive mode. This issue is not because of the partners
in passive state but probably because of some operational key update
which is pushing the state-machine is that weird state. This was
happening randomly on about 1% of the machine (when the sample size
is a large set of machines with a variety of NICs/ports bonded).
In this patch-series I'm attempting to unify the logic of actor-key
/ operational-key changes to one place to avoid possible errors in
update. Also this eliminates the need for the event-handler to decide
if the key needs update.
After this patch-set none of the machines (from same sample set) were
exhibiting LACP-weirdness that was observed earlier.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 31 Oct 2015 19:45:11 +0000 (12:45 -0700)]
bonding: simplify / unify event handling code for 3ad mode.
Old logic of updating state-machine is not required since
ad_update_actor_keys() does it implicitly. The only loss is
the notification differentiation between speed vs. duplex
change. Now only one unified notification is printed.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 31 Oct 2015 19:45:06 +0000 (12:45 -0700)]
bonding: unify all places where actor-oper key needs to be updated.
actor_admin, and actor_oper key is changed at multiple locations in
the code. This patch brings all those updates into one location in
an attempt to avoid possible inconsistent updates causing LACP state
machine to go in weird state.
The unified place is ad_update_actor_key() with simple state-machine
logic -
(a) If port is "duplex" then only it can participate in LACP
(b) Speed change reinitializes the LACP state-machine.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mahesh Bandewar [Sat, 31 Oct 2015 19:45:00 +0000 (12:45 -0700)]
bonding: Simplify __get_duplex function.
Eliminate 'else' clause by simply initializing variable
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 3 Nov 2015 03:48:39 +0000 (22:48 -0500)]
Merge branch 'bpf-persistent'
Daniel Borkmann says:
====================
BPF updates
This set adds support for persistent maps/progs. Please see
individual patches for further details. A man-page update
to bpf(2) will be sent later on, also a iproute2 patch for
support in tc.
v1 -> v2:
- Reworked most of patch 4 and 5
- Rebased to latest net-next
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 29 Oct 2015 13:58:10 +0000 (14:58 +0100)]
bpf: add sample usages for persistent maps/progs
This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN
and BPF_OBJ_GET commands can be used.
Example with maps:
# ./fds_example -F /sys/fs/bpf/m -P -m -k 1 -v 42
bpf: map fd:3 (Success)
bpf: pin ret:(0,Success)
bpf: fd:3 u->(1:42) ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):42 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1 -v 24
bpf: get fd:3 (Success)
bpf: fd:3 u->(1:24) ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):24 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -P -m
bpf: map fd:3 (Success)
bpf: pin ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):0 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -G -m
bpf: get fd:3 (Success)
Example with progs:
# ./fds_example -F /sys/fs/bpf/p -P -p
bpf: prog fd:3 (Success)
bpf: pin ret:(0,Success)
bpf sock:4 <- fd:3 attached ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/p -G -p
bpf: get fd:3 (Success)
bpf: sock:4 <- fd:3 attached ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/p2 -P -p -o ./sockex1_kern.o
bpf: prog fd:5 (Success)
bpf: pin ret:(0,Success)
bpf: sock:3 <- fd:5 attached ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/p2 -G -p
bpf: get fd:3 (Success)
bpf: sock:4 <- fd:3 attached ret:(0,Success)
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 29 Oct 2015 13:58:09 +0000 (14:58 +0100)]
bpf: add support for persistent maps/progs
This work adds support for "persistent" eBPF maps/programs. The term
"persistent" is to be understood that maps/programs have a facility
that lets them survive process termination. This is desired by various
eBPF subsystem users.
Just to name one example: tc classifier/action. Whenever tc parses
the ELF object, extracts and loads maps/progs into the kernel, these
file descriptors will be out of reach after the tc instance exits.
So a subsequent tc invocation won't be able to access/relocate on this
resource, and therefore maps cannot easily be shared, f.e. between the
ingress and egress networking data path.
The current workaround is that Unix domain sockets (UDS) need to be
instrumented in order to pass the created eBPF map/program file
descriptors to a third party management daemon through UDS' socket
passing facility. This makes it a bit complicated to deploy shared
eBPF maps or programs (programs f.e. for tail calls) among various
processes.
We've been brainstorming on how we could tackle this issue and various
approches have been tried out so far, which can be read up further in
the below reference.
The architecture we eventually ended up with is a minimal file system
that can hold map/prog objects. The file system is a per mount namespace
singleton, and the default mount point is /sys/fs/bpf/. Any subsequent
mounts within a given namespace will point to the same instance. The
file system allows for creating a user-defined directory structure.
The objects for maps/progs are created/fetched through bpf(2) with
two new commands (BPF_OBJ_PIN/BPF_OBJ_GET). I.e. a bpf file descriptor
along with a pathname is being passed to bpf(2) that in turn creates
(we call it eBPF object pinning) the file system nodes. Only the pathname
is being passed to bpf(2) for getting a new BPF file descriptor to an
existing node. The user can use that to access maps and progs later on,
through bpf(2). Removal of file system nodes is being managed through
normal VFS functions such as unlink(2), etc. The file system code is
kept to a very minimum and can be further extended later on.
The next step I'm working on is to add dump eBPF map/prog commands
to bpf(2), so that a specification from a given file descriptor can
be retrieved. This can be used by things like CRIU but also applications
can inspect the meta data after calling BPF_OBJ_GET.
Big thanks also to Alexei and Hannes who significantly contributed
in the design discussion that eventually let us end up with this
architecture here.
Reference: https://lkml.org/lkml/2015/10/15/925
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 29 Oct 2015 13:58:08 +0000 (14:58 +0100)]
bpf: consolidate bpf_prog_put{, _rcu} dismantle paths
We currently have duplicated cleanup code in bpf_prog_put() and
bpf_prog_put_rcu() cleanup paths. Back then we decided that it was
not worth it to make it a common helper called by both, but with
the recent addition of resource charging, we could have avoided
the fix in commit
ac00737f4e81 ("bpf: Need to call bpf_prog_uncharge_memlock
from bpf_prog_put") if we would have had only a single, common path.
We can simplify it further by assigning aux->prog only once during
allocation time.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 29 Oct 2015 13:58:07 +0000 (14:58 +0100)]
bpf: align and clean bpf_{map,prog}_get helpers
Add a bpf_map_get() function that we're going to use later on and
align/clean the remaining helpers a bit so that we have them a bit
more consistent:
- __bpf_map_get() and __bpf_prog_get() that both work on the fd
struct, check whether the descriptor is eBPF and return the
pointer to the map/prog stored in the private data.
Also, we can return f.file->private_data directly, the function
signature is enough of a documentation already.
- bpf_map_get() and bpf_prog_get() that both work on u32 user fd,
call their respective __bpf_map_get()/__bpf_prog_get() variants,
and take a reference.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 29 Oct 2015 13:58:06 +0000 (14:58 +0100)]
bpf: abstract anon_inode_getfd invocations
Since we're going to use anon_inode_getfd() invocations in more than just
the current places, make a helper function for both, so that we only need
to pass a map/prog pointer to the helper itself in order to get a fd. The
new helpers are called bpf_map_new_fd() and bpf_prog_new_fd().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 2 Nov 2015 17:03:11 +0000 (09:03 -0800)]
net: fix percpu memory leaks
This patch fixes following problems :
1) percpu_counter_init() can return an error, therefore
init_frag_mem_limit() must propagate this error so that
inet_frags_init_net() can do the same up to its callers.
2) If ip[46]_frags_ns_ctl_register() fail, we must unwind
properly and free the percpu_counter.
Without this fix, we leave freed object in percpu_counters
global list (if CONFIG_HOTPLUG_CPU) leading to crashes.
This bug was detected by KASAN and syzkaller tool
(http://github.com/google/syzkaller)
Fixes: 6d7b857d541e ("net: use lib/percpu_counter API for fragmentation mem accounting")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 2 Nov 2015 15:50:07 +0000 (07:50 -0800)]
net: avoid NULL deref in inet_ctl_sock_destroy()
Under low memory conditions, tcp_sk_init() and icmp_sk_init()
can both iterate on all possible cpus and call inet_ctl_sock_destroy(),
with eventual NULL pointer.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Mon, 2 Nov 2015 01:40:17 +0000 (10:40 +0900)]
ravb: use pdev rather than ndev for error messages
This corrects what appear to be typos, making the code consistent with
itself, and allowing meaningful prefixes to be displayed with the errors in
question.
Before:
(null): failed to initialize MDIO
(null): Cannot allocate desc base address table (size 176 bytes)
After:
ravb
e6800000.ethernet: failed to initialize MDIO
ravb
e6800000.ethernet: Cannot allocate desc base address table (size 176 bytes)
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthias Schiffer [Mon, 2 Nov 2015 00:24:38 +0000 (01:24 +0100)]
ipv6: fix crash on ICMPv6 redirects with prohibited/blackholed source
There are other error values besides ip6_null_entry that can be returned by
ip6_route_redirect(): fib6_rule_action() can also result in
ip6_blk_hole_entry and ip6_prohibit_entry if such ip rules are installed.
Only checking for ip6_null_entry in rt6_do_redirect() causes ip6_ins_rt()
to be called with rt->rt6i_table == NULL in these cases, making the kernel
crash.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 1 Nov 2015 23:36:55 +0000 (15:36 -0800)]
net: make skb_set_owner_w() more robust
skb_set_owner_w() is called from various places that assume
skb->sk always point to a full blown socket (as it changes
sk->sk_wmem_alloc)
We'd like to attach skb to request sockets, and in the future
to timewait sockets as well. For these kind of pseudo sockets,
we need to take a traditional refcount and use sock_edemux()
as the destructor.
It is now time to un-inline skb_set_owner_w(), being too big.
Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 1 Nov 2015 16:31:45 +0000 (18:31 +0200)]
bridge: vlan: Use rcu_dereference instead of rtnl_dereference
br_should_learn() is protected by RCU and not by RTNL, so use correct
flavor of nbp_vlan_group().
Fixes: 907b1e6e83ed ("bridge: vlan: use proper rcu for the vlgrp
member")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sun, 1 Nov 2015 16:22:53 +0000 (16:22 +0000)]
ppp, slip: Validate VJ compression slot parameters completely
Currently slhc_init() treats out-of-range values of rslots and tslots
as equivalent to 0, except that if tslots is too large it will
dereference a null pointer (CVE-2015-7799).
Add a range-check at the top of the function and make it return an
ERR_PTR() on error instead of NULL. Change the callers accordingly.
Compile-tested only.
Reported-by: 郭永刚 <guoyonggang@360.cn>
References: http://article.gmane.org/gmane.comp.security.oss.general/17908
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Sun, 1 Nov 2015 16:21:24 +0000 (16:21 +0000)]
isdn_ppp: Add checks for allocation failure in isdn_ppp_open()
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjørn Mork [Sun, 1 Nov 2015 00:34:50 +0000 (01:34 +0100)]
qmi_wwan: fix entry for HP lt4112 LTE/HSPA+ Gobi 4G Module
The lt4112 is a HP branded Huawei me906e modem. Like other Huawei
modems, it does not have a fixed interface to function mapping.
Instead it uses a Huawei specific scheme: functions are mapped by
subclass and protocol.
However, the HP vendor ID is used for modems from many different
manufacturers using different schemes, so we cannot apply a generic
vendor rule like we do for the Huawei vendor ID.
Replace the previous lt4112 entry pointing to an arbitrary interface
number with a device specific subclass + protocol match.
Reported-and-tested-by: Muri Nicanor <muri+libqmi@immerda.ch>
Tested-by: Martin Hauke <mardnh@gmx.de>
Fixes: bb2bdeb83fb1 ("qmi_wwan: Add support for HP lt4112 LTE/HSPA+ Gobi 4G Modem")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ani Sinha [Fri, 30 Oct 2015 23:54:31 +0000 (16:54 -0700)]
ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context.
Fixes the following kernel BUG :
BUG: using __this_cpu_add() in preemptible [
00000000] code: bash/2758
caller is __this_cpu_preempt_check+0x13/0x15
CPU: 0 PID: 2758 Comm: bash Tainted: P O 3.18.19 #2
ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
Call Trace:
[<
ffffffff81482b2a>] dump_stack+0x52/0x80
[<
ffffffff812010ae>] check_preemption_disabled+0xce/0xe1
[<
ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15
[<
ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c
[<
ffffffff8141a154>] ip_mr_forward+0x32f/0x34e
[<
ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c
[<
ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<
ffffffff810e6974>] ? pollwake+0x4d/0x51
[<
ffffffff81058ac0>] ? default_wake_function+0x0/0xf
[<
ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<
ffffffff810613d9>] ? __wake_up_common+0x45/0x77
[<
ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[<
ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53
[<
ffffffff8139a519>] ? sock_def_readable+0x71/0x75
[<
ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55
[<
ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41
[<
ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86
[<
ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d
[<
ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b
[<
ffffffff810d5738>] ? new_sync_read+0x82/0xaa
[<
ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99
[<
ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32
[<
ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f
[<
ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf
[<
ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3
[<
ffffffff81488ea1>] cstar_dispatch+0x7/0x1e
Signed-off-by: Ani Sinha <ani@arista.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 2 Nov 2015 20:56:12 +0000 (15:56 -0500)]
Merge branch 'sh_eth-fixes'
Sergei Shtylyov says:
====================
sh_eth: fix bugs in sh_eth_ring_init()
Here's a set of 2 patches against DaveM's 'net.git' repo which fix couple of
bugs in the sh_eth_ring_init() function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 30 Oct 2015 23:06:29 +0000 (02:06 +0300)]
sh_eth: fix WARNING in dma_common_free_remap()
Iff the first dma_alloc_coherent() call fails in sh_eth_ring_init(), the
following is printed to the kernel console:
WARNING: CPU: 0 PID: 1 at drivers/base/dma-mapping.c:334 dma_common_free_remap+0x48/0x6c()
trying to free invalid coherent area: (null)
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc7-dirty #969
Hardware name: Generic R8A7791 (Flattened Device Tree)
Backtrace:
[<
c0013820>] (dump_backtrace) from [<
c00139bc>] (show_stack+0x18/0x1c)
r6:
c0662856 r5:
00000009 r4:
00000000 r3:
00204140
[<
c00139a4>] (show_stack) from [<
c0227510>] (dump_stack+0x74/0x90)
[<
c022749c>] (dump_stack) from [<
c0026ef4>] (warn_slowpath_common+0x8c/0xb8)
r4:
ee84dce0 r3:
c0712774
[<
c0026e68>] (warn_slowpath_common) from [<
c0026fc4>] (warn_slowpath_fmt+0x38/0x40)
r8:
ee7f8000 r7:
c0734520 r6:
00001000 r5:
20000008 r4:
00000000
[<
c0026f90>] (warn_slowpath_fmt) from [<
c02df404>] (dma_common_free_remap+0x48/0x6c)
r3:
00000000 r2:
c0662871
[<
c02df3bc>] (dma_common_free_remap) from [<
c001b9fc>] (__arm_dma_free+0xb8/0xd4)
r6:
00000001 r5:
00000000 r4:
00001000 r3:
ee8c5584
[<
c001b944>] (__arm_dma_free) from [<
c001ba68>] (arm_dma_free+0x24/0x2c)
r10:
0000016b r8:
00000000 r7:
ee9bc830 r6:
00000000 r5:
00000400 r4:
ee9bc800
[<
c001ba44>] (arm_dma_free) from [<
c032ebf0>] (sh_eth_ring_init+0x110/0x138)
[<
c032eae0>] (sh_eth_ring_init) from [<
c033179c>] (sh_eth_open+0x94/0x1f4)
r6:
00000000 r5:
ee9bcd18 r4:
ee9bc800
[<
c0331708>] (sh_eth_open) from [<
c041bf7c>] (__dev_open+0x84/0x104)
r6:
c0565c50 r5:
00000000 r4:
ee9bc800
[<
c041bef8>] (__dev_open) from [<
c041c208>] (__dev_change_flags+0x94/0x13c)
r7:
00001002 r6:
00000001 r5:
00001003 r4:
ee9bc800
[<
c041c174>] (__dev_change_flags) from [<
c041c2e8>] (dev_change_flags+0x20/0x50)
r7:
c072c8a0 r6:
00000138 r5:
00001002 r4:
ee9bc800
[<
c041c2c8>] (dev_change_flags) from [<
c06e8d4c>] (ip_auto_config+0x174/0xf7c)
r8:
00001002 r7:
c072c8a0 r6:
c0700040 r5:
00000001 r4:
ee9bc800 r3:
00000101
[<
c06e8bd8>] (ip_auto_config) from [<
c000a810>] (do_one_initcall+0x100/0x1c8)
r10:
c06f883c r9:
00000000 r8:
c06e8bd8 r7:
c0734000 r6:
c070e918 r5:
c070e918
r4:
ee083640
[<
c000a710>] (do_one_initcall) from [<
c06c9ddc>] (kernel_init_freeable+0x11c/0x1ec)
r10:
c06f883c r9:
00000000 r8:
00000099 r7:
c0734000 r6:
c070372c r5:
c06f8834
r4:
00000007
[<
c06c9cc0>] (kernel_init_freeable) from [<
c0514d78>] (kernel_init+0x14/0xec)
r10:
00000000 r8:
00000000 r7:
00000000 r6:
00000000 r5:
c0514d64 r4:
c0734000
[<
c0514d64>] (kernel_init) from [<
c0010458>] (ret_from_fork+0x14/0x3c)
r4:
00000000 r3:
ee84c000
This is because the code jumps to a wrong label and so tries to free yet
unallocated coherent memory. Fix the *goto* in question.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Fri, 30 Oct 2015 23:05:56 +0000 (02:05 +0300)]
sh_eth: fix uninitialized arrays in sh_eth_ring_init()
sh_eth_ring_free() called in the sh_eth_ring_init()'s error path expects
the arrays pointed to by 'sh_eth_private::[rt]x_skbuff' to be initialized
with NULLs but they are allocated with just kmalloc_array() and so are left
filled with random data. Use kcalloc() instead.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 30 Oct 2015 23:39:48 +0000 (19:39 -0400)]
net: dsa: mv88e6xxx: lookup switch name
All the mv88e6xxx drivers use the exact same code in their probe
function to lookup the switch name given its ID. Thus introduce a
mv88e6xxx_switch_id structure and a mv88e6xxx_lookup_name function in
the common mv88e6xxx code.
In the meantime make __mv88e6xxx_reg_{read,write} static since we do not
need to expose these low-level r/w routines anymore.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 30 Oct 2015 22:56:45 +0000 (18:56 -0400)]
net: dsa: mv88e6xxx: assert SMI lock
It's easy to forget to lock the smi_mutex before calling the low-level
_mv88e6xxx_reg_{read,write}, so add a assert_smi_lock function in them.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Shi [Fri, 30 Oct 2015 22:16:26 +0000 (15:16 -0700)]
bpf: convert hashtab lock to raw lock
When running bpf samples on rt kernel, it reports the below warning:
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 477, name: ping
Preemption disabled at:[<
ffff80000017db58>] kprobe_perf_func+0x30/0x228
CPU: 3 PID: 477 Comm: ping Not tainted 4.1.10-rt8 #4
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[<
ffff80000008a5b0>] dump_backtrace+0x0/0x128
[<
ffff80000008a6f8>] show_stack+0x20/0x30
[<
ffff8000007da90c>] dump_stack+0x7c/0xa0
[<
ffff8000000e4830>] ___might_sleep+0x188/0x1a0
[<
ffff8000007e2200>] rt_spin_lock+0x28/0x40
[<
ffff80000018bf9c>] htab_map_update_elem+0x124/0x320
[<
ffff80000018c718>] bpf_map_update_elem+0x40/0x58
[<
ffff800000187658>] __bpf_prog_run+0xd48/0x1640
[<
ffff80000017ca6c>] trace_call_bpf+0x8c/0x100
[<
ffff80000017db58>] kprobe_perf_func+0x30/0x228
[<
ffff80000017dd84>] kprobe_dispatcher+0x34/0x58
[<
ffff8000007e399c>] kprobe_handler+0x114/0x250
[<
ffff8000007e3bf4>] kprobe_breakpoint_handler+0x1c/0x30
[<
ffff800000085b80>] brk_handler+0x88/0x98
[<
ffff8000000822f0>] do_debug_exception+0x50/0xb8
Exception stack(0xffff808349687460 to 0xffff808349687580)
7460:
4ca2b600 ffff8083 4a3a7000 ffff8083 49687620 ffff8083 0069c5f8 ffff8000
7480:
00000001 00000000 007e0628 ffff8000 496874b0 ffff8083 007e1de8 ffff8000
74a0:
496874d0 ffff8083 0008e04c ffff8000 00000001 00000000 4ca2b600 ffff8083
74c0:
00ba2e80 ffff8000 49687528 ffff8083 49687510 ffff8083 000e5c70 ffff8000
74e0:
00c22348 ffff8000 00000000 ffff8083 49687510 ffff8083 000e5c74 ffff8000
7500:
4ca2b600 ffff8083 49401800 ffff8083 00000001 00000000 00000000 00000000
7520:
496874d0 ffff8083 00000000 00000000 00000000 00000000 00000000 00000000
7540:
2f2e2d2c 33323130 00000000 00000000 4c944500 ffff8083 00000000 00000000
7560:
00000000 00000000 008751e0 ffff8000 00000001 00000000 124e2d1d 00107b77
Convert hashtab lock to raw lock to avoid such warning.
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 2 Nov 2015 20:40:11 +0000 (15:40 -0500)]
Merge branch 'bridge_vlan_fixes'
Nikolay Aleksandrov says:
====================
bridge: vlan: failure path and comment fixes
This is a set from Ido which takes care of one failure path error in
nbp_vlan_init (patch 1) and a few comment errors (patch 2).
I must admit I didn't expect the port init continues after a vlan init
failure but should've checked to make sure. Thanks to Ido for catching
these!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 30 Oct 2015 16:46:20 +0000 (17:46 +0100)]
bridge: vlan: Use correct flag name in comment
The flag used to indicate if a VLAN should be used for filtering - as
opposed to context only - on the bridge itself (e.g. br0) is called
'brentry' and not 'brvlan'.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 30 Oct 2015 16:46:19 +0000 (17:46 +0100)]
bridge: vlan: Prevent possible use-after-free
When adding a port to a bridge we initialize VLAN filtering on it. We do
not bail out in case an error occurred in nbp_vlan_init, as it can be
used as a non VLAN filtering bridge.
However, if VLAN filtering is required and an error occurred in
nbp_vlan_init, we should set vlgrp to NULL, so that VLAN filtering
functions (e.g. br_vlan_find, br_get_pvid) will know the struct is
invalid and will not try to access it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 30 Oct 2015 16:46:12 +0000 (09:46 -0700)]
tcp/dccp: fix ireq->pktopts race
IPv6 request sockets store a pointer to skb containing the SYN packet
to be able to transfer it to full blown socket when 3WHS is done
(ireq->pktopts -> np->pktoptions)
As explained in commit
5e0724d027f0 ("tcp/dccp: fix hashdance race for
passive sessions"), we must transfer the skb only if we won the
hashdance race, if multiple cpus receive the 'ack' packet completing
3WHS at the same time.
Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
santosh.shilimkar@oracle.com [Fri, 30 Oct 2015 15:49:10 +0000 (08:49 -0700)]
RDS: convert bind hash table to re-sizable hashtable
To further improve the RDS connection scalabilty on massive systems
where number of sockets grows into tens of thousands of sockets, there
is a need of larger bind hashtable. Pre-allocated 8K or 16K table is
not very flexible in terms of memory utilisation. The rhashtable
infrastructure gives us the flexibility to grow the hashtbable based
on use and also comes up with inbuilt efficient bucket(chain) handling.
Reviewed-by: David Miller <davem@davemloft.net>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saurabh Sengar [Fri, 30 Oct 2015 14:16:44 +0000 (19:46 +0530)]
net: rds: changing the return type from int to void
as result of function rds_iw_flush_mr_pool is nowhere checked,
changing its return type from int to void.
also removing the unused variable rc as there is nothing to return
Signed-off-by: Saurabh Sengar <saurabh.truth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 2 Nov 2015 20:33:38 +0000 (15:33 -0500)]
Merge tag 'linux-can-fixes-for-4.3-
20151030' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2015-10-30
this is a pull request for the upcoming v4.3 release.
Marek Vasut provides a patch to use the correct attrlen in the nla_put() in the
can_fill_info() function.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 2 Nov 2015 20:28:57 +0000 (15:28 -0500)]
Merge branch 'encx24j600-fixes'
Javier Martinez Canillas says:
====================
net: encx24j600: Fix SPI driver module autoload
Recently I've been trying to fix module autoloading for all SPI drivers and
found that the encx24j600 driver does not fill module alias information due
missing a MODULE_DEVICE_TABLE() so module autload won't work and the driver
Kconfig symbol is tristate which means the driver can be built as a module.
But also the SPI id table is not correctly defined so this series fixes both
issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Fri, 30 Oct 2015 12:49:18 +0000 (13:49 +0100)]
net: encx24j600: Export missing SPI module alias information
The driver Kconfig symbol is tristate which means that it can be built as
a module but the module alias information is not added to the module info
so module autoload won't work since user-space won't have the information.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Javier Martinez Canillas [Fri, 30 Oct 2015 12:49:17 +0000 (13:49 +0100)]
net: encx24j600: Fix SPI id table definition
A driver's SPI id table is expected to be an array of struct spi_device_id
that ends with a zero-initialized sentinel entry. But this driver defines
the table as a single struct spi_device_id and sets .id_table to a pointer
to this struct.
But spi_match_id() has a loop that iterates while the struct spi_device_id
.name[0] is not NULL, so not having a sentinel can cause a NULL pointer
deference error.
This patch defines the SPI id table correctly as all other SPI drivers do.
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Fri, 30 Oct 2015 11:22:51 +0000 (16:52 +0530)]
enic: assign affinity hint to interrupts
The affinity hint is used by the user space daemon, irqbalancer, to
indicate a preferred CPU mask for irqs. This patch sets the irq affinity
hint to local numa core first, when exausted we try non-local numa cores.
Also set tx xps cpus mask bassed on affinity hint.
v2: remove the global affinity policy.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 29 Oct 2015 21:20:40 +0000 (22:20 +0100)]
ipv4: use l4 hash for locally generated multipath flows
This patch changes how the multipath hash is computed for locally
generated flows: now the hash comprises l4 information.
This allows better utilization of the available paths when the existing
flows have the same source IP and the same destination IP: with l3 hash,
even when multiple connections are in place simultaneously, a single path
will be used, while with l4 hash we can use all the available paths.
v2 changes:
- use get_hash_from_flowi4() instead of implementing just another l4 hash
function
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Reid [Fri, 30 Oct 2015 08:43:55 +0000 (16:43 +0800)]
stmmac: Correctly report PTP capabilities.
priv->hwts_*_en indicate if timestamping is enabled/disabled at run
time. But priv->dma_cap.time_stamp and priv->dma_cap.atime_stamp
indicates HW is support for PTPv1/PTPv2.
Signed-off-by: Phil Reid <preid@electromag.com.au>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tina Ruchandani [Fri, 30 Oct 2015 08:24:56 +0000 (01:24 -0700)]
Use 64-bit timekeeping
This patch changes the use of struct timespec in
dccp_probe to use struct timespec64 instead. timespec uses a 32-bit
seconds field which will overflow in the year 2038 and beyond. timespec64
uses a 64-bit seconds field. Note that the correctness of the code isn't
changed, since the original code only uses the timestamps to compute a
small elapsed interval. This patch is part of a larger attempt to remove
instances of 32-bit timekeeping structures (timespec, timeval, time_t)
from the kernel so it is easier to identify where the real 2038 issues
are.
Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 1 Nov 2015 21:57:44 +0000 (16:57 -0500)]
Merge branch 'ipv4_link_down'
Julian Anastasov says:
====================
ipv4: fix problems from the RTNH_F_LINKDOWN introduction
Fix two problems from the change that introduced RTNH_F_LINKDOWN
flag. The first patch deals with the removal of local route on
DOWN event. The second patch makes sure the RTNH_F_LINKDOWN
flag is properly updated on UP event because the DOWN event
sets it in all cases.
v2->v3:
- use bool for force var
v1->v2:
- forgot to add ifconfig dummy0 down in the test case
- split to 2 patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Fri, 30 Oct 2015 08:23:34 +0000 (10:23 +0200)]
ipv4: update RTNH_F_LINKDOWN flag on UP event
When nexthop is part of multipath route we should clear the
LINKDOWN flag when link goes UP or when first address is added.
This is needed because we always set LINKDOWN flag when DEAD flag
was set but now on UP the nexthop is not dead anymore. Examples when
LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered:
- link goes down (LINKDOWN is set), then link goes UP and device
shows carrier OK but LINKDOWN remains set
- last address is deleted (LINKDOWN is set), then address is
added and device shows carrier OK but LINKDOWN remains set
Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
here add a multipath route where one nexthop is for dummy0:
ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE
ifconfig dummy0 down
ifconfig dummy0 up
now ip route shows nexthop that is not dead. Now set the sysctl var:
echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown
now ip route will show a dead nexthop because the forgotten
RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD.
Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Fri, 30 Oct 2015 08:23:33 +0000 (10:23 +0200)]
ipv4: fix to not remove local route on link down
When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.
Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0 proto kernel scope host src 192.168.168.1
Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
huangdaode [Fri, 30 Oct 2015 02:50:54 +0000 (10:50 +0800)]
net: hisilicon: Remove .owner assignment from platform_driver
platform_driver doesn't need to set .owner, because
platform_driver_register() will set it.
Signed-off-by: huangdaode <huangdaode@hisilicon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 1 Nov 2015 17:33:55 +0000 (12:33 -0500)]
net: dsa: use switchdev obj for VLAN add/del ops
Simplify DSA by pushing the switchdev objects for VLAN add and delete
operations down to its drivers. Currently only mv88e6xxx is affected.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 30 Oct 2015 01:11:35 +0000 (18:11 -0700)]
net: bcmgenet: Software reset EPHY after power on
The EPHY on GENET v1->v3 is extremely finicky, and will show occasional
failures based on the timing and reset sequence, ranging from duplicate
packets, to extremely high latencies.
Perform an additional software reset, and re-configuration to make sure it is
in a consistent and working state.
Fixes: 6ac3ce8295e6 ("net: bcmgenet: Remove excessive PHY reset")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Hajnoczi [Thu, 29 Oct 2015 11:57:42 +0000 (11:57 +0000)]
VSOCK: define VSOCK_SS_LISTEN once only
The SS_LISTEN socket state is defined by both af_vsock.c and
vmci_transport.c. This is risky since the value could be changed in one
file and the other would be out of sync.
Rename from SS_LISTEN to VSOCK_SS_LISTEN since the constant is not part
of enum socket_state (SS_CONNECTED, ...). This way it is clear that the
constant is vsock-specific.
The big text reflow in af_vsock.c was necessary to keep to the maximum
line length. Text is unchanged except for s/SS_LISTEN/VSOCK_SS_LISTEN/.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Fedin [Thu, 29 Oct 2015 06:45:22 +0000 (09:45 +0300)]
net: smsc911x: Fix crash if loopback test fails
On certain hardware in certain situations loopback test fails and the
driver gets removed. During mdiobus_unregister() instance of PHY driver
gets disposed. But by this time it has already been started using
phy_connect_direct().
PHY driver uses DELAYED_WORK in order to maintain its state. Attempting
to dispose the driver without calling phy_disconnect() causes deallocation
of DELAYED_WORK being active. This shortly causes a bad crash in timer
code.
The problem can be discovered by enabling CONFIG_DEBUG_OBJECTS_TIMERS and
CONFIG_DEBUG_OBJECTS_FREE
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 28 Oct 2015 17:09:53 +0000 (13:09 -0400)]
tipc: linearize arriving NAME_DISTR and LINK_PROTO buffers
Testing of the new UDP bearer has revealed that reception of
NAME_DISTRIBUTOR, LINK_PROTOCOL/RESET and LINK_PROTOCOL/ACTIVATE
message buffers is not prepared for the case that those may be
non-linear.
We now linearize all such buffers before they are delivered up to the
generic reception layer.
In order for the commit to apply cleanly to 'net' and 'stable', we do
the change in the function tipc_udp_recv() for now. Later, we will post
a commit to 'net-next' moving the linearization to generic code, in
tipc_named_rcv() and tipc_link_proto_rcv().
Fixes: commit d0f91938bede ("tipc: add ip/udp media type")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Wed, 28 Oct 2015 12:20:30 +0000 (10:20 -0200)]
fec: Use gpio_set_value_cansleep()
We are in a context where we can sleep, and the FEC PHY reset gpio
may be on an I2C expander. Use the cansleep() variant when
setting the GPIO value.
Based on a patch from Russell King for pci-mvebu.c.
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 1 Nov 2015 17:01:37 +0000 (12:01 -0500)]
Merge branch 'csum_partial_frags'
Hannes Frederic Sowa says:
====================
net: clean up interactions of CHECKSUM_PARTIAL and fragmentation
This series fixes wrong checksums on the wire for IPv4 and IPv6. Large
send buffers and especially NFS lead to wrong checksums in both IPv4
and IPv6.
CHECKSUM_PARTIAL skbs should not receive the respective fragmentations
functions, so we add WARN_ON_ONCE to those functions to fix up those as
soon as they get reported.
Changelog:
v2: added v4 checks
v3: removed WARN_ON_ONCES (advice by Tom Herbert)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>