David Ahern [Wed, 24 Feb 2016 19:47:02 +0000 (11:47 -0800)]
net: l3mdev: address selection should only consider devices in L3 domain
David Lamparter noted a use case where the source address selection fails
to pick an address from a VRF interface - unnumbered interfaces.
Relevant commands from his script:
ip addr add 9.9.9.9/32 dev lo
ip link set lo up
ip link add name vrf0 type vrf table 101
ip rule add oif vrf0 table 101
ip rule add iif vrf0 table 101
ip link set vrf0 up
ip addr add 10.0.0.3/32 dev vrf0
ip link add name dummy2 type dummy
ip link set dummy2 master vrf0 up
--> note dummy2 has no address - unnumbered device
ip route add 10.2.2.2/32 dev dummy2 table 101
ip neigh add 10.2.2.2 dev dummy2 lladdr 02:00:00:00:00:02
tcpdump -ni dummy2 &
And using ping instead of his socat example:
$ ping -I vrf0 -c1 10.2.2.2
ping: Warning: source address might be selected on device other than vrf0.
PING 10.2.2.2 (10.2.2.2) from 9.9.9.9 vrf0: 56(84) bytes of data.
>From tcpdump:
12:57:29.449128 IP 9.9.9.9 > 10.2.2.2: ICMP echo request, id 2491, seq 1, length 64
Note the source address is from lo and is not a VRF local address. With
this patch:
$ ping -I vrf0 -c1 10.2.2.2
PING 10.2.2.2 (10.2.2.2) from 10.0.0.3 vrf0: 56(84) bytes of data.
>From tcpdump:
12:59:25.096426 IP 10.0.0.3 > 10.2.2.2: ICMP echo request, id 2113, seq 1, length 64
Now the source address comes from vrf0.
The ipv4 function for selecting source address takes a const argument.
Removing the const requires touching a lot of places, so instead
l3mdev_master_ifindex_rcu is changed to take a const argument and then
do the typecast to non-const as required by netdev_master_upper_dev_get_rcu.
This is similar to what l3mdev_fib_table_rcu does.
IPv6 for unnumbered interfaces appears to be selecting the addresses
properly.
Cc: David Lamparter <david@opensourcerouting.org>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Feb 2016 03:07:00 +0000 (22:07 -0500)]
Merge branch 'ethtool-ksettings'
David Decotigny says:
====================
new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API
History:
v9
- add 'link' in macro, struct and function names
- rename ethtool_link_ksettings::parent -> ::base
- remove un-needed mlx4 en_dbg_enabled() companion patch
- note: bitmap u32[] API patches were merged separately by Kan Liang
v8
- bitmap u32 API returns number of bits copied, unit tests updated
v7
- module_exit in test_bitmap
v6
- fix copy_from_user in user/kernel handshake
v5
note: please see v4 bullets for a question regarding bitmap.c
- minor fix to make allyesconfig/allmodconfig
v4
- removed typedef for link mode bitmaps
- moved bitmap<->u32[] conversion routines to bitmap.c . This is the
naive implementation. I have an endian-aware version that uses
memcpy/memset as much as possible, but I find it harder to follow
(see http://paste.ubuntu.com/
13863722/). Please let me know if I
should use it instead.
- fixes suggested by Ben Hutchings
v3
- rebased v2 on top of latest net-next, minor checkpatch/printf %*pb
updates
v2
- keep return 0 in get_settings when successful, instead of
propagating positive result from driver's get_settings callback.
v1
- original submission
The main goal of this series is to support ethtool link mode masks
larger than 32 bits. It implements a new ioctl pair
(ETHTOOL_GLINKSETTINGS/SLINKSETTINGS), its associated callbacks
(get/set_link_ksettings) and a new struct ethtool_link_settings, which
should eventually replace legacy ethtool_cmd. Internally, the kernel
uses fixed length link mode masks defined at compilation time in
ethtool.h (for now: 31 bits), that can be increased by changing
__ETHTOOL_LINK_MODE_LAST in ethtool.h (absolute max is 4064 bits,
checked at compile time), and the user/kernel interface allows this
length to be arbitrary within 1..4064. This should allow some
flexibility without using too much heap/stack space, at the cost of a
small kernel/user handshake for the user to determine the sizes of
those bitmaps.
Along the way, I chose to drop in the new structure the 3 ethtool_cmd
fields marked "deprecated" (transceiver/maxrxpkt/maxtxpkt). They are
still available for old drivers via the (old) ETHTOOL_GSET/SSET API,
but are not available to drivers that switch to new API. Of those 3
fields, ethtool_cmd::transceiver seems to be still actively used by
several drivers, maybe we should not consider this field deprecated?
The 2 other fields are basically not used. This transition requires
some care in the way old and new ethtool talk to the kernel.
More technical details provided in the description for main patch. In
particular details about backward compatibility properties.
Some open questions:
- the kernel/interface multiplexes the "tell me the bitmap length"
handshake and the "give me the settings" inside the new
ETHTOOL_GLINKSETTINGS cmd. I was thinking of making this into 2
separate cmds: 1 cmd ETHTOOL_GKERNELPROPERTIES which would be
kernel-wide rather than device-specific, would return properties
like "length of the link mode bitmaps", and possibly others. And
ETHTOOL_GLINKSETTINGS would expect the proper bitmaps
- the link mode bitmaps are piggybacked at tail of the new struct
ethtool_link_settings. Since its user-visible definition does not
assume specific bitmap width, I am using a 0-length array as the
publicly visible placeholder. But then, the kernel needs to
specialize it (struct ethtool_link_ksettings) to specify its
current link mode masks. This means that kernel code is "littered"
with "ksettings->base.field" to access "field" inside
ethtool_settings:
+ I could use ethtool_link_settings everywhere (instead of a new
ethtool_ksettings) and an container_of accessor (or a plain cast)
to retrieve the link mode masks?
+ or: we could decide to make the link mode masks statically
bounded again, ie. make their width public, but larger than
current 32, and unchangeable forever. This would make everything
straightforward, but we might hit limits later, or have an
unneeded memory/stack usage for unused bits.
any preference?
- I foresee bugs where people use the legacy/deprecated SUPPORTED_x
macros instead of the new ETHTOOL_LINK_MODE_x_BIT enums in the new
get/set_link_ksettings callbacks. Not sure how to prevent problems
with this.
The only driver which was converted for now is mlx4. I am not
considering fcoe as fully converted, but I updated it a minima to be
able to remove __ethtool_get_settings, now known as
__ethtool_get_link_ksettings.
Tested with legacy and "future" ethtool on 64b x86 kernel and 32+64b
ethtool, and on a 32b x86 kernel + 32b ethtool.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:12 +0000 (10:58 -0800)]
net: mlx4: use new ETHTOOL_G/SSETTINGS API
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:11 +0000 (10:58 -0800)]
net: ethtool: remove unused __ethtool_get_settings
replaced by __ethtool_get_link_ksettings.
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:10 +0000 (10:58 -0800)]
net: core: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:09 +0000 (10:58 -0800)]
net: bridge: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:08 +0000 (10:58 -0800)]
net: 8021q: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:07 +0000 (10:58 -0800)]
net: rdma: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:06 +0000 (10:58 -0800)]
net: fcoe: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:05 +0000 (10:58 -0800)]
net: team: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:04 +0000 (10:58 -0800)]
net: macvlan: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:03 +0000 (10:58 -0800)]
net: ipvlan: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:02 +0000 (10:58 -0800)]
net: bonding: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:01 +0000 (10:58 -0800)]
net: usnic: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:58:00 +0000 (10:58 -0800)]
tx4939: use __ethtool_get_ksettings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:57:59 +0000 (10:57 -0800)]
net: ethtool: add new ETHTOOL_xLINKSETTINGS API
This patch defines a new ETHTOOL_GLINKSETTINGS/SLINKSETTINGS API,
handled by the new get_link_ksettings/set_link_ksettings callbacks.
This API provides support for most legacy ethtool_cmd fields, adds
support for larger link mode masks (up to 4064 bits, variable length),
and removes ethtool_cmd deprecated
fields (transceiver/maxrxpkt/maxtxpkt).
This API is deprecating the legacy ETHTOOL_GSET/SSET API and provides
the following backward compatibility properties:
- legacy ethtool with legacy drivers: no change, still using the
get_settings/set_settings callbacks.
- legacy ethtool with new get/set_link_ksettings drivers: the new
driver callbacks are used, data internally converted to legacy
ethtool_cmd. ETHTOOL_GSET will return only the 1st 32b of each link
mode mask. ETHTOOL_SSET will fail if user tries to set the
ethtool_cmd deprecated fields to
non-0 (transceiver/maxrxpkt/maxtxpkt). A kernel warning is logged if
driver sets higher bits.
- future ethtool with legacy drivers: no change, still using the
get_settings/set_settings callbacks, internally converted to new data
structure. Deprecated fields (transceiver/maxrxpkt/maxtxpkt) will be
ignored and seen as 0 from user space. Note that that "future"
ethtool tool will not allow changes to these deprecated fields.
- future ethtool with new drivers: direct call to the new callbacks.
By "future" ethtool, what is meant is:
- query: first try ETHTOOL_GLINKSETTINGS, and revert to ETHTOOL_GSET if
fails
- set: query first and remember which of ETHTOOL_GLINKSETTINGS or
ETHTOOL_GSET was successful
+ if ETHTOOL_GLINKSETTINGS was successful, then change config with
ETHTOOL_SLINKSETTINGS. A failure there is final (do not try
ETHTOOL_SSET).
+ otherwise ETHTOOL_GSET was successful, change config with
ETHTOOL_SSET. A failure there is final (do not try
ETHTOOL_SLINKSETTINGS).
The interaction user/kernel via the new API requires a small
ETHTOOL_GLINKSETTINGS handshake first to agree on the length of the link
mode bitmaps. If kernel doesn't agree with user, it returns the bitmap
length it is expecting from user as a negative length (and cmd field is
0). When kernel and user agree, kernel returns valid info in all
fields (ie. link mode length > 0 and cmd is ETHTOOL_GLINKSETTINGS).
Data structure crossing user/kernel boundary is 32/64-bit
agnostic. Converted internally to a legal kernel bitmap.
The internal __ethtool_get_settings kernel helper will gradually be
replaced by __ethtool_get_link_ksettings by the time the first
"link_settings" drivers start to appear. So this patch doesn't change
it, it will be removed before it needs to be changed.
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:57:58 +0000 (10:57 -0800)]
net: usnic: use __ethtool_get_settings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Wed, 24 Feb 2016 18:57:57 +0000 (10:57 -0800)]
net: usnic: remove unused call to ethtool_ops::get_settings
Signed-off-by: David Decotigny <decot@googlers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 24 Feb 2016 18:02:52 +0000 (10:02 -0800)]
net: Facility to report route quality of connected sockets
This patch add the SO_CNX_ADVICE socket option (setsockopt only). The
purpose is to allow an application to give feedback to the kernel about
the quality of the network path for a connected socket. The value
argument indicates the type of quality report. For this initial patch
the only supported advice is a value of 1 which indicates "bad path,
please reroute"-- the action taken by the kernel is to call
dst_negative_advice which will attempt to choose a different ECMP route,
reset the TX hash for flow label and UDP source port in encapsulation,
etc.
This facility should be useful for connected UDP sockets where only the
application can provide any feedback about path quality. It could also
be useful for TCP applications that have additional knowledge about the
path outside of the normal TCP control loop.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 24 Feb 2016 17:25:37 +0000 (09:25 -0800)]
net: ipv6: Make address flushing on ifdown optional
Currently, all ipv6 addresses are flushed when the interface is configured
down, including global, static addresses:
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
<< nothing; all addresses have been flushed>>
Add a new sysctl to make this behavior optional. The new setting defaults to
flush all addresses to maintain backwards compatibility. When the set global
addresses with no expire times are not flushed on an admin down. The sysctl
is per-interface or system-wide for all interfaces
$ sysctl -w net.ipv6.conf.eth1.keep_addr_on_down=1
or
$ sysctl -w net.ipv6.conf.all.keep_addr_on_down=1
Will keep addresses on eth1 on an admin down.
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
inet6 2100:1::2/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link
valid_lft forever preferred_lft forever
$ ip link set dev eth1 down
$ ip -6 addr show dev eth1
3: eth1: <BROADCAST,MULTICAST> mtu 1500 state DOWN qlen 1000
inet6 2100:1::2/120 scope global tentative
valid_lft forever preferred_lft forever
inet6 fe80::e0:f9ff:fe79:34bd/64 scope link tentative
valid_lft forever preferred_lft forever
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 24 Feb 2016 16:20:17 +0000 (17:20 +0100)]
tipc: fix null deref crash in compat config path
msg.dst_sk needs to be set up with a valid socket because some callbacks
later derive the netns from it.
Fixes: 263ea09084d172d ("Revert "genl: Add genlmsg_new_unicast() for unicast message allocation")
Reported-by: Jon Maloy <maloy@donjonn.com>
Bisected-by: Jon Maloy <maloy@donjonn.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 24 Feb 2016 16:10:48 +0000 (11:10 -0500)]
tipc: fix crash during node removal
When the TIPC module is unloaded, we have identified a race condition
that allows a node reference counter to go to zero and the node instance
being freed before the node timer is finished with accessing it. This
leads to occasional crashes, especially in multi-namespace environments.
The scenario goes as follows:
CPU0:(node_stop) CPU1:(node_timeout) // ref == 2
1: if(!mod_timer())
2: if (del_timer())
3: tipc_node_put() // ref -> 1
4: tipc_node_put() // ref -> 0
5: kfree_rcu(node);
6: tipc_node_get(node)
7: // BOOM!
We now clean up this functionality as follows:
1) We remove the node pointer from the node lookup table before we
attempt deactivating the timer. This way, we reduce the risk that
tipc_node_find() may obtain a valid pointer to an instance marked
for deletion; a harmless but undesirable situation.
2) We use del_timer_sync() instead of del_timer() to safely deactivate
the node timer without any risk that it might be reactivated by the
timeout handler. There is no risk of deadlock here, since the two
functions never touch the same spinlocks.
3: We remove a pointless tipc_node_get() + tipc_node_put() from the
timeout handler.
Reported-by: Zhijiang Hu <huzhijiang@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Wed, 24 Feb 2016 16:00:19 +0000 (11:00 -0500)]
tipc: eliminate risk of finding to-be-deleted node instance
Although we have never seen it happen, we have identified the
following problematic scenario when nodes are stopped and deleted:
CPU0: CPU1:
tipc_node_xxx() //ref == 1
tipc_node_put() //ref -> 0
tipc_node_find() // node still in table
tipc_node_delete()
list_del_rcu(n. list)
tipc_node_get() //ref -> 1, bad
kfree_rcu()
tipc_node_put() //ref to 0 again.
kfree_rcu() // BOOM!
We fix this by introducing use of the conditional kref_get_if_not_zero()
instead of kref_get() in the function tipc_node_find(). This eliminates
any risk of post-mortem access.
Reported-by: Zhijiang Hu <huzhijiang@gmail.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Feb 2016 21:54:45 +0000 (16:54 -0500)]
Merge branch 'qed-misc'
Yuval Mintz says:
====================
qed*: Driver updates
Usually I try to provide a sensible description of the patch set even if
it lacks a general 'motif', but this simply contains several small,
unrelated and self-explenatory tweaks and additions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 24 Feb 2016 14:52:50 +0000 (16:52 +0200)]
qed, qede: rebrand module description
Drop the `QL4xxx 40G/100G' and use `FastLinQ 4xxxx' instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 24 Feb 2016 14:52:49 +0000 (16:52 +0200)]
qed: Prevent probe on previous error
Don't allow driver to probe on an adapter at a failed state;
Gracefully block the probe instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 24 Feb 2016 14:52:48 +0000 (16:52 +0200)]
qed: add MODULE_FIRMWARE()
Module is using a binary firmware file and so should be marked as such.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 24 Feb 2016 14:52:47 +0000 (16:52 +0200)]
qede: Don't report link change needlessly
There are several corner cases where driver might get a 2nd notification
about the same link change. Don't log any additional changes if the
physical carrier is already reported as it should.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 24 Feb 2016 14:52:46 +0000 (16:52 +0200)]
qede: Linearize SKBs when needed
There's a corner-case in HW where an SKB queued for transmission that
contains too many frags will cause FW to assert.
This patch solves this by linearizing the SKB if necessary.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Wed, 24 Feb 2016 14:52:45 +0000 (16:52 +0200)]
qede: Change pci DID for 10g device
The device ID for the 10g module has changed. Populate the pci_ids table
accordingly.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amitoj Kaur Chawla [Wed, 24 Feb 2016 14:39:38 +0000 (20:09 +0530)]
netxen: Use kobj_to_dev()
Introduce the use of kobj_to_dev() helper function instead of open
coding it with container_of()
The Coccinelle semantic patch used to make this change is as follows:
//<smpl>
@@
expression a;
symbol kobj;
@@
- container_of(a, struct device, kobj)
+ kobj_to_dev(a)
//</smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amitoj Kaur Chawla [Wed, 24 Feb 2016 13:58:19 +0000 (19:28 +0530)]
3c59x: Use setup_timer()
Convert a call to init_timer and accompanying intializations of
the timer's data and function fields to a call to setup_timer.
The Coccinelle semantic patch that fixes this problem is
as follows:
// <smpl>
@@
expression t,f,d;
@@
-init_timer(&t);
+setup_timer(&t,f,d);
...
-t.data = d;
-t.function = f;
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amitoj Kaur Chawla [Wed, 24 Feb 2016 13:58:01 +0000 (19:28 +0530)]
forcedeth: Use setup_timer()
Convert a call to init_timer and accompanying intializations of
the timer's data and function fields to a call to setup_timer.
The Coccinelle semantic patch that fixes this problem is
as follows:
// <smpl>
@@
expression t,f,d;
@@
-init_timer(&t);
+setup_timer(&t,f,d);
-t.data = d;
-t.function = f;
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amitoj Kaur Chawla [Wed, 24 Feb 2016 13:57:49 +0000 (19:27 +0530)]
net: tulip: Use setup_timer()
Convert a call to init_timer and accompanying intializations of
the timer's data and function fields to a call to setup_timer.
The Coccinelle semantic patch that fixes this problem is
as follows:
// <smpl>
@@
expression t,f,d;
@@
-init_timer(&t);
+setup_timer(&t,f,d);
-t.data = d;
-t.function = f;
// </smpl>
Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Feb 2016 21:22:03 +0000 (16:22 -0500)]
Merge branch 'gianfar-ls1021a-ptp'
Yangbo Lu says:
====================
gianfar: Add PTP support for ls1021a platform
This patchset is to enable ptp support for ls1021a platform. The endianness
issue in gianfar driver and gianfar ptp driver must be fixed, and a 1588
timer node must be added into dts.
Changes for v2:
- Modified commit message
- Added more reviewers
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Wed, 24 Feb 2016 09:26:56 +0000 (17:26 +0800)]
gianfar: fix endianness for hardware timestamp
Fix endianness for the 64-bit hardware timestamp value with
be64_to_cpu to support both PowerPC platforms and ARM platforms.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Wed, 24 Feb 2016 09:26:55 +0000 (17:26 +0800)]
gianfar_ptp: replace get_of_u32 with of_property_read_u32
Replace get_of_u32 with standard helper function of_property_read_u32
since the latter can process cpu endianness.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yangbo Lu [Wed, 24 Feb 2016 09:26:54 +0000 (17:26 +0800)]
ARM: dts: ls1021a: add 1588 timer node
Add the 1588 timer node for ls1021a platform to
support gianfar ptp driver.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clemens Gruber [Tue, 23 Feb 2016 19:16:58 +0000 (20:16 +0100)]
phy: marvell: Fix
88E1510 initialization
A bug was introduced in the merge commit
b633353115e3 ("Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
The generic marvell_config_init (and therefore marvell_of_reg_init) is
not called anymore for the Marvell
88E1510 (in net-next).
This patch calls marvell_config_init and moves the specific init
function for the
88E1510 below the marvell_config_init function to avoid
adding a function predeclaration.
Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Feb 2016 20:20:22 +0000 (15:20 -0500)]
Merge branch 'dsa-port-vlan-dump'
Vivien Didelot says:
====================
net: dsa: add port VLAN dump operation
The VLAN GetNext approach is specific to some switches and thus hard to
implement for others. This patchset replaces it with a simpler port VLAN dump
operation, similar to the corresponding FDB operation.
The mv88e6xxx driver is the only one currently affected by the change.
The documentation is updated accordingly.
Note: this patchset uses http://www.spinics.net/lists/kernel/msg2186705.html
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Tue, 23 Feb 2016 17:13:56 +0000 (12:13 -0500)]
net: dsa: drop vlan_getnext
The VLAN GetNext operation is specific to some switches, and thus can be
complicated to implement for some drivers.
Remove the support for the vlan_getnext/port_pvid_get approach in favor
of the generic and simpler port_vlan_dump function.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Tue, 23 Feb 2016 17:13:55 +0000 (12:13 -0500)]
net: dsa: mv88e6xxx: implement port_vlan_dump
Remove the port_pvid_get and vlan_getnext functions in favor of a
simpler mv88e6xxx_port_vlan_dump function.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Tue, 23 Feb 2016 17:13:54 +0000 (12:13 -0500)]
net: dsa: add port_vlan_dump routine
Similar to port_fdb_dump, add a port_vlan_dump function to DSA drivers
which gets passed the switchdev VLAN object and callback.
This function, if implemented, takes precedence over the soon legacy
vlan_getnext/port_pvid_get approach.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Feb 2016 20:17:13 +0000 (15:17 -0500)]
Merge branch 'vxlan-rx-cleanups'
Jiri Benc says:
====================
vxlan: consolidate rx handling
Currently, vxlan_rcv is just called at the end of vxlan_udp_encap_recv,
continuing the rx processing where vxlan_udp_encap_recv left it. There's no
clear border between those two functions. This patchset moves
vxlan_udp_encap_recv and vxlan_rcv into a single function.
This also allows to do some simplification in error path.
The VXLAN-GPE implementation that will follow up this set can be seen at:
https://github.com/jbenc/linux-vxlan/commits/master
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 23 Feb 2016 17:02:59 +0000 (18:02 +0100)]
vxlan: simplify metadata_dst usage in vxlan_rcv
Now when the packet is scrubbed early, the metadata_dst can be assigned to
the skb as soon as it is allocated. This simplifies the error cleanup path,
as the dst will be freed by kfree_skb. It is also not necessary to pass it
as a parameter to functions anymore.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 23 Feb 2016 17:02:58 +0000 (18:02 +0100)]
vxlan: consolidate rx handling to a single function
Now when both vxlan_udp_encap_recv and vxlan_rcv are much shorter, combine
them into a single function.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 23 Feb 2016 17:02:57 +0000 (18:02 +0100)]
vxlan: move ECN decapsulation to a separate function
It simplifies the vxlan_rcv function.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 23 Feb 2016 17:02:56 +0000 (18:02 +0100)]
vxlan: move inner L2 header processing to a separate function
This code will be different for VXLAN-GPE, so move it to a separate
function. It will also make the rx path less spaghetti-like.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Tue, 23 Feb 2016 17:02:55 +0000 (18:02 +0100)]
vxlan: consolidate GBP handling even more
Now when the packet is scrubbed early, skb->mark can be set in the GBP
handling code.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 25 Feb 2016 19:16:22 +0000 (14:16 -0500)]
Merge branch 'tc_action-ns'
Cong Wang says:
====================
net_sched: add network namespace support for tc actions
This patchset adds network namespace support for tc actions.
v2:
* pull the first patch into net-next
* reduce code duplication by introducing more helper functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 22 Feb 2016 23:57:53 +0000 (15:57 -0800)]
net_sched: add network namespace support for tc actions
Currently tc actions are stored in a per-module hashtable,
therefore are visible to all network namespaces. This is
probably the last part of the tc subsystem which is not
aware of netns now. This patch makes them per-netns,
several tc action API's need to be adjusted for this.
The tc action API code is ugly due to historical reasons,
we need to refactor that code in the future.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 22 Feb 2016 23:57:52 +0000 (15:57 -0800)]
net_sched: prepare tcf_hashinfo_destroy() for netns support
We only release the memory of the hashtable itself, not its
entries inside. This is not a problem yet since we only call
it in module release path, and module is refcount'ed by
actions. This would be a problem after we move the per module
hinfo into per netns in the latter patch.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Tue, 23 Feb 2016 12:59:43 +0000 (13:59 +0100)]
ppp: clarify parsing of user supplied data in ppp_set_compress()
* Split big conditional statement.
* Check (data.length <= CCP_MAX_OPTION_LENGTH) only once.
* Don't read ccp_option[1] if not initialised.
Reading uninitialised ccp_option[1] was harmless, because this could
only happen when data.length was 0 or 1. So even then, we couldn't pass
the (ccp_option[1] < 2 || ccp_option[1] > data.length) test anyway.
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 23 Feb 2016 10:36:02 +0000 (11:36 +0100)]
bnx2x: add a separate GENEVE Kconfig symbol
When CONFIG_GENEVE is built as a loadable module, and bnx2x is built-in,
we get this link error:
drivers/net/built-in.o: In function `bnx2x_open':
:(.text+0x33322): undefined reference to `geneve_get_rx_port'
drivers/net/built-in.o: In function `bnx2x_sp_rtnl_task':
:(.text+0x3e632): undefined reference to `geneve_get_rx_port'
This avoids the problem by adding a separate Kconfig symbol named
CONFIG_BNX2X_GENEVE that is only enabled when the code is
reachable from the driver.
This is the same trick that BNX2X does for VXLAN support, and
is similar to how I40E handles both.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 883ce97d25b0 ("bnx2x: Add Geneve inner-RSS support")
Acked-By: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Feb 2016 21:55:58 +0000 (16:55 -0500)]
Merge branch 'gianfar-xmit-improvements'
Claudiu Manoil says:
====================
gianfar: xmit() improvements
Remove redundant operations, improve code locality and maintainability.
Thanks.
V2: Updated first patch to not touch existing wmb().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Tue, 23 Feb 2016 09:48:39 +0000 (11:48 +0200)]
gianfar: Remove redundant ops for do_tstamp from xmit()
Timestamp BD status updates that can be merged into the
same "do_tstamp" block, no need for extra save/restore
to the BD area. The code is more readable too.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Tue, 23 Feb 2016 09:48:38 +0000 (11:48 +0200)]
gianfar: Use skb_frag_t pointers inside xmit()
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Tue, 23 Feb 2016 09:48:37 +0000 (11:48 +0200)]
gianfar: Map head TxBD first
Move the mapping of the head BD before the mapping of fragments.
The TOE (h/w offload) decision logic block can be also moved up
(as the TOE flag belongs to the head BD), resulting in more
localized code (TOE logic vs BD mapping code blocks).
Note that, for this h/w, the R (status) bit for the head BD of a S/G
frame needs to be written last for a reliable transmission.
For the fragmented skb case, a local variable is used to temporarily
store the status info of the first BD, replacing a BD status read.
A merge of 2 "if(do_tstamp)" blocks was also possible.
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rafał Miłecki [Mon, 22 Feb 2016 21:51:13 +0000 (22:51 +0100)]
bgmac: support Ethernet device on BCM47094 SoC
It needs very similar workarounds to the one on BCM4707. It was tested
on D-Link DIR-885L home router.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Feb 2016 20:25:19 +0000 (15:25 -0500)]
Merge branch 'be2net-fixes'
Ajit Khaparde says:
====================
be2net patches
Please consider applying to net-next
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ajit.khaparde@broadcom.com [Mon, 22 Feb 2016 19:05:01 +0000 (00:35 +0530)]
be2net: Fix a UE caused by passing large frames to the ASIC
In QnQ configurations like Flex-10 where the VLANs are inserted by the
ASIC, on rare occasions the HW is encountering a scenario where the
final frame length ends to be greater than what the ASIC can support.
This is because when the TXULP pulls the TX WRB to check the length
of the frame to be transmitted it also adds the size of VLANs to be
inserted by the HW to the length of the frame indicated in the WRB,
which in some cases fails the range check. This causes a UE.
Avoid this by trimming the skb length to accommodate the VLAN insertion.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ajit.khaparde@broadcom.com [Mon, 22 Feb 2016 19:03:48 +0000 (00:33 +0530)]
be2net: Declare some u16 fields as u32 to improve performance
When 16-bit integers are loaded on CPUs with high order native
register sizes, the CPU could use some extra ops before using them.
And currently some of the frequently used fields in the driver like
the producer and consumer indices of the queues are declared as u16.
This patch declares such fields as u32. With this change we see the
64-byte packets per second numbers improve by about 4%.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Feb 2016 18:58:05 +0000 (13:58 -0500)]
Merge branch 'flow_dissector-fixes-and-improvements'
Alexander Duyck says:
====================
Flow dissector fixes and improvements
This patch series is meant to fix and/or improve a number of items within
the flow dissector code. The main change out of all of this is that IPv4
and IPv6 fragmentation should now be handled better than it was. As a
result we should see an improvement when handling things like IP fragment
reassembly as the skbs should now only have header data in the linear
portion of the buffer while the fragments will only hold payload data.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 24 Feb 2016 17:30:04 +0000 (09:30 -0800)]
eth: Pull header from first fragment via eth_get_headlen
We want to try and pull the L4 header in if it is available in the first
fragment. As such add the flag to indicate we want to pull the headers on
the first fragment in.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 24 Feb 2016 17:29:57 +0000 (09:29 -0800)]
flow_dissector: Use same pointer for IPv4 and IPv6 addresses
The IPv6 parsing was using a local pointer when it could use the same
pointer as the IPv4 portion of the code since the key_addrs can support
both IPv4 and IPv6 as it is just a pointer.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 24 Feb 2016 17:29:51 +0000 (09:29 -0800)]
flow_dissector: Correctly handle parsing FCoE
The flow dissector bits handling FCoE didn't bother to actually validate
that the space there was enough for the FCoE header. So we need to update
things so that if there is room we add the header and report a good result,
otherwise we do not add the header, and report the bad result.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 24 Feb 2016 17:29:44 +0000 (09:29 -0800)]
flow_dissector: Fix fragment handling for header length computation
It turns out that for IPv4 we were reporting the ip_proto of the fragment,
and for IPv6 we were not. This patch updates that behavior so that we
always report the IP protocol of the fragment. In addition it takes the
steps of updating the payload offset code so that we will determine the
start of the payload not including the L4 header for any fragment after the
first.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 24 Feb 2016 17:29:38 +0000 (09:29 -0800)]
flow_dissector: Check for IP fragmentation even if not using IPv4 address
This patch corrects the logic for the IPv4 parsing so that it is consistent
with how we handle IPv6. Specifically if we do not have the flow key
indicating we want the addresses we still may need to take a look at the IP
fragmentation bits and to see if we should stop after we have recognized
the L3 header.
Fixes: 807e165dc44f ("flow_dissector: Add control/reporting of fragmentation")
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 24 Feb 2016 18:50:23 +0000 (13:50 -0500)]
Merge branch 'mlx5-next'
Saeed Mahameed says:
====================
QoS and VxLAN offloads support for Mellanox 100G mlx5 driver
This patch series introduces QoS IEEE dcbnl support for
PFC, ETS and max rate.
In addition we added VxLAN support and introduced a patch
that modifies the driver to report checksum complete in RX path
for all IP (tunneled and non-tunneled) traffic which is non HW LRO.
This series is applied on top of the latest mlx5_ifc and NDO fixes
we sent to the net tree:
net/mlx5e: Use static constant netdevice ndos
net/mlx5e: Remove select queue ndo initialization
net/mlx5: Use offset based reserved field names in the IFC header file
The QoS patches depend on the IFC change since they expose new fields in
the driver/firmware API. Both QoS and VxLAN patches depend on the NDO changes,
since they add new ndo entries.
Changes from V1:
- Fixed the S.O.B from "Matt" to "Matthew" to be aligned with the committer title.
- Don't populate VxLAN/dcbnl ndos for virtual functions.
- Addressed John comment on mlx5_setup_tc to be aligned with latest API changes.
- Added device ETS capability check prior query/modify ets configuration.
- Call mlx5e_dcbnl_ieee_setets_core at the end of mlx5e_create_netdev and don't
fail netdev creation in case it failed or ETS was not supported.
The series where applied on top of: ("
5270c4dade09 Merge branch 'vxlan-cleanups'") +
latest mlx5 ifc and ndo fixes from net tree.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Finlay [Mon, 22 Feb 2016 16:17:34 +0000 (18:17 +0200)]
net/mlx5e: Add TX inner packet counters
Add TSO and TX checksum counters for tunneled, inner packets
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Finlay [Mon, 22 Feb 2016 16:17:33 +0000 (18:17 +0200)]
net/mlx5e: Add TX stateless offloads for tunneling
Add support for TSO and TX checksum when using hw assisted,
tunneled offloads.
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Finlay [Mon, 22 Feb 2016 16:17:32 +0000 (18:17 +0200)]
net/mlx5e: Add netdev support for VXLAN tunneling
If a VXLAN udp dport is added to device it will:
- Configure the hardware to offload the port (up to the max
supported).
- Advertise NETIF_F_GSO_UDP_TUNNEL and supported hw_enc_features.
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Finlay [Mon, 22 Feb 2016 16:17:31 +0000 (18:17 +0200)]
net/mlx5e: Protect en header file from redefinitions
add ifndef to en.h. needed for upcoming vxlan patchset.
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Finlay [Mon, 22 Feb 2016 16:17:30 +0000 (18:17 +0200)]
net/mlx5e: Move to checksum complete
Use checksum complete for all IP packets, unless they are HW LRO,
in which case, use checksum unnecessary.
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Mon, 22 Feb 2016 16:17:29 +0000 (18:17 +0200)]
net/mlx5e: Wake On LAN support
Implement set/get WOL by ethtool and added the needed
device commands and structures to mlx5_ifc.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Mon, 22 Feb 2016 16:17:28 +0000 (18:17 +0200)]
net/mlx5e: Implement DCBNL IEEE max rate
Add support for DCBNL IEEE get/set max rate.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Mon, 22 Feb 2016 16:17:27 +0000 (18:17 +0200)]
net/mlx5e: Support DCBNL IEEE PFC
Implement the set/get DCBNL IEEE PFC callbacks.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Mon, 22 Feb 2016 16:17:26 +0000 (18:17 +0200)]
net/mlx5e: Support DCBNL IEEE ETS
Support the ndo_setup_tc callback and the needed methods
for multi TC/UP support, and removed the default_vlan_prio
from mlx5e_priv which is always 0, it was replaced with
hardcoded "0" in the new select queue method.
For that we now create MAX_NUM_TC num of TISs (one per prio)
on netdevice creation instead of priv->params.num_tc which
was always 1.
So far each channel had a single TXQ, Now each channel has a
TXQ per TC (Traffic Class).
Added en_dcbnl.c which implements the set/get DCBNL IEEE ETS,
set/get dcbx and registers the mlx5e dcbnl ops.
We still use the kernel's default TXQ selection method to select the
channel to transmit through but now we use our own method to select
the TXQ inside the channel based on VLAN priority.
In mlx5, as opposed to mlx4, tc group N gets lower priority than
tc group N+1.
CC: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Saeed Mahameed [Mon, 22 Feb 2016 16:17:25 +0000 (18:17 +0200)]
net/mlx5: Introduce physical port TC/prio access functions
Add access functions to set and query a physical port TC groups
and prio parameters.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Mon, 22 Feb 2016 16:17:24 +0000 (18:17 +0200)]
net/mlx5: Introduce physical port PFC access functions
Add access functions to set and query a physical port PFC
(Priority Flow Control) parameters.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Achiad Shochat [Mon, 22 Feb 2016 16:17:23 +0000 (18:17 +0200)]
net/mlx5: Introduce a new header file for physical port functions
All the device physical port access functions are implemented in the
port.c file.
We just extract the exposure of these functions from driver.h into a
dedicated header file called port.h.
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Craig Gallek [Mon, 22 Feb 2016 15:45:29 +0000 (10:45 -0500)]
soreuseport: fix merge conflict in tcp bind
One of the validation checks for the new array-based TCP SO_REUSEPORT
validation was unintentionally dropped in
ea8add2b1903. This adds it back.
Lack of this check allows the user to allocate multiple sock_reuseport
structures (leaking all but the first).
Fixes: ea8add2b1903 ("tcp/dccp: better use of ephemeral ports in bind()")
Signed-off-by: Craig Gallek <kraig@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Feb 2016 19:52:46 +0000 (14:52 -0500)]
Merge branch 'dsa-pass-bridge-to-drivers'
Vivien Didelot says:
====================
net: dsa: pass bridge device to drivers
This patchset simplifies the DSA layer.
A switch may support multiple bridges with the same hardware VLAN. Thus a check
such as dsa_bridge_check_vlan_range must be moved from the DSA layer to the
concerned driver.
The first purpose of this patchset is to help moving this check to the
mv88e6xxx driver, which is the only one affected at the moment.
To do that, pass directly the bridge net_device structure down to the DSA
drivers, instead of calculating a bitmask of bridge members.
The second purpose is to prepare the replacement of the complex
port_vlan_getnext approach. A second patchset is ready to follow, implementing
port_vlan_dump and thus simplifying the DSA slave code one more time.
Note that this patchset applies on top of https://lkml.org/lkml/2016/2/5/532.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 12 Feb 2016 17:09:41 +0000 (12:09 -0500)]
net: dsa: remove dsa_bridge_check_vlan_range
DSA drivers may support multiple bridge groups with the same hardware
VLAN. The mv88e6xxx driver which cannot yet, already has its own check
for overlapping bridges. Thus remove the check from the DSA layer.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 12 Feb 2016 17:09:40 +0000 (12:09 -0500)]
net: dsa: mv88e6xxx: check hardware VLAN in use
The DSA drivers now have access to the VLAN prepare phase and the bridge
net_device. It is easier to check for overlapping bridges from within
the driver. Thus add such check in mv88e6xxx_port_vlan_prepare.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 12 Feb 2016 17:09:39 +0000 (12:09 -0500)]
net: dsa: pass bridge down to drivers
Some DSA drivers may or may not support multiple software bridges on top
of an hardware switch.
It is more convenient for them to access the bridge's net_device for
finer configuration.
Removing the need to craft and access a bitmask also simplifies the
code.
This patch changes the signature of bridge related functions, update DSA
drivers, and removes dsa_slave_br_port_mask.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 12 Feb 2016 17:09:38 +0000 (12:09 -0500)]
net: dsa: mv88e6xxx: add port private structure
Add a per-port mv88e6xxx_priv_port structure to store per-port related
data, instead of adding several arrays of DSA_MAX_PORTS elements in the
mv88e6xxx_priv_state structure.
It currently only contains the port STP state.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 23 Feb 2016 19:49:17 +0000 (14:49 -0500)]
Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge
Antonio Quartulli says:
====================
pull request [net-next]: batman-adv
20160223
This is a cleanup patchset: first the BATADV_BONDING_TQ_THRESHOLD
constant gets removed as it was defined but not used anywhere,
then all our *_free_ref functions are renamed to *_put in order
to follow the kernel naming convention by Sven Eckelmann.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 23 Feb 2016 09:37:52 +0000 (12:37 +0300)]
rocker: fix rocker_world_port_obj_vlan_add()
We were changing return values and accidentally made
rocker_world_port_obj_vlan_add() into a no-op.
Fixes: fccd84d44912 ('rocker: return -EOPNOTSUPP for undefined world ops')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:27 +0000 (11:01 +0100)]
batman-adv: Rename batadv_tt_orig_list_entry *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:26 +0000 (11:01 +0100)]
batman-adv: Rename batadv_tt_global_entry *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:25 +0000 (11:01 +0100)]
batman-adv: Rename batadv_tt_local_entry *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:24 +0000 (11:01 +0100)]
batman-adv: Rename batadv_orig_node_vlan *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:23 +0000 (11:01 +0100)]
batman-adv: Rename batadv_nc_path *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:22 +0000 (11:01 +0100)]
batman-adv: Rename batadv_nc_node *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:21 +0000 (11:01 +0100)]
batman-adv: Rename batadv_softif_vlan *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:20 +0000 (11:01 +0100)]
batman-adv: Rename batadv_tvlv_container *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:19 +0000 (11:01 +0100)]
batman-adv: Rename batadv_tvlv_handler *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:18 +0000 (11:01 +0100)]
batman-adv: Rename batadv_gw_node *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>
Sven Eckelmann [Sun, 17 Jan 2016 10:01:17 +0000 (11:01 +0100)]
batman-adv: Rename batadv_dat_entry *_free_ref function to *_put
The batman-adv source code is the only place in the kernel which uses the
*_free_ref naming scheme for the *_put functions. Changing it to *_put
makes it more consistent and makes it easier to understand the connection
to the *_get functions.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <a@unstable.cc>