Tariq Toukan [Wed, 20 Apr 2016 19:02:12 +0000 (22:02 +0300)]
net/mlx5e: Use function pointers for RX data path handling
In preparation for Striding RQ feature, which will need its own
RX handlers.
This patch does not change any functionality.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Achiad Shochat <achiad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Wed, 20 Apr 2016 19:02:11 +0000 (22:02 +0300)]
net/mlx5e: Use only close NUMA node for default RSS
Distribute default RSS table uniformly over the rings of the
close NUMA node, instead of all available channels.
This way we enforce the preference of close rings over far ones.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rana Shahout [Wed, 20 Apr 2016 19:02:10 +0000 (22:02 +0300)]
net/mlx5e: Allocate set of queue counters per netdev
Connect all netdev RQs to this set of queue counters.
Also, add an "rx_out_of_buffer" counter to ethtool,
which indicates RX packet drops due to lack of receive
buffers.
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Wed, 20 Apr 2016 19:02:09 +0000 (22:02 +0300)]
net/mlx5: Introduce device queue counters
A queue counter can collect several statistics for one or more
hardware queues (QPs, RQs, etc ..) that the counter is attached to.
For Ethernet it will provide an "out of buffer" counter which
collects the number of all packets that are dropped due to lack
of software buffers.
Here we add device commands to alloc/query/dealloc queue counters.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 21 Apr 2016 19:06:05 +0000 (15:06 -0400)]
Merge branch 'bcmsysport-napi-updates'
Florian Fainelli says:
====================
net: bcmsysport: utilize newer NAPI APIs
These two patches are very analoguous to what was already submitted for
BCMGENET and switch the SYSTEMPORT driver to utilizing __napi_schedule_irqoff()
and napi_complete_done for the RX NAPI context.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 20 Apr 2016 18:37:09 +0000 (11:37 -0700)]
net: bcmsysport: use napi_complete_done()
By using napi_complete_done(), we allow fine tuning of
/sys/class/net/ethX/gro_flush_timeout for higher GRO aggregation
efficiency for a Gbit NIC.
Check commit
24d2e4a50737 ("tg3: use napi_complete_done()") for details.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Wed, 20 Apr 2016 18:37:08 +0000 (11:37 -0700)]
net: bcmsysport: use __napi_schedule_irqoff()
Both bcm_sysport_tx_isr() and bcm_sysport_rx_isr() run in hard irq
context, we do not need to block irq again.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 21 Apr 2016 18:22:14 +0000 (14:22 -0400)]
Merge branch 'nlattr_align'
Nicolas Dichtel says:
====================
libnl: enhance API to ease 64bit alignment for attribute
Here is a proposal to add more helpers in the libnetlink to manage 64-bit
alignment issues.
Note that this series was only tested on x86 by tweeking
CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS and adding some traces.
The first patch adds helpers for 64bit alignment and other patches
use them.
We could also add helpers for nla_put_u64() and its variants if needed.
v1 -> v2:
- remove patch #1
- split patch #2 (now #1 and #2)
- add nla_need_padding_for_64bit()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 21 Apr 2016 16:58:27 +0000 (18:58 +0200)]
ip6mr: align RTA_MFC_STATS on 64-bit
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 21 Apr 2016 16:58:26 +0000 (18:58 +0200)]
ipmr: align RTA_MFC_STATS on 64-bit
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 21 Apr 2016 16:58:25 +0000 (18:58 +0200)]
rtnl: use the new API to align IFLA_STATS*
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 21 Apr 2016 16:58:24 +0000 (18:58 +0200)]
libnl: add more helpers to align attributes on 64-bit
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 19 Apr 2016 18:02:26 +0000 (14:02 -0400)]
veth: Update features to include all tunnel GSO types
This patch adds support for the checksum enabled versions of UDP and GRE
tunnels. With this change we should be able to send and receive GSO frames
of these types over the veth pair without needing to segment the packets.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 19 Apr 2016 18:02:19 +0000 (14:02 -0400)]
netdev_features: Fold NETIF_F_ALL_TSO into NETIF_F_GSO_SOFTWARE
This patch folds NETIF_F_ALL_TSO into the bitmask for NETIF_F_GSO_SOFTWARE.
The idea is to avoid duplication of defines since the only difference
between the two was the GSO_UDP bit.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 19 Apr 2016 14:30:56 +0000 (17:30 +0300)]
geneve: testing the wrong variable in geneve6_build_skb()
We intended to test "err" and not "skb".
Fixes: aed069df099c ('ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Heise [Tue, 19 Apr 2016 11:34:28 +0000 (13:34 +0200)]
NLA_BINARY misuse bug in HSR
Removed .type field from NLA to do proper length checking.
Reported by Daniel Borkmann and Julia Lawall.
Signed-off-by: Peter Heise <peter.heise@airbus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 19 Apr 2016 07:10:01 +0000 (15:10 +0800)]
net: use jiffies_to_msecs to replace EXPIRES_IN_MS in inet/sctp_diag
EXPIRES_IN_MS macro comes from net/ipv4/inet_diag.c and dates
back to before jiffies_to_msecs() has been introduced.
Now we can remove it and use jiffies_to_msecs().
Suggested-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jakub Sitnicki <jkbs@redhat.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Tue, 19 Apr 2016 03:11:50 +0000 (20:11 -0700)]
perf, bpf: minimize the size of perf_trace_() tracepoint handler
move trace_call_bpf() into helper function to minimize the size
of perf_trace_*() tracepoint handlers.
text data bss dec hex filename
10541679 5526646 2945024 19013349 1221ee5 vmlinux_before
10509422 5526646 2945024 18981092 121a0e4 vmlinux_after
It may seem that perf_fetch_caller_regs() can also be moved,
but that is incorrect, since ip/sp will be wrong.
bpf+tracepoint performance is not affected, since
perf_swevent_put_recursion_context() is now inlined.
export_symbol_gpl can also be dropped.
No measurable change in normal perf tracepoints.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 18 Apr 2016 22:24:04 +0000 (18:24 -0400)]
net: dsa: remove tag_protocol from dsa_switch
Having the tag protocol in dsa_switch_driver for setup time and in
dsa_switch_tree for runtime is enough. Remove dsa_switch's one.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 21 Apr 2016 15:52:05 +0000 (11:52 -0400)]
Merge branch '100GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
100GbE Intel Wired LAN Driver Updates 2016-04-20
This series contains updates to fm10k only.
Jacob provides majority of the changes in this series, starting with the
addition of helper functions to reduce code duplication and the amount
of code indentation. Fixed the use or should we say abuse of the ethtool
stats API, which could result in corrupt memory or misleading statistic
output. Added the appropriate rtnl_lock() and rtnl_unlock() to avoid
RCU warnings during AER events. Come to find out, the PTP/1588 support
is not working with the current version of switch management software
and possibly never worked, so just remove support for PTP/1588 for now.
Fixed how error responses from the switch manager after a LPORT_MAP
request are handled, originally which were silently being ignored.
Fixed up code documentation to hopefully ease the code and comment
comprehension. Fixed a possible NULL pointer dereference after a
kcalloc(), where when writing a new default redirection table, and we
needed to populate a new RSS table using ethtool_rxfh_indir_default().
We populate this table into a region of memory allocated using kcalloc()
but never check it for NULL.
Alex adds support for bulk transmit cleanup for fm10k, like he did for
all of our other drivers.
Ngai-Mint fixes a number of issues with the unicast and multicast address
syncs. Where an issue would occur when the netdev is pre-configured to
either multicast mode and is enabled for the first time.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Thu, 7 Apr 2016 15:52:53 +0000 (08:52 -0700)]
fm10k: fix incorrect IPv6 extended header checksum
Check for and handle IPv6 extended headers so that Tx checksum offload
can be done. Also use skb_checksum_help for unexpected cases. This was
originally discovered in ixgbe.
Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Thu, 7 Apr 2016 15:21:21 +0000 (08:21 -0700)]
fm10k: consistently use Intel(R) for driver names
Update every header file and other locations to consistently use
Intel(R) instead of just Intel. Also update copyright year of files
which we modified.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Thu, 7 Apr 2016 15:21:20 +0000 (08:21 -0700)]
fm10k: fix possible null pointer deref after kcalloc
When writing a new default redirection table, we needed to populate
a new RSS table using ethtool_rxfh_indir_default. We populated this
table into a region of memory allocated using kcalloc, but never checked
this for NULL. Fix this by moving the default table generation into
fm10k_write_reta. If this function is passed a table, use it. Otherwise,
generate the default table using ethtool_rxfh_indir_default, 4 at at
time.
Fixes: 0ea7fae44094 ("fm10k: use ethtool_rxfh_indir_default for default redirection table")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ngai-Mint Kwan [Fri, 1 Apr 2016 23:17:39 +0000 (16:17 -0700)]
fm10k: Reset multicast mode when deleting lport
Deleting lport when multicast mode is configured to
FM10K_XCAST_MODE_ALLMULTI or FM10K_XCAST_MODE_PROMISC will result in
generating orphaned multicast-group entries in the switch manager.
Before deleting the lport, reset multicast mode to FM10K_XCAST_MODE_NONE
to flush out these multicast-group entries.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:38 +0000 (16:17 -0700)]
fm10k: update comment regarding reserved bits check
The original comment may be read incorrectly as referring to checking
the *entire* length is zero. However, it merely checks only the reserved
bits of both length and reserved in a small amount of code. Update the
comment to indicate this is a clever trick and clearly spell out that it
only checks the reserve bits.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:37 +0000 (16:17 -0700)]
fm10k: use different name than FM10K_VLAN_CLEAR for override bit
Use a new #define FM10K_VLAN_OVERRIDE even though we're using the exact
same bit. The reason for this is clarity in the code, otherwise you can
read FM10K_VLAN_CLEAR and think it should be removed. Also add a comment
explaining why the FM10K_VLAN_OVERRIDE bit is set.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:36 +0000 (16:17 -0700)]
fm10k: use 8bit notation instead of 10bit notation for diagram
The diagram represents bit layout of the multi-bit VLAN update message
format. Typically these diagrams are drawn using some power of 2 as the
base, to more easily grasp where fields split. Although the numbers
above can make it somewhat easy to understand which bit you're looking
at, it makes the break points not line up. Re-draw the numbers using
base 8, and mark the bit values every 8 bits at the top. This should
make it more easy to grasp the table quickly.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:35 +0000 (16:17 -0700)]
fm10k: fix documentation of fm10k_tlv_parse_attr
fm10k_tlv_parse_attr is supposed to return FM10K_NOT_IMPLEMENTED for any
TLV who's attribute id lies outside the range of results. It does not do
this today. In addition, the documentation does not indicate that other
attributes which are not implemented for a given TLV will be silently
ignored. Fix this. Clean up the logic so that we don't rely on the fact
that FM10K_NOT_IMPLEMENTED is greater than zero, as this can easily
cause confusion.
A future extension could look into some way of reporting unknown TLVs
in order to make issues more easily discoverable. We can't just return
FM10K_NOT_IMPLEMENTED here because we don't want to drop the entire
message if it has an unknown TLV.
While here, update the copyright year.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:34 +0000 (16:17 -0700)]
fm10k: do not disable PCI device in fm10k_io_error_detected
fm10k_io_error_detected() does not need to call pci_disable_device(). In
the cases where the reset needs to occur, the stack flow will result in
calling fm10k_remove() which already disables the PCI device. If we
leave the pci_disable_device(), we result in a warning about disabling
an already disabled device.
Many PCI drivers do call pci_disable_device() in their .error_detected()
routines, but it does not appear to be required. In addition, these
drivers have a check "is_pci_enabled()" call in their remove routines,
which is how they chose to handle the duplicate device disable.
This seems incorrect, since the PCI device structure is reference
counted. It is very possible that the reference count for the PCI device
could be greater than 1. In this case, you would remove the PCI device
within the error_detected routine, reducing count to 1, then remove it
again in the remove function, reducing it to zero. This would result in
yet another disable somewhere else failing. Thus, we shouldn't be using
is_pci_enabled() to check for this issue. Instead, just remove the
extraneous pci_device_disable() found within the error_detected routine.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:33 +0000 (16:17 -0700)]
fm10k: correctly handle LPORT_MAP error
Currently, any error responses from the switch manager after an
LPORT_MAP request are silently ignored. At most the mailbox message will
be reported as an error. This can result in unexpected behavior when the
switch manager has configured a port with zero bandwidth. Add support
for reading the fm10k_swapi_error structure from LPORT_MAP responses.
If the message contains the necessary TLV and has a non-zero error code,
report link down, clear the dglort_map, and delay the next
get_host_state call by a reasonable delay. Also log an error message
indicating that the LPORT_MAP request failed.
The delay ensures preventing an interrupt storm on the switch manager,
and reduces the number of mailbox messages we send in this scenario
drastically.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Ngai-Mint Kwan [Fri, 1 Apr 2016 23:17:32 +0000 (16:17 -0700)]
fm10k: Fix multicast mode sync issues
Multicast mode checking is no longer a requirement to perform unicast
and multicast address syncs. Specifically, a device operating in
promiscuous and/or all multicast mode is not excluded. The issue occurs
when the netdev is pre-configured to either multicast mode and is
enabled for the first time. The multicast-group table in the Switch
Manager will be missing obvious multicast entries associated to this
netdev.
Changes were also made to disallow unicast and multicast syncs with
VLAN 0. The Switch Manager considers VLAN 0 to be an invalid entry.
Requests with VLAN 0 by the netdev are only generated when the driver is
freshly installed and the default VID is not set.
Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 23:17:31 +0000 (16:17 -0700)]
fm10k: drop 1588 support
The 1588 support within fm10k does not work correctly with the current
version of the switch management software, and likely never worked
correctly to begin with. Remove support for PTP/1588. Update copyright
year for all these files while we're touching them.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 11 Mar 2016 17:52:32 +0000 (09:52 -0800)]
fm10k: prevent RCU issues during AER events
During an AER action response, we were calling fm10k_close without
holding the rtnl_lock() which could lead to possible RCU warnings being
produced due to 64bit stat updates among other causes. Similarly, we
need rtnl_lock() around fm10k_open during fm10k_io_resume. Follow the
same pattern elsewhere in the driver and protect the entire open/close
sequence.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Thu, 10 Mar 2016 00:36:08 +0000 (16:36 -0800)]
fm10k: use DRV_SUMMARY to reduce code duplication
Use DRV_SUMMARY, similar to DRV_VERSION so that we don't have to
duplicate the driver summary in multiple places.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alexander Duyck [Mon, 7 Mar 2016 17:30:15 +0000 (09:30 -0800)]
fm10k: Add support for bulk Tx cleanup & cleanup boolean logic
This patch enables bulk free in Tx cleanup for fm10k and cleans up the
boolean logic in the polling routines for fm10k in the hopes of avoiding
any mix-ups similar to what occurred with i40e and i40evf.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 4 Mar 2016 23:37:48 +0000 (15:37 -0800)]
fm10k: remove debug-statistics support
This change fixes an (ab)use of the ethtool stats API, which could
result in corrupt memory or misleading stat output. The ethtool stats
API is not robust enough to handle varying number of statistics due to
how it requests the size and allocates memory. Remove the poorly conceived
support originally added for extra debug statistics. In the future,
a new stats API may open up the ability to display these statistics.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Fri, 1 Apr 2016 18:15:09 +0000 (11:15 -0700)]
fm10k: add helper functions to set strings and data for ethtool stats
Reduce duplicate code and the amount of indentation by adding
fm10k_add_stat_strings and fm10k_add_ethtool_stats functions which help
add fm10k_stat structures to the ethtool stats callbacks. This helps
increase ease of use for future stat additions, and increases code
readability.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Roopa Prabhu [Wed, 20 Apr 2016 15:43:43 +0000 (08:43 -0700)]
rtnetlink: add new RTM_GETSTATS message to dump link stats
This patch adds a new RTM_GETSTATS message to query link stats via netlink
from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK
returns a lot more than just stats and is expensive in some cases when
frequent polling for stats from userspace is a common operation.
RTM_GETSTATS is an attempt to provide a light weight netlink message
to explicity query only link stats from the kernel on an interface.
The idea is to also keep it extensible so that new kinds of stats can be
added to it in the future.
This patch adds the following attribute for NETDEV stats:
struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = {
[IFLA_STATS_LINK_64] = { .len = sizeof(struct rtnl_link_stats64) },
};
Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of
a single interface or all interfaces with NLM_F_DUMP.
Future possible new types of stat attributes:
link af stats:
- IFLA_STATS_LINK_IPV6 (nested. for ipv6 stats)
- IFLA_STATS_LINK_MPLS (nested. for mpls/mdev stats)
extended stats:
- IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge,
vlan, vxlan etc)
- IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are
available via ethtool today)
This patch also declares a filter mask for all stat attributes.
User has to provide a mask of stats attributes to query. filter mask
can be specified in the new hdr 'struct if_stats_msg' for stats messages.
Other important field in the header is the ifindex.
This api can also include attributes for global stats (eg tcp) in the future.
When global stats are included in a stats msg, the ifindex in the header
must be zero. A single stats message cannot contain both global and
netdev specific stats. To easily distinguish them, netdev specific stat
attributes name are prefixed with IFLA_STATS_LINK_
Without any attributes in the filter_mask, no stats will be returned.
This patch has been tested with mofified iproute2 ifstat.
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 20 Apr 2016 19:32:54 +0000 (15:32 -0400)]
net: nla_align_64bit() needs to test the right pointer.
Netlink messages are appended, one object at a time, to the end of
the SKB. Therefore we need to test skb_tail_pointer() not skb->data
for alignment.
Fixes: 35c5845957c7 ("net: Add helpers for 64-bit aligning netlink attributes.")
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 20 Apr 2016 14:31:31 +0000 (07:31 -0700)]
net: fix HAVE_EFFICIENT_UNALIGNED_ACCESS typos
HAVE_EFFICIENT_UNALIGNED_ACCESS needs CONFIG_ prefix.
Also add a comment in nla_align_64bit() explaining we have
to add a padding if current skb->data is aligned, as it
certainly can be confusing.
Fixes: 35c5845957c7 ("net: Add helpers for 64-bit aligning netlink attributes.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Heise [Wed, 20 Apr 2016 07:08:29 +0000 (09:08 +0200)]
net/hsr: Fixed version field in ENUM
New field (IFLA_HSR_VERSION) was added in the middle of an existing
ENUM and would break kernel ABI, therefore moved to the end.
Reported by Stephen Hemminger.
Signed-off-by: Peter Heise <peter.heise@airbus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 18 Apr 2016 20:10:24 +0000 (16:10 -0400)]
net: dsa: kill circular reference with slave priv
The dsa_slave_priv structure does not need a pointer to its net_device.
Kill it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 20 Apr 2016 00:26:11 +0000 (20:26 -0400)]
Merge branch 'bpf_event_output'
Daniel Borkmann says:
====================
BPF updates
This minor set adds a new helper bpf_event_output() for eBPF cls/act
program types which allows to pass events to user space applications.
For details, please see individual patches.
v1 -> v2:
- Address kbuild bot found compile issue in patch 2
- Rest as is
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 18 Apr 2016 19:01:24 +0000 (21:01 +0200)]
bpf: add event output helper for notifications/sampling/logging
This patch adds a new helper for cls/act programs that can push events
to user space applications. For networking, this can be f.e. for sampling,
debugging, logging purposes or pushing of arbitrary wake-up events. The
idea is similar to
a43eec304259 ("bpf: introduce bpf_perf_event_output()
helper") and
39111695b1b8 ("samples: bpf: add bpf_perf_event_output example").
The eBPF program utilizes a perf event array map that user space populates
with fds from perf_event_open(), the eBPF program calls into the helper
f.e. as skb_event_output(skb, &my_map, BPF_F_CURRENT_CPU, raw, sizeof(raw))
so that the raw data is pushed into the fd f.e. at the map index of the
current CPU.
User space can poll/mmap/etc on this and has a data channel for receiving
events that can be post-processed. The nice thing is that since the eBPF
program and user space application making use of it are tightly coupled,
they can define their own arbitrary raw data format and what/when they
want to push.
While f.e. packet headers could be one part of the meta data that is being
pushed, this is not a substitute for things like packet sockets as whole
packet is not being pushed and push is only done in a single direction.
Intention is more of a generically usable, efficient event pipe to applications.
Workflow is that tc can pin the map and applications can attach themselves
e.g. after cls/act setup to one or multiple map slots, demuxing is done by
the eBPF program.
Adding this facility is with minimal effort, it reuses the helper
introduced in
a43eec304259 ("bpf: introduce bpf_perf_event_output() helper")
and we get its functionality for free by overloading its BPF_FUNC_ identifier
for cls/act programs, ctx is currently unused, but will be made use of in
future. Example will be added to iproute2's BPF example files.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 18 Apr 2016 19:01:23 +0000 (21:01 +0200)]
bpf, trace: add BPF_F_CURRENT_CPU flag for bpf_perf_event_output
Add a BPF_F_CURRENT_CPU flag to optimize the use-case where user space has
per-CPU ring buffers and the eBPF program pushes the data into the current
CPU's ring buffer which saves us an extra helper function call in eBPF.
Also, make sure to properly reserve the remaining flags which are not used.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julia Lawall [Mon, 18 Apr 2016 14:55:35 +0000 (16:55 +0200)]
arcnet: com90xx: add __init attribute
Add __init attribute on a function that is only called from other __init
functions and that is not inlined, at least with gcc version 4.8.4 on an
x86 machine with allyesconfig. Currently, the function is put in the
.text.unlikely segment. Declaring it as __init will cause it to be put in
the .init.text and to disappear after initialization.
The result of objdump -x on the function before the change is as follows:
0000000000000000 l F .text.unlikely
00000000000000bf check_mirror
And after the change it is as follows:
0000000000000000 l F .init.text
00000000000000ba check_mirror
Done with the help of Coccinelle. The semantic patch checks for local
static non-init functions that are called from an __init function and are
not called from any other function.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Michael Grzeschik <mgr@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Konstantin Khlebnikov [Mon, 18 Apr 2016 11:41:17 +0000 (14:41 +0300)]
net/ipv6/addrconf: fix sysctl table indentation
Separated from previous patch for readability.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Konstantin Khlebnikov [Mon, 18 Apr 2016 11:41:10 +0000 (14:41 +0300)]
net/ipv6/addrconf: simplify sysctl registration
Struct ctl_table_header holds pointer to sysctl table which could be used
for freeing it after unregistration. IPv4 sysctls already use that.
Remove redundant NULL assignment: ndev allocated using kzalloc.
This also saves some bytes: sysctl table could be shorter than
DEVCONF_MAX+1 if some options are disable in config.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 19 Apr 2016 23:49:29 +0000 (19:49 -0400)]
net: Add helpers for 64-bit aligning netlink attributes.
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Suggested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 19 Apr 2016 18:30:10 +0000 (14:30 -0400)]
net: Align IFLA_STATS64 attributes properly on architectures that need it.
Since the nlattr header is 4 bytes in size, it can cause the netlink
attribute payload to not be 8-byte aligned.
This is particularly troublesome for IFLA_STATS64 which contains 64-bit
statistic values.
Solve this by creating a dummy IFLA_PAD attribute which has a payload
which is zero bytes in size. When HAVE_EFFICIENT_UNALIGNED_ACCESS is
false, we insert an IFLA_PAD attribute into the netlink response when
necessary such that the IFLA_STATS64 payload will be properly aligned.
With help and suggestions from Eric Dumazet.
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Mon, 18 Apr 2016 21:58:30 +0000 (23:58 +0200)]
net: w5100: don't build spi driver without w5100
The w5100-spi driver front-end only makes sense when the w5100
core driver is enabled, not for a configuration that only has w5300:
drivers/net/built-in.o: In function `w5100_spi_remove':
drivers/net/ethernet/wiznet/w5100-spi.c:277: undefined reference to `w5100_remove'
drivers/net/built-in.o: In function `w5100_spi_probe':
drivers/net/ethernet/wiznet/w5100-spi.c:272: undefined reference to `w5100_probe'
drivers/net/built-in.o: In function `w5200_spi_init':
drivers/net/ethernet/wiznet/w5100-spi.c:125: undefined reference to `w5100_ops_priv'
drivers/net/built-in.o: In function `w5200_spi_readbulk':
drivers/net/ethernet/wiznet/w5100-spi.c:125: undefined reference to `w5100_ops_priv'
drivers/net/built-in.o: In function `w5200_spi_writebulk':
drivers/net/ethernet/wiznet/w5100-spi.c:125: undefined reference to `w5100_ops_priv'
drivers/net/built-in.o:(.data+0x3ed1c): undefined reference to `w5100_pm_ops'
This adds an appropriate Kconfig dependency.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 630cf09751fe ("net: w5100: support SPI interface mode")
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Sat, 16 Apr 2016 20:29:33 +0000 (22:29 +0200)]
bpf: avoid warning for wrong pointer cast
Two new functions in bpf contain a cast from a 'u64' to a
pointer. This works on 64-bit architectures but causes a warning
on all 32-bit architectures:
kernel/trace/bpf_trace.c: In function 'bpf_perf_event_output_tp':
kernel/trace/bpf_trace.c:350:13: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
u64 ctx = *(long *)r1;
This changes the cast to first convert the u64 argument into a uintptr_t,
which is guaranteed to be the same size as a pointer.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 9940d67c93b5 ("bpf: support bpf_get_stackid() and bpf_perf_event_output() in tracepoint programs")
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sat, 16 Apr 2016 22:05:19 +0000 (01:05 +0300)]
of_mdio: make of_mdiobus_register_{device|phy}() *void*
The results of of_mdiobus_register_{device|phy}() are never checked, so we
can make both these functions *void*...
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Govindarajulu Varadarajan [Fri, 15 Apr 2016 19:10:43 +0000 (00:40 +0530)]
enic: set netdev->vlan_features
Driver sets vlan_feature to netdev->features as hardware supports all of
them on vlan interface.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
KY Srinivasan [Thu, 14 Apr 2016 23:31:54 +0000 (16:31 -0700)]
hv_netvsc: Implement support for VF drivers on Hyper-V
Support VF drivers on Hyper-V. On Hyper-V, each VF instance presented to
the guest has an associated synthetic interface that shares the MAC address
with the VF instance. Typically these are bonded together to support
live migration. By default, the host delivers all the incoming packets
on the synthetic interface. Once the VF is up, we need to explicitly switch
the data path on the host to divert traffic onto the VF interface. Even after
switching the data path, broadcast and multicast packets are always delivered
on the synthetic interface and these will have to be injected back onto the
VF interface (if VF is up).
This patch implements the necessary support in netvsc to support Linux
VF drivers.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 18 Apr 2016 18:45:09 +0000 (14:45 -0400)]
Merge branch 'fec-ksettings'
Philippe Reynes says:
====================
fec: ethtool: move to new api {get|set}_link_ksettings
Ethtool has a new api {get|set}_link_ksettings that deprecate
the old api {get|set}_settings. We update the fec driver to use
this new ethtool api.
For this first version, I've converted old u32 value in phy structure
to link_modes structure. Another way would be to replace u32 in
phy structure to use DECLARE_LINK_MODE_MASK for advertising, ....
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe Reynes [Thu, 14 Apr 2016 22:35:01 +0000 (00:35 +0200)]
fec: move to new ethtool api {get|set}_link_ksettings
The ethtool api {get|set}_settings is deprecated.
We move the fec driver to new api {get|set}_link_ksettings.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe Reynes [Thu, 14 Apr 2016 22:35:00 +0000 (00:35 +0200)]
phy: add generic function to support ksetting support
The old ethtool api (get_setting and set_setting) has
generic phy functions phy_ethtool_sset and phy_ethtool_gset.
To supprt the new ethtool api (get_link_ksettings and
set_link_ksettings), we add generic phy function
phy_ethtool_ksettings_get and phy_ethtool_ksettings_set.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Philippe Reynes [Thu, 14 Apr 2016 22:34:59 +0000 (00:34 +0200)]
net: ethtool: export conversion function between u32 and link mode
The function convert_legacy_u32_to_link_mode and
convert_link_mode_to_legacy_u32 may be used outside
of ethtool.c. We rename them to ethtool_convert_...
and export them, so we could use them in others
drivers and modules.
Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 14 Apr 2016 16:39:39 +0000 (18:39 +0200)]
tun: don't require serialization lock on tx
The current tun_net_xmit() implementation don't need any external
lock since it relies on rcu protection for the tun data structure
and on socket queue lock for skb queuing.
This patch set the NETIF_F_LLTX feature bit in the tun device, so
that on xmit, in absence of qdisc, no serialization lock is acquired
by the caller.
The user space can remove the default tun qdisc with:
tc qdisc replace dev <tun device name> root noqueue
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Mon, 18 Apr 2016 08:44:49 +0000 (11:44 +0300)]
udp: fix if statement in SIOCINQ ioctl
We deleted a line of code and accidentally made the "return put_user()"
part of the if statement when it's supposed to be unconditional.
Fixes: 9f9a45beaa96 ('udp: do not expect udp headers on ioctl SIOCINQ')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Sat, 16 Apr 2016 03:36:25 +0000 (20:36 -0700)]
rtnetlink: rtnl_fill_stats: avoid an unnecssary stats copy
This patch passes netlink attr data ptr directly to dev_get_stats
thus elimiating a stats copy.
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 17 Apr 2016 22:54:15 +0000 (18:54 -0400)]
Merge branch 'dsa-mv88e6xxx-switch-factorization'
Vivien Didelot says:
====================
net: dsa: mv88e6xxx: factorize switch info
This patchset factorizes the mv88e6xxx code by sharing a new extendable
info structure to store static data such as switch family, product
number, number of ports, number of databases and the name.
The next step is to add a "flags" bitmap member to the info structure in
order to simplify the shared code with a feature-based logic instead of
checking their family/ID.
This is a step forward having a single mv88e6xxx driver supporting many
similar devices, like any usual Linux driver.
Changes v3 -> v4:
- constify probed name in DSA
- rebase patchset above conflicting commit
48ace4e
Changes v2 -> v3:
- update commit messages and add Andrew's tags
- keep the info lookup code in a separated function
- split the single switch ID reading in probe in a new commit
Changes v1 -> v2:
- define PORT_SWITCH_ID_PROD_NUM_* values
- use plain struct mv88e6xxx_info
- remove non used yet ps->rev
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:24:03 +0000 (13:24 -0400)]
net: dsa: mv88e6xxx: remove switch ID from ps
ps->id is not needed anymore, so remove it as well as the related
defined values.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:24:02 +0000 (13:24 -0400)]
net: dsa: mv88e6xxx: add number of db to info
Add the number of databases to the info structure.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:24:01 +0000 (13:24 -0400)]
net: dsa: mv88e6xxx: add number of ports to info
Drop the ps->num_ports variable in favor of a new member of the info
structure. This removes the need to assign it at setup time.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:24:00 +0000 (13:24 -0400)]
net: dsa: mv88e6xxx: add family to info
Add an mv88e6xxx_family enum to the info structure for better family
indentification.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:23:59 +0000 (13:23 -0400)]
net: dsa: mv88e6xxx: add switch info
Add a new switch info structure which is meant to store switch models
static information, such as product number, name, number of ports,
number of databases, etc.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:23:58 +0000 (13:23 -0400)]
net: dsa: mv88e6xxx: read switch ID in probe
Read the switch ID only once, at probe time, to avoid multiple read
accesses and MII bus checking.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:23:57 +0000 (13:23 -0400)]
net: dsa: mv88e6xxx: drop revision probing
There is no point in having a special case for the revision when probing
a switch model. The code gets cluttered with unnecessary defines, and
leads to errors when code such as mv88e6131_setup compares
PORT_SWITCH_ID_6131_B2 to ps->id which masks the revision.
Drop every revision definition, and lookup only the product number.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:23:56 +0000 (13:23 -0400)]
net: dsa: mv88e6xxx: drop double ds assignment
Every driver assigns ps->ds even though it gets assigned in the shared
mv88e6xxx_setup_common function. Kill redundancy.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 17 Apr 2016 17:23:55 +0000 (13:23 -0400)]
net: dsa: constify probed name
Change the dsa_switch_driver.probe function to return a const char *.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 17 Apr 2016 02:34:40 +0000 (22:34 -0400)]
Merge branch 'nfp-next'
Jakub Kicinski says
====================
nfp: cleanups and improvements
Main purpose of this set is to get rid of doing potentially long
mdelay()s but it also contains some trivial changes I've accumulated.
First two patches fix harmless copy-paste errors, next two clean up
the documentation and remove unused defines. Patch 5 clarifies the
interpretation of RX descriptor fields. Patch 6, by far the biggest,
adds ability to perform FW reconfig asynchronously thanks to which
we can stop using mdelay().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 16 Apr 2016 10:25:54 +0000 (11:25 +0100)]
nfp: add async reconfiguration mechanism
Some callers of nfp_net_reconfig() are in atomic context so
we used to busy wait for commands to complete. In worst case
scenario that means locking up a core for up to 5 seconds
when a command times out. Lets add a timer-based mechanism
of asynchronously checking whether reconfiguration completed
successfully for atomic callers to use. Non-atomic callers
can now just sleep.
The approach taken is quite simple because (1) synchronous
reconfigurations always happen under RTNL (or before device
is registered); (2) we can coalesce pending reconfigs.
There is no need for request queues, timer which eventually
takes a look at reconfiguration result to report errors is
good enough.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 16 Apr 2016 10:25:53 +0000 (11:25 +0100)]
nfp: remove buggy RX buffer length validation
Meaning of data_len and meta_len RX WB descriptor fields is
slightly confusing. Add a comment with a diagram clarifying
the layout. Also remove the buffer length validation:
(a) it's imprecise for static rx-offsets; (b) if firmware
is buggy enough to DMA past the end of the buffer
WARN_ON_ONCE() doesn't seem like a strong enough response.
skb_put() will do the checking for us anyway.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 16 Apr 2016 10:25:52 +0000 (11:25 +0100)]
nfp: remove unused suspicious mask defines
NFP_NET_RXR_MASK sounds like a mask which could be used on
NFP_NET_CFG_RXRS_ENABLE register but its value is quite
strange. In fact there are no users of this define so let's
just remove it. Same for TX rings.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 16 Apr 2016 10:25:51 +0000 (11:25 +0100)]
nfp: correct names of constants in comments
Documentation in comments lacks CFG in some names.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 16 Apr 2016 10:25:50 +0000 (11:25 +0100)]
nfp: remove unnecessary static
There is no reason for those local variables to be static.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 16 Apr 2016 10:25:49 +0000 (11:25 +0100)]
nfp: check the right pointer for errors
Correct checking error condition on wrong pointer -
copy/paste mistake most likely.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 17 Apr 2016 02:02:14 +0000 (22:02 -0400)]
Merge branch 'IFF_NO_QUEUE-followups'
Phil Sutter says:
====================
Minor IFF_NO_QUEUE conversion follow-up
The following series converts two further drivers away from setting
'tx_queue_len = 0' to adding IFF_NO_QUEUE to priv_flags instead.
The first one, rtl8188eu in staging didn't exist back when all drivers
were converted. The second one, openvswitch seems to have slipped through
my grep'ing back then, no idea why.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Fri, 15 Apr 2016 17:14:20 +0000 (19:14 +0200)]
openvswitch: Convert to using IFF_NO_QUEUE
Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Phil Sutter [Fri, 15 Apr 2016 17:14:19 +0000 (19:14 +0200)]
staging: rtl8188eu: Convert to using IFF_NO_QUEUE
Cc: Jakub Sitnicki <jsitnicki@gmail.com>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 17 Apr 2016 01:51:01 +0000 (21:51 -0400)]
Merge branch 'fjes-next'
Taku Izumi says:
====================
FUJITSU Extended Socket driver version 1.1
This patchsets update FUJITSU Extended Socket network driver into version 1.1.
This mainly includes some improvements and minor bugfix.
v1->v2:
- Remove ioctl and debugfs facility according to David comment
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Taku Izumi [Fri, 15 Apr 2016 02:25:52 +0000 (11:25 +0900)]
fjes: Update fjes driver version : 1.1
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taku Izumi [Fri, 15 Apr 2016 02:25:46 +0000 (11:25 +0900)]
fjes: Introduce spinlock for rx_status
This patch introduces spinlock of rx_status for
proper excusive control.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taku Izumi [Fri, 15 Apr 2016 02:25:40 +0000 (11:25 +0900)]
fjes: Enhance changing MTU related work
This patch enhances the fjes_change_mtu() method
by introducing new flag named FJES_RX_MTU_CHANGING_DONE
in rx_status. At the same time, default MTU value is
changed into 65510 bytes.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taku Izumi [Fri, 15 Apr 2016 02:25:34 +0000 (11:25 +0900)]
fjes: fix bitwise check bug in fjes_raise_intr_rxdata_task
In fjes_raise_intr_rxdata_task(), there's a bug of bitwise
check because of missing "& FJES_RX_POLL_WORK".
This patch fixes this bug.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taku Izumi [Fri, 15 Apr 2016 02:25:27 +0000 (11:25 +0900)]
fjes: fix incorrect statistics information in fjes_xmit_frame()
There are bugs of acounting statistics in fjes_xmit_frame().
Accounting self stats is wrong. accounting stats of other
EPs to be transmitted is right.
This patch fixes this bug.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taku Izumi [Fri, 15 Apr 2016 02:25:21 +0000 (11:25 +0900)]
fjes: optimize timeout value
This patch optimizes the following timeout value.
- FJES_DEVICE_RESET_TIMEOUT
- FJES_COMMAND_REQ_TIMEOUT
- FJES_COMMAND_REQ_BUFF_TIMEOUT
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dinh Nguyen [Fri, 15 Apr 2016 01:42:29 +0000 (20:42 -0500)]
stmmac: socfpga: remove extra call to socfpga_dwmac_setup
In the socfpga_dwmac_probe function, we have a call to socfpga_dwmac_setup,
which is already called from socfpga_dwmac_init later in the probe function.
Remove this extra call to socfpga_dwmac_setup.
Also we should not be calling socfpga_dwmac_setup() directly without wrapping
it around the proper reset assert/deasserts. That is because the
socfpga_dwmac_setup() is setting up PHY modes in the system manager, and it
is requires the EMAC's to be in reset during the PHY setup.
Reported-by: Matthew Gerlach <mgerlach@opensource.altera.com>
Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Thu, 14 Apr 2016 21:47:12 +0000 (23:47 +0200)]
dsa: mv88e6xxx: Kill the REG_READ and REG_WRITE macros
These macros hide a ds variable and a return statement on error, which
can lead to locking issues. Kill them off.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 14 Apr 2016 21:04:34 +0000 (17:04 -0400)]
netdev_features: Add NETIF_F_TSO_MANGLEID to NETIF_F_ALL_TSO
I realized that when I added NETIF_F_TSO_MANGLEID as a TSO type I forgot to
add it to NETIF_F_ALL_TSO. This patch corrects that so the flag will be
included correctly.
The result should be minor as it was only used by a few drivers and in a
few specific cases such as when NETIF_F_SG was not supported on a device so
the TSO flags were cleared.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 16 Apr 2016 23:09:14 +0000 (19:09 -0400)]
Merge branch 'ipv6-gre-offloads'
Alexander Duyck says:
====================
Add support for offloads with IPv6 GRE tunnels
This patch series enables the use of segmentation and checksum offloads
with IPv6 based GRE tunnels.
In order to enable this series I had to make a change to
iptunnel_handle_offloads so that it would no longer free the skb. This was
necessary as there were multiple paths in the IPv6 GRE code that required
the skb to still be present so it could be freed. As it turned out I
believe this actually fixes a bug that was present in FOU/GUE based tunnels
anyway.
Below is a quick breakdown of the performance gains seen with a simple
netperf test passing traffic through a ip6gretap tunnel and then an i40e
interface:
Throughput Throughput Local Local Result
Units CPU Service Tag
Util Demand
%
3544.93 10^6bits/s 6.30 4.656 "before"
13081.75 10^6bits/s 3.75 0.752 "after"
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 14 Apr 2016 19:34:04 +0000 (15:34 -0400)]
ip6gre: Add support for GSO
This patch adds code borrowed from bits and pieces of other protocols to
the IPv6 GRE path so that we can support GSO over IPv6 based GRE tunnels.
By adding this support we are able to significantly improve the throughput
for GRE tunnels as we are able to make use of GSO.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 14 Apr 2016 19:33:58 +0000 (15:33 -0400)]
GRE: Add support for GRO/GSO of IPv6 GRE traffic
Since GRE doesn't really care about L3 protocol we can support IPv4 and
IPv6 using the same offloads. With that being the case we can add a call
to register the offloads for IPv6 as a part of our GRE offload
initialization.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 14 Apr 2016 19:33:51 +0000 (15:33 -0400)]
ip6gre: Add support for basic offloads offloads excluding GSO
This patch adds support for the basic offloads we support on most devices.
Specifically with this patch set we can support checksum offload, basic
scatter-gather, and highdma.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 14 Apr 2016 19:33:45 +0000 (15:33 -0400)]
ip6gretap: Fix MTU to allow for Ethernet header
When we were creating an ip6gretap interface the MTU was about 6 bytes
short of what was needed. It turns out we were not taking the Ethernet
header into account and as a result we were eating into the 8 bytes
reserved for the encap limit.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Thu, 14 Apr 2016 19:33:37 +0000 (15:33 -0400)]
ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skb
This patch updates the IP tunnel core function iptunnel_handle_offloads so
that we return an int and do not free the skb inside the function. This
actually allows us to clean up several paths in several tunnels so that we
can free the skb at one point in the path without having to have a
secondary path if we are supporting tunnel offloads.
In addition it should resolve some double-free issues I have found in the
tunnels paths as I believe it is possible for us to end up triggering such
an event in the case of fou or gue.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 16 Apr 2016 22:30:27 +0000 (18:30 -0400)]
Merge branch 'w5100-spi-and-w5200-support'
Akinobu Mita says:
====================
net: w5100: add support W5100/W5200 for SPI interface
This series add support for Wiznet W5100 and W5200 for SPI interface.
We can easily find the ethernet modules and shield for Arduino with
these chips for purchase. I've tested them with BeagleBone.
Wiznet W5100 for mmio access has already supported by w5100 driver.
In order to share the code between mmio mode and SPI mode, this series
firstly adds ability to support another register access interface to
the existing w5100 driver. This ground work also requires to introduce
workqueue and threaded irq because SPI transfers are callable only from
contexts that can sleep unlike mmio access.
The latter part of this series adds w5100-spi driver which actually
support W5100 and W5200 for SPI interface. Supporting W5100 is
straight forward because it only required to add a register access
interface by the SPI transfer. W5100 and W5200 have similar memory
map which justifies adding W5200 support to w5100 driver.
* Changes from v2 to v3
- Add comment for reg_lock
- Add ability to allocate ops specific data structure
- Allocate w5200 ops specific data structure to put DMA-safe buffer
- Add missing chip_id assignment for w5100_*_ops
* Changes from v1 to v2
- Use a plain single pointer instead of SKB queue, spotted by David S. Miller
- Correct timeout period in w5100_command
- Use spi_write_then_read instead of spi_write which needs DMA-safe buffer
- Support W5200
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Akinobu Mita [Thu, 14 Apr 2016 15:11:33 +0000 (00:11 +0900)]
net: w5100: support W5200
This adds support for W5200 chip.
W5100 and W5200 have similar memory map although some of their offsets
are different. The register access sequences between them are different
but w5100 driver has abstraction layer for difference bus interface
modes so it is easy to add W5200 support to w5100 and w5100-spi drivers.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Sinkovsky <msink@permonline.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>