openwrt/staging/blogic.git
8 years agomlxsw: spectrum: Implement LAG tx enabled lower state change
Jiri Pirko [Thu, 3 Dec 2015 11:12:30 +0000 (12:12 +0100)]
mlxsw: spectrum: Implement LAG tx enabled lower state change

Enabling/disabling TX on a LAG port means enabling/disabling distribution
in our HW.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: spectrum: Implement FDB add/remove/dump for LAG
Jiri Pirko [Thu, 3 Dec 2015 11:12:29 +0000 (12:12 +0100)]
mlxsw: spectrum: Implement FDB add/remove/dump for LAG

Implement FDB offloading for lagged ports, including learning LAG FDB
entries, adding/removing static FDB entries and dumping existing LAG FDB
entries.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: spectrum: Implement LAG port join/leave
Jiri Pirko [Thu, 3 Dec 2015 11:12:28 +0000 (12:12 +0100)]
mlxsw: spectrum: Implement LAG port join/leave

Implement basic procedures for joining/leaving port to/from LAG. That
includes HW setup of collector, core LAG mapping setup.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: reg: Add definition of LAG unicast record for SFN register
Jiri Pirko [Thu, 3 Dec 2015 11:12:27 +0000 (12:12 +0100)]
mlxsw: reg: Add definition of LAG unicast record for SFN register

LAG-related records have specific format in SFN register.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: reg: Add definition of LAG unicast record for SFD register
Jiri Pirko [Thu, 3 Dec 2015 11:12:26 +0000 (12:12 +0100)]
mlxsw: reg: Add definition of LAG unicast record for SFD register

LAG-related records have specific format in SFD register.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: reg: Add link aggregation configuration registers definitions
Jiri Pirko [Thu, 3 Dec 2015 11:12:25 +0000 (12:12 +0100)]
mlxsw: reg: Add link aggregation configuration registers definitions

Add definitions of SLDR, SLCR2, SLCOR registers that are used to
configure LAG.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: pci: Implement LAG processing for received packets
Jiri Pirko [Thu, 3 Dec 2015 11:12:24 +0000 (12:12 +0100)]
mlxsw: pci: Implement LAG processing for received packets

Completion queue element for receive queue provides information if the
packet was received via LAG port. Extract this info and pass it along
to core.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: core: Add support for packets received from LAG port
Jiri Pirko [Thu, 3 Dec 2015 11:12:23 +0000 (12:12 +0100)]
mlxsw: core: Add support for packets received from LAG port

Lower layer (pci) has information if the packet is received via LAG port.
If that is the case, it fills up rx_info accordingly. However upper
layer does not care about lag_id/port_index for received packets so
convert it to local_port before passing it up. For that conversion, lag
mapping array is introduced. Upper layer is responsible for setting up
the mapping according to what is set in HW.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomlxsw: spectrum: Add set_rx_mode ndo stub
Jiri Pirko [Thu, 3 Dec 2015 11:12:22 +0000 (12:12 +0100)]
mlxsw: spectrum: Add set_rx_mode ndo stub

Add just a stub for now. This allows to pass check in dev_ifsioc,
SIOCADDMULTI and SIOCDELMULTI cases. Teamd is using these to add LACP
slow MAC.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobonding: set inactive flags on release
Jiri Pirko [Thu, 3 Dec 2015 11:12:21 +0000 (12:12 +0100)]
bonding: set inactive flags on release

Be correct and symmetric to enslave and set inactive flags during release.
That gives LAG offload drivers - lower state change listeners - possibility
to do proper cleanup.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobonding: implement lower state change propagation
Jiri Pirko [Thu, 3 Dec 2015 11:12:20 +0000 (12:12 +0100)]
bonding: implement lower state change propagation

Let netdev notifier listeners know about link and slave state change.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobonding: allow notifications for bond_set_slave_link_state
Jiri Pirko [Thu, 3 Dec 2015 11:12:19 +0000 (12:12 +0100)]
bonding: allow notifications for bond_set_slave_link_state

Similar to state notifications.

We allow caller to indicate if the notification should happen now or later,
depending on if he holds rtnl mutex or not. Introduce bond_slave_link_notify
function (similar to bond_slave_state_notify) which is later on called
with rtnl mutex and goes over slaves and executes delayed notification.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoteam: implement lower state change propagation
Jiri Pirko [Thu, 3 Dec 2015 11:12:18 +0000 (12:12 +0100)]
team: implement lower state change propagation

Let netdev notifier listeners know about link-up and port-enable state
changes.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoteam: rtnl_lock for options set
Jiri Pirko [Thu, 3 Dec 2015 11:12:17 +0000 (12:12 +0100)]
team: rtnl_lock for options set

During options set, there will be needed to hold rtnl_mutex in order to
safely call netdev notifiers.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: introduce lower state changed info structure for LAG lowers
Jiri Pirko [Thu, 3 Dec 2015 11:12:16 +0000 (12:12 +0100)]
net: introduce lower state changed info structure for LAG lowers

This is shared info structure for bonding and team. Serves to pass down
info about link state and port activity to notification listeners.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: introduce change lower state notifier
Jiri Pirko [Thu, 3 Dec 2015 11:12:15 +0000 (12:12 +0100)]
net: introduce change lower state notifier

When lower device like bonding slave, team/bridge port, etc changes its
state, it is useful for others to notice this change. Currently this is
implemented specificly for bonding as NETDEV_BONDING_INFO notifier. This
patch aims to replace this specific usage and make this more generic to
be used for all upper-lower devices.

Introduce NETDEV_CHANGELOWERSTATE netdev notifier type and
netdev_lower_state_changed() helper.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobonding: fill-up LAG changeupper info struct and pass it along
Jiri Pirko [Thu, 3 Dec 2015 11:12:14 +0000 (12:12 +0100)]
bonding: fill-up LAG changeupper info struct and pass it along

Initialize netdev_lag_upper_info structure by TX type according to
current bonding mode and pass it along via netdev_master_upper_dev_link.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoteam: fill-up LAG changeupper info struct and pass it along
Jiri Pirko [Thu, 3 Dec 2015 11:12:13 +0000 (12:12 +0100)]
team: fill-up LAG changeupper info struct and pass it along

Initialize netdev_lag_upper_info structure by TX type according to
current team mode and pass it along via netdev_master_upper_dev_link.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add info struct for LAG changeupper
Jiri Pirko [Thu, 3 Dec 2015 11:12:12 +0000 (12:12 +0100)]
net: add info struct for LAG changeupper

This struct will be shared by bonding and team to pass internal
information to notifier listeners.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add possibility to pass information about upper device via notifier
Jiri Pirko [Thu, 3 Dec 2015 11:12:11 +0000 (12:12 +0100)]
net: add possibility to pass information about upper device via notifier

Sometimes the drivers and other code would find it handy to know some
internal information about upper device being changed. So allow upper-code
to pass information down to notifier listeners during linking.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: propagate upper priv via netdev_master_upper_dev_link
Jiri Pirko [Thu, 3 Dec 2015 11:12:10 +0000 (12:12 +0100)]
net: propagate upper priv via netdev_master_upper_dev_link

Eliminate netdev_master_upper_dev_link_private and pass priv directly as
a parameter of netdev_master_upper_dev_link.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add netif_is_lag_port helper
Jiri Pirko [Thu, 3 Dec 2015 11:12:09 +0000 (12:12 +0100)]
net: add netif_is_lag_port helper

Some code does not mind if a device is bond slave or team port and treats
them the same, as generic LAG ports.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add netif_is_lag_master helper
Jiri Pirko [Thu, 3 Dec 2015 11:12:08 +0000 (12:12 +0100)]
net: add netif_is_lag_master helper

Some code does not mind if the master is bond or team and treats them
the same, as generic LAG.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add netif_is_team_port helper
Jiri Pirko [Thu, 3 Dec 2015 11:12:07 +0000 (12:12 +0100)]
net: add netif_is_team_port helper

Similar to other helpers, caller can use this to find out if device is
team port.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add netif_is_team_master helper
Jiri Pirko [Thu, 3 Dec 2015 11:12:06 +0000 (12:12 +0100)]
net: add netif_is_team_master helper

Similar to other helpers, caller can use this to find out if device is
team master.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobonding: add 802.3ad support for 100G speeds
Jiri Pirko [Thu, 3 Dec 2015 11:12:05 +0000 (12:12 +0100)]
bonding: add 802.3ad support for 100G speeds

Similar to other speeds, add 100G to bonding 802.3ad code.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: Add support for CHANGEUPPER notifier error injection
Ido Schimmel [Thu, 3 Dec 2015 11:12:04 +0000 (12:12 +0100)]
net: Add support for CHANGEUPPER notifier error injection

Since CHANGEUPPER can now fail, add support for it in the newly
introduced netdev notifier error injection infrastructure.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: Check CHANGEUPPER notifier return value
Ido Schimmel [Thu, 3 Dec 2015 11:12:03 +0000 (12:12 +0100)]
net: Check CHANGEUPPER notifier return value

switchdev drivers reflect the newly requested topology to hardware when
CHANGEUPPER is received, after software links were already formed.
However, the operation can fail and user will not be notified, as the
return value of the notifier is not checked.

Add this check and rollback software links if necessary.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoi40e: Fix i40e_print_features() VEB mode output
Joe Perches [Thu, 3 Dec 2015 12:20:57 +0000 (04:20 -0800)]
i40e: Fix i40e_print_features() VEB mode output

Commit 7fd89545f337 ("i40e: remove BUG_ON from feature string building")
added defective output when I40E_FLAG_VEB_MODE_ENABLED was set in
function i40e_print_features.

Fix it.

Miscellanea:

- Remove unnecessary string variable
- Add space before not after fixed strings
- Use kmalloc not kzalloc
- Don't initialize i to 0, use result of first snprintf

Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Thu, 3 Dec 2015 16:43:32 +0000 (11:43 -0500)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-12-03

This series contains updates to ixgbe and ixgbevf only.

Mark cleans up ixgbe_init_phy_ops_x550em, since this was designed to
initialize function pointers only and moves the KR PHY reset to the
ixgbe_setup_internal_phy_t_x550em which was designed to detect which
mode the PHY operates in and set it up.  Added the new thermal alarm
type support used with newer X550EM_x devices.  Fixed both ixgbe and
ixgbevf to use a private work queue to avoid hangs, which would
possibly occur when creating and destroying many VFS repeatedly.
Updated ixgbe PTP implementation to accommodate X550EM_x devices,
which handle clocking differently.  Fixed specification violations
in the datasheet, which was reported by Dan Streetman.  Fixed ixgbe
to check for and handle IPv6 extended headers so that Tx checksum
offload can be done, which was reported by Tom Herbert.  Fixed ixgbe
link issue for some systems with X540 or X550 by only inhibiting the
turning PHY power off when manageability is present.

Alex Duyck refactors the MAC address configuration code, which in
turns fixes an issue where once 63 entries had been used, you could no
longer add additional filters.  Updated ixgbe to use __dev_uc_sync
which also resolved an issue in which you could not remove an FDB
address without having to reset the port.  Updated the ixgbe driver
to make use of all the free RAR entries for FDB use if needed.

v2: updated patch 13 to "Alex Duyck Approved" version, in the original
    submission, I had grabbed a previous version of the patch and did not
    catch it was superseded by a later version
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoixgbevf: Handle extended IPv6 headers in Tx path
Mark Rustad [Thu, 19 Nov 2015 21:56:30 +0000 (13:56 -0800)]
ixgbevf: Handle extended IPv6 headers in Tx path

Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for seeing how to coalesce the error handling
into one location.

Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Always turn PHY power on when requested
Mark Rustad [Thu, 5 Nov 2015 19:02:14 +0000 (11:02 -0800)]
ixgbe: Always turn PHY power on when requested

Instead of inhibiting PHY power control when manageability is
present, only inhibit turning PHY power off when manageability
is present. Consequently, PHY power will always be turned on when
requested. Without this patch, some systems with X540 or X550
devices in some conditions will never get link.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Handle extended IPv6 headers in Tx path
Mark Rustad [Wed, 18 Nov 2015 17:21:28 +0000 (09:21 -0800)]
ixgbe: Handle extended IPv6 headers in Tx path

Check for and handle IPv6 extended headers so that Tx checksum
offload can be done. Also use skb_checksum_help for unexpected
cases. Thanks to Tom Herbert for noticing these problems. Thanks
to Alexander Duyck for recognizing problems with the first version
of this patch and recognizing how to coalesce error conditions
into a single location.

Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Save VF info and take references
Mark Rustad [Fri, 30 Oct 2015 22:29:34 +0000 (15:29 -0700)]
ixgbe: Save VF info and take references

Save VF device pointers and take references to speed accesses used
to monitor the device behavior to avoid slot resets. The saved
information avoids lock contention during the search used to access
each of the VFs.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Wait for master disable to be set
Mark Rustad [Tue, 27 Oct 2015 20:23:23 +0000 (13:23 -0700)]
ixgbe: Wait for master disable to be set

According to the datasheets, the driver should wait for the master
disable bit to read as being set before checking the status
register for master disable.

Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Correct spec violations by waiting after reset
Mark Rustad [Tue, 27 Oct 2015 20:23:14 +0000 (13:23 -0700)]
ixgbe: Correct spec violations by waiting after reset

The ixgbe driver was violating the specification in the datasheet
by not waiting 1ms before checking for the reset bit clearing. This
is called out for devices supported by ixgbe, so implement the
required delay.

Reported-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Update PTP to support X550EM_x devices
Mark Rustad [Tue, 27 Oct 2015 16:58:07 +0000 (09:58 -0700)]
ixgbe: Update PTP to support X550EM_x devices

The X550EM_x devices handle clocking differently, so update the
PTP implementation to accommodate them. This involves significant
changes to ixgbe's PTP code to accommodate the new range of
behaviors including things like non-power-of-2 clock wrapping.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Allow FDB entries access to more RAR filters
Alexander Duyck [Thu, 22 Oct 2015 23:26:42 +0000 (16:26 -0700)]
ixgbe: Allow FDB entries access to more RAR filters

This change makes it so that we allow the PF to make use of all free RAR
entries for FDB use if needed.

Previously the code limited us to 16 unicast entries, however this was
shared between MACVLAN which wasn't limited and the FDB code which was.  So
instead of treating the FDB code as a second class citizen I have updated
it so that it has access to just as many entries as the MACVLAN filters.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses
Alexander Duyck [Thu, 22 Oct 2015 23:26:36 +0000 (16:26 -0700)]
ixgbe: Use __dev_uc_sync and __dev_uc_unsync for unicast addresses

This change replaces the ixgbe_write_uc_addr_list call in ixgbe_set_rx_mode
with a call to __dev_uc_sync instead.  This works much better with the MAC
addr list code that was already in place and solves an issue in which you
couldn't remove an FDB address without having to reset the port.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Refactor MAC address configuration code
Alexander Duyck [Thu, 22 Oct 2015 23:26:30 +0000 (16:26 -0700)]
ixgbe: Refactor MAC address configuration code

In the process of tracking down a memory leak when adding/removing FDB
entries I had to go through the MAC address configuration code for ixgbe.
In the process of doing so I found a number of issues that impacted
readability and performance.  This change updates the code in general to
clean it up so it becomes clear what each step is doing.  From what I can
tell there a couple of bugs cleaned up in this code.

First is the fact that the MAC addresses were being double counted for the
PF.  As a result once entries up to 63 had been used you could no longer
add additional filters.

A simple test case for this:
  for i in `seq 0 96`
  do
    ip link add link ens8 name mv$i type macvlan
    ip link set dev mv$i up
  done

Test script:
  ethregs -s 0:8.0 | grep -e "RAH" | grep 8000....$

When things are working correctly RAL/H registers 1 - 97 will be consumed.
In the failing case it will stop at 63 and prevent any further filters from
being added.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbevf: Minor cleanups
Mark Rustad [Thu, 22 Oct 2015 00:21:20 +0000 (17:21 -0700)]
ixgbevf: Minor cleanups

Make some minor cleanups, such as simplifying return paths, deleting
unneeded initializations, return values more directly and so forth.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbevf: Use a private workqueue to avoid certain possible hangs
Mark Rustad [Thu, 22 Oct 2015 00:21:15 +0000 (17:21 -0700)]
ixgbevf: Use a private workqueue to avoid certain possible hangs

Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Use private workqueue to avoid certain possible hangs
Mark Rustad [Thu, 22 Oct 2015 00:21:10 +0000 (17:21 -0700)]
ixgbe: Use private workqueue to avoid certain possible hangs

Use a private workqueue to avoid hangs that were otherwise possible
when performing stress tests, such as creating and destroying many
VFS repeatedly.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Add support for newer thermal alarm
Mark Rustad [Mon, 19 Oct 2015 16:22:14 +0000 (09:22 -0700)]
ixgbe: Add support for newer thermal alarm

The newer copper PHY implementation used with newer X550EM_x
devices uses a different thermal alarm type than the earlier
one. Make changes to support both types.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoixgbe: Prevent KR PHY reset in ixgbe_init_phy_ops_x550em
Mark Rustad [Fri, 16 Oct 2015 20:27:49 +0000 (13:27 -0700)]
ixgbe: Prevent KR PHY reset in ixgbe_init_phy_ops_x550em

This patch removes KR PHY reset from ixgbe_init_phy_ops_x550em,
since this function is meant to initialize function pointers for
the detected PHY type. Internal PHY reset was moved to
ixgbe_setup_internal_phy_t_x550em which will now detect which
mode the internal PHY operates in and set it up as required.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Darin Miller <darin.j.miller@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agosfc: use ALIGN macro for aligning frame sizes
Jarod Wilson [Mon, 30 Nov 2015 22:12:21 +0000 (17:12 -0500)]
sfc: use ALIGN macro for aligning frame sizes

Don't open-code it.

CC: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotcp: suppress too verbose messages in tcp_send_ack()
Eric Dumazet [Mon, 30 Nov 2015 16:57:28 +0000 (08:57 -0800)]
tcp: suppress too verbose messages in tcp_send_ack()

If tcp_send_ack() can not allocate skb, we properly handle this
and setup a timer to try later.

Use __GFP_NOWARN to avoid polluting syslog in the case host is
under memory pressure, so that pertinent messages are not lost under
a flood of useless information.

sk_gfp_atomic() can use its gfp_mask argument (all callers currently
were using GFP_ATOMIC before this patch)

We rename sk_gfp_atomic() to sk_gfp_mask() to clearly express this
function now takes into account its second argument (gfp_mask)

Note that when tcp_transmit_skb() is called with clone_it set to false,
we do not attempt memory allocations, so can pass a 0 gfp_mask, which
most compilers can emit faster than a non zero or constant value.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'hv_netvsc-less-headroom'
David S. Miller [Thu, 3 Dec 2015 04:43:48 +0000 (23:43 -0500)]
Merge branch 'hv_netvsc-less-headroom'

Merge branch 'hv_netvsc-less-headroom'

K. Y. Srinivasan says:

====================
hv_netvsc: Eliminate the additional head room

In an attempt to avoid having to allocate memory on the send path, the netvsc
driver was requesting additional head room so that both rndis header and the
netvsc packet (the state that had to persist) could be placed in the skb.
Since the amount of head room requested was exceeding the default head room
as set in LL_MAX_HEADER, we were forcing a reallocation of skb.

With this patch-set, I have reduced the size of the netvsc packet to less
than 20 bytes and with this reduction we don't need to ask for any additional
headroom. We place the rndis header in the skb head room and we place the
netvsc packet in control buffer area in the skb.

V2:  - Addressed  review comments:
     - Eliminated more fields from netvsc packet structure.

V3:  - Fixed a typo in patch: hv_netvsc: Don't ask for additional head room in the skb.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate vlan_tci from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:19 +0000 (16:43 -0800)]
hv_netvsc: Eliminate vlan_tci from struct hv_netvsc_packet

Eliminate vlan_tci from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate status from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:18 +0000 (16:43 -0800)]
hv_netvsc: Eliminate status from struct hv_netvsc_packet

Eliminate status from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate xmit_more from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:17 +0000 (16:43 -0800)]
hv_netvsc: Eliminate xmit_more from struct hv_netvsc_packet

Eliminate xmit_more from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate completion_func from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:16 +0000 (16:43 -0800)]
hv_netvsc: Eliminate completion_func from struct hv_netvsc_packet

Eliminate completion_func from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate is_data_pkt from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:15 +0000 (16:43 -0800)]
hv_netvsc: Eliminate is_data_pkt from struct hv_netvsc_packet

Eliminate is_data_pkt from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate send_completion_tid from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:14 +0000 (16:43 -0800)]
hv_netvsc: Eliminate send_completion_tid from struct hv_netvsc_packet

Eliminate send_completion_tid from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate page_buf from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:13 +0000 (16:43 -0800)]
hv_netvsc: Eliminate page_buf from struct hv_netvsc_packet

Eliminate page_buf from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: remove locking in netvsc_send()
Vitaly Kuznetsov [Wed, 2 Dec 2015 00:43:12 +0000 (16:43 -0800)]
hv_netvsc: remove locking in netvsc_send()

Packet scheduler guarantees there won't be multiple senders for the same
queue and as we use q_idx for multi_send_data the spinlock is redundant.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: move subchannel existence check to netvsc_select_queue()
Vitaly Kuznetsov [Wed, 2 Dec 2015 00:43:11 +0000 (16:43 -0800)]
hv_netvsc: move subchannel existence check to netvsc_select_queue()

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Don't ask for additional head room in the skb
KY Srinivasan [Wed, 2 Dec 2015 00:43:10 +0000 (16:43 -0800)]
hv_netvsc: Don't ask for additional head room in the skb

The rndis header is 116 bytes big and can be placed in the default
head room that will be available in the skb. Since the netvsc packet
is less than 48 bytes, we can use the skb control buffer
for the netvsc packet. With these changes we don't need to
ask for additional head room.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate send_completion_ctx from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:09 +0000 (16:43 -0800)]
hv_netvsc: Eliminate send_completion_ctx from struct hv_netvsc_packet

Eliminate send_completion_ctx from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate send_completion from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:08 +0000 (16:43 -0800)]
hv_netvsc: Eliminate send_completion from struct hv_netvsc_packet

Eliminate send_completion from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminatte the data field from struct hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:07 +0000 (16:43 -0800)]
hv_netvsc: Eliminatte the data field from struct hv_netvsc_packet

Eliminatte the data field from struct hv_netvsc_packet.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate rndis_msg pointer from hv_netvsc_packet structure
KY Srinivasan [Wed, 2 Dec 2015 00:43:06 +0000 (16:43 -0800)]
hv_netvsc: Eliminate rndis_msg pointer from hv_netvsc_packet structure

Eliminate rndis_msg pointer from hv_netvsc_packet structure.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Eliminate the channel field in hv_netvsc_packet structure
KY Srinivasan [Wed, 2 Dec 2015 00:43:05 +0000 (16:43 -0800)]
hv_netvsc: Eliminate the channel field in hv_netvsc_packet structure

Eliminate the channel field in hv_netvsc_packet structure.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Rearrange the hv_negtvsc_packet to be space efficient
KY Srinivasan [Wed, 2 Dec 2015 00:43:04 +0000 (16:43 -0800)]
hv_netvsc: Rearrange the hv_negtvsc_packet to be space efficient

Rearrange the elements of struct hv_negtvsc_packet for optimal layout -
eliminate unnecessary padding.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: Resize some of the variables in hv_netvsc_packet
KY Srinivasan [Wed, 2 Dec 2015 00:43:03 +0000 (16:43 -0800)]
hv_netvsc: Resize some of the variables in hv_netvsc_packet

As part of reducing the size of the hv_netvsc_packet, resize some of the
variables based on their usage.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 2 Dec 2015 20:32:50 +0000 (15:32 -0500)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-12-01

This series contains updates to i40e and i40evf only.

Helin adds new fields to i40e_vsi to store user configured RSS config data
and the code to use it.  Also renamed RSS items to clarify functionality
and scope to users.  Fixed a confusing kernel message of enabling RSS size
by reporting it together with the hardware maximum RSS size.

Anjali fixes the issue of forcing writeback too often causing us to not
benefit from NAPI.

Jesse adds a prefetch for data early in the transmit path to help immensely
for pktgen and forwarding workloads.  Fixed the i40e driver that was
possibly sleeping inside critical section of code.

Carolyn fixes an issue where adminq init failures always provided a message
that NVM was newer than expected, when this is not always the case for
init_adminq failures.  Fixed by adding a check for that specific error
condition and a different helpful message otherwise.

Mitch fixes error message by telling the user which VF is being naughty,
rather than making them guess.  Updated the queue_vector array from a
statically-sized member of the adapter structure, to a dynamically-allocated
and -sized array.  This reduces the size of the adapter structure and allows
us to support any number of queue vectors in the future without changing the
code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoi40e: remove unused argument
Jesse Brandeburg [Fri, 6 Nov 2015 01:01:02 +0000 (17:01 -0800)]
i40e: remove unused argument

With the final edition of the patches to remove sleeps from
the driver's entry points, the grab_rtnl argument is no
longer needed, so partially revert the commit that added it.

Change-ID: Ib9778476242586cc9e58b670f5f48d415cb59003
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: fix: do not sleep in netdev_ops
Jesse Brandeburg [Fri, 6 Nov 2015 01:01:01 +0000 (17:01 -0800)]
i40e: fix: do not sleep in netdev_ops

The driver was being called by VLAN, bonding, teaming operations
that expected to be able to hold locks like rcu_read_lock().

This causes the driver to be held to the requirement to not sleep,
and was found by the kernel debug options for checking sleep
inside critical section, and the locking validator.

Change-ID: Ibc68c835f5ffa8ffe0638ffe910a66fc5649a7f7
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e/i40evf: Bump i40e version to 1.4.4 and i40evf to 1.4.1
Catherine Sullivan [Mon, 26 Oct 2015 23:44:41 +0000 (19:44 -0400)]
i40e/i40evf: Bump i40e version to 1.4.4 and i40evf to 1.4.1

Bump.

Change-ID: I00ebbb2e5e5572f947502b8f6db4d94f666d6b14
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: allocate ring structs dynamically
Mitch Williams [Mon, 26 Oct 2015 23:44:40 +0000 (19:44 -0400)]
i40evf: allocate ring structs dynamically

Instead of awkwardly keeping a fixed array of pointers in the adapter
struct and then allocating ring structs individually, just keep a single
pointer and allocate a single blob for the arrays. This simplifies code,
shrinks the adapter structure, and future-proofs the driver by not
limiting the number of rings we can handle.

Change-ID: I31334ff911a6474954232cfe4bc98ccca3c769ff
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: allocate queue vectors dynamically
Mitch Williams [Mon, 26 Oct 2015 23:44:39 +0000 (19:44 -0400)]
i40evf: allocate queue vectors dynamically

Change the queue_vector array from a statically-sized member of the
adapter structure to a dynamically-allocated and -sized array.

This reduces the size of the adapter structure, and allows us to support
any number of queue vectors in the future without changing the code.

Change-ID: I08dc622cb2f2ad01e832e51c1ad9b86524730693
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: quoth the VF driver, Nevermore
Mitch Williams [Mon, 26 Oct 2015 23:44:38 +0000 (19:44 -0400)]
i40evf: quoth the VF driver, Nevermore

If, upon a midnight dreary, the PF returns ERR_PARAM when the VF is
requesting resources, that's fatal. Either the firmware or NVM is badly,
badly misconfigured, or this VF has been disabled due to a previous VF
driver sending a bunch of bogus messages.

Either way, there is no recovery from this. Don't ponder weak and weary,
just quit.

Change-ID: I09d9f16cc4ee7fec3b57646a289d33838c1c5bf5
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: make error message more useful
Mitch Williams [Mon, 26 Oct 2015 23:44:37 +0000 (19:44 -0400)]
i40e: make error message more useful

If we get an invalid message from a VF, we should tell the user which VF
is being naughty, rather than making them guess.

Change-ID: I9252cef7baea3d8584043ed6ff12619a94e2f99c
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: fix confusing message
Helin Zhang [Mon, 26 Oct 2015 23:44:36 +0000 (19:44 -0400)]
i40e: fix confusing message

This patch fixes the confusing kernel message of enabled RSS size,
by reporting it together with the hardware maximum RSS size.

Change-ID: I64864dbfbc13beccc180a7871680def1f3d5a339
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: Update error messaging
Carolyn Wyborny [Mon, 26 Oct 2015 23:44:35 +0000 (19:44 -0400)]
i40e: Update error messaging

This patch fixes an issue where adminq init failures always provided
a message that NVM was newer than expected.  This is not always the
case for init_adminq failures. Without this patch, if adminq init
fails for any reason, newer NVM message would be given.  This
problem is fixed by adding  a check for that specific error
condition and a different hopefully helpful message otherwise.

Change-ID: Iaeaebee4e398989eae40bb70f943ab66a3a521a5
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: add new fields to store user configuration of RSS
Helin Zhang [Mon, 26 Oct 2015 23:44:34 +0000 (19:44 -0400)]
i40evf: add new fields to store user configuration of RSS

This patch adds new fields to i40e_vsi to store user configured
RSS config data and code to use it.

Change-ID: Ic5d3db8d9df52182b560248f8cdca9c5c7546879
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: create a generic get RSS function
Helin Zhang [Mon, 26 Oct 2015 23:44:33 +0000 (19:44 -0400)]
i40evf: create a generic get RSS function

There are two ways to get RSS, this patch implements two functions
with the same input parameters, and creates a more generic function
for getting RSS configuration.

Change-ID: I12d3b712c21455d47dd0a5aae58fc9b7c680db59
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: create a generic config RSS function
Helin Zhang [Tue, 27 Oct 2015 20:15:06 +0000 (16:15 -0400)]
i40evf: create a generic config RSS function

There are two ways to configure RSS, this patch adjusts those two
functions with the same input parameters, and creates a more
generic function for configuring RSS.

Change-ID: Iace73bdeba4831909979bef221011060ab327f71
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40evf: rename VF adapter specific RSS function
Helin Zhang [Mon, 26 Oct 2015 23:44:31 +0000 (19:44 -0400)]
i40evf: rename VF adapter specific RSS function

This patch renames old VF adapter specific RSS function to clarify
its scope.

Change-ID: Ie5253083a44c677ebb7709a8a3a18402ad2dc6a6
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e/i40evf: prefetch skb data on transmit
Jesse Brandeburg [Mon, 26 Oct 2015 23:44:30 +0000 (19:44 -0400)]
i40e/i40evf: prefetch skb data on transmit

Issue a prefetch for data early in the transmit path.
This should not be generally needed for Tx traffic, but
it helps immensely for pktgen workloads and should help
for forwarding workloads as well.

Change-ID: Iefee870c20599e0c4240e1d8637e4f16b625f83a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e/i40evf: Fix RS bit update in Tx path and disable force WB workaround
Anjali Singhai Jain [Mon, 26 Oct 2015 23:44:29 +0000 (19:44 -0400)]
i40e/i40evf: Fix RS bit update in Tx path and disable force WB workaround

This patch fixes the issue of forcing WB too often causing us to not
benefit from NAPI.

Without this patch we were forcing WB/arming interrupt too often taking
away the benefits of NAPI and causing a performance impact.

With this patch we disable force WB in the clean routine for X710
and XL710 adapters. X722 adapters do not enable interrupt to force
a WB and benefit from WB_ON_ITR and hence force WB is left enabled
for those adapters.
For XL710 and X710 adapters if we have less than 4 packets pending
a software Interrupt triggered from service task will force a WB.

This patch also changes the conditions for setting RS bit as described
in code comments. This optimizes when the HW does a tail bump amd when
it does a WB. It also optimizes when we do a wmb.

Change-ID: Id831e1ae7d3e2ec3f52cd0917b41ce1d22d75d9d
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: rename rss_size to alloc_rss_size in i40e_pf
Helin Zhang [Mon, 26 Oct 2015 23:44:28 +0000 (19:44 -0400)]
i40e: rename rss_size to alloc_rss_size in i40e_pf

This patch renames rss_size to alloc_rss_size in i40e_pf, which is
clearer and avoids confusion. It also adds comments to the other
related structure members to help clarify usage.

Change-ID: Ia90090609d006ab589cb639975bb8a0af795d16f
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoi40e: add new fields to store user configuration
Helin Zhang [Mon, 26 Oct 2015 23:44:27 +0000 (19:44 -0400)]
i40e: add new fields to store user configuration

This patch adds new fields to i40e_vsi to store user configured
RSS config data and code to use it.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Change-ID: I73886469dca9e9f6b16d842182a87f3f4009f95d
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
8 years agoisdn: remove spellcaster driver
Arnd Bergmann [Mon, 30 Nov 2015 10:34:09 +0000 (11:34 +0100)]
isdn: remove spellcaster driver

The 'sc' ISDN driver relies on using readl() to access ISA I/O memory.
This has been deprecated and produced warnings since linux-2.3.23,
disabled by default since 2.4.10 and finally removed in 2.6.5.

I found this because the compiling the driver for ARM produces
a warning:

In file included from ../drivers/isdn/sc/includes.h:8:0,
                 from ../drivers/isdn/sc/init.c:13:
../arch/arm/include/asm/io.h:115:21: note: expected 'const volatile void *' but argument is of type 'long unsigned int'

It is pretty clear that this driver has not been used for a long time
and there is no point fixing it now, so let's remove it.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agostmmac: support Reg_9 to get HW level information
Giuseppe CAVALLARO [Mon, 30 Nov 2015 10:33:10 +0000 (11:33 +0100)]
stmmac: support Reg_9 to get HW level information

For GMAC newer than 3.40a there is a new register (Reg_9) that provides the
status of all modules of the transmit and receive paths and FIFO status.
These can be exposed via ethtool.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'qed-ethtool-ops'
David S. Miller [Tue, 1 Dec 2015 21:02:41 +0000 (16:02 -0500)]
Merge branch 'qed-ethtool-ops'

Yuval Mintz says:

====================
qede/qed: Implement various ethtool operations

This series adds several new ethtool operations to qede:
  - {get, set}_channels
  - {get, set}_ringparam
  - set_phys_id
  - nway_reset
  - {get, set}_pauseparam
As well as extending the qed APIs to support these commands.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqede: Add support for {get, set}_pauseparam
Sudarsana Kalluru [Mon, 30 Nov 2015 10:25:06 +0000 (12:25 +0200)]
qede: Add support for {get, set}_pauseparam

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqede: Add support for nway_reset
Sudarsana Kalluru [Mon, 30 Nov 2015 10:25:05 +0000 (12:25 +0200)]
qede: Add support for nway_reset

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqede: Add support for set_phys_id
Sudarsana Kalluru [Mon, 30 Nov 2015 10:25:04 +0000 (12:25 +0200)]
qede: Add support for set_phys_id

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqed: Add support for changing LED state
Sudarsana Kalluru [Mon, 30 Nov 2015 10:25:03 +0000 (12:25 +0200)]
qed: Add support for changing LED state

Physical LEDs are being controlled by the management FW.
This adds the qed functionality required to request management FW to
change the LED configuration, as well as the necessary APIs for this
functionality to later be used by the protocol drivers.

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqede: Add support for {get, set}_ringparam
Sudarsana Kalluru [Mon, 30 Nov 2015 10:25:02 +0000 (12:25 +0200)]
qede: Add support for {get, set}_ringparam

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoqede: Add support for {get, set}_channels
Sudarsana Kalluru [Mon, 30 Nov 2015 10:25:01 +0000 (12:25 +0200)]
qede: Add support for {get, set}_channels

Signed-off-by: Sudarsana Kalluru <Sudarsana.Kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'sfc-8000'
David S. Miller [Tue, 1 Dec 2015 20:46:40 +0000 (15:46 -0500)]
Merge branch 'sfc-8000'

Bert Kenward says:

====================
Basic support for Solarflare 8000 series NICs

The upcoming Solarflare 8000 series 10G/40G network card supports a
similar interface to the current 7000 series cards. This patch series
provides basic support for these cards, making no use of any new
functionality.

v2: fix indenting in ef10.c in patch 1/2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: Add PCI ID for Solarflare 8000 series 10/40G NIC
Bert Kenward [Mon, 30 Nov 2015 09:05:47 +0000 (09:05 +0000)]
sfc: Add PCI ID for Solarflare 8000 series 10/40G NIC

Also add support for 7000 series 40G NIC VF.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agosfc: make TSO version a per-queue parameter
Bert Kenward [Mon, 30 Nov 2015 09:05:35 +0000 (09:05 +0000)]
sfc: make TSO version a per-queue parameter

The Solarflare 8000 series NIC will use a new TSO scheme. The current
driver refuses to load if the current TSO scheme is not found. Remove
that check and instead make the TSO version a per-queue parameter.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: add support for netdev notifier error injection
Nikolay Aleksandrov [Sat, 28 Nov 2015 12:45:28 +0000 (13:45 +0100)]
net: add support for netdev notifier error injection

This module allows to insert errors in some of netdevice's notifier
events. All network drivers use these notifiers to signal various events
and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
afterwards. Until recently I had to run failure tests by injecting
a custom module, but now this infrastructure makes it trivial to test
these failure paths. Some of the recent bugs I fixed were found using
this module.
Here's an example:
 $ cd /sys/kernel/debug/notifier-error-inject/netdev
 $ echo -22 > actions/NETDEV_CHANGEMTU/error
 $ ip link set eth0 mtu 1024
 RTNETLINK answers: Invalid argument

CC: Akinobu Mita <akinobu.mita@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev <netdev@vger.kernel.org>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agohv_netvsc: rework link status change handling
Vitaly Kuznetsov [Fri, 27 Nov 2015 10:39:55 +0000 (11:39 +0100)]
hv_netvsc: rework link status change handling

There are several issues in hv_netvsc driver with regards to link status
change handling:
- RNDIS_STATUS_NETWORK_CHANGE results in calling userspace helper doing
  '/etc/init.d/network restart' and this is inappropriate and broken for
  many reasons.
- link_watch infrastructure only sends one notification per second and
  in case of e.g. paired disconnect/connect events we get only one
  notification with last status. This makes it impossible to handle such
  situations in userspace.

Redo link status changes handling in the following way:
- Create a list of reconfig events in network device context.
- On a reconfig event add it to the list of events and schedule
  netvsc_link_change().
- In netvsc_link_change() ensure 2-second delay between link status
  changes.
- Handle RNDIS_STATUS_NETWORK_CHANGE as a paired disconnect/connect event.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agounix: use wq_has_sleeper in unix_dgram_recvmsg
Rainer Weikusat [Thu, 26 Nov 2015 19:23:15 +0000 (19:23 +0000)]
unix: use wq_has_sleeper in unix_dgram_recvmsg

The current unix_dgram_recvmsg does a wake up for every received
datagram. This seems wasteful as only SOCK_DGRAM client sockets in an
n:1 association with a server socket will ever wait because of the
associated condition. The patch below changes the function such that the
wake up only happens if wq_has_sleeper indicates that someone actually
wants to be notified. Testing with SOCK_SEQPACKET and SOCK_DGRAM socket
seems to confirm that this is an improvment.

Signed-Off-By: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'ipmr-nl'
David S. Miller [Mon, 30 Nov 2015 20:26:23 +0000 (15:26 -0500)]
Merge branch 'ipmr-nl'

Nikolay Aleksandrov says:

====================
net: ipmr: more cleanups and mfc netlink support

This set continues with the minor cleanups in the first 6 patches and
patch 7 adds the first new feature - MFC manipulation via netlink. It
registers NEWROUTE/DELROUTE for that purpose and uses the same semantics
as the already present netlink dump. The only new attribute that is used
is RTA_PREFSRC to denote an MFC_PROXY entry. Currently the table must
exist before adding an entry, and new tables can be created only via
setsockopt, but that will be changed in the future.
This set was tested with modified iproute2 which supports NEWROUTE/DELROUTE
for RTNL_FAMILY_IPMR.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: ipmr: add mfc newroute/delroute netlink support
Nikolay Aleksandrov [Thu, 26 Nov 2015 14:23:50 +0000 (15:23 +0100)]
net: ipmr: add mfc newroute/delroute netlink support

This patch adds support to add and remove MFC entries. It uses the
same attributes like the already present dump support in order to be
consistent. There's one new entry - RTA_PREFSRC, it's used to denote an
MFC_PROXY entry (see MRT_ADD_MFC vs MRT_ADD_MFC_PROXY).
The already existing infrastructure is used to create and delete the
entries, the netlink message gets converted internally to a struct mfcctl
which is used with ipmr_mfc_add/delete.
The other used attributes are:
RTA_IIF - used for mfcc_parent (when adding it's required to be valid)
RTA_SRC - used for mfcc_origin
RTA_DST - used for mfcc_mcastgrp
RTA_TABLE - the MRT table id
RTA_MULTIPATH - the "oifs" ttl array

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>