Tony Nguyen [Wed, 26 Jun 2019 09:20:27 +0000 (02:20 -0700)]
ice: Bump version number
Update driver version to 0.7.5
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Wed, 26 Jun 2019 09:20:26 +0000 (02:20 -0700)]
ice: Remove flag to track VF interrupt status
As a result of refactoring of VF VSIs interrupts code, there is no
need to track its configuration status again with ICE_VF_STATE_CFG_INTR
flag - In fact, it is not being checked anywhere in the code right now, so
this patch removes the dead code as applicable to the flag.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Wed, 26 Jun 2019 09:20:25 +0000 (02:20 -0700)]
ice: Remove unnecessary flag ICE_FLAG_MSIX_ENA
This flag is not needed and is called every time we re-enable interrupts
in the hotpath so remove it. Also remove ice_vsi_req_irq() because it
was a wrapper function for ice_vsi_req_irq_msix() whose sole purpose was
checking the ICE_FLAG_MSIX_ENA flag.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Wed, 26 Jun 2019 09:20:24 +0000 (02:20 -0700)]
ice: Don't return error for disabling LAN Tx queue that does exist
Since Tx rings are being managed by FW/NVM, Tx rings might have not been
set up or driver had already wiped them off - In that case, call to
disable LAN Tx queue is being returned as not in existence. This patch
makes sure we don't return unnecessary error for such scenario.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Wed, 26 Jun 2019 09:20:23 +0000 (02:20 -0700)]
ice: Remove duplicate code in ice_alloc_rx_bufs
Currently if the call to ice_alloc_mapped_page() fails we jump to the
no_buf label, possibly call ice_release_rx_desc(), and return true
indicating that there is more work to do. In the success case we just
fall out of the while loop, possibly call ice_alloc_mapped_page(), and
return false saying we exhausted cleaned_count. This flow can be
improved by breaking if ice_alloc_mapped_page() fails and then the flow
outside of the while loop is the same for the failure and success case.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Wed, 26 Jun 2019 09:20:22 +0000 (02:20 -0700)]
ice: Add stats for Rx drops at the port level
Currently we are not reporting dropped counts at the port level to
ethtool or netlink. This was found when debugging Rx dropped issues
and the total packets sent did not equal the total packets received
minus the rx_dropped, which was very confusing. To determine dropped
counts at the port level we need to read the PRTRPB_RDPC register.
To fix reporting we will store the dropped counts in the PF's
rx_discards. This will be reported to netlink by storing it in the
PF VSI's rx_missed_errors signaling that the receiver missed the
packet. Also, we will report this to ethtool in the rx_dropped.nic
field.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Wed, 26 Jun 2019 09:20:21 +0000 (02:20 -0700)]
ice: Update number of VF queue before setting VSI resources
In case there is a request from a VF to change its number of queues, and
the request was successful, we need to update number of queues
configured on the VF before updating corresponding VSI for that VF,
especially LAN Tx queue tree and TC update, otherwise, we would continued
to use old value of vf->num_vf_qs for allocated Tx/Rx queues...
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Wed, 26 Jun 2019 09:20:20 +0000 (02:20 -0700)]
ice: Set up Tx scheduling tree based on alloc VSI Tx queues
This patch uses allocated number of Tx queues per VSI to set up its
scheduling tree instead of using total number of available Tx queues.
Only PF VSIs have total number of allocated Tx queues equal to number
of available Tx queues, other VSIs have different number of queues
configured.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Wed, 26 Jun 2019 09:20:19 +0000 (02:20 -0700)]
ice: Only bump Rx tail and release buffers once per napi_poll
Currently we bump the Rx tail and release/give buffers to hardware every
16 descriptors. This causes us to bump Rx tail up to 4 times per
napi_poll call. Also we are always bumping tail on an odd index and this
is a problem because hardware ignores the lower 3 bits in the QRX_TAIL
register. This is making it so hardware sees tail bumps only every 8
descriptors. Instead lets only bump Rx tail once per napi_poll if
the value aligns with hardware's expectations of the lower 3 bits being
cleared. Also only release/give Rx buffers once per napi_poll call.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Akeem G Abodunrin [Wed, 26 Jun 2019 09:20:18 +0000 (02:20 -0700)]
ice: Disable VFs until reset is completed
This patch adds code to clear VFs enable status until reset is completed,
and Tx/Rx rings are setup. Without this patch, the code flow request Tx
queues to be disabled after reset, especially PFR - where VF VSI Tx rings
have already been wiped off in the NVM and result to adminq error based on
the call to disable Tx LAN queue in ice_reset_all_vfs function call.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Wed, 26 Jun 2019 09:20:17 +0000 (02:20 -0700)]
ice: Do not configure port with no media
The firmware reports an error when trying to configure a port with no
media. Instead of always configuring the port, check for media before
attempting to configure it. In the absence of media, turn off link and
poll for media to become available before re-enabling link.
Move ice_force_phys_link_state() up to avoid forward declaration.
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Wed, 26 Jun 2019 09:20:16 +0000 (02:20 -0700)]
ice: separate out control queue lock creation
The ice_init_all_ctrlq and ice_shutdown_all_ctrlq functions create and
destroy the locks used to protect the send and receive process of each
control queue.
This is problematic, as the driver may use these functions to shutdown
and re-initialize the control queues at run time. For example, it may do
this in response to a device reset.
If the driver failed to recover from a reset, it might leave the control
queues offline. In this case, the locks will no longer be initialized.
A later call to ice_sq_send_cmd will then attempt to acquire a lock that
has been destroyed.
It is incorrect behavior to access a lock that has been destroyed.
Indeed, ice_aq_send_cmd already tries to avoid accessing an offline
control queue, but the check occurs inside the lock.
The root of the problem is that the locks are destroyed at run time.
Modify ice_init_all_ctrlq and ice_shutdown_all_ctrlq such that they no
longer create or destroy the locks.
Introduce new functions, ice_create_all_ctrlq and ice_destroy_all_ctrlq.
Call these functions in ice_init_hw and ice_deinit_hw.
Now, the control queue locks will remain valid for the life of the
driver, and will not be destroyed until the driver unloads.
This also allows removing a duplicate check of the sq.count and
rq.count values when shutting down the controlqs. The ice_shutdown_ctrlq
function already checks this value under the lock. Previously
commit
dec64ff10ed9 ("ice: use [sr]q.count when checking if queue is
initialized") needed this check to happen outside the lock, because it
prevented duplicate attempts at destroying the locks.
The driver may now safely use ice_init_all_ctrlq and
ice_shutdown_all_ctrlq while handling reset events, without causing the
locks to be invalid.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Wed, 26 Jun 2019 09:20:15 +0000 (02:20 -0700)]
ice: Always set prefena when configuring an Rx queue
Currently we are always setting prefena to 0. This is causing the
hardware to only fetch descriptors when there are none free in the cache
for a received packet instead of prefetching when it has used the last
descriptor regardless of incoming packets. Fix this by allowing the
hardware to prefetch Rx descriptors.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tony Nguyen [Wed, 26 Jun 2019 09:20:14 +0000 (02:20 -0700)]
ice: Move vector base setup to PF VSI
When interrupt tracking was refactored, during rebuild, the call to
ice_vsi_setup_vector_base() was inadvertently removed from the PF VSI
instead of being removed from the VF VSI. During reset, the failure to
properly setup the vector base generates a call trace. Correct this so
that resets/rebuilds properly complete.
Fixes: cbe66bfee6a0 ("ice: Refactor interrupt tracking")
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Jacob Keller [Wed, 26 Jun 2019 09:20:13 +0000 (02:20 -0700)]
ice: track hardware stat registers past rollover
Currently, ice_stat_update32 and ice_stat_update40 will limit the
value of the software statistic to 32 or 40 bits wide, depending on
which register is being read.
This means that if a driver is running for a long time, the displayed
software register values will roll over to zero at 40 bits or 32 bits.
This occurs because the functions directly assign the difference between
the previous value and current value of the hardware statistic.
Instead, add this value to the current software statistic, and then
update the previous value.
In this way, each time ice_stat_update40 or ice_stat_update32 are
called, they will increment the software tracking value by the
difference of the hardware register from its last read. The software
tracking value will correctly count up until it overflows a u64.
The only requirement is that the ice_stat_update functions be called at
least once each time the hardware register overflows.
While we're fixing ice_stat_update40, modify it to use rd64 instead of
two calls to rd32. Additionally, drop the now unnecessary hireg
function parameter.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Paul Greenwalt [Wed, 26 Jun 2019 09:20:12 +0000 (02:20 -0700)]
ice: add lp_advertising flow control support
Add support for reporting link partner advertising when
ETHTOOL_GLINKSETTINGS defined. Get pause param reports the Tx/Rx
pause configured, and then ethtool issues ETHTOOL_GSET ioctl and
ice_get_settings_link_up reports the negotiated Tx/Rx pause. Negotiated
pause frame report per IEEE 802.3-2005 table 288-3.
$ ethtool --show-pause ens6f0
Pause parameters for ens6f0:
Autonegotiate: on
RX: on
TX: on
RX negotiated: on
TX negotiated: on
$ ethtool ens6f0
Settings for ens6f0:
Supported ports: [ FIBRE ]
Supported link modes: 25000baseCR/Full
Supported pause frame use: Symmetric
Supports auto-negotiation: Yes
Supported FEC modes: None BaseR RS
Advertised link modes: 25000baseCR/Full
Advertised pause frame use: Symmetric Receive-only
Advertised auto-negotiation: Yes
Advertised FEC modes: None BaseR RS
Link partner advertised link modes: Not reported
Link partner advertised pause frame use: Symmetric
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 25000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: g
Wake-on: g
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes
When ETHTOOL_GLINKSETTINGS is not defined, get pause param reports the
negotiated Tx/Rx pause.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
YueHaibing [Wed, 31 Jul 2019 16:02:19 +0000 (18:02 +0200)]
staging/octeon: Fix build error without CONFIG_NETDEVICES
While do COMPILE_TEST build without CONFIG_NETDEVICES,
we get Kconfig warning:
WARNING: unmet direct dependencies detected for PHYLIB
Depends on [n]: NETDEVICES [=n]
Selected by [y]:
- OCTEON_ETHERNET [=y] && STAGING [=y] && (CAVIUM_OCTEON_SOC && NETDEVICES [=n] || COMPILE_TEST [=y])
Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: Mark Brown <broonie@kernel.org>
Fixes: 171a9bae68c7 ("staging/octeon: Allow test build on !MIPS")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 31 Jul 2019 15:59:41 +0000 (08:59 -0700)]
Merge tag 'mac80211-next-for-davem-2019-07-31' of git://git./linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
We have a reasonably large number of changes:
* lots more HE (802.11ax) support, particularly things
relevant for the the AP side, but also mesh support
* debugfs cleanups from Greg
* some more work on extended key ID
* start using genl parallel_ops, as preparation for
weaning ourselves off RTNL and getting parallelism
* various other changes all over
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 31 Jul 2019 15:47:14 +0000 (08:47 -0700)]
Merge branch 'mlxsw-Test-coverage-for-DSCP-leftover-fix'
Petr Machata says:
====================
mlxsw: Test coverage for DSCP leftover fix
This patch set fixes some global scope pollution issues in the DSCP tests
(in patch #1), and then proceeds (in patch #2) to add a new test for
checking whether, after DSCP prioritization rules are removed from a port,
DSCP rewrites consistently to zero, instead of the last removed rule still
staying in effect.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Wed, 31 Jul 2019 10:30:27 +0000 (10:30 +0000)]
selftests: mlxsw: Add a test for leftover DSCP rule
Commit
dedfde2fe1c4 ("mlxsw: spectrum_dcb: Configure DSCP map as the last
rule is removed") fixed a problem in mlxsw where last DSCP rule to be
removed remained in effect when DSCP rewrite was applied.
Add a selftest that covers this problem.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Wed, 31 Jul 2019 10:30:26 +0000 (10:30 +0000)]
selftests: mlxsw: Fix local variable declarations in DSCP tests
These two tests have some problems in the global scope pollution and on
contrary, contain unnecessary local declarations. Fix them.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ding Xiang [Wed, 31 Jul 2019 08:53:46 +0000 (16:53 +0800)]
myri10ge: remove unneeded variable
"error" is unneeded,just return 0
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Wed, 31 Jul 2019 08:06:38 +0000 (10:06 +0200)]
net: ag71xx: Slighly simplify code in 'ag71xx_rings_init()'
A few lines above, we have:
tx_size = BIT(tx->order);
So use 'tx_size' directly to be consistent with the way 'rx->descs_cpu' and
'rx->descs_dma' are computed below.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shay Bar [Wed, 3 Jul 2019 13:18:48 +0000 (16:18 +0300)]
mac80211: HE STA disassoc due to QOS NULL not sent
In case of HE AP-STA link, ieee80211_send_nullfunc() will not
send the QOS NULL packet to check if AP is still associated.
In this case, probe_send_count will be non-zero and
ieee80211_sta_work() will later disassociate the AP, even
though no packet was ever sent.
Fix this by decrementing probe_send_count and not calling
ieee80211_send_nullfunc() in case of HE link, so that we
still wait for some time for the AP beacon to reappear and
don't disconnect right away.
Signed-off-by: Shay Bar <shay.bar@celeno.com>
Link: https://lore.kernel.org/r/20190703131848.22879-1-shay.bar@celeno.com
[clarify commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
John Crispin [Tue, 30 Jul 2019 16:37:01 +0000 (18:37 +0200)]
mac80211: allow setting spatial reuse parameters from bss_conf
Store the OBSS PD parameters inside bss_conf when bringing up an AP and/or
when a station connects to an AP. This allows the driver to configure the
HW accordingly.
Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190730163701.18836-3-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 31 Jul 2019 08:58:20 +0000 (10:58 +0200)]
nl80211: add strict start type
Add a strict start type so all new attributes starting from
NL80211_ATTR_HE_OBSS_PD are validated strictly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
John Crispin [Tue, 30 Jul 2019 16:37:00 +0000 (18:37 +0200)]
cfg80211: add support for parsing OBBS_PD attributes
Add the data structure, policy and parsing code allowing userland to send
the OBSS PD information into the kernel.
Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20190730163701.18836-2-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Karthikeyan Periyasamy [Wed, 24 Jul 2019 09:16:10 +0000 (14:46 +0530)]
mac80211: reject zero MAC address in add station
This came up in fuzz testing, and really we don't consider
all-zeroes to be a valid MAC address in most places, so
also reject it here to avoid confusion later on.
Signed-off-by: Karthikeyan Periyasamy <periyasa@codeaurora.org>
Link: https://lore.kernel.org/r/1563959770-21570-1-git-send-email-periyasa@codeaurora.org
[rewrite commit message]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Mon, 29 Jul 2019 14:31:09 +0000 (16:31 +0200)]
cfg80211: use parallel_ops for genl
Over time, we really need to get rid of all of our global locking.
One of the things needed is to use parallel_ops. This isn't really
the most important (RTNL is much more important) but OTOH we just
keep adding uses of genl_family_attrbuf() now. Use .parallel_ops to
disallow this.
Reviewed-By: Denis Kenzior <denkenz@gmail.com>
Link: https://lore.kernel.org/r/20190729143109.18683-1-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Mon, 29 Jul 2019 16:06:05 +0000 (18:06 +0200)]
mac80211_hwsim: fill boottime_ns in netlink RX path
Give a proper boottime_ns value for netlink RX to avoid scan
issues here.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20190729160605.1074-1-johannes@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Colin Ian King [Tue, 30 Jul 2019 14:32:05 +0000 (15:32 +0100)]
mac80211: add missing null return check from call to ieee80211_get_sband
The return from ieee80211_get_sband can potentially be a null pointer, so
it seems prudent to add a null check to avoid a null pointer dereference
on sband.
Addresses-Coverity: ("Dereference null return")
Fixes: 2ab45876756f ("mac80211: add support for the ADDBA extension element")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20190730143205.14261-1-colin.king@canonical.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Tue, 30 Jul 2019 22:12:50 +0000 (15:12 -0700)]
Merge branch 'net-dsa-ksz-Add-Microchip-KSZ87xx-support'
Marek Vasut says:
====================
net: dsa: ksz: Add Microchip KSZ87xx support
This series adds support for Microchip KSZ87xx switches, which are
slightly simpler compared to KSZ9xxx .
====================
Signed-off-by: Marek Vasut <marex@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tristram Ha [Mon, 29 Jul 2019 17:49:47 +0000 (19:49 +0200)]
net: dsa: ksz: Add Microchip KSZ8795 DSA driver
Add Microchip KSZ8795 DSA driver.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tristram Ha [Mon, 29 Jul 2019 17:49:46 +0000 (19:49 +0200)]
net: dsa: ksz: Add KSZ8795 tag code
Add DSA tag code for Microchip KSZ8795 switch. The switch is simpler
and the tag is only 1 byte, instead of 2 as is the case with KSZ9477.
Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Vasut [Mon, 29 Jul 2019 17:49:45 +0000 (19:49 +0200)]
dt-bindings: net: dsa: ksz: document Microchip KSZ87xx family switches
Document Microchip KSZ87xx family switches. These include
KSZ8765 - 5 port switch
KSZ8794 - 4 port switch
KSZ8795 - 5 port switch
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: David S. Miller <davem@davemloft.net>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: Vivien Didelot <vivien.didelot@gmail.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: devicetree@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Jul 2019 22:00:00 +0000 (15:00 -0700)]
Merge branch 'vsock-virtio-optimizations-to-increase-the-throughput'
Stefano Garzarella says:
====================
vsock/virtio: optimizations to increase the throughput
This series tries to increase the throughput of virtio-vsock with slight
changes.
While I was testing the v2 of this series I discovered an huge use of memory,
so I added patch 1 to mitigate this issue. I put it in this series in order
to better track the performance trends.
v5:
- rebased all patches on net-next
- added Stefan's R-b and Michael's A-b
v4: https://patchwork.kernel.org/cover/
11047717
v3: https://patchwork.kernel.org/cover/
10970145
v2: https://patchwork.kernel.org/cover/
10938743
v1: https://patchwork.kernel.org/cover/
10885431
Below are the benchmarks step by step. I used iperf3 [1] modified with VSOCK
support. As Michael suggested in the v1, I booted host and guest with 'nosmap'.
A brief description of patches:
- Patches 1: limit the memory usage with an extra copy for small packets
- Patches 2+3: reduce the number of credit update messages sent to the
transmitter
- Patches 4+5: allow the host to split packets on multiple buffers and use
VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max packet size allowed
host -> guest [Gbps]
pkt_size before opt p 1 p 2+3 p 4+5
32 0.032 0.030 0.048 0.051
64 0.061 0.059 0.108 0.117
128 0.122 0.112 0.227 0.234
256 0.244 0.241 0.418 0.415
512 0.459 0.466 0.847 0.865
1K 0.927 0.919 1.657 1.641
2K 1.884 1.813 3.262 3.269
4K 3.378 3.326 6.044 6.195
8K 5.637 5.676 10.141 11.287
16K 8.250 8.402 15.976 16.736
32K 13.327 13.204 19.013 20.515
64K 21.241 21.341 20.973 21.879
128K 21.851 22.354 21.816 23.203
256K 21.408 21.693 21.846 24.088
512K 21.600 21.899 21.921 24.106
guest -> host [Gbps]
pkt_size before opt p 1 p 2+3 p 4+5
32 0.045 0.046 0.057 0.057
64 0.089 0.091 0.103 0.104
128 0.170 0.179 0.192 0.200
256 0.364 0.351 0.361 0.379
512 0.709 0.699 0.731 0.790
1K 1.399 1.407 1.395 1.427
2K 2.670 2.684 2.745 2.835
4K 5.171 5.199 5.305 5.451
8K 8.442 8.500 10.083 9.941
16K 12.305 12.259 13.519 15.385
32K 11.418 11.150 11.988 24.680
64K 10.778 10.659 11.589 35.273
128K 10.421 10.339 10.939 40.338
256K 10.300 9.719 10.508 36.562
512K 9.833 9.808 10.612 35.979
As Stefan suggested in the v1, I measured also the efficiency in this way:
efficiency = Mbps / (%CPU_Host + %CPU_Guest)
The '%CPU_Guest' is taken inside the VM. I know that it is not the best way,
but it's provided for free from iperf3 and could be an indication.
host -> guest efficiency [Mbps / (%CPU_Host + %CPU_Guest)]
pkt_size before opt p 1 p 2+3 p 4+5
32 0.35 0.45 0.79 1.02
64 0.56 0.80 1.41 1.54
128 1.11 1.52 3.03 3.12
256 2.20 2.16 5.44 5.58
512 4.17 4.18 10.96 11.46
1K 8.30 8.26 20.99 20.89
2K 16.82 16.31 39.76 39.73
4K 30.89 30.79 74.07 75.73
8K 53.74 54.49 124.24 148.91
16K 80.68 83.63 200.21 232.79
32K 132.27 132.52 260.81 357.07
64K 229.82 230.40 300.19 444.18
128K 332.60 329.78 331.51 492.28
256K 331.06 337.22 339.59 511.59
512K 335.58 328.50 331.56 504.56
guest -> host efficiency [Mbps / (%CPU_Host + %CPU_Guest)]
pkt_size before opt p 1 p 2+3 p 4+5
32 0.43 0.43 0.53 0.56
64 0.85 0.86 1.04 1.10
128 1.63 1.71 2.07 2.13
256 3.48 3.35 4.02 4.22
512 6.80 6.67 7.97 8.63
1K 13.32 13.31 15.72 15.94
2K 25.79 25.92 30.84 30.98
4K 50.37 50.48 58.79 59.69
8K 95.90 96.15 107.04 110.33
16K 145.80 145.43 143.97 174.70
32K 147.06 144.74 146.02 282.48
64K 145.25 143.99 141.62 406.40
128K 149.34 146.96 147.49 489.34
256K 156.35 149.81 152.21 536.37
512K 151.65 150.74 151.52 519.93
[1] https://github.com/stefano-garzarella/iperf/
====================
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 30 Jul 2019 15:43:34 +0000 (17:43 +0200)]
vsock/virtio: change the maximum packet size allowed
Since now we are able to split packets, we can avoid limiting
their sizes to VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE.
Instead, we can use VIRTIO_VSOCK_MAX_PKT_BUF_SIZE as the max
packet size.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 30 Jul 2019 15:43:33 +0000 (17:43 +0200)]
vhost/vsock: split packets to send using multiple buffers
If the packets to sent to the guest are bigger than the buffer
available, we can split them, using multiple buffers and fixing
the length in the packet header.
This is safe since virtio-vsock supports only stream sockets.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 30 Jul 2019 15:43:32 +0000 (17:43 +0200)]
vsock/virtio: fix locking in virtio_transport_inc_tx_pkt()
fwd_cnt and last_fwd_cnt are protected by rx_lock, so we should use
the same spinlock also if we are in the TX path.
Move also buf_alloc under the same lock.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 30 Jul 2019 15:43:31 +0000 (17:43 +0200)]
vsock/virtio: reduce credit update messages
In order to reduce the number of credit update messages,
we send them only when the space available seen by the
transmitter is less than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Garzarella [Tue, 30 Jul 2019 15:43:30 +0000 (17:43 +0200)]
vsock/virtio: limit the memory used per-socket
Since virtio-vsock was introduced, the buffers filled by the host
and pushed to the guest using the vring, are directly queued in
a per-socket list. These buffers are preallocated by the guest
with a fixed size (4 KB).
The maximum amount of memory used by each socket should be
controlled by the credit mechanism.
The default credit available per-socket is 256 KB, but if we use
only 1 byte per packet, the guest can queue up to 262144 of 4 KB
buffers, using up to 1 GB of memory per-socket. In addition, the
guest will continue to fill the vring with new 4 KB free buffers
to avoid starvation of other sockets.
This patch mitigates this issue copying the payload of small
packets (< 128 bytes) into the buffer of last packet queued, in
order to avoid wasting memory.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Boyd [Tue, 30 Jul 2019 18:15:51 +0000 (11:15 -0700)]
net: Remove dev_err() usage after platform_get_irq()
We don't need dev_err() messages when platform_get_irq() fails now that
platform_get_irq() prints an error message itself when something goes
wrong. Let's remove these prints with a simple semantic patch.
// <smpl>
@@
expression ret;
struct platform_device *E;
@@
ret =
(
platform_get_irq(E, ...)
|
platform_get_irq_byname(E, ...)
);
if ( \( ret < 0 \| ret <= 0 \) )
{
(
-if (ret != -EPROBE_DEFER)
-{ ...
-dev_err(...);
-... }
|
...
-dev_err(...);
)
...
}
// </smpl>
While we're here, remove braces on if statements that only have one
statement (manually).
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Felix Fietkau <nbd@nbd.name>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Jul 2019 21:21:32 +0000 (14:21 -0700)]
Merge branch 'Finish-conversion-of-skb_frag_t-to-bio_vec'
Jonathan Lemon says:
====================
Finish conversion of skb_frag_t to bio_vec
The recent conversion of skb_frag_t to bio_vec did not include
skb_frag's page_offset. Add accessor functions for this field,
utilize them, and remove the union, restoring the original structure.
v2:
- rename accessors
- follow kdoc conventions
====================
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan Lemon [Tue, 30 Jul 2019 14:40:34 +0000 (07:40 -0700)]
linux: Remove bvec page_offset, use bv_offset
Now that page_offset is referenced through accessors, remove
the union, and use bv_offset.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan Lemon [Tue, 30 Jul 2019 14:40:33 +0000 (07:40 -0700)]
net: Use skb_frag_off accessors
Use accessor functions for skb fragment's page_offset instead
of direct references, in preparation for bvec conversion.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jonathan Lemon [Tue, 30 Jul 2019 14:40:32 +0000 (07:40 -0700)]
linux: Add skb_frag_t page_offset accessors
Add skb_frag_off(), skb_frag_off_add(), skb_frag_off_set(),
and skb_frag_off_copy() accessors for page_offset.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Jul 2019 21:18:14 +0000 (14:18 -0700)]
Merge branch 'sctp-clean-up-sctp_connect-function'
Xin Long says:
====================
sctp: clean up __sctp_connect function
This patchset is to factor out some common code for
sctp_sendmsg_new_asoc() and __sctp_connect() into 2
new functioins.
v1->v2:
- add the patch 1/5 to avoid a slab-out-of-bounds warning.
- add some code comment for the check change in patch 2/5.
- remove unused 'addrcnt' as Marcelo noticed in patch 3/5.
====================
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 30 Jul 2019 12:38:23 +0000 (20:38 +0800)]
sctp: factor out sctp_connect_add_peer
In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it adds a peer with the other addr into the
asoc after this asoc is created with the 1st addr.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 30 Jul 2019 12:38:22 +0000 (20:38 +0800)]
sctp: factor out sctp_connect_new_asoc
In this function factored out from sctp_sendmsg_new_asoc() and
__sctp_connect(), it creates the asoc and adds a peer with the
1st addr.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 30 Jul 2019 12:38:21 +0000 (20:38 +0800)]
sctp: clean up __sctp_connect
__sctp_connect is doing quit similar things as sctp_sendmsg_new_asoc.
To factor out common functions, this patch is to clean up their code
to make them look more similar:
1. create the asoc and add a peer with the 1st addr.
2. add peers with the other addrs into this asoc one by one.
while at it, also remove the unused 'addrcnt'.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 30 Jul 2019 12:38:20 +0000 (20:38 +0800)]
sctp: check addr_size with sa_family_t size in __sctp_setsockopt_connectx
Now __sctp_connect() is called by __sctp_setsockopt_connectx() and
sctp_inet_connect(), the latter has done addr_size check with size
of sa_family_t.
In the next patch to clean up __sctp_connect(), we will remove
addr_size check with size of sa_family_t from __sctp_connect()
for the 1st address.
So before doing that, __sctp_setsockopt_connectx() should do
this check first, as sctp_inet_connect() does.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 30 Jul 2019 12:38:19 +0000 (20:38 +0800)]
sctp: only copy the available addr data in sctp_transport_init
'addr' passed to sctp_transport_init is not always a whole size
of union sctp_addr, like the path:
sctp_sendmsg() ->
sctp_sendmsg_new_asoc() ->
sctp_assoc_add_peer() ->
sctp_transport_new() -> sctp_transport_init()
In the next patches, we will also pass the address length of data
only to sctp_assoc_add_peer().
So sctp_transport_init() should copy the only available data from
addr to peer->ipaddr, instead of 'peer->ipaddr = *addr' which may
cause slab-out-of-bounds.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Tue, 30 Jul 2019 14:56:57 +0000 (15:56 +0100)]
rxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto
rxkad sometimes triggers a warning about oversized stack frames when
building with clang for a 32-bit architecture:
net/rxrpc/rxkad.c:243:12: error: stack frame size of 1088 bytes in function 'rxkad_secure_packet' [-Werror,-Wframe-larger-than=]
net/rxrpc/rxkad.c:501:12: error: stack frame size of 1088 bytes in function 'rxkad_verify_packet' [-Werror,-Wframe-larger-than=]
The problem is the combination of SYNC_SKCIPHER_REQUEST_ON_STACK() in
rxkad_verify_packet()/rxkad_secure_packet() with the relatively large
scatterlist in rxkad_verify_packet_1()/rxkad_secure_packet_encrypt().
The warning does not show up when using gcc, which does not inline the
functions as aggressively, but the problem is still the same.
Allocate the cipher buffers from the slab instead, caching the allocated
packet crypto request memory used for DATA packet crypto in the rxrpc_call
struct.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Jul 2019 21:19:09 +0000 (14:19 -0700)]
Merge branch 'bnxt_en-TPA-57500'
Michael Chan says:
====================
bnxt_en: Add TPA (GRO_HW and LRO) on 57500 chips.
This patchset adds TPA v2 support on the 57500 chips. TPA v2 is
different from the legacy TPA scheme on older chips and requires major
refactoring and restructuring of the existing TPA logic. The main
difference is that the new TPA v2 has on-the-fly aggregation buffer
completions before a TPA packet is completed. The larger aggregation
ID space also requires a new ID mapping logic to make it more
memory efficient.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:33 +0000 (06:10 -0400)]
bnxt_en: Add PCI IDs for 57500 series NPAR devices.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:32 +0000 (06:10 -0400)]
bnxt_en: Support all variants of the 5750X chip family.
Define the 57508, 57504, and 57502 chip IDs that are all part of the
BNXT_CHIP_P5 family of chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:31 +0000 (06:10 -0400)]
bnxt_en: Refactor bnxt_init_one() and turn on TPA support on 57500 chips.
With the new TPA feature in the 57500 chips, we need to discover the
feature first before setting up the netdev features. Refactor the
the firmware probe and init logic more cleanly into 2 functions and
and make these calls before setting up the netdev features.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:30 +0000 (06:10 -0400)]
bnxt_en: Support TPA counters on 57500 chips.
Support the new expanded TPA v2 counters on 57500 B0 chips for
ethtool -S.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:29 +0000 (06:10 -0400)]
bnxt_en: Allocate the larger per-ring statistics block for 57500 chips.
The new TPA implemantation has additional TPA counters that extend the
per-ring statistics block. Allocate the proper size accordingly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:28 +0000 (06:10 -0400)]
bnxt_en: Refactor ethtool ring statistics logic.
The current code assumes that the per ring statistics counters are
fixed. In newer chips that support a newer version of TPA, the
TPA counters are also changed. Refactor the code by defining these
counter names in arrays so that it is easy to add a new array for
a new set of counters supported by the newer chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:27 +0000 (06:10 -0400)]
bnxt_en: Add hardware GRO setup function for 57500 chips.
Add a more optimized hardware GRO function to setup the SKB on 57500
chips. Some workaround code is no longer needed on 57500 chips and
the pseudo checksum is also calculated in hardware, so no need to
do the software pseudo checksum in the driver.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:26 +0000 (06:10 -0400)]
bnxt_en: Add TPA ID mapping logic for 57500 chips.
The new TPA feature on 57500 supports a larger number of concurrent TPAs
(up to 1024) divided among the functions. We need to add some logic to
map the hardware TPA ID to a software index that keeps track of each TPA
in progress. A 1:1 direct mapping without translation would be too
wasteful as we would have to allocate 1024 TPA structures for each RX
ring on each PCI function.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:25 +0000 (06:10 -0400)]
bnxt_en: Add fast path logic for TPA on 57500 chips.
With all the previous refactoring, the TPA fast path can now be
modified slightly to support TPA on the new chips. The main
difference is that the agg completions are retrieved differently using
the bnxt_get_tpa_agg_p5() function on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:24 +0000 (06:10 -0400)]
bnxt_en: Set TPA GRO mode flags on 57500 chips properly.
On 57500 chips, hardware GRO mode cannot be determined from the TPA
end, so we need to check bp->flags to determine if we are in hardware
GRO mode or not. Modify bnxt_set_features so that the TPA flags
in bp->flags don't change until the device is closed. This will ensure
that the fast path can safely rely on bp->flags to determine the
TPA mode.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:23 +0000 (06:10 -0400)]
bnxt_en: Refactor tunneled hardware GRO logic.
The 2 GRO functions to set up the hardware GRO SKB fields for 2
different hardware chips have practically identical logic for
tunneled packets. Refactor the logic into a separate bnxt_gro_tunnel()
function that can be used by both functions.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:22 +0000 (06:10 -0400)]
bnxt_en: Handle standalone RX_AGG completions.
On the new 57500 chips, these new RX_AGG completions are not coalesced
at the TPA_END completion. Handle these by storing them in the
array in the bnxt_tpa_info struct, as they are seen when processing
the CMPL ring.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:21 +0000 (06:10 -0400)]
bnxt_en: Expand bnxt_tpa_info struct to support 57500 chips.
Add an aggregation array to bnxt_tpa_info struct to keep track of the
aggregation completions. The aggregation completions are not
completed at the TPA_END completion on 57500 chips so we need to
keep track of them. The array is only allocated on the new chips
when required. An agg_count field is also added to keep track of the
number of these completions.
The maximum concurrent TPA is now discovered from firmware instead of
the hardcoded 64. Add a new bp->max_tpa to keep track of maximum
configured TPA.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:20 +0000 (06:10 -0400)]
bnxt_en: Refactor TPA logic.
Refactor the TPA logic slightly, so that the code can be more easily
extended to support TPA on the new 57500 chips. In particular, the
logic to get the next aggregation completion is refactored into a
new function bnxt_get_agg() so that this operation is made more
generalized. This operation will be different on the new chip in TPA
mode. The logic to recycle the aggregation buffers has a new start
index parameter added for the same purpose.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:19 +0000 (06:10 -0400)]
bnxt_en: Add TPA structure definitions for BCM57500 chips.
The new chips have a slightly modified TPA interface for LRO/GRO_HW.
Modify the TPA structures so that the same structures can also be
used on the new chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 29 Jul 2019 10:10:18 +0000 (06:10 -0400)]
bnxt_en: Update firmware interface spec. to 1.10.0.89.
Among the changes are new CoS discard counters and new ctx_hw_stats_ext
struct for the latest 5750X B0 chips.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Mon, 29 Jul 2019 20:40:56 +0000 (22:40 +0200)]
can: fix ioctl function removal
Commit
60649d4e0af ("can: remove obsolete empty ioctl() handler") replaced the
almost empty can_ioctl() function with sock_no_ioctl() which always returns
-EOPNOTSUPP.
Even though we don't have any ioctl() functions on socket/network layer we need
to return -ENOIOCTLCMD to be able to forward ioctl commands like SIOCGIFINDEX
to the network driver layer.
This patch fixes the wrong return codes in the CAN network layer protocols.
Reported-by: kernel test robot <rong.a.chen@intel.com>
Fixes: 60649d4e0af ("can: remove obsolete empty ioctl() handler")
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rasmus Villemoes [Mon, 22 Jul 2019 23:37:26 +0000 (23:37 +0000)]
net: dsa: mv88e6xxx: avoid some redundant vtu load/purge operations
We have an ERPS (Ethernet Ring Protection Switching) setup involving
mv88e6250 switches which we're in the process of switching to a BSP
based on the mainline driver. Breaking any link in the ring works as
expected, with the ring reconfiguring itself quickly and traffic
continuing with almost no noticable drops. However, when plugging back
the cable, we see 5+ second stalls.
This has been tracked down to the userspace application in charge of
the protocol missing a few CCM messages on the good link (the one that
was not unplugged), causing it to broadcast a "signal fail". That
message eventually reaches its link partner, which responds by
blocking the port. Meanwhile, the first node has continued to block
the port with the just plugged-in cable, breaking the network. And the
reason for those missing CCM messages has in turn been tracked down to
the VTU apparently being too busy servicing load/purge operations that
the normal lookups are delayed.
Initial state, the link between C and D is blocked in software.
_____________________
/ \
| |
A ----- B ----- C *---- D
Unplug the cable between C and D.
_____________________
/ \
| |
A ----- B ----- C * * D
Reestablish the link between C and D.
_____________________
/ \
| |
A ----- B ----- C *---- D
Somehow, enough VTU/ATU operations happen inside C that prevents
the application from receving the CCM messages from B in a timely
manner, so a Signal Fail message is sent by C. When B receives
that, it responds by blocking its port.
_____________________
/ \
| |
A ----- B *---* C *---- D
Very shortly after this, the signal fail condition clears on the
BC link (some CCM messages finally make it through), so C
unblocks the port. However, a guard timer inside B prevents it
from removing the blocking before 5 seconds have elapsed.
It is not unlikely that our userspace ERPS implementation could be
smarter and/or is simply buggy. However, this patch fixes the symptoms
we see, and is a small optimization that should not break anything
(knock wood). The idea is simply to avoid doing an VTU load of an
entry identical to the one already present. To do that, we need to
know whether mv88e6xxx_vtu_get() actually found an existing entry, or
has just prepared a struct mv88e6xxx_vtu_entry for us to load. To that
end, let vlan->valid be an output parameter. The other two callers of
mv88e6xxx_vtu_get() are not affected by this patch since they pass
new=false.
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Sun, 28 Jul 2019 09:25:19 +0000 (11:25 +0200)]
r8169: make use of xmit_more
There was a previous attempt to use xmit_more, but the change had to be
reverted because under load sometimes a transmit timeout occurred [0].
Maybe this was caused by a missing memory barrier, the new attempt
keeps the memory barrier before the call to netif_stop_queue like it
is used by the driver as of today. The new attempt also changes the
order of some calls as suggested by Eric.
[0] https://lkml.org/lkml/2019/2/10/39
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Wilcox (Oracle) [Fri, 26 Jul 2019 17:44:25 +0000 (10:44 -0700)]
staging/octeon: Allow test build on !MIPS
Add compile test support by moving all includes of files under
asm/octeon into octeon-ethernet.h, and if we're not on MIPS,
stub out all the calls into the octeon support code in octeon-stubs.h
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ding Xiang [Mon, 29 Jul 2019 09:01:22 +0000 (17:01 +0800)]
net: ag71xx: use resource_size for the ioremap size
use resource_size to calcuate ioremap size and make
the code simpler.
Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Jul 2019 15:56:27 +0000 (08:56 -0700)]
Merge branch 'nfc-next'
Andy Shevchenko says:
====================
NFC: nxp-nci: clean up and new device support
Few people reported that some laptops are coming with new ACPI ID for the
devices should be supported by nxp-nci driver.
This series adds new ID (patch 2), cleans up the driver from legacy platform
data and unifies GPIO request for Device Tree and ACPI (patches 3-6), removes
dead or unneeded code (patches 7, 9, 11), constifies ID table (patch 8),
removes comma in terminator line for better maintenance (patch 10) and
rectifies Kconfig entry (patches 12-14).
It also contains a fix for NFC subsystem as suggested by Sedat.
Series has been tested by Sedat.
Changelog v4:
- rebased on top of latest linux-next
- appended cover letter
- elaborated removal of pr_fmt() in the patch 11 (David)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sedat Dilek [Mon, 29 Jul 2019 13:35:14 +0000 (16:35 +0300)]
NFC: nxp-nci: Fix recommendation for NFC_NXP_NCI_I2C Kconfig
This is a simple cleanup to the Kconfig help text as discussed in [1].
[1] https://marc.info/?t=
155774435600001&r=1&w=2
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
Signed-off-by: Sedat Dilek <sedat.dilek@credativ.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sedat Dilek [Mon, 29 Jul 2019 13:35:13 +0000 (16:35 +0300)]
NFC: nxp-nci: Clarify on supported chips
This patch clarifies on the supported NXP NCI chips and families
and lists PN547 and PN548 separately which are known as NPC100
respectively NPC300.
This helps to find informations and identify drivers on vendor's
support websites.
For details see the discussion in [1] and [2].
[1] https://marc.info/?t=
155774435600001&r=1&w=2
[2] https://patchwork.kernel.org/project/linux-wireless/list/?submitter=33142
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
Signed-off-by: Sedat Dilek <sedat.dilek@credativ.de>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Oleg Zhurakivskyy <oleg.zhurakivskyy@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:12 +0000 (16:35 +0300)]
NFC: nxp-nci: Remove 'default n' for the core
It seems contributors follow the style of Kconfig entries where explicit
'default n' is present. The default 'default' is 'n' already, thus, drop
these lines from Kconfig to make it more clear.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:11 +0000 (16:35 +0300)]
NFC: nxp-nci: Remove unused macro pr_fmt()
The macro had never been used.
The driver uses mostly the nfc_err(), which, with other macros in the family,
is backed by corresponding dev_err(). pr_fmt() is not used for dev_err()
macro. Moreover, there is no need to print the module name which is part of the
device instance name anyway.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:10 +0000 (16:35 +0300)]
NFC: nxp-nci: Drop comma in terminator lines
There is no need to have a comma after terminator entry
in the arrays of IDs.
This may prevent the misguided addition behind the terminator
without compiler notice.
Drop the comma in terminator lines for good.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:09 +0000 (16:35 +0300)]
NFC: nxp-nci: Drop of_match_ptr() use
There is no need to guard OF device ID table with of_match_ptr().
Otherwise we would get a defined but not used data.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:08 +0000 (16:35 +0300)]
NFC: nxp-nci: Constify acpi_device_id
The content of acpi_device_id is not supposed to change at runtime.
All functions working with acpi_device_id provided by <linux/acpi.h>
work with const acpi_device_id. So mark the non-const structs as const.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:07 +0000 (16:35 +0300)]
NFC: nxp-nci: Get rid of useless label
Return directly in ->probe() since there no special cleaning is needed.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:06 +0000 (16:35 +0300)]
NFC: nxp-nci: Get rid of code duplication in ->probe()
Since OF and ACPI case almost the same get rid of code duplication
by moving gpiod_get() calls directly to ->probe().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:05 +0000 (16:35 +0300)]
NFC: nxp-nci: Add GPIO ACPI mapping table
In order to unify GPIO resource request prepare gpiod_get_index()
to behave correctly when there is no mapping provided by firmware.
Here we add explicit mapping between _CRS GpioIo() resources and
their names used in the driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:04 +0000 (16:35 +0300)]
NFC: nxp-nci: Convert to use GPIO descriptor
Since we got rid of platform data, the driver may use
GPIO descriptor directly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:03 +0000 (16:35 +0300)]
NFC: nxp-nci: Get rid of platform data
Legacy platform data must go away. We are on the safe side here since
there are no users of it in the kernel.
If anyone by any odd reason needs it the GPIO lookup tables and
built-in device properties at your service.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Mon, 29 Jul 2019 13:35:02 +0000 (16:35 +0300)]
NFC: nxp-nci: Add NXP1001 to the ACPI ID table
It seems a lot of laptops are equipped with NXP NFC300 chip with
the ACPI ID NXP1001 as per DSDT.
Append it to the driver's ACPI ID table.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey Konovalov [Mon, 29 Jul 2019 13:35:01 +0000 (16:35 +0300)]
NFC: fix attrs checks in netlink interface
nfc_genl_deactivate_target() relies on the NFC_ATTR_TARGET_INDEX
attribute being present, but doesn't check whether it is actually
provided by the user. Same goes for nfc_genl_fw_download() and
NFC_ATTR_FIRMWARE_NAME.
This patch adds appropriate checks.
Found with syzkaller.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 29 Jul 2019 15:23:41 +0000 (08:23 -0700)]
Merge branch 'hns3-next'
Huazhong Tan says:
====================
net: hns3: some code optimizations & bugfixes & features
This patch-set includes code optimizations, bugfixes and features for
the HNS3 ethernet controller driver.
[patch 1/10] checks reset status before setting channel.
[patch 2/10] adds a NULL pointer checking.
[patch 3/10] removes reset level upgrading when current reset fails.
[patch 4/10] fixes a GFP flags errors when holding spin_lock.
[patch 5/10] modifies firmware version format.
[patch 6/10] adds some print information which is off by default.
[patch 7/10 - 8/10] adds two code optimizations about interrupt handler
and work task.
[patch 9/10] adds support for using order 1 pages with a 4K buffer.
[patch 10/10] modifies messages prints with dev_info() instead of
pr_info().
Change log:
V3->V4: replace netif_info with netif_dbg in [patch 6/10]
V2->V3: fixes comments from Saeed Mahameed and Joe Perches.
V1->V2: fixes comments from Saeed Mahameed and
removes previous [patch 4/11] and [patch 11/11]
which needs further discussion, and adds a new
patch [10/10] suggested by Saeed Mahameed.
====================
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 29 Jul 2019 02:53:31 +0000 (10:53 +0800)]
net: hns3: use dev_info() instead of pr_info()
dev_info() is more appropriate for printing messages when driver
initialization done, so switch to dev_info().
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 29 Jul 2019 02:53:30 +0000 (10:53 +0800)]
net: hns3: Add support for using order 1 pages with a 4K buffer
Hardware supports 0.5K, 1K, 2K, 4K RX buffer size, the
RX buffer can not be reused because the hns3_page_order
return 0 when page size and RX buffer size are both 4096.
So this patch changes the hns3_page_order to return 1 when
RX buffer is greater than half of the page size and page size
is less the 8192, and dev_alloc_pages has already been used
to allocate the compound page for RX buffer.
This patch also changes hnae3_* to hns3_* for page order
and RX buffer size calculation because they are used in
hns3 module.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 29 Jul 2019 02:53:29 +0000 (10:53 +0800)]
net: hns3: add interrupt affinity support for misc interrupt
The misc interrupt is used to schedule the reset and mailbox
subtask, and service_task delayed_work is used to do periodic
management work each second.
This patch sets the above three subtask's affinity using the
misc interrupt' affinity.
Also this patch setups a affinity notify for misc interrupt to
allow user to change the above three subtask's affinity.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Mon, 29 Jul 2019 02:53:28 +0000 (10:53 +0800)]
net: hns3: make hclge_service use delayed workqueue
Use delayed work instead of using timers to trigger the
hclge_serive.
Simplify the code with one less middle function and in order
to support misc irq affinity.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yonglong Liu [Mon, 29 Jul 2019 02:53:27 +0000 (10:53 +0800)]
net: hns3: add debug messages to identify eth down cause
Some times just see the eth interface have been down/up via
dmesg, but can not know why the eth down. So adds some debug
messages to identify the cause for this.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Mon, 29 Jul 2019 02:53:26 +0000 (10:53 +0800)]
net: hns3: modify firmware version display format
This patch modifies firmware version display format in
hclge(vf)_cmd_init() and hns3_get_drvinfo(). Also, adds
some optimizations for firmware version display format.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Mon, 29 Jul 2019 02:53:25 +0000 (10:53 +0800)]
net: hns3: change GFP flag during lock period
When allocating memory, the GFP_KERNEL cannot be used during the
spin_lock period. This is because it may cause scheduling when holding
spin_lock. This patch changes GFP flag to GFP_ATOMIC in this case.
Fixes: dd74f815dd41 ("net: hns3: Add support for rule add/delete for flow director")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: lipeng 00277521 <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Mon, 29 Jul 2019 02:53:24 +0000 (10:53 +0800)]
net: hns3: remove upgrade reset level when reset fail
Currently, hclge_reset_err_handle() will assert a global reset
when the failing count is smaller than MAX_RESET_FAIL_CNT, which
will affect other running functions.
So this patch removes this upgrading, and uses re-scheduling reset
task to do it.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guangbin Huang [Mon, 29 Jul 2019 02:53:23 +0000 (10:53 +0800)]
net: hns3: add a check for get_reset_level
For some cases, ops->get_reset_level may not be implemented, so we
should check whether it is NULL before calling get_reset_level.
Signed-off-by: Guangbin Huang <huangguangbin@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>