Karsten Graul [Sun, 3 May 2020 12:38:50 +0000 (14:38 +0200)]
net/smc: enqueue local LLC messages
As SMC server, when a second link was deleted, trigger the setup of an
asymmetric link. Do this by enqueueing a local ADD_LINK message which
is processed by the LLC layer as if it were received from peer. Do the
same when a new IB port became active and a new link could be created.
smc_llc_srv_add_link_local() enqueues a local ADD_LINK message.
And smc_llc_srv_delete_link_local() is used the same way to enqueue a
local DELETE_LINK message. This is used when an IB port is no longer
active.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:49 +0000 (14:38 +0200)]
net/smc: delete link processing as SMC server
Add smc_llc_process_srv_delete_link() to process a DELETE_LINK request
as SMC server. When the request is to delete ALL links then terminate
the whole link group. If not, find the link to delete by its link_id,
send the DELETE_LINK request LLC message and wait for the response.
No matter if a response was received, clear the deleted link and update
the link group state.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:48 +0000 (14:38 +0200)]
net/smc: delete link processing as SMC client
Add smc_llc_process_cli_delete_link() to process a DELETE_LINK request
as SMC client. When the request is to delete ALL links then terminate
the whole link group. If not, find the link to delete by its link_id,
send the DELETE_LINK response LLC message and then clear the deleted
link. Finally determine and update the link group state.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:47 +0000 (14:38 +0200)]
net/smc: llc_del_link_work and use the LLC flow for delete link
Introduce a work that is scheduled when a new DELETE_LINK LLC request is
received. The work will call either the SMC client or SMC server
DELETE_LINK processing.
And use the LLC flow framework to process incoming DELETE_LINK LLC
messages, scheduling the llc_del_link_work for those events.
With these changes smc_lgr_forget() is only called by one function and
can be migrated into smc_lgr_cleanup_early().
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:46 +0000 (14:38 +0200)]
net/smc: delete an asymmetric link as SMC server
When a link group moved from asymmetric to symmetric state then the
dangling asymmetric link can be deleted. Add smc_llc_find_asym_link() to
find the respective link and add smc_llc_delete_asym_link() to delete
it.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:45 +0000 (14:38 +0200)]
net/smc: final part of add link processing as SMC server
This patch finalizes the ADD_LINK processing of new links. Send the
CONFIRM_LINK request to the peer, receive the response and set link
state to ACTIVE.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:44 +0000 (14:38 +0200)]
net/smc: rkey processing for a new link as SMC server
Part of SMC server new link establishment is the exchange of rkeys for
used buffers.
Loop over all used RMB buffers and send ADD_LINK_CONTINUE LLC messages
to the peer.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:43 +0000 (14:38 +0200)]
net/smc: first part of add link processing as SMC server
First set of functions to process an ADD_LINK LLC request as an SMC
server. Find an alternate IB device, determine the new link group type
and get the index for the new link. Then initialize the link and send
the ADD_LINK LLC message to the peer. Save the contents of the response,
ready the link, map all used buffers and register the buffers with the
IB device. If any error occurs, stop the processing and clear the link.
And call smc_llc_srv_add_link() in af_smc.c to start second link
establishment after the initial link of a link group was created.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:42 +0000 (14:38 +0200)]
net/smc: final part of add link processing as SMC client
This patch finalizes the ADD_LINK processing of new links. Receive the
CONFIRM_LINK request from peer, complete the link initialization,
register all used buffers with the IB device and finally send the
CONFIRM_LINK response, which completes the ADD_LINK processing.
And activate smc_llc_cli_add_link() in af_smc.c.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:41 +0000 (14:38 +0200)]
net/smc: rkey processing for a new link as SMC client
Part of the SMC client new link establishment process is the exchange of
rkeys for all used buffers.
Add new LLC message type ADD_LINK_CONTINUE which is used to exchange
rkeys of all current RMB buffers. Add functions to iterate over all
used RMB buffers of the link group, and implement the ADD_LINK_CONTINUE
processing.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Sun, 3 May 2020 12:38:40 +0000 (14:38 +0200)]
net/smc: first part of add link processing as SMC client
First set of functions to process an ADD_LINK LLC request as an SMC
client. Find an alternate IB device, determine the new link group type
and get the index for the new link. Then ready the link, map the buffers
and send an ADD_LINK LLC response. If any error occurs, send a reject
LLC message and terminate the processing.
Add smc_llc_alloc_alt_link() to find a free link index for a new link,
depending on the new link group type.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 3 May 2020 22:59:30 +0000 (15:59 -0700)]
Merge branch 'Enhance-current-features-in-ena-driver'
Sameeh Jubran says:
====================
Enhance current features in ena driver
Difference from v2:
* dropped patch "net: ena: move llq configuration from ena_probe to ena_device_init()"
* reworked patch ""net: ena: implement ena_com_get_admin_polling_mode() to drop the prototype
Difference from v1:
* reodered paches #01 and #02.
* dropped adding Rx/Tx drops to ethtool in patch #08
V1:
This patchset introduces the following:
* minor changes to RSS feature
* add total rx and tx drop counter
* add unmask_interrupt counter for ethtool statistics
* add missing implementation for ena_com_get_admin_polling_mode()
* some minor code clean-up and cosmetics
* use SHUTDOWN as reset reason when closing interface
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arthur Kiyanovski [Sun, 3 May 2020 09:52:21 +0000 (09:52 +0000)]
net: ena: cosmetic: extract code to ena_indirection_table_set()
Extract code to ena_indirection_table_set() to make
the code cleaner.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:20 +0000 (09:52 +0000)]
net: ena: cosmetic: remove unnecessary spaces and tabs in ena_com.h macros
The macros in ena_com.h have inconsistent spaces between
the macro name and it's value.
This commit sets all the macros to have a single space between
the name and value.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:19 +0000 (09:52 +0000)]
net: ena: use SHUTDOWN as reset reason when closing interface
The 'ENA_REGS_RESET_SHUTDOWN' enum indicates a normal driver
shutdown / removal procedure.
Also, a comment is added to one of the reset reason assignments for
code clarity.
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arthur Kiyanovski [Sun, 3 May 2020 09:52:18 +0000 (09:52 +0000)]
net: ena: drop superfluous prototype
Before this commit there was a function prototype named
ena_com_get_ena_admin_polling_mode() that was never implemented.
This patch simply deletes it.
Signed-off-by: Igor Chauskin <igorch@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:17 +0000 (09:52 +0000)]
net: ena: add support for reporting of packet drops
1. Add support for getting tx drops from the device and saving them
in the driver.
2. Report tx via netdev stats.
Signed-off-by: Igor Chauskin <igorch@amazon.com>
Signed-off-by: Guy Tzalik <gtzalik@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:16 +0000 (09:52 +0000)]
net: ena: add unmask interrupts statistics to ethtool
Add unmask interrupts statistics to ethtool.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:15 +0000 (09:52 +0000)]
net: ena: remove code that does nothing
Both key and func parameters are pointers on the stack.
Setting them to NULL does nothing.
The original intent was to leave the key and func unset in this case,
but for this to happen nothing needs to be done as the calling
function ethtool_get_rxfh() already clears key and func.
This commit removes the above described useless code.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:14 +0000 (09:52 +0000)]
net: ena: changes to RSS hash key allocation
This commit contains 2 cosmetic changes:
1. Use ena_com_check_supported_feature_id() in
ena_com_hash_key_fill_default_key() instead of rewriting
its implementation. This also saves us a superfluous admin
command by using the cached value.
2. Change if conditions in ena_com_rss_init() to be clearer.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arthur Kiyanovski [Sun, 3 May 2020 09:52:13 +0000 (09:52 +0000)]
net: ena: change default RSS hash function to Toeplitz
Currently in the driver we are setting the hash function to be CRC32.
Starting with this commit we want to change the default behaviour so that
we set the hash function to be Toeplitz instead.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sameeh Jubran [Sun, 3 May 2020 09:52:12 +0000 (09:52 +0000)]
net: ena: allow setting the hash function without changing the key
Current code does not allow setting the hash function without
changing the key. This commit enables it.
To achieve this we separate ena_com_get_hash_function() to 2 functions:
ena_com_get_hash_function() - which gets only the hash function, and
ena_com_get_hash_key() - which gets only the hash key.
Also return 0 instead of rc at the end of ena_get_rxfh() since all
previous operations succeeded.
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arthur Kiyanovski [Sun, 3 May 2020 09:52:11 +0000 (09:52 +0000)]
net: ena: fix error returning in ena_com_get_hash_function()
In case the "func" parameter is NULL we now return "-EINVAL".
This shouldn't happen in general, but when it does happen, this is the
proper way to handle it.
We also check func for NULL in the beginning of the function, as there
is no reason to do all the work and realize in the end of the function
it was useless.
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arthur Kiyanovski [Sun, 3 May 2020 09:52:10 +0000 (09:52 +0000)]
net: ena: avoid unnecessary admin command when RSS function set fails
Currently when ena_set_hash_function() fails the hash function is
restored to the previous value by calling an admin command to get
the hash function from the device.
In this commit we avoid the admin command, by saving the previous
hash function before calling ena_set_hash_function() and using this
previous value to restore the hash function in case of failure of
ena_set_hash_function().
Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 3 May 2020 22:50:53 +0000 (15:50 -0700)]
Merge branch 'sch_fq-optimizations'
Eric Dumazet says:
====================
net_sched: sch_fq: round of optimizations
This series is focused on better layout of struct fq_flow to
reduce number of cache line misses in fq_enqueue() and dequeue operations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 3 May 2020 02:54:22 +0000 (19:54 -0700)]
net_sched: sch_fq: perform a prefetch() earlier
The prefetch() done in fq_dequeue() can be done a bit earlier
after the refactoring of the code done in the prior patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 3 May 2020 02:54:21 +0000 (19:54 -0700)]
net_sched: sch_fq: do not call fq_peek() twice per packet
This refactors the code to not call fq_peek() from fq_dequeue_head()
since the caller can provide the skb.
Also rename fq_dequeue_head() to fq_dequeue_skb() because 'head' is
a bit vague, given the skb could come from t_root rb-tree.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 3 May 2020 02:54:20 +0000 (19:54 -0700)]
net_sched: sch_fq: use bulk freeing in fq_gc()
fq_gc() already builds a small array of pointers, so using
kmem_cache_free_bulk() needs very little change.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 3 May 2020 02:54:19 +0000 (19:54 -0700)]
net_sched: sch_fq: change fq_flow size/layout
sizeof(struct fq_flow) is 112 bytes on 64bit arches.
This means that half of them use two cache lines, but 50% use
three cache lines.
This patch adds cache line alignment, and makes sure that only
the first cache line is touched by fq_enqueue(), which is more
expensive that fq_dequeue() in general.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 3 May 2020 02:54:18 +0000 (19:54 -0700)]
net_sched: sch_fq: avoid touching f->next from fq_gc()
A significant amount of cpu cycles is spent in fq_gc()
When fq_gc() does its lookup in the rb-tree, it needs the
following fields from struct fq_flow :
f->sk (lookup key in the rb-tree)
f->fq_node (anchor in the rb-tree)
f->next (used to determine if the flow is detached)
f->age (used to determine if the flow is candidate for gc)
This unfortunately spans two cache lines (assuming 64 bytes cache lines)
We can avoid using f->next, if we use the low order bit of f->{age|tail}
This low order bit is 0, if f->tail points to an sk_buff.
We set the low order bit to 1, if the union contains a jiffies value.
Combined with the following patch, this makes sure we only need
to bring into cpu caches one cache line per flow.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Yakunin [Sat, 2 May 2020 15:34:42 +0000 (18:34 +0300)]
inet_diag: bc: read cgroup id only for full sockets
Fix bug introduced by commit
b1f3e43dbfac ("inet_diag: add support for
cgroup filter").
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Reported-by: syzbot+ee80f840d9bf6893223b@syzkaller.appspotmail.com
Reported-by: syzbot+13bef047dbfffa5cd1af@syzkaller.appspotmail.com
Fixes: b1f3e43dbfac ("inet_diag: add support for cgroup filter")
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 2 May 2020 15:25:04 +0000 (17:25 +0200)]
net: ethernet: fec: Replace interrupt driven MDIO with polled IO
Measurements of the MDIO bus have shown that driving the MDIO bus
using interrupts is slow. Back to back MDIO transactions take about
90us, with 25us spent performing the transaction, and the remainder of
the time the bus is idle.
Replacing the completion interrupt with polled IO results in back to
back transactions of 40us. The polling loop waiting for the hardware
to complete the transaction takes around 28us. Which suggests
interrupt handling has an overhead of 50us, and polled IO nearly
halves this overhead, and doubles the MDIO performance.
Care has to be taken when setting the MII_SPEED register, or it can
trigger an MII event> That then upsets the polling, due to an
unexpected pending event.
Suggested-by: Chris Heally <cphealy@gmail.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 2 May 2020 23:41:06 +0000 (16:41 -0700)]
smc: Remove unused function.
net/smc/smc_llc.c:544:12: warning: ‘smc_llc_alloc_alt_link’ defined but not used [-Wunused-function]
static int smc_llc_alloc_alt_link(struct smc_link_group *lgr,
^~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 2 May 2020 23:31:45 +0000 (16:31 -0700)]
Merge branch 'ptp-Add-adjust-phase-to-support-phase-offset'
Vincent Cheng says:
====================
ptp: Add adjust phase to support phase offset.
This series adds adjust phase to the PTP Hardware Clock device interface.
Some PTP hardware clocks have a write phase mode that has
a built-in hardware filtering capability. The write phase mode
utilizes a phase offset control word instead of a frequency offset
control word. Add adjust phase function to take advantage of this
capability.
Changes since v1:
- As suggested by Richard Cochran:
1. ops->adjphase is new so need to check for non-null function pointer.
2. Kernel coding style uses lower_case_underscores.
3. Use existing PTP clock API for delayed worker.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Cheng [Sat, 2 May 2020 03:35:38 +0000 (23:35 -0400)]
ptp: ptp_clockmatrix: Add adjphase() to support PHC write phase mode.
Add idtcm_adjphase() to support PHC write phase mode.
Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Cheng [Sat, 2 May 2020 03:35:37 +0000 (23:35 -0400)]
ptp: Add adjust_phase to ptp_clock_caps capability.
Add adjust_phase to ptp_clock_caps capability to allow
user to query if a PHC driver supports adjust phase with
ioctl PTP_CLOCK_GETCAPS command.
Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent Cheng [Sat, 2 May 2020 03:35:36 +0000 (23:35 -0400)]
ptp: Add adjphase function to support phase offset control.
Adds adjust phase function to take advantage of a PHC
clock's hardware filtering capability that uses phase offset
control word instead of frequency offset control word.
Signed-off-by: Vincent Cheng <vincent.cheng.xh@renesas.com>
Reviewed-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 2 May 2020 00:02:27 +0000 (17:02 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:
====================
pull-request: bpf-next 2020-05-01 (v2)
The following pull-request contains BPF updates for your *net-next* tree.
We've added 61 non-merge commits during the last 6 day(s) which contain
a total of 153 files changed, 6739 insertions(+), 3367 deletions(-).
The main changes are:
1) pulled work.sysctl from vfs tree with sysctl bpf changes.
2) bpf_link observability, from Andrii.
3) BTF-defined map in map, from Andrii.
4) asan fixes for selftests, from Andrii.
5) Allow bpf_map_lookup_elem for SOCKMAP and SOCKHASH, from Jakub.
6) production cloudflare classifier as a selftes, from Lorenz.
7) bpf_ktime_get_*_ns() helper improvements, from Maciej.
8) unprivileged bpftool feature probe, from Quentin.
9) BPF_ENABLE_STATS command, from Song.
10) enable bpf_[gs]etsockopt() helpers for sock_ops progs, from Stanislav.
11) enable a bunch of common helpers for cg-device, sysctl, sockopt progs,
from Stanislav.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 May 2020 23:20:05 +0000 (16:20 -0700)]
Merge branch 'net-smc-extent-buffer-mapping-and-port-handling'
Karsten Graul says:
====================
net/smc: extent buffer mapping and port handling
Add functionality to map/unmap and register/unregister memory buffers for
specific SMC-R links and for the whole link group. Prepare LLC layer messages
for the support of multiple links and extent the processing of adapter events.
And add further small preparations needed for the SMC-R failover support.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stanislav Fomichev [Fri, 1 May 2020 22:43:20 +0000 (15:43 -0700)]
selftests/bpf: Use reno instead of dctcp
Andrey pointed out that we can use reno instead of dctcp for CC
tests and drop CONFIG_TCP_CONG_DCTCP=y requirement.
Fixes: beecf11bc218 ("bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr")
Suggested-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200501224320.28441-1-sdf@google.com
Karsten Graul [Fri, 1 May 2020 10:48:13 +0000 (12:48 +0200)]
net/smc: llc_add_link_work to handle ADD_LINK LLC requests
Introduce a work that is scheduled when a new ADD_LINK LLC request is
received. The work will call either the SMC client or SMC server
ADD_LINK processing.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:12 +0000 (12:48 +0200)]
net/smc: allocate index for a new link
Add smc_llc_alloc_alt_link() to find a free link index for a new link,
depending on the new link group type. And update constants for the
maximum number of links to 3 (2 symmetric and 1 dangling asymmetric link).
These maximum numbers are the same as used by other implementations of the
SMC-R protocol.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:11 +0000 (12:48 +0200)]
net/smc: introduce smc_pnet_find_alt_roce()
Introduce a new function in smc_pnet.c that searches for an alternate
IB device, using an existing link group and a primary IB device. The
alternate IB device needs to be active and must have the same PNETID
as the link group.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:10 +0000 (12:48 +0200)]
net/smc: remove DELETE LINK processing from smc_core.c
Support for multiple links makes the former DELETE LINK processing
obsolete which sent one DELETE_LINK LLC message for each single link.
Remove this processing from smc_core.c.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:09 +0000 (12:48 +0200)]
net/smc: take link down instead of terminating the link group
Use the introduced link down processing in all places where the link
group is terminated and take down the affected link only.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:08 +0000 (12:48 +0200)]
net/smc: add smcr_port_err() and smcr_link_down() processing
Call smcr_port_err() when an IB event reports an inactive IB device.
smcr_port_err() calls smcr_link_down() for all affected links.
smcr_link_down() either triggers the local DELETE_LINK processing, or
sends an DELETE_LINK LLC message to the SMC server to initiate the
processing.
The old handler function smc_port_terminate() is removed.
Add helper smcr_link_down_cond() to take a link down conditionally, and
smcr_link_down_cond_sched() to schedule the link_down processing to a
work.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:07 +0000 (12:48 +0200)]
net/smc: add smcr_port_add() and smcr_link_up() processing
Call smcr_port_add() when an IB event reports a new active IB device.
smcr_port_add() will start a work which either triggers the local
ADD_LINK processing, or send an ADD_LINK LLC message to the SMC server
to initiate the processing.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:06 +0000 (12:48 +0200)]
net/smc: remember PNETID of IB device for later device matching
The PNETID is needed to find an alternate link for a link group.
Save the PNETID of the link that is used to create the link group for
later device matching.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:05 +0000 (12:48 +0200)]
net/smc: mutex to protect the lgr against parallel reconfigurations
Introduce llc_conf_mutex in the link group which is used to protect the
buffers and lgr states against parallel link reconfiguration.
This ensures that new connections do not start to register buffers with
the links of a link group when link creation or termination is running.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:04 +0000 (12:48 +0200)]
net/smc: extend smc_llc_send_add_link() and smc_llc_send_delete_link()
All LLC sends are done from worker context only, so remove the prep
functions which were used to build the message before it was sent, and
add the function content into the respective send function
smc_llc_send_add_link() and smc_llc_send_delete_link().
Extend smc_llc_send_add_link() to include the qp_mtu value in the LLC
message, which is needed to establish a link after the initial link was
created. Extend smc_llc_send_delete_link() to contain a link_id and a
reason code for the link deletion in the LLC message, which is needed
when a specific link should be deleted.
And add the list of existing DELETE_LINK reason codes.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:03 +0000 (12:48 +0200)]
net/smc: map and register buffers for a new link
Introduce support to map and register all current buffers for a new
link. smcr_buf_map_lgr() will map used buffers for a new link and
smcr_buf_reg_lgr() can be called to register used buffers on the
IB device of the new link.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:02 +0000 (12:48 +0200)]
net/smc: unmapping of buffers to support multiple links
With the support of multiple links that are created and cleared there
is a need to unmap one link from all current buffers. Add unmapping by
link and by rmb. And make smcr_link_clear() available to be called from
the LLC layer.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Karsten Graul [Fri, 1 May 2020 10:48:01 +0000 (12:48 +0200)]
net/smc: multiple link support for rmb buffer registration
The CONFIRM_RKEY LLC processing handles all links in one LLC message.
Move the call to this processing out of smcr_link_reg_rmb() which does
processing per link, into smcr_lgr_reg_rmbs() which is responsible for
link group level processing. Move smcr_link_reg_rmb() into module
smc_core.c.
>From af_smc.c now call smcr_lgr_reg_rmbs() to register new rmbs on all
available links.
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 May 2020 23:08:20 +0000 (16:08 -0700)]
Merge branch 'Introduce-a-flow-gate-control-action-and-apply-IEEE'
Po Liu says:
====================
Introduce a flow gate control action and apply IEEE
Changes from V4:
----------------
0001:
Fix and modify according to Vlid Buslov suggestions:
- Change spin_lock_bh() to spin_lock() since tcf_gate_act() already in
software irq.
- Remove spin lock protect in the ops->cleanup function.
- Enable the CONFIG_DEBUG_ATOMIC_SLEEP and CONFIG_PROVE_LOCKING checking,
then fix as kzalloc flag type and lock deadlock.
- Change kzalloc() flag type from GFP_KERNEL to GFP_ATOMIC since
function in the spin_lock protect.
- Change hrtimer type from HRTIMER_MODE_ABS to HRTIMER_MODE_ABS_SOFT
to avoid deadlock.
0002:
Fix and modify according to Vlid Buslov suggestions:
- Remove all rcu_read_lock protection since no rcu parameters.
- Enable the CONFIG_DEBUG_ATOMIC_SLEEP and CONFIG_PROVE_LOCKING checking,
then check kzalloc sleeping flag.
- Change kzalloc to kcalloc for array memory alloc and change GFP_KERNEL
flag to GFP_ATOMIC since function holding spin_lock protect.
0003:
- No changes.
0004:
- Commit comments rephrase act by Claudiu Manoil.
Changes from V3:
----------------
0001:
Fix and modify according to Vlid Buslov:
- Remove the struct gate_action and move the parameters to the
struct tcf_gate align with tc_action parameters. This would not need to
alloc rcu type memory with pointer.
- Remove the spin_lock type entry_lock which do not needed anymore, will
use the tcf_lock system provided.
- Provide lockep protect for the status parameters in the tcf_gate_act().
- Remove the cycletime 0 input warning, return error directly.
And:
- Remove Qci related description in the Kconfig for gate action.
0002:
- Fix rcu_read_lock protect range suggested by Vlid Buslov.
0003:
- No changes.
0004:
- Fix bug of gate maxoct wildcard condition not included.
- Fix the pass time basetime calculation report by Vladimir Otlean.
Changes from V2:
0001: No changes.
0002: No changes.
0003: No changes.
0004: Fix the vlan id filter parameter and add reject src mac
FF-FF-FF-FF-FF-FF filter in driver.
Changes from V1:
----------------
0000: Update description make it more clear
0001: Removed 'add update dropped stats' patch, will provide pull
request as standalone patches.
0001: Update commit description make it more clear ack by Jiri Pirko.
0002: No changes
0003: Fix some code style ack by Jiri Pirko.
0004: Fix enetc_psfp_enable/disable parameter type ack by test robot
iprout2 command patches:
Not attach with these serial patches, will provide separate pull
request after kernel accept these patches.
Changes from RFC:
-----------------
0000: Reduce to 5 patches and remove the 4 max frame size offload and
flow metering in the policing offload action, Only keep gate action
offloading implementation.
0001: No changes.
0002:
- fix kfree lead ack by Jakub Kicinski and Cong Wang
- License fix from Jakub Kicinski and Stephen Hemminger
- Update example in commit acked by Vinicius Costa Gomes
- Fix the rcu protect in tcf_gate_act() acked by Vinicius
0003: No changes
0004: No changes
0005:
Acked by Vinicius Costa Gomes
- Use refcount kernel lib
- Update stream gate check code position
- Update reduce ref names more clear
iprout2 command patches:
0000: Update license expression and add gate id
0001: Add tc action gate man page
--------------------------------------------------------------------
These patches add stream gate action policing in IEEE802.1Qci (Per-Stream
Filtering and Policing) software support and hardware offload support in
tc flower, and implement the stream identify, stream filtering and
stream gate filtering action in the NXP ENETC ethernet driver.
Per-Stream Filtering and Policing (PSFP) specifies flow policing and
filtering for ingress flows, and has three main parts:
1. The stream filter instance table consists of an ordered list of
stream filters that determine the filtering and policing actions that
are to be applied to frames received on a specific stream. The main
elements are stream gate id, flow metering id and maximum SDU size.
2. The stream gate function setup a gate list to control ingress traffic
class open/close state. When the gate is running at open state, the flow
could pass but dropped when gate state is running to close. User setup a
bastime to tell gate when start running the entry list, then the hardware
would periodiclly. There is no compare qdisc action support.
3. Flow metering is two rates two buckets and three-color marker to
policing the frames. Flow metering instance are as specified in the
algorithm in MEF10.3. The most likely qdisc action is policing action.
The first patch introduces an ingress frame flow control gate action,
for the point 2. The tc gate action maintains the open/close state gate
list, allowing flows to pass when the gate is open. Each gate action
may policing one or more qdisc filters. When the start time arrived, The
driver would repeat the gate list periodiclly. User can assign a passed
time, the driver would calculate a new future time by the cycletime of
the gate list.
The 0002 patch introduces the gate flow hardware offloading.
The 0003 patch adds support control the on/off for the tc flower
offloading by ethtool.
The 0004 patch implement the stream identify and stream filtering and
stream gate filtering action in the NXP ENETC ethernet driver. Tc filter
command provide filtering keys with MAC address and VLAN id. These keys
would be set to stream identify instance entry. Stream gate instance
entry would refer the gate action parameters. Stream filter instance
entry would refer the stream gate index and assign a stream handle value
matches to the stream identify instance.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Po Liu [Fri, 1 May 2020 00:53:18 +0000 (08:53 +0800)]
net: enetc: add tc flower psfp offload driver
This patch is to add tc flower offload for the enetc IEEE 802.1Qci(PSFP)
function. There are four main feature parts to implement the flow
policing and filtering for ingress flow with IEEE 802.1Qci features.
They are stream identify(this is defined in the P802.1cb exactly but
needed for 802.1Qci), stream filtering, stream gate and flow metering.
Each function block includes many entries by index to assign parameters.
So for one frame would be filtered by stream identify first, then
flow into stream filter block by the same handle between stream identify
and stream filtering. Then flow into stream gate control which assigned
by the stream filtering entry. And then policing by the gate and limited
by the max sdu in the filter block(optional). At last, policing by the
flow metering block, index choosing at the fitering block.
So you can see that each entry of block may link to many upper entries
since they can be assigned same index means more streams want to share
the same feature in the stream filtering or stream gate or flow
metering.
To implement such features, each stream filtered by source/destination
mac address, some stream maybe also plus the vlan id value would be
treated as one flow chain. This would be identified by the chain_index
which already in the tc filter concept. Driver would maintain this chain
and also with gate modules. The stream filter entry create by the gate
index and flow meter(optional) entry id and also one priority value.
Offloading only transfer the gate action and flow filtering parameters.
Driver would create (or search same gate id and flow meter id and
priority) one stream filter entry to set to the hardware. So stream
filtering do not need transfer by the action offloading.
This architecture is same with tc filter and actions relationship. tc
filter maintain the list for each flow feature by keys. And actions
maintain by the action list.
Below showing a example commands by tc:
> tc qdisc add dev eth0 ingress
> ip link set eth0 address 10:00:80:00:00:00
> tc filter add dev eth0 parent ffff: protocol ip chain 11 \
flower skip_sw dst_mac 10:00:80:00:00:00 \
action gate index 10 \
sched-entry open
200000000 1
8000000 \
sched-entry close
100000000 -1 -1
Command means to set the dst_mac 10:00:80:00:00:00 to index 11 of stream
identify module. Then setting the gate index 10 of stream gate module.
Keep the gate open for 200ms and limit the traffic volume to 8MB in this
sched-entry. Then direct the frames to the ingress queue 1.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Po Liu [Fri, 1 May 2020 00:53:17 +0000 (08:53 +0800)]
net: enetc: add hw tc hw offload features for PSPF capability
This patch is to let ethtool enable/disable the tc flower offload
features. Hardware ENETC has the feature of PSFP which is for per-stream
policing. When enable the tc hw offloading feature, driver would enable
the IEEE 802.1Qci feature. It is only set the register enable bit for
this feature not enable for any entry of per stream filtering and stream
gate or stream identify but get how much capabilities for each feature.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Po Liu [Fri, 1 May 2020 00:53:16 +0000 (08:53 +0800)]
net: schedule: add action gate offloading
Add the gate action to the flow action entry. Add the gate parameters to
the tc_setup_flow_action() queueing to the entries of flow_action_entry
array provide to the driver.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Po Liu [Fri, 1 May 2020 00:53:15 +0000 (08:53 +0800)]
net: qos: introduce a gate control flow action
Introduce a ingress frame gate control flow action.
Tc gate action does the work like this:
Assume there is a gate allow specified ingress frames can be passed at
specific time slot, and be dropped at specific time slot. Tc filter
chooses the ingress frames, and tc gate action would specify what slot
does these frames can be passed to device and what time slot would be
dropped.
Tc gate action would provide an entry list to tell how much time gate
keep open and how much time gate keep state close. Gate action also
assign a start time to tell when the entry list start. Then driver would
repeat the gate entry list cyclically.
For the software simulation, gate action requires the user assign a time
clock type.
Below is the setting example in user space. Tc filter a stream source ip
address is 192.168.0.20 and gate action own two time slots. One is last
200ms gate open let frame pass another is last 100ms gate close let
frames dropped. When the ingress frames have reach total frames over
8000000 bytes, the excessive frames will be dropped in that 200000000ns
time slot.
> tc qdisc add dev eth0 ingress
> tc filter add dev eth0 parent ffff: protocol ip \
flower src_ip 192.168.0.20 \
action gate index 2 clockid CLOCK_TAI \
sched-entry open
200000000 -1
8000000 \
sched-entry close
100000000 -1 -1
> tc chain del dev eth0 ingress chain 0
"sched-entry" follow the name taprio style. Gate state is
"open"/"close". Follow with period nanosecond. Then next item is internal
priority value means which ingress queue should put. "-1" means
wildcard. The last value optional specifies the maximum number of
MSDU octets that are permitted to pass the gate during the specified
time interval.
Base-time is not set will be 0 as default, as result start time would
be ((N + 1) * cycletime) which is the minimal of future time.
Below example shows filtering a stream with destination mac address is
10:00:80:00:00:00 and ip type is ICMP, follow the action gate. The gate
action would run with one close time slot which means always keep close.
The time cycle is total 200000000ns. The base-time would calculate by:
1357000000000 + (N + 1) * cycletime
When the total value is the future time, it will be the start time.
The cycletime here would be 200000000ns for this case.
> tc filter add dev eth0 parent ffff: protocol ip \
flower skip_hw ip_proto icmp dst_mac 10:00:80:00:00:00 \
action gate index 12 base-time
1357000000000 \
sched-entry close
200000000 -1 -1 \
clockid CLOCK_TAI
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Berger [Thu, 30 Apr 2020 23:26:51 +0000 (16:26 -0700)]
net: bcmgenet: Move wake-up event out of side band ISR
The side band interrupt service routine is not available on chips
like 7211, or rather, it does not permit the signaling of wake-up
events due to the complex interrupt hierarchy.
Move the wake-up event accounting into a .resume_noirq function,
account for possible wake-up events and clear the MPD/HFB interrupts
from there, while leaving the hardware untouched until the resume
function proceeds with doing its usual business.
Because bcmgenet_wol_power_down_cfg() now enables the MPD and HFB
interrupts, it is invoked by a .suspend_noirq function to prevent
the servicing of interrupts after the clocks have been disabled.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 May 2020 22:53:33 +0000 (15:53 -0700)]
Merge branch 'net-ipa-dont-cache-channel-state'
Alex Elder says:
====================
net: ipa: don't cache channel state
This series removes a field that holds a copy of a channel's state
at the time it was last fetched. In principle the state can change
at any time, so it's better to just fetch it whenever needed. The
first patch is just preparatory, simplifying the arguments to
gsi_channel_state().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 30 Apr 2020 22:13:23 +0000 (17:13 -0500)]
net: ipa: do not cache channel state
It is possible for a GSI channel's state to be changed as a result
of an action by a different execution environment. Specifically,
the modem is able to issue a GSI generic command that causes a state
change on a GSI channel associated with the AP.
A channel's state only needs to be known when a channel is allocated
or deallocaed, started or stopped, or reset. So there is little
value in caching the state anyway.
Stop recording a copy of the channel's last known state, and instead
fetch the true state from hardware whenever it's needed. In such
cases, *do* record the state in a local variable, in case an error
message reports it (so the value reported is the value seen).
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 30 Apr 2020 22:13:22 +0000 (17:13 -0500)]
net: ipa: pass channel pointer to gsi_channel_state()
Pass a channel pointer rather than a GSI pointer and channel ID to
gsi_channel_state().
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 May 2020 22:40:15 +0000 (15:40 -0700)]
Merge branch 'net-dsa-mv88e6xxx-augment-phylink-support-for-10G'
Russell King says:
====================
net: dsa: mv88e6xxx: augment phylink support for 10G
This series adds phylink 10G support for the
88E6390 series switches,
as suggested by Andrew Lunn.
The first patch cleans up the code to use generic definitions for the
registers in a similar way to what was done with the initial conversion
of 1G serdes support.
The second patch adds the necessary bits 10GBASE mode to the
pcs_get_state() method.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Thu, 30 Apr 2020 08:21:39 +0000 (09:21 +0100)]
net: dsa: mv88e6xxx:
88e6390 10G serdes support
Add support for reading and reporting the 10G link status on the
88e6390 in addition to the 1000BASE-X/2500BASE-X/SGMII status.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Thu, 30 Apr 2020 08:21:34 +0000 (09:21 +0100)]
net: dsa: mv88e6xxx: use generic clause 45 definitions
The private MV88E6390_PCS_CONTROL_1 definitions in serdes.h reflects
the IEEE 802.3 standard PCS control register 1 definitions, only
offset by 0x1000 in the PHYXS register space. Rather than inventing
our own, use those that already exist, and name the register
MV88E6390_10G_CTRL1.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 May 2020 22:37:59 +0000 (15:37 -0700)]
Merge branch 'net-atlantic-A2-support'
Igor Russkikh says:
====================
net: atlantic: A2 support
This patchset adds support for the new generation of Atlantic NICs.
Chip generations are mostly compatible register-wise, but there are still
some differences. Therefore we've made some of first generation (A1) code
non-static to re-use it where possible.
Some pieces are A2 specific, in which case we redefine/extend such APIs.
v2:
* removed #pragma pack (2 structures require the packed attribute);
* use defines instead of magic numbers where possible;
v1: https://patchwork.ozlabs.org/cover/
1276220/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:45 +0000 (11:04 +0300)]
net: atlantic: A2 ingress / egress hw configuration
Chip generations are mostly compatible register-wise, but there are still
some differences. Therefore we've made some of first generation (A1) code
non-static to re-use it where possible.
Some pieces are A2 specific, in which case we redefine/extend such APIs.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:44 +0000 (11:04 +0300)]
net: atlantic: basic A2 init/deinit hw_ops
This patch adds basic A2 HW initialization / deinitialization.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Thu, 30 Apr 2020 08:04:43 +0000 (11:04 +0300)]
net: atlantic: common functions needed for basic A2 init/deinit hw_ops
This patch adds common functions (mostly FW-related), which are
needed for basic A2 HW initialization / deinitialization.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Co-developed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Thu, 30 Apr 2020 08:04:42 +0000 (11:04 +0300)]
net: atlantic: HW bindings for basic A2 init/deinit hw_ops
This patch adds A2 register definitions for basic A2 HW
initialization / deinitialization.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Co-developed-by: Egor Pomozov <epomozov@marvell.com>
Signed-off-by: Egor Pomozov <epomozov@marvell.com>
Co-developed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Co-developed-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:41 +0000 (11:04 +0300)]
net: atlantic: add A2 RPF hw_ops
This patch adds RPF-related hw_ops, which are needed for basic
functionality.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:40 +0000 (11:04 +0300)]
net: atlantic: HW bindings for A2 RFP
RPF is one of the modules which has been significantly
changed/extended on A2.
This patch adds the necessary A2 register definitions
for RPF, which are used in follow-up patches.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Co-developed-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:39 +0000 (11:04 +0300)]
net: atlantic: A2 hw_ops skeleton
This patch adds basic hw_ops layout for A2.
Actual implementation will be added in the follow-up patches.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Thu, 30 Apr 2020 08:04:38 +0000 (11:04 +0300)]
net: atlantic: minimal A2 fw_ops
This patch adds the minimum set of FW ops for A2.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Co-developed-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Thu, 30 Apr 2020 08:04:37 +0000 (11:04 +0300)]
net: atlantic: minimal A2 HW bindings required for fw_ops
This patch adds the bare minimum of A2 HW bindings required to
get fw_ops working.
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Bogdanov [Thu, 30 Apr 2020 08:04:36 +0000 (11:04 +0300)]
net: atlantic: A2 driver-firmware interface
This patch adds the driver<->firmware interface for A2
Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Starovoytov [Thu, 30 Apr 2020 08:04:35 +0000 (11:04 +0300)]
net: atlantic: move IS_CHIP_FEATURE to aq_hw.h
IS_CHIP feature will be used to differentiate between A1 and A2,
where necessary. Thus, move it to aq_hw.h, rename it and make
it accept the 'hw' pointer.
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Starovoytov [Thu, 30 Apr 2020 08:04:34 +0000 (11:04 +0300)]
net: atlantic: make hw_get_regs optional
This patch fixes potential crash in case if hw_get_regs is NULL.
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikita Danilov [Thu, 30 Apr 2020 08:04:33 +0000 (11:04 +0300)]
net: atlantic: simplify hw_get_fw_version() usage
hw_get_fw_version() never fails, so this patch simplifies its
usage by utilizing return value instead of output argument.
Signed-off-by: Nikita Danilov <ndanilov@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Starovoytov [Thu, 30 Apr 2020 08:04:32 +0000 (11:04 +0300)]
net: atlantic: add hw_soft_reset, hw_prepare to hw_ops
A2 will have a different implementation of these 2 APIs, so
this patch moves them to hw_ops in preparation for A2.
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Co-developed-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Signed-off-by: Dmitry Bezrukov <dbezrukov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:31 +0000 (11:04 +0300)]
net: atlantic: add defines for 10M and EEE 100M link mode
This patch adds defines for 10M and EEE 100M link modes, which are
supported by A2.
10M support is added in this patch series.
EEE is out of scope, but will be added in a follow-up series.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:30 +0000 (11:04 +0300)]
net: atlantic: add A2 device IDs
Adding device ids for the new generation of atlantic nic.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 30 Apr 2020 08:04:29 +0000 (11:04 +0300)]
net: atlantic: update company name in the driver description
Aquantia is now part of Marvell. Thus, update the driver description.
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 29 Apr 2020 02:52:20 +0000 (02:52 +0000)]
drivers: net: davinci_mdio: fix potential NULL dereference in davinci_mdio_probe()
platform_get_resource() may fail and return NULL, so we should
better check it's return value to avoid a NULL pointer dereference
since devm_ioremap() does not check input parameters for null.
This is detected by Coccinelle semantic patch.
@@
expression pdev, res, n, t, e, e1, e2;
@@
res = \(platform_get_resource\|platform_get_resource_byname\)(pdev, t, n);
+ if (!res)
+ return -EINVAL;
... when != res == NULL
e = devm_ioremap(e1, res->start, e2);
Fixes: 03f66f067560 ("net: ethernet: ti: davinci_mdio: use devm_ioremap()")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jesper Dangaard Brouer [Mon, 27 Apr 2020 16:37:43 +0000 (18:37 +0200)]
net: fix skb_panic to output real address
In skb_panic() the real pointer values are really needed to diagnose
issues, e.g. data and head are related (to calculate headroom). The
hashed versions of the addresses doesn't make much sense here. The
patch use the printk specifier %px to print the actual address.
The printk documentation on %px:
https://www.kernel.org/doc/html/latest/core-api/printk-formats.html#unmodified-addresses
Fixes: ad67b74d2469 ("printk: hash addresses printed with %p")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe Roullier [Mon, 27 Apr 2020 10:00:38 +0000 (12:00 +0200)]
net: ethernet: stmmac: simplify phy modes management for stm32
No new feature, just to simplify stm32 part to be easier to use.
Add by default all Ethernet clocks in DT, and activate or not in function
of phy mode, clock frequency, if property "st,ext-phyclk" is set or not.
Keep backward compatibility
--------------------------------------------------------------------------
|PHY_MODE | Normal | PHY wo crystal| PHY wo crystal | No 125Mhz |
| | | 25MHz | 50MHz | from PHY |
--------------------------------------------------------------------------
| MII | - | eth-ck | n/a | n/a |
| | | st,ext-phyclk | | |
--------------------------------------------------------------------------
| GMII | - | eth-ck | n/a | n/a |
| | | st,ext-phyclk | | |
--------------------------------------------------------------------------
| RGMII | - | eth-ck | n/a | eth-ck |
| | | st,ext-phyclk | |st,eth-clk-sel|
| | | | | or |
| | | | | st,ext-phyclk|
----------------==--------------------------------------------------------
| RMII | - | eth-ck | eth-ck | n/a |
| | | st,ext-phyclk | st,eth-ref-clk-sel | |
| | | | or st,ext-phyclk | |
--------------------------------------------------------------------------
Signed-off-by: Christophe Roullier <christophe.roullier@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrii Nakryiko [Fri, 1 May 2020 18:56:22 +0000 (11:56 -0700)]
bpf: Fix use-after-free of bpf_link when priming half-fails
If bpf_link_prime() succeeds to allocate new anon file, but then fails to
allocate ID for it, link priming is considered to be failed and user is
supposed ot be able to directly kfree() bpf_link, because it was never exposed
to user-space.
But at that point file already keeps a pointer to bpf_link and will eventually
call bpf_link_release(), so if bpf_link was kfree()'d by caller, that would
lead to use-after-free.
Fix this by first allocating ID and only then allocating file. Adding ID to
link_idr is ok, because link at that point still doesn't have its ID set, so
no user-space process can create a new FD for it.
Fixes: a3b80e107894 ("bpf: Allocate ID for bpf_link")
Reported-by: syzbot+39b64425f91b5aab714d@syzkaller.appspotmail.com
Suggested-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20200501185622.3088964-1-andriin@fb.com
Cambda Zhu [Fri, 24 Apr 2020 08:06:16 +0000 (16:06 +0800)]
net: Replace the limit of TCP_LINGER2 with TCP_FIN_TIMEOUT_MAX
This patch changes the behavior of TCP_LINGER2 about its limit. The
sysctl_tcp_fin_timeout used to be the limit of TCP_LINGER2 but now it's
only the default value. A new macro named TCP_FIN_TIMEOUT_MAX is added
as the limit of TCP_LINGER2, which is 2 minutes.
Since TCP_LINGER2 used sysctl_tcp_fin_timeout as the default value
and the limit in the past, the system administrator cannot set the
default value for most of sockets and let some sockets have a greater
timeout. It might be a mistake that let the sysctl to be the limit of
the TCP_LINGER2. Maybe we can add a new sysctl to set the max of
TCP_LINGER2, but FIN-WAIT-2 timeout is usually no need to be too long
and 2 minutes are legal considering TCP specs.
Changes in v3:
- Remove the new socket option and change the TCP_LINGER2 behavior so
that the timeout can be set to value between sysctl_tcp_fin_timeout
and 2 minutes.
Changes in v2:
- Add int overflow check for the new socket option.
Changes in v1:
- Add a new socket option to set timeout greater than
sysctl_tcp_fin_timeout.
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 May 2020 19:53:06 +0000 (12:53 -0700)]
Merge branch 'r8169-improve-user-message-handling'
Heiner Kallweit says:
====================
r8169: improve user message handling
Series improves few aspects of handling messages to users.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 1 May 2020 17:26:22 +0000 (19:26 +0200)]
r8169: switch from netif_xxx message functions to netdev_xxx
Considering the few messages we have in the driver, there's not really
a benefit in being able to control them on a message type level.
Therefore simplify the code and switch to the netdev_xxx message
functions. In addition add net_ratelimit() to messages that can be
printed from a hot path.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 1 May 2020 17:24:47 +0000 (19:24 +0200)]
r8169: remove "out of memory" error message from rtl_request_firmware
When preparing an unrelated change, checkpatch complained about this
redundant out-of-memory message. Therefore remove it.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 1 May 2020 17:23:36 +0000 (19:23 +0200)]
r8169: simplify counter handling
The counter handling functions can only fail if rtl8169_do_counters()
times out. In the poll function we emit an error message in case of
timeout, therefore we don't have to propagate the timeout all the
way up just to print another message basically saying the same.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Heiner Kallweit [Fri, 1 May 2020 17:22:29 +0000 (19:22 +0200)]
r8169: remove redundant driver message when entering promiscuous mode
Net core - __dev_set_promiscuity - prints a message already when
promiscuous mode in entered/left, therefore we don't have to do this
in the driver too. Also the driver message would be misleading
(would be because "link" message level is disabled per default)
because it would print "promisc mode enabled" even if it's being
left. Reason is that __dev_change_flags() calls dev_set_rx_mode()
before touching the promisc flag.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stanislav Fomichev [Thu, 30 Apr 2020 23:31:52 +0000 (16:31 -0700)]
bpf: Bpf_{g,s}etsockopt for struct bpf_sock_addr
Currently, bpf_getsockopt and bpf_setsockopt helpers operate on the
'struct bpf_sock_ops' context in BPF_PROG_TYPE_SOCK_OPS program.
Let's generalize them and make them available for 'struct bpf_sock_addr'.
That way, in the future, we can allow those helpers in more places.
As an example, let's expose those 'struct bpf_sock_addr' based helpers to
BPF_CGROUP_INET{4,6}_CONNECT hooks. That way we can override CC before the
connection is made.
v3:
* Expose custom helpers for bpf_sock_addr context instead of doing
generic bpf_sock argument (as suggested by Daniel). Even with
try_socket_lock that doesn't sleep we have a problem where context sk
is already locked and socket lock is non-nestable.
v2:
* s/BPF_PROG_TYPE_CGROUP_SOCKOPT/BPF_PROG_TYPE_SOCK_OPS/
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20200430233152.199403-1-sdf@google.com
David S. Miller [Fri, 1 May 2020 19:24:43 +0000 (12:24 -0700)]
Merge branch 'net-ReST-part-three'
Mauro Carvalho Chehab says:
====================
net: manually convert files to ReST format - part 3 (final)
That's the third part (and the final one) of my work to convert the networking
text files into ReST. it is based on linux-next next-
20200430 branch.
The full series (including those ones) are at:
https://git.linuxtv.org/mchehab/experimental.git/log/?h=net-docs
The built output documents, on html format is at:
https://www.infradead.org/~mchehab/kernel_docs/networking/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Fri, 1 May 2020 14:44:59 +0000 (16:44 +0200)]
docs: networking: arcnet-hardware.rst: don't duplicate chapter names
Since changeset
58ad30cf91f0 ("docs: fix reference to core-api/namespaces.rst"),
auto-references for chapters are generated. This is a nice feature, but
has a drawback: no chapters can have the same sumber.
So, we need to change two chapter titles, to avoid warnings when
building the docs.
Fixes: 58ad30cf91f0 ("docs: fix reference to core-api/namespaces.rst")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Fri, 1 May 2020 14:44:58 +0000 (16:44 +0200)]
net: docs: add page_pool.rst to index.rst
This file is already in ReST format. Add it to the net
index.rst, in order to make it part of the documentation
body.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Fri, 1 May 2020 14:44:57 +0000 (16:44 +0200)]
docs: networking: device drivers: convert toshiba/spider_net.txt to ReST
- add SPDX header;
- adjust title markup;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Fri, 1 May 2020 14:44:56 +0000 (16:44 +0200)]
docs: networking: device drivers: convert ti/tlan.txt to ReST
- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark tables as such;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mauro Carvalho Chehab [Fri, 1 May 2020 14:44:55 +0000 (16:44 +0200)]
docs: networking: device drivers: convert ti/cpsw.txt to ReST
- add SPDX header;
- adjust titles and chapters, adding proper markups;
- mark code blocks and literals as such;
- adjust identation, whitespaces and blank lines where needed;
- add to networking/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>