David S. Miller [Sat, 21 Oct 2017 01:22:19 +0000 (02:22 +0100)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
pull request: bluetooth-next 2017-10-19
Here's the first bluetooth-next pull request targeting the 4.15 kernel
release.
- Multiple fixes & improvements to the hci_bcm driver
- DT improvements, e.g. new local-bd-address property
- Fixes & improvements to ECDH usage. Private key is now generated by
the crypto subsystem.
- gcc-4.9 warning fixes
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 19 Oct 2017 00:02:03 +0000 (17:02 -0700)]
ipv4: ipv4_default_advmss() should use route mtu
ipv4_default_advmss() incorrectly uses the device MTU instead
of the route provided one. IPv6 has the proper behavior,
lets harmonize the two protocols.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 21 Oct 2017 00:45:56 +0000 (01:45 +0100)]
Merge branch 'ieee802154-for-davem-2017-10-18' of git://git./linux/kernel/git/sschmidt/wpan-next
Stefan Schmidt says:
====================
pull-request: ieee802154 2017-10-18
Please find below a pull request from the ieee802154 subsystem for net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 22:01:38 +0000 (15:01 -0700)]
spectrum: Convert fib event handlers to use container_of on info arg
Use container_of to convert the generic fib_notifier_info into
the event specific data structure.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 18 Oct 2017 21:20:30 +0000 (14:20 -0700)]
tcp: fix tcp_send_syn_data()
syn_data was allocated by sk_stream_alloc_skb(), meaning
its destructor and _skb_refdst fields are mangled.
We need to call tcp_skb_tsorted_anchor_cleanup() before
calling kfree_skb() or kernel crashes.
Bug was reported by syzkaller bot.
Fixes: e2080072ed2d ("tcp: new list for sent but unacked skbs for RACK recovery")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 21 Oct 2017 00:39:11 +0000 (01:39 +0100)]
Merge branch 'ipv6-fixes-for-RTF_CACHE-entries'
Paolo Abeni says:
====================
ipv6: fixes for RTF_CACHE entries
This series addresses 2 different but related issues with RTF_CACHE
introduced by the recent refactory.
patch 1 restore the gc timer for such routes
patch 2 removes the aged out dst from the fib tree, properly coping with pMTU
routes
v1 -> v2:
- dropped the for ip route show cache
- avoid touching dst.obsolete when the dst is aged out
v2 -> v3:
- take care of pMTU exceptions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 19 Oct 2017 14:07:11 +0000 (16:07 +0200)]
ipv6: remove from fib tree aged out RTF_CACHE dst
The commit
2b760fcf5cfb ("ipv6: hook up exception table to store
dst cache") partially reverted the commit
1e2ea8ad37be ("ipv6: set
dst.obsolete when a cached route has expired").
As a result, RTF_CACHE dst referenced outside the fib tree will
not be removed until the next sernum change; dst_check() does not
fail on aged-out dst, and dst->__refcnt can't decrease: the aged
out dst will stay valid for a potentially unlimited time after the
timeout expiration.
This change explicitly removes RTF_CACHE dst from the fib tree when
aged out. The rt6_remove_exception() logic will then obsolete the
dst and other entities will drop the related reference on next
dst_check().
pMTU exceptions are not aged-out, and are removed from the exception
table only when the - usually considerably longer - ip6_rt_mtu_expires
timeout expires.
v1 -> v2:
- do not touch dst.obsolete in rt6_remove_exception(), not needed
v2 -> v3:
- take care of pMTU exceptions, too
Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 19 Oct 2017 14:07:10 +0000 (16:07 +0200)]
ipv6: start fib6 gc on RTF_CACHE dst creation
After the commit
2b760fcf5cfb ("ipv6: hook up exception table
to store dst cache"), the fib6 gc is not started after the
creation of a RTF_CACHE via a redirect or pmtu update, since
fib6_add() isn't invoked anymore for such dsts.
We need the fib6 gc to run periodically to clean the RTF_CACHE,
or the dst will stay there forever.
Fix it by explicitly calling fib6_force_start_gc() on successful
exception creation. gc_args->more accounting will ensure that
the gc timer will run for whatever time needed to properly
clean the table.
v2 -> v3:
- clarified the commit message
Fixes: 2b760fcf5cfb ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 12:33:00 +0000 (13:33 +0100)]
Merge branch 'bpf-lsm-hooks'
Chenbo Feng says:
====================
bpf: security: New file mode and LSM hooks for eBPF object permission control
Much like files and sockets, eBPF objects are accessed, controlled, and
shared via a file descriptor (FD). Unlike files and sockets, the
existing mechanism for eBPF object access control is very limited.
Currently there are two options for granting accessing to eBPF
operations: grant access to all processes, or only CAP_SYS_ADMIN
processes. The CAP_SYS_ADMIN-only mode is not ideal because most users
do not have this capability and granting a user CAP_SYS_ADMIN grants too
many other security-sensitive permissions. It also unnecessarily allows
all CAP_SYS_ADMIN processes access to eBPF functionality. Allowing all
processes to access to eBPF objects is also undesirable since it has
potential to allow unprivileged processes to consume kernel memory, and
opens up attack surface to the kernel.
Adding LSM hooks maintains the status quo for systems which do not use
an LSM, preserving compatibility with userspace, while allowing security
modules to choose how best to handle permissions on eBPF objects. Here
is a possible use case for the lsm hooks with selinux module:
The network-control daemon (netd) creates and loads an eBPF object for
network packet filtering and analysis. It passes the object FD to an
unprivileged network monitor app (netmonitor), which is not allowed to
create, modify or load eBPF objects, but is allowed to read the traffic
stats from the map.
Selinux could use these hooks to grant the following permissions:
allow netd self:bpf_map { create read write};
allow netmonitor netd:fd use;
allow netmonitor netd:bpf_map read;
In this patch series, A file mode is added to bpf map to store the
accessing mode. With this file mode flags, the map can be obtained read
only, write only or read and write. With the help of this file mode,
several security hooks can be added to the eBPF syscall implementations
to do permissions checks. These LSM hooks are mainly focused on checking
the process privileges before it obtains the fd for a specific bpf
object. No matter from a file location or from a eBPF id. Besides that,
a general check hook is also implemented at the start of bpf syscalls so
that each security module can have their own implementation on the reset
of bpf object related functionalities.
In order to store the ownership and security information about eBPF
maps, a security field pointer is added to the struct bpf_map. And the
last two patch set are implementation of selinux check on these hooks
introduced, plus an additional check when eBPF object is passed between
processes using unix socket as well as binder IPC.
Change since V1:
- Whitelist the new bpf flags in the map allocate check.
- Added bpf selftest for the new flags.
- Added two new security hooks for copying the security information from
the bpf object security struct to file security struct
- Simplified the checking action when bpf fd is passed between processes.
Change since V2:
- Fixed the line break problem for map flags check
- Fixed the typo in selinux check of file mode.
- Merge bpf_map and bpf_prog into one selinux class
- Added bpf_type and bpf_sid into file security struct to store the
security information when generate fd.
- Add the hook to bpf_map_new_fd and bpf_prog_new_fd.
Change since V3:
- Return the actual error from security check instead of -EPERM
- Move the hooks into anon_inode_getfd() to avoid get file again after
bpf object file is installed with fd.
- Removed the bpf_sid field inside file_scerity_struct to reduce the
cache size.
Change since V4:
- Rename bpf av prog_use to prog_run to distinguish from fd_use.
- Remove the bpf_type field inside file_scerity_struct and use bpf fops
to indentify bpf object instead.
Change since v5:
- Fixed the incorrect selinux class name for SECCLASS_BPF
Change since v7:
- Fixed the build error caused by xt_bpf module.
- Add flags check for bpf_obj_get() and bpf_map_get_fd_by_id() to make it
uapi-wise.
- Add the flags field to the bpf_obj_get_user function when BPF_SYSCALL
is not configured.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Wed, 18 Oct 2017 20:00:26 +0000 (13:00 -0700)]
selinux: bpf: Add addtional check for bpf object file receive
Introduce a bpf object related check when sending and receiving files
through unix domain socket as well as binder. It checks if the receiving
process have privilege to read/write the bpf map or use the bpf program.
This check is necessary because the bpf maps and programs are using a
anonymous inode as their shared inode so the normal way of checking the
files and sockets when passing between processes cannot work properly on
eBPF object. This check only works when the BPF_SYSCALL is configured.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Wed, 18 Oct 2017 20:00:25 +0000 (13:00 -0700)]
selinux: bpf: Add selinux check for eBPF syscall operations
Implement the actual checks introduced to eBPF related syscalls. This
implementation use the security field inside bpf object to store a sid that
identify the bpf object. And when processes try to access the object,
selinux will check if processes have the right privileges. The creation
of eBPF object are also checked at the general bpf check hook and new
cmd introduced to eBPF domain can also be checked there.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Wed, 18 Oct 2017 20:00:24 +0000 (13:00 -0700)]
security: bpf: Add LSM hooks for bpf object related syscall
Introduce several LSM hooks for the syscalls that will allow the
userspace to access to eBPF object such as eBPF programs and eBPF maps.
The security check is aimed to enforce a per object security protection
for eBPF object so only processes with the right priviliges can
read/write to a specific map or use a specific eBPF program. Besides
that, a general security hook is added before the multiplexer of bpf
syscall to check the cmd and the attribute used for the command. The
actual security module can decide which command need to be checked and
how the cmd should be checked.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Wed, 18 Oct 2017 20:00:23 +0000 (13:00 -0700)]
bpf: Add tests for eBPF file mode
Two related tests are added into bpf selftest to test read only map and
write only map. The tests verified the read only and write only flags
are working on hash maps.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chenbo Feng [Wed, 18 Oct 2017 20:00:22 +0000 (13:00 -0700)]
bpf: Add file mode configuration into bpf maps
Introduce the map read/write flags to the eBPF syscalls that returns the
map fd. The flags is used to set up the file mode when construct a new
file descriptor for bpf maps. To not break the backward capability, the
f_flags is set to O_RDWR if the flag passed by syscall is 0. Otherwise
it should be O_RDONLY or O_WRONLY. When the userspace want to modify or
read the map content, it will check the file mode to see if it is
allowed to make the change.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 18 Oct 2017 19:12:09 +0000 (12:12 -0700)]
net-tun: fix panics at dismantle time
syzkaller got crashes at dismantle time [1]
It is not correct to test (tun->flags & IFF_NAPI) in tun_napi_disable()
and tun_napi_del() : Each tun_file can have different mode, depending
on how they were created.
Similarly I have changed tun_get_user() and tun_poll_controller()
to use the new tfile->napi_enabled boolean.
[ 154.331360] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 154.339220] IP: [<
ffffffff9634cad6>] hrtimer_active+0x26/0x60
[ 154.344983] PGD 0
[ 154.347009] Oops: 0000 [#1] SMP
[ 154.350680] gsmi: Log Shutdown Reason 0x03
[ 154.379572] task:
ffff994719150dc0 ti:
ffff99475c0ae000 task.ti:
ffff99475c0ae000
[ 154.387043] RIP: 0010:[<
ffffffff9634cad6>] [<
ffffffff9634cad6>] hrtimer_active+0x26/0x60
[ 154.395232] RSP: 0018:
ffff99475c0afce8 EFLAGS:
00010246
[ 154.400542] RAX:
ffff994754850ac0 RBX:
ffff994753e65408 RCX:
ffff994753e65388
[ 154.407666] RDX:
0000000000000000 RSI:
0000000000000001 RDI:
ffff994753e65408
[ 154.414790] RBP:
ffff99475c0afce8 R08:
0000000000000000 R09:
0000000000000000
[ 154.421921] R10:
ffff99475f6f5910 R11:
0000000000000001 R12:
0000000000000000
[ 154.429044] R13:
ffff99417deab668 R14:
ffff99417deaa780 R15:
ffff99475f45dde0
[ 154.436174] FS:
0000000000000000(0000) GS:
ffff994767a00000(0000) knlGS:
0000000000000000
[ 154.444249] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 154.449986] CR2:
0000000000000000 CR3:
00000005a8a0e000 CR4:
0000000000022670
[ 154.457110] Stack:
[ 154.459120]
ffff99475c0afd28 ffffffff9634d614 1000000000000000 0000000000000000
[ 154.466598]
ffffe54240000000 ffff994753e65408 ffff994753e653a8 ffff99417deab668
[ 154.474067]
ffff99475c0afd48 ffffffff9634d6fd ffff99474c2be678 ffff994753e65398
[ 154.481537] Call Trace:
[ 154.483985] [<
ffffffff9634d614>] hrtimer_try_to_cancel+0x24/0xf0
[ 154.490074] [<
ffffffff9634d6fd>] hrtimer_cancel+0x1d/0x30
[ 154.495563] [<
ffffffff96860b3c>] napi_disable+0x3c/0x70
[ 154.500875] [<
ffffffff9678ae62>] __tun_detach+0xd2/0x360
[ 154.506272] [<
ffffffff9678b117>] tun_chr_close+0x27/0x40
[ 154.511669] [<
ffffffff9646ebe6>] __fput+0xd6/0x1e0
[ 154.516548] [<
ffffffff9646ed3e>] ____fput+0xe/0x10
[ 154.521429] [<
ffffffff963035a2>] task_work_run+0x72/0x90
[ 154.526827] [<
ffffffff962e9407>] do_exit+0x317/0xb60
[ 154.531879] [<
ffffffff962e9c8f>] do_group_exit+0x3f/0xa0
[ 154.537275] [<
ffffffff962e9d07>] SyS_exit_group+0x17/0x20
[ 154.542769] [<
ffffffff969784be>] entry_SYSCALL_64_fastpath+0x12/0x17
Fixes: 943170998b20 ("net-tun: enable NAPI for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 18:39:13 +0000 (11:39 -0700)]
net: ipv4: Change fib notifiers to take a fib_alias
All of the notifier data (fib_info, tos, type and table id) are
contained in the fib_alias. Pass it to the notifier instead of
each data separately shortening the argument list by 3.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Wed, 18 Oct 2017 18:22:51 +0000 (11:22 -0700)]
tcp: socket option to set TCP fast open key
New socket option TCP_FASTOPEN_KEY to allow different keys per
listener. The listener by default uses the global key until the
socket option is set. The key is a 16 bytes long binary data. This
option has no effect on regular non-listener TCP sockets.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 12:15:08 +0000 (13:15 +0100)]
Merge branch 'mlxsw-extack'
David Ahern says:
====================
mlxsw: spectrum_router: Add extack messages for RIF and VRF overflow
Currently, exceeding the number of VRF instances or the number of router
interfaces either fails with a non-intuitive EBUSY:
$ ip li set swp1s1.6 vrf vrf-1s1-6 up
RTNETLINK answers: Device or resource busy
or fails silently (IPv6) since the checks are done in a work queue. This
set adds support for the address validator notifier to spectrum which
allows ext-ack based messages to be returned on failure.
To make that happen the IPv6 version needs to be converted from atomic
to blocking (patch 2), and then support for extack needs to be added
to the notifier (patch 3). Patch 1 reworks the locking in ipv6_add_addr
to work better in the atomic and non-atomic code paths. Patches 4 and 5
add the validator notifier to spectrum and then plumb the extack argument
through spectrum_router.
With this set, VRF overflows fail with:
$ ip li set swp1s1.6 vrf vrf-1s1-6 up
Error: spectrum: Exceeded number of supported VRF.
and RIF overflows fail with:
$ ip addr add dev swp1s2.191 10.12.191.1/24
Error: spectrum: Exceeded number of supported router interfaces.
v2 -> v3
- fix surround context of patch 4 which was altered by
c30f5d012edf
v1 -> v2
- fix error path in ipv6_add_addr: reset rt to NULL (Ido comment) and
add in6_dev_put on ifa once the hold has been done
RFC -> v1
- addressed various comments from Ido
- refactored ipv6_add_addr to allow ifa's to be allocated with
GFP_KERNEL as requested by DaveM
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 16:56:56 +0000 (09:56 -0700)]
mlxsw: spectrum_router: Add extack message for RIF and VRF overflow
Add extack argument down to mlxsw_sp_rif_create and mlxsw_sp_vr_create
to set an error message on RIF or VR overflow. Now on overflow of
either resource the user gets an informative message as opposed to
failing with EBUSY.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 16:56:55 +0000 (09:56 -0700)]
mlxsw: spectrum: router: Add support for address validator notifier
Add support for inetaddr_validator and inet6addr_validator. The
notifiers provide a means for validating ipv4 and ipv6 addresses
before the addresses are installed and on failure the error
is propagated back to the user.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 16:56:54 +0000 (09:56 -0700)]
net: Add extack to validator_info structs used for address notifier
Add extack to in_validator_info and in6_validator_info. Update the one
user of each, ipvlan, to return an error message for failures.
Only manual configuration of an address is plumbed in the IPv6 code path.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 16:56:53 +0000 (09:56 -0700)]
net: ipv6: Make inet6addr_validator a blocking notifier
inet6addr_validator chain was added by commit
3ad7d2468f79f ("Ipvlan
should return an error when an address is already in use") to allow
address validation before changes are committed and to be able to
fail the address change with an error back to the user. The address
validation is not done for addresses received from router
advertisements.
Handling RAs in softirq context is the only reason for the notifier
chain to be atomic versus blocking. Since the only current user, ipvlan,
of the validator chain ignores softirq context, the notifier can be made
blocking and simply not invoked for softirq path.
The blocking option is needed by spectrum for example to validate
resources for an adding an address to an interface.
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 16:56:52 +0000 (09:56 -0700)]
ipv6: addrconf: cleanup locking in ipv6_add_addr
ipv6_add_addr is called in process context with rtnl lock held
(e.g., manual config of an address) or during softirq processing
(e.g., autoconf and address from a router advertisement).
Currently, ipv6_add_addr calls rcu_read_lock_bh shortly after entry
and does not call unlock until exit, minus the call around the address
validator notifier. Similarly, addrconf_hash_lock is taken after the
validator notifier and held until exit. This forces the allocation of
inet6_ifaddr to always be atomic.
Refactor ipv6_add_addr as follows:
1. add an input boolean to discriminate the call path (process context
or softirq). This new flag controls whether the alloc can be done
with GFP_KERNEL or GFP_ATOMIC.
2. Move the rcu_read_lock_bh and unlock calls only around functions that
do rcu updates.
3. Remove the in6_dev_hold and put added by
3ad7d2468f79f ("Ipvlan should
return an error when an address is already in use."). This was done
presumably because rcu_read_unlock_bh needs to be called before calling
the validator. Since rcu_read_lock is not needed before the validator
runs revert the hold and put added by
3ad7d2468f79f and only do the
hold when setting ifp->idev.
4. move duplicate address check and insertion of new address in the global
address hash into a helper. The helper is called after an ifa is
allocated and filled in.
This allows the ifa for manually configured addresses to be done with
GFP_KERNEL and reduces the overall amount of time with rcu_read_lock held
and hash table spinlock held.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 12:11:05 +0000 (13:11 +0100)]
Merge branch 's390-next'
Julian Wiedmann says:
====================
s390/net: updates 2017-10-18
please apply some additional robustness fixes and cleanups for 4.15.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:25 +0000 (17:40 +0200)]
s390/qeth: don't dump control cmd twice
A few lines down, qeth_prepare_control_data() makes further changes to
the control cmd buffer, and then also writes a trace entry for it.
So the first entry just pollutes the trace file with intermediate data,
drop it.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:24 +0000 (17:40 +0200)]
s390/qeth: support GRO flush timer
Switch to napi_complete_done(), and thus enable delayed GRO flushing.
The timeout is configured via /sys/class/net/<if>/gro_flush_timeout.
Default timeout is 0, so no change in behaviour.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:23 +0000 (17:40 +0200)]
s390/qeth: try harder to get packets from RX buffer
Current code bails out when two subsequent buffer elements hold
insufficient data to contain a qeth_hdr packet descriptor.
This seems reasonable, but it would be legal for quirky hardware to
leave a few elements empty and then present packets in a subsequent
element. These packets would currently be dropped.
So make sure to check all buffer elements, until we hit the LAST_ENTRY
indication.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:22 +0000 (17:40 +0200)]
s390/qeth: consolidate skb allocation
Move the allocation of SG skbs into the main path. This allows for
a little code sharing, and handling ENOMEM from within one place.
As side effect, L2 SG skbs now get the proper amount of additional
headroom (read: zero) instead of the hard-coded ETH_HLEN.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:21 +0000 (17:40 +0200)]
s390/qeth: clean up page frag creation
Replace the open-coded skb_add_rx_frag(), and use a fall-through
to remove some duplicated code.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:20 +0000 (17:40 +0200)]
s390/qeth: no VLAN support on OSM
Instead of silently discarding VLAN registration requests on OSM,
just indicate that this card type doesn't support VLAN.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:19 +0000 (17:40 +0200)]
s390/qeth: don't verify device when setting MAC address
There's no reason why l2_set_mac_address() should ever be called for
a netdevice that's not owned by qeth. It's certainly not required for
VLAN devices, which have their own netdev_ops.
Also:
1) we don't do such validation for any of the other netdev_ops routines.
2) the code in question clearly has never been actually exercised;
it's broken. After determining that the device is not owned
by qeth, it would still use dev->ml_priv to write a qeth trace entry.
Remove the check, and its helper that walked the global card list.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:18 +0000 (17:40 +0200)]
s390/qeth: clean up initial MTU determination
1. Drop the support for Token Ring,
2. use the ETH_DATA_LEN macro for the default L2 MTU,
3. handle OSM via the default case (as OSM is L2-only), and
4. document why the L3 MTU is reduced.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:17 +0000 (17:40 +0200)]
s390/qeth: fix early exit from error path
When the allocation of the addr buffer fails, we need to free
our refcount on the inetdevice before returning.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Wed, 18 Oct 2017 15:40:16 +0000 (17:40 +0200)]
s390/qeth: use kstrtobool() in qeth_bridgeport_hostnotification_store()
The sysfs enabled value is a boolean, so kstrtobool() is a better fit
for parsing the input string since it does the range checking for us.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:15 +0000 (17:40 +0200)]
s390/qeth: remove duplicated device matching
With commit "s390/ccwgroup: tie a ccwgroup driver to its ccw driver",
the ccwgroup core now ensures that a qeth group device only consists of
ccw devices which are supported by qeth. Therefore remove qeth's
internal device matching, and use .driver_info to determine the card
type.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allen Pais [Wed, 18 Oct 2017 15:40:14 +0000 (17:40 +0200)]
s390/drivers: use setup_timer
Use setup_timer function instead of initializing timer with the
function and data fields.
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Wed, 18 Oct 2017 15:40:13 +0000 (17:40 +0200)]
s390/qeth: rely on kernel for feature recovery
When recovering a device, qeth needs to re-run the IPA commands that
enable all previously active HW features.
Instead of duplicating qeth_set_features(), let netdev_update_features()
recover the missing HW features from dev->wanted_features.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Wed, 18 Oct 2017 15:38:08 +0000 (18:38 +0300)]
net/sched: Set the net-device for egress device instance
Currently the netdevice field is not set and the egdev instance
is not functional, fix that.
Fixes: 3f55bdda8df ('net: sched: introduce per-egress action device callbacks')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 12:06:53 +0000 (13:06 +0100)]
Merge branch 'cxgb4-more-flower-offloads'
Rahul Lakkireddy says:
====================
cxgb4: enable more tc flower offload matches and actions
This patch series enable more matches and actions for TC Flower
Offload support on Chelsio adapters.
Patch 1 enables matching on IP TOS.
Patch 2 enables matching on VLAN TCI.
Patch 3 adds support for action PASS.
Patch 4 adds support for ETH-DMAC rewrite via TC-PEDIT action. Also,
adds a check to assert that vlan/eth-dmac rewrite actions are valid
only in combination with action egress redirect.
Patch 5 introduces SMT ops for adding/removing entries from SMAC Table
in HW in preparation for patch 6.
Patch 6 adds support for ETH-SMAC rewrite via TC-PEDIT action.
Patch 7 introduces fw_filter2_wr to support L3/L4 header rewrites
in preparation for patch 8.
Patch 8 adds support for rewrite on L3/L4 header fields via TC-PEDIT
action. Supported fields for rewrite are:
IPv4 src/dst address, IPv6 src/dst address, TCP/UDP sport/dport.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:14 +0000 (20:49 +0530)]
cxgb4: add tc flower support for L3/L4 rewrite
Adds support to rewrite L3/L4 fields via TC-PEDIT action.
Supported fields for rewrite are:
IPv4 src/dst address, IPv6 src/dst address, TCP/UDP sport/dport.
Also, process match fields first and then process the action items.
Refactor pedit action validation to separate function to avoid
excessive code indentation.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:13 +0000 (20:49 +0530)]
cxgb4: introduce fw_filter2_wr to prepare for L3/L4 rewrite support
Update driver to use new fw_filter2_wr in order to support rewrite of
L3/L4 header fields via filters. Query FW_PARAMS_PARAM_DEV_FILTER2_WR
to check whether FW supports this new wr.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:12 +0000 (20:49 +0530)]
cxgb4: add tc flower support for ETH-SMAC rewrite
Adds support for ETH-SMAC rewrite via TC-PEDIT action.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:11 +0000 (20:49 +0530)]
cxgb4: introduce SMT ops to prepare for SMAC rewrite support
Introduce SMT operations for allocating/removing entries from
SMAC table. Make TCAM filters use the SMT ops whenever SMAC rewrite
is required.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:10 +0000 (20:49 +0530)]
cxgb4: add tc flower support for ETH-DMAC rewrite
Add support for ETH-DMAC Rewrite via TC-PEDIT action. Also, add
check to assert that vlan/eth-dmac rewrite actions are valid only
in combination with action egress redirect.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:09 +0000 (20:49 +0530)]
cxgb4: add tc flower support for action PASS
Add support for tc flower action PASS.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:08 +0000 (20:49 +0530)]
cxgb4: add tc flower match support for vlan
Add support for matching on vlan tci. Construct vlan tci match param
based on vlan-id and vlan-pcp values supplied by tc.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kumar Sanghvi [Wed, 18 Oct 2017 15:19:07 +0000 (20:49 +0530)]
cxgb4: add tc flower match support for TOS
Add support for matching on IP TOS. Also check on ethtype value
to be either IPv4 or IPv6.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Wed, 18 Oct 2017 15:17:29 +0000 (08:17 -0700)]
tcp: Remove use of inet6_sk and add IPv6 checks to tracepoint
386fd5da401d ("tcp: Check daddr_cache before use in tracepoint") was the
second version of the tracepoint fixup patch. This patch is the delta
between v2 and v3. Specifically, remove the use of inet6_sk and check
sk_family as requested by Eric and add IS_ENABLED(CONFIG_IPV6) around
the use of sk_v6_rcv_saddr and sk_v6_daddr as done in sock_common (noted
by Cong).
Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Donald Sharp [Wed, 18 Oct 2017 14:24:28 +0000 (10:24 -0400)]
doc: Update VRF documentation metric
Two things:
1) Update examples to show usage of metric
2) Discuss reasoning for using such a high metric.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 07:42:09 +0000 (08:42 +0100)]
Merge tag 'rxrpc-next-
20171018' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Add bits for kernel services
Here are some patches that add a few things for kernel services to use:
(1) Allow service upgrade to be requested and allow the resultant actual
service ID to be obtained.
(2) Allow the RTT time of a call to be obtained.
(3) Allow a kernel service to find out if a call is still alive on a
server between transmitting a request and getting the reply.
(4) Allow data transmission to ignore signals if transmission progress is
being made in reasonable time. This is also usable by userspace by
passing MSG_WAITALL to sendmsg()[*].
[*] I'm not sure this is the right interface for this or whether a sockopt
should be used instead.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 07:37:28 +0000 (08:37 +0100)]
Merge tag 'wireless-drivers-next-for-davem-2017-10-18' of git://git./linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
wireless-drivers-next patches for 4.15
The first pull request for 4.15, unusually late this time but still
relatively small. Also includes merge from wireless-drivers to fix
conflicts in iwlwifi.
Major changes:
rsi
* add P2P mode support
* sdio suspend and resume support
iwlwifi
* A fix and an addition for PCI devices for the A000 family
* Dump PCI registers when an error occurs, to make it easier to debug
rtlwifi
* add support for 64 bit DMA, enabled with a module parameter
* add module parameter to enable ASPM
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 18 Oct 2017 08:33:37 +0000 (10:33 +0200)]
net: sched: cls_u32: use hash_ptr() for tc_u_hash
After the change to the tp hash, we now get a build warning
on 32-bit architectures:
net/sched/cls_u32.c: In function 'tc_u_hash':
net/sched/cls_u32.c:338:17: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
return hash_64((u64) tp->chain->block, U32_HASH_SHIFT);
Using hash_ptr() instead of hash_64() lets us drop the cast
and fixes the warning while still resulting in the same hash
value.
Fixes: 7fa9d974f3c2 ("net: sched: cls_u32: use block instead of q in tc_u_common")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 18 Oct 2017 07:48:25 +0000 (10:48 +0300)]
tipc: checking for NULL instead of IS_ERR()
The tipc_alloc_conn() function never returns NULL, it returns error
pointers, so I have fixed the check.
Fixes: 14c04493cb77 ("tipc: add ability to order and receive topology events in driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 20 Oct 2017 07:32:49 +0000 (08:32 +0100)]
Merge branch 'sh_eth-fallback-compat-strings'
Simon Horman says:
====================
net: sh_eth: add R-Car Gen[12] fallback compatibility strings
Add fallback compatibility strings for R-Car Gen 1 and 2.
In the case of Renesas R-Car hardware we know that there are generations of
SoCs, f.e. Gen 1 and 2. But beyond that its not clear what the relationship
between IP blocks might be. For example, I believe that r8a7790 is older
than r8a7791 but that doesn't imply that the latter is a descendant of the
former or vice versa.
We can, however, by examining the documentation and behaviour of the
hardware at run-time observe that the current driver implementation appears
to be compatible with the IP blocks on SoCs within a given generation.
For the above reasons and convenience when enabling new SoCs a
per-generation fallback compatibility string scheme is being adopted for
drivers for Renesas SoCs.
Changes since v1:
* Correct typos in changelogs
* Consistently use tabs for indentation in bindings document
* Enhance readability of description of bindings usage
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Wed, 18 Oct 2017 07:21:28 +0000 (09:21 +0200)]
net: sh_eth: implement R-Car Gen[12] fallback compatibility strings
Implement fallback compatibility strings for R-Car Gen 1 and 2.
In the case of Renesas R-Car hardware we know that there are generations of
SoCs, f.e. Gen 1 and 2. But beyond that its not clear what the relationship
between IP blocks might be. For example, I believe that r8a7790 is older
than r8a7791 but that doesn't imply that the latter is a descendant of the
former or vice versa.
We can, however, by examining the documentation and behaviour of the
hardware at run-time observe that the current driver implementation appears
to be compatible with the IP blocks on SoCs within a given generation.
For the above reasons and convenience when enabling new SoCs a
per-generation fallback compatibility string scheme is being adopted for
drivers for Renesas SoCs.
Note that R-Car Gen2 and RZ/G1 have many compatible IP blocks. The
approach that has been consistently taken for other IP blocks is to name
common code, compatibility strings and so on after R-Car Gen2.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Wed, 18 Oct 2017 07:21:27 +0000 (09:21 +0200)]
net: sh_eth: rename name structures as rcar_gen[12]_*
Rename structures describing R-Car SoCs as rcar_gen[12]_*
rather than r8a77[79]x_*. This seems a little easier on the
eyes. And will make things slightly cleaner in a follow-up
patch that adds fallback-compatibility strings for these SoCs.
Note that R-Car Gen2 and RZ/G1 have many compatible IP blocks. The
approach that has been consistently taken for other IP blocks is to name
common code, compatibility strings and so on after R-Car Gen2.
Also rename sh_eth_set_rate_r8a777x as sh_eth_set_rate_rcar as
it it is used by the R-Car generations supported by the driver.
This patch should have no run-time effect and
is compile-tested only.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Wed, 18 Oct 2017 07:21:26 +0000 (09:21 +0200)]
dt-bindings: net: sh_eth: add R-Car Gen[12] fallback compatibility strings
Add fallback compatibility strings for R-Car Gen 1 and 2.
In the case of Renesas R-Car hardware we know that there are generations of
SoCs, f.e. Gen 1 and 2. But beyond that its not clear what the relationship
between IP blocks might be. For example, I believe that r8a7790 is older
than r8a7791 but that doesn't imply that the latter is a descendant of the
former or vice versa.
We can, however, by examining the documentation and behaviour of the
hardware at run-time observe that the current driver implementation appears
to be compatible with the IP blocks on SoCs within a given generation.
For the above reasons and convenience when enabling new SoCs a
per-generation fallback compatibility string scheme is being adopted for
drivers for Renesas SoCs.
Note that R-Car Gen2 and RZ/G1 have many compatible IP blocks. The
approach that has been consistently taken for other IP blocks is to name
common code, compatibility strings and so on after R-Car Gen2.
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stephen Hemminger [Wed, 18 Oct 2017 00:16:52 +0000 (17:16 -0700)]
dql: make dql_init return void
dql_init always returned 0, and the only place that uses it
in network core code didn't care about the return value anyway.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Tue, 17 Oct 2017 22:42:53 +0000 (17:42 -0500)]
net: l2tp: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in this particular case I replaced the "NOBREAK" comment with
a "fall through" comment, which is what GCC is expecting to find.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Tue, 17 Oct 2017 19:01:45 +0000 (14:01 -0500)]
liquidio: mark expected switch fall-through in octeon_destroy_resources
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Tue, 17 Oct 2017 18:59:20 +0000 (13:59 -0500)]
liquidio: remove unnecessary NULL check before kfree in delete_glists
NULL check before freeing functions like kfree is not needed.
This issue was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 19 Oct 2017 12:20:32 +0000 (13:20 +0100)]
Merge branch 'ibmvnic-next'
Thomas Falcon says:
====================
ibmvnic: Enable SG and TSO feature support
This patch set is fairly straightforward. The first patch enables
scatter-gather support in the ibmvnic driver. The following patch
then enables the TCP Segmentation offload feature. The final patch
allows users to enable or disable net device features using ethtool.
Enabling SG and TSO grants a large increase in throughput with TX
speed increasing from 1Gb/s to 9Gb/s in our initial test runs.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Tue, 17 Oct 2017 17:36:56 +0000 (12:36 -0500)]
ibmvnic: Let users change net device features
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Tue, 17 Oct 2017 17:36:55 +0000 (12:36 -0500)]
ibmvnic: Enable TSO support
This patch enables TSO support. It includes additional
buffers reserved exclusively for large packets. Throughput
is greatly increased with TSO enabled, from about 1 Gb/s to
9 Gb/s on our test systems.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Tue, 17 Oct 2017 17:36:54 +0000 (12:36 -0500)]
ibmvnic: Enable scatter-gather support
This patch enables scatter gather support. Since there is no
HW/FW scatter-gather support at this time, the driver needs to
loop through each fragment and copy it to a contiguous, pre-mapped
buffer entry.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 17 Oct 2017 17:07:44 +0000 (10:07 -0700)]
tun: relax check on eth_get_headlen() return value
syzkaller hit the WARN() in tun_get_user(), providing skb
with payload in fragments only, and nothing in skb->head
GRO layer is fine with this, so relax the check.
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 17 Oct 2017 15:01:30 +0000 (16:01 +0100)]
mqprio: fix potential null pointer dereference on opt
The pointer opt has a null check however before for this check opt is
dereferenced when len is initialized, hence we potentially have a null
pointer deference on opt. Avoid this by checking for a null opt before
dereferencing it.
Detected by CoverityScan, CID#
1458234 ("Dereference before null check")
Fixes: 4e8b86c06269 ("mqprio: Introduce new hardware offload mode and shaper in mqprio")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 17 Oct 2017 12:33:01 +0000 (15:33 +0300)]
thunderbolt: Right shifting to zero bug in tbnet_handle_packet()
There is a problem when we do:
sequence = pkg->hdr.length_sn & TBIP_HDR_SN_MASK;
sequence >>= TBIP_HDR_SN_SHIFT;
TBIP_HDR_SN_SHIFT is 27, and right shifting a u8 27 bits is always
going to result in zero. The fix is to declare these variables as u32.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yehezkel Bernat <yehezkel.bernat@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Tue, 17 Oct 2017 12:32:17 +0000 (15:32 +0300)]
thunderbolt: Fix a couple right shifting to zero bugs
The problematic code looks like this:
res_seq = res_hdr->xd_hdr.length_sn & TB_XDOMAIN_SN_MASK;
res_seq >>= TB_XDOMAIN_SN_SHIFT;
TB_XDOMAIN_SN_SHIFT is 27, and right shifting a u8 27 bits is always
going to result in zero. The fix is to declare these variables as u32.
Fixes: d1ff70241a27 ("thunderbolt: Add support for XDomain discovery protocol")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 19 Oct 2017 11:51:38 +0000 (12:51 +0100)]
Merge branch 'ena-next'
Netanel Belgazal says:
====================
update ENA driver to releawse 1.3.0
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:34:01 +0000 (07:34 +0000)]
net: ena: increase ena driver version to 1.3.0
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:34:00 +0000 (07:34 +0000)]
net: ena: add new admin define for future support of IPv6 RSS
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:33:59 +0000 (07:33 +0000)]
net: ena: add statistics for missed tx packets
Add a new statistic to ethtool stats that show the number of packets
without transmit acknowledgement from ENA device.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:33:58 +0000 (07:33 +0000)]
net: ena: add power management ops to the ENA driver
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:33:57 +0000 (07:33 +0000)]
net: ena: remove legacy suspend suspend/resume support
Remove ena_device_io_suspend/resume() methods
Those methods were intend to be used by the device to trigger
suspend/resume but eventually it was dropped.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:33:56 +0000 (07:33 +0000)]
net: ena: improve ENA driver boot time.
The ena admin commands timeout is in resolutions of 100ms.
Therefore, When the driver works in polling mode, it sleeps for 100ms
each time. The overall boot time of the ENA driver is ~1.5 sec.
To reduce the boot time, This change modifies the granularity of
the sleeps to 5ms.
This change improves the boot time to 220ms.
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:30:34 +0000 (07:30 +0000)]
MAINTAINERS: change ENA driver maintainers email domain
ENA driver was developed by developers from Annapurna Labs.
Annapurna Labs was acquired by Amazon and the company's domain
(@annapurnalabs.com) will become deprecated soon.
Update the email addresses of the maintainers to the alternative amazon
emails (@amazon.com)
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kalderon [Tue, 17 Oct 2017 07:23:25 +0000 (10:23 +0300)]
qed: Fix iWARP out of order flow
Out of order flow is not working for iWARP.
This patch got cut out from initial series that added out
of order support for iWARP.
Make out of order code common for iWARP and iSCSI.
Add new configuration option CONFIG_QED_OOO. Set by
qedr and qedi Kconfigs.
Fixes: d1abfd0b4ee2 ("qed: Add iWARP out of order support")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Manish Rangankar <Manish.Rangankar@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Tue, 17 Oct 2017 06:51:30 +0000 (14:51 +0800)]
net: hns3: Add mqprio hardware offload support in hns3 driver
When using tc qdisc, dcb_ops->setup_tc is used to tell hclge_dcb
module to do the tm related setup. Only TC_MQPRIO_MODE_CHANNEL
offload mode is supported.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Tue, 17 Oct 2017 02:44:44 +0000 (22:44 -0400)]
macvlan/macvtap: Add support for L2 forwarding offloads with macvtap
This patch reverts earlier commit
b13ba1b83f52 ("macvlan: forbid L2
fowarding offload for macvtap"). The reason for reverting this is because
the original patch no longer fixes what it previously did as the
underlying structure has changed for macvtap. Specifically macvtap
originally pulled packets directly off of the lowerdev. However in commit
6acf54f1cf0a ("macvtap: Add support of packet capture on macvtap device.")
that code was changed and instead macvtap would listen directly on the
macvtap device itself instead of the lower device. As such, the L2
forwarding offload should now be able to provide a performance advantage of
skipping the checks on the lower dev while not introducing any sort of
regression.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 19 Oct 2017 10:44:36 +0000 (11:44 +0100)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
40GbE Intel Wired LAN Driver Updates 2017-10-17
This series contains updates to i40e and ethtool.
Alan provides most of the changes in this series which are mainly fixes
and cleanups. Renamed the ethtool "cmd" variable to "ks", since the new
ethtool API passes us ksettings structs instead of command structs.
Cleaned up an ifdef that was not accomplishing anything. Added function
header comments to provide better documentation. Fixed two issues in
i40e_get_link_ksettings(), by calling
ethtool_link_ksettings_zero_link_mode() to ensure the advertising and
link masks are cleared before we start setting bits. Cleaned up and fixed
code comments which were incorrect. Separated the setting of autoneg in
i40e_phy_types_to_ethtool() into its own conditional to clarify what PHYs
support and advertise autoneg, and makes it easier to add new PHY types in
the future. Added ethtool functionality to intersect two link masks
together to find the common ground between them. Overhauled i40e to
ensure that the new ethtool API macros are being used, instead of the
old ones. Fixed the usage of unsigned 64-bit division which is not
supported on all architectures.
Sudheer adds support for 25G Active Optical Cables (AOC) and Active Copper
Cables (ACC) PHY types.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Schmidt [Wed, 18 Oct 2017 15:40:18 +0000 (17:40 +0200)]
Merge remote-tracking branch 'net-next/master'
Eric Dumazet [Tue, 17 Oct 2017 02:38:35 +0000 (19:38 -0700)]
tcp: fix tcp_xmit_retransmit_queue() after rbtree introduction
I tried to hard avoiding a call to rb_first() (via tcp_rtx_queue_head)
in tcp_xmit_retransmit_queue(). But this was probably too bold.
Quoting Yuchung :
We might miss re-arming the RTO if tp->retransmit_skb_hint is not NULL.
This can happen when RACK marks the first packet lost again and resets
tp->retransmit_skb_hint for example (tcp_rack_mark_skb_lost())
Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Oct 2017 13:17:11 +0000 (14:17 +0100)]
Merge branch 'bpf-ctx-info-out-of-verifier'
Jakub Kicinski says:
====================
bpf: move context info out of the verifier
Daniel pointed out during the review of my previous patchset that
the knowledge about context doesn't really belong directly in the
verifier. This patch set takes a bit of a drastic approach to
move the info out of there. I want to be able to use different
set of verifier_ops for program analysis. To do that, I have
to first move the test_run callback to a separate structure. Then
verifier ops can be declared in the verifier directly and
different sets can be picked for verification vs analysis.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 16 Oct 2017 23:40:56 +0000 (16:40 -0700)]
bpf: allow access to skb->len from offloads
Since we are now doing strict checking of what offloads
may access, make sure skb->len is on that list.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 16 Oct 2017 23:40:55 +0000 (16:40 -0700)]
bpf: move knowledge about post-translation offsets out of verifier
Use the fact that verifier ops are now separate from program
ops to define a separate set of callbacks for verification of
already translated programs.
Since we expect the analyzer ops to be defined only for
a small subset of all program types initialize their array
by hand (don't use linux/bpf_types.h).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 16 Oct 2017 23:40:54 +0000 (16:40 -0700)]
bpf: remove the verifier ops from program structure
Since the verifier ops don't have to be associated with
the program for its entire lifetime we can move it to
verifier's struct bpf_verifier_env.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 16 Oct 2017 23:40:53 +0000 (16:40 -0700)]
bpf: split verifier and program ops
struct bpf_verifier_ops contains both verifier ops and operations
used later during program's lifetime (test_run). Split the runtime
ops into a different structure.
BPF_PROG_TYPE() will now append ## _prog_ops or ## _verifier_ops
to the names.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 16 Oct 2017 22:32:07 +0000 (15:32 -0700)]
tcp: Check daddr_cache before use in tracepoint
Running perf in one window to capture tcp_retransmit_skb tracepoint:
$ perf record -e tcp:tcp_retransmit_skb -a
And causing a retransmission on an active TCP session (e.g., dropping
packets in the receiver, changing MTU on the interface to 500 and back
to 1500) triggers a panic:
[ 58.543144] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[ 58.545300] IP: perf_trace_tcp_retransmit_skb+0xd0/0x145
[ 58.546770] PGD 0 P4D 0
[ 58.547472] Oops: 0000 [#1] SMP
[ 58.548328] Modules linked in: vrf
[ 58.549262] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc4+ #26
[ 58.551004] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 58.554560] task:
ffffffff81a0e540 task.stack:
ffffffff81a00000
[ 58.555817] RIP: 0010:perf_trace_tcp_retransmit_skb+0xd0/0x145
[ 58.557137] RSP: 0018:
ffff88003fc03d68 EFLAGS:
00010282
[ 58.558292] RAX:
0000000000000000 RBX:
ffffe8ffffc0ec80 RCX:
ffff880038543098
[ 58.559850] RDX:
0400000000000000 RSI:
ffff88003fc03d70 RDI:
ffff88003fc14b68
[ 58.561099] RBP:
ffff88003fc03da8 R08:
0000000000000000 R09:
ffffea0000d3224a
[ 58.562005] R10:
ffff88003fc03db8 R11:
0000000000000010 R12:
ffff8800385428c0
[ 58.562930] R13:
ffffe8ffffc0e478 R14:
ffffffff81a93a40 R15:
ffff88003d4f0c00
[ 58.563845] FS:
0000000000000000(0000) GS:
ffff88003fc00000(0000) knlGS:
0000000000000000
[ 58.564873] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 58.565613] CR2:
0000000000000008 CR3:
000000003d68f004 CR4:
00000000000606f0
[ 58.566538] Call Trace:
[ 58.566865] <IRQ>
[ 58.567140] __tcp_retransmit_skb+0x4ab/0x4c6
[ 58.567704] ? tcp_set_ca_state+0x22/0x3f
[ 58.568231] tcp_retransmit_skb+0x14/0xa3
[ 58.568754] tcp_retransmit_timer+0x472/0x5e3
[ 58.569324] ? tcp_write_timer_handler+0x1e9/0x1e9
[ 58.569946] tcp_write_timer_handler+0x95/0x1e9
[ 58.570548] tcp_write_timer+0x2a/0x58
Check that daddr_cache is non-NULL before de-referencing.
Fixes: e086101b150a ("tcp: add a tracepoint for tcp retransmission")
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Mon, 16 Oct 2017 21:53:16 +0000 (16:53 -0500)]
net: ipx: mark expected switch fall-through
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Mon, 16 Oct 2017 21:36:52 +0000 (16:36 -0500)]
ipv6: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in some cases I placed the "fall through" comment
on its own line, which is what GCC is expecting to find.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 16 Oct 2017 21:24:02 +0000 (14:24 -0700)]
tcp: Use pI6c in tcp tracepoint
The compact form for IPv6 addresses is more user friendly than the full
version. For example:
compact: 2001:db8:1::1
full: 2001:0db8:0001:0000:0000:0000:0000:0004i
Update the tcp tracepoint to show the compact form.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Mon, 16 Oct 2017 20:48:55 +0000 (15:48 -0500)]
ipv4: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Notice that in some cases I placed the "fall through" comment
on its own line, which is what GCC is expecting to find.
Addresses-Coverity-ID: 115108
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gustavo A. R. Silva [Mon, 16 Oct 2017 20:11:22 +0000 (15:11 -0500)]
decnet: af_decnet: mark expected switch fall-throughs
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 18 Oct 2017 12:44:47 +0000 (13:44 +0100)]
Merge branch 'DSA-DPAA'
Madalin Bucur says:
====================
adapt DPAA drivers for DSA
Junote Cai reported that he was not able to get a DSA setup involving the
DPAA/FMAN driver to work and narrowed it down to of_find_net_device_by_node()
call in DSA setup. The initial attempt to fix this by adding of_node to the
platform device results in a second, failed, probing of the FMan MAC driver
against the new platform device created for the DPAA Ethernet driver.
Solve these issues by removing the of_node pointer from the platform device
and changing the net_dev dev to the of_device dev to ensure the DSA init
will be able to find the DPAA net_dev using of_find_net_device_by_node().
Several changes were required to enable this solution: refactoring the
adjust_link (also resulted in lesser, cleaner code) and renaming the fman
kernel modules to keep the legacy udev rules happy.
Changes in v2:
- fix issue on error path in "dpaa_eth: change device used" patch
- cleanup the dpaa_eth_probe() error paths
Changes in v3:
- remove obsolete comment in moved code
- add explanation for module rename
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 16 Oct 2017 18:36:10 +0000 (21:36 +0300)]
dpaa_eth: remove obsolete comment
Comment is no longer valid for a long time now.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 16 Oct 2017 18:36:09 +0000 (21:36 +0300)]
fsl/fman: add dpaa in module names
This change just renames the FMan driver modules, using a common prefix
for the DPAA FMan and DPAA Ethernet drivers. Besides making the names more
aligned, this allows writing udev rules that match on either driver name,
if needed, using the fsl_dpaa_* prefix. The change of netdev dev required
for the DSA probing makes the previous rules written using this prefix
fail, this change makes them work again, ensuring backwards compatibility
for their users.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 16 Oct 2017 18:36:08 +0000 (21:36 +0300)]
dpaa_eth: cleanup dpaa_eth_probe() error paths
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 16 Oct 2017 18:36:07 +0000 (21:36 +0300)]
dpaa_eth: change device used
Change device used for DMA mapping to the MAC device that is an
of_device, with proper DMA ops. Using this device for the netdevice
should also address the issue with DSA scenarios that need the
netdevice to be backed by an of_device.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 16 Oct 2017 18:36:06 +0000 (21:36 +0300)]
dpaa_eth: move of_phy_connect() to the eth driver
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>