Petr Machata [Wed, 5 Sep 2018 10:16:00 +0000 (12:16 +0200)]
mlxsw: spectrum_buffers: Set up a dedicated pool for BUM traffic
MC-aware mode was recently enabled by mlxsw on Spectrum switches in
commit
7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw
ports"). Unfortunately, testing has shown that the fix is incomplete and
in the presented form actually makes the problem even worse, because any
amount of MC traffic causes UC disruption.
The reason for this is that currently, mlxsw configures the MC-specific
TCs (8..15) to map to pool 0. It also configures a maximum buffer size
of 0, but for MC traffic that maximum is disregarded and not part of the
quota. Therefore MC traffic is always admitted to the egress buffer.
Fix the configuration by directing the MC TCs into pool 15, which is
dedicated to MC traffic and recognized as such by the silicon.
Fixes: 7b8195306694 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 4 Sep 2018 19:45:11 +0000 (12:45 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Must perform TXQ teardown before unregistering interfaces in
mac80211, from Toke Høiland-Jørgensen.
2) Don't allow creating mac80211_hwsim with less than one channel, from
Johannes Berg.
3) Division by zero in cfg80211, fix from Johannes Berg.
4) Fix endian issue in tipc, from Haiqing Bai.
5) BPF sockmap use-after-free fixes from Daniel Borkmann.
6) Spectre-v1 in mac80211_hwsim, from Jinbum Park.
7) Missing rhashtable_walk_exit() in tipc, from Cong Wang.
8) Revert kvzalloc() conversion of AF_PACKET, it breaks mmap() when
kvzalloc() tries to use kmalloc() pages. From Eric Dumazet.
9) Fix deadlock in hv_netvsc, from Dexuan Cui.
10) Do not restart timewait timer on RST, from Florian Westphal.
11) Fix double lwstate refcount grab in ipv6, from Alexey Kodanev.
12) Unsolicit report count handling is off-by-one, fix from Hangbin Liu.
13) Sleep-in-atomic in cadence driver, from Jia-Ju Bai.
14) Respect ttl-inherit in ip6 tunnel driver, from Hangbin Liu.
15) Use-after-free in act_ife, fix from Cong Wang.
16) Missing hold to meta module in act_ife, from Vlad Buslov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
net: phy: sfp: Handle unimplemented hwmon limits and alarms
net: sched: action_ife: take reference to meta module
act_ife: fix a potential use-after-free
net/mlx5: Fix SQ offset in QPs with small RQ
tipc: correct spelling errors for tipc_topsrv_queue_evt() comments
tipc: correct spelling errors for struct tipc_bc_base's comment
bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
bnxt_en: Clean up unused functions.
bnxt_en: Fix firmware signaled resource change logic in open.
sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel
sctp: fix invalid reference to the index variable of the iterator
net/ibm/emac: wrong emac_calc_base call was used by typo
net: sched: null actions array pointer before releasing action
vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition
r8169: add support for NCube 8168 network card
ip6_tunnel: respect ttl inherit for ip6tnl
mac80211: shorten the IBSS debug messages
mac80211: don't Tx a deauth frame if the AP forbade Tx
mac80211: Fix station bandwidth setting after channel switch
mac80211: fix a race between restart and CSA flows
...
Andrew Lunn [Tue, 4 Sep 2018 02:23:56 +0000 (04:23 +0200)]
net: phy: sfp: Handle unimplemented hwmon limits and alarms
Not all SFPs implement the registers containing sensor limits and
alarms. Luckily, there is a bit indicating if they are implemented or
not. Add checking for this bit, when deciding if the hwmon attributes
should be visible.
Fixes: 1323061a018a ("net: phy: sfp: Add HWMON support for module sensors")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 3 Sep 2018 21:44:42 +0000 (00:44 +0300)]
net: sched: action_ife: take reference to meta module
Recent refactoring of add_metainfo() caused use_all_metadata() to add
metainfo to ife action metalist without taking reference to module. This
causes warning in module_put called from ife action cleanup function.
Implement add_metainfo_and_get_ops() function that returns with reference
to module taken if metainfo was added successfully, and call it from
use_all_metadata(), instead of calling __add_metainfo() directly.
Example warning:
[ 646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
[ 646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
[ 646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
[ 646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 646.440595] RIP: 0010:module_put+0x1cb/0x230
[ 646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff <0f> 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
[ 646.464997] RSP: 0018:
ffff880354d37068 EFLAGS:
00010286
[ 646.470599] RAX:
0000000000000000 RBX:
ffffffffc0a52518 RCX:
ffffffff8c2668db
[ 646.478118] RDX:
0000000000000003 RSI:
dffffc0000000000 RDI:
ffffffffc0a52518
[ 646.485641] RBP:
ffffffffc0a52180 R08:
fffffbfff814a4a4 R09:
fffffbfff814a4a3
[ 646.493164] R10:
ffffffffc0a5251b R11:
fffffbfff814a4a4 R12:
1ffff1006a9a6e0d
[ 646.500687] R13:
00000000ffffffff R14:
ffff880362bab890 R15:
dead000000000100
[ 646.508213] FS:
00007f4164c99800(0000) GS:
ffff88036fe40000(0000) knlGS:
0000000000000000
[ 646.516961] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 646.523080] CR2:
00007f41638b8420 CR3:
0000000351df0004 CR4:
00000000001606e0
[ 646.530595] Call Trace:
[ 646.533408] ? find_symbol_in_section+0x260/0x260
[ 646.538509] tcf_ife_cleanup+0x11b/0x200 [act_ife]
[ 646.543695] tcf_action_cleanup+0x29/0xa0
[ 646.548078] __tcf_action_put+0x5a/0xb0
[ 646.552289] ? nla_put+0x65/0xe0
[ 646.555889] __tcf_idr_release+0x48/0x60
[ 646.560187] tcf_generic_walker+0x448/0x6b0
[ 646.564764] ? tcf_action_dump_1+0x450/0x450
[ 646.569411] ? __lock_is_held+0x84/0x110
[ 646.573720] ? tcf_ife_walker+0x10c/0x20f [act_ife]
[ 646.578982] tca_action_gd+0x972/0xc40
[ 646.583129] ? tca_get_fill.constprop.17+0x250/0x250
[ 646.588471] ? mark_lock+0xcf/0x980
[ 646.592324] ? check_chain_key+0x140/0x1f0
[ 646.596832] ? debug_show_all_locks+0x240/0x240
[ 646.601839] ? memset+0x1f/0x40
[ 646.605350] ? nla_parse+0xca/0x1a0
[ 646.609217] tc_ctl_action+0x215/0x230
[ 646.613339] ? tcf_action_add+0x220/0x220
[ 646.617748] rtnetlink_rcv_msg+0x56a/0x6d0
[ 646.622227] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.626466] netlink_rcv_skb+0x18d/0x200
[ 646.630752] ? rtnl_fdb_del+0x3f0/0x3f0
[ 646.634959] ? netlink_ack+0x500/0x500
[ 646.639106] netlink_unicast+0x2d0/0x370
[ 646.643409] ? netlink_attachskb+0x340/0x340
[ 646.648050] ? _copy_from_iter_full+0xe9/0x3e0
[ 646.652870] ? import_iovec+0x11e/0x1c0
[ 646.657083] netlink_sendmsg+0x3b9/0x6a0
[ 646.661388] ? netlink_unicast+0x370/0x370
[ 646.665877] ? netlink_unicast+0x370/0x370
[ 646.670351] sock_sendmsg+0x6b/0x80
[ 646.674212] ___sys_sendmsg+0x4a1/0x520
[ 646.678443] ? copy_msghdr_from_user+0x210/0x210
[ 646.683463] ? lock_downgrade+0x320/0x320
[ 646.687849] ? debug_show_all_locks+0x240/0x240
[ 646.692760] ? do_raw_spin_unlock+0xa2/0x130
[ 646.697418] ? _raw_spin_unlock+0x24/0x30
[ 646.701798] ? __handle_mm_fault+0x1819/0x1c10
[ 646.706619] ? __pmd_alloc+0x320/0x320
[ 646.710738] ? debug_show_all_locks+0x240/0x240
[ 646.715649] ? restore_nameidata+0x7b/0xa0
[ 646.720117] ? check_chain_key+0x140/0x1f0
[ 646.724590] ? check_chain_key+0x140/0x1f0
[ 646.729070] ? __fget_light+0xbc/0xd0
[ 646.733121] ? __sys_sendmsg+0xd7/0x150
[ 646.737329] __sys_sendmsg+0xd7/0x150
[ 646.741359] ? __ia32_sys_shutdown+0x30/0x30
[ 646.746003] ? up_read+0x53/0x90
[ 646.749601] ? __do_page_fault+0x484/0x780
[ 646.754105] ? do_syscall_64+0x1e/0x2c0
[ 646.758320] do_syscall_64+0x72/0x2c0
[ 646.762353] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 646.767776] RIP: 0033:0x7f4163872150
[ 646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 646.791474] RSP: 002b:
00007ffdef7d6b58 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 646.799721] RAX:
ffffffffffffffda RBX:
0000000000000024 RCX:
00007f4163872150
[ 646.807240] RDX:
0000000000000000 RSI:
00007ffdef7d6bd0 RDI:
0000000000000003
[ 646.814760] RBP:
000000005b8b9482 R08:
0000000000000001 R09:
0000000000000000
[ 646.822286] R10:
00000000000005e7 R11:
0000000000000246 R12:
00007ffdef7dad20
[ 646.829807] R13:
0000000000000000 R14:
0000000000000000 R15:
0000000000679bc0
[ 646.837360] irq event stamp: 6083
[ 646.841043] hardirqs last enabled at (6081): [<
ffffffff8c220a7d>] __call_rcu+0x17d/0x500
[ 646.849882] hardirqs last disabled at (6083): [<
ffffffff8c004f06>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 646.859775] softirqs last enabled at (5968): [<
ffffffff8d4004a1>] __do_softirq+0x4a1/0x6ee
[ 646.868784] softirqs last disabled at (6082): [<
ffffffffc0a78759>] tcf_ife_cleanup+0x39/0x200 [act_ife]
[ 646.878845] ---[ end trace
b1b8c12ffe51e657 ]---
Fixes: 5ffe57da29b3 ("act_ife: fix a potential deadlock")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Mon, 3 Sep 2018 18:08:15 +0000 (11:08 -0700)]
act_ife: fix a potential use-after-free
Immediately after module_put(), user could delete this
module, so e->ops could be already freed before we call
e->ops->release().
Fix this by moving module_put() after ops->release().
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tariq Toukan [Mon, 3 Sep 2018 15:06:24 +0000 (18:06 +0300)]
net/mlx5: Fix SQ offset in QPs with small RQ
Correct the formula for calculating the RQ page remainder,
which should be in byte granularity. The result will be
non-zero only for RQs smaller than PAGE_SIZE, as an RQ size
is a power of 2.
Divide this by the SQ stride (MLX5_SEND_WQE_BB) to get the
SQ offset in strides granularity.
Fixes: d7037ad73daa ("net/mlx5: Fix QP fragmented buffer allocation")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 4 Sep 2018 05:12:02 +0000 (22:12 -0700)]
Merge tag 'mac80211-for-davem-2018-09-03' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Here are quite a large number of fixes, notably:
* various A-MSDU building fixes (currently only affects mt76)
* syzkaller & spectre fixes in hwsim
* TXQ vs. teardown fix that was causing crashes
* embed WMM info in reg rule, bad code here had been causing crashes
* one compilation issue with fix from Arnd (rfkill-gpio includes)
* fixes for a race and bad data during/after channel switch
* nl80211: a validation fix, attribute type & unit fixes
along with other small fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhenbo Gao [Mon, 3 Sep 2018 08:36:46 +0000 (16:36 +0800)]
tipc: correct spelling errors for tipc_topsrv_queue_evt() comments
tipc_conn_queue_evt -> tipc_topsrv_queue_evt
tipc_send_work -> tipc_conn_send_work
tipc_send_to_sock -> tipc_conn_send_to_sock
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhenbo Gao [Mon, 3 Sep 2018 08:36:45 +0000 (16:36 +0800)]
tipc: correct spelling errors for struct tipc_bc_base's comment
Trivial fix for two spelling mistakes.
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com>
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 4 Sep 2018 04:59:43 +0000 (21:59 -0700)]
Merge branch 'bnxt_en-Bug-fixes'
Michael Chan says:
====================
bnxt_en: Bug fixes.
This short series fixes resource related logic in the driver, mostly
affecting the RDMA driver under corner cases.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 3 Sep 2018 08:23:19 +0000 (04:23 -0400)]
bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
Currently, the driver adjusts the bp->hw_resc.max_cp_rings by the number
of MSIX vectors used by RDMA. There is one code path in open that needs
to check the true max_cp_rings including any used by RDMA. This code
is now checking for the reduced max_cp_rings which will fail when the
number of cp rings is very small.
To fix this in a clean way, we don't adjust max_cp_rings anymore.
Instead, we add a helper bnxt_get_max_func_cp_rings_for_en() to get the
reduced max_cp_rings when appropriate.
Fixes: ec86f14ea506 ("bnxt_en: Add ULP calls to stop and restart IRQs.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 3 Sep 2018 08:23:18 +0000 (04:23 -0400)]
bnxt_en: Clean up unused functions.
Remove unused bnxt_subtract_ulp_resources(). Change
bnxt_get_max_func_irqs() to static since it is only locally used.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 3 Sep 2018 08:23:17 +0000 (04:23 -0400)]
bnxt_en: Fix firmware signaled resource change logic in open.
When the driver detects that resources have changed during open, it
should reset the rx and tx rings to 0. This will properly setup the
init sequence to initialize the default rings again. We also need
to signal the RDMA driver to stop and clear its interrupts. We then
call the RoCE driver to restart if a new set of default rings is
successfully reserved.
Fixes: 25e1acd6b92b ("bnxt_en: Notify firmware about IF state changes.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 4 Sep 2018 04:57:55 +0000 (21:57 -0700)]
Merge branch 'sctp-two-fixes-for-spp_ipv6_flowlabel-and-spp_dscp-sockopts'
Xin Long says:
====================
sctp: two fixes for spp_ipv6_flowlabel and spp_dscp sockopts
This patchset fixes two problems in sctp_apply_peer_addr_params()
when setting spp_ipv6_flowlabel or spp_dscp.
====================
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 3 Sep 2018 07:47:11 +0000 (15:47 +0800)]
sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel
When users set params.spp_address and get a trans, ipv6_flowlabel flag
should be applied into this trans. But even if this one is not an ipv6
trans, it should not go to apply it into all other transes of the asoc
but simply ignore it.
Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 3 Sep 2018 07:47:10 +0000 (15:47 +0800)]
sctp: fix invalid reference to the index variable of the iterator
Now in sctp_apply_peer_addr_params(), if SPP_IPV6_FLOWLABEL flag is set
and trans is NULL, it would use trans as the index variable to traverse
transport_addr_list, then trans is set as the last transport of it.
Later, if SPP_DSCP flag is set, it would enter into the wrong branch as
trans is actually an invalid reference.
So fix it by using a new index variable to traverse transport_addr_list
for both SPP_DSCP and SPP_IPV6_FLOWLABEL flags process.
Fixes: 0b0dce7a36fb ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams")
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Mikhaylov [Mon, 3 Sep 2018 07:26:28 +0000 (10:26 +0300)]
net/ibm/emac: wrong emac_calc_base call was used by typo
__emac_calc_base_mr1 was used instead of __emac4_calc_base_mr1
by copy-paste mistake for emac4syn.
Fixes: 45d6e545505fd32edb812f085be7de45b6a5c0af ("net/ibm/emac: add 8192 rx/tx fifo size")
Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Mon, 3 Sep 2018 07:04:55 +0000 (10:04 +0300)]
net: sched: null actions array pointer before releasing action
Currently, tcf_action_delete() nulls actions array pointer after putting
and deleting it. However, if tcf_idr_delete_index() returns an error,
pointer to action is not set to null. That results it being released second
time in error handling code of tca_action_gd().
Kasan error:
[ 807.367755] ==================================================================
[ 807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250
[ 807.382763] Read of size 8 at addr
ffff88033e636000 by task tc/2732
[ 807.391289] CPU: 0 PID: 2732 Comm: tc Tainted: G W 4.19.0-rc1+ #799
[ 807.399542] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[ 807.407948] Call Trace:
[ 807.410763] dump_stack+0x92/0xeb
[ 807.414456] print_address_description+0x70/0x360
[ 807.419549] kasan_report+0x14d/0x300
[ 807.423582] ? tc_setup_cb_call+0x14e/0x250
[ 807.428150] tc_setup_cb_call+0x14e/0x250
[ 807.432539] ? nla_put+0x65/0xe0
[ 807.436146] fl_dump+0x394/0x3f0 [cls_flower]
[ 807.440890] ? fl_tmplt_dump+0x140/0x140 [cls_flower]
[ 807.446327] ? lock_downgrade+0x320/0x320
[ 807.450702] ? lock_acquire+0xe2/0x220
[ 807.454819] ? is_bpf_text_address+0x5/0x140
[ 807.459475] ? memcpy+0x34/0x50
[ 807.462980] ? nla_put+0x65/0xe0
[ 807.466582] tcf_fill_node+0x341/0x430
[ 807.470717] ? tcf_block_put+0xe0/0xe0
[ 807.474859] tcf_node_dump+0xdb/0xf0
[ 807.478821] fl_walk+0x8e/0x170 [cls_flower]
[ 807.483474] tcf_chain_dump+0x35a/0x4d0
[ 807.487703] ? tfilter_notify+0x170/0x170
[ 807.492091] ? tcf_fill_node+0x430/0x430
[ 807.496411] tc_dump_tfilter+0x362/0x3f0
[ 807.500712] ? tc_del_tfilter+0x850/0x850
[ 807.505104] ? kasan_unpoison_shadow+0x30/0x40
[ 807.509940] ? __mutex_unlock_slowpath+0xcf/0x410
[ 807.515031] netlink_dump+0x263/0x4f0
[ 807.519077] __netlink_dump_start+0x2a0/0x300
[ 807.523817] ? tc_del_tfilter+0x850/0x850
[ 807.528198] rtnetlink_rcv_msg+0x46a/0x6d0
[ 807.532671] ? rtnl_fdb_del+0x3f0/0x3f0
[ 807.536878] ? tc_del_tfilter+0x850/0x850
[ 807.541280] netlink_rcv_skb+0x18d/0x200
[ 807.545570] ? rtnl_fdb_del+0x3f0/0x3f0
[ 807.549773] ? netlink_ack+0x500/0x500
[ 807.553913] netlink_unicast+0x2d0/0x370
[ 807.558212] ? netlink_attachskb+0x340/0x340
[ 807.562855] ? _copy_from_iter_full+0xe9/0x3e0
[ 807.567677] ? import_iovec+0x11e/0x1c0
[ 807.571890] netlink_sendmsg+0x3b9/0x6a0
[ 807.576192] ? netlink_unicast+0x370/0x370
[ 807.580684] ? netlink_unicast+0x370/0x370
[ 807.585154] sock_sendmsg+0x6b/0x80
[ 807.589015] ___sys_sendmsg+0x4a1/0x520
[ 807.593230] ? copy_msghdr_from_user+0x210/0x210
[ 807.598232] ? do_wp_page+0x174/0x880
[ 807.602276] ? __handle_mm_fault+0x749/0x1c10
[ 807.607021] ? __handle_mm_fault+0x1046/0x1c10
[ 807.611849] ? __pmd_alloc+0x320/0x320
[ 807.615973] ? check_chain_key+0x140/0x1f0
[ 807.620450] ? check_chain_key+0x140/0x1f0
[ 807.624929] ? __fget_light+0xbc/0xd0
[ 807.628970] ? __sys_sendmsg+0xd7/0x150
[ 807.633172] __sys_sendmsg+0xd7/0x150
[ 807.637201] ? __ia32_sys_shutdown+0x30/0x30
[ 807.641846] ? up_read+0x53/0x90
[ 807.645442] ? __do_page_fault+0x484/0x780
[ 807.649949] ? do_syscall_64+0x1e/0x2c0
[ 807.654164] do_syscall_64+0x72/0x2c0
[ 807.658198] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.663625] RIP: 0033:0x7f42e9870150
[ 807.667568] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
[ 807.687328] RSP: 002b:
00007ffdbf595b58 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 807.695564] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f42e9870150
[ 807.703083] RDX:
0000000000000000 RSI:
00007ffdbf595b80 RDI:
0000000000000003
[ 807.710605] RBP:
00007ffdbf599d90 R08:
0000000000679bc0 R09:
000000000000000f
[ 807.718127] R10:
00000000000005e7 R11:
0000000000000246 R12:
00007ffdbf599d88
[ 807.725651] R13:
0000000000000000 R14:
0000000000000000 R15:
0000000000000000
[ 807.735048] Allocated by task 2687:
[ 807.738902] kasan_kmalloc+0xa0/0xd0
[ 807.742852] __kmalloc+0x118/0x2d0
[ 807.746615] tcf_idr_create+0x44/0x320
[ 807.750738] tcf_nat_init+0x41e/0x530 [act_nat]
[ 807.755638] tcf_action_init_1+0x4e0/0x650
[ 807.760104] tcf_action_init+0x1ce/0x2d0
[ 807.764395] tcf_exts_validate+0x1d8/0x200
[ 807.768861] fl_change+0x55a/0x26b4 [cls_flower]
[ 807.773845] tc_new_tfilter+0x748/0xa20
[ 807.778051] rtnetlink_rcv_msg+0x56a/0x6d0
[ 807.782517] netlink_rcv_skb+0x18d/0x200
[ 807.786804] netlink_unicast+0x2d0/0x370
[ 807.791095] netlink_sendmsg+0x3b9/0x6a0
[ 807.795387] sock_sendmsg+0x6b/0x80
[ 807.799240] ___sys_sendmsg+0x4a1/0x520
[ 807.803445] __sys_sendmsg+0xd7/0x150
[ 807.807473] do_syscall_64+0x72/0x2c0
[ 807.811506] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.818776] Freed by task 2728:
[ 807.822283] __kasan_slab_free+0x122/0x180
[ 807.826752] kfree+0xf4/0x2f0
[ 807.830080] __tcf_action_put+0x5a/0xb0
[ 807.834281] tcf_action_put_many+0x46/0x70
[ 807.838747] tca_action_gd+0x232/0xc40
[ 807.842862] tc_ctl_action+0x215/0x230
[ 807.846977] rtnetlink_rcv_msg+0x56a/0x6d0
[ 807.851444] netlink_rcv_skb+0x18d/0x200
[ 807.855731] netlink_unicast+0x2d0/0x370
[ 807.860021] netlink_sendmsg+0x3b9/0x6a0
[ 807.864312] sock_sendmsg+0x6b/0x80
[ 807.868166] ___sys_sendmsg+0x4a1/0x520
[ 807.872372] __sys_sendmsg+0xd7/0x150
[ 807.876401] do_syscall_64+0x72/0x2c0
[ 807.880431] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 807.887704] The buggy address belongs to the object at
ffff88033e636000
which belongs to the cache kmalloc-256 of size 256
[ 807.900909] The buggy address is located 0 bytes inside of
256-byte region [
ffff88033e636000,
ffff88033e636100)
[ 807.913155] The buggy address belongs to the page:
[ 807.918322] page:
ffffea000cf98d80 count:1 mapcount:0 mapping:
ffff88036f80ee00 index:0x0 compound_mapcount: 0
[ 807.928831] flags: 0x5fff8000008100(slab|head)
[ 807.933647] raw:
005fff8000008100 ffffea000db44f00 0000000400000004 ffff88036f80ee00
[ 807.942050] raw:
0000000000000000 0000000080190019 00000001ffffffff 0000000000000000
[ 807.950456] page dumped because: kasan: bad access detected
[ 807.958240] Memory state around the buggy address:
[ 807.963405]
ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb
[ 807.971288]
ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
[ 807.979166] >
ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 807.994882] ^
[ 807.998477]
ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 808.006352]
ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[ 808.014230] ==================================================================
[ 808.022108] Disabling lock debugging due to kernel taint
Fixes: edfaf94fa705 ("net_sched: improve and refactor tcf_action_put_many()")
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gleb Fotengauer-Malinovskiy [Mon, 3 Sep 2018 17:59:13 +0000 (20:59 +0300)]
vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition
The _IOC_READ flag fits this ioctl request more because this request
actually only writes to, but doesn't read from userspace.
See NOTEs in include/uapi/asm-generic/ioctl.h for more information.
Fixes: 429711aec282 ("vhost: switch to use new message format")
Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anthony Wong [Fri, 31 Aug 2018 12:06:42 +0000 (20:06 +0800)]
r8169: add support for NCube 8168 network card
This card identifies itself as:
Ethernet controller [0200]: NCube Device [10ff:8168] (rev 06)
Subsystem: TP-LINK Technologies Co., Ltd. Device [7470:3468]
Adding a new entry to rtl8169_pci_tbl makes the card work.
Link: http://launchpad.net/bugs/1788730
Signed-off-by: Anthony Wong <anthony.wong@ubuntu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Fri, 31 Aug 2018 08:52:01 +0000 (16:52 +0800)]
ip6_tunnel: respect ttl inherit for ip6tnl
man ip-tunnel ttl section says:
0 is a special value meaning that packets inherit the TTL value.
IPv4 tunnel respect this in ip_tunnel_xmit(), but IPv6 tunnel has not
implement it yet. To make IPv6 behave consistently with IP tunnel,
add ipv6 tunnel inherit support.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emmanuel Grumbach [Fri, 31 Aug 2018 08:31:13 +0000 (11:31 +0300)]
mac80211: shorten the IBSS debug messages
When tracing is enabled, all the debug messages are recorded and must
not exceed MAX_MSG_LEN (100) columns. Longer debug messages grant the
user with:
WARNING: CPU: 3 PID: 32642 at /tmp/wifi-core-
20180806094828/src/iwlwifi-stack-dev/net/mac80211/./trace_msg.h:32 trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
Workqueue: phy1 ieee80211_iface_work [mac80211]
RIP: 0010:trace_event_raw_event_mac80211_msg_event+0xab/0xc0 [mac80211]
Call Trace:
__sdata_dbg+0xbd/0x120 [mac80211]
ieee80211_ibss_rx_queued_mgmt+0x15f/0x510 [mac80211]
ieee80211_iface_work+0x21d/0x320 [mac80211]
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Emmanuel Grumbach [Fri, 31 Aug 2018 08:31:12 +0000 (11:31 +0300)]
mac80211: don't Tx a deauth frame if the AP forbade Tx
If the driver fails to properly prepare for the channel
switch, mac80211 will disconnect. If the CSA IE had mode
set to 1, it means that the clients are not allowed to send
any Tx on the current channel, and that includes the
deauthentication frame.
Make sure that we don't send the deauthentication frame in
this case.
In iwlwifi, this caused a failure to flush queues since the
firmware already closed the queues after having parsed the
CSA IE. Then mac80211 would wait until the deauthentication
frame would go out (drv_flush(drop=false)) and that would
never happen.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ilan Peer [Fri, 31 Aug 2018 08:31:10 +0000 (11:31 +0300)]
mac80211: Fix station bandwidth setting after channel switch
When performing a channel switch flow for a managed interface, the
flow did not update the bandwidth of the AP station and the rate
scale algorithm. In case of a channel width downgrade, this would
result with the rate scale algorithm using a bandwidth that does not
match the interface channel configuration.
Fix this by updating the AP station bandwidth and rate scaling algorithm
before the actual channel change in case of a bandwidth downgrade, or
after the actual channel change in case of a bandwidth upgrade.
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Emmanuel Grumbach [Fri, 31 Aug 2018 08:31:06 +0000 (11:31 +0300)]
mac80211: fix a race between restart and CSA flows
We hit a problem with iwlwifi that was caused by a bug in
mac80211. A bug in iwlwifi caused the firwmare to crash in
certain cases in channel switch. Because of that bug,
drv_pre_channel_switch would fail and trigger the restart
flow.
Now we had the hw restart worker which runs on the system's
workqueue and the csa_connection_drop_work worker that runs
on mac80211's workqueue that can run together. This is
obviously problematic since the restart work wants to
reconfigure the connection, while the csa_connection_drop_work
worker does the exact opposite: it tries to disconnect.
Fix this by cancelling the csa_connection_drop_work worker
in the restart worker.
Note that this can sound racy: we could have:
driver iface_work CSA_work restart_work
+++++++++++++++++++++++++++++++++++++++++++++
|
<--drv_cs ---|
<FW CRASH!>
-CS FAILED-->
| |
| cancel_work(CSA)
schedule |
CSA work |
| |
Race between those 2
But this is not possible because we flush the workqueue
in the restart worker before we cancel the CSA worker.
That would be bullet proof if we could guarantee that
we schedule the CSA worker only from the iface_work
which runs on the workqueue (and not on the system's
workqueue), but unfortunately we do have an instance
in which we schedule the CSA work outside the context
of the workqueue (ieee80211_chswitch_done).
Note also that we should probably cancel other workers
like beacon_connection_loss_work and possibly others
for different types of interfaces, at the very least,
IBSS should suffer from the exact same problem, but for
now, do the minimum to fix the actual bug that was actually
experienced and reproduced.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Dreyfuss, Haim [Fri, 31 Aug 2018 08:31:04 +0000 (11:31 +0300)]
mac80211: fix WMM TXOP calculation
In commit
9236c4523e5b ("mac80211: limit wmm params to comply
with ETSI requirements"), we have limited the WMM parameters to
comply with 802.11 and ETSI standard. Mistakenly the TXOP value
was caluclated wrong. Fix it by taking the minimum between
802.11 to ETSI to make sure we are not violating both.
Fixes: e552af058148 ("mac80211: limit wmm params to comply with ETSI requirements")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Dan Carpenter [Fri, 31 Aug 2018 08:10:55 +0000 (11:10 +0300)]
cfg80211: fix a type issue in ieee80211_chandef_to_operating_class()
The "chandef->center_freq1" variable is a u32 but "freq" is a u16 so we
are truncating away the high bits. I noticed this bug because in commit
9cf0a0b4b64a ("cfg80211: Add support for 60GHz band channels 5 and 6")
we made "freq <= 56160 + 2160 * 6" a valid requency when before it was
only "freq <= 56160 + 2160 * 4" that was valid. It introduces a static
checker warning:
net/wireless/util.c:1571 ieee80211_chandef_to_operating_class()
warn: always true condition '(freq <= 56160 + 2160 * 6) => (0-u16max <= 69120)'
But really we probably shouldn't have been truncating the high bits
away to begin with.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Lorenzo Bianconi [Thu, 30 Aug 2018 23:04:13 +0000 (01:04 +0200)]
mac80211: fix an off-by-one issue in A-MSDU max_subframe computation
Initialize 'n' to 2 in order to take into account also the first
packet in the estimation of max_subframe limit for a given A-MSDU
since frag_tail pointer is NULL when ieee80211_amsdu_aggregate
routine analyzes the second frame.
Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Linus Torvalds [Mon, 3 Sep 2018 03:09:36 +0000 (20:09 -0700)]
Merge tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
"A few fixes for the fallout of being a little more pedantic about dma
masks"
* tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping:
of/platform: initialise AMBA default DMA masks
sparc: set a default 32-bit dma mask for OF devices
kernel/dma/direct: take DMA offset into account in dma_direct_supported
Vinson Lee [Sat, 1 Sep 2018 21:20:27 +0000 (21:20 +0000)]
uapi: Fix linux/rds.h userspace compilation errors.
Include linux/in6.h for struct in6_addr.
/usr/include/linux/rds.h:156:18: error: field ‘laddr’ has incomplete type
struct in6_addr laddr;
^~~~~
/usr/include/linux/rds.h:157:18: error: field ‘faddr’ has incomplete type
struct in6_addr faddr;
^~~~~
/usr/include/linux/rds.h:178:18: error: field ‘laddr’ has incomplete type
struct in6_addr laddr;
^~~~~
/usr/include/linux/rds.h:179:18: error: field ‘faddr’ has incomplete type
struct in6_addr faddr;
^~~~~
/usr/include/linux/rds.h:198:18: error: field ‘bound_addr’ has incomplete type
struct in6_addr bound_addr;
^~~~~~~~~~
/usr/include/linux/rds.h:199:18: error: field ‘connected_addr’ has incomplete type
struct in6_addr connected_addr;
^~~~~~~~~~~~~~
/usr/include/linux/rds.h:219:18: error: field ‘local_addr’ has incomplete type
struct in6_addr local_addr;
^~~~~~~~~~
/usr/include/linux/rds.h:221:18: error: field ‘peer_addr’ has incomplete type
struct in6_addr peer_addr;
^~~~~~~~~
/usr/include/linux/rds.h:245:18: error: field ‘src_addr’ has incomplete type
struct in6_addr src_addr;
^~~~~~~~
/usr/include/linux/rds.h:246:18: error: field ‘dst_addr’ has incomplete type
struct in6_addr dst_addr;
^~~~~~~~
Fixes: b7ff8b1036f0 ("rds: Extend RDS API for IPv6 support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jia-Ju Bai [Sat, 1 Sep 2018 12:11:05 +0000 (20:11 +0800)]
net: cadence: Fix a sleep-in-atomic-context bug in macb_halt_tx()
The kernel module may sleep with holding a spinlock.
The function call paths (from bottom to top) in Linux-4.16 are:
[FUNC] usleep_range
drivers/net/ethernet/cadence/macb_main.c, 648:
usleep_range in macb_halt_tx
drivers/net/ethernet/cadence/macb_main.c, 730:
macb_halt_tx in macb_tx_error_task
drivers/net/ethernet/cadence/macb_main.c, 721:
_raw_spin_lock_irqsave in macb_tx_error_task
To fix this bug, usleep_range() is replaced with udelay().
This bug is found by my static analysis tool DSAC.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 2 Sep 2018 22:53:38 +0000 (15:53 -0700)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2018-09-02
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data()
when linearizing multiple scatterlist elements, from Tushar.
2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is
found on map update where it would release existing ULP. syzbot found and
triggered this couple of times now, fix from John.
3) Add missing xskmap type to bpftool so it will properly show the type
on map dump, from Prashant.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 2 Sep 2018 21:37:30 +0000 (14:37 -0700)]
Linux 4.19-rc2
David Ahern [Thu, 30 Aug 2018 21:15:43 +0000 (14:15 -0700)]
net/ipv6: Only update MTU metric if it set
Jan reported a regression after an update to 4.18.5. In this case ipv6
default route is setup by systemd-networkd based on data from an RA. The
RA contains an MTU of 1492 which is used when the route is first inserted
but then systemd-networkd pushes down updates to the default route
without the mtu set.
Prior to the change to fib6_info, metrics such as MTU were held in the
dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
the value from the metrics is used if it is set and finally falling
back to the idev value.
After the fib6_info change metrics are contained in the fib6_info struct
and there is no equivalent to rt6i_pmtu. To maintain consistency with
the old behavior the new code should only reset the MTU in the metrics
if the route update has it set.
Fixes: d4ead6b34b67 ("net/ipv6: move metrics from dst to rt6_info")
Reported-by: Jan Janssen <medhefgo@web.de>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Wed, 29 Aug 2018 15:00:24 +0000 (08:00 -0700)]
net: ethernet: cpsw-phy-sel: prefer phandle for phy sel
The cpsw-phy-sel device is not a child of the cpsw interconnect target
module. It lives in the system control module.
Let's fix this issue by trying to use cpsw-phy-sel phandle first if it
exists and if not fall back to current usage of trying to find the
cpsw-phy-sel child. That way the phy sel driver can be a child of the
system control module where it belongs in the device tree.
Without this fix, we cannot have a proper interconnect target module
hierarchy in device tree for things like genpd.
Note that deferred probe is mostly not supported by cpsw and this patch
does not attempt to fix that. In case deferred probe support is needed,
this could be added to cpsw_slave_open() and phy_connect() so they start
handling and returning errors.
For documenting it, looks like the cpsw-phy-sel is used for all cpsw device
tree nodes. It's missing the related binding documentation, so let's also
update the binding documentation accordingly.
Cc: devicetree@vger.kernel.org
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Wed, 29 Aug 2018 15:00:23 +0000 (08:00 -0700)]
dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle
The current cpsw usage for cpsw-phy-sel is undocumented but is used for
all the boards using cpsw. And cpsw-phy-sel is not really a child of
the cpsw device, it lives in the system control module instead.
Let's document the existing usage, and improve it a bit where we prefer
to use a phandle instead of a child device for it. That way we can
properly describe the hardware in dts files for things like genpd.
Cc: devicetree@vger.kernel.org
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 2 Sep 2018 20:39:37 +0000 (13:39 -0700)]
Merge branch 'igmp-fix-two-incorrect-unsolicit-report-count-issues'
Hangbin Liu says:
====================
igmp: fix two incorrect unsolicit report count issues
Just like the subject, fix two minor igmp unsolicit report count issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Wed, 29 Aug 2018 10:06:10 +0000 (18:06 +0800)]
igmp: fix incorrect unsolicit report count after link down and up
After link down and up, i.e. when call ip_mc_up(), we doesn't init
im->unsolicit_count. So after igmp_timer_expire(), we will not start
timer again and only send one unsolicit report at last.
Fix it by initializing im->unsolicit_count in igmp_group_added(), so
we can respect igmp robustness value.
Fixes: 24803f38a5c0b ("igmp: do not remove igmp souce list info when set link down")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Wed, 29 Aug 2018 10:06:08 +0000 (18:06 +0800)]
igmp: fix incorrect unsolicit report count when join group
We should not start timer if im->unsolicit_count equal to 0 after decrease.
Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
default.
Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 31 Aug 2018 04:25:02 +0000 (21:25 -0700)]
bpf: avoid misuse of psock when TCP_ULP_BPF collides with another ULP
Currently we check sk_user_data is non NULL to determine if the sk
exists in a map. However, this is not sufficient to ensure the psock
or the ULP ops are not in use by another user, such as kcm or TLS. To
avoid this when adding a sock to a map also verify it is of the
correct ULP type. Additionally, when releasing a psock verify that
it is the TCP_ULP_BPF type before releasing the ULP. The error case
where we abort an update due to ULP collision can cause this error
path.
For example,
__sock_map_ctx_update_elem()
[...]
err = tcp_set_ulp_id(sock, TCP_ULP_BPF) <- collides with TLS
if (err) <- so err out here
goto out_free
[...]
out_free:
smap_release_sock() <- calling tcp_cleanup_ulp releases the
TLS ULP incorrectly.
Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Prashant Bhole [Fri, 31 Aug 2018 06:32:42 +0000 (15:32 +0900)]
tools/bpf: bpftool, add xskmap in map types
When listed all maps, bpftool currently shows (null) for xskmap.
Added xskmap type in map_type_name[] to show correct type.
Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tushar Dave [Fri, 31 Aug 2018 21:45:16 +0000 (23:45 +0200)]
bpf: Fix bpf_msg_pull_data()
Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while
linearizing multiple scatterlist elements. Variable 'offset' is used
to find first starting scatterlist element
i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"
Use different variable name while linearizing multiple scatterlist
elements so that value contained in variable 'offset' won't get
overwritten.
Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Linus Torvalds [Sun, 2 Sep 2018 17:56:01 +0000 (10:56 -0700)]
Merge tag 'devicetree-fixes-for-4.19' of git://git./linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"A couple of new helper functions in preparation for some tree wide
clean-ups.
I'm sending these new helpers now for rc2 in order to simplify the
dependencies on subsequent cleanups across the tree in 4.20"
* tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: Add device_type access helper functions
of: add node name compare helper functions
of: add helper to lookup compatible child node
Linus Torvalds [Sun, 2 Sep 2018 17:44:28 +0000 (10:44 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"First batch of fixes post-merge window:
- A handful of devicetree changes for i.MX2{3,8} to change over to
new panel bindings. The platforms were moved from legacy
framebuffers to DRM and some development board panels hadn't yet
been converted.
- OMAP fixes related to ti-sysc driver conversion fallout, fixing
some register offsets, no_console_suspend fixes, etc.
- Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.
- Fixed 0755->0644 permissions on a newly added file.
- Defconfig changes to make ARM Versatile more useful with QEMU
(helps testing).
- Enable defconfig options for new TI SoC platform that was merged
this window (AM6)"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: defconfig: Enable TI's AM6 SoC platform
ARM: defconfig: Update the ARM Versatile defconfig
ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
ARM: dts: imx23-evk: Convert to the new display bindings
ARM: dts: imx23-evk: Move regulators outside simple-bus
ARM: dts: imx28-evk: Convert to the new display bindings
ARM: dts: imx28-evk: Move regulators outside simple-bus
Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"
arm: dts: am4372: setup rtc as system-power-controller
ARM: dts: omap4-droid4: fix vibrations on Droid 4
bus: ti-sysc: Fix no_console_suspend handling
bus: ti-sysc: Fix module register ioremap for larger offsets
ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
ARM: OMAP2+: Fix null hwmod for ti-sysc debug
Linus Torvalds [Sun, 2 Sep 2018 17:11:30 +0000 (10:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Speculation:
- Make the microcode check more robust
- Make the L1TF memory limit depend on the internal cache physical
address space and not on the CPUID advertised physical address
space, which might be significantly smaller. This avoids disabling
L1TF on machines which utilize the full physical address space.
- Fix the GDT mapping for EFI calls on 32bit PTI
- Fix the MCE nospec implementation to prevent #GP
Fixes and robustness:
- Use the proper operand order for LSL in the VDSO
- Prevent NMI uaccess race against CR3 switching
- Add a lockdep check to verify that text_mutex is held in
text_poke() functions
- Repair the fallout of giving native_restore_fl() a prototype
- Prevent kernel memory dumps based on usermode RIP
- Wipe KASAN shadow stack before rewinding the stack to prevent false
positives
- Move the AMS GOTO enforcement to the actual build stage to allow
user API header extraction without a compiler
- Fix a section mismatch introduced by the on demand VDSO mapping
change
Miscellaneous:
- Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pti: Fix section mismatch warning/error
x86/vdso: Fix lsl operand order
x86/mce: Fix set_mce_nospec() to avoid #GP fault
x86/efi: Load fixmap GDT in efi_call_phys_epilog()
x86/nmi: Fix NMI uaccess race against CR3 switching
x86: Allow generating user-space headers without a compiler
x86/dumpstack: Don't dump kernel memory based on usermode RIP
x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
x86/irqflags: Mark native_restore_fl extern inline
x86/build: Remove jump label quirk for GCC older than 4.5.2
x86/Kconfig: Fix trivial typo
x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
x86/spectre: Add missing family 6 check to microcode check
Linus Torvalds [Sun, 2 Sep 2018 17:09:35 +0000 (10:09 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull CPU hotplug fix from Thomas Gleixner:
"Remove the stale skip_onerr member from the hotplug states"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Remove skip_onerr field from cpuhp_step structure
Linus Torvalds [Sun, 2 Sep 2018 16:41:45 +0000 (09:41 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"A small set of updates for core code:
- Prevent tracing in functions which are called from trace patching
via stop_machine() to prevent executing half patched function trace
entries.
- Remove old GCC workarounds
- Remove pointless includes of notifier.h"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Remove workaround for unreachable warnings from old GCC
notifier: Remove notifier header file wherever not used
watchdog: Mark watchdog touch functions as notrace
Randy Dunlap [Sun, 2 Sep 2018 04:01:28 +0000 (21:01 -0700)]
x86/pti: Fix section mismatch warning/error
Fix the section mismatch warning in arch/x86/mm/pti.c:
WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.
Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/43a6d6a3-d69d-5eda-da09-0b1c88215a2a@infradead.org
Linus Walleij [Fri, 31 Aug 2018 14:13:07 +0000 (16:13 +0200)]
of/platform: initialise AMBA default DMA masks
This addresses a v4.19-rc1 regression in the PL111 DRM driver in
drivers/gpu/pl111/*
The driver uses the CMA KMS helpers and will thus at some point call
down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory
for the framebuffer.
It appears that in v4.18, it was OK that this (and other DMA mastering
AMBA devices) left dev->coherent_dma_mask blank (zero).
In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in
dma_alloc_attrs() in include/linux/dma-mapping.h is triggered. The
allocation later fails when get_coherent_dma_mask() is called from
__dma_alloc() and __dma_alloc() returns NULL:
drm-clcd-pl111 dev:20: coherent DMA mask is unset
drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR*
Failed to set fbdev configuration
It turns out that in commit
4d8bde883bfb ("OF: Don't set default
coherent DMA mask") the OF core stops setting the default DMA mask on
new devices, especially those lines of the patch:
- if (!dev->coherent_dma_mask)
- dev->coherent_dma_mask = DMA_BIT_MASK(32);
Robin Murphy solved a similar problem in
a5516219b102 ("of/platform:
Initialise default DMA masks") by simply assigning dev.coherent_dma_mask
and the dev.dma_mask to point to the same when creating devices from the
device tree, and introducing the same code into the code path creating
AMBA/PrimeCell devices solved my problem, graphics now come up.
The code simply assumes that the device can access all of the system
memory by setting the coherent DMA mask to 0xffffffff when creating a
device from the device tree, which is crude, but seems to be what kernel
v4.18 assumed.
The AMBA PrimeCells do not differ between coherent and streaming DMA so
we can just assign the same to any DMA mask.
Possibly drivers should augment their coherent DMA mask in accordance
with "dma-ranges" from the device tree if more finegranular masking is
needed.
Reported-by: Russell King <linux@armlinux.org.uk>
Fixes: 4d8bde883bfb ("OF: Don't set default coherent DMA mask")
Cc: Russell King <linux@armlinux.org.uk>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig [Tue, 28 Aug 2018 09:25:51 +0000 (11:25 +0200)]
sparc: set a default 32-bit dma mask for OF devices
This keeps the historic default behavior for devices without a DMA mask,
but removes the warning about a lacking DMA mask for doing DMA without
a mask.
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Olof Johansson [Sun, 2 Sep 2018 01:22:19 +0000 (18:22 -0700)]
Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omap variants against v4.19-rc1
These are mostly fixes related to using ti-sysc interconnect target module
driver for accessing right register offsets for sgx and cpsw and for
no_console_suspend regression.
There is also a droid4 emmc fix where emmc may not get detected for some
models, and vibrator dts mismerge fix.
And we have a file permission fix for am335x-osd3358-sm-red.dts that
just got added. And we must tag RTC as system-power-controller for
am437x for PMIC to shut down during poweroff.
* tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
arm: dts: am4372: setup rtc as system-power-controller
ARM: dts: omap4-droid4: fix vibrations on Droid 4
bus: ti-sysc: Fix no_console_suspend handling
bus: ti-sysc: Fix module register ioremap for larger offsets
ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
ARM: OMAP2+: Fix null hwmod for ti-sysc debug
Signed-off-by: Olof Johansson <olof@lixom.net>
Alexey Kodanev [Thu, 30 Aug 2018 16:11:24 +0000 (19:11 +0300)]
ipv6: don't get lwtstate twice in ip6_rt_copy_init()
Commit
80f1a0f4e0cd ("net/ipv6: Put lwtstate when destroying fib6_info")
partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
with ip6_rt_copy_init(), and it should be done only once there.
rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
ip6_rt_copy_init(), so there is no need to get it again at the end.
With this patch, lwtstate also isn't copied from RTF_REJECT routes.
[1]:
unreferenced object 0xffff880b6aaa14e0 (size 64):
comm "ip", pid 10577, jiffies
4295149341 (age 1273.903s)
hex dump (first 32 bytes):
01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<
0000000018664623>] lwtunnel_build_state+0x1bc/0x420
[<
00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0
[<
00000000ee2c5d1f>] ip6_route_add+0x14/0x70
[<
000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0
[<
000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0
[<
000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0
[<
000000004c893c76>] netlink_unicast+0x417/0x5a0
[<
00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30
[<
00000000890ff0aa>] sock_sendmsg+0xb1/0xf0
[<
00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950
[<
000000001e7426c8>] __sys_sendmsg+0xde/0x170
[<
00000000fe411443>] do_syscall_64+0x9f/0x4a0
[<
000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<
000000006d21f353>] 0xffffffffffffffff
Fixes: 6edb3c96a5f0 ("net/ipv6: Defer initialization of dst to data path")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Neves [Sat, 1 Sep 2018 20:14:52 +0000 (21:14 +0100)]
x86/vdso: Fix lsl operand order
In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.
Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt
Linus Torvalds [Sat, 1 Sep 2018 20:17:15 +0000 (13:17 -0700)]
Merge tag 'linux-watchdog-4.19-rc2' of git://linux-watchdog.org/linux-watchdog
Pull watchdog fixlet from Wim Van Sebroeck:
"Document support for r8a774a1"
* tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog:
dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support
Linus Torvalds [Sat, 1 Sep 2018 20:03:32 +0000 (13:03 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Two small fixes, one for the x86 Stoney SoC to get a more accurate clk
frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX
driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: x86: Set default parent to 48Mhz
clk: npcm7xx: fix memory allocation
Christoph Hellwig [Fri, 24 Aug 2018 06:07:52 +0000 (08:07 +0200)]
kernel/dma/direct: take DMA offset into account in dma_direct_supported
When a device has a DMA offset the dma capable result will change due
to the difference between the physical and DMA address. Take that into
account.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
LuckTony [Fri, 31 Aug 2018 16:55:06 +0000 (09:55 -0700)]
x86/mce: Fix set_mce_nospec() to avoid #GP fault
The trick with flipping bit 63 to avoid loading the address of the 1:1
mapping of the poisoned page while the 1:1 map is updated used to work when
unmapping the page. But it falls down horribly when attempting to directly
set the page as uncacheable.
The problem is that when the cache mode is changed to uncachable, the pages
needs to be flushed from the cache first. But the decoy address is
non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a
#GP fault.
Add code to change_page_attr_set_clr() to fix the address before calling
flush.
Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk
Thomas Falcon [Thu, 30 Aug 2018 18:19:53 +0000 (13:19 -0500)]
ibmvnic: Include missing return code checks in reset function
Check the return codes of these functions and halt reset
in case of failure. The driver will remain in a dormant state
until the next reset event, when device initialization will be
re-attempted.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Thu, 30 Aug 2018 14:01:18 +0000 (16:01 +0200)]
selftests: pmtu: detect correct binary to ping ipv6 addresses
Some systems don't have the ping6 binary anymore, and use ping for
everything. Detect the absence of ping6 and try to use ping instead.
Fixes: d1f1b9cbf34c ("selftests: net: Introduce first PMTU test")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Thu, 30 Aug 2018 14:01:17 +0000 (16:01 +0200)]
selftests: pmtu: maximum MTU for vti4 is 2^16-1-20
Since commit
82612de1c98e ("ip_tunnel: restore binding to ifaces with a
large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of
the mysterious constant 0xFFF8. This makes this selftest fail.
Fixes: 82612de1c98e ("ip_tunnel: restore binding to ifaces with a large mtu")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Thu, 30 Aug 2018 12:24:29 +0000 (14:24 +0200)]
tcp: do not restart timewait timer on rst reception
RFC 1337 says:
''Ignore RST segments in TIME-WAIT state.
If the 2 minute MSL is enforced, this fix avoids all three hazards.''
So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk
expire rather than removing it instantly when a reset is received.
However, Linux will also re-start the TIME-WAIT timer.
This causes connect to fail when tying to re-use ports or very long
delays (until syn retry interval exceeds MSL).
packetdrill test case:
// Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode.
`sysctl net.ipv4.tcp_rfc1337=1`
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
0.200 < . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
// Receive first segment
0.310 < P. 1:1001(1000) ack 1 win 46
// Send one ACK
0.310 > . 1:1(0) ack 1001
// read 1000 byte
0.310 read(4, ..., 1000) = 1000
// Application writes 100 bytes
0.350 write(4, ..., 100) = 100
0.350 > P. 1:101(100) ack 1001
// ACK
0.500 < . 1001:1001(0) ack 101 win 257
// close the connection
0.600 close(4) = 0
0.600 > F. 101:101(0) ack 1001 win 244
// Our side is in FIN_WAIT_1 & waits for ack to fin
0.7 < . 1001:1001(0) ack 102 win 244
// Our side is in FIN_WAIT_2 with no outstanding data.
0.8 < F. 1001:1001(0) ack 102 win 244
0.8 > . 102:102(0) ack 1002 win 244
// Our side is now in TIME_WAIT state, send ack for fin.
0.9 < F. 1002:1002(0) ack 102 win 244
0.9 > . 102:102(0) ack 1002 win 244
// Peer reopens with in-window SYN:
1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
// Therefore, reply with ACK.
1.000 > . 102:102(0) ack 1002 win 244
// Peer sends RST for this ACK. Normally this RST results
// in tw socket removal, but rfc1337=1 setting prevents this.
1.100 < R 1002:1002(0) win 244
// second syn. Due to rfc1337=1 expect another pure ACK.
31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
31.0 > . 102:102(0) ack 1002 win 244
// .. and another RST from peer.
31.1 < R 1002:1002(0) win 244
31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT`
// third syn after one minute. Time-Wait socket should have expired by now.
63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7>
// so we expect a syn-ack & 3whs to proceed from here on.
63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7>
Without this patch, 'ss' shows restarts of tw timer and last packet is
thus just another pure ack, more than one minute later.
This restores the original code from commit
283fd6cf0be690a83
("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git .
For some reason the else branch was removed/lost in
1f28b683339f7
("Merge in TCP/UDP optimizations and [..]") and timer restart became
unconditional.
Reported-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Machek [Thu, 30 Aug 2018 11:30:03 +0000 (13:30 +0200)]
net/rds: RDS is not Radio Data System
Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is
easily confused with Radio Data System (which we may want to support
in kernel, too).
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dexuan Cui [Thu, 30 Aug 2018 05:42:13 +0000 (05:42 +0000)]
hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe()
This patch fixes the race between netvsc_probe() and
rndis_set_subchannel(), which can cause a deadlock.
These are the related 3 paths which show the deadlock:
path #1:
Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus]
Call Trace:
schedule
schedule_preempt_disabled
__mutex_lock
__device_attach
bus_probe_device
device_add
vmbus_device_register
vmbus_onoffer
vmbus_onmessage_work
process_one_work
worker_thread
kthread
ret_from_fork
path #2:
schedule
schedule_preempt_disabled
__mutex_lock
netvsc_probe
vmbus_probe
really_probe
__driver_attach
bus_for_each_dev
driver_attach_async
async_run_entry_fn
process_one_work
worker_thread
kthread
ret_from_fork
path #3:
Workqueue: events netvsc_subchan_work [hv_netvsc]
Call Trace:
schedule
rndis_set_subchannel
netvsc_subchan_work
process_one_work
worker_thread
kthread
ret_from_fork
Before path #1 finishes, path #2 can start to run, because just before
the "bus_probe_device(dev);" in device_add() in path #1, there is a line
"object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can
immediately try to load hv_netvsc and hence path #2 can start to run.
Next, path #2 offloads the subchannal's initialization to a workqueue,
i.e. path #3, so we can end up in a deadlock situation like this:
Path #2 gets the device lock, and is trying to get the rtnl lock;
Path #3 gets the rtnl lock and is waiting for all the subchannel messages
to be processed;
Path #1 is trying to get the device lock, but since #2 is not releasing
the device lock, path #1 has to sleep; since the VMBus messages are
processed one by one, this means the sub-channel messages can't be
procedded, so #3 has to sleep with the rtnl lock held, and finally #2
has to sleep... Now all the 3 paths are sleeping and we hit the deadlock.
With the patch, we can make sure #2 gets both the device lock and the
rtnl lock together, gets its job done, and releases the locks, so #1
and #3 will not be blocked for ever.
Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 29 Aug 2018 19:46:08 +0000 (12:46 -0700)]
nfp: wait for posted reconfigs when disabling the device
To avoid leaking a running timer we need to wait for the
posted reconfigs after netdev is unregistered. In common
case the process of deinitializing the device will perform
synchronous reconfigs which wait for posted requests, but
especially with VXLAN ports being actively added and removed
there can be a race condition leaving a timer running after
adapter structure is freed leading to a crash.
Add an explicit flush after deregistering and for a good
measure a warning to check if timer is running just before
structures are freed.
Fixes: 3d780b926a12 ("nfp: add async reconfiguration mechanism")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 29 Aug 2018 18:50:12 +0000 (11:50 -0700)]
Revert "packet: switch kvzalloc to allocate memory"
This reverts commit
71e41286203c017d24f041a7cd71abea7ca7b1e0.
mmap()/munmap() can not be backed by kmalloced pages :
We fault in :
VM_BUG_ON_PAGE(PageSlab(page), page);
unmap_single_vma+0x8a/0x110
unmap_vmas+0x4b/0x90
unmap_region+0xc9/0x140
do_munmap+0x274/0x360
vm_munmap+0x81/0xc0
SyS_munmap+0x2b/0x40
do_syscall_64+0x13e/0x1c0
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes: 71e41286203c ("packet: switch kvzalloc to allocate memory")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: John Sperbeck <jsperbeck@google.com>
Bisected-by: John Sperbeck <jsperbeck@google.com>
Cc: Zhang Yu <zhangyu31@baidu.com>
Cc: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 31 Aug 2018 16:20:30 +0000 (09:20 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A few arm64 fixes came in this week, specifically fixing some nasty
truncation of return values from firmware calls and resolving a
VM_BUG_ON due to accessing uninitialised struct pages corresponding to
NOMAP pages.
Summary:
- Fix typos in SVE documentation
- Fix type-checking and implicit truncation for SMCCC calls
- Force CONFIG_HOLES_IN_ZONE=y so that SLAB doesn't fall over NOMAP
regions"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: always enable CONFIG_HOLES_IN_ZONE
arm/arm64: smccc-1.1: Handle function result as parameters
arm/arm64: smccc-1.1: Make return values unsigned long
Documentation/arm64/sve: Couple of improvements and typos
Joerg Roedel [Fri, 31 Aug 2018 08:05:38 +0000 (10:05 +0200)]
x86/efi: Load fixmap GDT in efi_call_phys_epilog()
When PTI is enabled on x86-32 the kernel uses the GDT mapped in the fixmap
for the simple reason that this address is also mapped for user-space.
The efi_call_phys_prolog()/efi_call_phys_epilog() wrappers change the GDT
to call EFI runtime services and switch back to the kernel GDT when they
return. But the switch-back uses the writable GDT, not the fixmap GDT.
When that happened and and the CPU returns to user-space it switches to the
user %cr3 and tries to restore user segment registers. This fails because
the writable GDT is not mapped in the user page-table, and without a GDT
the fault handlers also can't be launched. The result is a triple fault and
reboot of the machine.
Fix that by restoring the GDT back to the fixmap GDT which is also mapped
in the user page-table.
Fixes: 7757d607c6b3 x86/pti: ('Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32')
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: hpa@zytor.com
Cc: linux-efi@vger.kernel.org
Link: https://lkml.kernel.org/r/1535702738-10971-1-git-send-email-joro@8bytes.org
Linus Torvalds [Fri, 31 Aug 2018 15:45:16 +0000 (08:45 -0700)]
Merge tag 'for-linus-4.19b-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- minor cleanup avoiding a warning when building with new gcc
- a patch to add a new sysfs node for Xen frontend/backend drivers to
make it easier to obtain the state of a pv device
- two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable
PTEs
* tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: remove redundant variable save_pud
xen: export device state to sysfs
x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
x86/xen: don't write ptes directly in 32-bit PV guests
Linus Torvalds [Fri, 31 Aug 2018 15:42:46 +0000 (08:42 -0700)]
Merge tag 'm68k-for-v4.19-tag2' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fix from Geert Uytterhoeven:
"Just a single fix for a bug introduced during the merge window: fix
wrong date and time on PMU-based Macs"
* tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/mac: Use correct PMU response format
Linus Torvalds [Fri, 31 Aug 2018 15:38:53 +0000 (08:38 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- regression fixes for i801 and designware
- better API and leak fix for releasing DMA safe buffers
- better greppable strings for the bitbang algorithm
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sh_mobile: fix leak when using DMA bounce buffer
i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow
i2c: refactor function to release a DMA safe buffer
i2c: algos: bit: make the error messages grepable
i2c: designware: Re-init controllers with pm_disabled set on resume
i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus
Andy Lutomirski [Wed, 29 Aug 2018 15:47:18 +0000 (08:47 -0700)]
x86/nmi: Fix NMI uaccess race against CR3 switching
A NMI can hit in the middle of context switching or in the middle of
switch_mm_irqs_off(). In either case, CR3 might not match current->mm,
which could cause copy_from_user_nmi() and friends to read the wrong
memory.
Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org
Ben Hutchings [Wed, 29 Aug 2018 19:43:17 +0000 (20:43 +0100)]
x86: Allow generating user-space headers without a compiler
When bootstrapping an architecture, it's usual to generate the kernel's
user-space headers (make headers_install) before building a compiler. Move
the compiler check (for asm goto support) to the archprepare target so that
it is only done when building code for the target.
Fixes: e501ce957a78 ("x86: Force asm-goto")
Reported-by: Helmut Grohne <helmutg@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180829194317.GA4765@decadent.org.uk
Jann Horn [Tue, 28 Aug 2018 15:49:01 +0000 (17:49 +0200)]
x86/dumpstack: Don't dump kernel memory based on usermode RIP
show_opcodes() is used both for dumping kernel instructions and for dumping
user instructions. If userspace causes #PF by jumping to a kernel address,
show_opcodes() can be reached with regs->ip controlled by the user,
pointing to kernel code. Make sure that userspace can't trick us into
dumping kernel memory into dmesg.
Fixes: 7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: security@kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180828154901.112726-1-jannh@google.com
Rob Herring [Tue, 28 Aug 2018 20:10:48 +0000 (15:10 -0500)]
of: Add device_type access helper functions
In preparation to remove direct access to device_node.type, add
of_node_is_type() and of_node_get_device_type() helpers to check and
retrieve the device type.
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Mukesh Ojha [Tue, 28 Aug 2018 06:54:54 +0000 (12:24 +0530)]
cpu/hotplug: Remove skip_onerr field from cpuhp_step structure
When notifiers were there, `skip_onerr` was used to avoid calling
particular step startup/teardown callbacks in the CPU up/down rollback
path, which made the hotplug asymmetric.
As notifiers are gone now after the full state machine conversion, the
`skip_onerr` field is no longer required.
Remove it from the structure and its usage.
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1535439294-31426-1-git-send-email-mojha@codeaurora.org
James Morse [Thu, 30 Aug 2018 15:05:32 +0000 (16:05 +0100)]
arm64: mm: always enable CONFIG_HOLES_IN_ZONE
Commit
6d526ee26ccd ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA")
only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was
choking on the missing zone for nomap pages. This problem doesn't just
apply to NUMA systems.
If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will
return true if the pfn is part of a valid sparsemem section.
When working with multiple pages, the mm code uses pfn_valid_within()
to test each page it uses within the sparsemem section is valid. On
most systems memory comes in MAX_ORDER_NR_PAGES chunks which all
have valid/initialised struct pages. In this case pfn_valid_within()
is optimised out.
Systems where this isn't true (e.g. due to nomap) should set
HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each
page as it works with it.
Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to
a VM_BUG_ON():
| page:
fffffdff802e1780 is uninitialized and poisoned
| raw:
ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| raw:
ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
| ------------[ cut here ]------------
| kernel BUG at include/linux/mm.h:978!
| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
| CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate:
40000085 (nZcv daIf -PAN -UAO)
| pc : move_freepages_block+0x144/0x248
| lr : move_freepages_block+0x144/0x248
| sp :
fffffe0071177680
[...]
| Process dd (pid: 25236, stack limit = 0x0000000094cc07fb)
| Call trace:
| move_freepages_block+0x144/0x248
| steal_suitable_fallback+0x100/0x16c
| get_page_from_freelist+0x440/0xb20
| __alloc_pages_nodemask+0xe8/0x838
| new_slab+0xd4/0x418
| ___slab_alloc.constprop.27+0x380/0x4a8
| __slab_alloc.isra.21.constprop.26+0x24/0x34
| kmem_cache_alloc+0xa8/0x180
| alloc_buffer_head+0x1c/0x90
| alloc_page_buffers+0x68/0xb0
| create_empty_buffers+0x20/0x1ec
| create_page_buffers+0xb0/0xf0
| __block_write_begin_int+0xc4/0x564
| __block_write_begin+0x10/0x18
| block_write_begin+0x48/0xd0
| blkdev_write_begin+0x28/0x30
| generic_perform_write+0x98/0x16c
| __generic_file_write_iter+0x138/0x168
| blkdev_write_iter+0x80/0xf0
| __vfs_write+0xe4/0x10c
| vfs_write+0xb4/0x168
| ksys_write+0x44/0x88
| sys_write+0xc/0x14
| el0_svc_naked+0x30/0x34
| Code:
aa1303e0 90001a01 91296421 94008902 (
d4210000)
| ---[ end trace
1601ba47f6e883fe ]---
Remove the NUMA dependency.
Link: https://www.spinics.net/lists/arm-kernel/msg671851.html
Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Finn Thain [Fri, 24 Aug 2018 02:02:06 +0000 (12:02 +1000)]
m68k/mac: Use correct PMU response format
Now that the 68k Mac port has adopted the via-pmu driver, it must decode
the PMU response accordingly otherwise the date and time will be wrong.
Fixes: ebd722275f9cfc67 ("macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Linus Torvalds [Fri, 31 Aug 2018 04:18:05 +0000 (21:18 -0700)]
Merge tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular fixes pull:
- Mediatek has a bunch of fixes to their RDMA and Overlay engines.
- i915 has some Cannonlake/Geminilake watermark workarounds, LSPCON
fix, HDCP free fix, audio fix and a ppgtt reference counting fix.
- amdgpu has some SRIOV, Kasan, memory leaks and other misc fixes"
* tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm: (35 commits)
drm/i915/audio: Hook up component bindings even if displays are disabled
drm/i915: Increase LSPCON timeout
drm/i915: Stop holding a ref to the ppgtt from each vma
drm/i915: Free write_buf that we allocated with kzalloc.
drm/i915: Fix glk/cnl display w/a #1175
drm/amdgpu: Need to set moved to true when evict bo
drm/amdgpu: Remove duplicated power source update
drm/amd/display: Fix memory leak caused by missed dc_sink_release
drm/amdgpu: fix holding mn_lock while allocating memory
drm/amdgpu: Power on uvd block when hw_fini
drm/amdgpu: Update power state at the end of smu hw_init.
drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins
drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode
drm/amdgpu: Adjust the VM size based on system memory size v2
drm/mediatek: fix connection from RDMA2 to DSI1
drm/mediatek: update some variable name from ovl to comp
drm/mediatek: use layer_nr function to get layer number to init plane
drm/mediatek: add function to return RDMA layer number
drm/mediatek: add function to return OVL layer number
drm/mediatek: add function to get layer number for component
...
Stephen Rothwell [Thu, 30 Aug 2018 21:47:28 +0000 (07:47 +1000)]
disable stringop truncation warnings for now
They are too noisy
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 31 Aug 2018 01:02:02 +0000 (18:02 -0700)]
Merge tag 'pm-4.19-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These address a corner case in the menu cpuidle governor and fix error
handling in the PM core's generic clock management code.
Specifics:
- Make the menu cpuidle governor avoid stopping the scheduler tick if
the predicted idle duration exceeds the tick period length, but the
selected idle state is shallow and deeper idle states with high
target residencies are available (Rafael Wysocki).
- Make the PM core's generic clock management code use a proper data
type for one variable to make error handling work (Dan Carpenter)"
* tag 'pm-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: menu: Retain tick when shallow state is selected
PM / clk: signedness bug in of_pm_clk_add_clks()
Rafael J. Wysocki [Thu, 30 Aug 2018 23:23:31 +0000 (01:23 +0200)]
Merge branch 'pm-core'
Merge a generic clock management fix for 4.19-rc2.
* pm-core:
PM / clk: signedness bug in of_pm_clk_add_clks()
Akshu Agrawal [Tue, 21 Aug 2018 06:51:57 +0000 (12:21 +0530)]
clk: x86: Set default parent to 48Mhz
System clk provided in ST soc can be set to:
48Mhz, non-spread
25Mhz, spread
To get accurate rate, we need it to set it at non-spread
option which is 48Mhz.
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Fixes: 421bf6a1f061 ("clk: x86: Add ST oscout platform clock")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Wolfram Sang [Fri, 24 Aug 2018 14:52:46 +0000 (16:52 +0200)]
i2c: sh_mobile: fix leak when using DMA bounce buffer
We only freed the bounce buffer after successful DMA, missing the cases
where DMA setup may have gone wrong. Use a better location which always
gets called after each message and use 'stop_after_dma' as a flag for a
successful transfer.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Fri, 24 Aug 2018 14:52:45 +0000 (16:52 +0200)]
i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow
After various refactoring over the years, start_ch() doesn't return
errno anymore, so make the function return void. This saves the error
handling when calling it which in turn eases cleanup of resources of a
future patch.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Fri, 24 Aug 2018 14:52:44 +0000 (16:52 +0200)]
i2c: refactor function to release a DMA safe buffer
a) rename to 'put' instead of 'release' to match 'get' when obtaining
the buffer
b) change the argument order to have the buffer as first argument
c) add a new argument telling the function if the message was
transferred. This allows the function to be used also in cases
where setting up DMA failed, so the buffer needs to be freed without
syncing to the message buffer.
Also convert the only user.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Jan Kundrát [Tue, 28 Aug 2018 08:07:40 +0000 (10:07 +0200)]
i2c: algos: bit: make the error messages grepable
Yep, I went looking for one of these, and I wasn't able to find it
easily. That's worse than a line which is 82-chars long, IMHO.
Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Hans de Goede [Wed, 29 Aug 2018 13:06:31 +0000 (15:06 +0200)]
i2c: designware: Re-init controllers with pm_disabled set on resume
On Bay Trail and Cherry Trail devices we set the pm_disabled flag for I2C
busses which the OS shares with the PUNIT as these need special handling.
Until now we called dev_pm_syscore_device(dev, true) for I2C controllers
with this flag set to keep these I2C controllers always on.
After commit
12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and
resume from hibernation"), this no longer works. This commit modifies
lpss_iosf_exit_d3_state() to only run if lpss_iosf_enter_d3_state() has ran
before it, so that it does not run on a resume from hibernate (or from S3).
On these systems the conditions for lpss_iosf_enter_d3_state() to run
never become true, so lpss_iosf_exit_d3_state() never gets called and
the 2 LPSS DMA controllers never get forced into D0 mode, instead they
are left in their default automatic power-on when needed mode.
The not forcing of D0 mode for the DMA controllers enables these systems
to properly enter S0ix modes, which is a good thing.
But after entering S0ix modes the I2C controller connected to the PMIC
no longer works, leading to e.g. broken battery monitoring.
The _PS3 method for this I2C controller looks like this:
Method (_PS3, 0, NotSerialized) // _PS3: Power State 3
{
If ((((PMID == 0x04) || (PMID == 0x05)) || (PMID == 0x06)))
{
Return (Zero)
}
PSAT |= 0x03
Local0 = PSAT /* \_SB_.I2C5.PSAT */
}
Where PMID = 0x05, so we enter the Return (Zero) path on these systems.
So even if we were to not call dev_pm_syscore_device(dev, true) the
I2C controller will be left in D0 rather then be switched to D3.
Yet on other Bay and Cherry Trail devices S0ix is not entered unless *all*
I2C controllers are in D3 mode. This combined with the I2C controller no
longer working now that we reach S0ix states on these systems leads to me
believing that the PUNIT itself puts the I2C controller in D3 when all
other conditions for entering S0ix states are true.
Since now the I2C controller is put in D3 over a suspend/resume we must
re-initialize it afterwards and that does indeed fix it no longer working.
This commit implements this fix by:
1) Making the suspend_late callback a no-op if pm_disabled is set and
making the resume_early callback skip the clock re-enable (since it now was
not disabled) while still doing the necessary I2C controller re-init.
2) Removing the dev_pm_syscore_device(dev, true) call, so that the suspend
and resume callbacks are actually called. Normally this would cause the
ACPI pm code to call _PS3 putting the I2C controller in D3, wreaking havoc
since it is shared with the PUNIT, but in this special case the _PS3 method
is a no-op so we can safely allow a "fake" suspend / resume.
Fixes: 12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and resume ...")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200861
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Mika Westerberg [Thu, 30 Aug 2018 08:50:13 +0000 (11:50 +0300)]
i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus
Commit
7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict
with PCI BAR") made it possible for AML code to access SMBus I/O ports
by installing custom SystemIO OpRegion handler and blocking i80i driver
access upon first AML read/write to this OpRegion.
However, while ThinkPad T560 does have SystemIO OpRegion declared under
the SMBus device, it does not access any of the SMBus registers:
Device (SMBU)
{
...
OperationRegion (SMBP, PCI_Config, 0x50, 0x04)
Field (SMBP, DWordAcc, NoLock, Preserve)
{
, 5,
TCOB, 11,
Offset (0x04)
}
Name (TCBV, 0x00)
Method (TCBS, 0, NotSerialized)
{
If ((TCBV == 0x00))
{
TCBV = (\_SB.PCI0.SMBU.TCOB << 0x05)
}
Return (TCBV) /* \_SB_.PCI0.SMBU.TCBV */
}
OperationRegion (TCBA, SystemIO, TCBS (), 0x10)
Field (TCBA, ByteAcc, NoLock, Preserve)
{
Offset (0x04),
, 9,
CPSC, 1
}
}
Problem with the current approach is that it blocks all I/O port access
and because this system has touchpad connected to the SMBus controller
after first AML access (happens during suspend/resume cycle) the
touchpad fails to work anymore.
Fix this so that we allow ACPI AML I/O port access if it does not touch
the region reserved for the SMBus.
Fixes: 7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200737
Reported-by: Yussuf Khalil <dev@pp3345.net>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Linus Torvalds [Thu, 30 Aug 2018 20:39:04 +0000 (13:39 -0700)]
Merge tag 'for-linus-
20180830' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Small collection of fixes that should go into this series. This pull
contains:
- NVMe pull request with three small fixes (via Christoph)
- Kill useless NULL check before kmem_cache_destroy (Chengguang Xu)
- Xen block driver pull request with persistent grant flushing fixes
(Juergen Gross)
- Final wbt fixes, wrapping up the changes for this series. These
have been heavily tested (me)
- cdrom info leak fix (Scott Bauer)
- ATA dma quirk for SQ201 (Linus Walleij)
- Straight forward bsg refcount_t conversion (John Pittman)"
* tag 'for-linus-
20180830' of git://git.kernel.dk/linux-block:
cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
nvmet: free workqueue object if module init fails
nvme-fcloop: Fix dropped LS's to removed target port
nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
block: bsg: move atomic_t ref_count variable to refcount API
block: remove unnecessary condition check
ata: ftide010: Add a quirk for SQ201
blk-wbt: remove dead code
blk-wbt: improve waking of tasks
blk-wbt: abstract out end IO completion handler
xen/blkback: remove unused pers_gnts_lock from struct xen_blkif_ring
xen/blkback: move persistent grants flags to bool
xen/blkfront: reorder tests in xlblk_init()
xen/blkfront: cleanup stale persistent grants
xen/blkback: don't keep persistent grants too long
Rob Herring [Mon, 27 Aug 2018 12:50:47 +0000 (07:50 -0500)]
of: add node name compare helper functions
In preparation to remove device_node.name pointer, add helper functions
for node name comparisons which are a common pattern throughout the kernel.
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Linus Torvalds [Thu, 30 Aug 2018 17:05:12 +0000 (10:05 -0700)]
Merge tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd
Pull mtd fixes from Boris Brezillon:
"Raw NAND fixes:
- denali: Fix a regression caused by the nand_scan() rework
- docg4: Fix a build error when gcc decides to not iniline some
functions (can be reproduced with gcc 4.1.2):
* tag 'mtd/for-4.19-rc2' of git://git.infradead.org/linux-mtd:
mtd: rawnand: denali: do not pass zero maxchips to nand_scan()
mtd: rawnand: docg4: Remove wrong __init annotations
Linus Torvalds [Thu, 30 Aug 2018 16:50:15 +0000 (09:50 -0700)]
Merge tag 'mmc-v4.19-2' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix unsupported parallel dispatch of requests
MMC host:
- atmel-mci/android-goldfish: Fixup logic of sg_copy_{from,to}_buffer
- renesas_sdhi_internal_dmac: Prevent IRQ-storm due of DMAC IRQs
- renesas_sdhi_internal_dmac: Fixup bad register offset"
* tag 'mmc-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: renesas_sdhi_internal_dmac: mask DMAC interrupts
mmc: renesas_sdhi_internal_dmac: fix #define RST_RESERVED_BITS
mmc: block: Fix unsupported parallel dispatch of requests
mmc: android-goldfish: fix bad logic of sg_copy_{from,to}_buffer conversion
mmc: atmel-mci: fix bad logic of sg_copy_{from,to}_buffer conversion
Marc Zyngier [Fri, 24 Aug 2018 14:08:30 +0000 (15:08 +0100)]
arm/arm64: smccc-1.1: Handle function result as parameters
If someone has the silly idea to write something along those lines:
extern u64 foo(void);
void bar(struct arm_smccc_res *res)
{
arm_smccc_1_1_smc(0xbad, foo(), res);
}
they are in for a surprise, as this gets compiled as:
0000000000000588 <bar>:
588:
a9be7bfd stp x29, x30, [sp, #-32]!
58c:
910003fd mov x29, sp
590:
f9000bf3 str x19, [sp, #16]
594:
aa0003f3 mov x19, x0
598:
aa1e03e0 mov x0, x30
59c:
94000000 bl 0 <_mcount>
5a0:
94000000 bl 0 <foo>
5a4:
aa0003e1 mov x1, x0
5a8:
d4000003 smc #0x0
5ac:
b4000073 cbz x19, 5b8 <bar+0x30>
5b0:
a9000660 stp x0, x1, [x19]
5b4:
a9010e62 stp x2, x3, [x19, #16]
5b8:
f9400bf3 ldr x19, [sp, #16]
5bc:
a8c27bfd ldp x29, x30, [sp], #32
5c0:
d65f03c0 ret
5c4:
d503201f nop
The call to foo "overwrites" the x0 register for the return value,
and we end up calling the wrong secure service.
A solution is to evaluate all the parameters before assigning
anything to specific registers, leading to the expected result:
0000000000000588 <bar>:
588:
a9be7bfd stp x29, x30, [sp, #-32]!
58c:
910003fd mov x29, sp
590:
f9000bf3 str x19, [sp, #16]
594:
aa0003f3 mov x19, x0
598:
aa1e03e0 mov x0, x30
59c:
94000000 bl 0 <_mcount>
5a0:
94000000 bl 0 <foo>
5a4:
aa0003e1 mov x1, x0
5a8:
d28175a0 mov x0, #0xbad
5ac:
d4000003 smc #0x0
5b0:
b4000073 cbz x19, 5bc <bar+0x34>
5b4:
a9000660 stp x0, x1, [x19]
5b8:
a9010e62 stp x2, x3, [x19, #16]
5bc:
f9400bf3 ldr x19, [sp, #16]
5c0:
a8c27bfd ldp x29, x30, [sp], #32
5c4:
d65f03c0 ret
Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Uros Bizjak [Tue, 14 Aug 2018 16:59:51 +0000 (18:59 +0200)]
x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
Replace open-coded set instructions with CC_SET()/CC_OUT().
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180814165951.13538-1-ubizjak@gmail.com
Jiri Kosina [Tue, 28 Aug 2018 06:55:14 +0000 (08:55 +0200)]
x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
text_poke() and text_poke_bp() must be called with text_mutex held.
Put proper lockdep anotation in place instead of just mentioning the
requirement in a comment.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1808280853520.25787@cbobk.fhfr.pm
Masahiro Yamada [Mon, 27 Aug 2018 03:39:43 +0000 (12:39 +0900)]
objtool: Remove workaround for unreachable warnings from old GCC
Commit
cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
bumped the minimum GCC version to 4.6 for all architectures.
This effectively reverts commit
da541b20021c ("objtool: Skip unreachable
warnings for GCC 4.4 and older"), which was a workaround for GCC 4.4 or
older.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/1535341183-19994-1-git-send-email-yamada.masahiro@socionext.com
Mukesh Ojha [Fri, 24 Aug 2018 12:33:53 +0000 (18:03 +0530)]
notifier: Remove notifier header file wherever not used
The conversion of the hotplug notifiers to a state machine left the
notifier.h includes around in some places. Remove them.
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1535114033-4605-1-git-send-email-mojha@codeaurora.org
Vincent Whitchurch [Tue, 21 Aug 2018 15:25:07 +0000 (17:25 +0200)]
watchdog: Mark watchdog touch functions as notrace
Some architectures need to use stop_machine() to patch functions for
ftrace, and the assumption is that the stopped CPUs do not make function
calls to traceable functions when they are in the stopped state.
Commit
ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after
MULTI_STOP_PREPARE") added calls to the watchdog touch functions from
the stopped CPUs and those functions lack notrace annotations. This
leads to crashes when enabling/disabling ftrace on ARM kernels built
with the Thumb-2 instruction set.
Fix it by adding the necessary notrace annotations.
Fixes: ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: oleg@redhat.com
Cc: tj@kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180821152507.18313-1-vincent.whitchurch@axis.com
Jann Horn [Tue, 28 Aug 2018 18:40:33 +0000 (20:40 +0200)]
x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
Reset the KASAN shadow state of the task stack before rewinding RSP.
Without this, a kernel oops will leave parts of the stack poisoned, and
code running under do_exit() can trip over such poisoned regions and cause
nonsensical false-positive KASAN reports about stack-out-of-bounds bugs.
This does not wipe the exception stacks; if an oops happens on an exception
stack, it might result in random KASAN false-positives from other tasks
afterwards. This is probably relatively uninteresting, since if the kernel
oopses on an exception stack, there are most likely bigger things to worry
about. It'd be more interesting if vmapped stacks and KASAN were
compatible, since then handle_stack_overflow() would oops from exception
stack context.
Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: kasan-dev@googlegroups.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180828184033.93712-1-jannh@google.com
Nick Desaulniers [Mon, 27 Aug 2018 21:40:09 +0000 (14:40 -0700)]
x86/irqflags: Mark native_restore_fl extern inline
This should have been marked extern inline in order to pick up the out
of line definition in arch/x86/kernel/irqflags.S.
Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180827214011.55428-1-ndesaulniers@google.com