Alexei Starovoitov [Tue, 8 Sep 2015 20:40:01 +0000 (13:40 -0700)]
bpf: fix out of bounds access in verifier log
when the verifier log is enabled the print_bpf_insn() is doing
bpf_alu_string[BPF_OP(insn->code) >> 4]
and
bpf_jmp_string[BPF_OP(insn->code) >> 4]
where BPF_OP is a 4-bit instruction opcode.
Malformed insns can cause out of bounds access.
Fix it by sizing arrays appropriately.
The bug was found by clang address sanitizer with libfuzzer.
Reported-by: Yonghong Song <yhs@plumgrid.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roopa Prabhu [Tue, 8 Sep 2015 17:53:04 +0000 (10:53 -0700)]
ipv6: fix multipath route replace error recovery
Problem:
The ecmp route replace support for ipv6 in the kernel, deletes the
existing ecmp route too early, ie when it installs the first nexthop.
If there is an error in installing the subsequent nexthops, its too late
to recover the already deleted existing route leaving the fib
in an inconsistent state.
This patch reduces the possibility of this by doing the following:
a) Changes the existing multipath route add code to a two stage process:
build rt6_infos + insert them
ip6_route_add rt6_info creation code is moved into
ip6_route_info_create.
b) This ensures that most errors are caught during building rt6_infos
and we fail early
c) Separates multipath add and del code. Because add needs the special
two stage mode in a) and delete essentially does not care.
d) In any event if the code fails during inserting a route again, a
warning is printed (This should be unlikely)
Before the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists
$ip -6 route show
/* previously added ecmp route 3000:1000:1000:1000::2 dissappears from
* kernel */
After the patch:
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
/* Try replacing the route with a duplicate nexthop */
$ip -6 route change 3000:1000:1000:1000::2/128 nexthop via
fe80::202:ff:fe00:b dev swp49s0 nexthop via fe80::202:ff:fe00:d dev
swp49s1 nexthop via fe80::202:ff:fe00:d dev swp49s1
RTNETLINK answers: File exists
$ip -6 route show
3000:1000:1000:1000::2 via fe80::202:ff:fe00:b dev swp49s0 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:d dev swp49s1 metric 1024
3000:1000:1000:1000::2 via fe80::202:ff:fe00:f dev swp49s2 metric 1024
Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 8 Sep 2015 16:00:09 +0000 (18:00 +0200)]
ebpf: fix fd refcount leaks related to maps in bpf syscall
We may already have gotten a proper fd struct through fdget(), so
whenever we return at the end of an map operation, we need to call
fdput(). However, each map operation from syscall side first probes
CHECK_ATTR() to verify that unused fields in the bpf_attr union are
zero.
In case of malformed input, we return with error, but the lookup to
the map_fd was already performed at that time, so that we return
without an corresponding fdput(). Fix it by performing an fdget()
only right before bpf_map_get(). The fdget() invocation on maps in
the verifier is not affected.
Fixes: db20fd2b0108 ("bpf: add lookup/update/delete/iterate methods to BPF maps")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sasha Levin [Tue, 8 Sep 2015 14:53:40 +0000 (10:53 -0400)]
RDS: verify the underlying transport exists before creating a connection
There was no verification that an underlying transport exists when creating
a connection, this would cause dereferencing a NULL ptr.
It might happen on sockets that weren't properly bound before attempting to
send a message, which will cause a NULL ptr deref:
[135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[135546.051270] Modules linked in:
[135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted
4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
[135546.053217] task:
ffff8800835bc000 ti:
ffff8800bc708000 task.ti:
ffff8800bc708000
[135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
[135546.055666] RSP: 0018:
ffff8800bc70fab0 EFLAGS:
00010202
[135546.056457] RAX:
dffffc0000000000 RBX:
0000000000000f2c RCX:
ffff8800835bc000
[135546.057494] RDX:
0000000000000007 RSI:
ffff8800835bccd8 RDI:
0000000000000038
[135546.058530] RBP:
ffff8800bc70fb18 R08:
0000000000000001 R09:
0000000000000000
[135546.059556] R10:
ffffed014d7a3a23 R11:
ffffed014d7a3a21 R12:
0000000000000000
[135546.060614] R13:
0000000000000001 R14:
ffff8801ec3d0000 R15:
0000000000000000
[135546.061668] FS:
00007faad4ffb700(0000) GS:
ffff880252000000(0000) knlGS:
0000000000000000
[135546.062836] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[135546.063682] CR2:
000000000000846a CR3:
000000009d137000 CR4:
00000000000006a0
[135546.064723] Stack:
[135546.065048]
ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
[135546.066247]
0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
[135546.067438]
1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
[135546.068629] Call Trace:
[135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
[135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
[135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
[135546.071981] rds_sendmsg (net/rds/send.c:1058)
[135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
[135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
[135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
[135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
[135546.076349] ? __might_fault (mm/memory.c:3795)
[135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
[135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
[135546.078856] SYSC_sendto (net/socket.c:1657)
[135546.079596] ? SYSC_connect (net/socket.c:1628)
[135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
[135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
[135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
[135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
[135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
[135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
[135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Vrabel [Tue, 8 Sep 2015 13:25:14 +0000 (14:25 +0100)]
xen-netback: require fewer guest Rx slots when not using GSO
Commit
f48da8b14d04ca87ffcffe68829afd45f926ec6a (xen-netback: fix
unlimited guest Rx internal queue and carrier flapping) introduced a
regression.
The PV frontend in IPXE only places 4 requests on the guest Rx ring.
Since netback required at least (MAX_SKB_FRAGS + 1) slots, IPXE could
not receive any packets.
a) If GSO is not enabled on the VIF, fewer guest Rx slots are required
for the largest possible packet. Calculate the required slots
based on the maximum GSO size or the MTU.
This calculation of the number of required slots relies on
1650d5455bd2 (xen-netback: always fully coalesce guest Rx packets)
which present in 4.0-rc1 and later.
b) Reduce the Rx stall detection to checking for at least one
available Rx request. This is fine since we're predominately
concerned with detecting interfaces which are down and thus have
zero available Rx requests.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 9 Sep 2015 19:29:26 +0000 (12:29 -0700)]
Merge branch 'cxgb4-fixes'
Hariprasad Shenai says:
====================
cxgb4: Fix tx flit calculation and wc stat configuration
This patch series fixes the following:
Patch 1/2 fixes tx flit calculation, which if wrong can lead to
stall, hang, data corrpution, write combining failure. Patch 2/2 fixes
PCI-E write combining stats configuration.
This patch series has been created against net tree and includes
patches on cxgb4 driver.
We have included all the maintainers of respective drivers. Kindly review
the change and let us know in case of any review comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Tue, 8 Sep 2015 10:55:40 +0000 (16:25 +0530)]
cxgb4: Fix for write-combining stats configuration
The write-combining configuration register SGE_STAT_CFG_A needs to
be configured after FW initializes the adapter, else FW will reset
the configuration
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Tue, 8 Sep 2015 10:55:39 +0000 (16:25 +0530)]
cxgb4: Fix tx flit calculation
In commit
0aac3f56d4a63f04 ("cxgb4: Add comment for calculate tx flits
and sge length code") introduced a regression where tx flit calculation
is going wrong, which can lead to data corruption, hang, stall and
write-combining failure. Fixing it.
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atsushi Nemoto [Tue, 8 Sep 2015 09:15:41 +0000 (18:15 +0900)]
net: eth: altera: Fix the initial device operstate
Call netif_carrier_off() prior to register_netdev(), otherwise
userspace can see incorrect link state.
Signed-off-by: Atsushi Nemoto <nemoto@toshiba-tops.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kolmakov Dmitriy [Mon, 7 Sep 2015 09:05:48 +0000 (09:05 +0000)]
net: tipc: fix stall during bclink wakeup procedure
If an attempt to wake up users of broadcast link is made when there is
no enough place in send queue than it may hang up inside the
tipc_sk_rcv() function since the loop breaks only after the wake up
queue becomes empty. This can lead to complete CPU stall with the
following message generated by RCU:
INFO: rcu_sched self-detected stall on CPU { 0} (t=2101 jiffies
g=54225 c=54224 q=11465)
Task dump for CPU 0:
tpch R running task 0 39949 39948 0x0000000a
ffffffff818536c0 ffff88181fa037a0 ffffffff8106a4be 0000000000000000
ffffffff818536c0 ffff88181fa037c0 ffffffff8106d8a8 ffff88181fa03800
0000000000000001 ffff88181fa037f0 ffffffff81094a50 ffff88181fa15680
Call Trace:
<IRQ> [<
ffffffff8106a4be>] sched_show_task+0xae/0x120
[<
ffffffff8106d8a8>] dump_cpu_task+0x38/0x40
[<
ffffffff81094a50>] rcu_dump_cpu_stacks+0x90/0xd0
[<
ffffffff81097c3b>] rcu_check_callbacks+0x3eb/0x6e0
[<
ffffffff8106e53f>] ? account_system_time+0x7f/0x170
[<
ffffffff81099e64>] update_process_times+0x34/0x60
[<
ffffffff810a84d1>] tick_sched_handle.isra.18+0x31/0x40
[<
ffffffff810a851c>] tick_sched_timer+0x3c/0x70
[<
ffffffff8109a43d>] __run_hrtimer.isra.34+0x3d/0xc0
[<
ffffffff8109aa95>] hrtimer_interrupt+0xc5/0x1e0
[<
ffffffff81030d52>] ? native_smp_send_reschedule+0x42/0x60
[<
ffffffff81032f04>] local_apic_timer_interrupt+0x34/0x60
[<
ffffffff810335bc>] smp_apic_timer_interrupt+0x3c/0x60
[<
ffffffff8165a3fb>] apic_timer_interrupt+0x6b/0x70
[<
ffffffff81659129>] ? _raw_spin_unlock_irqrestore+0x9/0x10
[<
ffffffff8107eb9f>] __wake_up_sync_key+0x4f/0x60
[<
ffffffffa313ddd1>] tipc_write_space+0x31/0x40 [tipc]
[<
ffffffffa313dadf>] filter_rcv+0x31f/0x520 [tipc]
[<
ffffffffa313d699>] ? tipc_sk_lookup+0xc9/0x110 [tipc]
[<
ffffffff81659259>] ? _raw_spin_lock_bh+0x19/0x30
[<
ffffffffa314122c>] tipc_sk_rcv+0x2dc/0x3e0 [tipc]
[<
ffffffffa312e7ff>] tipc_bclink_wakeup_users+0x2f/0x40 [tipc]
[<
ffffffffa313ce26>] tipc_node_unlock+0x186/0x190 [tipc]
[<
ffffffff81597c1c>] ? kfree_skb+0x2c/0x40
[<
ffffffffa313475c>] tipc_rcv+0x2ac/0x8c0 [tipc]
[<
ffffffffa312ff58>] tipc_l2_rcv_msg+0x38/0x50 [tipc]
[<
ffffffff815a76d3>] __netif_receive_skb_core+0x5a3/0x950
[<
ffffffff815a98d3>] __netif_receive_skb+0x13/0x60
[<
ffffffff815a993e>] netif_receive_skb_internal+0x1e/0x90
[<
ffffffff815aa138>] napi_gro_receive+0x78/0xa0
[<
ffffffffa07f93f4>] tg3_poll_work+0xc54/0xf40 [tg3]
[<
ffffffff81597c8c>] ? consume_skb+0x2c/0x40
[<
ffffffffa07f9721>] tg3_poll_msix+0x41/0x160 [tg3]
[<
ffffffff815ab0f2>] net_rx_action+0xe2/0x290
[<
ffffffff8104b92a>] __do_softirq+0xda/0x1f0
[<
ffffffff8104bc26>] irq_exit+0x76/0xa0
[<
ffffffff81004355>] do_IRQ+0x55/0xf0
[<
ffffffff8165a12b>] common_interrupt+0x6b/0x6b
<EOI>
The issue occurs only when tipc_sk_rcv() is used to wake up postponed
senders:
tipc_bclink_wakeup_users()
// wakeupq - is a queue which consists of special
// messages with SOCK_WAKEUP type.
tipc_sk_rcv(wakeupq)
...
while (skb_queue_len(inputq)) {
filter_rcv(skb)
// Here the type of message is checked
// and if it is SOCK_WAKEUP then
// it tries to wake up a sender.
tipc_write_space(sk)
wake_up_interruptible_sync_poll()
}
After the sender thread is woke up it can gather control and perform
an attempt to send a message. But if there is no enough place in send
queue it will call link_schedule_user() function which puts a message
of type SOCK_WAKEUP to the wakeup queue and put the sender to sleep.
Thus the size of the queue actually is not changed and the while()
loop never exits.
The approach I proposed is to wake up only senders for which there is
enough place in send queue so the described issue can't occur.
Moreover the same approach is already used to wake up senders on
unicast links.
I have got into the issue on our product code but to reproduce the
issue I changed a benchmark test application (from
tipcutils/demos/benchmark) to perform the following scenario:
1. Run 64 instances of test application (nodes). It can be done
on the one physical machine.
2. Each application connects to all other using TIPC sockets in
RDM mode.
3. When setup is done all nodes start simultaneously send
broadcast messages.
4. Everything hangs up.
The issue is reproducible only when a congestion on broadcast link
occurs. For example, when there are only 8 nodes it works fine since
congestion doesn't occur. Send queue limit is 40 in my case (I use a
critical importance level) and when 64 nodes send a message at the
same moment a congestion occurs every time.
Signed-off-by: Dmitry S Kolmakov <kolmakov.dmitriy@huawei.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Barry Song [Mon, 7 Sep 2015 03:15:20 +0000 (03:15 +0000)]
dm9000: fix a typo
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 6 Sep 2015 01:49:41 +0000 (21:49 -0400)]
net: bridge: remove unnecessary switchdev include
Remove the unnecessary switchdev.h include from br_netlink.c.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sun, 6 Sep 2015 01:27:57 +0000 (21:27 -0400)]
net: bridge: check __vlan_vid_del for error
Since __vlan_del can return an error code, change its inner function
__vlan_vid_del to return an eventual error from switchdev_port_obj_del.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Sat, 5 Sep 2015 20:07:27 +0000 (13:07 -0700)]
net: dsa: bcm_sf2: Fix ageing conditions and operation
The comparison check between cur_hw_state and hw_state is currently
invalid because cur_hw_state is right shifted by G_MISTP_SHIFT, while
hw_state is not, so we end-up comparing bits 2:0 with bits 7:5, which is
going to cause an additional aging to occur. Fix this by not shifting
cur_hw_state while reading it, but instead, mask the value with the
appropriately shitfted bitmask.
The other problem with the fast-ageing process is that we did not set
the EN_AGE_DYNAMIC bit to request the ageing to occur for dynamically
learned MAC addresses. Finally, write back 0 to the FAST_AGE_CTRL
register to avoid leaving spurious bits sets from one operation to the
other.
Fixes: 12f460f23423 ("net: dsa: bcm_sf2: add HW bridging support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julien Grall [Thu, 3 Sep 2015 22:59:50 +0000 (23:59 +0100)]
device property: Don't overwrite addr when failing in device_get_mac_address
The function device_get_mac_address is trying different property names
in order to get the mac address. To check the return value, the variable
addr (which contain the buffer pass by the caller) will be re-used. This
means that if the previous property is not found, the next property will
be read using a NULL buffer.
Therefore it's only possible to retrieve the mac if node contains a
property "mac-address". Fix it by using a temporary buffer for the
return value.
This has been introduced by commit
4c96b7dc0d393f12c17e0d81db15aa4a820a6ab3
"Add a matching set of device_ functions for determining mac/phy"
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Jeremy Linton <jeremy.linton@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eugene Shatokhin [Tue, 1 Sep 2015 14:05:33 +0000 (17:05 +0300)]
usbnet: Fix a race between usbnet_stop() and the BH
The race may happen when a device (e.g. YOTA 4G LTE Modem) is
unplugged while the system is downloading a large file from the Net.
Hardware breakpoints and Kprobes with delays were used to confirm that
the race does actually happen.
The race is on skb_queue ('next' pointer) between usbnet_stop()
and rx_complete(), which, in turn, calls usbnet_bh().
Here is a part of the call stack with the code where the changes to the
queue happen. The line numbers are for the kernel 4.1.0:
*0 __skb_unlink (skbuff.h:1517)
prev->next = next;
*1 defer_bh (usbnet.c:430)
spin_lock_irqsave(&list->lock, flags);
old_state = entry->state;
entry->state = state;
__skb_unlink(skb, list);
spin_unlock(&list->lock);
spin_lock(&dev->done.lock);
__skb_queue_tail(&dev->done, skb);
if (dev->done.qlen == 1)
tasklet_schedule(&dev->bh);
spin_unlock_irqrestore(&dev->done.lock, flags);
*2 rx_complete (usbnet.c:640)
state = defer_bh(dev, skb, &dev->rxq, state);
At the same time, the following code repeatedly checks if the queue is
empty and reads these values concurrently with the above changes:
*0 usbnet_terminate_urbs (usbnet.c:765)
/* maybe wait for deletions to finish. */
while (!skb_queue_empty(&dev->rxq)
&& !skb_queue_empty(&dev->txq)
&& !skb_queue_empty(&dev->done)) {
schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS));
set_current_state(TASK_UNINTERRUPTIBLE);
netif_dbg(dev, ifdown, dev->net,
"waited for %d urb completions\n", temp);
}
*1 usbnet_stop (usbnet.c:806)
if (!(info->flags & FLAG_AVOID_UNLINK_URBS))
usbnet_terminate_urbs(dev);
As a result, it is possible, for example, that the skb is removed from
dev->rxq by __skb_unlink() before the check
"!skb_queue_empty(&dev->rxq)" in usbnet_terminate_urbs() is made. It is
also possible in this case that the skb is added to dev->done queue
after "!skb_queue_empty(&dev->done)" is checked. So
usbnet_terminate_urbs() may stop waiting and return while dev->done
queue still has an item.
Locking in defer_bh() and usbnet_terminate_urbs() was revisited to avoid
this race.
Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Reviewed-by: Bjørn Mork <bjorn@mork.no>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Fri, 4 Sep 2015 21:05:42 +0000 (23:05 +0200)]
cxgb4: fix usage of uninitialized variable
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c: In function ‘init_one’:
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4579:8: warning: ‘chip’ may be used uninitialized in this function [-Wmaybe-uninitialized]
chip |= CHELSIO_CHIP_CODE(CHELSIO_T4, pl_rev);
^
drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c:4571:11: note: ‘chip’ was declared here
int ver, chip;
^
Fixes: d86bd29e0b31 ("cxgb4/cxgb4vf: read the correct bits of PL Who Am I register")
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Thu, 3 Sep 2015 20:22:16 +0000 (23:22 +0300)]
fixed_phy: pass 'irq' to fixed_phy_add()
I've noticed that fixed_phy_register() ignores its 'irq' parameter instead of
passing it to fixed_phy_add(). Luckily, fixed_phy_register() seems to always
be called with PHY_POLL for 'irq'... :-)
Fixes: a75951217472 ("net: phy: extend fixed driver with fixed_phy_register()")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Stringer [Fri, 4 Sep 2015 20:07:40 +0000 (13:07 -0700)]
openvswitch: Remove conntrack Kconfig option.
There's no particular desire to have conntrack action support in Open
vSwitch as an independently configurable bit, rather just to ensure
there is not a hard dependency. This exposed option doesn't accurately
reflect the conntrack dependency when enabled, so simplify this by
removing the option. Compile the support if NF_CONNTRACK is enabled.
Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Fri, 4 Sep 2015 15:22:24 +0000 (11:22 -0400)]
net: dsa: mv88e6171: add hardware 802.1Q support
The Marvell
88E6171 switch is in the
88E6351 family, which supports
802.1Q, thus add support from the generic mv88e6xxx functions.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 7 Sep 2015 02:49:55 +0000 (19:49 -0700)]
Merge tag 'mac80211-for-davem-2015-09-04' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
For the first round of fixes, we have this:
* fix for the sizeof() pointer type issue
* a fix for regulatory getting into a restore loop
* a fix for rfkill global 'all' state, it needs to be stored
everywhere to apply correctly to new rfkill instances
* properly refuse CQM RSSI when it cannot actually be used
* protect HT TDLS traffic properly in non-HT networks
* don't incorrectly advertise 80 MHz support when not allowed
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Fri, 4 Sep 2015 12:44:12 +0000 (14:44 +0200)]
ethernet: synopsys: SYNOPSYS_DWC_ETH_QOS should depend on HAS_DMA
If NO_DMA=y:
ERROR: "dma_alloc_coherent" [drivers/net/ethernet/synopsys/dwc_eth_qos.ko] undefined!
ERROR: "dma_free_coherent" [drivers/net/ethernet/synopsys/dwc_eth_qos.ko] undefined!
ERROR: "dma_unmap_single" [drivers/net/ethernet/synopsys/dwc_eth_qos.ko] undefined!
ERROR: "dma_map_page" [drivers/net/ethernet/synopsys/dwc_eth_qos.ko] undefined!
ERROR: "dma_mapping_error" [drivers/net/ethernet/synopsys/dwc_eth_qos.ko] undefined!
ERROR: "dma_map_single" [drivers/net/ethernet/synopsys/dwc_eth_qos.ko] undefined!
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Lars Persson <larper@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Fri, 4 Sep 2015 10:49:32 +0000 (12:49 +0200)]
vxlan: Refactor vxlan_udp_encap_recv() to kill compiler warning
drivers/net/vxlan.c: In function ‘vxlan_udp_encap_recv’:
drivers/net/vxlan.c:1226: warning: ‘info’ may be used uninitialized in this function
While this warning is a false positive, it can be killed easily by
getting rid of the pointer intermediary and referring directly to the
ip_tunnel_info structure.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Fri, 4 Sep 2015 10:47:28 +0000 (12:47 +0200)]
lan78xx: Fix ladv/radv error handling in lan78xx_link_reset()
net/usb/lan78xx.c: In function ‘lan78xx_link_reset’:
net/usb/lan78xx.c:1107: warning: comparison is always false due to limited range of data type
net/usb/lan78xx.c:1111: warning: comparison is always false due to limited range of data type
Assigning return values that can be negative error codes to "u16"
variables makes them positive, ignoring the errors. Hence use "int"
instead.
Drop the "unlikely"s (unlikely considered harmful) and propagate the
actual error values instead of overriding them to -EIO while we're at
it.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Thu, 3 Sep 2015 20:24:52 +0000 (16:24 -0400)]
RDS: rds_conn_lookup() should factor in the struct net for a match
Only return a conn if the rds_conn_net(conn) matches the struct
net passed to rds_conn_lookup().
Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen endpoints,
one per netns.")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Maciej S. Szmigiero [Thu, 3 Sep 2015 19:38:30 +0000 (21:38 +0200)]
net: fec: normalize return value of pm_runtime_get_sync() in MDIO write
If fec MDIO write method succeeds its return value comes from
call to pm_runtime_get_sync().
But pm_runtime_get_sync() can also return 1.
In case of Micrel KSZ9031 PHY this value will then
be returned along the call chain of phy_write() ->
ksz9031_extended_write() -> ksz9031_center_flp_timing() ->
ksz9031_config_init() -> phy_init_hw() -> phy_attach_direct() ->
phy_connect_direct().
Then phy_connect() will cast it into a pointer using ERR_PTR(),
which then fec_enet_mii_probe() will try to dereference
resulting in an oops.
Fix it by normalizing return value of pm_runtime_get_sync()
to be zero if positive in MDIO write method.
Fixes: 8fff755e9f8d ("net: fec: Ensure clocks are enabled while using mdio bus")
Signed-off-by: Maciej Szmigiero <mail@maciej.szmigiero.name>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Thu, 3 Sep 2015 12:04:17 +0000 (14:04 +0200)]
switchdev: fix return value of switchdev_port_fdb_dump in case of error
switchdev_port_fdb_dump is used as .ndo_fdb_dump. Its return value is
idx, so we cannot return errval.
Fixes: 45d4122ca7cd ("switchdev: add support for fdb add/del/dump via switchdev_port_obj ops.")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Acked-by: Scott Feldman<sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Thu, 3 Sep 2015 11:41:53 +0000 (07:41 -0400)]
be2net: Revert "make the RX_FILTER command asynchronous" commit
The be_cmd_rx_filter() routine sends a non-embedded cmd to the FW and used
a pre-allocated dma memory to hold the cmd payload. This worked fine when
this cmd was synchronous. This cmd was changed to asynchronous mode by the
commit
8af65c2f4("make the RX_FILTER command asynchronous"). So now when
there are two quick invocations of this cmd, the 2nd request may end up
overwriting the first request, causing FW cmd corruption.
This patch reverts the offending commit and hence fixes the regression.
Fixes: 8af65c2f4("be2net: make the RX_FILTER command asynchronous")
Signed-off-by: Sathya Perla <sathya.perla@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 6 Sep 2015 00:36:30 +0000 (17:36 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Conflicts:
include/net/netfilter/nf_conntrack.h
The conflict was an overlap between changing the type of the zone
argument to nf_ct_tmpl_alloc() whilst exporting nf_ct_tmpl_free.
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net, they are:
1) Oneliner to restore maps in nf_tables since we support addressing registers
at 32 bits level.
2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
oneliner from Bernhard Thaler.
3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
KASan utility, patch from Jozsef Kadlecsik.
4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
unnamed unions, patch from Elad Raz.
5) Add a workaround to address inconsistent endianess in the res_id field of
nfnetlink batch messages, reported by Florian Westphal.
6) Fix error paths of CT/synproxy since the conntrack template was moved to use
kmalloc, patch from Daniel Borkmann.
All of them look good to me to reach 4.2, I can route this to -stable myself
too, just let me know what you prefer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sudip Mukherjee [Thu, 3 Sep 2015 06:00:30 +0000 (11:30 +0530)]
net: wan: sbni: fix device usage count
dev_get_by_name() will increment the usage count if the matching device
is found. But we were not decrementing the count if we have got the
device and the device is non-active.
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Laing [Thu, 3 Sep 2015 01:52:31 +0000 (13:52 +1200)]
net/ipv6: Correct PIM6 mrt_lock handling
In the IPv6 multicast routing code the mrt_lock was not being released
correctly in the MFC iterator, as a result adding or deleting a MIF would
cause a hang because the mrt_lock could not be acquired.
This fix is a copy of the code for the IPv4 case and ensures that the lock
is released correctly.
Signed-off-by: Richard Laing <richard.laing@alliedtelesis.co.nz>
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Fri, 28 Aug 2015 08:44:20 +0000 (10:44 +0200)]
mac80211: reject software RSSI CQM with beacon filtering
When beacon filtering is enabled the mac80211 software implementation
for RSSI CQM cannot work as beacons will not be available. Rather than
accepting such a configuration without proper effect, reject it.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Arik Nemtsov [Sat, 15 Aug 2015 19:39:53 +0000 (22:39 +0300)]
mac80211: avoid VHT usage with no 80MHz chans allowed
Currently if 80MHz channels are not allowed for use, the VHT IE is not
included in the probe request for an AP. This is not good enough if the
AP is configured with the wrong regulatory and supports VHT even where
prohibited or in TDLS scenarios.
Mark the ifmgd with the DISABLE_VHT flag for the misbehaving-AP case, and
unset VHT support from the peer-station entry for the TDLS case.
Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Maciej S. Szmigiero [Wed, 2 Sep 2015 17:00:31 +0000 (19:00 +0200)]
cfg80211: regulatory: restore proper user alpha2
restore_regulatory_settings() should restore alpha2
as computed in restore_alpha2(), not raw user_alpha2 to
behave as described in the comment just above that code.
This fixes endless loop of calling CRDA for "00" and "97"
countries after resume from suspend on my laptop.
Looks like others had the same problem, too:
http://ath9k-devel.ath9k.narkive.com/knY5W6St/ath9k-and-crda-messages-in-logs
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/899335
https://forum.porteus.org/viewtopic.php?t=4975&p=36436
https://forums.opensuse.org/showthread.php/483356-Authentication-Regulatory-Domain-issues-ath5k-12-2
Signed-off-by: Maciej Szmigiero <mail@maciej.szmigiero.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
João Paulo Rechi Vita [Tue, 25 Aug 2015 12:56:43 +0000 (08:56 -0400)]
rfkill: Copy "all" global state to other types
When switching the state of all RFKill switches of type all we need to
replicate the RFKILL_TYPE_ALL global state to all the other types global
state, so it is used to initialize persistent RFKill switches on
register.
Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Avri Altman [Tue, 18 Aug 2015 13:52:07 +0000 (16:52 +0300)]
mac80211: protect non-HT BSS when HT TDLS traffic exists
HT TDLS traffic should be protected in a non-HT BSS to avoid
collisions. Therefore, when TDLS peers join/leave, check if
protection is (now) needed and set the ht_operation_mode of
the virtual interface according to the HT capabilities of the
TDLS peer(s).
This works because a non-HT BSS connection never sets (or
otherwise uses) the ht_operation_mode; it just means that
drivers must be aware that this field applies to all HT
traffic for this virtual interface, not just the traffic
within the BSS. Document that.
Signed-off-by: Avri Altman <avri.altman@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Thierry Reding [Wed, 26 Aug 2015 10:22:14 +0000 (12:22 +0200)]
mac80211: Do not use sizeof() on pointer type
The rate_control_cap_mask() function takes a parameter mcs_mask, which
GCC will take to be u8 * even though it was declared with a fixed size.
This causes the following warning:
net/mac80211/rate.c: In function 'rate_control_cap_mask':
net/mac80211/rate.c:719:25: warning: 'sizeof' on array function parameter 'mcs_mask' will return size of 'u8 * {aka unsigned char *}' [-Wsizeof-array-argument]
for (i = 0; i < sizeof(mcs_mask); i++)
^
net/mac80211/rate.c:684:10: note: declared here
u8 mcs_mask[IEEE80211_HT_MCS_MASK_LEN],
^
This can be easily fixed by using the IEEE80211_HT_MCS_MASK_LEN directly
within the loop condition.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
David S. Miller [Thu, 3 Sep 2015 22:43:06 +0000 (15:43 -0700)]
Merge branch 'sctp-fixes'
Marcelo Ricardo Leitner says:
====================
couple of sctp fixes for
0ca50d12fe46
These are two fixes for sctp after my patch on
0ca50d12fe46 ("sctp: fix
src address selection if using secondary addresses")
The first, fix a dst leak on those it decided to skip.
The second, adds the fallback on src selection that Vlad had asked
about. Unfortunatelly a lot of ipvs setups relies on the old behavior
and I don't see a better fix for it.
Please consider both to -stable tree.
====================
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Wed, 2 Sep 2015 19:20:22 +0000 (16:20 -0300)]
sctp: add routing output fallback
Commit
0ca50d12fe46 added a restriction that the address must belong to
the output interface, so that sctp will use the right interface even
when using secondary addresses.
But it breaks IPVS setups, on which people is used to attach VIP
addresses to loopback interface on real servers. It's preferred to
attach to the interface actually in use, but it's a very common setup
and that used to work.
This patch then saves the first routing good result, even if it would be
going out through an interface that doesn't have that address. If no
better hit found, it's then used. This effectively restores the original
behavior if no better interface could be found.
Fixes: 0ca50d12fe46 ("sctp: fix src address selection if using secondary addresses")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Wed, 2 Sep 2015 19:20:21 +0000 (16:20 -0300)]
sctp: fix dst leak
Commit
0ca50d12fe46 failed to release the reference to dst entries that
it decided to skip.
Fixes: 0ca50d12fe46 ("sctp: fix src address selection if using secondary addresses")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atsushi Nemoto [Wed, 2 Sep 2015 08:49:29 +0000 (17:49 +0900)]
net: eth: altera: fix napi poll_list corruption
tse_poll() calls __napi_complete() with irq enabled. This leads napi
poll_list corruption and may stop all napi drivers working.
Use napi_complete() instead of __napi_complete().
Signed-off-by: Atsushi Nemoto <nemoto@toshiba-tops.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 3 Sep 2015 15:08:17 +0000 (08:08 -0700)]
Merge git://git./linux/kernel/git/davem/net-next
Pull networking updates from David Miller:
"Another merge window, another set of networking changes. I've heard
rumblings that the lightweight tunnels infrastructure has been voted
networking change of the year. But what do I know?
1) Add conntrack support to openvswitch, from Joe Stringer.
2) Initial support for VRF (Virtual Routing and Forwarding), which
allows the segmentation of routing paths without using multiple
devices. There are some semantic kinks to work out still, but
this is a reasonably strong foundation. From David Ahern.
3) Remove spinlock fro act_bpf fast path, from Alexei Starovoitov.
4) Ignore route nexthops with a link down state in ipv6, just like
ipv4. From Andy Gospodarek.
5) Remove spinlock from fast path of act_gact and act_mirred, from
Eric Dumazet.
6) Document the DSA layer, from Florian Fainelli.
7) Add netconsole support to bcmgenet, systemport, and DSA. Also
from Florian Fainelli.
8) Add Mellanox Switch Driver and core infrastructure, from Jiri
Pirko.
9) Add support for "light weight tunnels", which allow for
encapsulation and decapsulation without bearing the overhead of a
full blown netdevice. From Thomas Graf, Jiri Benc, and a cast of
others.
10) Add Identifier Locator Addressing support for ipv6, from Tom
Herbert.
11) Support fragmented SKBs in iwlwifi, from Johannes Berg.
12) Allow perf PMUs to be accessed from eBPF programs, from Kaixu Xia.
13) Add BQL support to 3c59x driver, from Loganaden Velvindron.
14) Stop using a zero TX queue length to mean that a device shouldn't
have a qdisc attached, use an explicit flag instead. From Phil
Sutter.
15) Use generic geneve netdevice infrastructure in openvswitch, from
Pravin B Shelar.
16) Add infrastructure to avoid re-forwarding a packet in software
that was already forwarded by a hardware switch. From Scott
Feldman.
17) Allow AF_PACKET fanout function to be implemented in a bpf
program, from Willem de Bruijn"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1458 commits)
netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
net: fec: clear receive interrupts before processing a packet
ipv6: fix exthdrs offload registration in out_rt path
xen-netback: add support for multicast control
bgmac: Update fixed_phy_register()
sock, diag: fix panic in sock_diag_put_filterinfo
flow_dissector: Use 'const' where possible.
flow_dissector: Fix function argument ordering dependency
ixgbe: Resolve "initialized field overwritten" warnings
ixgbe: Remove bimodal SR-IOV disabling
ixgbe: Add support for reporting 2.5G link speed
ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
ixgbe: support for ethtool set_rxfh
ixgbe: Avoid needless PHY access on copper phys
ixgbe: cleanup to use cached mask value
ixgbe: Remove second instance of lan_id variable
ixgbe: use kzalloc for allocating one thing
flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
ixgbe: Remove unused PCI bus types
...
Linus Torvalds [Wed, 2 Sep 2015 23:35:26 +0000 (16:35 -0700)]
Merge tag 'dm-4.3-changes' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper update from Mike Snitzer:
- a couple small cleanups in dm-cache, dm-verity, persistent-data's
dm-btree, and DM core.
- a 4.1-stable fix for dm-cache that fixes the leaking of deferred bio
prison cells
- a 4.2-stable fix that adds feature reporting for the dm-stats
features added in 4.2
- improve DM-snapshot to not invalidate the on-disk snapshot if
snapshot device write overflow occurs; but a write overflow triggered
through the origin device will still invalidate the snapshot.
- optimize DM-thinp's async discard submission a bit now that late bio
splitting has been included in block core.
- switch DM-cache's SMQ policy lock from using a mutex to a spinlock;
improves performance on very low latency devices (eg. NVMe SSD).
- document DM RAID 4/5/6's discard support
[ I did not pull the slab changes, which weren't appropriate for this
tree, and weren't obviously the right thing to do anyway. At the very
least they need some discussion and explanation before getting merged.
Because not pulling the actual tagged commit but doing a partial pull
instead, this merge commit thus also obviously is missing the git
signature from the original tag ]
* tag 'dm-4.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm cache: fix use after freeing migrations
dm cache: small cleanups related to deferred prison cell cleanup
dm cache: fix leaking of deferred bio prison cells
dm raid: document RAID 4/5/6 discard support
dm stats: report precise_timestamps and histogram in @stats_list output
dm thin: optimize async discard submission
dm snapshot: don't invalidate on-disk image on snapshot write overflow
dm: remove unlikely() before IS_ERR()
dm: do not override error code returned from dm_get_device()
dm: test return value for DM_MAPIO_SUBMITTED
dm verity: remove unused mempool
dm cache: move wake_waker() from free_migrations() to where it is needed
dm btree remove: remove unused function get_nr_entries()
dm btree: remove unused "dm_block_t root" parameter in btree_split_sibling()
dm cache policy smq: change the mutex to a spinlock
Daniel Borkmann [Wed, 2 Sep 2015 23:26:07 +0000 (01:26 +0200)]
netfilter: nf_conntrack: make nf_ct_zone_dflt built-in
Fengguang reported, that some randconfig generated the following linker
issue with nf_ct_zone_dflt object involved:
[...]
CC init/version.o
LD init/built-in.o
net/built-in.o: In function `ipv4_conntrack_defrag':
nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
net/built-in.o: In function `ipv6_defrag':
nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
make: *** [vmlinux] Error 1
Given that configurations exist where we have a built-in part, which is
accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
area when netfilter is configured in general.
Therefore, split the more generic parts into a common header under
include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
section that already holds parts related to CONFIG_NF_CONNTRACK in the
netfilter core. This fixes the issue on my side.
Fixes: 308ac9143ee2 ("netfilter: nf_conntrack: push zone object into functions")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 2 Sep 2015 18:54:02 +0000 (20:54 +0200)]
netfilter: nf_dup{4, 6}: fix build error when nf_conntrack disabled
While testing various Kconfig options on another issue, I found that
the following one triggers as well on allmodconfig and nf_conntrack
disabled:
net/ipv4/netfilter/nf_dup_ipv4.c: In function ‘nf_dup_ipv4’:
net/ipv4/netfilter/nf_dup_ipv4.c:72:20: error: ‘nf_skb_duplicated’ undeclared (first use in this function)
if (this_cpu_read(nf_skb_duplicated))
[...]
net/ipv6/netfilter/nf_dup_ipv6.c: In function ‘nf_dup_ipv6’:
net/ipv6/netfilter/nf_dup_ipv6.c:66:20: error: ‘nf_skb_duplicated’ undeclared (first use in this function)
if (this_cpu_read(nf_skb_duplicated))
Fix it by including directly the header where it is defined.
Fixes: bbde9fc1824a ("netfilter: factor out packet duplication for IPv4/IPv6")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Wed, 2 Sep 2015 09:24:14 +0000 (17:24 +0800)]
net: fec: clear receive interrupts before processing a packet
The patch just to re-submit the patch "
db3421c114cfa6326" because the
patch "
4d494cdc92b3b9a0" remove the change.
Clear any pending receive interrupt before we process a pending packet.
This helps to avoid any spurious interrupts being raised after we have
fully cleaned the receive ring, while still allowing an interrupt to be
raised if we receive another packet.
The position of this is critical: we must do this prior to reading the
next packet status to avoid potentially dropping an interrupt when a
packet is still pending.
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 2 Sep 2015 22:29:07 +0000 (00:29 +0200)]
ipv6: fix exthdrs offload registration in out_rt path
We previously register IPPROTO_ROUTING offload under inet6_add_offload(),
but in error path, we try to unregister it with inet_del_offload(). This
doesn't seem correct, it should actually be inet6_del_offload(), also
ipv6_exthdrs_offload_exit() from that commit seems rather incorrect (it
also uses rthdr_offload twice), but it got removed entirely later on.
Fixes: 3336288a9fea ("ipv6: Switch to using new offload infrastructure.")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 2 Sep 2015 20:22:38 +0000 (13:22 -0700)]
Merge branch 'for-4.3/sg' of git://git.kernel.dk/linux-block
Pull SG updates from Jens Axboe:
"This contains a set of scatter-gather related changes/fixes for 4.3:
- Add support for limited chaining of sg tables even for
architectures that do not set ARCH_HAS_SG_CHAIN. From Christoph.
- Add sg chain support to target_rd. From Christoph.
- Fixup open coded sg->page_link in crypto/omap-sham. From
Christoph.
- Fixup open coded crypto ->page_link manipulation. From Dan.
- Also from Dan, automated fixup of manual sg_unmark_end()
manipulations.
- Also from Dan, automated fixup of open coded sg_phys()
implementations.
- From Robert Jarzmik, addition of an sg table splitting helper that
drivers can use"
* 'for-4.3/sg' of git://git.kernel.dk/linux-block:
lib: scatterlist: add sg splitting function
scatterlist: use sg_phys()
crypto/omap-sham: remove an open coded access to ->page_link
scatterlist: remove open coded sg_unmark_end instances
crypto: replace scatterwalk_sg_chain with sg_chain
target/rd: always chain S/G list
scatterlist: allow limited chaining without ARCH_HAS_SG_CHAIN
Linus Torvalds [Wed, 2 Sep 2015 20:14:58 +0000 (13:14 -0700)]
Merge branch 'for-4.3/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
"On top of the 4.3 core block IO changes, here are the driver related
changes for 4.3. Basically just NVMe and nbd this time around:
- NVMe:
- PRACT PI improvement from Alok Pandey.
- Cleanups and improvements on submission queue doorbell and
writing, using CMB if available. From Jon Derrick.
- From Keith, support for setting queue maximum segments, and
reset support.
- Also from Jon, fixup of u64 division issue on 32-bit archs and
wiring up of the reset support through and ioctl.
- Two small cleanups from Matias and Sunad
- Various code cleanups and fixes from Markus Pargmann"
* 'for-4.3/drivers' of git://git.kernel.dk/linux-block:
NVMe: Using PRACT bit to generate and verify PI by controller
NVMe:Remove unreachable code in nvme_abort_req
NVMe: Add nvme subsystem reset IOCTL
NVMe: Add nvme subsystem reset support
NVMe: removed unused nn var from nvme_dev_add
NVMe: Set queue max segments
nbd: flags is a u32 variable
nbd: Rename functions for clearness of recv/send path
nbd: Change 'disconnect' to be boolean
nbd: Add debugfs entries
nbd: Remove variable 'pid'
nbd: Move clear queue debug message
nbd: Remove 'harderror' and propagate error properly
nbd: restructure sock_shutdown
nbd: sock_shutdown, remove conditional lock
nbd: Fix timeout detection
nvme: Fixes u64 division which breaks i386 builds
NVMe: Use CMB for the IO SQes if available
NVMe: Unify SQ entry writing and doorbell ringing
Linus Torvalds [Wed, 2 Sep 2015 20:10:25 +0000 (13:10 -0700)]
Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block
Pull core block updates from Jens Axboe:
"This first core part of the block IO changes contains:
- Cleanup of the bio IO error signaling from Christoph. We used to
rely on the uptodate bit and passing around of an error, now we
store the error in the bio itself.
- Improvement of the above from myself, by shrinking the bio size
down again to fit in two cachelines on x86-64.
- Revert of the max_hw_sectors cap removal from a revision again,
from Jeff Moyer. This caused performance regressions in various
tests. Reinstate the limit, bump it to a more reasonable size
instead.
- Make /sys/block/<dev>/queue/discard_max_bytes writeable, by me.
Most devices have huge trim limits, which can cause nasty latencies
when deleting files. Enable the admin to configure the size down.
We will look into having a more sane default instead of UINT_MAX
sectors.
- Improvement of the SGP gaps logic from Keith Busch.
- Enable the block core to handle arbitrarily sized bios, which
enables a nice simplification of bio_add_page() (which is an IO hot
path). From Kent.
- Improvements to the partition io stats accounting, making it
faster. From Ming Lei.
- Also from Ming Lei, a basic fixup for overflow of the sysfs pending
file in blk-mq, as well as a fix for a blk-mq timeout race
condition.
- Ming Lin has been carrying Kents above mentioned patches forward
for a while, and testing them. Ming also did a few fixes around
that.
- Sasha Levin found and fixed a use-after-free problem introduced by
the bio->bi_error changes from Christoph.
- Small blk cgroup cleanup from Viresh Kumar"
* 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
blk: Fix bio_io_vec index when checking bvec gaps
block: Replace SG_GAPS with new queue limits mask
block: bump BLK_DEF_MAX_SECTORS to 2560
Revert "block: remove artifical max_hw_sectors cap"
blk-mq: fix race between timeout and freeing request
blk-mq: fix buffer overflow when reading sysfs file of 'pending'
Documentation: update notes in biovecs about arbitrarily sized bios
block: remove bio_get_nr_vecs()
fs: use helper bio_add_page() instead of open coding on bi_io_vec
block: kill merge_bvec_fn() completely
md/raid5: get rid of bio_fits_rdev()
md/raid5: split bio for chunk_aligned_read
block: remove split code in blkdev_issue_{discard,write_same}
btrfs: remove bio splitting and merge_bvec_fn() calls
bcache: remove driver private bio splitting code
block: simplify bio_add_page()
block: make generic_make_request handle arbitrarily sized bios
blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
block: don't access bio->bi_error after bio_put()
block: shrink struct bio down to 2 cache lines again
...
Linus Torvalds [Wed, 2 Sep 2015 19:22:54 +0000 (12:22 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull first round of SCSI updates from James Bottomley:
"This includes one new driver: cxlflash plus the usual grab bag of
updates for the major drivers: qla2xxx, ipr, storvsc, pm80xx, hptiop,
plus a few assorted fixes.
There's another tranch coming, but I want to incubate it another few
days in the checkers, plus it includes a mpt2sas separated lifetime
fix, which Avago won't get done testing until Friday"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (85 commits)
aic94xx: set an error code on failure
storvsc: Set the error code correctly in failure conditions
storvsc: Allow write_same when host is windows 10
storvsc: use storage protocol version to determine storage capabilities
storvsc: use correct defaults for values determined by protocol negotiation
storvsc: Untangle the storage protocol negotiation from the vmbus protocol negotiation.
storvsc: Use a single value to track protocol versions
storvsc: Rather than look for sets of specific protocol versions, make decisions based on ranges.
cxlflash: Remove unused variable from queuecommand
cxlflash: shift wrapping bug in afu_link_reset()
cxlflash: off by one bug in cxlflash_show_port_status()
cxlflash: Virtual LUN support
cxlflash: Superpipe support
cxlflash: Base error recovery support
qla2xxx: Update driver version to 8.07.00.26-k
qla2xxx: Add pci device id 0x2261.
qla2xxx: Fix missing device login retries.
qla2xxx: do not clear slot in outstanding cmd array
qla2xxx: Remove decrement of sp reference count in abort handler.
qla2xxx: Add support to show MPI and PEP FW version for ISP27xx.
...
Linus Torvalds [Wed, 2 Sep 2015 19:16:24 +0000 (12:16 -0700)]
Merge tag 'for-linus-
20150901' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"SPI NOR:
- reduce virtual address space requirements for fsl-quadspi memory map
- new fsl-quadspi IP support: imx6ul-qspi and imx7d-qspi
- add new NOR flash device support
- add new driver for NXP SPI Flash Interface (SPIFI)
- stop abusing SPI API structs for non-SPI framework
- fixup DT table matching for new "jedec,spi-nor" string
NAND:
- brcmnand: fix big endian MIPS macro usage
- denali: refactor to use devres, dev_*() printing, etc.
- OMAP ELM: change the module alias to actually be usable
- pxa3xx_nand: fixup a few command sequencing issues -- both new and old
- race conditions in the IRQ handler status clearing
- problems when a bootloader left interrupts pending
- config issues when overriding the bootloader configuration
- new flash device support
- sunxi_nand:
- optimize timing configuration by calculation, rather than fixed
fail-safe values
- use EDO setting from ONFI
- r852: fix compiler warnings
- davinci: add 4KB page support
Core:
- oobtest: correct debug print information"
* tag 'for-linus-
20150901' of git://git.infradead.org/linux-mtd: (42 commits)
mtd: mtd_oobtest: Fix the address offset with vary_offset case
mtd: blkdevs: fix switch-bool compilation warning
mtd: spi-nor: stop (ab)using struct spi_device_id
mtd: nand: add Toshiba TC58NVG0S3E to nand_ids table
mtd: dataflash: Export OF module alias information
nand: pxa3xx: Increase READ_ID buffer and make the size static
mtd: nand: pxa3xx-nand: fix random command timeouts
mtd: nand: pxa3xx_nand: fix early spurious interrupt
mtd: pxa3xx_nand: add a default chunk size
mtd: omap_elm: Fix module alias
mtd: physmap_of: fix null pointer deference when kzalloc returns null
mtd: nettel: do not ignore mtd_device_register() failure in nettel_init()
mtd: denali_pci: switch to dev_err()
mtd: denali_pci: refactor driver using devres API
mtd: denali_pci: use module_pci_driver() macro
mtd: denali: hide core part from user in Kconfig
mtd: spi-nor: add Spansion S25FL204K support
mtd: spi-nor: Improve Kconfig help text for SPI_FSL_QUADSPI
mtd: spi-nor: add driver for NXP SPI Flash Interface (SPIFI)
doc: dt: add documentation for nxp,lpc1773-spifi
...
Paul Durrant [Wed, 2 Sep 2015 16:58:36 +0000 (17:58 +0100)]
xen-netback: add support for multicast control
Xen's PV network protocol includes messages to add/remove ethernet
multicast addresses to/from a filter list in the backend. This allows
the frontend to request the backend only forward multicast packets
which are of interest thus preventing unnecessary noise on the shared
ring.
The canonical netif header in git://xenbits.xen.org/xen.git specifies
the message format (two more XEN_NETIF_EXTRA_TYPEs) so the minimal
necessary changes have been pulled into include/xen/interface/io/netif.h.
To prevent the frontend from extending the multicast filter list
arbitrarily a limit (XEN_NETBK_MCAST_MAX) has been set to 64 entries.
This limit is not specified by the protocol and so may change in future.
If the limit is reached then the next XEN_NETIF_EXTRA_TYPE_MCAST_ADD
sent by the frontend will be failed with NETIF_RSP_ERROR.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabio Estevam [Wed, 2 Sep 2015 16:25:59 +0000 (13:25 -0300)]
bgmac: Update fixed_phy_register()
Commit
a5597008dbc2 ("phy: fixed_phy: Add gpio to determine link up/down.")
added a new argument to fixed_phy_register(), but missed to update bgmac
driver, causing the following build failure:
drivers/net/ethernet/broadcom/bgmac.c:1450:2: error: too few arguments to function 'fixed_phy_register'
Add the missing argument.
Reported-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Wed, 2 Sep 2015 12:00:36 +0000 (14:00 +0200)]
sock, diag: fix panic in sock_diag_put_filterinfo
diag socket's sock_diag_put_filterinfo() dumps classic BPF programs
upon request to user space (ss -0 -b). However, native eBPF programs
attached to sockets (SO_ATTACH_BPF) cannot be dumped with this method:
Their orig_prog is always NULL. However, sock_diag_put_filterinfo()
unconditionally tries to access its filter length resp. wants to copy
the filter insns from there. Internal cBPF to eBPF transformations
attached to sockets don't have this issue, as orig_prog state is kept.
It's currently only used by packet sockets. If we would want to add
native eBPF support in the future, this needs to be done through
a different attribute than PACKET_DIAG_FILTER to not confuse possible
user space disassemblers that work on diag data.
Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 2 Sep 2015 15:04:23 +0000 (08:04 -0700)]
Merge branch 'for-4.3' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- a new PIDs controller is added. It turns out that PIDs are actually
an independent resource from kmem due to the limited PID space.
- more core preparations for the v2 interface. Once cpu side interface
is settled, it should be ready for lifting the devel mask.
for-4.3-unified-base was temporarily branched so that other trees
(block) can pull cgroup core changes that blkcg changes depend on.
- a non-critical idr_preload usage bug fix.
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: pids: fix invalid get/put usage
cgroup: introduce cgroup_subsys->legacy_name
cgroup: don't print subsystems for the default hierarchy
cgroup: make cftype->private a unsigned long
cgroup: export cgrp_dfl_root
cgroup: define controller file conventions
cgroup: fix idr_preload usage
cgroup: add documentation for the PIDs controller
cgroup: implement the PIDs subsystem
cgroup: allow a cgroup subsystem to reject a fork
Linus Torvalds [Wed, 2 Sep 2015 15:03:25 +0000 (08:03 -0700)]
Merge branch 'for-4.3' of git://git./linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:
"Minor cleanups"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: clean up of schunk->map[] assignment in pcpu_setup_first_chunk
percpu: update incorrect comment for this_cpu_*() operations
Linus Torvalds [Wed, 2 Sep 2015 15:02:20 +0000 (08:02 -0700)]
Merge branch 'for-4.3' of git://git./linux/kernel/git/tj/wq
Pull workqueue updates from Tejun Heo:
"Only three trivial changes for workqueue this time - doc, MAINTAINERS
and EXPORT_SYMBOL updates"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: fix some docbook warnings
workqueue: Make flush_workqueue() available again to non GPL modules
workqueue: add myself as a dedicated reviwer
Linus Torvalds [Wed, 2 Sep 2015 15:00:54 +0000 (08:00 -0700)]
Merge branch 'for-4.3' of git://git./linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:
"Nothing interesting. A couple device specific minor updates and a
kernel doc change"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: pata_arasam_cf: Use devm_clk_get
libata: fix libata-core.c kernel-doc warning
ata: sata_rcar: Remove obsolete sata-r8a779* platform_device_id entries
David S. Miller [Wed, 2 Sep 2015 04:19:17 +0000 (21:19 -0700)]
flow_dissector: Use 'const' where possible.
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Wed, 2 Sep 2015 01:11:04 +0000 (18:11 -0700)]
flow_dissector: Fix function argument ordering dependency
Commit
c6cc1ca7f4d70c ("flowi: Abstract out functions to get flow hash
based on flowi") introduced a bug in __skb_set_sw_hash where we
require a dependency on evaluating arguments in a function in order.
There is no such ordering enforced in C, so this incorrect. This
patch fixes that by splitting out the arguments. This bug was
found via a compiler warning that keys may be uninitialized.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 2 Sep 2015 03:17:17 +0000 (20:17 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-09-01
This series contains updates to i40e, ixgbe and ixgbevf.
Anjali fixes a bug in i40e where the port is not receiving multicast or VLAN
tagged packets in promiscuous mode. Which can occur when a software bridge
is created on top of the device.
Don adds support in ixgbe that indicates the presence of management firmware.
Added support for entering low power link up state on devices that support
it when the device is closing or suspending. Updated the driver to report
unknown bus speed and width since IOSF does not report a PCIe bus speed or
width for X550 devices. Also added the new bus type for integrated I/O
interface (IOSF). Cleaned up of redundant code in ixgbe.
Mark adds support for UDP-encapsulation transmit checksum and for VXLAN
receive offloads. Introduces a helper function to do the register access
and processing to avoid needless PHY access on copper PHYs. Added support
for reporting 2.5G link speed. Fixed warnings resulting from redundant
initializations of the get_bus_info field.
Maninder Singh updates the ixgbe driver to use kzalloc instead of kcalloc
for allocation of one thing.
Tom Barbette adds support for ethtool to change the rxfh indirection table
and/or key using ethtool interface.
Emil resolves an issue where users were not able to dynamically set number
of queues for 82598 via ethtool -L.
Alex Williamson removes bimodal SR-IOV disabling behavior since it is
confusing to users and results in a state where the PF is broken for other
uses unless the user sets sriov_numvfs to zero prior to unbinding the
device.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 2 Sep 2015 02:45:46 +0000 (19:45 -0700)]
Merge tag 'pm+acpi-4.3-rc1' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI updates from Rafael Wysocki:
"From the number of commits perspective, the biggest items are ACPICA
and cpufreq changes with the latter taking the lead (over 50 commits).
On the cpufreq front, there are many cleanups and minor fixes in the
core and governors, driver updates etc. We also have a new cpufreq
driver for Mediatek MT8173 chips.
ACPICA mostly updates its debug infrastructure and adds a number of
fixes and cleanups for a good measure.
The Operating Performance Points (OPP) framework is updated with new
DT bindings and support for them among other things.
We have a few updates of the generic power domains framework and a
reorganization of the ACPI device enumeration code and bus type
operations.
And a lot of fixes and cleanups all over.
Included is one branch from the MFD tree as it contains some
PM-related driver core and ACPI PM changes a few other commits are
based on.
Specifics:
- ACPICA update to upstream revision
20150818 including method
tracing extensions to allow more in-depth AML debugging in the
kernel and a number of assorted fixes and cleanups (Bob Moore, Lv
Zheng, Markus Elfring).
- ACPI sysfs code updates and a documentation update related to AML
method tracing (Lv Zheng).
- ACPI EC driver fix related to serialized evaluations of _Qxx
methods and ACPI tools updates allowing the EC userspace tool to be
built from the kernel source (Lv Zheng).
- ACPI processor driver updates preparing it for future introduction
of CPPC support and ACPI PCC mailbox driver updates (Ashwin
Chaugule).
- ACPI interrupts enumeration fix for a regression related to the
handling of IRQ attribute conflicts between MADT and the ACPI
namespace (Jiang Liu).
- Fixes related to ACPI device PM (Mika Westerberg, Srinidhi
Kasagar).
- ACPI device registration code reorganization to separate the
sysfs-related code and bus type operations from the rest (Rafael J
Wysocki).
- Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause,
Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss).
- ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan
Xinhui, Rafael J Wysocki).
- cpufreq core cleanups on top of the previous changes allowing it to
preseve its sysfs directories over system suspend/resume (Viresh
Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior).
- cpufreq fixes and cleanups related to governors (Viresh Kumar).
- cpufreq updates (core and the cpufreq-dt driver) related to the
turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz).
- New DT bindings for Operating Performance Points (OPP), support for
them in the OPP framework and in the cpufreq-dt driver plus related
OPP framework fixes and cleanups (Viresh Kumar).
- cpufreq powernv driver updates (Shilpasri G Bhat).
- New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen).
- Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups
and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean).
- intel_pstate driver updates including Skylake-S support, support
for enabling HW P-states per CPU and an additional vendor bypass
list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao).
- cpuidle core fixes related to the handling of coupled idle states
(Xunlei Pang).
- intel_idle driver updates including Skylake Client support and
support for freeze-mode-specific idle states (Len Brown).
- Driver core updates related to power management (Andy Shevchenko,
Rafael J Wysocki).
- Generic power domains framework fixes and cleanups (Jon Hunter,
Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson).
- Device PM QoS framework update to allow the latency tolerance
setting to be exposed to user space via sysfs (Mika Westerberg).
- devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect
exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas).
- System sleep support updates (Alan Stern, Len Brown, SungEun Kim).
- rockchip-io AVS support updates (Heiko Stuebner).
- PM core clocks support fixup (Colin Ian King).
- Power capping RAPL driver update including support for Skylake H/S
and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi).
- Generic device properties framework fixes related to the handling
of static (driver-provided) property sets (Andy Shevchenko).
- turbostat and cpupower updates (Len Brown, Shilpasri G Bhat,
Shreyas B Prabhu)"
* tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits)
cpufreq: speedstep-lib: Use monotonic clock
cpufreq: powernv: Increase the verbosity of OCC console messages
cpufreq: sfi: use kmemdup rather than duplicating its implementation
cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor()
cpufreq: rename cpufreq_real_policy as cpufreq_user_policy
cpufreq: remove redundant 'policy' field from user_policy
cpufreq: remove redundant 'governor' field from user_policy
cpufreq: update user_policy.* on success
cpufreq: use memcpy() to copy policy
cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event
cpufreq: mediatek: Add MT8173 cpufreq driver
dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings
PM / Domains: Fix typo in description of genpd_dev_pm_detach()
PM / Domains: Remove unusable governor dummies
PM / Domains: Make pm_genpd_init() available to modules
PM / domains: Align column headers and data in pm_genpd_summary output
powercap / RAPL: disable the 2nd power limit properly
tools: cpupower: Fix error when running cpupower monitor
PM / OPP: Drop unlikely before IS_ERR(_OR_NULL)
PM / OPP: Fix static checker warning (broken 64bit big endian systems)
...
Linus Torvalds [Wed, 2 Sep 2015 02:37:56 +0000 (19:37 -0700)]
Merge tag 'devicetree-for-4.3' of git://git./linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
- added Frank Rowand as DT maintainer in preparation for Grant's
retirement.
- generic MSI binding documentation and a few other minor doc updates
- fix long standing issue with DT platorm device unregistration
- fix loop forever bug in of_find_matching_node_by_address()
* tag 'devicetree-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
MAINTAINERS: Add Frank Rowand as DT maintainer
mtd: nand: pxa3xx: add optional dma for pxa architecture
Documentation: DT: cpsw: document missing compatible
Docs: dt: add generic MSI bindings
drivercore: Fix unregistration path of platform devices
of/address: Don't loop forever in of_find_matching_node_by_address().
of: Add vendor prefix for JEDEC Solid State Technology Association
of/platform: add function to populate default bus
of: Add vendor prefix for Sharp Corporation
Linus Torvalds [Wed, 2 Sep 2015 01:46:42 +0000 (18:46 -0700)]
Merge branch 'for-next' of git://git./linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
"The usual stuff from trivial tree for 4.3 (kerneldoc updates, printk()
fixes, Documentation and MAINTAINERS updates)"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
MAINTAINERS: update my e-mail address
mod_devicetable: add space before */
scsi: a100u2w: trivial typo in printk
i2c: Fix typo in i2c-bfin-twi.c
treewide: fix typos in comment blocks
Doc: fix trivial typo in SubmittingPatches
proportions: Spelling s/consitent/consistent/
dm: Spelling s/consitent/consistent/
aic7xxx: Fix typo in error message
pcmcia: Fix typo in locking documentation
scsi/arcmsr: Fix typos in error log
drm/nouveau/gr: Fix typo in nv10.c
[SCSI] Fix printk typos in drivers/scsi
staging: comedi: Grammar s/Enable support a/Enable support for a/
Btrfs: Spelling s/consitent/consistent/
README: GTK+ is a acronym
ASoC: omap: Fix typo in config option description
mm: tlb.c: Fix error message
ntfs: super.c: Fix error log
fix typo in Documentation/SubmittingPatches
...
Linus Torvalds [Wed, 2 Sep 2015 01:44:28 +0000 (18:44 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/livepatching
Pull livepatching fix from Jiri Kosina:
"Livepatching error handling fix"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: Improve error handling in klp_disable_func()
Linus Torvalds [Wed, 2 Sep 2015 01:39:09 +0000 (18:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
"to receive patches queued for 4.3 merge window in HID tree. Highlights:
- a lot of improvements (regarding supported features and devices) to
Wacom driver, from Aaron Skomra and Jason Gerecke
- a lot of functional fixes and support for large I2C transfer to
cp2112 driver, from Ellen Wang
- HW support improvements to RMI driver, from Andrew Duggan
- quite some small fixes and device ID additions all over the place"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (44 commits)
HID: wacom: wacom_setup_numbered_buttons is local to wacom_wac
HID: wacom: Add support for Express Key Remote.
HID: wacom: Set button bits based on a new numbered_buttons
HID: quirks: add QUIRK_NOGET for an other TPV touchscreen
HID: usbhid: Fix the check for HID_RESET_PENDING in hid_io_error
HID: i2c-hid: Only disable irq wake if it was successfully enabled during suspend
HID: wacom: Use tablet-provided touch height/width values for INTUOSHT
HID: gembird: add new driver to fix Gembird JPD-DualForce 2
HID: lenovo: Hide middle-button press until release
HID: lenovo: Add missing return-value check
HID: lenovo: Use constants for axes names
HID: wacom: Simplify 'wacom_pl_irq'
HID: wacom: Do not repeatedly attempt to set device mode on error
HID: wacom: Do not repeatedly attempt to set device mode on error
HID: wacom: Remove WACOM_QUIRK_NO_INPUT
HID: wacom: Replace WACOM_QUIRK_MONITOR with WACOM_DEVICETYPE_WL_MONITOR
HID: wacom: Use calculated pkglen for wireless touch interface
HID: sony: Fix DS4 controller reporting rate issues
HID: chicony: Add support for Acer Aspire Switch 12
HID: hid-lg: Add USBID for Logitech G29 Wheel
...
Linus Torvalds [Wed, 2 Sep 2015 01:34:22 +0000 (18:34 -0700)]
Merge tag 'edac_for_4.3' of git://git./linux/kernel/git/bp/bp
Pull EDAC fixes from Borislav Petkov:
"Two minor fixlets this time: AMD MCE decoding correction and
xgene_edac cleanup"
* tag 'edac_for_4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
EDAC, mce_amd: Don't emit 'CE' for Deferred error
EDAC, xgene: Drop owner assignment from platform_driver
Mark Rustad [Wed, 29 Jul 2015 23:00:38 +0000 (16:00 -0700)]
ixgbe: Resolve "initialized field overwritten" warnings
Resolve warnings resulting from redundant initialization of the
get_bus_info field in the mac_ops_X550* structures.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Alex Williamson [Fri, 10 Jul 2015 21:31:34 +0000 (15:31 -0600)]
ixgbe: Remove bimodal SR-IOV disabling
When unbinding an SR-IOV device with VFs configured from ixgbe, the
driver behaves in one of two ways. If max_vfs was specified, the
SR-IOV state is disabled, removing the VFs. The occurs regardless of
whether the VF count was later modified through sysfs. If however
max_vfs is zero, such as by not specifying the module parameter, the
VFs persist after the PF is unbound from ixgbe. If the PF is then
bound to vfio-pci to be assigned to a VM, the PF is non-functional.
>From the comment, commit
da36b64736cf ("ixgbe: Implement PCI SR-IOV
sysfs callback operation") clearly intended this alternate behavior,
but probably didn't realize the PF doesn't work in this mode.
This bimodal behavior is confusing to users and results in a state
where the PF is broken for other uses unless the user sets
sriov_numvfs to zero prior to unbinding the device. Remove this
behavior so that VFs are removed and the PF is functional for other
uses after unbind, regardless of the way VFs are enabled.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Fri, 10 Jul 2015 21:19:22 +0000 (14:19 -0700)]
ixgbe: Add support for reporting 2.5G link speed
Now that we can do 2.5G link speed, we need to be able to report it.
Also change the nested triadic involved in creating the log message
to instead use a simpler switch statement to set a string pointer.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Emil Tantilov [Thu, 9 Jul 2015 19:28:59 +0000 (12:28 -0700)]
ixgbe: fix bounds checking in ixgbe_setup_tc for 82598
This patch resolves an issue where users were not able to dynamically
set number of queues for 82598 via ethtool -L
Reported-by: Tal Abudi <talabudi@gmail.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tom Barbette [Fri, 26 Jun 2015 13:40:18 +0000 (15:40 +0200)]
ixgbe: support for ethtool set_rxfh
Allows to change the rxfh indirection table and/or key using
ethtool interface.
Signed-off-by: Tom Barbette <tom.barbette@ulg.ac.be>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Fri, 26 Jun 2015 00:49:57 +0000 (17:49 -0700)]
ixgbe: Avoid needless PHY access on copper phys
Avoid a needless PHY access on copper phys to save the 10ms wait
time for each PHY access. A helper function is introduced to
actually do the register access and process the contents.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Fri, 19 Jun 2015 23:14:57 +0000 (19:14 -0400)]
ixgbe: cleanup to use cached mask value
We already cache this FW/SW semaphore mask so might as well use it
for consistency.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Fri, 19 Jun 2015 16:23:36 +0000 (12:23 -0400)]
ixgbe: Remove second instance of lan_id variable
This patch removes the redundant lan_id in the phy struct and uses
the bus version. Both variables exist and intend to represent the
STATUS register LAN_ID field. However, phy.lan_id is not bit shifted
so the phy.lan_id = 0x0 for LAN Id 0 and phy.lan_id = 0x4 for LAN Id 1.
Where bus.lan_id is bit shifted so bus.lan_id = 0x0 for LAN Id 0 and
bus.lan_id = 0x1 for LAN Id 1. There seems no need for the additional
lan_id variable and this should make the code less confusing.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Maninder Singh [Fri, 19 Jun 2015 04:07:55 +0000 (09:37 +0530)]
ixgbe: use kzalloc for allocating one thing
Use kzalloc rather than kcalloc(1..
The semantic patch that makes this change is as follows:
// <smpl>
@@
@@
- kcalloc(1,
+ kzalloc(
...)
// </smpl>
and removing checkpatch below CHECK:
CHECK: Prefer kzalloc(sizeof(*fwd_adapter)...) over
kzalloc(sizeof(struct ixgbe_fwd_adapter)...)
Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Reviewed-by: Vaneet Narang <v.narang@samsung.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Wed, 2 Sep 2015 00:00:24 +0000 (17:00 -0700)]
flow: Move __get_hash_from_flowi{4,6} into flow_dissector.c
These cannot live in net/core/flow.c which only builds when XFRM is
enabled.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Thu, 18 Jun 2015 20:31:42 +0000 (16:31 -0400)]
ixgbe: Remove unused PCI bus types
The ixgbe never has as very doubtfully ever will support either
PCI or PCI-X devices. So remove the unused types from the
ixgbe_bus_type. Thanks to Alex Duyck for suggesting this.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Thu, 18 Jun 2015 17:24:06 +0000 (13:24 -0400)]
ixgbe: add new bus type for intergrated I/O interface (IOSF)
With this patch we add support for a new bus type ixgbe_bus_type_internal.
X550em devices use IOSF and not PCIe bus so this new type is to accommodate
them.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Thu, 18 Jun 2015 00:59:59 +0000 (20:59 -0400)]
ixgbe: add get_bus_info method for X550
Added ixgbe_get_bus_info_X550em to X550 code. ixgbe_get_bus_info_X550em
sets bus.width to ixgbe_bus_width_unknown and bus.speed to
ixgbe_bus_speed_unknown, because IOSF does not report a PCIe bus
width or speed.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Wed, 17 Jun 2015 21:34:31 +0000 (17:34 -0400)]
ixgbe: Add support for entering low power link up state
When the device is closing or suspending, call ixgbe_enter_lplu to
enter low power link up state on devices that support it. When this
is done, prevent the phy from being reset in the ixgbe_down path
so that link is present when calling ixgbe_enter_lplu.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Mon, 15 Jun 2015 18:33:25 +0000 (11:33 -0700)]
ixgbe: Add support for VXLAN RX offloads
Add support for VXLAN RX offloads for the X55x devices that support
them.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Mark Rustad [Mon, 15 Jun 2015 18:33:20 +0000 (11:33 -0700)]
ixgbe: Add support for UDP-encapsulated tx checksum offload
By using GSO for UDP-encapsulated packets, all ixgbe devices can
be directed to generate checksums for the inner headers because
the outer UDP checksum can be zero. So point the machinery at the
inner headers and have the hardware generate the checksum.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 1 Sep 2015 23:46:08 +0000 (16:46 -0700)]
flow_dissector: Don't use bit fields.
Just have a flags member instead.
In file included from include/linux/linkage.h:4:0,
from include/linux/kernel.h:6,
from net/core/flow_dissector.c:1:
In function 'flow_keys_hash_start',
inlined from 'flow_hash_from_keys' at net/core/flow_dissector.c:553:34:
>> include/linux/compiler.h:447:38: error: call to '__compiletime_assert_459' declared with attribute error: BUILD_BUG_ON failed: FLOW_KEYS_HASH_OFFSET % sizeof(u32)
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Rustad [Thu, 11 Jun 2015 18:02:20 +0000 (11:02 -0700)]
ixgbe: Check whether FDIRCMD writes actually complete
Wait up to about 100 us for FDIRCMD writes to complete and return
failure indications.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Thu, 11 Jun 2015 00:42:30 +0000 (20:42 -0400)]
ixgbe: Assign set_phy_power dynamically where needed
There are various reasons why this method may or may not need to be
defined and some of these we don't know until runtime. So we will
set the value in get_invariants.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Don Skidmore [Thu, 11 Jun 2015 00:05:02 +0000 (20:05 -0400)]
ixgbe: add new function to check for management presence
This patch adds a support function that will indicate for the
existence of management FW.
Signed-off-by: Donald C Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Anjali Singhai Jain [Tue, 28 Jul 2015 17:02:00 +0000 (13:02 -0400)]
i40e: Set defport behavior for the Main VSI when in promiscuous mode
This fixes bugs where the port is not receiving multicast or VLAN tagged
packets when in promiscuous mode. This can occur when a SW bridge is
created on top of the device.
This also fixes issues where the promiscuous behavior setting was not
being preserved across a reset caused by features being enabled or
disabled.
We are using defport instead of doing a true promiscuous mode because we do
not need to receive the SRIOV or VMDq VSI directed traffic which would suck
up bandwidth and is really not intended for the SW bridge.
In addition, with defport we get VLAN promiscuous behavior which is not
possible from the VSI level promiscuous setting.
Change-ID: Ie21985eac32d5af1c02e9d71c6430a90d5bab40f
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Linus Torvalds [Tue, 1 Sep 2015 23:13:25 +0000 (16:13 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace
Pull user namespace updates from Eric Biederman:
"This finishes up the changes to ensure proc and sysfs do not start
implementing executable files, as the there are application today that
are only secure because such files do not exist.
It akso fixes a long standing misfeature of /proc/<pid>/mountinfo that
did not show the proper source for files bind mounted from
/proc/<pid>/ns/*.
It also straightens out the handling of clone flags related to user
namespaces, fixing an unnecessary failure of unshare(CLONE_NEWUSER)
when files such as /proc/<pid>/environ are read while <pid> is calling
unshare. This winds up fixing a minor bug in unshare flag handling
that dates back to the first version of unshare in the kernel.
Finally, this fixes a minor regression caused by the introduction of
sysfs_create_mount_point, which broke someone's in house application,
by restoring the size of /sys/fs/cgroup to 0 bytes. Apparently that
application uses the directory size to determine if a tmpfs is mounted
on /sys/fs/cgroup.
The bind mount escape fixes are present in Al Viros for-next branch.
and I expect them to come from there. The bind mount escape is the
last of the user namespace related security bugs that I am aware of"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs: Set the size of empty dirs to 0.
userns,pidns: Force thread group sharing, not signal handler sharing.
unshare: Unsharing a thread does not require unsharing a vm
nsfs: Add a show_path method to fix mountinfo
mnt: fs_fully_visible enforce noexec and nosuid if !SB_I_NOEXEC
vfs: Commit to never having exectuables on proc and sysfs.
Linus Torvalds [Tue, 1 Sep 2015 22:42:28 +0000 (15:42 -0700)]
Merge branch 'x86-core-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 clockevent update from Thomas Gleixner:
"A single commit, which converts HPET clockevents driver to the new
callbacks"
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/hpet: Migrate to new set_state interface
Linus Torvalds [Tue, 1 Sep 2015 22:20:51 +0000 (15:20 -0700)]
Merge branch 'x86-apic-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner:
"This udpate contains:
- rework the irq vector array to store a pointer to the irq
descriptor instead of the irq number to avoid a lookup of the irq
descriptor in the irq entry path
- lguest interrupt handling cleanups
- conversion of the local apic timer to the new clockevent callbacks
- preparatory changes for the irq argument removal of interrupt flow
handlers"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/irq: Do not dereference irq descriptor before checking it
tools/lguest: Clean up include dir
tools/lguest: Fix redefinition of struct virtio_pci_cfg_cap
x86/irq: Store irq descriptor in vector array
genirq: Provide irq_desc_has_action
x86/irq: Get rid of an indentation level
x86/irq: Rename VECTOR_UNDEFINED to VECTOR_UNUSED
x86/irq: Replace numeric constant
x86/irq: Protect smp_cleanup_move
x86/lguest: Do not setup unused irq vectors
x86/lguest: Clean up lguest_setup_irq
x86/apic: Drop local_irq_save/restore in timer callbacks
x86/apic: Migrate apic timer to new set_state interface
x86/irq: Use access helper irq_data_get_affinity_mask()
x86/irq: Use accessor irq_data_get_irq_handler_data()
x86/irq: Use accessor irq_data_get_node()
David S. Miller [Tue, 1 Sep 2015 22:06:24 +0000 (15:06 -0700)]
Merge branch 'flow-dissector-features'
Tom Herbert says:
====================
flow_dissector: Paramterize dissection and other features
This patch set adds some new capabilities to flow_dissector:
- Add flags to flow dissector functions to control dissection
- Flag to stop dissection when L3 header is seen (don't
dissect L4)
- Flag to stop dissection when encapsulation is detected
- Flag to parse first fragment of fragmented packet. This
may provide L4 ports
- Added new reporting in key_control
- Packet is a fragment
- Packet is a first fragment
- Packet has encapsulation
Also:
- Make __skb_set_sw_hash a general function
- Create functions to get a flow hash based on flowi4 or flowi6
structures without an reference to an skbuff
- Ignore flow dissector return value from ___skb_get_hash. Just
use whatever key fields are found to make a hash
Tested:
Ran 200 netperf TCP_RR instances for IPv6 and IPv4. Did not see any
regression. Ran UDP_RR with 10000 byte request and response size
for IPv4 and IPv6, no regression observed however I did see better
performance with IPv6 flow labels due to use of flow labels for L4
hash.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:33 +0000 (09:24 -0700)]
flow_dissector: Ignore flow dissector return value from ___skb_get_hash
In ___skb_get_hash ignore return value from skb_flow_dissect_flow_keys.
A failure in that function likely means that there was a parse error,
so we may as well use whatever fields were found before the error was
hit. This is also good because it means we won't keep trying to derive
the hash on subsequent calls to skb_get_hash for the same packet.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:32 +0000 (09:24 -0700)]
flow_dissector: Add control/reporting of encapsulation
Add an input flag to flow dissector on rather dissection should stop
when encapsulation is detected (IP/IP or GRE). Also, add a key_control
flag that indicates encapsulation was encountered during the
dissection.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:31 +0000 (09:24 -0700)]
flow_dissector: Add flag to stop parsing when an IPv6 flow label is seen
Add an input flag to flow dissector on rather dissection should be
stopped when a flow label is encountered. Presumably, the flow label
is derived from a sufficient hash of an inner transport packet so
further dissection is not needed (that is ports are not included in
the flow hash). Using the flow label instead of ports has the additional
benefit that packet fragments should hash to same value as non-fragments
for a flow (assuming that the same flow label is used).
We set this flag by default in for skb_get_hash.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:30 +0000 (09:24 -0700)]
flow_dissector: Add flag to stop parsing at L3
Add an input flag to flow dissector on rather dissection should be
stopped when an L3 packet is encountered. This would be useful if a
caller just wanted to get IP addresses of the outermost header (e.g.
to do an L3 hash).
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:29 +0000 (09:24 -0700)]
flow_dissector: Support IPv6 fragment header
Parse NEXTHDR_FRAGMENT. When seen account for it in the fragment bits of
key_control. Also, check if first fragment should be parsed.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:28 +0000 (09:24 -0700)]
flow_dissector: Add control/reporting of fragmentation
Add an input flag to flow dissector on rather dissection should be
attempted on a first fragment. Also add key_control flags to indicate
that a packet is a fragment or first fragment.
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 1 Sep 2015 16:24:27 +0000 (09:24 -0700)]
flow_dissector: Add flags argument to skb_flow_dissector functions
The flags argument will allow control of the dissection process (for
instance whether to parse beyond L3).
Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>