Andy Lutomirski [Wed, 8 Oct 2014 16:02:13 +0000 (09:02 -0700)]
x86,kvm,vmx: Preserve CR4 across VM entry
CR4 isn't constant; at least the TSD and PCE bits can vary.
TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.
This adds a branch and a read from cr4 to each vm entry. Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact. A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 18 Oct 2014 16:31:37 +0000 (09:31 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Include fixes for netrom and dsa (Fabian Frederick and Florian
Fainelli)
2) Fix FIXED_PHY support in stmmac, from Giuseppe CAVALLARO.
3) Several SKB use after free fixes (vxlan, openvswitch, vxlan,
ip_tunnel, fou), from Li ROngQing.
4) fec driver PTP support fixes from Luwei Zhou and Nimrod Andy.
5) Use after free in virtio_net, from Michael S Tsirkin.
6) Fix flow mask handling for megaflows in openvswitch, from Pravin B
Shelar.
7) ISDN gigaset and capi bug fixes from Tilman Schmidt.
8) Fix route leak in ip_send_unicast_reply(), from Vasily Averin.
9) Fix two eBPF JIT bugs on x86, from Alexei Starovoitov.
10) TCP_SKB_CB() reorganization caused a few regressions, fixed by Cong
Wang and Eric Dumazet.
11) Don't overwrite end of SKB when parsing malformed sctp ASCONF
chunks, from Daniel Borkmann.
12) Don't call sock_kfree_s() with NULL pointers, this function also has
the side effect of adjusting the socket memory usage. From Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (90 commits)
bna: fix skb->truesize underestimation
net: dsa: add includes for ethtool and phy_fixed definitions
openvswitch: Set flow-key members.
netrom: use linux/uaccess.h
dsa: Fix conversion from host device to mii bus
tipc: fix bug in bundled buffer reception
ipv6: introduce tcp_v6_iif()
sfc: add support for skb->xmit_more
r8152: return -EBUSY for runtime suspend
ipv4: fix a potential use after free in fou.c
ipv4: fix a potential use after free in ip_tunnel_core.c
hyperv: Add handling of IP header with option field in netvsc_set_hash()
openvswitch: Create right mask with disabled megaflows
vxlan: fix a free after use
openvswitch: fix a use after free
ipv4: dst_entry leak in ip_send_unicast_reply()
ipv4: clean up cookie_v4_check()
ipv4: share tcp_v4_save_options() with cookie_v4_check()
ipv4: call __ip_options_echo() in cookie_v4_check()
atm: simplify lanai.c by using module_pci_driver
...
Linus Torvalds [Sat, 18 Oct 2014 16:30:41 +0000 (09:30 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull Sparc bugfix from David Miller:
"Sparc64 AES ctr mode bug fix"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix FPU register corruption with AES crypto offload.
Linus Torvalds [Sat, 18 Oct 2014 16:29:59 +0000 (09:29 -0700)]
Merge git://git./linux/kernel/git/davem/ide
Pull IDE cleanup from David Miller:
"One IDE driver cleanup"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
Drivers: ide: Remove typedef atiixp_ide_timing
Catalin Marinas [Fri, 17 Oct 2014 16:38:49 +0000 (17:38 +0100)]
futex: Ensure get_futex_key_refs() always implies a barrier
Commit
b0c29f79ecea (futexes: Avoid taking the hb->lock if there's
nothing to wake up) changes the futex code to avoid taking a lock when
there are no waiters. This code has been subsequently fixed in commit
11d4616bd07f (futex: revert back to the explicit waiter counting code).
Both the original commit and the fix-up rely on get_futex_key_refs() to
always imply a barrier.
However, for private futexes, none of the cases in the switch statement
of get_futex_key_refs() would be hit and the function completes without
a memory barrier as required before checking the "waiters" in
futex_wake() -> hb_waiters_pending(). The consequence is a race with a
thread waiting on a futex on another CPU, allowing the waker thread to
read "waiters == 0" while the waiter thread to have read "futex_val ==
locked" (in kernel).
Without this fix, the problem (user space deadlocks) can be seen with
Android bionic's mutex implementation on an arm64 multi-cluster system.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Matteo Franchin <Matteo.Franchin@arm.com>
Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up)
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Dumazet [Fri, 17 Oct 2014 19:45:55 +0000 (12:45 -0700)]
bna: fix skb->truesize underestimation
skb->truesize is not meant to be tracking amount of used bytes
in an skb, but amount of reserved/consumed bytes in memory.
For instance, if we use a single byte in last page fragment,
we have to account the full size of the fragment.
skb->truesize can be very different from skb->len, that has
a very specific safety purpose.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Rasesh Mody <rasesh.mody@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 17 Oct 2014 23:02:13 +0000 (16:02 -0700)]
net: dsa: add includes for ethtool and phy_fixed definitions
net/dsa/slave.c uses functions and structures declared in phy_fixed.h
but does not explicitely include it, while dsa.h needs structure
declarations for 'struct ethtool_wolinfo' and 'struct ethtool_eee', fix
those by including the correct header files.
Fixes: ec9436baedb6 ("net: dsa: allow drivers to do link adjustment")
Fixes: ce31b31c68e7 ("net: dsa: allow updating fixed PHY link information")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Fri, 17 Oct 2014 20:56:31 +0000 (13:56 -0700)]
openvswitch: Set flow-key members.
This patch adds missing memset which are required to initialize
flow key member. For example for IP flow we need to initialize
ip.frag for all cases.
Found by inspection.
This bug is introduced by commit
0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
("openvswitch: Eliminate memset() from flow_extract").
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Fri, 17 Oct 2014 20:00:22 +0000 (22:00 +0200)]
netrom: use linux/uaccess.h
replace asm/uaccess.h by linux/uaccess.h
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Fri, 17 Oct 2014 19:30:58 +0000 (12:30 -0700)]
dsa: Fix conversion from host device to mii bus
Commit
b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
replaces mii_bus with a generic host_dev, and introduces
dsa_host_dev_to_mii_bus() to support conversion from host_dev to mii_bus.
However, in some cases it uses to_mii_bus to perform that conversion.
Since host_dev is not the phy bus device but typically a platform device,
this fails and results in a crash with the affected drivers.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff81781d35>] __mutex_lock_slowpath+0x75/0x100
PGD
406783067 PUD
406784067 PMD 0
Oops: 0002 [#1] SMP
...
Call Trace:
[<
ffffffff810a538b>] ? pick_next_task_fair+0x61b/0x880
[<
ffffffff81781de3>] mutex_lock+0x23/0x37
[<
ffffffff81533244>] mdiobus_read+0x34/0x60
[<
ffffffff8153b95a>] __mv88e6xxx_reg_read+0x8a/0xa0
[<
ffffffff8153b9bc>] mv88e6xxx_reg_read+0x4c/0xa0
Fixes: b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Fri, 17 Oct 2014 19:25:28 +0000 (15:25 -0400)]
tipc: fix bug in bundled buffer reception
In commit
ec8a2e5621db2da24badb3969eda7fd359e1869f ("tipc: same receive
code path for connection protocol and data messages") we omitted the
the possiblilty that an arriving message extracted from a bundle buffer
may be a multicast message. Such messages need to be to be delivered to
the socket via a separate function, tipc_sk_mcast_rcv(). As a result,
small multicast messages arriving as members of a bundle buffer will be
silently dropped.
This commit corrects the error by considering this case in the function
tipc_link_bundle_rcv().
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 17 Oct 2014 16:17:20 +0000 (09:17 -0700)]
ipv6: introduce tcp_v6_iif()
Commit
971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line
misses") added a regression for SO_BINDTODEVICE on IPv6.
This is because we still use inet6_iif() which expects that IP6 control
block is still at the beginning of skb->cb[]
This patch adds tcp_v6_iif() helper and uses it where necessary.
Because __inet6_lookup_skb() is used by TCP and DCCP, we add an iif
parameter to it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Acked-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Fri, 17 Oct 2014 14:32:25 +0000 (15:32 +0100)]
sfc: add support for skb->xmit_more
Don't ring the doorbell, and don't do PIO. This will also prevent
TX Push, because there will be more than one buffer waiting when
the doorbell is rung.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
hayeswang [Fri, 17 Oct 2014 08:55:08 +0000 (16:55 +0800)]
r8152: return -EBUSY for runtime suspend
Remove calling cancel_delayed_work_sync() for runtime suspend,
because it would cause dead lock. Instead, return -EBUSY to
avoid the device enters suspending if the net is running and
the delayed work is pending or running. The delayed work would
try to wake up the device later, so the suspending is not
necessary.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 08:53:47 +0000 (16:53 +0800)]
ipv4: fix a potential use after free in fou.c
pskb_may_pull() maybe change skb->data and make uh pointer oboslete,
so reload uh and guehdr
Fixes: 37dd0247 ("gue: Receive side for Generic UDP Encapsulation")
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 08:53:23 +0000 (16:53 +0800)]
ipv4: fix a potential use after free in ip_tunnel_core.c
pskb_may_pull() maybe change skb->data and make eth pointer oboslete,
so set eth after pskb_may_pull()
Fixes:
3d7b46cd("ip_tunnel: push generic protocol handling to ip_tunnel module")
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Thu, 16 Oct 2014 21:47:58 +0000 (14:47 -0700)]
hyperv: Add handling of IP header with option field in netvsc_set_hash()
In case that the IP header has optional field at the end, this patch will
get the port numbers after that field, and compute the hash. The general
parser skb_flow_dissect() is used here.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pravin B Shelar [Fri, 17 Oct 2014 04:55:45 +0000 (21:55 -0700)]
openvswitch: Create right mask with disabled megaflows
If megaflows are disabled, the userspace does not send the netlink attribute
OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask.
sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the
bytes that represent padding for struct sw_flow, or the bytes that represent
fields that may not be set during ovs_flow_extract().
This is a problem, because when we extract a flow from a packet,
we do not memset() anymore the struct sw_flow to 0.
This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(),
which operates on the netlink attributes rather than on the mask key. Using
this approach we are sure that only the bytes that the user provided in the
flow are matched.
Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we
now return with an error.
This bug is introduced by commit
0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
("openvswitch: Eliminate memset() from flow_extract").
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 06:06:16 +0000 (14:06 +0800)]
vxlan: fix a free after use
pskb_may_pull maybe change skb->data and make eth pointer oboslete,
so eth needs to reload
Fixes: 91269e390d062 ("vxlan: using pskb_may_pull as early as possible")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Fri, 17 Oct 2014 06:03:08 +0000 (14:03 +0800)]
openvswitch: fix a use after free
pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp
setting after arphdr_ok to avoid the use the freed memory
Fixes: 0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.")
Cc: Jesse Gross <jesse@nicira.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vasily Averin [Wed, 15 Oct 2014 12:24:02 +0000 (16:24 +0400)]
ipv4: dst_entry leak in ip_send_unicast_reply()
ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork
and in case errors in __ip_append_data() nobody frees stolen dst entry
Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")
Signed-off-by: Vasily Averin <vvs@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 15 Oct 2014 21:33:22 +0000 (14:33 -0700)]
ipv4: clean up cookie_v4_check()
We can retrieve opt from skb, no need to pass it as a parameter.
And opt should always be non-NULL, no need to check.
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 15 Oct 2014 21:33:21 +0000 (14:33 -0700)]
ipv4: share tcp_v4_save_options() with cookie_v4_check()
cookie_v4_check() allocates ip_options_rcu in the same way
with tcp_v4_save_options(), we can just make it a helper function.
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Wed, 15 Oct 2014 21:33:20 +0000 (14:33 -0700)]
ipv4: call __ip_options_echo() in cookie_v4_check()
commit
971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
missed that cookie_v4_check() still calls ip_options_echo() which uses
IPCB(). It should use TCPCB() at TCP layer, so call __ip_options_echo()
instead.
Fixes: commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Cc: Krzysztof Kolasa <kkolasa@winsoft.pl>
Cc: Eric Dumazet <edumazet@google.com>
Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Opdenacker [Wed, 15 Oct 2014 07:45:50 +0000 (09:45 +0200)]
atm: simplify lanai.c by using module_pci_driver
This simplifies the lanai.c driver by using
the module_pci_driver() macro, at the expense
of losing only debugging messages.
Signed-off-by: Michael Opdenacker <michael.opdenacker@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Thu, 16 Oct 2014 13:47:51 +0000 (15:47 +0200)]
netlink: fix description of portid
Avoid confusion between pid and portid.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 16 Oct 2014 18:42:51 +0000 (14:42 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/jkirsher/net
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2014-10-16
This series contains updates to fm10k and ixgbe.
Matthew provides two fixes for fm10k, first sets the flag to fetch the
host state before kicking off the service task that reads the host
state when bringing the interface up. The second makes sure that we
release the mailbox lock after detecting an error and before we return
the error code.
Andy Zhou provides a compile fix for fm10k, when the driver is compiled
into the kernel and the VXLAN driver is compiled as a module.
Emil provides a fix for ixgbe to prevent against a panic by trying
to dereference a NULL pointer in ixgbe_ndo_set_vf_spoofchk().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Emil Tantilov [Thu, 16 Oct 2014 15:49:02 +0000 (15:49 +0000)]
ixgbe: check for vfs outside of sriov_num_vfs before dereference
The check for vfinfo is not sufficient because it does not protect
against specifying vf that is outside of sriov_num_vfs range.
All of the ndo functions have a check for it except for
ixgbevf_ndo_set_spoofcheck().
The following patch is all we need to protect against this panic:
ip link set p96p1 vf 0 spoofchk off
BUG: unable to handle kernel NULL pointer dereference at
0000000000000052
IP: [<
ffffffffa044a1c1>]
ixgbe_ndo_set_vf_spoofchk+0x51/0x150 [ixgbe]
Reported-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Acked-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andy Zhou [Sat, 4 Oct 2014 06:19:11 +0000 (06:19 +0000)]
fm10k: Add CONFIG_FM10K_VXLAN configuration option
Compiling with CONFIG_FM10K=y and VXLAN=m resulting in linking error:
drivers/built-in.o: In function `fm10k_open':
(.text+0x1f9d7a): undefined reference to `vxlan_get_rx_port'
make: *** [vmlinux] Error 1
The fix follows the same strategy as I40E.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Fri, 3 Oct 2014 00:43:35 +0000 (00:43 +0000)]
fm10k: Unlock mailbox on VLAN addition failures
After grabbing the mailbox lock and detecting an error, the lock must be
released before the error code can be returned.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Matthew Vick [Thu, 2 Oct 2014 05:10:18 +0000 (05:10 +0000)]
fm10k: Check the host state when bringing the interface up
Set the flag to fetch the host state before kicking off the service task
that reads the host state when bringing the interface back up.
Signed-off-by: Matthew Vick <matthew.vick@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Li RongQing [Thu, 16 Oct 2014 01:17:18 +0000 (09:17 +0800)]
vxlan: using pskb_may_pull as early as possible
pskb_may_pull should be used to check if skb->data has enough space,
skb->len can not ensure that.
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Thu, 16 Oct 2014 00:49:41 +0000 (08:49 +0800)]
vxlan: fix a use after free in vxlan_encap_bypass
when netif_rx() is done, the netif_rx handled skb maybe be freed,
and should not be used.
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Wed, 15 Oct 2014 19:03:41 +0000 (21:03 +0200)]
openvswitch: use vport instead of p
All functions used struct vport *vport except
ovs_vport_find_upcall_portid.
This fixes 1 kerneldoc warning
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Wed, 15 Oct 2014 19:03:18 +0000 (21:03 +0200)]
openvswitch: kerneldoc warning fix
s/sock/gs
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Claudiu Manoil [Wed, 15 Oct 2014 16:11:46 +0000 (19:11 +0300)]
gianfar: Add FCS to rx buffer size (fix)
For each Rx frame the eTSEC writes its FCS (Frame Check Sequence)
to the Rx buffer.
The eTSEC h/w manual states in the "Receive Buffer Descriptor Field
Descriptions" table:
"Data length is the number of octets written by the eTSEC into this BD's
data buffer if L is cleared (the value is equal to MRBLR), or, if L is
set, the length of the frame including *CRC*, FCB (if RCTRL[PRSDEP > 00),
preamble (if MACCFG2[PreAmRxEn]=1), time stamp (if RCTRL[TS] = 1) and
any padding (RCTRL[PAL])."
Though the FCS bytes are removed by the driver before passing the skb
to the net stack, the Rx buffer size computation does not currently
take into account the FCS bytes (4 bytes).
Because the Rx buffer size is multiple of 512 bytes, leaving out the
FCS is not a problem for the default MTU of 1500, as the Rx buffer size
is 1536 in this case. However, for custom MTUs, where the difference
between the MTU size and the Rx buffer size is less, this can be a
problem as the computed Rx buffer size won't be enough to accomodate
the FCS for a received frame that is big enough (close to MTU size).
In such case the received frame is considered to be incomplete (L flag
not set in the RxBD status) and silently dropped.
Note that the driver does not currently support S/G on Rx, so it has to
compute its Rx buffer size based on the MTU of the device.
Reported-by: Kristian Otnes <kotnes@cisco.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Wed, 15 Oct 2014 13:23:28 +0000 (16:23 +0300)]
virtio_net: fix use after free
commit
0b725a2ca61bedc33a2a63d0451d528b268cf975
net: Remove ndo_xmit_flush netdev operation, use signalling instead.
added code that looks at skb->xmit_more after the skb has
been put in TX VQ. Since some paths process the ring and free the skb
immediately, this can cause use after free.
Fix by storing xmit_more in a local variable.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nimrod Andy [Wed, 15 Oct 2014 09:30:12 +0000 (17:30 +0800)]
net: fec: ptp: fix convergence issue to support LinuxPTP stack
iMX6SX IEEE 1588 module has one hw issue in capturing the ATVR register.
The current SW flow is:
ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
ts_counter_ns = ENET0->ATVR;
The ATVR value is not expected value that cause LinuxPTP stack cannot be convergent.
ENET Block Guide/ Chapter for the iMX6SX (PELE) address the issue:
After set ENET_ATCR[Capture], there need some time cycles before the counter
value is capture in the register clock domain. The wait-time-cycles is at least
6 clock cycles of the slower clock between the register clock and the 1588 clock.
So need something like:
ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
wait();
ts_counter_ns = ENET0->ATVR;
For iMX6SX, the 1588 ts_clk is fixed to 25Mhz, register clock is 66Mhz, so the
wait-time-cycles must be greater than 240ns (40ns * 6). The patch add 1us delay
before cpu read ATVR register.
Changes V2:
Modify the commit/comments log to describe the issue clearly.
Signed-off-by: Fugang Duan <B38611@freescale.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Himangi Saraogi [Thu, 14 Aug 2014 16:44:30 +0000 (22:14 +0530)]
Drivers: ide: Remove typedef atiixp_ide_timing
The Linux kernel coding style guidelines suggest not using typedefs
for structure types. This patch gets rid of the typedef for
atiixp_ide_timing.
The following Coccinelle semantic patch detects the case:
@tn1@
type td;
@@
typedef struct { ... } td;
@script:python tf@
td << tn1.td;
tdres;
@@
coccinelle.tdres = td;
@@
type tn1.td;
identifier tf.tdres;
@@
-typedef
struct
+ tdres
{ ... }
-td
;
@@
type tn1.td;
identifier tf.tdres;
@@
-td
+ struct tdres
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anish Bhatt [Wed, 15 Oct 2014 07:26:47 +0000 (00:26 -0700)]
cxgb4i : Fix -Wmaybe-uninitialized warning.
Identified by kbuild test robot. csk family is always set to be AF_INET or
AF_INET6, so skb will always be initialized to some value but there is no harm
in silencing the warning anyways.
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Fixes :
f42bb57c61fd ('cxgb4i : Fix -Wunused-function warning')
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Herbert [Tue, 14 Oct 2014 22:19:06 +0000 (15:19 -0700)]
net: Add ndo_gso_check
Add ndo_gso_check which a device can define to indicate whether is
is capable of doing GSO on a packet. This funciton would be called from
the stack to determine whether software GSO is needed to be done. A
driver should populate this function if it advertises GSO types for
which there are combinations that it wouldn't be able to handle. For
instance a device that performs UDP tunneling might only implement
support for transparent Ethernet bridging type of inner packets
or might have limitations on lengths of inner headers.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Giuseppe CAVALLARO [Wed, 15 Oct 2014 05:30:41 +0000 (07:30 +0200)]
stmmac: fix sti compatibililies
this patch is to fix the stmmac data compatibilities for
all the SoCs inside the platform file.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 15 Oct 2014 05:48:18 +0000 (07:48 +0200)]
Merge branch 'for-3.18-consistent-ops' of git://git./linux/kernel/git/tj/percpu
Pull percpu consistent-ops changes from Tejun Heo:
"Way back, before the current percpu allocator was implemented, static
and dynamic percpu memory areas were allocated and handled separately
and had their own accessors. The distinction has been gone for many
years now; however, the now duplicate two sets of accessors remained
with the pointer based ones - this_cpu_*() - evolving various other
operations over time. During the process, we also accumulated other
inconsistent operations.
This pull request contains Christoph's patches to clean up the
duplicate accessor situation. __get_cpu_var() uses are replaced with
with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().
Unfortunately, the former sometimes is tricky thanks to C being a bit
messy with the distinction between lvalues and pointers, which led to
a rather ugly solution for cpumask_var_t involving the introduction of
this_cpu_cpumask_var_ptr().
This converts most of the uses but not all. Christoph will follow up
with the remaining conversions in this merge window and hopefully
remove the obsolete accessors"
* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
irqchip: Properly fetch the per cpu offset
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
Revert "powerpc: Replace __get_cpu_var uses"
percpu: Remove __this_cpu_ptr
clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
sparc: Replace __get_cpu_var uses
avr32: Replace __get_cpu_var with __this_cpu_write
blackfin: Replace __get_cpu_var uses
tile: Use this_cpu_ptr() for hardware counters
tile: Replace __get_cpu_var uses
powerpc: Replace __get_cpu_var uses
alpha: Replace __get_cpu_var
ia64: Replace __get_cpu_var uses
s390: cio driver &__get_cpu_var replacements
s390: Replace __get_cpu_var uses
mips: Replace __get_cpu_var uses
MIPS: Replace __get_cpu_var uses in FPU emulator.
arm: Replace __this_cpu_ptr with raw_cpu_ptr
...
Linus Torvalds [Wed, 15 Oct 2014 05:30:52 +0000 (07:30 +0200)]
Merge tag 'llvmlinux-for-v3.18' of git://git.linuxfoundation.org/llvmlinux/kernel
Pull LLVM updates from Behan Webster:
"These patches remove the use of VLAIS using a new SHASH_DESC_ON_STACK
macro.
Some of the previously accepted VLAIS removal patches haven't used
this macro. I will push new patches to consistently use this macro in
all those older cases for 3.19"
[ More LLVM patches coming in through subsystem trees, and LLVM itself
needs some fixes that are already in many distributions but not in
released versions of LLVM. Some day this will all "just work" - Linus ]
* tag 'llvmlinux-for-v3.18' of git://git.linuxfoundation.org/llvmlinux/kernel:
crypto: LLVMLinux: Remove VLAIS usage from crypto/testmgr.c
security, crypto: LLVMLinux: Remove VLAIS from ima_crypto.c
crypto: LLVMLinux: Remove VLAIS usage from libcrc32c.c
crypto: LLVMLinux: Remove VLAIS usage from crypto/hmac.c
crypto, dm: LLVMLinux: Remove VLAIS usage from dm-crypt
crypto: LLVMLinux: Remove VLAIS from crypto/.../qat_algs.c
crypto: LLVMLinux: Remove VLAIS from crypto/omap_sham.c
crypto: LLVMLinux: Remove VLAIS from crypto/n2_core.c
crypto: LLVMLinux: Remove VLAIS from crypto/mv_cesa.c
crypto: LLVMLinux: Remove VLAIS from crypto/ccp/ccp-crypto-sha.c
btrfs: LLVMLinux: Remove VLAIS
crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto code
Linus Torvalds [Wed, 15 Oct 2014 05:23:49 +0000 (07:23 +0200)]
Merge tag 'iommu-updates-v3.18' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This pull-request includes:
- change in the IOMMU-API to convert the former iommu_domain_capable
function to just iommu_capable
- various fixes in handling RMRR ranges for the VT-d driver (one fix
requires a device driver core change which was acked by Greg KH)
- the AMD IOMMU driver now assigns and deassigns complete alias
groups to fix issues with devices using the wrong PCI request-id
- MMU-401 support for the ARM SMMU driver
- multi-master IOMMU group support for the ARM SMMU driver
- various other small fixes all over the place"
* tag 'iommu-updates-v3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (41 commits)
iommu/vt-d: Work around broken RMRR firmware entries
iommu/vt-d: Store bus information in RMRR PCI device path
iommu/vt-d: Only remove domain when device is removed
driver core: Add BUS_NOTIFY_REMOVED_DEVICE event
iommu/amd: Fix devid mapping for ivrs_ioapic override
iommu/irq_remapping: Fix the regression of hpet irq remapping
iommu: Fix bus notifier breakage
iommu/amd: Split init_iommu_group() from iommu_init_device()
iommu: Rework iommu_group_get_for_pci_dev()
iommu: Make of_device_id array const
amd_iommu: do not dereference a NULL pointer address.
iommu/omap: Remove omap_iommu unused owner field
iommu: Remove iommu_domain_has_cap() API function
IB/usnic: Convert to use new iommu_capable() API function
vfio: Convert to use new iommu_capable() API function
kvm: iommu: Convert to use new iommu_capable() API function
iommu/tegra: Convert to iommu_capable() API function
iommu/msm: Convert to iommu_capable() API function
iommu/vt-d: Convert to iommu_capable() API function
iommu/fsl: Convert to iommu_capable() API function
...
Linus Torvalds [Wed, 15 Oct 2014 05:05:03 +0000 (07:05 +0200)]
Merge tag 'clk-for-linus-3.18' of git://git.linaro.org/people/mike.turquette/linux
Pull clock tree updates from Mike Turquette:
"The clk tree changes for 3.18 are dominated by clock drivers. Mostly
fixes and enhancements to existing drivers as well as new drivers.
This tag contains a bit more arch code than I usually take due to some
OMAP2+ changes. Additionally it contains the restart notifier
handlers which are merged as a dependency into several trees.
The PXA changes are the only messy part. Due to having a stable tree
I had to revert one patch and follow up with one more fix near the tip
of this tag. Some dead code is introduced but it will soon become
live code after 3.18-rc1 is released as the rest of the PXA family is
converted over to the common clock framework.
Another trend in this tag is that multiple vendors have started to
push the complexity of changing their CPU frequency into the clock
driver, whereas this used to be done in CPUfreq drivers.
Changes to the clk core include a generic gpio-clock type and a
clk_set_phase() function added to the top-level clk.h api. Due to
some confusion on the fbdev mailing list the kernel boot parameters
documentation was updated to further explain the clk_ignore_unused
parameter, which is often required by users of the simplefb driver.
Finally some fixes to the locking around the clock debugfs stuff was
done to prevent deadlocks when interacting with other subsystems."
* tag 'clk-for-linus-3.18' of git://git.linaro.org/people/mike.turquette/linux: (99 commits)
clk: pxa clocks build system fix
Revert "arm: pxa: Transition pxa27x to clk framework"
clk: samsung: register restart handlers for s3c2412 and s3c2443
clk: rockchip: add restart handler
clk: rockchip: rk3288: i2s_frac adds flag to set parent's rate
doc/kernel-parameters.txt: clarify clk_ignore_unused
arm: pxa: Transition pxa27x to clk framework
dts: add devicetree bindings for pxa27x clocks
clk: add pxa27x clock drivers
arm: pxa: add clock pll selection bits
clk: dts: document pxa clock binding
clk: add pxa clocks infrastructure
clk: gpio-gate: Ensure gpiod_ APIs are prototyped
clk: ti: dra7-atl-clock: Mark the device as pm_runtime_irq_safe
clk: ti: LLVMLinux: Move __init outside of type definition
clk: ti: consider the fact that of_clk_get() might return an error
clk: ti: dra7-atl-clock: fix a memory leak
clk: ti: change clock init to use generic of_clk_init
clk: hix5hd2: add I2C clocks
clk: hix5hd2: add watchdog0 clocks
...
Linus Torvalds [Wed, 15 Oct 2014 04:58:16 +0000 (06:58 +0200)]
Merge tag 'mfd-for-linus-3.18' of git://git./linux/kernel/git/lee/mfd
Pull MFD updates from Lee Jones:
"Changes to existing drivers:
- DT clean-ups in da9055-core, max14577, rn5t618, arizona, hi6421, stmpe, twl4030
- Export symbols for use in modules in max14577
- Plenty of static code analysis/Coccinelle fixes throughout the SS
- Regmap clean-ups in arizona, wm5102, wm5110, da9052, tps65217, rk808
- Remove unused/duplicate code in da9052, 88pm860x, ti_ssp, lpc_sch, arizona
- Bug fixes in ti_am335x_tscadc, da9052, ti_am335x_tscadc, rtsx_pcr
- IRQ fixups in arizona, stmpe, max14577
- Regulator related changes in axp20x
- Pass DMA coherency information from parent => child in MFD core
- Rename DT document files for consistency
- Add ACPI support to the MFD core
- Add Andreas Werner to MAINTAINERS for MEN F21BMC
New drivers/supported devices:
- New driver for MEN 14F021P00 Board Management Controller
- New driver for Ricoh RN5T618 PMIC
- New driver for Rockchip RK808
- New driver for HiSilicon Hi6421 PMIC
- New driver for Qualcomm SPMI PMICs
- Add support for Intel Braswell in lpc_ich
- Add support for Intel 9 Series PCH in lpc_ich
- Add support for Intel Quark ILB in lpc_sch"
[ Delayed to after the poweer/reset pull due to Kconfig problems with
recursive Kconfig select/depends-on chains. - Linus ]
* tag 'mfd-for-linus-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (79 commits)
mfd: cros_ec: wait for completion of commands that return IN_PROGRESS
i2c: i2c-cros-ec-tunnel: Set retries to 3
mfd: cros_ec: move locking into cros_ec_cmd_xfer
mfd: cros_ec: stop calling ->cmd_xfer() directly
mfd: cros_ec: Delay for 50ms when we see EC_CMD_REBOOT_EC
MAINTAINERS: Adds Andreas Werner to maintainers list for MEN F21BMC
mfd: arizona: Correct mask to allow setting micbias external cap
mfd: Add ACPI support
Revert "mfd: wm5102: Manually apply register patch"
mfd: ti_am335x_tscadc: Update logic in CTRL register for 5-wire TS
mfd: dt-bindings: atmel-gpbr: Rename doc file to conform to naming convention
mfd: dt-bindings: qcom-pm8xxx: Rename doc file to conform to naming convention
mfd: Inherit coherent_dma_mask from parent device
mfd: Document DT bindings for Qualcomm SPMI PMICs
mfd: Add support for Qualcomm SPMI PMICs
mfd: dt-bindings: pm8xxx: Add new compatible string
mfd: axp209x: Drop the parent supplies field
mfd: twl4030-power: Use 'ti,system-power-controller' as alternative way to support system power off
mfd: dt-bindings: twl4030-power: Use the standard property to mark power control
mfd: syscon: Add Atmel GPBR DT bindings documention
...
Linus Torvalds [Wed, 15 Oct 2014 04:56:23 +0000 (06:56 +0200)]
Merge tag 'for-v3.18' of git://git.infradead.org/battery-2.6
Pull power supply and reset updates from Sebastian Reichel:
- Initial support for the following chips
* max77836 (charger)
* max14577 (charger)
* bq27742 (battery gauge)
* ltc2952 (poweroff)
* stih416 (restart)
* syscon-reboot (restart)
* gpio-restart (restart)
- cleanup of power supply core
- misc fixes in power supply and reset drivers
* tag 'for-v3.18' of git://git.infradead.org/battery-2.6: (48 commits)
power: ab8500_fg: Fix build warning
Documentation: charger: max14577: Update the date of introducing ABI
power: reset: corrections for simple syscon reboot driver
Documentation: power: reset: Add documentation for generic SYSCON reboot driver
power: reset: Add generic SYSCON register mapped reset
bq27x00_battery: Fix flag reading for bq27742
power: reset: use restart_notifier mechanism for msm-poweroff
power: Add simple gpio-restart driver
power: reset: st: Provide DT bindings for ST's Power Reset driver
power: reset: Add restart functionality for STiH41x platforms
power: charger-manager: Fix NULL pointer exception with missing cm-fuel-gauge
power: max14577: Fix circular config SYSFS dependency
power: gpio-charger: do not use gpio value directly
power: max8925: Use of_get_child_by_name
power: max8925: Fix NULL ptr dereference on memory allocation failure
bq27x00_battery: Add support to bq27742
Documentation: charger: max14577: Document exported sysfs entry
devicetree: mfd: max14577: Add device tree bindings document
power: max17040: Add ID for MAX77836 Fuel Gauge block
charger: max14577: Configure battery-dependent settings from DTS and sysfs
...
Conflicts:
drivers/power/reset/Kconfig
drivers/power/reset/Makefile
Linus Torvalds [Wed, 15 Oct 2014 04:46:01 +0000 (06:46 +0200)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
"There is the long-awaited discard support for RBD (Guangliang Zhao,
Josh Durgin), a pile of RBD bug fixes that didn't belong in late -rc's
(Ilya Dryomov, Li RongQing), a pile of fs/ceph bug fixes and
performance and debugging improvements (Yan, Zheng, John Spray), and a
smattering of cleanups (Chao Yu, Fabian Frederick, Joe Perches)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (40 commits)
ceph: fix divide-by-zero in __validate_layout()
rbd: rbd workqueues need a resque worker
libceph: ceph-msgr workqueue needs a resque worker
ceph: fix bool assignments
libceph: separate multiple ops with commas in debugfs output
libceph: sync osd op definitions in rados.h
libceph: remove redundant declaration
ceph: additional debugfs output
ceph: export ceph_session_state_name function
ceph: include the initial ACL in create/mkdir/mknod MDS requests
ceph: use pagelist to present MDS request data
libceph: reference counting pagelist
ceph: fix llistxattr on symlink
ceph: send client metadata to MDS
ceph: remove redundant code for max file size verification
ceph: remove redundant io_iter_advance()
ceph: move ceph_find_inode() outside the s_mutex
ceph: request xattrs if xattr_version is zero
rbd: set the remaining discard properties to enable support
rbd: use helpers to handle discard for layered images correctly
...
Linus Torvalds [Wed, 15 Oct 2014 04:43:27 +0000 (06:43 +0200)]
Merge branch 'CVE-2014-7970' of git://git./linux/kernel/git/luto/linux
Pull pivot_root() fix from Andy Lutomirski.
Prevent a leak of unreachable mounts.
* 'CVE-2014-7970' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux:
mnt: Prevent pivot_root from creating a loop in the mount tree
David S. Miller [Wed, 15 Oct 2014 04:29:08 +0000 (00:29 -0400)]
Merge branch 'cxgb4'
Anish Bhatt says:
====================
ipv6 and related cleanup for cxgb4/cxgb4i
This patch set removes some duplicated/extraneous code from cxgb4i, guards
cxgb4 against compilation failure based on ipv6 tristate, make ipv6 related
code no longer be enabled by default irrespective of ipv6 tristate and fixes
a refcnt issue.
-Anish
v2 : Provide more detailed commit messages, make subject more concise as
recommended by Dave Miller.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Anish Bhatt [Wed, 15 Oct 2014 03:07:24 +0000 (20:07 -0700)]
cxgb4i: Remove duplicate call to dst_neigh_lookup()
There is an extra call to dst_neigh_lookup() leftover in cxgb4i that can cause
an unreleased refcnt issue. Remove extraneous call.
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Fixes :
759a0cc5a3e1b ('cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api')
Signed-off-by: David S. Miller <davem@davemloft.net>
Anish Bhatt [Wed, 15 Oct 2014 03:07:23 +0000 (20:07 -0700)]
cxgb4i : Fix -Wunused-function warning
A bunch of ipv6 related code is left on by default. While this causes no
compilation issues, there is no need to have this enabled by default. Guard
with an ipv6 check, which also takes care of a -Wunused-function warning.
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anish Bhatt [Wed, 15 Oct 2014 03:07:22 +0000 (20:07 -0700)]
cxgb4 : Fix build failure in cxgb4 when ipv6 is disabled/not in-built
cxgb4 ipv6 does not guard against ipv6 being disabled, or the standard
ipv6 module vs inbuilt tri-state issue. This was fixed for cxgb4i & iw_cxgb4
but missed for cxgb4.
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Anish Bhatt [Wed, 15 Oct 2014 03:07:21 +0000 (20:07 -0700)]
cxgb4i : Remove duplicated CLIP handling code
cxgb4 already handles CLIP updates from a previous changeset for iw_cxgb4,
there is no need to have this functionality in cxgb4i. Remove duplicated code
Signed-off-by: Anish Bhatt <anish@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 15 Oct 2014 02:37:58 +0000 (19:37 -0700)]
sparc64: Fix FPU register corruption with AES crypto offload.
The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the
key material is preloaded into the FPU registers, and then we loop
over and over doing the crypt operation, reusing those pre-cooked key
registers.
There are intervening blkcipher*() calls between the crypt operation
calls. And those might perform memcpy() and thus also try to use the
FPU.
The sparc64 kernel FPU usage mechanism is designed to allow such
recursive uses, but with a catch.
There has to be a trap between the two FPU using threads of control.
The mechanism works by, when the FPU is already in use by the kernel,
allocating a slot for FPU saving at trap time. Then if, within the
trap handler, we try to use the FPU registers, the pre-trap FPU
register state is saved into the slot. Then at trap return time we
notice this and restore the pre-trap FPU state.
Over the long term there are various more involved ways we can make
this work, but for a quick fix let's take advantage of the fact that
the situation where this happens is very limited.
All sparc64 chips that support the crypto instructiosn also are using
the Niagara4 memcpy routine, and that routine only uses the FPU for
large copies where we can't get the source aligned properly to a
multiple of 8 bytes.
We look to see if the FPU is already in use in this context, and if so
we use the non-large copy path which only uses integer registers.
Furthermore, we also limit this special logic to when we are doing
kernel copy, rather than a user copy.
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman [Wed, 8 Oct 2014 17:42:27 +0000 (10:42 -0700)]
mnt: Prevent pivot_root from creating a loop in the mount tree
Andy Lutomirski recently demonstrated that when chroot is used to set
the root path below the path for the new ``root'' passed to pivot_root
the pivot_root system call succeeds and leaks mounts.
In examining the code I see that starting with a new root that is
below the current root in the mount tree will result in a loop in the
mount tree after the mounts are detached and then reattached to one
another. Resulting in all kinds of ugliness including a leak of that
mounts involved in the leak of the mount loop.
Prevent this problem by ensuring that the new mount is reachable from
the current root of the mount tree.
[Added stable cc. Fixes CVE-2014-7970. --Andy]
Cc: stable@vger.kernel.org
Reported-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/87bnpmihks.fsf@x220.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Eric Dumazet [Mon, 13 Oct 2014 13:27:47 +0000 (06:27 -0700)]
tcp: TCP Small Queues and strange attractors
TCP Small queues tries to keep number of packets in qdisc
as small as possible, and depends on a tasklet to feed following
packets at TX completion time.
Choice of tasklet was driven by latencies requirements.
Then, TCP stack tries to avoid reorders, by locking flows with
outstanding packets in qdisc in a given TX queue.
What can happen is that many flows get attracted by a low performing
TX queue, and cpu servicing TX completion has to feed packets for all of
them, making this cpu 100% busy in softirq mode.
This became particularly visible with latest skb->xmit_more support
Strategy adopted in this patch is to detect when tcp_wfree() is called
from ksoftirqd and let the outstanding queue for this flow being drained
before feeding additional packets, so that skb->ooo_okay can be set
to allow select_queue() to select the optimal queue :
Incoming ACKS are normally handled by different cpus, so this patch
gives more chance for these cpus to take over the burden of feeding
qdisc with future packets.
Tested:
lpaa23:~# ./super_netperf 1400 --google-pacing-rate
3028000 -H lpaa24 -l 3600 &
lpaa23:~# sar -n DEV 1 10 | grep eth1
06:16:18 AM eth1 595448.00
1190564.00 38381.09
1760253.12 0.00 0.00 1.00
06:16:19 AM eth1 594858.00
1189686.00 38340.76
1758952.72 0.00 0.00 0.00
06:16:20 AM eth1 597017.00
1194019.00 38480.79
1765370.29 0.00 0.00 1.00
06:16:21 AM eth1 595450.00
1190936.00 38380.19
1760805.05 0.00 0.00 0.00
06:16:22 AM eth1 596385.00
1193096.00 38442.56
1763976.29 0.00 0.00 1.00
06:16:23 AM eth1 598155.00
1195978.00 38552.97
1768264.60 0.00 0.00 0.00
06:16:24 AM eth1 594405.00
1188643.00 38312.57
1757414.89 0.00 0.00 1.00
06:16:25 AM eth1 593366.00
1187154.00 38252.16
1755195.83 0.00 0.00 0.00
06:16:26 AM eth1 593188.00
1186118.00 38232.88
1753682.57 0.00 0.00 1.00
06:16:27 AM eth1 596301.00
1192241.00 38440.94
1762733.09 0.00 0.00 0.00
Average: eth1 595457.30
1190843.50 38381.69
1760664.84 0.00 0.00 0.50
lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog
backlog
7606336b 2513p requeues 167982
backlog
224072b 74p requeues 566
backlog
581376b 192p requeues 5598
backlog
181680b 60p requeues 1070
backlog
5305056b 1753p requeues 110166 // Here, this TX queue is attracting flows
backlog
157456b 52p requeues 1758
backlog
672216b 222p requeues 3025
backlog 60560b 20p requeues 24541
backlog
448144b 148p requeues 21258
lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect
Immediate jump to full bandwidth, and traffic is properly
shard on all tx queues.
lpaa23:~# sar -n DEV 1 10 | grep eth1
06:16:46 AM eth1
1397632.00
2795397.00 90081.87
4133031.26 0.00 0.00 1.00
06:16:47 AM eth1
1396874.00
2793614.00 90032.99
4130385.46 0.00 0.00 0.00
06:16:48 AM eth1
1395842.00
2791600.00 89966.46
4127409.67 0.00 0.00 1.00
06:16:49 AM eth1
1395528.00
2791017.00 89946.17
4126551.24 0.00 0.00 0.00
06:16:50 AM eth1
1397891.00
2795716.00 90098.74
4133497.39 0.00 0.00 1.00
06:16:51 AM eth1
1394951.00
2789984.00 89908.96
4125022.51 0.00 0.00 0.00
06:16:52 AM eth1
1394608.00
2789190.00 89886.90
4123851.36 0.00 0.00 1.00
06:16:53 AM eth1
1395314.00
2790653.00 89934.33
4125983.09 0.00 0.00 0.00
06:16:54 AM eth1
1396115.00
2792276.00 89984.25
4128411.21 0.00 0.00 1.00
06:16:55 AM eth1
1396829.00
2793523.00 90030.19
4130250.28 0.00 0.00 0.00
Average: eth1
1396158.40
2792297.00 89987.09
4128439.35 0.00 0.00 0.50
lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog
backlog
7900052b 2609p requeues 173287
backlog
878120b 290p requeues 589
backlog
1068884b 354p requeues 5621
backlog
996212b 329p requeues 1088
backlog
984100b 325p requeues 115316
backlog
956848b 316p requeues 1781
backlog
1080996b 357p requeues 3047
backlog
975016b 322p requeues 24571
backlog
990156b 327p requeues 21274
(All 8 TX queues get a fair share of the traffic)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Oct 2014 21:05:23 +0000 (17:05 -0400)]
Merge branch 'qlcnic'
Rajesh Borundia says:
====================
qlcnic: Bug fixes
This series fixes following issues.
* We were programming maximum number of arguments supported by
adapter instead of required in a command.
* Destroy tx command requires three arguments instead of two.
Please apply these patches to net.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Tue, 14 Oct 2014 11:41:46 +0000 (07:41 -0400)]
qlcnic: Fix number of arguments in destroy tx context command
o Number of arguments taken by destroy tx command is three
instead of two.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Tue, 14 Oct 2014 11:41:45 +0000 (07:41 -0400)]
qlcnic: Fix programming number of arguments in a command.
o Initially we were programming maximum number of arguments.
Instead we should program number of arguments required in
a command.
o Maximum number of arguments for 82xx adapter is four. Fix it
for GET_ESWITCH_STATS command.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Rustad [Tue, 14 Oct 2014 13:28:38 +0000 (06:28 -0700)]
genl_magic: Resolve logical-op warnings
Resolve "logical 'and' applied to non-boolean constant" warnings"
that appear in W=2 builds by adding !! to a bit test.
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Oct 2014 21:02:37 +0000 (17:02 -0400)]
net: Trap attempts to call sock_kfree_s() with a NULL pointer.
Unlike normal kfree() it is never right to call sock_kfree_s() with
a NULL pointer, because sock_kfree_s() also has the side effect of
discharging the memory from the sockets quota.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Tue, 14 Oct 2014 19:35:08 +0000 (12:35 -0700)]
rds: avoid calling sock_kfree_s() on allocation failure
It is okay to free a NULL pointer but not okay to mischarge the socket optmem
accounting. Compile test only.
Reported-by: rucsoftsec@gmail.com
Cc: Chien Yen <chien.yen@oracle.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Shenai [Tue, 14 Oct 2014 20:24:14 +0000 (01:54 +0530)]
cxgb4: Fix FW flash logic using ethtool
Use t4_fw_upgrade instead of t4_load_fw to write firmware into FLASH, since
t4_load_fw doesn't co-ordinate with the firmware and the adapter can get hosed
enough to require a power cycle of the system.
Based on original work by Casey Leedom <leedom@chelsio.com>
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Oct 2014 20:40:49 +0000 (16:40 -0400)]
Merge branch 'stmmac'
Giuseppe Cavallaro says:
====================
stmmac: review and fix the dwmac-sti glue-logic
This patch is to review the whole glue logic adopted on STi SoCs that
was bugged.
In the old glue-logic there was a lot of confusion when setup the
retiming especially for STiD127 where, for example, the bits 6 and 7
(in the GMAC control register) have a different meaning of what is
used for STiH4xx SoCs. So we cannot adopt the same glue for all these
SoCs.
Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII
couldn't run when the speed was 10Mbps (because the clock was not properly
managed).
Note that the phy clock needs to be provided by the platform as well as
documented in the related binding file (updated as consequence).
The old code supported too many configurations never adopted and validated.
This made the code very complex to maintain and debug in case of issues.
The patch simplifies all the configurations as commented in the tables
inside the file and obviously it has been tested on all the boards
based on the SoCs mentioned.
With this patch, the dwmac-sti is also ready to support new configurations that
will be available on next SoC generations.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Giuseppe CAVALLARO [Tue, 14 Oct 2014 06:12:56 +0000 (08:12 +0200)]
stmmac: dwmac-sti: review the glue-logic for STi4xx and STiD127 SoCs
This patch is to review the whole glue logic adopted on STi SoCs that
was bugged.
In the old glue-logic there was a lot of confusion when setup the
retiming especially for STiD127 where, for example, the bits 6 and 7
(in the GMAC control register) have a different meaning of what is
used for STiH4xx SoCs. So we cannot adopt the same glue for all these
SoCs.
Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII
couldn't run when the speed was 10Mbps (because the clock was not properly
managed).
Note that the phy clock needs to be provided by the platform as well as
documented in the related binding file (updated as consequence).
The old code supported too many configurations never adopted and validated.
This made the code very complex to maintain and debug in case of issues.
The patch simplifies all the configurations as commented in the tables
inside the file and obviously it has been tested on all the boards
based on the SoCs mentioned.
With this patch, the dwmac-sti is also ready to support new configurations that
will be available on next SoC generations.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Giuseppe CAVALLARO [Tue, 14 Oct 2014 06:12:55 +0000 (08:12 +0200)]
stmmac: make the STi Layer compatible to STiH407
This adds the missing compatibility to the STiH407 SoC.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Giuseppe CAVALLARO [Tue, 14 Oct 2014 06:11:54 +0000 (08:11 +0200)]
stmmac: platform: fix FIXED_PHY support.
On several STi platforms: e.g. stihxxx-b2120 an Ethernet switch is
embedded and connected to the stmmac via RGMII mode. So this is managed
by using the FIXED_PHY. In that case, the support in the platform needs
to be fixed to allow the stmmac to dialog with the switch via fixed-link
by using phy_bus_name property.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Tue, 14 Oct 2014 18:21:04 +0000 (11:21 -0700)]
dsa: mv88e6171: Fix tag_protocol check
tag_protocol is now an enum, so drivers have to check against it.
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Oct 2014 20:09:38 +0000 (16:09 -0400)]
Merge branch 'xgene'
Iyappan Subramanian says:
====================
Adding SGMII based 1GbE basic support to APM X-Gene SoC ethernet driver.
v2: Address comments from v1
* Split the patchset into two, the first one being preparatory patch
* Added link_state function pointer to the xgene_mac_ops structure
* Added xgene_indirect_ctl structure for indirect read/write arguments
v1:
* Initial version
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Iyappan Subramanian [Tue, 14 Oct 2014 00:05:35 +0000 (17:05 -0700)]
drivers: net: xgene: Add SGMII based 1GbE ethtool support
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iyappan Subramanian [Tue, 14 Oct 2014 00:05:34 +0000 (17:05 -0700)]
drivers: net: xgene: Add SGMII based 1GbE support
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iyappan Subramanian [Tue, 14 Oct 2014 00:05:33 +0000 (17:05 -0700)]
drivers: net: xgene: Preparing for adding SGMII based 1GbE
- Added link_state function pointer to the xgene__mac_ops structure
- Moved ring manager (pdata->rm) assignment to xgene_enet_setup_ops
- Removed unused variable (pdata->phy_addr) and macro (FULL_DUPLEX)
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Iyappan Subramanian [Tue, 14 Oct 2014 00:05:32 +0000 (17:05 -0700)]
dtb: Add SGMII based 1GbE node to APM X-Gene SoC device tree
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov [Tue, 14 Oct 2014 09:08:54 +0000 (02:08 -0700)]
net: filter: move common defines into bpf_common.h
userspace programs that use eBPF instruction macros need to include two files:
uapi/linux/filter.h and uapi/linux/bpf.h
Move common macro definitions that are shared between classic BPF and eBPF
into uapi/linux/bpf_common.h, so that user app can include only one bpf.h file
Cc: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Tue, 14 Oct 2014 17:01:14 +0000 (19:01 +0200)]
caif_usb: use target structure member in memset
parent cfusbl was used instead of first structure member 'layer'
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Tue, 14 Oct 2014 17:00:55 +0000 (19:00 +0200)]
caif_usb: remove redundant memory message
Let MM subsystem display out of memory messages.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fabian Frederick [Mon, 13 Oct 2014 20:21:46 +0000 (22:21 +0200)]
caif: replace kmalloc/memset 0 by kzalloc
Also add blank line after declaration
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Mon, 13 Oct 2014 16:51:07 +0000 (22:21 +0530)]
drivers: net: cpsw: remove child devices while driver detach
remove all the child devices from the system to make sure that re-insert of
cpsw module doesn't fail on child device populated by of_platform_populate().
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Mon, 13 Oct 2014 16:51:06 +0000 (22:21 +0530)]
drivers: net: davinci_cpdma: remove spinlock as SOFTIRQ-unsafe lock order detected
remove spinlock in cpdma_desc_pool_destroy() as there is no active cpdma
channel and iounmap should be called without auquiring lock.
root@dra7xx-evm:~# modprobe -r ti_cpsw
[ 50.539743]
[ 50.541312] ======================================================
[ 50.547796] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
[ 50.554826]
3.14.19-02124-g95c5b7b #308 Not tainted
[ 50.559939] ------------------------------------------------------
[ 50.566416] modprobe/1921 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[ 50.573347] (vmap_area_lock){+.+...}, at: [<
c01127fc>] find_vmap_area+0x10/0x6c
[ 50.581132]
[ 50.581132] and this task is already holding:
[ 50.587249] (&(&pool->lock)->rlock#2){..-...}, at: [<
bf017c74>] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma]
[ 50.597766] which would create a new lock dependency:
[ 50.603048] (&(&pool->lock)->rlock#2){..-...} -> (vmap_area_lock){+.+...}
[ 50.610296]
[ 50.610296] but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 50.618601] (&(&pool->lock)->rlock#2){..-...}
... which became SOFTIRQ-irq-safe at:
[ 50.626829] [<
c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c
[ 50.632677] [<
bf01773c>] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma]
[ 50.640437] [<
bf0177e8>] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma]
[ 50.647289] [<
bf017908>] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma]
[ 50.654512] [<
bf017984>] cpdma_chan_process+0x3c/0x54 [davinci_cpdma]
[ 50.661455] [<
bf0277e8>] cpsw_poll+0x14/0xa8 [ti_cpsw]
[ 50.667038] [<
c05844f4>] net_rx_action+0xc0/0x1e8
[ 50.672150] [<
c0048234>] __do_softirq+0xcc/0x304
[ 50.677183] [<
c004873c>] irq_exit+0xa8/0xfc
[ 50.681751] [<
c000eeac>] handle_IRQ+0x50/0xb0
[ 50.686513] [<
c0008638>] gic_handle_irq+0x28/0x5c
[ 50.691628] [<
c06590a4>] __irq_svc+0x44/0x5c
[ 50.696289] [<
c0658ab4>] _raw_spin_unlock_irqrestore+0x34/0x44
[ 50.702591] [<
c065a9c4>] do_page_fault.part.9+0x144/0x3c4
[ 50.708433] [<
c065acb8>] do_page_fault+0x74/0x84
[ 50.713453] [<
c00083dc>] do_DataAbort+0x34/0x98
[ 50.718391] [<
c065923c>] __dabt_usr+0x3c/0x40
[ 50.723148]
[ 50.723148] to a SOFTIRQ-irq-unsafe lock:
[ 50.728893] (vmap_area_lock){+.+...}
... which became SOFTIRQ-irq-unsafe at:
[ 50.736476] ... [<
c06584e8>] _raw_spin_lock+0x28/0x38
[ 50.741876] [<
c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[ 50.747908] [<
c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[ 50.754210] [<
c011486c>] get_vm_area_caller+0x3c/0x48
[ 50.759692] [<
c0114be0>] vmap+0x40/0x78
[ 50.763900] [<
c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[ 50.769835] [<
c093eac0>] start_kernel+0x320/0x388
[ 50.774952] [<
80008074>] 0x80008074
[ 50.778793]
[ 50.778793] other info that might help us debug this:
[ 50.778793]
[ 50.787181] Possible interrupt unsafe locking scenario:
[ 50.787181]
[ 50.794295] CPU0 CPU1
[ 50.799042] ---- ----
[ 50.803785] lock(vmap_area_lock);
[ 50.807446] local_irq_disable();
[ 50.813652] lock(&(&pool->lock)->rlock#2);
[ 50.820782] lock(vmap_area_lock);
[ 50.827086] <Interrupt>
[ 50.829823] lock(&(&pool->lock)->rlock#2);
[ 50.834490]
[ 50.834490] *** DEADLOCK ***
[ 50.834490]
[ 50.840695] 4 locks held by modprobe/1921:
[ 50.844981] #0: (&__lockdep_no_validate__){......}, at: [<
c03e53e8>] driver_detach+0x44/0xb8
[ 50.854038] #1: (&__lockdep_no_validate__){......}, at: [<
c03e53f4>] driver_detach+0x50/0xb8
[ 50.863102] #2: (&(&ctlr->lock)->rlock){......}, at: [<
bf017c34>] cpdma_ctlr_destroy+0x1c/0x114 [davinci_cpdma]
[ 50.873890] #3: (&(&pool->lock)->rlock#2){..-...}, at: [<
bf017c74>] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma]
[ 50.884871]
the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
[ 50.892827] -> (&(&pool->lock)->rlock#2){..-...} ops: 167 {
[ 50.898703] IN-SOFTIRQ-W at:
[ 50.901995] [<
c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c
[ 50.909476] [<
bf01773c>] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma]
[ 50.918878] [<
bf0177e8>] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma]
[ 50.927366] [<
bf017908>] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma]
[ 50.936218] [<
bf017984>] cpdma_chan_process+0x3c/0x54 [davinci_cpdma]
[ 50.944794] [<
bf0277e8>] cpsw_poll+0x14/0xa8 [ti_cpsw]
[ 50.952009] [<
c05844f4>] net_rx_action+0xc0/0x1e8
[ 50.958765] [<
c0048234>] __do_softirq+0xcc/0x304
[ 50.965432] [<
c004873c>] irq_exit+0xa8/0xfc
[ 50.971635] [<
c000eeac>] handle_IRQ+0x50/0xb0
[ 50.978035] [<
c0008638>] gic_handle_irq+0x28/0x5c
[ 50.984788] [<
c06590a4>] __irq_svc+0x44/0x5c
[ 50.991085] [<
c0658ab4>] _raw_spin_unlock_irqrestore+0x34/0x44
[ 50.999023] [<
c065a9c4>] do_page_fault.part.9+0x144/0x3c4
[ 51.006510] [<
c065acb8>] do_page_fault+0x74/0x84
[ 51.013171] [<
c00083dc>] do_DataAbort+0x34/0x98
[ 51.019738] [<
c065923c>] __dabt_usr+0x3c/0x40
[ 51.026129] INITIAL USE at:
[ 51.029335] [<
c06585a4>] _raw_spin_lock_irqsave+0x38/0x4c
[ 51.036729] [<
bf017d78>] cpdma_chan_submit+0x4c/0x2f0 [davinci_cpdma]
[ 51.045225] [<
bf02863c>] cpsw_ndo_open+0x378/0x6bc [ti_cpsw]
[ 51.052897] [<
c058747c>] __dev_open+0x9c/0x104
[ 51.059287] [<
c05876ec>] __dev_change_flags+0x88/0x160
[ 51.066420] [<
c05877e4>] dev_change_flags+0x18/0x48
[ 51.073270] [<
c05ed51c>] devinet_ioctl+0x61c/0x6e0
[ 51.080029] [<
c056ee54>] sock_ioctl+0x5c/0x298
[ 51.086418] [<
c01350a4>] do_vfs_ioctl+0x78/0x61c
[ 51.092993] [<
c01356ac>] SyS_ioctl+0x64/0x74
[ 51.099200] [<
c000e580>] ret_fast_syscall+0x0/0x48
[ 51.105956] }
[ 51.107696] ... key at: [<
bf019000>] __key.21312+0x0/0xfffff650 [davinci_cpdma]
[ 51.115912] ... acquired at:
[ 51.119019] [<
c00899ac>] lock_acquire+0x9c/0x104
[ 51.124138] [<
c06584e8>] _raw_spin_lock+0x28/0x38
[ 51.129341] [<
c01127fc>] find_vmap_area+0x10/0x6c
[ 51.134547] [<
c0114960>] remove_vm_area+0x8/0x6c
[ 51.139659] [<
c0114a7c>] __vunmap+0x20/0xf8
[ 51.144318] [<
c001c350>] __arm_iounmap+0x10/0x18
[ 51.149440] [<
bf017d08>] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma]
[ 51.156560] [<
bf026294>] cpsw_remove+0x48/0x8c [ti_cpsw]
[ 51.162407] [<
c03e62c8>] platform_drv_remove+0x18/0x1c
[ 51.168063] [<
c03e4c44>] __device_release_driver+0x70/0xc8
[ 51.174094] [<
c03e5458>] driver_detach+0xb4/0xb8
[ 51.179212] [<
c03e4a6c>] bus_remove_driver+0x4c/0x90
[ 51.184693] [<
c00b024c>] SyS_delete_module+0x10c/0x198
[ 51.190355] [<
c000e580>] ret_fast_syscall+0x0/0x48
[ 51.195661]
[ 51.197217]
the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[ 51.205986] -> (vmap_area_lock){+.+...} ops: 520 {
[ 51.211032] HARDIRQ-ON-W at:
[ 51.214321] [<
c06584e8>] _raw_spin_lock+0x28/0x38
[ 51.221090] [<
c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[ 51.228750] [<
c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[ 51.236690] [<
c011486c>] get_vm_area_caller+0x3c/0x48
[ 51.243811] [<
c0114be0>] vmap+0x40/0x78
[ 51.249654] [<
c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[ 51.257239] [<
c093eac0>] start_kernel+0x320/0x388
[ 51.263994] [<
80008074>] 0x80008074
[ 51.269474] SOFTIRQ-ON-W at:
[ 51.272769] [<
c06584e8>] _raw_spin_lock+0x28/0x38
[ 51.279525] [<
c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[ 51.287190] [<
c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[ 51.295126] [<
c011486c>] get_vm_area_caller+0x3c/0x48
[ 51.302245] [<
c0114be0>] vmap+0x40/0x78
[ 51.308094] [<
c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[ 51.315669] [<
c093eac0>] start_kernel+0x320/0x388
[ 51.322423] [<
80008074>] 0x80008074
[ 51.327906] INITIAL USE at:
[ 51.331112] [<
c06584e8>] _raw_spin_lock+0x28/0x38
[ 51.337775] [<
c011376c>] alloc_vmap_area.isra.28+0xb8/0x300
[ 51.345352] [<
c0113a44>] __get_vm_area_node.isra.29+0x90/0x134
[ 51.353197] [<
c011486c>] get_vm_area_caller+0x3c/0x48
[ 51.360224] [<
c0114be0>] vmap+0x40/0x78
[ 51.365977] [<
c09442f0>] check_writebuffer_bugs+0x54/0x1a0
[ 51.373464] [<
c093eac0>] start_kernel+0x320/0x388
[ 51.380131] [<
80008074>] 0x80008074
[ 51.385517] }
[ 51.387260] ... key at: [<
c0a66948>] vmap_area_lock+0x10/0x20
[ 51.393841] ... acquired at:
[ 51.396945] [<
c00899ac>] lock_acquire+0x9c/0x104
[ 51.402060] [<
c06584e8>] _raw_spin_lock+0x28/0x38
[ 51.407266] [<
c01127fc>] find_vmap_area+0x10/0x6c
[ 51.412478] [<
c0114960>] remove_vm_area+0x8/0x6c
[ 51.417592] [<
c0114a7c>] __vunmap+0x20/0xf8
[ 51.422252] [<
c001c350>] __arm_iounmap+0x10/0x18
[ 51.427369] [<
bf017d08>] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma]
[ 51.434487] [<
bf026294>] cpsw_remove+0x48/0x8c [ti_cpsw]
[ 51.440336] [<
c03e62c8>] platform_drv_remove+0x18/0x1c
[ 51.446000] [<
c03e4c44>] __device_release_driver+0x70/0xc8
[ 51.452031] [<
c03e5458>] driver_detach+0xb4/0xb8
[ 51.457147] [<
c03e4a6c>] bus_remove_driver+0x4c/0x90
[ 51.462628] [<
c00b024c>] SyS_delete_module+0x10c/0x198
[ 51.468289] [<
c000e580>] ret_fast_syscall+0x0/0x48
[ 51.473584]
[ 51.475140]
[ 51.475140] stack backtrace:
[ 51.479703] CPU: 0 PID: 1921 Comm: modprobe Not tainted
3.14.19-02124-g95c5b7b #308
[ 51.487744] [<
c0016090>] (unwind_backtrace) from [<
c0012060>] (show_stack+0x10/0x14)
[ 51.495865] [<
c0012060>] (show_stack) from [<
c0652a20>] (dump_stack+0x78/0x94)
[ 51.503444] [<
c0652a20>] (dump_stack) from [<
c0086f18>] (check_usage+0x408/0x594)
[ 51.511293] [<
c0086f18>] (check_usage) from [<
c00870f8>] (check_irq_usage+0x54/0xb0)
[ 51.519416] [<
c00870f8>] (check_irq_usage) from [<
c0088724>] (__lock_acquire+0xe54/0x1b90)
[ 51.528077] [<
c0088724>] (__lock_acquire) from [<
c00899ac>] (lock_acquire+0x9c/0x104)
[ 51.536291] [<
c00899ac>] (lock_acquire) from [<
c06584e8>] (_raw_spin_lock+0x28/0x38)
[ 51.544417] [<
c06584e8>] (_raw_spin_lock) from [<
c01127fc>] (find_vmap_area+0x10/0x6c)
[ 51.552726] [<
c01127fc>] (find_vmap_area) from [<
c0114960>] (remove_vm_area+0x8/0x6c)
[ 51.560935] [<
c0114960>] (remove_vm_area) from [<
c0114a7c>] (__vunmap+0x20/0xf8)
[ 51.568693] [<
c0114a7c>] (__vunmap) from [<
c001c350>] (__arm_iounmap+0x10/0x18)
[ 51.576362] [<
c001c350>] (__arm_iounmap) from [<
bf017d08>] (cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma])
[ 51.586494] [<
bf017d08>] (cpdma_ctlr_destroy [davinci_cpdma]) from [<
bf026294>] (cpsw_remove+0x48/0x8c [ti_cpsw])
[ 51.597261] [<
bf026294>] (cpsw_remove [ti_cpsw]) from [<
c03e62c8>] (platform_drv_remove+0x18/0x1c)
[ 51.606659] [<
c03e62c8>] (platform_drv_remove) from [<
c03e4c44>] (__device_release_driver+0x70/0xc8)
[ 51.616237] [<
c03e4c44>] (__device_release_driver) from [<
c03e5458>] (driver_detach+0xb4/0xb8)
[ 51.625264] [<
c03e5458>] (driver_detach) from [<
c03e4a6c>] (bus_remove_driver+0x4c/0x90)
[ 51.633749] [<
c03e4a6c>] (bus_remove_driver) from [<
c00b024c>] (SyS_delete_module+0x10c/0x198)
[ 51.642781] [<
c00b024c>] (SyS_delete_module) from [<
c000e580>] (ret_fast_syscall+0x0/0x48)
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mugunthan V N [Mon, 13 Oct 2014 16:51:05 +0000 (22:21 +0530)]
drivers: net: davinci_cpdma: remove kfree on objects allocated with devm_* apis
memories allocated with devm_* apis must not be freed with kfree apis,
so removing the kfree calls
Fixes: e194312854ed ('drivers: net: davinci_cpdma: Convert kzalloc() to devm_kzalloc().')
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prashant Sreedharan [Mon, 13 Oct 2014 16:21:42 +0000 (09:21 -0700)]
tg3: Add skb->xmit_more support
Ring TX doorbell only if xmit_more is not set or the queue is stopped.
Suggested-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Prashant Sreedharan <prashant@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Mon, 13 Oct 2014 14:34:10 +0000 (16:34 +0200)]
ipv4: fix nexthop attlen check in fib_nh_match
fib_nh_match does not match nexthops correctly. Example:
ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \
nexthop via 192.168.122.13 dev eth0
ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \
nexthop via 192.168.122.15 dev eth0
Del command is successful and route is removed. After this patch
applied, the route is correctly matched and result is:
RTNETLINK answers: No such process
Please consider this for stable trees as well.
Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config")
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 11 Oct 2014 22:17:29 +0000 (15:17 -0700)]
tcp: fix tcp_ack() performance problem
We worked hard to improve tcp_ack() performance, by not accessing
skb_shinfo() in fast path (
cd7d8498c9a5 tcp: change tcp_skb_pcount()
location)
We still have one spurious access because of ACK timestamping,
added in commit
e1c8a607b281 ("net-timestamp: ACK timestamp for
bytestreams")
By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set,
we can avoid two cache line misses for the common case.
While we are at it, add two prefetchw() :
One in tcp_ack() to bring skb at the head of write queue.
One in tcp_clean_rtx_queue() loop to bring following skb,
as we will delete skb from the write queue and dirty skb->next->prev.
Add a couple of [un]likely() clauses.
After this patch, tcp_ack() is no longer the most consuming
function in tcp stack.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yan, Zheng [Tue, 14 Oct 2014 07:38:01 +0000 (15:38 +0800)]
ceph: fix divide-by-zero in __validate_layout()
The 'stripe_unit' field is 64 bits, casting it to 32 bits can result zero.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Ilya Dryomov [Fri, 10 Oct 2014 14:36:07 +0000 (18:36 +0400)]
rbd: rbd workqueues need a resque worker
Need to use WQ_MEM_RECLAIM for our workqueues to prevent I/O lockups
under memory pressure - we sit on the memory reclaim path.
Cc: stable@vger.kernel.org # 3.17, needs backporting for 3.16
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Tested-by: Micha Krause <micha@krausam.de>
Reviewed-by: Sage Weil <sage@redhat.com>
Ilya Dryomov [Fri, 10 Oct 2014 12:39:05 +0000 (16:39 +0400)]
libceph: ceph-msgr workqueue needs a resque worker
Commit
f363e45fd118 ("net/ceph: make ceph_msgr_wq non-reentrant")
effectively removed WQ_MEM_RECLAIM flag from ceph_msgr_wq. This is
wrong - libceph is very much a memory reclaim path, so restore it.
Cc: stable@vger.kernel.org # needs backporting for < 3.12
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Tested-by: Micha Krause <micha@krausam.de>
Reviewed-by: Sage Weil <sage@redhat.com>
Fabian Frederick [Thu, 9 Oct 2014 21:16:35 +0000 (23:16 +0200)]
ceph: fix bool assignments
Fix some coccinelle warnings:
fs/ceph/caps.c:2400:6-10: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2401:6-15: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2402:6-17: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2403:6-22: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2404:6-22: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2405:6-19: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2440:4-20: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2469:3-16: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2490:2-18: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2519:3-7: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2549:3-12: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2575:2-6: WARNING: Assignment of bool to 0/1
fs/ceph/caps.c:2589:3-7: WARNING: Assignment of bool to 0/1
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Ilya Dryomov [Mon, 6 Oct 2014 14:40:27 +0000 (18:40 +0400)]
libceph: separate multiple ops with commas in debugfs output
For requests with multiple ops, separate ops with commas instead of \t,
which is a field separator here.
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Ilya Dryomov [Thu, 2 Oct 2014 13:22:29 +0000 (17:22 +0400)]
libceph: sync osd op definitions in rados.h
Bring in missing osd ops and strings, use macros to eliminate multiple
points of maintenance.
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Fabian Frederick [Tue, 30 Sep 2014 20:07:50 +0000 (22:07 +0200)]
libceph: remove redundant declaration
ceph_release_page_vector was defined twice in libceph.h
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
John Spray [Fri, 12 Sep 2014 15:58:49 +0000 (16:58 +0100)]
ceph: additional debugfs output
MDS session state and client global ID is
useful instrumentation when testing.
Signed-off-by: John Spray <john.spray@redhat.com>
John Spray [Fri, 19 Sep 2014 12:51:08 +0000 (13:51 +0100)]
ceph: export ceph_session_state_name function
...so that it can be used from the ceph debugfs
code when dumping session info.
Signed-off-by: John Spray <john.spray@redhat.com>
Yan, Zheng [Tue, 16 Sep 2014 12:35:17 +0000 (20:35 +0800)]
ceph: include the initial ACL in create/mkdir/mknod MDS requests
Current code set new file/directory's initial ACL in a non-atomic
manner.
Client first sends request to MDS to create new file/directory, then set
the initial ACL after the new file/directory is successfully created.
The fix is include the initial ACL in create/mkdir/mknod MDS requests.
So MDS can handle creating file/directory and setting the initial ACL in
one request.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Yan, Zheng [Tue, 16 Sep 2014 11:15:28 +0000 (19:15 +0800)]
ceph: use pagelist to present MDS request data
Current code uses page array to present MDS request data. Pages in the
array are allocated/freed by caller of ceph_mdsc_do_request(). If request
is interrupted, the pages can be freed while they are still being used by
the request message.
The fix is use pagelist to present MDS request data. Pagelist is
reference counted.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Yan, Zheng [Tue, 16 Sep 2014 09:50:45 +0000 (17:50 +0800)]
libceph: reference counting pagelist
this allow pagelist to present data that may be sent multiple times.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Yan, Zheng [Thu, 18 Sep 2014 08:11:12 +0000 (16:11 +0800)]
ceph: fix llistxattr on symlink
only regular file and directory have vxattrs.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
John Spray [Tue, 9 Sep 2014 18:26:01 +0000 (19:26 +0100)]
ceph: send client metadata to MDS
Implement version 2 of CEPH_MSG_CLIENT_SESSION syntax,
which includes additional client metadata to allow
the MDS to report on clients by user-sensible names
like hostname.
Signed-off-by: John Spray <john.spray@redhat.com>
Reviewed-by: Yan, Zheng <zyan@redhat.com>
David S. Miller [Tue, 14 Oct 2014 19:05:39 +0000 (15:05 -0400)]
Merge branch 'isdn'
Tilman Schmidt says:
====================
Coverity patches for drivers/isdn
Here's a series of patches for the ISDN CAPI subsystem and the
Gigaset ISDN driver.
Patches 1 to 7 are specific fixes for Coverity warnings.
Patches 8 to 11 fix related problems with the handling of invalid
CAPI command codes I noticed while working on this.
Patch 12 fixes an unrelated problem I noticed during the subsequent
regression tests.
It would be great if these could still be merged.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>