git.openwrt.org Git - openwrt/staging/blogic.git/log

lib/checksum.c: fix carry in csum_tcpudp_nofold

The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.

Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify

Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081

This patch avoids calling rtnl_notify if the device ndo_bridge_getlink
handler does not return any bytes in the skb.

Alternately, the skb->len check can be moved inside rtnl_notify.

For the bridge vlan case described in 92081, there is also a fix needed
in bridge driver to generate a proper notification. Will fix that in
subsequent patch.

v2: rebase patch on net tree

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'tcp_stretch_acks'

Neal Cardwell says:

====================
fix stretch ACK bugs in TCP CUBIC and Reno

This patch series fixes the TCP CUBIC and Reno congestion control
modules to properly handle stretch ACKs in their respective additive
increase modes, and in the transitions from slow start to additive
increase.

This finishes the project started by commit 9f9843a751d0a2057 ("tcp:
properly handle stretch acks in slow start"), which fixed behavior for
TCP congestion control when handling stretch ACKs in slow start mode.

Motivation: In the Jan 2015 netdev thread 'BW regression after "tcp:
refine TSO autosizing"', Eyal Perry documented a regression that Eric
Dumazet determined was caused by improper handling of TCP stretch
ACKs.

Background: LRO, GRO, delayed ACKs, and middleboxes can cause "stretch
ACKs" that cover more than the RFC-specified maximum of 2
packets. These stretch ACKs can cause serious performance shortfalls
in common congestion control algorithms, like Reno and CUBIC, which
were designed and tuned years ago with receiver hosts that were not
using LRO or GRO, and were instead ACKing every other packet.

Testing: at Google we have been using this approach for handling
stretch ACKs for CUBIC datacenter and Internet traffic for several
years, with good results.

v2:
* fixed return type of tcp_slow_start() to be u32 instead of int
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix timing issue in CUBIC slope calculation

This patch fixes a bug in CUBIC that causes cwnd to increase slightly
too slowly when multiple ACKs arrive in the same jiffy.

If cwnd is supposed to increase at a rate of more than once per jiffy,
then CUBIC was sometimes too slow. Because the bic_target is
calculated for a future point in time, calculated with time in
jiffies, the cwnd can increase over the course of the jiffy while the
bic_target calculated as the proper CUBIC cwnd at time
t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only
increases on jiffy tick boundaries.

So since the cnt is set to:
ca->cnt = cwnd / (bic_target - cwnd);
as cwnd increases but bic_target does not increase due to jiffy
granularity, the cnt becomes too large, causing cwnd to increase
too slowly.

For example:
- suppose at the beginning of a jiffy, cwnd=40, bic_target=44
- so CUBIC sets:
   ca->cnt =  cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10
- suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai()
   increases cwnd to 41
- so CUBIC sets:
   ca->cnt =  cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13

So now CUBIC will wait for 13 packets to be ACKed before increasing
cwnd to 42, insted of 10 as it should.

The fix is to avoid adjusting the slope (determined by ca->cnt)
multiple times within a jiffy, and instead skip to compute the Reno
cwnd, the "TCP friendliness" code path.

Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix stretch ACK bugs in CUBIC

Change CUBIC to properly handle stretch ACKs in additive increase mode
by passing in the count of ACKed packets to tcp_cong_avoid_ai().

In addition, because we are now precisely accounting for stretch ACKs,
including delayed ACKs, we can now remove the delayed ACK tracking and
estimation code that tracked recent delayed ACK behavior in
ca->delayed_ack.

Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix stretch ACK bugs in Reno

Change Reno to properly handle stretch ACKs in additive increase mode
by passing in the count of ACKed packets to tcp_cong_avoid_ai().

In addition, if snd_cwnd crosses snd_ssthresh during slow start
processing, and we then exit slow start mode, we need to carry over
any remaining "credit" for packets ACKed and apply that to additive
increase by passing this remaining "acked" count to
tcp_cong_avoid_ai().

Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: fix the timid additive increase on stretch ACKs

tcp_cong_avoid_ai() was too timid (snd_cwnd increased too slowly) on
"stretch ACKs" -- cases where the receiver ACKed more than 1 packet in
a single ACK. For example, suppose w is 10 and we get a stretch ACK
for 20 packets, so acked is 20. We ought to increase snd_cwnd by 2
(since acked/w = 20/10 = 2), but instead we were only increasing cwnd
by 1. This patch fixes that behavior.

Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: stretch ACK fixes prep

LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that
cover more than the RFC-specified maximum of 2 packets. These stretch
ACKs can cause serious performance shortfalls in common congestion
control algorithms that were designed and tuned years ago with
receiver hosts that were not using LRO or GRO, and were instead
politely ACKing every other packet.

This patch series fixes Reno and CUBIC to handle stretch ACKs.

This patch prepares for the upcoming stretch ACK bug fix patches. It
adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future
fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and
changes all congestion control algorithms to pass in 1 for the ACKed
count. It also changes tcp_slow_start() to return the number of packet
ACK "credits" that were not processed in slow start mode, and can be
processed by the congestion control module in additive increase mode.

In future patches we will fix tcp_cong_avoid_ai() to handle stretch
ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start
and additive increase mode.

Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

1) Don't OOPS on socket AIO, from Christoph Hellwig.

2) Scheduled scans should be aborted upon RFKILL, from Emmanuel
    Grumbach.

3) Fix sleep in atomic context in kvaser_usb, from Ahmed S Darwish.

4) Fix RCU locking across copy_to_user() in bpf code, from Alexei
    Starovoitov.

5) Lots of crash, memory leak, short TX packet et al bug fixes in
    sh_eth from Ben Hutchings.

6) Fix memory corruption in SCTP wrt.  INIT collitions, from Daniel
    Borkmann.

7) Fix return value logic for poll handlers in netxen, enic, and bnx2x.
    From Eric Dumazet and Govindarajulu Varadarajan.

8) Header length calculation fix in mac80211 from Fred Chou.

9) mv643xx_eth doesn't handle highmem correctly in non-TSO code paths.
    From Ezequiel Garcia.

10) udp_diag has bogus logic in it's hash chain skipping, copy same fix
    tcp diag used.  From Herbert Xu.

11) amd-xgbe programs wrong rx flow control register, from Thomas
    Lendacky.

12) Fix race leading to use after free in ping receive path, from Subash
    Abhinov Kasiviswanathan.

13) Cache redirect routes otherwise we can get a heavy backlog of rcu
    jobs liberating DST_NOCACHE entries.  From Hannes Frederic Sowa.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
  net: don't OOPS on socket aio
  stmmac: prevent probe drivers to crash kernel
  bnx2x: fix napi poll return value for repoll
  ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too
  sh_eth: Fix DMA-API usage for RX buffers
  sh_eth: Check for DMA mapping errors on transmit
  sh_eth: Ensure DMA engines are stopped before freeing buffers
  sh_eth: Remove RX overflow log messages
  ping: Fix race in free in receive path
  udp_diag: Fix socket skipping within chain
  can: kvaser_usb: Fix state handling upon BUS_ERROR events
  can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
  can: kvaser_usb: Send correct context to URB completion
  can: kvaser_usb: Do not sleep in atomic context
  ipv4: try to cache dst_entries which would cause a redirect
  samples: bpf: relax test_maps check
  bpf: rcu lock must not be held when calling copy_to_user()
  net: sctp: fix slab corruption from use after free on INIT collisions
  net: mv643xx_eth: Fix highmem support in non-TSO egress path
  sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers
  ...

net: don't OOPS on socket aio

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

stmmac: prevent probe drivers to crash kernel

In the case when alloc_netdev fails we return NULL to a caller. But there is no
check for NULL in the probe drivers. This patch changes NULL to an error
pointer. The function description is amended to reflect what we may get
returned.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'powerpc-3.19-5' of git://git./linux/kernel/git/mpe/linux

Pull powerpc fixes from Michael Ellerman:
"Two powerpc fixes"

* tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared
powerpc/xmon: Fix another endiannes issue in RTAS call from xmon

Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/rusty/linux

Pull one more module fix from Rusty Russell:
"SCSI was using module_refcount() to figure out when the module was
  unloading: this broke with new atomic refcounting.  The code is still
  suspicious, but this solves the WARN_ON()"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  scsi: always increment reference count

bnx2x: fix napi poll return value for repoll

With the commit d75b1ade567ffab ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. When in busy_poll is we return 0
in napi_poll. We should return budget.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
ipsec 2015-01-26

Just two small fixes for _decode_session6() where we
might decode to wrong header information in some rare
situations.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too

Lubomir Rintel reported that during replacing a route the interface
reference counter isn't correctly decremented.

To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>:
| [root@rhel7-5 lkundrak]# sh -x lal
| + ip link add dev0 type dummy
| + ip link set dev0 up
| + ip link add dev1 type dummy
| + ip link set dev1 up
| + ip addr add 2001:db8:8086::2/64 dev dev0
| + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20
| + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10
| + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20
| + ip link del dev0 type dummy
| Message from syslogd@rhel7-5 at Jan 23 10:54:41 ...
| kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2
|
| Message from syslogd@rhel7-5 at Jan 23 10:54:51 ...
| kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2

During replacement of a rt6_info we must walk all parent nodes and check
if the to be replaced rt6_info got propagated. If so, replace it with
an alive one.

Fixes: 4a287eba2de3957 ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag")
Reported-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sh_eth'

Ben Hutchings says:

====================
Fixes for sh_eth #3

I'm continuing review and testing of Ethernet support on the R-Car H2
chip. This series fixes the last of the more serious issues I've found.

These are not tested on any of the other supported chips.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Fix DMA-API usage for RX buffers

- Use the return value of dma_map_single(), rather than calling
virt_to_page() separately
- Check for mapping failue
- Call dma_unmap_single() rather than dma_sync_single_for_cpu()

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Check for DMA mapping errors on transmit

dma_map_single() may fail if an IOMMU or swiotlb is in use, so
we need to check for this.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Ensure DMA engines are stopped before freeing buffers

Currently we try to clear EDRRR and EDTRR and immediately continue to
free buffers.  This is unsafe because:

- In general, register writes are not serialised with DMA, so we still
  have to wait for DMA to complete somehow
- The R8A7790 (R-Car H2) manual states that the TX running flag cannot
  be cleared by writing to EDTRR
- The same manual states that clearing the RX running flag only stops
  RX DMA at the next packet boundary

I applied this patch to the driver to detect DMA writes to freed
buffers:

> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1098,7 +1098,14 @@ static void sh_eth_ring_free(struct net_device *ndev)
>   /* Free Rx skb ringbuffer */
>   if (mdp->rx_skbuff) {
>   for (i = 0; i < mdp->num_rx_ring; i++)
> + memcpy(mdp->rx_skbuff[i]->data,
> +        "Hello, world", 12);
> + msleep(100);
> + for (i = 0; i < mdp->num_rx_ring; i++) {
> + WARN_ON(memcmp(mdp->rx_skbuff[i]->data,
> +        "Hello, world", 12));
>   dev_kfree_skb(mdp->rx_skbuff[i]);
> + }
>   }
>   kfree(mdp->rx_skbuff);
>   mdp->rx_skbuff = NULL;

then ran the loop:

    while ethtool -G eth0 rx 128 ; ethtool -G eth0 rx 64; do echo -n .; done

and 'ping -f' toward the sh_eth port from another machine.  The
warning fired several times a minute.

To fix these issues:

- Deactivate all TX descriptors rather than writing to EDTRR
- As there seems to be no way of telling when RX DMA is stopped,
  perform a soft reset to ensure that both DMA enginess are stopped
- To reduce the possibility of the reset truncating a transmitted
  frame, disable egress and wait a reasonable time to reach a
  packet boundary before resetting
- Update statistics before resetting

(The 'reasonable time' does not allow for CS/CD in half-duplex
mode, but half-duplex no longer seems reasonable!)

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Remove RX overflow log messages

If RX traffic is overflowing the FIFO or DMA ring, logging every time
this happens just makes things worse. These errors are visible in the
statistics anyway.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-3.19-20150127' of git://git./linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-01-27

this is another pull request for net/master which consists of 4 patches.

All 4 patches are contributed by Ahmed S. Darwish, he fixes more problems in
the kvaser_usb driver.

David, please merge net/master to net-next/master, as we have more kvaser_usb
patches in the queue, that target net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

ping: Fix race in free in receive path

An exception is seen in ICMP ping receive path where the skb
destructor sock_rfree() tries to access a freed socket. This happens
because ping_rcv() releases socket reference with sock_put() and this
internally frees up the socket. Later icmp_rcv() will try to free the
skb and as part of this, skb destructor is called and which leads
to a kernel panic as the socket is freed already in ping_rcv().

-->|exception
-007|sk_mem_uncharge
-007|sock_rfree
-008|skb_release_head_state
-009|skb_release_all
-009|__kfree_skb
-010|kfree_skb
-011|icmp_rcv
-012|ip_local_deliver_finish

Fix this incorrect free by cloning this skb and processing this cloned
skb instead.

This patch was suggested by Eric Dumazet

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

udp_diag: Fix socket skipping within chain

While working on rhashtable walking I noticed that the UDP diag
dumping code is buggy. In particular, the socket skipping within
a chain never happens, even though we record the number of sockets
that should be skipped.

As this code was supposedly copied from TCP, this patch does what
TCP does and resets num before we walk a chain.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

can: kvaser_usb: Fix state handling upon BUS_ERROR events

While being in an ERROR_WARNING state, and receiving further
bus error events with error counters still in the ERROR_WARNING
range of 97-127 inclusive, the state handling code erroneously
reverts back to ERROR_ACTIVE.

Per the CAN standard, only revert to ERROR_ACTIVE when the
error counters are less than 96.

Moreover, in certain Kvaser models, the BUS_ERROR flag is
always set along with undefined bits in the M16C status
register. Thus use bitwise operators instead of full equality
for checking that register against bus errors.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT

On some x86 laptops, plugging a Kvaser device again after an
unplug makes the firmware always ignore the very first command.
For such a case, provide some room for retries instead of
completely exiting the driver init code.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: kvaser_usb: Send correct context to URB completion

Send expected argument to the URB completion hander: a CAN
netdevice instead of the network interface private context
`kvaser_usb_net_priv'.

This was discovered by having some garbage in the kernel
log in place of the netdevice names: can0 and can1.

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

can: kvaser_usb: Do not sleep in atomic context

Upon receiving a hardware event with the BUS_RESET flag set,
the driver kills all of its anchored URBs and resets all of
its transmit URB contexts.

Unfortunately it does so under the context of URB completion
handler `kvaser_usb_read_bulk_callback()', which is often
called in an atomic context.

While the device is flooded with many received error packets,
usb_kill_urb() typically sleeps/reschedules till the transfer
request of each killed URB in question completes, leading to
the sleep in atomic bug. [3]

In v2 submission of the original driver patch [1], it was
stated that the URBs kill and tx contexts reset was needed
since we don't receive any tx acknowledgments later and thus
such resources will be locked down forever. Fortunately this
is no longer needed since an earlier bugfix in this patch
series is now applied: all tx URB contexts are reset upon CAN
channel close. [2]

Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF
event, which is the recommended handling method advised by
the device manufacturer.

[1] http://article.gmane.org/gmane.linux.network/239442
    http://www.webcitation.org/6Vr2yagAQ

[2] can: kvaser_usb: Reset all URB tx contexts upon channel close
    889b77f7fd2bcc922493d73a4c51d8a851505815

[3] Stacktrace:

<IRQ>  [<ffffffff8158de87>] dump_stack+0x45/0x57
[<ffffffff8158b60c>] __schedule_bug+0x41/0x4f
[<ffffffff815904b1>] __schedule+0x5f1/0x700
[<ffffffff8159360a>] ? _raw_spin_unlock_irqrestore+0xa/0x10
[<ffffffff81590684>] schedule+0x24/0x70
[<ffffffff8147d0a5>] usb_kill_urb+0x65/0xa0
[<ffffffff81077970>] ? prepare_to_wait_event+0x110/0x110
[<ffffffff8147d7d8>] usb_kill_anchored_urbs+0x48/0x80
[<ffffffffa01f4028>] kvaser_usb_unlink_tx_urbs+0x18/0x50 [kvaser_usb]
[<ffffffffa01f45d0>] kvaser_usb_rx_error+0xc0/0x400 [kvaser_usb]
[<ffffffff8108b14a>] ? vprintk_default+0x1a/0x20
[<ffffffffa01f5241>] kvaser_usb_read_bulk_callback+0x4c1/0x5f0 [kvaser_usb]
[<ffffffff8147a73e>] __usb_hcd_giveback_urb+0x5e/0xc0
[<ffffffff8147a8a1>] usb_hcd_giveback_urb+0x41/0x110
[<ffffffffa0008748>] finish_urb+0x98/0x180 [ohci_hcd]
[<ffffffff810cd1a7>] ? acct_account_cputime+0x17/0x20
[<ffffffff81069f65>] ? local_clock+0x15/0x30
[<ffffffffa000a36b>] ohci_work+0x1fb/0x5a0 [ohci_hcd]
[<ffffffff814fbb31>] ? process_backlog+0xb1/0x130
[<ffffffffa000cd5b>] ohci_irq+0xeb/0x270 [ohci_hcd]
[<ffffffff81479fc1>] usb_hcd_irq+0x21/0x30
[<ffffffff8108bfd3>] handle_irq_event_percpu+0x43/0x120
[<ffffffff8108c0ed>] handle_irq_event+0x3d/0x60
[<ffffffff8108ec84>] handle_fasteoi_irq+0x74/0x110
[<ffffffff81004dfd>] handle_irq+0x1d/0x30
[<ffffffff81004727>] do_IRQ+0x57/0x100
[<ffffffff8159482a>] common_interrupt+0x6a/0x6a

Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Merge tag 'mac80211-for-davem-2015-01-23' of git://git./linux/kernel/git/jberg/mac80211

Another set of last-minute fixes:
* fix station double-removal when suspending while associating
* fix the HT (802.11n) header length calculation
* fix the CCK radiotap flag used for monitoring, a pretty
old regression but a simple one-liner
* fix per-station group-key handling

Signed-off-by: David S. Miller <davem@davemloft.net>

ipv4: try to cache dst_entries which would cause a redirect

Not caching dst_entries which cause redirects could be exploited by hosts
on the same subnet, causing a severe DoS attack. This effect aggravated
since commit f88649721268999 ("ipv4: fix dst race in sk_dst_get()").

Lookups causing redirects will be allocated with DST_NOCACHE set which
will force dst_release to free them via RCU. Unfortunately waiting for
RCU grace period just takes too long, we can end up with >1M dst_entries
waiting to be released and the system will run OOM. rcuos threads cannot
catch up under high softirq load.

Attaching the flag to emit a redirect later on to the specific skb allows
us to cache those dst_entries thus reducing the pressure on allocation
and deallocation.

This issue was discovered by Marcelo Leitner.

Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bpf'

Alexei Starovoitov says:

====================
bpf: fix two bugs

Michael Holzheu caught two issues (in bpf syscall and in the test).
Fix them. Details in corresponding patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

samples: bpf: relax test_maps check

hash map is unordered, so get_next_key() iterator shouldn't
rely on particular order of elements. So relax this test.

Fixes: ffb65f27a155 ("bpf: add a testsuite for eBPF maps")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bpf: rcu lock must not be held when calling copy_to_user()

BUG: sleeping function called from invalid context at mm/memory.c:3732
in_atomic(): 0, irqs_disabled(): 0, pid: 671, name: test_maps
1 lock held by test_maps/671:
#0: (rcu_read_lock){......}, at: [<0000000000264190>] map_lookup_elem+0xe8/0x260
Call Trace:
([<0000000000115b7e>] show_trace+0x12e/0x150)
[<0000000000115c40>] show_stack+0xa0/0x100
[<00000000009b163c>] dump_stack+0x74/0xc8
[<000000000017424a>] ___might_sleep+0x23a/0x248
[<00000000002b58e8>] might_fault+0x70/0xe8
[<0000000000264230>] map_lookup_elem+0x188/0x260
[<0000000000264716>] SyS_bpf+0x20e/0x840

Fix it by allocating temporary buffer to store map element value.

Fixes: db20fd2b0108 ("bpf: add lookup/update/delete/iterate methods to BPF maps")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: sctp: fix slab corruption from use after free on INIT collisions

When hitting an INIT collision case during the 4WHS with AUTH enabled, as
already described in detail in commit 1be9a950c646 ("net: sctp: inherit
auth_capable on INIT collisions"), it can happen that we occasionally
still remotely trigger the following panic on server side which seems to
have been uncovered after the fix from commit 1be9a950c646 ...

[  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
[  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
[  533.940559] PGD 5030f2067 PUD 0
[  533.957104] Oops: 0000 [#1] SMP
[  533.974283] Modules linked in: sctp mlx4_en [...]
[  534.939704] Call Trace:
[  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
[  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
[  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
[  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
[  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
[  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
[  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
[  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

... or depending on the the application, for example this one:

[ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
[ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
[ 1370.054568] PGD 633c94067 PUD 0
[ 1370.070446] Oops: 0000 [#1] SMP
[ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
[ 1370.963431] Call Trace:
[ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
[ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
[ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
[ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
[ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

With slab debugging enabled, we can see that the poison has been overwritten:

[  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
[  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
[  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
[  669.826424]  __slab_alloc+0x4bf/0x566
[  669.826433]  __kmalloc+0x280/0x310
[  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
[  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
[  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
[  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
[  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
[  669.826635]  __slab_free+0x39/0x2a8
[  669.826643]  kfree+0x1d6/0x230
[  669.826650]  kzfree+0x31/0x40
[  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
[  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
[  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]

Since this only triggers in some collision-cases with AUTH, the problem at
heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
when having refcnt 1, once directly in sctp_assoc_update() and yet again
from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
the already kzfree'd memory, which is also consistent with the observation
of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
at a later point in time when poison is checked on new allocation).

Reference counting of auth keys revisited:

Shared keys for AUTH chunks are being stored in endpoints and associations
in endpoint_shared_keys list. On endpoint creation, a null key is being
added; on association creation, all endpoint shared keys are being cached
and thus cloned over to the association. struct sctp_shared_key only holds
a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
keeps track of users internally through refcounting. Naturally, on assoc
or enpoint destruction, sctp_shared_key are being destroyed directly and
the reference on sctp_auth_bytes dropped.

User space can add keys to either list via setsockopt(2) through struct
sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
with refcount 1 and in case of replacement drops the reference on the old
sctp_auth_bytes. A key can be set active from user space through setsockopt()
on the id via sctp_auth_set_active_key(), which iterates through either
endpoint_shared_keys and in case of an assoc, invokes (one of various places)
sctp_auth_asoc_init_active_key().

sctp_auth_asoc_init_active_key() computes the actual secret from local's
and peer's random, hmac and shared key parameters and returns a new key
directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
the reference if there was a previous one. The secret, which where we
eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
intitial refcount of 1, which also stays unchanged eventually in
sctp_assoc_update(). This key is later being used for crypto layer to
set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().

To close the loop: asoc->asoc_shared_key is freshly allocated secret
material and independant of the sctp_shared_key management keeping track
of only shared keys in endpoints and assocs. Hence, also commit 4184b2a79a76
("net: sctp: fix memory leak in auth key management") is independant of
this bug here since it concerns a different layer (though same structures
being used eventually). asoc->asoc_shared_key is reference dropped correctly
on assoc destruction in sctp_association_free() and when active keys are
being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
to remove that sctp_auth_key_put() from there which fixes these panics.

Fixes: 730fc3d05cd4 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'akpm' (patches from Andrew Morton)

Merge misc fixes from Andrew Morton:
"Six fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  drivers/rtc/rtc-s5m.c: terminate s5m_rtc_id array with empty element
  printk: add dummy routine for when CONFIG_PRINTK=n
  mm/vmscan: fix highidx argument type
  memcg: remove extra newlines from memcg oom kill log
  x86, build: replace Perl script with Shell script
  mm: page_alloc: embed OOM killing naturally into allocation slowpath

net: mv643xx_eth: Fix highmem support in non-TSO egress path

Commit 69ad0dd7af22b61d9e0e68e56b6290121618b0fb
Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date: Mon May 19 13:59:59 2014 -0300

net: mv643xx_eth: Use dma_map_single() to map the skb fragments

caused a nasty regression by removing the support for highmem skb
fragments. By using page_address() to get the address of a fragment's
page, we are assuming a lowmem page. However, such assumption is incorrect,
as fragments can be in highmem pages, resulting in very nasty issues.

This commit fixes this by using the skb_frag_dma_map() helper,
which takes care of mapping the skb fragment properly. Additionally,
the type of mapping is now tracked, so it can be unmapped using
dma_unmap_page or dma_unmap_single when appropriate.

This commit also fixes the error path in txq_init() to release the
resources properly.

Fixes: 69ad0dd7af22 ("net: mv643xx_eth: Use dma_map_single() to map the skb fragments")
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sh_eth'

Ben Hutchings says:

====================
Fixes for sh_eth #2

I'm continuing review and testing of Ethernet support on the R-Car H2
chip. This series fixes more of the issues I've found, but it won't be
the last set.

These are not tested on any of the other supported chips.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers

In order to stop the RX path accessing the RX ring while it's being
stopped or resized, we clear the interrupt mask (EESIPR) and then call
free_irq() or synchronise_irq().  This is insufficient because the
interrupt handler or NAPI poller may set EESIPR again after we clear
it.  Also, in sh_eth_set_ringparam() we currently don't disable NAPI
polling at all.

I could easily trigger a crash by running the loop:

   while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done

and 'ping -f' toward the sh_eth port from another machine.

To fix this:
- Add a software flag (irq_enabled) to signal whether interrupts
  should be enabled
- In the interrupt handler, if the flag is clear then clear EESIPR
  and return
- In the NAPI poller, if the flag is clear then don't set EESIPR
- Set the flag before enabling interrupts in sh_eth_dev_init() and
  sh_eth_set_ringparam()
- Clear the flag and serialise with the interrupt and NAPI
  handlers before clearing EESIPR in sh_eth_close() and
  sh_eth_set_ringparam()

After this, I could run the loop for 100,000 iterations successfully.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Fix crash or memory leak when resizing rings on device that is down

If the device is down then no packet buffers should be allocated.
We also must not touch its registers as it may be powered off.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Detach net device when stopping queue to resize DMA rings

We must only ever stop TX queues when they are full or the net device
is not 'ready' so far as the net core, and specifically the watchdog,
is concerned. Otherwise, the watchdog may fire *immediately* if no
packets have been added to the queue in the last 5 seconds.

What's more, sh_eth_tx_timeout() will likely crash if called while
we're resizing the TX ring.

I could easily trigger this by running the loop:

while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: Fix padding of short frames on TX

If an skb to be transmitted is shorter than the minimum Ethernet frame
length, we currently set the DMA descriptor length to the minimum but
do not add zero-padding. This could result in leaking sensitive
data. We also pass different lengths to dma_map_single() and
dma_unmap_single().

Use skb_padto() to pad properly, before calling dma_map_single().

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

drivers: net: cpsw: discard dual emac default vlan configuration

In Dual EMAC, the default VLANs are used to segregate Rx packets between
the ports, so adding the same default VLAN to the switch will affect the
normal packet transfers. So returning error on addition of dual EMAC
default VLANs.

Even if EMAC 0 default port VLAN is added to EMAC 1, it will lead to
break dual EMAC port separations.

Fixes: d9ba8f9e6298 (driver: net: ethernet: cpsw: dual emac interface implementation)
Cc: <stable@vger.kernel.org> # v3.9+
Reported-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'cls_bpf'

Daniel Borkmann says:

====================
Two cls_bpf fixes

Found them while doing a review on act_bpf and going over the
cls_bpf code again. Will also address the first issue in act_bpf
as it needs to be fixed there, too.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: cls_bpf: fix auto generation of per list handles

When creating a bpf classifier in tc with priority collisions and
invoking automatic unique handle assignment, cls_bpf_grab_new_handle()
will return a wrong handle id which in fact is non-unique. Usually
altering of specific filters is being addressed over major id, but
in case of collisions we result in a filter chain, where handle ids
address individual cls_bpf_progs inside the classifier.

Issue is, in cls_bpf_grab_new_handle() we probe for head->hgen handle
in cls_bpf_get() and in case we found a free handle, we're supposed
to use exactly head->hgen. In case of insufficient numbers of handles,
we bail out later as handle id 0 is not allowed.

Fixes: 7d1d65cb84e1 ("net: sched: cls_bpf: add BPF-based classifier")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: cls_bpf: fix size mismatch on filter preparation

In cls_bpf_modify_existing(), we read out the number of filter blocks,
do some sanity checks, allocate a block on that size, and copy over the
BPF instruction blob from user space, then pass everything through the
classic BPF checker prior to installation of the classifier.

We should reject mismatches here, there are 2 scenarios: the number of
filter blocks could be smaller than the provided instruction blob, so
we do a partial copy of the BPF program, and thus the instructions will
either be rejected from the verifier or a valid BPF program will be run;
in the other case, we'll end up copying more than we're supposed to,
and most likely the trailing garbage will be rejected by the verifier
as well (i.e. we need to fit instruction pattern, ret {A,K} needs to be
last instruction, load/stores must be correct, etc); in case not, we
would leak memory when dumping back instruction patterns. The code should
have only used nla_len() as Dave noted to avoid this from the beginning.
Anyway, lets fix it by rejecting such load attempts.

Fixes: 7d1d65cb84e1 ("net: sched: cls_bpf: add BPF-based classifier")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'linux-can-fixes-for-3.19-20150121' of git://git./linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-01-21

this is a pull request for v3.19, net/master, which consists of a single patch.

Viktor Babrian fixes the issue in the c_can dirver, that the CAN interface
might continue to send frames after the interface has been shut down.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-3.19-fixes' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
"The lifetime rules of cgroup hierarchies always have been somewhat
  counter-intuitive and cgroup core tried to enforce that hierarchies
  w/o userland-visible usages must die in finite amount of time so that
  the controllers can be reused for other hierarchies; unfortunately,
  this can't be implemented reasonably for the memory controller - the
  kmemcg part doesn't have any way to forcefully drain the existing
  usages, leading to an interruptible hang if a following mount attempts
  to use the controller in any way.

  So, it seems like we're stuck with "hierarchies live on till they die
  whenever that may be" at least for now.  This pretty much confines
  attaching controllers to hierarchies to before the hierarchies are
  actively used by making dynamic configurations post active usages
  unreliable.  This has never been reliable and should be fine in
  practice given how cgroups are used.

  After the patch, hierarchies aren't killed if it isn't already
  drained.  A following mount attempt of the same mount options will
  reuse the existing hierarchy.  Mount attempts with differing options
  will fail w/ -EBUSY"

* 'for-3.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: prevent mount hang due to memory controller lifetime

Merge tag 'regulator-v3.19-rc6' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
"One correctness fix here for the s2mps11 driver which would have
  resulted in some of the regulators being completely broken together
  with a fix for locking in regualtor_put() (which is fortunately rarely
  called at all in practical systems)"

* tag 'regulator-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: s2mps11: Fix wrong calculation of register offset
  regulator: core: fix race condition in regulator_put()

Merge tag 'spi-v3.19-rc6' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
"A few driver specific fixes here, some fixes for issues introduced and
  discovered during recent work on the DesignWare driver (which has been
  getting a lot of attention recently) and a couple of other drivers.
  All serious things for people who run into them"

* tag 'spi-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: dw: amend warning message
  spi: sh-msiof: fix MDR1_FLD_MASK value
  spi: dw-mid: fix FIFO size
  spi: dw: Fix detecting FIFO depth
  spi/pxa2xx: Clear cur_chip pointer before starting next message

drivers/rtc/rtc-s5m.c: terminate s5m_rtc_id array with empty element

Array of platform_device_id elements should be terminated with empty
element.

Fixes: 5bccae6ec458 ("rtc: s5m-rtc: add real-time clock driver for s5m8767")
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

printk: add dummy routine for when CONFIG_PRINTK=n

There are missing dummy routines for log_buf_addr_get() and
log_buf_len_get() for when CONFIG_PRINTK is not set causing build
failures.

This patch adds these dummy routines at the appropriate location.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/vmscan: fix highidx argument type

for_each_zone_zonelist_nodemask wants an enum zone_type argument, but is
passed gfp_t:

  mm/vmscan.c:2658:9:    expected int enum zone_type [signed] highest_zoneidx
  mm/vmscan.c:2658:9:    got restricted gfp_t [usertype] gfp_mask
  mm/vmscan.c:2658:9: warning: incorrect type in argument 2 (different base types)
  mm/vmscan.c:2658:9:    expected int enum zone_type [signed] highest_zoneidx
  mm/vmscan.c:2658:9:    got restricted gfp_t [usertype] gfp_mask

convert argument to the correct type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memcg: remove extra newlines from memcg oom kill log

Commit e61734c55c24 ("cgroup: remove cgroup->name") added two extra
newlines to memcg oom kill log messages.  This makes dmesg hard to read
and parse.  The issue affects 3.15+.

Example:

  Task in /t                          <<< extra #1
   killed as a result of limit of /t
                                      <<< extra #2
  memory: usage 102400kB, limit 102400kB, failcnt 274712

Remove the extra newlines from memcg oom kill messages, so the messages
look like:

  Task in /t killed as a result of limit of /t
  memory: usage 102400kB, limit 102400kB, failcnt 240649

Fixes: e61734c55c24 ("cgroup: remove cgroup->name")
Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

x86, build: replace Perl script with Shell script

Commit e6023367d779 ("x86, kaslr: Prevent .bss from overlaping initrd")
added Perl to the required build environment. This reimplements in
shell the Perl script used to find the size of the kernel with bss and
brk added.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Rob Landley <rob@landley.net>
Acked-by: Rob Landley <rob@landley.net>
Cc: Anca Emanuel <anca.emanuel@gmail.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Junjie Mao <eternal.n08@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: page_alloc: embed OOM killing naturally into allocation slowpath

The OOM killing invocation does a lot of duplicative checks against the
task's allocation context.  Rework it to take advantage of the existing
checks in the allocator slowpath.

The OOM killer is invoked when the allocator is unable to reclaim any
pages but the allocation has to keep looping.  Instead of having a check
for __GFP_NORETRY hidden in oom_gfp_allowed(), just move the OOM
invocation to the true branch of should_alloc_retry().  The __GFP_FS
check from oom_gfp_allowed() can then be moved into the OOM avoidance
branch in __alloc_pages_may_oom(), along with the PF_DUMPCORE test.

__alloc_pages_may_oom() can then signal to the caller whether the OOM
killer was invoked, instead of requiring it to duplicate the order and
high_zoneidx checks to guess this when deciding whether to continue.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge remote-tracking branches 'spi/fix/dw', 'spi/fix/msiof' and 'spi/fix/pxa2xx' into spi-linus

Merge branch 's390'

Ursula Braun says:

====================
s390/qeth patches for net

here are two s390/qeth patches built for net.
One patch is quite large, but we would like to fix the locking warning
seen in recent kernels as soon as possible. But if you want me to submit
these patches for net-next, I will do.
Or Gerlitz says:
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

390/qeth: Fix locking warning during qeth device setup

Do not wait for channel command buffers in IPA commands.
The potential wait could be done while holding a spin lock and causes
in recent kernels such a bug if kernel lock debugging is enabled:

kernel: BUG: sleeping function called from invalid context at drivers/s390/net/qeth_core_main.c:
794
kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 2031, name: NetworkManager
kernel: 2 locks held by NetworkManager/2031:
kernel:  #0:  (rtnl_mutex){+.+.+.}, at: [<00000000006e0d7a>] rtnetlink_rcv+0x32/0x50
kernel:  #1:  (_xmit_ETHER){+.....}, at: [<00000000006cfe90>] dev_set_rx_mode+0x30/0x50
kernel: CPU: 0 PID: 2031 Comm: NetworkManager Not tainted 3.18.0-rc5-next-20141124 #1
kernel:        00000000275fb1f0 00000000275fb280 0000000000000002 0000000000000000
               00000000275fb320 00000000275fb298 00000000275fb298 00000000007e326a
               0000000000000000 000000000099ce2c 00000000009b4988 000000000000000b
               00000000275fb2e0 00000000275fb280 0000000000000000 0000000000000000
               0000000000000000 00000000001129c8 00000000275fb280 00000000275fb2e0
kernel: Call Trace:
kernel: ([<00000000001128b0>] show_trace+0xf8/0x158)
kernel:  [<000000000011297a>] show_stack+0x6a/0xe8
kernel:  [<00000000007e995a>] dump_stack+0x82/0xb0
kernel:  [<000000000017d668>] ___might_sleep+0x170/0x228
kernel:  [<000003ff80026f0e>] qeth_wait_for_buffer+0x36/0xd0 [qeth]
kernel:  [<000003ff80026fe2>] qeth_get_ipacmd_buffer+0x3a/0xc0 [qeth]
kernel:  [<000003ff80105078>] qeth_l3_send_setdelmc+0x58/0xf8 [qeth_l3]
kernel:  [<000003ff8010b1fe>] qeth_l3_set_ip_addr_list+0x2c6/0x848 [qeth_l3]
kernel:  [<000003ff8010bbb4>] qeth_l3_set_multicast_list+0x434/0xc48 [qeth_l3]
kernel:  [<00000000006cfe9a>] dev_set_rx_mode+0x3a/0x50
kernel:  [<00000000006cff90>] __dev_open+0xe0/0x140
kernel:  [<00000000006d02a0>] __dev_change_flags+0xa0/0x178
kernel:  [<00000000006d03a8>] dev_change_flags+0x30/0x70
kernel:  [<00000000006e14ee>] do_setlink+0x346/0x9a0
...

The device driver has plenty of command buffers available
per channel for channel command communication.
In the extremely rare case when there is no command buffer
available, return a NULL pointer and issue a warning
in the kernel log. The caller handles the case when
a NULL pointer is encountered and returns an error.

In the case the wait for command buffer is possible
(because no lock is held as in the OSN case), still wait
until a channel command buffer is available.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Reviewed-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qeth: clean up error handling

In the functions that are registering and unregistering MAC
addresses in the qeth-handled hardware, remove callback functions
that are unnesessary, as only the return code is analyzed.
Translate hardware response codes to semi-standard 'errno'-like
codes for readability.

Add kernel-doc description to the internal API function
qeth_send_control_data().

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Reviewed-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipv6: Fix __ip6_route_redirect

In my last commit (a3c00e4: ipv6: Remove BACKTRACK macro), the changes in
__ip6_route_redirect is incorrect.  The following case is missed:
1. The for loop tries to find a valid gateway rt. If it fails to find
   one, rt will be NULL.
2. When rt is NULL, it is set to the ip6_null_entry.
3. The newly added 'else if', from a3c00e4, will stop the backtrack from
   happening.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 3.19-rc6

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"Hopefully the last round of fixes for 3.19

   - regression fix for the LDT changes
   - regression fix for XEN interrupt handling caused by the APIC
     changes
   - regression fixes for the PAT changes
   - last minute fixes for new the MPX support
   - regression fix for 32bit UP
   - fix for a long standing relocation issue on 64bit tagged for stable
   - functional fix for the Hyper-V clocksource tagged for stable
   - downgrade of a pr_err which tends to confuse users

  Looks a bit on the large side, but almost half of it are valuable
  comments"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Change Fast TSC calibration failed from error to info
  x86/apic: Re-enable PCI_MSI support for non-SMP X86_32
  x86, mm: Change cachemode exports to non-gpl
  x86, tls: Interpret an all-zero struct user_desc as "no segment"
  x86, tls, ldt: Stop checking lm in LDT_empty
  x86, mpx: Strictly enforce empty prctl() args
  x86, mpx: Fix potential performance issue on unmaps
  x86, mpx: Explicitly disable 32-bit MPX support on 64-bit kernels
  x86, hyperv: Mark the Hyper-V clocksource as being continuous
  x86: Don't rely on VMWare emulating PAT MSR correctly
  x86, irq: Properly tag virtualization entry in /proc/interrupts
  x86, boot: Skip relocs when load address unchanged
  x86/xen: Override ACPI IRQ management callback __acpi_unregister_gsi
  ACPI: pci: Do not clear pci_dev->irq in acpi_pci_irq_disable()
  x86/xen: Treat SCI interrupt as normal GSI interrupt

Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"From the irqchip departement you get:

   - regression fix for omap-intc

   - regression fix for atmel-aic-common

   - functional correctness fix for hip04

   - type mismatch fix for gic-v3-its

   - proper error pointer check for mtd-sysirq

  Mostly one and two liners except for the omap regression fix which is
  slightly larger than desired"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: atmel-aic-common: Prevent clobbering of priority when changing IRQ type
  irqchip: omap-intc: Fix legacy DMA regression
  irqchip: gic-v3-its: Fix use of max with decimal constant
  irqchip: hip04: Initialize hip04_cpu_map to 0xffff
  irqchip: mtk-sysirq: Use IS_ERR() instead of NULL pointer check

Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
"A set of small fixes:

   - regression fix for exynos_mct clocksource

   - trivial build fix for kona clocksource

   - functional one liner fix for the sh_tmu clocksource

   - two validation fixes to prevent (root only) data corruption in the
     kernel via settimeofday and adjtimex.  Tagged for stable"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: adjtimex: Validate the ADJ_FREQUENCY values
  time: settimeofday: Validate the values of tv from user
  clocksource: sh_tmu: Set cpu_possible_mask to fix SMP broadcast
  clocksource: kona: fix __iomem annotation
  clocksource: exynos_mct: Fix bitmask regression for exynos4_mct_write

Merge tag 'armsoc-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
"A week's worth of fixes for various ARM platforms.  Diff wise, the
  largest fix is for OMAP to deal with how GIC now registers interrupts
  (irq_domain_add_legacy() -> irq_domain_add_linear() changes).

  Besides this, a few more renesas platforms needed the GIC instatiation
  done for legacy boards.  There's also a fix that disables coherency of
  mvebu due to issues, and a few other smaller fixes"

* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: dts: add baud rate to Juno stdout-path
  ARM: dts: imx25: Fix PWM "per" clocks
  bus: mvebu-mbus: fix support of MBus window 13
  Merge tag 'mvebu-fixes-3.19-3' of git://git.infradead.org/linux-mvebu into fixes
  ARM: mvebu: completely disable hardware I/O coherency
  ARM: OMAP: Work around hardcoded interrupts
  ARM: shmobile: r8a7779: Instantiate GIC from C board code in legacy builds
  ARM: shmobile: r8a7778: Instantiate GIC from C board code in legacy builds
  arm: boot: dts: dra7: enable dwc3 suspend PHY quirk

Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
"A couple of fixes - deadlock in CIFS and build breakage in cris serial
  driver (resurfaced f_dentry in there)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  VFS: Convert file->f_dentry->d_inode to file_inode()
  fix deadlock in cifs_ioctl_clone()

Merge tag 'dm-3.19-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:
"Two stable fixes for dm-cache and one 3.19 DM core fix:

   - fix potential for dm-cache metadata corruption via stale metadata
     buffers being used when switching an inactive cache table to
     active; this could occur due to each table having it's own bufio
     client rather than sharing the client between tables.

   - fix dm-cache target to properly account for discard IO while
     suspending otherwise IO quiescing could complete prematurely.

   - fix DM core's handling of multiple internal suspends by maintaining
     an 'internal_suspend_count' and only resuming the device when this
     count drops to zero"

* tag 'dm-3.19-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: fix handling of multiple internal suspends
  dm cache: fix problematic dual use of a single migration count variable
  dm cache: share cache-metadata object across inactive and active DM tables

Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull two block layer fixes from Jens Axboe:
"Two small patches that should make it into 3.19:

   - a fixup from me for NVMe, making the cq_vector a signed variable.
     Otherwise our -1 comparison fails, and commit 2b25d981790b doesn't
     do what it was supposed to.

   - a fixup for the hotplug handling for blk-mq from Ming Lei, using
     the proper kobject referencing to ensure we release resources at
     the right time"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: fix hctx/ctx kobject use-after-free
  NVMe: cq_vector should be signed

net: dsa: set slave MII bus PHY mask

When registering a mdio bus, Linux assumes than every port has a PHY and tries
to scan it. If a switch port has no PHY registered, DSA will fail to register
the slave MII bus. To fix this, set the slave MII bus PHY mask to the switch
PHYs mask.

As an example, if we use a Marvell MV88E6352 (which is a 7-port switch with no
registered PHYs for port 5 and port 6), with the following declared names:

static struct dsa_chip_data switch_cdata = {
[...]
.port_names[0] = "sw0",
.port_names[1] = "sw1",
.port_names[2] = "sw2",
.port_names[3] = "sw3",
.port_names[4] = "sw4",
.port_names[5] = "cpu",
};

DSA will fail to create the switch instance. With the PHY mask set for the
slave MII bus, only the PHY for ports 0-4 will be scanned and the instance will
be successfully created.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ipvlan: fix incorrect usage of IS_ERR() macro in IPv6 code path.

The ip6_route_output() always returns a valid dst pointer unlike in IPv4
case. So the validation has to be different from the IPv4 path. Correcting
that error in this patch.

This was picked up by a static checker with a following warning -

drivers/net/ipvlan/ipvlan_core.c:380 ipvlan_process_v6_outbound()
warn: 'dst' isn't an ERR_PTR

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: llc: use correct size for sysctl timeout entries

The timeout entries are sizeof(int) rather than sizeof(long), which
means that when they were getting read we'd also leak kernel memory
to userspace along with the timeout values.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

netxen: fix netxen_nic_poll() logic

NAPI poll logic now enforces that a poller returns exactly the budget
when it wants to be called again.

If a driver limits TX completion, it has to return budget as well when
the limit is hit, not the number of received packets.

Reported-and-tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: d75b1ade567f ("net: less interrupt masking in NAPI")
Cc: Manish Chopra <manish.chopra@qlogic.com>
Acked-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

enic: fix rx napi poll return value

With the commit d75b1ade567ffab ("net: less interrupt masking in NAPI") napi repoll
is done only when work_done == budget. When we are in busy_poll we return 0 in
napi_poll. We should return budget.

Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'wireless-drivers-for-davem-2015-01-20' of git://git./linux/kernel/git/kvalo/wireless-drivers

ath9k:

* fix an IRQ storm caused by commit 872b5d814f99

iwlwifi:

* A fix for scan that fixes a firmware assertion

* A fix that improves roaming behavior. Same fix has been tested for
  a while in iwldvm. This is a bit of a work around, but the real fix
  should be in mac80211 and will come later.

* A fix for BARs that avoids a WARNING.

* one fix for rfkill while scheduled scan is running.
  Linus's system hit this issue. WiFi would be unavailable
  after this has happpened because of bad state in cfg80211.

Signed-off-by: David S. Miller <davem@davemloft.net>

ARM: dts: imx6sx: correct i.MX6sx sdb board enet phy address

The commit (3d125f9c91c5) cause i.MX6SX sdb enet cannot work. The cause is
the commit add mdio node with un-correct phy address.

The patch just correct i.MX6sx sdb board enet phy address.

V2:
* As Shawn's suggestion that unit-address should match 'reg' property, so
update ethernet-phy unit-address.

Acked-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Fugang Duan <B38611@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

pULL SCSI fixes from James Bottomley:
"This consists of four real fixes and three MAINTAINER updates.

  Three of the fixes are obvious (the DIX and atomic allocation are bug
  on and warn on fixes and the other is just trivial) and the ipr one is
  a bit more involved but is required because without it, the card
  double completes aborted commands and causes a kernel oops"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  MAINTAINERS: ibmvscsi driver maintainer change
  MAINTAINERS: ibmvfc driver maintainer change
  MAINTAINERS: Remove self as isci maintainer
  scsi_debug: test always evaluates to false, || should be used instead
  scsi: Avoid crashing if device uses DIX but adapter does not support it
  scsi_debug: use atomic allocation in resp_rsup_opcodes
  ipr: wait for aborted command responses

Merge git://www.linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:
"This will fix reboot issues with the imx2_wdt driver and it also drops
  some forgotten owner assignments from platform_drivers"

* git://www.linux-watchdog.org/linux-watchdog:
  watchdog: drop owner assignment from platform_drivers
  watchdog: imx2_wdt: Disable power down counter on boot
  watchdog: imx2_wdt: Improve power management support.

Merge branch 'hwmon-for-linus' of git://git./linux/kernel/git/jdelvare/staging

Pull hwmon update from Jean Delvare:
"This contains a single thing: a new driver for the temperature sensor
  embedded in the Intel 5500/5520/X58 chipsets.

  Sorry for the late request, it's been so long since I last sent a pull
  request and I've been so busy with other tasks meanwhile that I simply
  forgot about these patches.  But given that this is a new driver, it
  can't introduce any regression so I thought it could still be OK.

  This has been in linux-next for months now"

* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (i5500_temp) Convert to use ATTRIBUTE_GROUPS macro
  hwmon: (i5500_temp) Convert to module_pci_driver
  hwmon: (i5500_temp) Don't bind to disabled sensors
  hwmon: (i5500_temp) Convert to devm_hwmon_device_register_with_groups
  hwmon: (i5500_temp) New driver for the Intel 5500/5520/X58 chipsets

Merge tag 'media/v3.19-4' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
  - fix some race conditions caused by a regression on videobuf2
  - fix a interrupt release bug on cx23885
  - fix support for Mygica T230 and HVR4400
  - fix compilation breakage when USB is not selected on tlg2300
  - fix capabilities report on ompa3isp, soc-camera, rcar_vin and
    pvrusb2

* tag 'media/v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] omap3isp: Correctly set QUERYCAP capabilities
  [media] cx23885: fix free interrupt bug
  [media] pvrusb2: fix missing device_caps in querycap
  [media] vb2: fix vb2_thread_stop race conditions
  [media] rcar_vin: Update device_caps and capabilities in querycap
  [media] soc-camera: fix device capabilities in multiple camera host drivers
  [media] Fix Mygica T230 support
  [media] cx23885: Split Hauppauge WinTV Starburst from HVR4400 card entry
  [media] tlg2300: Fix media dependencies

dm: fix handling of multiple internal suspends

Commit ffcc393641 ("dm: enhance internal suspend and resume interface")
attempted to handle multiple internal suspends on the same device, but
it did that incorrectly. When these functions are called in this order
on the same device the device is no longer suspended, but it should be:
dm_internal_suspend_noflush
dm_internal_suspend_noflush
dm_internal_resume

Fix this bug by maintaining an 'internal_suspend_count' and resuming
the device when this count drops to zero.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

hwmon: (i5500_temp) Convert to use ATTRIBUTE_GROUPS macro

Use ATTRIBUTE_GROUPS macro to simplify the code a bit.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Jean Delvare <jdelvare@suse.de>

hwmon: (i5500_temp) Convert to module_pci_driver

Use module_pci_driver to simplify the code a bit.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jean Delvare <jdelvare@suse.de>

hwmon: (i5500_temp) Don't bind to disabled sensors

On many motherboards, for an unknown reason, the thermal sensor seems
to be disabled and will return a constant temperature value of 36.5
degrees Celsius. Don't bind to the device in that case, so that we
don't report this bogus value to userspace.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Romain Dolbeau <romain@dolbeau.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (i5500_temp) Convert to devm_hwmon_device_register_with_groups

Use devm_hwmon_device_register_with_groups() to simplify the code a
bit.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Romain Dolbeau <romain@dolbeau.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>

hwmon: (i5500_temp) New driver for the Intel 5500/5520/X58 chipsets

The Intel 5500, 5520 and X58 chipsets embed a digital thermal sensor.
This new driver supports it.

Note that on many boards the sensor seems to be disabled and reports
the minimum value (36.5 degrees Celsius) all the time.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Tested-by: Romain Dolbeau <romain@dolbeau.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>

amd-xgbe: Use proper Rx flow control register

Updated hardware documention shows the Rx flow control settings were
moved from the Rx queue operation mode register to a new Rx queue flow
control register. The old flow control settings are now reserved areas
of the Rx queue operation mode register. Update the code to use the new
register.

Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs

Pull btrfs fixes from Chris Mason:
"We have a few fixes in my for-linus branch.

  Qu Wenruo's batch fix a regression between some our merge window pull
  and the inode_cache feature.  The rest are smaller bugs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: Don't call btrfs_start_transaction() on frozen fs to avoid deadlock.
  btrfs: Fix the bug that fs_info->pending_changes is never cleared.
  btrfs: fix state->private cast on 32 bit machines
  Btrfs: fix race deleting block group from space_info->ro_bgs list
  Btrfs: fix incorrect freeing in scrub_stripe
  btrfs: sync ioctl, handle errors after transaction start

Merge tag 'platform-drivers-x86-v3.19-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86

Pull platform driver fix from Darren Hart:
"Revert keyboard backlight sysfs support and documentation.

  The support for the dell-laptop keyboard backlight was flawed and the
  fix:

        https://lkml.org/lkml/2015/1/14/539

  was more invasive that I felt comfortable sending at RC5.

  This series reverts the support for the dell-laptop keyboard backlight
  as well as the documentation for the newly created sysfs attributes.

  We'll get this implemented correctly for 3.20"

* tag 'platform-drivers-x86-v3.19-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86:
  Revert "platform: x86: dell-laptop: Add support for keyboard backlight"
  Revert "Documentation: Add entry for dell-laptop sysfs interface"

Merge tag 'pci-v3.19-fixes-1' of git://git./linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
"These are fixes for:

   - a resource management problem that causes a Radeon "Fatal error
     during GPU init" on machines where the BIOS programmed an invalid
     Root Port window.  This was a regression in v3.16.

   - an Atheros AR93xx device that doesn't handle PCI bus resets
     correctly.  This was a regression in v3.14.

   - an out-of-date email address"

* tag 'pci-v3.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  MAINTAINERS: Update Richard Zhu's email address
  sparc/PCI: Clip bridge windows to fit in upstream windows
  powerpc/PCI: Clip bridge windows to fit in upstream windows
  parisc/PCI: Clip bridge windows to fit in upstream windows
  mn10300/PCI: Clip bridge windows to fit in upstream windows
  microblaze/PCI: Clip bridge windows to fit in upstream windows
  ia64/PCI: Clip bridge windows to fit in upstream windows
  frv/PCI: Clip bridge windows to fit in upstream windows
  alpha/PCI: Clip bridge windows to fit in upstream windows
  x86/PCI: Clip bridge windows to fit in upstream windows
  PCI: Add pci_claim_bridge_resource() to clip window if necessary
  PCI: Add pci_bus_clip_resource() to clip to fit upstream window
  PCI: Pass bridge device, not bus, when updating bridge windows
  PCI: Mark Atheros AR93xx to avoid bus reset
  PCI: Add flag for devices where we can't use bus reset

Merge tag 'devicetree-for-linus' of git://git./linux/kernel/git/glikely/linux

Pull devicetree bug fixes and documentation updates from Grant Likely:
"A few bugfixes for the new DT overlay feature, documentation updates,
  spelling corrections, and changes to MAINTAINERS.  Nothing earth
  shattering here"

* tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux:
  of/unittest: Overlays with sub-devices tests
  of/platform: Handle of_populate drivers in notifier
  of/overlay: Do not generate duplicate nodes
  devicetree: document the "qemu" and "virtio" vendor prefixes
  devicetree: document ARM bindings for QEMU's Firmware Config interface
  Documentation: of: fix typo in graph bindings
  dma-mapping: fix debug print to display correct dma_pfn_offset
  of: replace Asahi Kasei Corp vendor prefix
  ARM: dt: GIC: Spelling s/specific/specifier/, s/flaggs/flags/
  dt/bindings: arm-boards: Spelling s/pointong/pointing/
  MAINTAINERS: Update DT website and git repository
  MAINTAINERS: drop DT regex matching on of_get_property and of_match_table

Merge tag 'imx-fixes-3.19-2' of git://git./linux/kernel/git/shawnguo/linux into fixes

Merge "ARM: imx: fixes for 3.19, 2nd round" from Shawn Guo:

The i.MX fixes for 3.19, 2nd round:
- Correct pwm clock assignment in i.MX25 device tree to fix the broken
pwm support on i.MX25

* tag 'imx-fixes-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx25: Fix PWM "per" clocks

Signed-off-by: Olof Johansson <olof@lixom.net>

arm64: dts: add baud rate to Juno stdout-path

Without explicit command-line parameters, the Juno UART ends up running
at 57600 baud in the kernel, which is at odds with the 115200 baud used
by the rest of the firmware. Since commit 7914a7c5651a5161 now lets us
fix this by specifying default options in stdout-path, do so.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'mvebu-fixes-3.19-4' of git://git.infradead.org/linux-mvebu into fixes

Merge "mvebu/fixes #3" from Andrew Lunn:

mvebu fixes for 3.19. (Part 4)

bus: mvebu-mbus: fix support of MBus window 13

* tag 'mvebu-fixes-3.19-4' of git://git.infradead.org/linux-mvebu:
bus: mvebu-mbus: fix support of MBus window 13
ARM: mvebu: completely disable hardware I/O coherency

Signed-off-by: Olof Johansson <olof@lixom.net>

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Three small fixes.

  Two for x86 and one avoids that sparse bails out"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: SYSENTER emulation is broken
  KVM: x86: Fix of previously incomplete fix for CVE-2014-8480
  KVM: fix sparse warning in include/trace/events/kvm.h

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
"Another round of small ARM fixes.

  restore_user_regs early stack deallocation is buggy in the presence of
  FIQs which switch to SVC mode, and could lead to corrupted registers
  being returned to a user process given an inopportune FIQ event.

  Another bug was spotted in the ARM perf code where it could lose track
  of perf counter overflows, leading to incorrect perf results.

  Lastly, a bug in arm_add_memory() was spotted where the memory sizes
  aren't properly rounded.  As most people pass properly rounded sizes,
  this hasn't been noticed"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8292/1: mm: fix size rounding-down of arm_add_memory() function
  ARM: 8255/1: perf: Prevent wraparound during overflow
  ARM: 8266/1: Remove early stack deallocation from restore_user_regs

Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull two arm64 fixes from Will Deacon:
"Arm64 fixes seem to come in pairs recently.  We've got a fix for
  removing device-tree blobs when doing a make clean and another one
  addressing a missing include, which fixes build failures in -next for
  allmodconfig (spotted by Mark's buildbot).

  Summary from signed tag:

   - fix cleaning of .dtbs following directory restructuring
   - fix allmodconfig build breakage in -next due to missing include"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: dump: Fix implicit inclusion of definition for PCI_IOBASE
  arm64: Add dtb files to archclean rule

Revert "platform: x86: dell-laptop: Add support for keyboard backlight"

This reverts commit 02b2aaaa57ab41504e8d03a3b2ceeb9440a2c188.

This interface was determined to be flawed and required too invasive a
fix for the RC cycle. This will be revisited in 3.20.

Signed-off-by: Darren Hart <dvhart@linux.intel.com>

Revert "Documentation: Add entry for dell-laptop sysfs interface"

This reverts commit 3161293ba6dfceee9c1efe75185677445def05d4.

This interface was determined to be flawed and required too invasive a
fix for the RC cycle. This will be revisited in 3.20.

Signed-off-by: Darren Hart <dvhart@linux.intel.com>

dm cache: fix problematic dual use of a single migration count variable

Introduce a new variable to count the number of allocated migration
structures.  The existing variable cache->nr_migrations became
overloaded.  It was used to:

i) track of the number of migrations in flight for the purposes of
    quiescing during suspend.

ii) to estimate the amount of background IO occuring.

Recent discard changes meant that REQ_DISCARD bios are processed with
a migration.  Discards are not background IO so nr_migrations was not
incremented.  However this could cause quiescing to complete early.

(i) is now handled with a new variable cache->nr_allocated_migrations.
cache->nr_migrations has been renamed cache->nr_io_migrations.
cleanup_migration() is now called free_io_migration(), since it
decrements that variable.

Also, remove the unused cache->next_migration variable that got replaced
with with prealloc_structs a while ago.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

dm cache: share cache-metadata object across inactive and active DM tables

If a DM table is reloaded with an inactive table when the device is not
suspended (normal procedure for LVM2), then there will be two dm-bufio
objects that can diverge. This can lead to a situation where the
inactive table uses bufio to read metadata at the same time the active
table writes metadata -- resulting in the inactive table having stale
metadata buffers once it is promoted to the active table slot.

Fix this by using reference counting and a global list of cache metadata
objects to ensure there is only one metadata object per metadata device.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org